Public API

Python API (overview)

omegaml.datasets

the omegaml.store.base.OmegaStore store for datasets

omegaml.models

the omegaml.store.base.OmegaStore store for models

omegaml.runtimes

omegaml.jobs

the omegaml.notebook.jobs.OmegaJobs store for jobs

omegaml.scripts

the omegaml.store.base.OmegaStore store for scripts

omegaml.streams

stream helper

omegaml.logger

the OmegaSimpleLogger for easy log access

omegaml.store.base.OmegaStore([mongo_url, ...])

The storage backend for models and data

omegaml.runtimes.OmegaRuntime(omega[, ...])

omegaml compute cluster gateway

omegaml.runtimes.OmegaModelProxy(modelname)

proxy to a remote model in a celery worker

omegaml.runtimes.OmegaJobProxy(jobname[, ...])

proxy to a remote job in a celery worker

omegaml.notebook.jobs.OmegaJobs([bucket, ...])

Omega Jobs API

omegaml.mdataframe.MDataFrame(collection[, ...])

A DataFrame for mongodb

omegaml.mdataframe.MGrouper(mdataframe, ...)

a Grouper for MDataFrames

omegaml.mdataframe.MLocIndexer(mdataframe[, ...])

implements the LocIndexer for MDataFrames

omegaml.mdataframe.MPosIndexer(mdataframe)

implements the position-based indexer for MDataFrames

omegaml.backends.sqlalchemy.SQLAlchemyBackend([...])

sqlalchemy plugin for omegaml

Python API

omegaml.models = OmegaStore(bucket=omegaml, prefix=models/)

the omegaml.store.base.OmegaStore store for models

omegaml.runtimes - the cluster runtime API
omegaml.notebook.jobs - the lambda compute service

omegaml.store

class omegaml.store.base.OmegaStore(mongo_url=None, bucket=None, prefix=None, kind=None, defaults=None, dbalias=None)

The storage backend for models and data

drop(name, force=False, version=- 1, **kwargs)

Drop the object

Parameters:
  • name – The name of the object

  • force – If True ignores DoesNotExist exception, defaults to False meaning this raises a DoesNotExist exception of the name does not exist

Returns:

True if object was deleted, False if not. If force is True and the object does not exist it will still return True

Raises:

DoesNotExist if the object does not exist and `force=False`

get(name, version=- 1, force_python=False, kind=None, **kwargs)

Retrieve an object

Parameters:
  • name – The name of the object

  • version – Version of the stored object (not supported)

  • force_python – Return as a python object

  • kwargs – kwargs depending on object kind

Returns:

an object, estimator, pipelines, data array or pandas dataframe previously stored with put()

getl(*args, **kwargs)

return a lazy MDataFrame for a given object

Same as .get, but returns a MDataFrame

list(pattern=None, regexp=None, kind=None, raw=False, hidden=None, include_temp=False, bucket=None, prefix=None, filter=None)

List all files in store

specify pattern as a unix pattern (e.g. models/*, or specify regexp)

Parameters:
  • pattern – the unix file pattern or None for all

  • regexp – the regexp. takes precedence over pattern

  • raw – if True return the meta data objects

  • filter – specify additional filter criteria, optional

Returns:

List of files in store

put(obj, name, attributes=None, kind=None, replace=False, **kwargs)

Stores an object, store estimators, pipelines, numpy arrays or pandas dataframes

omegaml.jobs

class omegaml.notebook.jobs.OmegaJobs(bucket=None, prefix=None, store=None, defaults=None)

Omega Jobs API

run(name, event=None, timeout=None)

Run a job immediately

The job is run and the results are stored in om.jobs(‘results/name <timestamp>’) and the result’s Metadata is returned.

Metadata.attributes of the original job as given by name is updated:

  • attributes['job_runs'] (list) - list of status of each run. Status is

    a dict as below

  • attributes['job_results'] (list) - list of results job names in same

    index-order as job_runs

  • attributes['trigger'] (list) - list of triggers

The status of each job run is a dict with keys:

  • status (str): the status of the job run, OK or ERROR

  • ts (datetime): time of execution

  • message (str): error mesasge in case of ERROR, else blank

  • results (str): name of results in case of OK, else blank

Usage:

# directly (sync)
meta = om.jobs.run('mynb')

# via runtime (async)
job = om.runtime.job('mynb')
result = job.run()
Parameters:
  • name (str) – the name of the jobfile

  • event (str) – an event name

  • timeout (int) – timeout in seconds, None means no timeout

Returns:

Metadata of the results entry

See also

  • OmegaJobs.run_notebook

run_notebook(name, event=None, timeout=None)

run a given notebook immediately.

Parameters:
  • name (str) – the name of the jobfile

  • event (str) – an event name

  • timeout (int) – timeout in seconds

Returns:

Metadata of results

schedule(nb_file, run_at=None, last_run=None)

Schedule a processing of a notebook as per the interval specified on the job script

Notes

This updates the notebook’s Metadata entry by adding the next scheduled run in attributes['triggers']`

Parameters:
  • nb_file (str) – the name of the notebook

  • run_at (str|dict|JobSchedule) – the schedule specified in a format suitable for JobSchedule. If not specified, this value is extracted from the first cell of the notebook

  • last_run (datetime) – the last time this job was run, use this to reschedule the job for the next run. Defaults to the last timestamp listed in attributes['job_runs'], or datetime.utcnow() if no previous run exists.

See also