Public API¶

Python API (overview)¶

`omegaml.datasets`	the `omegaml.store.base.OmegaStore` store for datasets
`omegaml.models`	the `omegaml.store.base.OmegaStore` store for models
`omegaml.runtimes`
`omegaml.jobs`	the `omegaml.notebook.jobs.OmegaJobs` store for jobs
`omegaml.scripts`	the `omegaml.store.base.OmegaStore` store for scripts
`omegaml.streams`	stream helper
`omegaml.logger`	the OmegaSimpleLogger for easy log access
`omegaml.store.base.OmegaStore`([mongo_url, ...])	The storage backend for models and data
`omegaml.runtimes.OmegaRuntime`(omega[, ...])	omegaml compute cluster gateway
`omegaml.runtimes.OmegaModelProxy`(modelname)	proxy to a remote model in a celery worker
`omegaml.runtimes.OmegaJobProxy`(jobname[, ...])	proxy to a remote job in a celery worker
`omegaml.notebook.jobs.OmegaJobs`([bucket, ...])	Omega Jobs API
`omegaml.mdataframe.MDataFrame`(collection[, ...])	A DataFrame for mongodb
`omegaml.mdataframe.MGrouper`(mdataframe, ...)	a Grouper for MDataFrames
`omegaml.mdataframe.MLocIndexer`(mdataframe[, ...])	implements the LocIndexer for MDataFrames
`omegaml.mdataframe.MPosIndexer`(mdataframe)	implements the position-based indexer for MDataFrames
`omegaml.backends.sqlalchemy.SQLAlchemyBackend`([...])	sqlalchemy plugin for omegaml

Python API¶

omegaml.models = OmegaStore(bucket=omegaml, prefix=models/)¶: the omegaml.store.base.OmegaStore store for models

omegaml.runtimes - the cluster runtime API¶

omegaml.notebook.jobs - the lambda compute service¶

omegaml.store¶

class omegaml.store.base.OmegaStore(mongo_url=None, bucket=None, prefix=None, kind=None, defaults=None, dbalias=None)

The storage backend for models and data

drop(name, force=False, version=- 1, **kwargs)

Drop the object

Parameters:

name – The name of the object
force – If True ignores DoesNotExist exception, defaults to False meaning this raises a DoesNotExist exception of the name does not exist

Returns:

True if object was deleted, False if not. If force is True and the object does not exist it will still return True

Raises:

DoesNotExist if the object does not exist and `force=False`

get(name, version=- 1, force_python=False, kind=None, **kwargs)

Retrieve an object

Parameters:

name – The name of the object
version – Version of the stored object (not supported)
force_python – Return as a python object
kwargs – kwargs depending on object kind

Returns:

an object, estimator, pipelines, data array or pandas dataframe previously stored with put()

getl(*args, **kwargs)

return a lazy MDataFrame for a given object

Same as .get, but returns a MDataFrame

list(pattern=None, regexp=None, kind=None, raw=False, hidden=None, include_temp=False, bucket=None, prefix=None, filter=None)

List all files in store

specify pattern as a unix pattern (e.g. models/*, or specify regexp)

Parameters:

pattern – the unix file pattern or None for all
regexp – the regexp. takes precedence over pattern
raw – if True return the meta data objects
filter – specify additional filter criteria, optional

Returns:

List of files in store

put(obj, name, attributes=None, kind=None, replace=False, **kwargs): Stores an object, store estimators, pipelines, numpy arrays or pandas dataframes

omegaml.jobs¶

class omegaml.notebook.jobs.OmegaJobs(bucket=None, prefix=None, store=None, defaults=None)¶

Omega Jobs API

run(name, event=None, timeout=None)¶

Run a job immediately

The job is run and the results are stored in om.jobs(‘results/name <timestamp>’) and the result’s Metadata is returned.

Metadata.attributes of the original job as given by name is updated:

attributes['job_runs'] (list) - list of status of each run. Status is
a dict as below
attributes['job_results'] (list) - list of results job names in same
index-order as job_runs
attributes['trigger'] (list) - list of triggers

The status of each job run is a dict with keys:

status (str): the status of the job run, OK or ERROR
ts (datetime): time of execution
message (str): error mesasge in case of ERROR, else blank
results (str): name of results in case of OK, else blank

Usage:

# directly (sync)
meta = om.jobs.run('mynb')

# via runtime (async)
job = om.runtime.job('mynb')
result = job.run()

Parameters:

name (str) – the name of the jobfile
event (str) – an event name
timeout (int) – timeout in seconds, None means no timeout

Returns:

Metadata of the results entry