core models

omegaml.store

class omegaml.store.base.OmegaStore(mongo_url=None, bucket=None, prefix=None, kind=None, defaults=None, dbalias=None)[source]

The storage backend for models and data

Changed in version NEXT: refactored all methods handling Python and Pandas datatypes to omegaml.backends.coreobjects.CoreObjectsBackend

collection(name=None, bucket=None, prefix=None)[source]

Returns a mongo db collection as a datastore

If there is an existing object of name, will return the .collection of the object. Otherwise returns the collection according to naming convention {bucket}.{prefix}.{name}.datastore

Parameters:

name – the collection to use. if none defaults to the collection name given on instantiation. the actual collection name used is always prefix + name + ‘.data’

drop(name, force=False, version=-1, report=False, **kwargs)[source]

Drop the object

Parameters:
  • name – The name of the object. If the name is a pattern it will be expanded using .list(), and call .drop() on every obj found.

  • force – If True ignores DoesNotExist exception, defaults to False meaning this raises a DoesNotExist exception if the name does not exist

  • report – if True returns a dict name=>status, where status is True if deleted, False if not deleted

Returns:

True if object was deleted, False if not. If force is True and the object does not exist it will still return True

Raises:

DoesNotExist if the object does not exist and `force=False`

exists(name, hidden=False)[source]

check if object exists

Parameters:
  • name (str) – name of object

  • hidden (bool) – if True, include hidden files, defaults to False, unless name starts with ‘.’

Returns:

bool, True if object exists

Changed in version 0.16.4: hidden defaults to True if name starts with ‘.’

property fs

Retrieve a gridfs instance using url and collection provided

Returns:

a gridfs instance

get(name, version=-1, force_python=False, kind=None, model_store=None, data_store=None, **kwargs)[source]

Retrieve an object

Parameters:
  • name – The name of the object

  • version – Version of the stored object (not supported)

  • force_python – Return as a python object

  • kwargs – kwargs depending on object kind

Returns:

an object, estimator, pipelines, data array or pandas dataframe previously stored with put()

get_backend(name, model_store=None, data_store=None, **kwargs)[source]

return the backend by a given object name

Parameters:
  • kind – The object kind

  • model_store – the OmegaStore instance used to store models

  • data_store – the OmegaStore instance used to store data

  • kwargs – the kwargs passed to the backend initialization

Returns:

the backend

get_backend_bykind(kind, model_store=None, data_store=None, **kwargs)[source]

return the backend by a given object kind

Parameters:
  • kind – The object kind

  • model_store – the OmegaStore instance used to store models

  • data_store – the OmegaStore instance used to store data

  • kwargs – the kwargs passed to the backend initialization

Returns:

the backend

get_backend_byobj(obj, name=None, kind=None, attributes=None, model_store=None, data_store=None, **kwargs)[source]

return the matching backend for the given obj

Returns:

the first backend that supports the given parameters or None

Changed in version NEXT: backends are tested in order of MDREGISTRY.KINDS, see .register_backend()

getl(*args, **kwargs)[source]

return a lazy MDataFrame for a given object

Same as .get, but returns a MDataFrame

help(name_or_obj=None, kind=None, raw=False, display=None, renderer=None)[source]

get help for an object by looking up its backend and calling help() on it

Retrieves the object’s metadata and looks up its corresponding backend. If the metadata.attributes[‘docs’] is a string it will display this as the help() contents. If the string starts with ‘http://’ or ‘https://’ it will open the web page.

Parameters:
  • name_or_obj (str|obj) – the name or actual object to get help for

  • kind (str) – optional, if specified forces retrieval of backend for the given kind

  • raw (bool) – optional, if True forces help to be the backend type of the object. If False returns the attributes[docs] on the object’s metadata, if available. Defaults to False

  • display (fn) – optional, callable for interactive display, defaults to help in if sys.flags.interactive is True, else uses pydoc.render_doc with plaintext

  • renderer (fn) – optional, the renderer= argument for pydoc.render_doc to use if sys.flags.interactive is False and display is not provided

Returns:

  • help(obj) if python is in interactive mode

  • text(str) if python is in not interactive mode

list(pattern=None, regexp=None, kind=None, raw=False, hidden=None, include_temp=False, bucket=None, prefix=None, filter=None)[source]

List all files in store

specify pattern as a unix pattern (e.g. models/*, or specify regexp)

Parameters:
  • pattern – the unix file pattern or None for all

  • regexp – the regexp. takes precedence over pattern

  • raw – if True return the meta data objects

  • filter – specify additional filter criteria, optional

Returns:

List of files in store

make_metadata(name, kind, bucket=None, prefix=None, **kwargs)[source]

create or update a metadata object

this retrieves a Metadata object if it exists given the kwargs. Only the name, prefix and bucket arguments are considered

for existing Metadata objects, the attributes kw is treated as follows:

  • attributes=None, the existing attributes are left as is

  • attributes={}, the attributes value on an existing metadata object is reset to the empty dict

  • attributes={ some : value }, the existing attributes are updated

For new metadata objects, attributes defaults to {} if not specified, else is set as provided.

Parameters:
  • name – the object name

  • bucket – the bucket, optional, defaults to self.bucket

  • prefix – the prefix, optional, defaults to self.prefix

metadata(name=None, bucket=None, prefix=None, version=-1, **kwargs)[source]

Returns a metadata document for the given entry name

property mongodb

Returns a mongo database object

object_store_key(name, ext, hashed=None)[source]

Returns the store key

Unless you write a mixin or a backend you should not use this method

Parameters:
  • name – The name of object

  • ext – The extension of the filename

  • hashed – hash the key to support arbitrary name length, defaults to defaults.OMEGA_STORE_HASHEDNAMES, True by default since 0.13.7

Returns:

A filename with relative bucket, prefix and name

put(obj, name, attributes=None, kind=None, replace=False, model_store=None, data_store=None, **kwargs)[source]

Stores an object, store estimators, pipelines, numpy arrays or pandas dataframes

register_backend(kind, backend, index=-1)[source]

register a backend class

Parameters:
  • kind – (str) the backend kind

  • backend – (class) the backend class

  • index – (int) the insert position, defaults to -1, which means to append

Changed in version NEXT: added index to have more control over backend evaluation by .get_backend_byobj()

Changed in version NEXT: backends can specify cls.KIND_EXT to register additional kinds

register_backends()[source]

register backends in defaults.OMEGA_STORE_BACKENDS

register_mixin(mixincls)[source]

register a mixin class

Parameters:

mixincls – (class) the mixin class

property tmppath

return an instance-specific temporary path

class omegaml.documents.Metadata(**kwargs)[source]

Metadata stores information about objects in OmegaStore

attributes

customer-defined other meta attributes

bucket

bucket

collection

for PANDAS_DFROWS this is the collection

created

created datetime

gridfile

for PANDAS_HDF and SKLEARN_JOBLIB this is the gridfile

kind

kind of data

kind_meta

omegaml technical attributes, e.g. column indicies

modified

created datetime

name

this is the name of the data

objid

for PYTHON_DATA this is the actual document

prefix

prefix

s3file

s3file attributes

uri

location URI