core models¶
omegaml.store¶
- class omegaml.store.base.OmegaStore(mongo_url=None, bucket=None, prefix=None, kind=None, defaults=None, dbalias=None)[source]¶
The storage backend for models and data
Changed in version NEXT: refactored all methods handling Python and Pandas datatypes to omegaml.backends.coreobjects.CoreObjectsBackend
- collection(name=None, bucket=None, prefix=None)[source]¶
Returns a mongo db collection as a datastore
If there is an existing object of name, will return the .collection of the object. Otherwise returns the collection according to naming convention {bucket}.{prefix}.{name}.datastore
- Parameters:
name – the collection to use. if none defaults to the collection name given on instantiation. the actual collection name used is always prefix + name + ‘.data’
- drop(name, force=False, version=-1, report=False, **kwargs)[source]¶
Drop the object
- Parameters:
name – The name of the object. If the name is a pattern it will be expanded using .list(), and call .drop() on every obj found.
force – If True ignores DoesNotExist exception, defaults to False meaning this raises a DoesNotExist exception if the name does not exist
report – if True returns a dict name=>status, where status is True if deleted, False if not deleted
- Returns:
True if object was deleted, False if not. If force is True and the object does not exist it will still return True
- Raises:
DoesNotExist if the object does not exist and
`force=False`
- exists(name, hidden=False)[source]¶
check if object exists
- Parameters:
name (str) – name of object
hidden (bool) – if True, include hidden files, defaults to False, unless name starts with ‘.’
- Returns:
bool, True if object exists
Changed in version 0.16.4: hidden defaults to True if name starts with ‘.’
- property fs¶
Retrieve a gridfs instance using url and collection provided
- Returns:
a gridfs instance
- get(name, version=-1, force_python=False, kind=None, model_store=None, data_store=None, **kwargs)[source]¶
Retrieve an object
- Parameters:
name – The name of the object
version – Version of the stored object (not supported)
force_python – Return as a python object
kwargs – kwargs depending on object kind
- Returns:
an object, estimator, pipelines, data array or pandas dataframe previously stored with put()
- get_backend(name, model_store=None, data_store=None, **kwargs)[source]¶
return the backend by a given object name
- Parameters:
kind – The object kind
model_store – the OmegaStore instance used to store models
data_store – the OmegaStore instance used to store data
kwargs – the kwargs passed to the backend initialization
- Returns:
the backend
- get_backend_bykind(kind, model_store=None, data_store=None, **kwargs)[source]¶
return the backend by a given object kind
- Parameters:
kind – The object kind
model_store – the OmegaStore instance used to store models
data_store – the OmegaStore instance used to store data
kwargs – the kwargs passed to the backend initialization
- Returns:
the backend
- get_backend_byobj(obj, name=None, kind=None, attributes=None, model_store=None, data_store=None, **kwargs)[source]¶
return the matching backend for the given obj
- Returns:
the first backend that supports the given parameters or None
Changed in version NEXT: backends are tested in order of MDREGISTRY.KINDS, see .register_backend()
- getl(*args, **kwargs)[source]¶
return a lazy MDataFrame for a given object
Same as .get, but returns a MDataFrame
- help(name_or_obj=None, kind=None, raw=False, display=None, renderer=None)[source]¶
get help for an object by looking up its backend and calling help() on it
Retrieves the object’s metadata and looks up its corresponding backend. If the metadata.attributes[‘docs’] is a string it will display this as the help() contents. If the string starts with ‘http://’ or ‘https://’ it will open the web page.
- Parameters:
name_or_obj (str|obj) – the name or actual object to get help for
kind (str) – optional, if specified forces retrieval of backend for the given kind
raw (bool) – optional, if True forces help to be the backend type of the object. If False returns the attributes[docs] on the object’s metadata, if available. Defaults to False
display (fn) – optional, callable for interactive display, defaults to help in if sys.flags.interactive is True, else uses pydoc.render_doc with plaintext
renderer (fn) – optional, the renderer= argument for pydoc.render_doc to use if sys.flags.interactive is False and display is not provided
- Returns:
help(obj) if python is in interactive mode
text(str) if python is in not interactive mode
- list(pattern=None, regexp=None, kind=None, raw=False, hidden=None, include_temp=False, bucket=None, prefix=None, filter=None)[source]¶
List all files in store
specify pattern as a unix pattern (e.g.
models/*
, or specify regexp)- Parameters:
pattern – the unix file pattern or None for all
regexp – the regexp. takes precedence over pattern
raw – if True return the meta data objects
filter – specify additional filter criteria, optional
- Returns:
List of files in store
- make_metadata(name, kind, bucket=None, prefix=None, **kwargs)[source]¶
create or update a metadata object
this retrieves a Metadata object if it exists given the kwargs. Only the name, prefix and bucket arguments are considered
for existing Metadata objects, the attributes kw is treated as follows:
attributes=None, the existing attributes are left as is
attributes={}, the attributes value on an existing metadata object is reset to the empty dict
attributes={ some : value }, the existing attributes are updated
For new metadata objects, attributes defaults to {} if not specified, else is set as provided.
- Parameters:
name – the object name
bucket – the bucket, optional, defaults to self.bucket
prefix – the prefix, optional, defaults to self.prefix
- metadata(name=None, bucket=None, prefix=None, version=-1, **kwargs)[source]¶
Returns a metadata document for the given entry name
- property mongodb¶
Returns a mongo database object
- object_store_key(name, ext, hashed=None)[source]¶
Returns the store key
Unless you write a mixin or a backend you should not use this method
- Parameters:
name – The name of object
ext – The extension of the filename
hashed – hash the key to support arbitrary name length, defaults to defaults.OMEGA_STORE_HASHEDNAMES, True by default since 0.13.7
- Returns:
A filename with relative bucket, prefix and name
- put(obj, name, attributes=None, kind=None, replace=False, model_store=None, data_store=None, **kwargs)[source]¶
Stores an object, store estimators, pipelines, numpy arrays or pandas dataframes
- register_backend(kind, backend, index=-1)[source]¶
register a backend class
- Parameters:
kind – (str) the backend kind
backend – (class) the backend class
index – (int) the insert position, defaults to -1, which means to append
Changed in version NEXT: added index to have more control over backend evaluation by .get_backend_byobj()
Changed in version NEXT: backends can specify cls.KIND_EXT to register additional kinds
- register_mixin(mixincls)[source]¶
register a mixin class
- Parameters:
mixincls – (class) the mixin class
- property tmppath¶
return an instance-specific temporary path
- class omegaml.documents.Metadata(**kwargs)[source]¶
Metadata stores information about objects in OmegaStore
- attributes¶
customer-defined other meta attributes
- bucket¶
bucket
- collection¶
for PANDAS_DFROWS this is the collection
- created¶
created datetime
- gridfile¶
for PANDAS_HDF and SKLEARN_JOBLIB this is the gridfile
- kind¶
kind of data
- kind_meta¶
omegaml technical attributes, e.g. column indicies
- modified¶
created datetime
- name¶
this is the name of the data
- objid¶
for PYTHON_DATA this is the actual document
- prefix¶
prefix
- s3file¶
s3file attributes
- uri¶
location URI