

  • ExperimentBackend provides the storage layer (backend to om.models)

  • TrackingProvider provides the metrics logging API

  • TrackingProxy provides live metrics tracking in runtime tasks


class omegaml.backends.experiment.ExperimentBackend(model_store=None, data_store=None, tracking=None, **kwargs)

ExperimentBackend provides storage of tracker configurations


To log metrics and other data:

with om.runtime.experiment('myexp') as exp:
    om.runtime.model('mymodel').fit(X, Y)
    om.runtime.model('mymodel').score(X, Y) # automatically log score result
    exp.log_metric('mymetric', value)
    exp.log_param('myparam', value)
    exp.log_artifact(X, 'X')
    exp.log_artifact(Y, 'Y')
    exp.log_artifact(om.models.metadata('mymodel'), 'mymodel')

To log data and automatically profile system data:

with om.runtime.experiment('myexp', provider='profiling') as exp:
    om.runtime.model('mymodel').fit(X, Y)
    om.runtime.model('mymodel').score(X, Y) # automatically log score result
    exp.log_metric('mymetric', value)
    exp.log_param('myparam', value)
    exp.log_artifact(X, 'X')
    exp.log_artifact(Y, 'Y')
    exp.log_artifact(om.models.metadata('mymodel'), 'mymodel')

# profiling data contains metrics for cpu, memory and disk use
data ='profile')

To get back experiment data without running an experiment:

# recommended way
exp = om.runtime.experiment('myexp').use()
exp_df =

# experiments exist in the models store
exp = om.models.get('experiments/myexp')
exp_df =

See also

KIND = 'experiment.tracker'
get(name, raw=False, data_store=None, **kwargs)

retrieve a model

  • name – the name of the object

  • version – the version of the object (not supported)

put(obj, name, **kwargs)

store a model

  • obj – the model object to be stored

  • name – the name of the object

  • attributes – attributes for meta data

classmethod supports(obj, name, **kwargs)

test if this backend supports this obj

Metrics Logging

class omegaml.backends.experiment.TrackingProvider(experiment, store=None, model_store=None)

TrackingProvider implements an abstract interface to experiment tracking

Concrete implementations like MLFlow, Sacred or can be implemented based on TrackingProvider. In combination with the runtime’s OmegaTrackingProxy this provides a powerful tracking interface that scales with your needs.

How it works:

  1. Experiments created using om.runtime.experiment() are stored as instances of a TrackingProvider concrete implementation

  2. Upon retrieval of an experiment, any call to its API is proxied to the actual implementation, e.g. MLFlow

  3. On calling a model method via the runtime, e.g. om.runtime.model().fit(), the TrackingProvider information is passed on to the runtime worker, and made available as the backend.tracking property. Thus within a model backend, you can always log to the tracker by using:

    with self.tracking as exp:
        exp.log_metric() # call any TrackingProvider method
  4. omega-ml provides the OmegaSimpleTracker, which implements a tracking interface similar to packages like MLFlow, Sacred. See ExperimentBackend for an example.

class omegaml.backends.experiment.OmegaSimpleTracker(experiment, store=None, model_store=None)

A tracking provider that logs to an omegaml dataset


with om.runtime.experiment(provider='default') as exp:
    exp.log_metric('accuracy', .78)

set the lastest run as the active run


current run (int)

data(experiment=None, run=None, event=None, step=None, key=None, raw=False)

build a dataframe of all stored data

  • experiment (str) – the name of the experiment, defaults to its current value

  • run (int|list) – the run(s) to get data back, defaults to current run, use ‘all’ for all

  • event (str|list) – the event(s) to include

  • step (int|list) – the step(s) to include

  • key (str|list) – the key(s) to include

  • raw (bool) – if True returns the raw data instead of a DataFrame


  • data (DataFrame) if raw == False

  • data (list of dicts) if raw == True

log_artifact(obj, name, step=None, **extra)

log any object to the current run


# log an artifact
exp.log_artifact(mydict, 'somedata')

# retrieve back
mydict_ = exp.restore_artifact('somedata')
  • obj (obj) – any object to log

  • name (str) – the name of artifact

  • step (int) – the step, if any

  • **extra – any extra data to log


  • bool, str, int, float, list, dict are stored as format=type

  • Metadata is stored as format=metadata

  • objects supported by om.models are stored as format=model

  • objects supported by om.datasets are stored as format=dataset

  • all other objects are pickled and stored as format=pickle

log_event(event, key, value, step=None, **extra)

log some event

  • event (str) – the event name (e.g. start, stop, metric, param)

  • key (str) – a key to relate the value (e.g. metric name)

  • value (str|float|int|bool|dict) – the actual event value

  • step (int) – the step

  • **extra – any other values to store with event

log_metric(key, value, step=None, **extra)

log a metric value

  • key (str) – the metric name

  • value (str|float|int|bool|dict) – the metric value

  • step (int) – the step

  • **extra – any other values to store with event


  • logged as event=metric

log_param(key, value, step=None, **extra)

log an experiment parameter

  • key (str) – the parameter name

  • value (str|float|int|bool|dict) – the parameter value

  • step (int) – the step

  • **extra – any other values to store with event


  • logged as event=param

log_system(key=None, value=None, step=None, **extra)

log system data

  • key (str) – the key to use, defaults to ‘system’

  • value (str|float|int|bool|dict) – the parameter value

  • step (int) – the step

  • **extra – any other values to store with event


  • logged as event=system

  • logs platform, python version and list of installed packages

restore_artifact(key=None, experiment=None, run=None, step=None, value=None)

restore a logged artificat

  • key (str) – the name of the artifact as provided in log_artifact

  • run (int) – the run for which to query, defaults to current run

  • step (int) – the step for which to query, defaults to all steps in run

  • value (dict) – this value is used instead of querying data, use to retrieve an artifact from contents of .data()


  • this will restore the artifact according to its type assigned by .log_artifact(). If the type cannot be determined, the actual data is returned


start a new run

This starts a new run and logs the start event

property status

status of a run


run (int) – the run number, defaults to the currently active run


status in ‘STARTED’, ‘STOPPED’


stop the current run

This stops the current run and records the stop event


reuse the latest run instead of starting a new one

semantic sugar for self.active_run()



class omegaml.backends.experiment.OmegaProfilingTracker(*args, **kwargs)

A metric tracker that runs a system profiler while the experiment is active

Will record profile events that contain cpu, memory and disk profilings. See BackgroundProfiler.profile() for details of the profiling metrics collected.


To log metrics and system performance data:

with om.runtime.experiment('myexp', provider='profiling') as exp:

data ='profile')


exp.profiler.interval = n.m # interval of n.m seconds to profile, defaults to 3 seconds
exp.profiler.metrics = ['cpu', 'memory', 'disk'] # all or subset of metrics to collect
exp.max_buffer = n # number of items in buffer before tracking


  • the profiling data is buffered to reduce the number of database writes, by default the data is written on every 6 profiling events (default: 6 * 10 = every 60 seconds)

  • the step reported in the tracker counts the profiling event since the start, it is not related to the step (epoch) reported by e.g. tensorflow

  • For every step there is a event=profile, key=profile_dt entry which you can use to relate profiling events to a specific wall-clock time.

  • It usually sufficient to report system metrics in intervals > 10 seconds since machine learning algorithms tend to use CPU and memory over longer periods of time.


the callback for BackgroundProfiler


start a new run

This starts a new run and logs the start event


stop the current run

This stops the current run and records the stop event

class omegaml.backends.experiment.NoTrackTracker(experiment, store=None, model_store=None)

A default tracker that does not record anything

for tensorflow

class omegaml.backends.experiment.TensorflowCallback(*args, **kwargs)

A callback for Tensorflow Keras models

Implements the callback protocol according to Tensorflow Keras semantics and linking to a omegaml.backends.experiment.TrackingProvider

Runtime Integration

class omegaml.runtimes.trackingproxy.OmegaTrackingProxy(experiment=None, provider=None, runtime=None, implied_run=True)

OmegaTrackingProxy provides the runtime context for experiment tracking


Using implied start()/stop() semantics, creating experiment runs:

with om.runtime.experiment('myexp') as exp:
    exp.log_metric('accuracy', score)

Using explicit start()/stop() semantics:

exp = om.runtime.experiment('myexp')

See also

  • OmegaSimpleTracker

  • ExperimentBackend