omegaml.runtime.experiment ¶

Concepts ¶

ExperimentBackend provides the storage layer (backend to om.models)
TrackingProvider provides the metrics logging API
TrackingProxy provides live metrics tracking in runtime tasks

Backends ¶

class omegaml.backends.experiment.ExperimentBackend(model_store=None, data_store=None, tracking=None, **kwargs)¶

ExperimentBackend provides storage of tracker configurations

Usage:

To log metrics and other data:

with om.runtime.experiment('myexp') as exp:
    om.runtime.model('mymodel').fit(X, Y)
    om.runtime.model('mymodel').score(X, Y) # automatically log score result
    exp.log_metric('mymetric', value)
    exp.log_param('myparam', value)
    exp.log_artifact(X, 'X')
    exp.log_artifact(Y, 'Y')
    exp.log_artifact(om.models.metadata('mymodel'), 'mymodel')

To log data and automatically profile system data:

with om.runtime.experiment('myexp', provider='profiling') as exp:
    om.runtime.model('mymodel').fit(X, Y)
    om.runtime.model('mymodel').score(X, Y) # automatically log score result
    exp.log_metric('mymetric', value)
    exp.log_param('myparam', value)
    exp.log_artifact(X, 'X')
    exp.log_artifact(Y, 'Y')
    exp.log_artifact(om.models.metadata('mymodel'), 'mymodel')

# profiling data contains metrics for cpu, memory and disk use
data = exp.data(event='profile')

To get back experiment data without running an experiment:

# recommended way
exp = om.runtime.experiment('myexp').use()
exp_df = exp.data()

# experiments exist in the models store
exp = om.models.get('experiments/myexp')
exp_df = exp.data()

Metrics Logging ¶

class omegaml.backends.experiment.TrackingProvider(experiment, store=None, model_store=None)¶

TrackingProvider implements an abstract interface to experiment tracking

Concrete implementations like MLFlow, Sacred or Neptune.ai can be implemented based on TrackingProvider. In combination with the runtime’s OmegaTrackingProxy this provides a powerful tracking interface that scales with your needs.

How it works:

Experiments created using om.runtime.experiment() are stored as instances of a TrackingProvider concrete implementation

Upon retrieval of an experiment, any call to its API is proxied to the actual implementation, e.g. MLFlow
On calling a model method via the runtime, e.g. om.runtime.model().fit(), the TrackingProvider information is passed on to the runtime worker, and made available as the backend.tracking property. Thus within a model backend, you can always log to the tracker by using:
with self.tracking as exp:
    exp.log_metric() # call any TrackingProvider method
omega-ml provides the OmegaSimpleTracker, which implements a tracking interface similar to packages like MLFlow, Sacred. See ExperimentBackend for an example.

class omegaml.backends.experiment.OmegaSimpleTracker(experiment, store=None, model_store=None)¶

A tracking provider that logs to an omegaml dataset

Usage:

with om.runtime.experiment(provider='default') as exp:
    ...
    exp.log_metric('accuracy', .78)

active_run()¶

set the lastest run as the active run

Returns:: current run (int)

data(experiment=None, run=None, event=None, step=None, key=None, raw=False)¶

build a dataframe of all stored data

Parameters:

experiment (str) – the name of the experiment, defaults to its current value
run (int|list) – the run(s) to get data back, defaults to current run, use ‘all’ for all
event (str|list) – the event(s) to include
step (int|list) – the step(s) to include
key (str|list) – the key(s) to include
raw (bool) – if True returns the raw data instead of a DataFrame

Returns:

data (DataFrame) if raw == False
data (list of dicts) if raw == True

log_artifact(obj, name, step=None, **extra)¶

log any object to the current run

Usage:

# log an artifact
exp.log_artifact(mydict, 'somedata')

# retrieve back
mydict_ = exp.restore_artifact('somedata')

Parameters:

obj (obj) – any object to log
name (str) – the name of artifact
step (int) – the step, if any
**extra – any extra data to log

Notes

bool, str, int, float, list, dict are stored as format=type
Metadata is stored as format=metadata
objects supported by om.models are stored as format=model
objects supported by om.datasets are stored as format=dataset
all other objects are pickled and stored as format=pickle

log_event(event, key, value, step=None, **extra)¶

log some event

Parameters:

event (str) – the event name (e.g. start, stop, metric, param)
key (str) – a key to relate the value (e.g. metric name)
value (str|float|int|bool|dict) – the actual event value
step (int) – the step
**extra – any other values to store with event

log_metric(key, value, step=None, **extra)¶

log a metric value

Parameters:

key (str) – the metric name
value (str|float|int|bool|dict) – the metric value
step (int) – the step
**extra – any other values to store with event

Notes

logged as event=metric

log_param(key, value, step=None, **extra)¶

log an experiment parameter

Parameters:

key (str) – the parameter name
value (str|float|int|bool|dict) – the parameter value
step (int) – the step
**extra – any other values to store with event

Notes

logged as event=param

log_system(key=None, value=None, step=None, **extra)¶

log system data

Parameters:

key (str) – the key to use, defaults to ‘system’
value (str|float|int|bool|dict) – the parameter value
step (int) – the step
**extra – any other values to store with event

Notes

logged as event=system
logs platform, python version and list of installed packages

restore_artifact(key=None, experiment=None, run=None, step=None, value=None)¶

restore a logged artificat

Parameters:

key (str) – the name of the artifact as provided in log_artifact
run (int) – the run for which to query, defaults to current run
step (int) – the step for which to query, defaults to all steps in run
value (dict) – this value is used instead of querying data, use to retrieve an artifact from contents of .data()

Notes

this will restore the artifact according to its type assigned by .log_artifact(). If the type cannot be determined, the actual data is returned

start()¶

start a new run

This starts a new run and logs the start event

property status¶

status of a run

Parameters:: run (int) – the run number, defaults to the currently active run
Returns:: status in ‘STARTED’, ‘STOPPED’

stop()¶

stop the current run

This stops the current run and records the stop event

use()¶

reuse the latest run instead of starting a new one

semantic sugar for self.active_run()

Returns:: self

class omegaml.backends.experiment.OmegaProfilingTracker(*args, **kwargs)¶

A metric tracker that runs a system profiler while the experiment is active

Will record profile events that contain cpu, memory and disk profilings. See BackgroundProfiler.profile() for details of the profiling metrics collected.

Usage:

To log metrics and system performance data:

with om.runtime.experiment('myexp', provider='profiling') as exp:
    ...

data = exp.data(event='profile')

Properties:

exp.profiler.interval = n.m # interval of n.m seconds to profile, defaults to 3 seconds
exp.profiler.metrics = ['cpu', 'memory', 'disk'] # all or subset of metrics to collect
exp.max_buffer = n # number of items in buffer before tracking

Notes

the profiling data is buffered to reduce the number of database writes, by default the data is written on every 6 profiling events (default: 6 * 10 = every 60 seconds)
the step reported in the tracker counts the profiling event since the start, it is not related to the step (epoch) reported by e.g. tensorflow
For every step there is a event=profile, key=profile_dt entry which you can use to relate profiling events to a specific wall-clock time.
It usually sufficient to report system metrics in intervals > 10 seconds since machine learning algorithms tend to use CPU and memory over longer periods of time.

log_profile(data)¶: the callback for BackgroundProfiler

start()¶

start a new run

This starts a new run and logs the start event

stop()¶

stop the current run

This stops the current run and records the stop event

class omegaml.backends.experiment.NoTrackTracker(experiment, store=None, model_store=None)¶: A default tracker that does not record anything

for tensorflow

class omegaml.backends.experiment.TensorflowCallback(*args, **kwargs)¶

A callback for Tensorflow Keras models

Implements the callback protocol according to Tensorflow Keras semantics and linking to a omegaml.backends.experiment.TrackingProvider

Runtime Integration ¶

class omegaml.runtimes.trackingproxy.OmegaTrackingProxy(experiment=None, provider=None, runtime=None, implied_run=True)¶

OmegaTrackingProxy provides the runtime context for experiment tracking

Usage:

Using implied start()/stop() semantics, creating experiment runs:
with om.runtime.experiment('myexp') as exp:
    ...
    exp.log_metric('accuracy', score)
Using explicit start()/stop() semantics:
exp = om.runtime.experiment('myexp')
exp.start()
...
exp.stop()

omegaml.runtime.experiment¶

Concepts¶

Backends¶

Metrics Logging¶

Runtime Integration¶

omegaml.runtime.experiment ¶

Concepts ¶

Backends ¶

Metrics Logging ¶

Runtime Integration ¶