Capturing model metrics¶
omega-ml provides experiment and model tracking for all models using its built-in metrics store.
Running an experiment¶
Collecting metrics as part of an experiment is straight forward:
lr = LogisticRegression()
X, Y = ...
with om.runtime.experiment('myexp') as exp:
    lr.fit(X, Y)
    score = lr.score(X, Y)
    exp.log_metric('accuracy', score)
    exp.log_param('penalty', 'L2')
    exp.log_artifact(lr, 'mymodel')
We can get back the data collected in an experiment using exp.data(),
as a DataFrame:
In [1]: exp.data()
Out[2]:
      experiment  run     event                          dt   node  step       key                                              value
0      myexp    1     start  2021-11-22T15:49:51.893920  eowyn   NaN       NaN                                                NaN
1      myexp    1    system  2021-11-22T15:49:51.927902  eowyn   NaN    system  {'platform': {'system': 'Linux', 'node': 'eowy...
2      myexp    1    metric  2021-11-22T15:49:51.938601  eowyn   NaN  accuracy                                                  1
3      myexp    1     param  2021-11-22T15:49:51.950340  eowyn   NaN  penaltiy                                                 L2
4      myexp    1  artifact  2021-11-22T15:49:51.984030  eowyn   NaN   mymodel  {'name': 'mymodel', 'data': 'experiments/.arte...
5      myexp    1      stop  2021-11-22T15:49:51.994113  eowyn   NaN       NaN                                                NaN
Note the run column records the number of times the above with
block has been run. If you run it again, there is a second set of metrics. We can
get back all the runs by filtering as exp.data(run='all'), or a specific set
of runs by giving a list or a tuple exp.data(run=(1,3)). Additional
filters are available for the event, node, step, key fields
In [3]: exp.data(run='all', key='accuracy')
Out[4]:
  experiment  run  step   event       key  value                          dt   node
0      myexp    1  None  metric  accuracy      1  2021-11-22T15:49:51.938601  eowyn
1      myexp    2  None  metric  accuracy      1  2021-11-22T16:02:08.579077  eowyn
2      myexp    3  None  metric  accuracy      1  2021-11-22T16:02:13.048647  eowyn
Tracking experiments with multiple steps¶
To run an experiment that uses a series of parameters or a k-fold of input data
such as in cross-validation, we can track each step separately. In this case there
the data will be recorded for one run (e.g. run=1) and many steps.
shuffle = ShuffleSplit(n_splits=5)
with om.runtime.experiment('myexp') as exp:
    for step, split in enumerate(shuffle.split(X))
        Xs, Ys = X[split[0]], Y[split[1]]
        lr.fit(Xs, Ys)
        score = lr.score(X, Y, step=step)
        exp.log_metric('accuracy', score, step=step)
        exp.log_param('penalty', 'L2', step=step)
        exp.log_artifact(lr, 'mymodel', step=step)
If the ML framework provides model callbacks, such as Tensorflow, the model can
be fit using exp.tensorflow_callback(). In this case, the model itself
will provide model metrics via the callback:
model = Sequential()
...
model.compile(metrics=['accuracy'])
with om.runtime.experiment('myexp') as exp:
    model.fit(X, Y,
              callbacks=[exp.tensorflow_callback()])
Tracking model execution at runtime¶
Since experiments are a feature of the runtime, we can store a model and link it to an experiment. In this case the runtime will create an experiment context prior to performing the requested model action.
lr = LogisticRegression()
om.models.put(lr, 'mymodel', attributes={
    'tracking': {
        'default': 'myexp',
    }})
om.runtime.model('mymodel').score(X, Y)
Thus the runtime worker will run the following code equivalent. This is true for all calls of the runtime (programmatic, cli or REST API).
# run time worker, in response to om.runtime.score('mymodel', X, Y)
def omega_score(X, Y):
    model = om.models.get('mymodel')
    meta = om.models.metadata('mymodel')
    exp_name = meta.attributes['tracking']['default']
    with om.runtime.experiment(exp_name) as exp:
        exp.log_event('task_call', 'mymodel')
        result = model.score(X, Y)
        exp.log_metric('score', result)
        exp.log_artifcat(meta, 'related')
        exp.log_event('task_success', 'mymodel')
Customizing tracking behavior¶
Tracking behavior can be adjusted by using a different tracking provider,
e.g. the SimpleTrackingProvider logs model metrics, while the
OmegaProfilingTracker also logs system resource usage like
CPU and RAM while running the experiment. Write your own tracking providers
to forward metrics to a third-party metrics store, or to provide custom
callbacks to your machine learning framework.
The specific tracking provider used is specified as the provider=
argument when creating the experiment. For example, the ‘profiling’ provider
will track system metrics during execution:
In [1]: with om.runtime.experiment('myexp2',
                                    provider='profiling') as exp:
             ...
Out[2]: exp.data()
       experiment  run     event                          dt   node  step           key                                              value
    0      myexp2    1     start  2021-11-22T16:53:28.534211  eowyn   NaN           NaN                                                NaN
    1      myexp2    1    system  2021-11-22T16:53:28.579121  eowyn   NaN        system  {'platform': {'system': 'Linux', 'node': 'eowy...
    2      myexp2    1    metric  2021-11-22T16:53:28.592081  eowyn   NaN      accuracy                                                  1
    3      myexp2    1     param  2021-11-22T16:53:28.600690  eowyn   NaN      penaltiy                                                 L2
    4      myexp2    1  artifact  2021-11-22T16:53:28.627970  eowyn   NaN       mymodel  {'name': 'mymodel', 'data': 'experiments/.arte...
    5      myexp2    1   profile  2021-11-22T16:53:28.635717  eowyn   0.0    profile_dt                         2021-11-22T16:53:28.531654
    6      myexp2    1   profile  2021-11-22T16:53:28.643665  eowyn   0.0   memory_load                                               22.4
    7      myexp2    1   profile  2021-11-22T16:53:28.651388  eowyn   0.0  memory_total                                        33542479872
    8      myexp2    1   profile  2021-11-22T16:53:28.658964  eowyn   0.0      cpu_load                           [25.9, 27.1, 27.6, 28.6]
    9      myexp2    1   profile  2021-11-22T16:53:28.666597  eowyn   0.0     cpu_count                                                  4
    10     myexp2    1   profile  2021-11-22T16:53:28.673986  eowyn   0.0      cpu_freq                       [0.833, 1.728, 2.228, 1.736]
    11     myexp2    1   profile  2021-11-22T16:53:28.681591  eowyn   0.0       cpu_avg                            [0.215, 0.6825, 0.6925]
    12     myexp2    1   profile  2021-11-22T16:53:28.688981  eowyn   0.0      disk_use                                               95.6
    13     myexp2    1   profile  2021-11-22T16:53:28.697768  eowyn   0.0    disk_total                                       502468108288
    14     myexp2    1      stop  2021-11-22T16:53:28.705661  eowyn   NaN           NaN                                                NaN
The following tracking providers are available:
default- the default tracker,OmegaSimpleTrackerprofiling- the profiling tracker,OmegaProfilingTrackernotrack- the no-operation tracker,NoTrackTracker. Use this to disable tracking.