Model Drift ----------- Monitoring model drift is similar to monitoring data drift. However monitoring model drift is more complex as models have several components that can change over time: input features, targets (predictions) and respective metrics. Also models can be updated, retrained, or replaced over time, which means we can have multiple versions of a model's input, targets and metrics. omega-ml provides a `ModelDriftMonitor` class that allows to monitor all of these components of model drift. To illustrate model drift monitoring, consider the following example. We use the `iris` data from the `sklearn` package, and take two snapshots of the data. We then compare the two snapshots and plot the results. .. code:: python import omegaml as om from omegaml.backends.monitoring import ModelDriftMonitor from sklearn import datasets x, y = datasets.load_iris(return_X_y=True, as_frame=True) with om.runtime.experiment('foo', recreate=True) as exp: mon = ModelDriftMonitor(tracking=exp) mon.snapshot(X=x, Y=y, catcols=['target']) mon.snapshot(Y=y[0:5], catcols=['target']) stats = mon.compare() stats.plot() As expected, the two snapshots show a difference only in the target distribution. .. image:: /images/mon_0features.png To investigate, we can plot the distribution of the baseline and target snapshots. The plot shows the relative frequency of each unique value in both the baseline and the target snapshot. The relative frequency means that the same number of values are sampled from the distribution of the `Y_target` column in both snapshots. .. code:: python stats.plot('Y_target') .. image:: /images/mon_1hist_drift.png To show the comparison of the actual distributions, i.e. in terms of the absolute frequency of each unique value in `Y_target`, specify `sample=False`. In this example, the target snapshot contained only the first 5 rows of the original data, while the baseline contained all rows. Therefore each value is less frequent in the target snapshot. .. code-block:: python stats.plot('Y_target', sample=False) .. image:: /images/mon_1hist_drift_actual.png The underlying statistics are available as the drift statistics dataframe. Note that the plot is a relative comparison of the two snapshots, that is the baseline and target snapshots are compared to each other on the basis of the percentage of each target. .. code:: python stats.df .. image:: /images/mon_1hist_drift_df.png To select a specific column, statistic or sequence of snapshots, use the