Runtime Clusters

By default omega|ml uses a Celery cluster for remote computation. However the runtime is flexible to other clusters, provided the cluster supports submitting arbitrary functions (in particular, omegaml’s task functions).

Celery Runtime (default)

The Celery runtime is the default implementation, provided as om.runtime. It provides the following interfaces:

  • .model() - to get a model proxy to a remote model, OmegaModelProxy
  • .job() - to get a job proxy to a remote job, OmegaJobProxy Enterprise Edition
  • .script() - to get a script proxy to a remote script (lambda module), OmegaScriptProxy Enterprise Edition

The model proxy supports most methods of scikit-learn models, e.g.

  • fit()
  • predict()
  • transform()
  • etc.

Note

All omega|ml proxies support the same interface, although the specific backend implementation may not support all functionality or apply slightly different semantics

See the `Working with Machine learning models`_ for more details.

The job proxy supports two methods:

  • run() - to run a job immediately
  • schedule() - to schedule a job in the future

Dask runtime (optional)

The Dask (distributed) runtime supports executing omega|ml tasks and jobs on a dask cluster, using the same semantics as the celery cluster.

To enable the Dask cluster,

# get your omega instance
om = Omega(...)

# create a dask runtime and set it as the omega runtime
om.runtime = OmegaRuntimeDask('http://dask-scheduler-host:port',
                              auth=om.runtime.auth)

Once this is done, om.runtime works as with the default runtime, except that now all tasks previously executed on the celery cluster will now be executed on the dask cluster.