Filtering Data¶
Query filtering¶
The .get
method when operating on a Pandas DataFrame provides
keyword-style filtering and an optional lazy evaluation mode. Filters are
applied remotely inside the database and thus perform much faster than if
filtered in the returned dataframe.
om.datasets.get('foodf', x__gt=5)
=>
x
6 6
7 7
8 8
9 9
The filter syntax is <column>__<operator>=<value>
, where the operator
is one of the following:
eq
compare equal (this is also the default, when using the short form, i.e.<column>=<value>
gt
greator thangte
greater or equallt
less thanlte
less or equalbetween
between two values, specifyvalue
as a 2-tuplecontains
contains a value, specifyvalue
as a sequencestartswith
starts with a stringendswith
ends with a stringisnull
is a null value, specifyvalue
as a boolean
In general get
returns a Pandas DataFrame
. See the Pandas
documentation for ways to work with DataFrames.
However, unlike Pandas omega|ml provides methods to work with data that is larger than memory. This is covered in the next section.