Filtering Data¶
Query filtering¶
The .get method when operating on a Pandas DataFrame provides
keyword-style filtering and an optional lazy evaluation mode. Filters are
applied remotely inside the database and thus perform much faster than if
filtered in the returned dataframe.
om.datasets.get('foodf', x__gt=5)
=>
    x
 6  6
 7  7
 8  8
 9  9
The filter syntax is <column>__<operator>=<value>, where the operator
is one of the following:
- eqcompare equal (this is also the default, when using the short form, i.e.- <column>=<value>
- gtgreator than
- gtegreater or equal
- ltless than
- lteless or equal
- betweenbetween two values, specify- valueas a 2-tuple
- containscontains a value, specify- valueas a sequence
- startswithstarts with a string
- endswithends with a string
- isnullis a null value, specify- valueas a boolean
In general get returns a Pandas DataFrame. See the Pandas
documentation for ways to work with DataFrames.
However, unlike Pandas omega|ml provides methods to work with data that is larger than memory. This is covered in the next section.