Filters
OKA offers an advanced filtering functionality. The same filters can be used in OKA interface to filter the data to display or in OKA backend to filter the input data used by the pipelines.
Depending on the plugin selected in OKA UI you will have different filtering options (see Filters). In the same way, depending on the pipeline, you will have different filtering capabilities:
For
log_js_fetch
pipeline: Date filtering is available only to request the jobs presents between two dates.For
OKA Predict (Predictor)
andMeteoCluster
pipelines: Full filtering capabilities on dates and advanced features.
Creation
The easiest way to create a filter is through OKA UI by saving the filters into a profile (see Save filters as profile). You can also create a filter profile manually using the admin panel (see Administrator panels). Be aware that the filters must follow a specific JSON format in order to be understood by OKA:
start_date
: Filter from this date (included).end_date
: Filter to this date (included).date_col
: Filter on this date column:Submit
: Jobs submission dateStart
: Jobs start dateEligible
: Date when the jobs are eligibleEnd
: Jobs end datedate
: Date of measurement (load values, nodes statistics, energy measurement…)
multiple_filters
: Filter on features (jobs features…). For example:"multiple_filters":{"rules": [{"id":"Account","type":"string","field":"Account","input":"text", "value":"default","operator":"equal"}, {"id":"Allocated_CPUS","type":"double", "field":"Allocated_CPUS","input":"text","value":"1","operator":"greater"}], "condition":"AND"}}
time_delta
: Filter on x days. If you providestart_date
andtime_delta
, it will filter fromstart_date
tostart_date + time_delta
. If you provideend_date
andtime_delta
, it will filter fromend_date - time_delta
toend_date
. If you providetime_delta
only, it will filter fromnow - time_delta
tonow
.time_delta=30
has thus the same meaning than theRangeKey
Last 30 days
.
The filter used in the above example with a pipeline loading jobscheduler logs will gather only jobs present on the cluster from 2022-01-11 00:00:00 until 2022-02-20 00:00:00.
Usage
Filters can be loaded and applied in OKA UI to filter the data to display (see Save filters as profile).
Filters can be used with the pipelines to filter the logs to ingest (log_js_fetch
) or the data to train the models (OKA Predict (Predictor)
and MeteoCluster
).
This can be used to train a model for a specific jobs workload caracterised by the features defined in the filters.
Use the admin interface > Conf pipelines > Filters
to associate a filter to a pipeline.