Supporting Types
Enumerations
These enumerations are used as parameter or return types in SDK methods.
ComputeLoadResolution
- class applications.ait.enums.ComputeLoadResolution(*values)
Bases:
StrEnumSupported time resolutions for compute_load().
Restricted subset of Resolution — only values present in RESOLUTION_DIC (lib/common/oka_constants/constants.py) are accepted by compute_load(). Once compute_load() is refactored to support all Resolution values, remove this enum and replace usages with Resolution directly.
Member
Value
SECOND1second
MINUTE1minute
HOUR1hour
DAY1day
MONTH1month
DataType
- class applications.ait.enums.DataType(*values)
Bases:
StrEnumResource dimension used across OKA data queries.
Typed replacement for the
"core"/"GPU"string literals passed to services and providers to select the hardware resource being measured. Used in state, congestion, load, and other apps.When
GPUis selected, metric categories that don’t have a dedicated GPU field (cost, energy, …) fall back to GPU-hours.Example:
>>> provider.get_jobs_status(data_type=DataType.GPU)
Member
Value
COREcore
GPUGPU
GpuAccountingField
- class applications.resources.dto.gpu.GpuAccountingField(*values)
Bases:
StrEnumSubset of
AccountingFieldrelevant to GPU analysis.Only two fields are valid for GPU distribution queries: the number of GPUs allocated to the job (
ALLOC_GPUS) and the number originally requested (REQ_GPUS).Member
Value
ALLOC_GPUSAllocated_GPU
REQ_GPUSRequested_GPU
GroupingField
- class applications.state.dto.state.GroupingField(*values)
Bases:
StrEnumStandard ES fields used to group jobs in grouped query variants.
Members cover the most common grouping dimensions. Pass a raw string for custom or cluster-specific fields not listed here (e.g.
"Application","WCKey").Example:
>>> provider.get_jobs_status_grouped(grouping_field=GroupingField.ACCOUNT) >>> provider.get_jobs_status_grouped(grouping_field="Application") # custom
Member
Value
ACCOUNTAccount
UIDUID
GIDGID
PARTITIONPartition
USERNAMEUser
JobState
- class applications.ait.enums.JobState(*values)
Bases:
StrEnumHPC job state as reported by the scheduler.
Inherits from
strso that members compare equal to plain strings:>>> JobState.CANCELLED == "CANCELLED" # True >>> {"CANCELLED": 42}[JobState.CANCELLED] # 42
Drop-in replacement for the job state string constants in
oka_constants.constants(CANCELLED, FAILED, …). Existing code that compares against raw strings keeps working; new code gains autocompletion, exhaustiveness checks, andEnumiteration.Member
Value
CANCELLEDCANCELLED
COMPLETEDCOMPLETED
FAILEDFAILED
NODE_FAILNODE_FAIL
PREEMPTEDPREEMPTED
TIMEOUTTIMEOUT
BOOT_FAILBOOT_FAIL
REQUEUEDREQUEUED
RUNNINGRUNNING
RESIZINGRESIZING
SUSPENDEDSUSPENDED
PENDINGPENDING
CONFIGURINGCONFIGURING
COMPLETINGCOMPLETING
OUT_OF_MEMORYOUT_OF_MEMORY
REVOKEDREVOKED
LoadDatetimeField
- class applications.load.dto.enums.LoadDatetimeField(*values)
Bases:
StrEnumSubset of DatetimeCol values used across the load application.
Restricts the full DatetimeCol set (SUBMIT, ELIGIBLE, START, END) to the two fields the load app actually defaults to and checks against.
Example:
>>> provider.get_core_load(datetime_col=LoadDatetimeField.SUBMIT) >>> LoadDatetimeField.ELIGIBLE in mapping_keys
Member
Value
SUBMITSubmit
ELIGIBLEEligible
MemoryAccountingField
- class applications.resources.dto.memory.MemoryAccountingField(*values)
Bases:
StrEnumSubset of memory fields valid for memory distribution queries.
Only two fields are relevant: peak memory actually used by the job (
MAX_RSS) and memory originally requested (REQ_MEM).Member
Value
MAX_RSSMaxRSS
REQ_MEMRequested_Memory
MetricCategory
- class applications.ait.enums.MetricCategory(*values)
Bases:
StrEnumMetric category used across OKA data queries.
Typed replacement for the raw string constants in
oka_constants.constants(STATE,COREHOURS,GPUHOURS,COST,ENERGY,CARBON_FOOTPRINT). String values match the Elasticsearch field names, so members are drop-in replacements — no conversion needed:>>> MetricCategory.JOBS == "State" # True >>> MetricCategory.CORE_HOURS == "Core_hours" # True
Used by providers, services, and views across multiple apps (state, load, consumers, kpi, …). Use this instead of raw strings or
oka_constantslookups in user scripts and provider calls.Example:
>>> provider.get_jobs_status(category=MetricCategory.CORE_HOURS) >>> provider.get_jobs_status(category=MetricCategory.COST)
Member
Value
JOBSState
CORE_HOURSCore_hours
GPU_HOURSGPU_hours
COSTCost
ENERGYEnergy
CARBON_FOOTPRINTCO2
Resolution
- class applications.ait.enums.Resolution(*values)
Bases:
ResolutionMixin,StrEnumTime bucket size for data aggregation queries.
Each member carries metadata needed by different layers of the stack:
es_interval: Elasticsearch calendar_interval value (e.g."1d").pandas_freq: Pandas frequency alias for resampling (e.g."D").millis: Duration in milliseconds (used for JS chart intervals).duration_hours: Duration in hours (used for resource-hour normalization).date_format: strftime format for display.
These properties replace the legacy dicts
RESOLUTION_DIC(oka_constants),DURATION_AS_HOURS,TIME_STEPS, andTIME_FORMATTING(ait/constants).Example:
>>> Resolution.DAY.es_interval # "1d" >>> Resolution.DAY.duration_hours # 24.0 >>> Resolution("1hour").pandas_freq # "h"
Member
Value
SECOND1second
MINUTE1minute
TEN_MIN10min
HOUR1hour
DAY1day
WEEK1week
MONTH1month
YEAR1year
ResultStatus
- class applications.ait.enums.ResultStatus(*values)
Bases:
StrEnumOutcome status for service/provider query results.
Used on DTOs to indicate whether the query returned data, and if not, why. This replaces the legacy pattern of returning
{"error": message}dicts or raising exceptions for empty results.SDK users can check the status before iterating over results:
>>> if result.status == ResultStatus.OK: ... for entry in result.entries: ... print(entry.state, entry.count) >>> else: ... print(f"No data: {result.status.value}")
Members carry the same string values as the legacy constants
DB_ERROR_MESSAGE,FILTER_ERROR_MESSAGE, andRESOLUTION_MESSAGEfromoka_constantsto ease migration.Member
Value
OKok
NO_DATANo data
NO_RESULTSNo results found
TOO_MANY_DATAToo many data - Please tune your date filter to a smaller period or try another resolution
SubmissionDatetimeCol
- class applications.throughput.dto.common.SubmissionDatetimeCol(*values)
Bases:
StrEnumPre-start timestamp columns only (Submit and Eligible).
Restricted subset for metrics where the reference date must precede job execution — i.e. wait time and slowdown. Using Start or End as a “from” date would produce nonsensical results for those metrics.
Enum values are the PascalCase Elasticsearch field names, matching
AccountingFielddirectly. Use.session_keywhen a lowercase form is needed for URL path segments.Example:
>>> col = SubmissionDatetimeCol.ELIGIBLE >>> str(col) # → "Eligible" (ES field name, ready to use) >>> col.session_key # → "eligible" (for URL/session storage)
Member
Value
SUBMITSubmit
ELIGIBLEEligible
ThroughputDatetimeCol
- class applications.throughput.dto.common.ThroughputDatetimeCol(*values)
Bases:
StrEnumAny datetime column available in throughput queries.
Full set of job-lifecycle timestamps: submission, eligibility, start, and end. Use this for metrics that can be bucketed on any event (e.g. job frequency, interarrival time).
Enum values are the PascalCase Elasticsearch field names, matching
AccountingFielddirectly. Use.session_keywhen a lowercase form is needed for URL path segments.Example:
>>> col = ThroughputDatetimeCol.START >>> str(col) # → "Start" (ES field name, ready to use) >>> col.session_key # → "start" (for URL/session storage)
Member
Value
SUBMITSubmit
ELIGIBLEEligible
STARTStart
ENDEnd
Data Models
These Pydantic models appear as nested types within DTOs or as return values from provider methods.
CategoryThreshold
- class applications.throughput.dto.exec_time.CategoryThreshold(*, name: str, min_percent: float | None = None, max_percent: float | None = None, color: str = '#cccccc', tooltip: str = '')
Bases:
BaseModelA single ratio threshold category for the exectime/timelimit sunburst.
- name
Display name for the category (e.g.
"Optimal").- Type:
str
- min_percent
Lower bound of the ratio range (inclusive), or
Nonefor an open lower bound.- Type:
float | None
- max_percent
Upper bound of the ratio range (exclusive), or
Nonefor an open upper bound.- Type:
float | None
- color
Hex color string for frontend rendering (e.g.
"#4caf50").- Type:
str
- tooltip
Tooltip text shown on hover in the frontend.
- Type:
str
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
CoreBinStats
- class applications.resources.dto.cores.CoreBinStats(*, bin_label: str, job_count: Annotated[int, Ge(ge=0)], core_hours_sum: Annotated[float, Ge(ge=0)], core_hours_mean: Annotated[float, Ge(ge=0)])
Bases:
BaseModelStatistics for a single core allocation bin.
- bin_label
Human-readable bin range, e.g.
"[1, 4[".- Type:
str
- job_count
Number of jobs that allocated cores in this bin.
- Type:
int
- core_hours_sum
Total core-hours consumed by those jobs.
- Type:
float
- core_hours_mean
Mean core-hours per job in this bin.
- Type:
float
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
CoresGroupStats
- class applications.resources.dto.cores.CoresGroupStats(*, group_name: str | int, grouping_type: str, bin_labels: list[str], job_counts: list[Annotated[int | float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0)])]], core_hours_sum: list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0.0)])]], core_hours_mean: list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0.0)])]])
Bases:
BaseModelCore distribution data for a single group.
- group_name
Value of the grouping field for this group (e.g. a username).
- Type:
str | int
- grouping_type
ES field used for grouping (e.g.
"uid","account").- Type:
str
- bin_labels
Ordered bin range labels shared across all metrics.
- Type:
list[str]
- job_counts
Number of jobs per bin.
- Type:
list[int | float]
- core_hours_sum
Total core-hours per bin.
- Type:
list[float]
- core_hours_mean
Mean core-hours per job per bin.
- Type:
list[float]
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
CoresMemoryBinStats
- class applications.resources.dto.cores_memory.CoresMemoryBinStats(*, memory_bin_label: str, job_counts: list[int | float], core_hours: list[float])
Bases:
BaseModelPer-memory-bin counts and core-hours across all core-allocation bins.
- memory_bin_label
Human-readable memory bin range, e.g.
"[1GB, 4GB[".- Type:
str
- job_counts
Number of jobs per core-allocation bin for this memory bin.
- Type:
list[int | float]
- core_hours
Total core-hours per core-allocation bin for this memory bin.
- Type:
list[float]
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
CoresMemoryGroupStats
- class applications.resources.dto.cores_memory.CoresMemoryGroupStats(*, group_name: str | int, grouping_type: str, core_bin_labels: list[str], bins: list[CoresMemoryBinStats])
Bases:
BaseModelCores-vs-memory matrix data for a single group.
- group_name
Value of the grouping field for this group (e.g. a username).
- Type:
str | int
- grouping_type
ES field used for grouping (e.g.
"uid","account").- Type:
str
- core_bin_labels
Ordered core-allocation bin labels for this group.
- Type:
list[str]
- bins
One entry per memory bin, carrying job counts and core-hours.
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
ExecTimeFilterConfig
- class applications.throughput.dto.exec_time.ExecTimeFilterConfig(*, include_timeout: bool = True, include_other_end_states: bool = False)
Bases:
BaseModelJob-state filter settings for the exectime/timelimit analysis.
Controls which terminal states are included in the ratio calculation. Non-terminal states (RUNNING, PENDING, etc.) are always excluded.
- include_timeout
When
True, TIMEOUT jobs are included.- Type:
bool
- include_other_end_states
When
True, CANCELLED, FAILED, NODE_FAIL, PREEMPTED, BOOT_FAIL, OUT_OF_MEMORY, and REVOKED jobs are included.- Type:
bool
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
ExtendedStats
- class applications.resources.dto.stats.ExtendedStats(*, min: float | None = None, max: float | None = None, mean: float | None = None, count: Annotated[int, Ge(ge=0)] = 0, std: float | None = None)
Bases:
BaseModelDescriptive statistics returned by an ES extended_stats aggregation.
- min
Minimum observed value, or
Nonewhen count is zero.- Type:
float | None
- max
Maximum observed value, or
Nonewhen count is zero.- Type:
float | None
- mean
Mean value, or
Nonewhen count is zero.- Type:
float | None
- count
Number of documents included in the aggregation.
- Type:
int
- std
Standard deviation, or
Nonewhen count is zero.- Type:
float | None
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
GpuBinStats
- class applications.resources.dto.gpu.GpuBinStats(*, bin_label: str, job_count: Annotated[int, Ge(ge=0)], gpu_hours_sum: Annotated[float | None, Ge(ge=0.0)] = None, gpu_hours_mean: Annotated[float | None, Ge(ge=0.0)] = None)
Bases:
BaseModelStatistics for a single GPU allocation bin.
- bin_label
Human-readable bin range, e.g.
"[1, 4[".- Type:
str
- job_count
Number of jobs that used GPUs in this bin.
- Type:
int
- gpu_hours_sum
Total GPU-hours consumed by those jobs.
Nonewhen computing requested (not allocated) GPUs.- Type:
float | None
- gpu_hours_mean
Mean GPU-hours per job in this bin.
Nonewhen computing requested (not allocated) GPUs.- Type:
float | None
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
GpuGroupStats
- class applications.resources.dto.gpu.GpuGroupStats(*, group_name: str | int, grouping_type: str, bin_labels: list[str], job_counts: list[Annotated[int | float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0)])]], gpu_hours_sum: list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0.0)])]] | None = None, gpu_hours_mean: list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0.0)])]] | None = None)
Bases:
BaseModelGPU distribution data for a single group.
- group_name
Value of the grouping field for this group (e.g. a username).
- Type:
str | int
- grouping_type
ES field used for grouping (e.g.
"uid","account").- Type:
str
- bin_labels
Ordered bin range labels shared across all metrics.
- Type:
list[str]
- job_counts
Number of jobs per bin.
- Type:
list[int | float]
- gpu_hours_sum
Total GPU-hours per bin.
Nonewhen computing requested (not allocated) GPUs.- Type:
list[float] | None
- gpu_hours_mean
Mean GPU-hours per job per bin.
Nonewhen computing requested (not allocated) GPUs.- Type:
list[float] | None
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
LoadStats
- class applications.load.dto.load.LoadStats(*, mean: float | None = None, std: float | None = None, min: float | None = None, p10: float | None = None, p20: float | None = None, p30: float | None = None, p40: float | None = None, median: float | None = None, p60: float | None = None, p70: float | None = None, p80: float | None = None, p90: float | None = None, max: float | None = None)
Bases:
BaseModelDescriptive statistics for a single load timeseries (one RUNNING or WAITING series in a single bucket resolution).
Matches the output of
oka.lib.common.utility.get_stats— pandasSeries.describe(percentiles=[.1, .2, .3, .4, .5, .6, .7, .8, .9])with50%renamed tomedian. Values rounded to one decimal;Nonewhere pandas returnedNaN.- mean
Arithmetic mean.
- Type:
float | None
- std
Standard deviation.
- Type:
float | None
- min
Minimum value.
- Type:
float | None
- p10
10th percentile.
- Type:
float | None
- p20
20th percentile.
- Type:
float | None
- p30
30th percentile.
- Type:
float | None
- p40
40th percentile.
- Type:
float | None
- median
Median (50th percentile).
- Type:
float | None
- p60
60th percentile.
- Type:
float | None
- p70
70th percentile.
- Type:
float | None
- p80
80th percentile.
- Type:
float | None
- p90
90th percentile.
- Type:
float | None
- max
Maximum value.
- Type:
float | None
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
MemoryBinStats
- class applications.resources.dto.memory.MemoryBinStats(*, bin_label: str, job_count: Annotated[int, Ge(ge=0)])
Bases:
BaseModelStatistics for a single memory bin.
- bin_label
Human-readable bin range, e.g.
"[1 GB, 4 GB["or"[64 GB".- Type:
str
- job_count
Number of jobs whose memory fell in this bin.
- Type:
int
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
MemoryGroupStats
- class applications.resources.dto.memory.MemoryGroupStats(*, group_name: str | int, grouping_type: str, bin_labels: list[str], job_counts: list[int | float])
Bases:
BaseModelMemory distribution data for a single group.
- group_name
Value of the grouping field for this group (e.g. a username).
- Type:
str | int
- grouping_type
ES field used for grouping (e.g.
"uid","account").- Type:
str
- bin_labels
Ordered GB bin range labels shared across all metrics.
- Type:
list[str]
- job_counts
Number of jobs per bin.
- Type:
list[int | float]
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
MemoryRatioBinStats
- class applications.resources.dto.consumed_vs_requested_memory.MemoryRatioBinStats(*, bin_label: str, job_count: Annotated[int, Ge(ge=0)])
Bases:
BaseModelStatistics for a single memory ratio percentage bin.
- bin_label
Human-readable bin range, e.g.
"[10%, 20%["or"[100%".- Type:
str
- job_count
Number of jobs whose consumed/requested ratio fell in this bin.
- Type:
int
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
MemoryRatioGroupStats
- class applications.resources.dto.consumed_vs_requested_memory.MemoryRatioGroupStats(*, group_name: str | int, grouping_type: str, bin_labels: list[str], job_counts: list[int | float])
Bases:
BaseModelMemory ratio distribution data for a single group.
- group_name
Value of the grouping field for this group (e.g. a username).
- Type:
str | int
- grouping_type
ES field used for grouping (e.g.
"uid","account").- Type:
str
- bin_labels
Ordered percentage bin range labels shared across all metrics.
- Type:
list[str]
- job_counts
Number of jobs per bin.
- Type:
list[int | float]
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
NodesBinStats
- class applications.resources.dto.nodes.NodesBinStats(*, bin_label: str, job_count: Annotated[int, Ge(ge=0)])
Bases:
BaseModelStatistics for a single node allocation bin.
- bin_label
Human-readable bin range, e.g.
"[1, 4["or"[16".- Type:
str
- job_count
Number of jobs whose node count fell in this bin.
- Type:
int
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
NodesGroupStats
- class applications.resources.dto.nodes.NodesGroupStats(*, group_name: str | int, grouping_type: str, bin_labels: list[str], job_counts: list[int | float])
Bases:
BaseModelNode allocation distribution data for a single group.
- group_name
Value of the grouping field for this group (e.g. a username).
- Type:
str | int
- grouping_type
ES field used for grouping (e.g.
"uid","account").- Type:
str
- bin_labels
Ordered bin range labels shared across all metrics.
- Type:
list[str]
- job_counts
Number of jobs per bin.
- Type:
list[int | float]
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Domain Models
These Django models represent core OKA domain objects passed to or returned from SDK classes.
Workload
- class core_applications.workload.models.workload.Workload(*args, **kwargs)
Bases:
ModelDefine a specific scope for healthcheck evaluations.
A workload represents a mutable configuration that defines: - Which clusters to monitor - What filters to apply (using QueryBuilder JSON format) - User ownership
- name
Unique name of the workload (e.g., ‘ai_team_production’).
- description
Detailed description of what this workload monitors.
- clusters
ManyToMany relationship to clusters to monitor.
- filters
JSON object with QueryBuilder-style filters for data providers.
- created_by
User who created this workload.
- created_at
Timestamp when this workload was created.
- exception DoesNotExist
Bases:
ObjectDoesNotExist
- exception MultipleObjectsReturned
Bases:
MultipleObjectsReturned
- exception NotUpdated
Bases:
ObjectNotUpdated,DatabaseError
- property cluster_names: list[str]
Get list of cluster names for this workload.
- Returns:
List of cluster name strings.
- property cluster_uids: list[str]
Get list of cluster UIDs for this workload.
- Returns:
List of cluster UID strings.