kempnerpulse.translate¶
KempnerPulse Layer 2 — Translate.
Maps Layer-1 RawRecord objects (source vocabulary, raw values) to
CanonicalRecord objects (one stable internal vocabulary). This package currently
defines the canonical schema — the inter-layer contract that Layers 3 and 4
depend on. The translator that produces canonical records (mapping tables, unit
normalization, missing-value policy, counter differencing) lands on top of it.
- class kempnerpulse.translate.CanonicalRecord[source]¶
Bases:
objectOne fully-translated reading for one entity, in canonical vocabulary.
All
Optional[float]metric fields areNoneunless the source provided them. The required block (no defaults) is the metadata every record must carry; the optional block is the per-subsystem readings plus cluster and reserved metadata that tolerate absence.- record_aggregation_mode: AggregationMode¶
- record_provenance: Provenance¶
- validate()[source]¶
Raise
TranslateErrorif any single-record invariant is violated.Single-record invariants only. The cross-record invariant — energy is monotonically non-decreasing per entity — is enforced upstream by the Translate differencer, which is the only component that sees the sequence; it cannot be checked from one record in isolation.
- Return type:
None
- __init__(record_schema_version, record_timestamp_monotonic_seconds, record_timestamp_wallclock_unix_seconds, record_aggregation_mode, record_window_microseconds, record_freshness_microseconds, record_provenance, record_hostname, entity_gpu_index, entity_gpu_uuid, entity_mig_instance_index=None, entity_process_id=None, entity_process_command_line_truncated=None, record_slurm_job_id=None, record_slurm_step_id=None, record_slurm_array_job_id=None, record_slurm_array_task_id=None, record_slurm_restart_count=None, record_node_index_in_job=None, record_mpi_rank=None, record_capture_clock_offset_microseconds=None, record_user_annotation_iteration_index=None, record_user_annotation_phase_label=None, record_user_annotation_step_count=None, record_user_annotation_request_id=None, record_user_annotation_token_count=None, gpu_streaming_multiprocessor_active_cycle_fraction=None, gpu_streaming_multiprocessor_warp_occupancy_fraction=None, gpu_tensor_core_pipe_active_cycle_fraction=None, gpu_tensor_core_half_precision_mma_active_cycle_fraction=None, gpu_tensor_core_integer_mma_active_cycle_fraction=None, gpu_tensor_core_double_precision_fma_active_cycle_fraction=None, gpu_tensor_core_double_mma_active_cycle_fraction=None, gpu_tensor_core_quarter_mma_active_cycle_fraction=None, gpu_cuda_core_floating_point_64bit_pipe_active_cycle_fraction=None, gpu_cuda_core_floating_point_32bit_pipe_active_cycle_fraction=None, gpu_cuda_core_floating_point_16bit_pipe_active_cycle_fraction=None, gpu_graphics_compute_engine_active_cycle_fraction=None, gpu_dram_controller_active_cycle_fraction=None, gpu_memory_copy_engine_busy_time_fraction=None, gpu_pcie_transmit_throughput_bytes_per_second=None, gpu_pcie_receive_throughput_bytes_per_second=None, gpu_pcie_replay_count=None, gpu_nvlink_aggregate_throughput_bytes_per_second=None, gpu_board_power_draw_watts=None, gpu_board_total_energy_joules=None, gpu_board_enforced_power_limit_watts=None, gpu_board_default_power_limit_watts=None, gpu_die_temperature_celsius=None, gpu_memory_die_temperature_celsius=None, gpu_streaming_multiprocessor_clock_frequency_megahertz=None, gpu_memory_clock_frequency_megahertz=None, gpu_framebuffer_used_mebibytes=None, gpu_framebuffer_free_mebibytes=None, gpu_framebuffer_reserved_mebibytes=None, gpu_framebuffer_total_mebibytes=None, gpu_nvml_busy_time_fraction=None, gpu_xid_error_count=None, gpu_uncorrectable_remapped_row_count=None, gpu_correctable_remapped_row_count=None, gpu_row_remap_failure_flag=None)¶
- Parameters:
record_schema_version (int)
record_timestamp_monotonic_seconds (float)
record_timestamp_wallclock_unix_seconds (float)
record_aggregation_mode (AggregationMode)
record_window_microseconds (int)
record_freshness_microseconds (int)
record_provenance (Provenance)
record_hostname (str)
entity_gpu_index (int)
entity_gpu_uuid (str)
entity_mig_instance_index (int | None)
entity_process_id (int | None)
entity_process_command_line_truncated (str | None)
record_slurm_job_id (str | None)
record_slurm_step_id (str | None)
record_slurm_array_job_id (str | None)
record_slurm_array_task_id (str | None)
record_slurm_restart_count (int | None)
record_node_index_in_job (int | None)
record_mpi_rank (int | None)
record_capture_clock_offset_microseconds (int | None)
record_user_annotation_iteration_index (int | None)
record_user_annotation_phase_label (str | None)
record_user_annotation_step_count (int | None)
record_user_annotation_request_id (str | None)
record_user_annotation_token_count (int | None)
gpu_streaming_multiprocessor_active_cycle_fraction (float | None)
gpu_streaming_multiprocessor_warp_occupancy_fraction (float | None)
gpu_tensor_core_pipe_active_cycle_fraction (float | None)
gpu_tensor_core_half_precision_mma_active_cycle_fraction (float | None)
gpu_tensor_core_integer_mma_active_cycle_fraction (float | None)
gpu_tensor_core_double_precision_fma_active_cycle_fraction (float | None)
gpu_tensor_core_double_mma_active_cycle_fraction (float | None)
gpu_tensor_core_quarter_mma_active_cycle_fraction (float | None)
gpu_cuda_core_floating_point_64bit_pipe_active_cycle_fraction (float | None)
gpu_cuda_core_floating_point_32bit_pipe_active_cycle_fraction (float | None)
gpu_cuda_core_floating_point_16bit_pipe_active_cycle_fraction (float | None)
gpu_graphics_compute_engine_active_cycle_fraction (float | None)
gpu_dram_controller_active_cycle_fraction (float | None)
gpu_memory_copy_engine_busy_time_fraction (float | None)
gpu_pcie_transmit_throughput_bytes_per_second (float | None)
gpu_pcie_receive_throughput_bytes_per_second (float | None)
gpu_pcie_replay_count (int | None)
gpu_nvlink_aggregate_throughput_bytes_per_second (float | None)
gpu_board_power_draw_watts (float | None)
gpu_board_total_energy_joules (float | None)
gpu_board_enforced_power_limit_watts (float | None)
gpu_board_default_power_limit_watts (float | None)
gpu_die_temperature_celsius (float | None)
gpu_memory_die_temperature_celsius (float | None)
gpu_streaming_multiprocessor_clock_frequency_megahertz (float | None)
gpu_memory_clock_frequency_megahertz (float | None)
gpu_framebuffer_used_mebibytes (float | None)
gpu_framebuffer_free_mebibytes (float | None)
gpu_framebuffer_reserved_mebibytes (float | None)
gpu_framebuffer_total_mebibytes (float | None)
gpu_nvml_busy_time_fraction (float | None)
gpu_xid_error_count (int | None)
gpu_uncorrectable_remapped_row_count (int | None)
gpu_correctable_remapped_row_count (int | None)
gpu_row_remap_failure_flag (bool | None)
- Return type:
None
- class kempnerpulse.translate.AggregationMode[source]¶
Bases:
EnumHow a record’s metric values are integrated over time.
- POINT = 'point'¶
- WINDOW = 'window'¶
- class kempnerpulse.translate.Provenance[source]¶
Bases:
EnumWhere a record came from.
- DCGMI = 'dcgmi'¶
- PROMETHEUS = 'prometheus'¶
- NVML_FALLBACK = 'nvml_fallback'¶
- REPLAY = 'replay'¶
- exception kempnerpulse.translate.TranslateError[source]¶
Bases:
ValueErrorA canonical record violated a schema invariant (see
validate).
- kempnerpulse.translate.canonical_field_names()[source]¶
Every
CanonicalRecordfield name, in declaration order.- Return type:
- class kempnerpulse.translate.Translator[source]¶
Bases:
objectMaps
RawRecordobjects toCanonicalRecordobjects under a fixedSourceContext.- __init__(ctx)[source]¶
- Parameters:
ctx (SourceContext)
- Return type:
None
- translate(raw)[source]¶
Translate one record. Returns
Nonefor non-GPU entities.- Parameters:
raw (RawRecord)
- Return type:
CanonicalRecord | None
- kempnerpulse.translate.make_translator(backend, **context_kwargs)[source]¶
Construct a
Translatorwith a backend-derivedSourceContext.- Parameters:
backend (BackendKind)
- Return type:
- class kempnerpulse.translate.SourceContext[source]¶
Bases:
objectStatic context resolved at startup; frozen for the process lifetime.
- backend: BackendKind¶
- provenance: Provenance¶
- aggregation_mode: AggregationMode¶
- __init__(backend, provenance, aggregation_mode, window_microseconds, hostname, gpu_uuid_by_index=<factory>, gpu_model_by_index=<factory>, slurm_metadata=<factory>)¶
- Parameters:
backend (BackendKind)
provenance (Provenance)
aggregation_mode (AggregationMode)
window_microseconds (int)
hostname (str)
- Return type:
None
- kempnerpulse.translate.make_source_context(backend, *, hostname=None, gpu_uuid_by_index=None, gpu_model_by_index=None, slurm_metadata=None)[source]¶
Build a
SourceContextwith backend-derived provenance/aggregation.
Modules
Per-process source context for translation (resolved once at startup). |
|
Source-vocabulary → canonical-vocabulary mapping and unit normalization. |
|
The canonical schema — the inter-layer contract. |
|
Layer 2 orchestrator — turn a |