kempnerpulse.translate¶

KempnerPulse Layer 2 — Translate.

Maps Layer-1 RawRecord objects (source vocabulary, raw values) to CanonicalRecord objects (one stable internal vocabulary). This package currently defines the canonical schema — the inter-layer contract that Layers 3 and 4 depend on. The translator that produces canonical records (mapping tables, unit normalization, missing-value policy, counter differencing) lands on top of it.

class kempnerpulse.translate.CanonicalRecord[source]¶

Bases: object

One fully-translated reading for one entity, in canonical vocabulary.

All Optional[float] metric fields are None unless the source provided them. The required block (no defaults) is the metadata every record must carry; the optional block is the per-subsystem readings plus cluster and reserved metadata that tolerate absence.

record_schema_version: int¶

record_timestamp_monotonic_seconds: float¶

record_timestamp_wallclock_unix_seconds: float¶

record_aggregation_mode: AggregationMode¶

record_window_microseconds: int¶

record_freshness_microseconds: int¶

record_provenance: Provenance¶

record_hostname: str¶

entity_gpu_index: int¶

entity_gpu_uuid: str¶

entity_mig_instance_index: int | None = None¶

entity_process_id: int | None = None¶

entity_process_command_line_truncated: str | None = None¶

record_slurm_job_id: str | None = None¶

record_slurm_step_id: str | None = None¶

record_slurm_array_job_id: str | None = None¶

record_slurm_array_task_id: str | None = None¶

record_slurm_restart_count: int | None = None¶

record_node_index_in_job: int | None = None¶

record_mpi_rank: int | None = None¶

record_capture_clock_offset_microseconds: int | None = None¶

record_user_annotation_iteration_index: int | None = None¶

record_user_annotation_phase_label: str | None = None¶

record_user_annotation_step_count: int | None = None¶

record_user_annotation_request_id: str | None = None¶

record_user_annotation_token_count: int | None = None¶

gpu_streaming_multiprocessor_active_cycle_fraction: float | None = None¶

gpu_streaming_multiprocessor_warp_occupancy_fraction: float | None = None¶

gpu_tensor_core_pipe_active_cycle_fraction: float | None = None¶

gpu_tensor_core_half_precision_mma_active_cycle_fraction: float | None = None¶

gpu_tensor_core_integer_mma_active_cycle_fraction: float | None = None¶

gpu_tensor_core_double_precision_fma_active_cycle_fraction: float | None = None¶

gpu_tensor_core_double_mma_active_cycle_fraction: float | None = None¶

gpu_tensor_core_quarter_mma_active_cycle_fraction: float | None = None¶

gpu_cuda_core_floating_point_64bit_pipe_active_cycle_fraction: float | None = None¶

gpu_cuda_core_floating_point_32bit_pipe_active_cycle_fraction: float | None = None¶

gpu_cuda_core_floating_point_16bit_pipe_active_cycle_fraction: float | None = None¶

gpu_graphics_compute_engine_active_cycle_fraction: float | None = None¶

gpu_dram_controller_active_cycle_fraction: float | None = None¶

gpu_memory_copy_engine_busy_time_fraction: float | None = None¶

gpu_pcie_transmit_throughput_bytes_per_second: float | None = None¶

gpu_pcie_receive_throughput_bytes_per_second: float | None = None¶

gpu_pcie_replay_count: int | None = None¶

gpu_nvlink_aggregate_throughput_bytes_per_second: float | None = None¶

gpu_board_power_draw_watts: float | None = None¶

gpu_board_total_energy_joules: float | None = None¶

gpu_board_enforced_power_limit_watts: float | None = None¶

gpu_board_default_power_limit_watts: float | None = None¶

gpu_die_temperature_celsius: float | None = None¶

gpu_memory_die_temperature_celsius: float | None = None¶

gpu_streaming_multiprocessor_clock_frequency_megahertz: float | None = None¶

gpu_memory_clock_frequency_megahertz: float | None = None¶

gpu_framebuffer_used_mebibytes: float | None = None¶

gpu_framebuffer_free_mebibytes: float | None = None¶

gpu_framebuffer_reserved_mebibytes: float | None = None¶

gpu_framebuffer_total_mebibytes: float | None = None¶

gpu_nvml_busy_time_fraction: float | None = None¶

gpu_xid_error_count: int | None = None¶

gpu_uncorrectable_remapped_row_count: int | None = None¶

gpu_correctable_remapped_row_count: int | None = None¶

gpu_row_remap_failure_flag: bool | None = None¶

validate()[source]¶

Raise TranslateError if any single-record invariant is violated.

Single-record invariants only. The cross-record invariant — energy is monotonically non-decreasing per entity — is enforced upstream by the Translate differencer, which is the only component that sees the sequence; it cannot be checked from one record in isolation.

Return type:: None

__init__(record_schema_version, record_timestamp_monotonic_seconds, record_timestamp_wallclock_unix_seconds, record_aggregation_mode, record_window_microseconds, record_freshness_microseconds, record_provenance, record_hostname, entity_gpu_index, entity_gpu_uuid, entity_mig_instance_index=None, entity_process_id=None, entity_process_command_line_truncated=None, record_slurm_job_id=None, record_slurm_step_id=None, record_slurm_array_job_id=None, record_slurm_array_task_id=None, record_slurm_restart_count=None, record_node_index_in_job=None, record_mpi_rank=None, record_capture_clock_offset_microseconds=None, record_user_annotation_iteration_index=None, record_user_annotation_phase_label=None, record_user_annotation_step_count=None, record_user_annotation_request_id=None, record_user_annotation_token_count=None, gpu_streaming_multiprocessor_active_cycle_fraction=None, gpu_streaming_multiprocessor_warp_occupancy_fraction=None, gpu_tensor_core_pipe_active_cycle_fraction=None, gpu_tensor_core_half_precision_mma_active_cycle_fraction=None, gpu_tensor_core_integer_mma_active_cycle_fraction=None, gpu_tensor_core_double_precision_fma_active_cycle_fraction=None, gpu_tensor_core_double_mma_active_cycle_fraction=None, gpu_tensor_core_quarter_mma_active_cycle_fraction=None, gpu_cuda_core_floating_point_64bit_pipe_active_cycle_fraction=None, gpu_cuda_core_floating_point_32bit_pipe_active_cycle_fraction=None, gpu_cuda_core_floating_point_16bit_pipe_active_cycle_fraction=None, gpu_graphics_compute_engine_active_cycle_fraction=None, gpu_dram_controller_active_cycle_fraction=None, gpu_memory_copy_engine_busy_time_fraction=None, gpu_pcie_transmit_throughput_bytes_per_second=None, gpu_pcie_receive_throughput_bytes_per_second=None, gpu_pcie_replay_count=None, gpu_nvlink_aggregate_throughput_bytes_per_second=None, gpu_board_power_draw_watts=None, gpu_board_total_energy_joules=None, gpu_board_enforced_power_limit_watts=None, gpu_board_default_power_limit_watts=None, gpu_die_temperature_celsius=None, gpu_memory_die_temperature_celsius=None, gpu_streaming_multiprocessor_clock_frequency_megahertz=None, gpu_memory_clock_frequency_megahertz=None, gpu_framebuffer_used_mebibytes=None, gpu_framebuffer_free_mebibytes=None, gpu_framebuffer_reserved_mebibytes=None, gpu_framebuffer_total_mebibytes=None, gpu_nvml_busy_time_fraction=None, gpu_xid_error_count=None, gpu_uncorrectable_remapped_row_count=None, gpu_correctable_remapped_row_count=None, gpu_row_remap_failure_flag=None)¶

Parameters:

record_schema_version (int)
record_timestamp_monotonic_seconds (float)
record_timestamp_wallclock_unix_seconds (float)
record_aggregation_mode (AggregationMode)
record_window_microseconds (int)
record_freshness_microseconds (int)
record_provenance (Provenance)
record_hostname (str)
entity_gpu_index (int)
entity_gpu_uuid (str)
entity_mig_instance_index (int | None)
entity_process_id (int | None)
entity_process_command_line_truncated (str | None)
record_slurm_job_id (str | None)
record_slurm_step_id (str | None)
record_slurm_array_job_id (str | None)
record_slurm_array_task_id (str | None)
record_slurm_restart_count (int | None)
record_node_index_in_job (int | None)
record_mpi_rank (int | None)
record_capture_clock_offset_microseconds (int | None)
record_user_annotation_iteration_index (int | None)
record_user_annotation_phase_label (str | None)
record_user_annotation_step_count (int | None)
record_user_annotation_request_id (str | None)
record_user_annotation_token_count (int | None)
gpu_streaming_multiprocessor_active_cycle_fraction (float | None)
gpu_streaming_multiprocessor_warp_occupancy_fraction (float | None)
gpu_tensor_core_pipe_active_cycle_fraction (float | None)
gpu_tensor_core_half_precision_mma_active_cycle_fraction (float | None)
gpu_tensor_core_integer_mma_active_cycle_fraction (float | None)
gpu_tensor_core_double_precision_fma_active_cycle_fraction (float | None)
gpu_tensor_core_double_mma_active_cycle_fraction (float | None)
gpu_tensor_core_quarter_mma_active_cycle_fraction (float | None)
gpu_cuda_core_floating_point_64bit_pipe_active_cycle_fraction (float | None)
gpu_cuda_core_floating_point_32bit_pipe_active_cycle_fraction (float | None)
gpu_cuda_core_floating_point_16bit_pipe_active_cycle_fraction (float | None)
gpu_graphics_compute_engine_active_cycle_fraction (float | None)
gpu_dram_controller_active_cycle_fraction (float | None)
gpu_memory_copy_engine_busy_time_fraction (float | None)
gpu_pcie_transmit_throughput_bytes_per_second (float | None)
gpu_pcie_receive_throughput_bytes_per_second (float | None)
gpu_pcie_replay_count (int | None)
gpu_nvlink_aggregate_throughput_bytes_per_second (float | None)
gpu_board_power_draw_watts (float | None)
gpu_board_total_energy_joules (float | None)
gpu_board_enforced_power_limit_watts (float | None)
gpu_board_default_power_limit_watts (float | None)
gpu_die_temperature_celsius (float | None)
gpu_memory_die_temperature_celsius (float | None)
gpu_streaming_multiprocessor_clock_frequency_megahertz (float | None)
gpu_memory_clock_frequency_megahertz (float | None)
gpu_framebuffer_used_mebibytes (float | None)
gpu_framebuffer_free_mebibytes (float | None)
gpu_framebuffer_reserved_mebibytes (float | None)
gpu_framebuffer_total_mebibytes (float | None)
gpu_nvml_busy_time_fraction (float | None)
gpu_xid_error_count (int | None)
gpu_uncorrectable_remapped_row_count (int | None)
gpu_correctable_remapped_row_count (int | None)
gpu_row_remap_failure_flag (bool | None)

Return type:

None

class kempnerpulse.translate.AggregationMode[source]¶

Bases: Enum

How a record’s metric values are integrated over time.

POINT = 'point'¶

WINDOW = 'window'¶

class kempnerpulse.translate.Provenance[source]¶

Bases: Enum

Where a record came from.

DCGMI = 'dcgmi'¶

PROMETHEUS = 'prometheus'¶

NVML_FALLBACK = 'nvml_fallback'¶

REPLAY = 'replay'¶

exception kempnerpulse.translate.TranslateError[source]¶

Bases: ValueError

A canonical record violated a schema invariant (see validate).

kempnerpulse.translate.canonical_field_names()[source]¶

Every CanonicalRecord field name, in declaration order.

Return type:: tuple

class kempnerpulse.translate.Translator[source]¶

Bases: object

Maps RawRecord objects to CanonicalRecord objects under a fixed SourceContext.

__init__(ctx)[source]¶

Parameters:: ctx (SourceContext)
Return type:: None

translate(raw)[source]¶

Translate one record. Returns None for non-GPU entities.

Parameters:: raw (RawRecord)
Return type:: CanonicalRecord | None

translate_tick(records)[source]¶

Translate a tick’s worth of records, dropping non-GPU entities.

Parameters:: records (Iterable[RawRecord])
Return type:: List[CanonicalRecord]

kempnerpulse.translate.make_translator(backend, **context_kwargs)[source]¶

Construct a Translator with a backend-derived SourceContext.

Parameters:: backend (BackendKind)
Return type:: Translator

class kempnerpulse.translate.SourceContext[source]¶

Bases: object

Static context resolved at startup; frozen for the process lifetime.

backend: BackendKind¶

provenance: Provenance¶

aggregation_mode: AggregationMode¶

window_microseconds: int¶

hostname: str¶

gpu_uuid_by_index: Dict[int, str]¶

gpu_model_by_index: Dict[int, str]¶

slurm_metadata: Dict[str, object]¶

__init__(backend, provenance, aggregation_mode, window_microseconds, hostname, gpu_uuid_by_index=<factory>, gpu_model_by_index=<factory>, slurm_metadata=<factory>)¶

Parameters:

backend (BackendKind)
provenance (Provenance)
aggregation_mode (AggregationMode)
window_microseconds (int)
hostname (str)
gpu_uuid_by_index (Dict[int, str])
gpu_model_by_index (Dict[int, str])
slurm_metadata (Dict[str, object])

Return type:

None

kempnerpulse.translate.make_source_context(backend, *, hostname=None, gpu_uuid_by_index=None, gpu_model_by_index=None, slurm_metadata=None)[source]¶

Build a SourceContext with backend-derived provenance/aggregation.

Parameters:

backend (BackendKind)
hostname (str | None)
gpu_uuid_by_index (Dict[int, str] | None)
gpu_model_by_index (Dict[int, str] | None)
slurm_metadata (Dict[str, object] | None)

Return type:

SourceContext

Modules

`context`	Per-process source context for translation (resolved once at startup).
`mapping`	Source-vocabulary → canonical-vocabulary mapping and unit normalization.
`schema`	The canonical schema — the inter-layer contract.
`translator`	Layer 2 orchestrator — turn a `RawRecord` into a `CanonicalRecord`.