kempnerpulse.reader.preflight

Layer 1 (Read) — preflight checks for the direct DCGM backend.

Verifies that dcgmi is present and the DCGM host engine answers before a stream is opened, and returns the discovery output so GPU IDs can be resolved. Failures raise typed ReaderError objects carrying an actionable remediation string rather than leaving a cryptic subprocess error to surface later.

Functions

probe_dcgmi([timeout])

Run dcgmi discovery -l and return its stdout.

kempnerpulse.reader.preflight.probe_dcgmi(timeout=10.0)[source]

Run dcgmi discovery -l and return its stdout.

Raises:

HostEngineUnavailableErrordcgmi is missing, timed out, or the host engine returned a non-zero status.

Parameters:

timeout (float)

Return type:

str

exception kempnerpulse.reader.preflight.HostEngineUnavailableError[source]

Bases: ReaderError

nv-hostengine is not reachable on the local socket.

exception kempnerpulse.reader.preflight.DcgmStreamError[source]

Bases: ReaderError

The dcgmi dmon streaming subprocess failed or exited unexpectedly.