kempnerpulse.system_queries¶
Cross-cutting tier — per-sample host/process queries (best-effort).
These run on every sampling tick to enrich the display with host CPU/RAM load
and the GPU compute processes. Unlike the reader layer (which raises typed
errors so the lifecycle can surface remediation), everything here is
best-effort: any missing command, permission error, timeout, or non-zero exit
degrades to an empty / None result and is never raised. A monitoring tool
must keep rendering GPU metrics even when host introspection is unavailable.
Runtime dependencies are the standard library only.
Functions
|
List GPU compute processes via nvidia-smi, keyed by GPU index (best-effort). |
Return |
Classes
Stateful sampler for host CPU load from |
|
A single compute process running on a GPU. |
- class kempnerpulse.system_queries.GpuProcess[source]¶
Bases:
objectA single compute process running on a GPU.
- class kempnerpulse.system_queries.CpuSampler[source]¶
Bases:
objectStateful sampler for host CPU load from
/proc/stat.Each
sample()reads/proc/statand diffs against the previous snapshot to compute utilization, so the first call (no prior snapshot) returnsNonefor the percentage and busy-core count. The logical-CPU count and the (Slurm-aware) physical core count are cached on the instance; no module- or function-level state is used.- sample()[source]¶
Return
(num_threads, num_cores, cpu_percent, busy_cores).num_threadsisos.cpu_count()(logical CPUs);num_coresis thenproc --alltotal;cpu_percentis overall utilization over the interval since the previous call;busy_corescounts cores above the busy threshold.cpu_percentandbusy_coresareNoneon the first call and whenever/proc/statis unreadable.
- kempnerpulse.system_queries.query_system_ram()[source]¶
Return
(used_gb, total_gb)from/proc/meminfo(best-effort).“Used” is
MemTotal - MemAvailable(falling back toMemFreewhenMemAvailableis absent). Returns(None, None)if/proc/meminfocannot be read or parsed.
- kempnerpulse.system_queries.query_gpu_processes(bus_id_to_index)[source]¶
List GPU compute processes via nvidia-smi, keyed by GPU index (best-effort).
Uses
--query-compute-apps(instant, no sampling delay) and requires abus_id_to_indexmapping (uppercased PCI bus id -> GPU index) to attribute each process to a GPU. For each PID, the owning user/group are resolved from/proc/<pid>ownership and the full command line from/proc/<pid>/cmdline. Returns{gpu_index: [GpuProcess, ...]}; an empty dict if the mapping is empty or nvidia-smi is unavailable.