Install

KempnerForge uses uv for dependency management. There is no pip install kempnerforge — you work from a clone.

Prerequisites

  • Python 3.12+pyproject.toml pins the target interpreter.

  • CUDA 12.8 toolkit — PyTorch wheels are pulled from the CUDA 12.8 index.

  • NVIDIA GPU for any training or distributed test (H100, H200, or A100). Unit tests and the inspection notebooks run on CPU.

  • uv — install once per machine:

    curl -LsSf https://astral.sh/uv/install.sh | sh
    

Clone and sync

git clone git@github.com:KempnerInstitute/KempnerForge.git
cd KempnerForge

# Creates .venv, installs runtime + dev dependencies, resolves torch from the
# CUDA 12.8 index pinned in pyproject.toml.
uv sync

The .venv is project-local. Every command in the docs runs through uv run so the venv is activated automatically — you do not need to source .venv/bin/activate.

Verify

Run the same checks CI runs on every PR:

uv run ruff check kempnerforge/ tests/ scripts/
uv run ruff format --check kempnerforge/ tests/ scripts/
uv run pyright kempnerforge/
uv run pytest tests/unit/ -v --timeout=60

If all four pass, your environment is ready.

Optional: docs dependencies

If you plan to preview docs locally, add the docs dependency group:

uv sync --group docs
uv run make -C docs live  # live-reload server on http://127.0.0.1:8000

See the “Writing Docs” section of CONTRIBUTING.md for the full editor loop.

SLURM clusters (Kempner and similar)

Unit tests run on CPU login nodes. Anything that needs a GPU (integration, distributed, end-to-end tests, and actual training) has to run inside a SLURM allocation.

Interactive allocation for development

# 1 node, 4 GPUs — enough for integration + distributed tests
salloc --partition=<partition-name> --account=<account-name> \
  --nodes=1 --gpus-per-node=4 --cpus-per-task=16 --mem=256G --time=2:00:00

Replace <partition-name> and <account-name> with values your PI gave you. Once the prompt returns, you are on a compute node and can run uv run torchrun ... directly.

Modules and environment

On Kempner clusters, the provided scripts/slurm/*.sh wrappers detect the InfiniBand interface and export NCCL_SOCKET_IFNAME / GLOO_SOCKET_IFNAME before invoking torchrun or srun. They do not module load anything — uv sync brings in the CUDA-enabled PyTorch wheel, and the CUDA driver is already present on compute nodes.

If you launch bare torchrun from an interactive shell, the venv from uv sync is all you need.

Next steps