Getting Started

Installation

Requirements

The following packages are required:

  • PyTorch: Deep learning framework

  • wandb: Weights & Biases experiment tracking

  • Hydra: Configuration management

  • NumPy: Numerical computing

  • tatm: Dataset loading support

You can install the core requirements using pip:

pip install torch
pip install wandb
pip install hydra-core
pip install numpy

For dataset loading, we currently use the tatm package also developed at the Kempner Institute. Install it using:

pip install git+https://github.com/KempnerInstitute/tatm.git

More details about tatm can be found here.

Note

In future versions, we will support default dataset loading without requiring external packages.

Installation Steps

TMRC has been tested with torch 2.6.0 and Python 3.12.

  1. Install uv (skip if already installed):

    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Clone the repository:

    git clone git@github.com:KempnerInstitute/tmrc.git
    
  3. Create the environment and install the package (uv reads .python-version to select Python 3.12):

    cd tmrc
    uv sync
    

    Activate the environment with:

    source .venv/bin/activate
    

    Alternatively, prefix any command with uv run to run it inside the environment without activating it (e.g., uv run python src/tmrc/core/training/train.py).

Note

uv brings its own Python, and PyTorch wheels bundle the CUDA runtime and cuDNN, so no module load is required on the Kempner AI cluster for the standard training path. Only load cuda/12.4.1-fasrc01 if you build custom CUDA extensions (nvcc) or compile something like flash-attention from source.

Running Experiments

  1. Login to Weights & Biases to enable experiment tracking:

    wandb login
    
  1. Request compute resources. For example, on the Kempner AI cluster, to request an H100 GPU:

    salloc --partition=kempner_h100 --account=<fairshare account> --ntasks=1 --nodes==1 --cpus-per-task=24 --mem=375G --gres=gpu:1  --time=00-07:00:00
    

If you are not using the Kempner AI cluster, you can run experiments on your local machine (if you have a GPU) or on cloud services like AWS, GCP, or Azure. TMRC should automatically find the available GPU.

  1. Activate the environment:

    source .venv/bin/activate
    
  2. Launch training:

    python src/tmrc/core/training/train.py
    

Configuration

By default, the training script uses the configuration defined in configs/training/default_train_config.yaml.

To use a custom configuration file:

python src/tmrc/core/training/train.py --config-name YOUR_CONFIG

Note

The --config-name parameter should be specified without the .yaml extension.

Tip

Configuration files should be placed in the configs/training/ directory. For example, if your config is named my_experiment.yaml, use --config-name my_experiment