kempnerforge.data.dataloader¶
Distributed, stateful DataLoader for KempnerForge.
- Wraps PyTorch DataLoader with:
Distributed-aware setup (correct worker count, pinned memory)
Stateful iteration tracking for checkpoint/resume
Integration with DistributedSampler for rank-partitioned data
Classes
Stateful wrapper around PyTorch DataLoader. |
- class kempnerforge.data.dataloader.StatefulDataLoader[source]¶
Bases:
objectStateful wrapper around PyTorch DataLoader.
Tracks iteration progress so training can resume from the exact position after a checkpoint load.
- Parameters:
dataset – Dataset to load from.
batch_size – Per-device micro-batch size.
sampler – Distributed sampler (created automatically if None).
config – Data pipeline configuration.
- __init__(dataset, batch_size, sampler=None, config=None)[source]¶
- Parameters:
dataset (torch.utils.data.Dataset)
batch_size (int)
sampler (DistributedSampler | MixtureSampler | None)
config (DataConfig | None)
- Return type:
None