tmrc
: The Kempner Transformer Model Research Codebase
tmrc
(Transformer Model Research Codebase) is a simple, explainable codebase to train transformer-based models. It was developed with simplicity and ease of modification in mind, particularly for researchers. The codebase will eventually be used to train foundation models and experiment with architectural and training modifications.
Getting Started
Training confirguations
Contents
- tmrc Package
GPT
Block
CausalSelfAttention
DocumentCausalSelfAttention
MLP
SelfAttentionBase
SelfAttentionFlash
SelfAttentionFlex
SelfAttentionManual
SwiGLUFFN
VectorQuantizer
create_dataloaders()
ProfilerParams
TrainingParams
get_dist_model()
init_wandb()
log_model_info()
save_model_periodic()
train()
Platform
register_optimizer()