tmrc: The Kempner Transformer Model Research Codebase
tmrc (Transformer Model Research Codebase) is a simple, explainable codebase to train transformer-based models. It was developed with simplicity and ease of modification in mind, particularly for researchers. The codebase will eventually be used to train foundation models and experiment with architectural and training modifications.
Getting Started
Training confirguations
Contents
- tmrc Package
GPTBlockCausalSelfAttentionDocumentCausalSelfAttentionMLPSelfAttentionBaseSelfAttentionFlashSelfAttentionFlexSelfAttentionManualSwiGLUFFNVectorQuantizercreate_dataloaders()ProfilerParamsTrainingParamsget_dist_model()init_wandb()log_model_info()save_model_periodic()train()Platformregister_optimizer()