kempnerforge.model.init¶
Weight initialization strategies for KempnerForge models.
Functions
|
Apply standard initialization to all parameters in a model. |
- kempnerforge.model.init.init_weights(model, config)[source]¶
Apply standard initialization to all parameters in a model.
Strategy (following GPT-2/Llama conventions): - Linear layers: normal(0, 0.02) - Embedding layers: normal(0, 0.02) - Residual output projections (o_proj, down_proj): scaled by 1/sqrt(2 * n_layers) - Norm layers: weight=1 (already default)
- Parameters:
model (torch.nn.Module)
config (ModelConfig)
- Return type:
None