kempnerforge.model.init

Weight initialization strategies for KempnerForge models.

Functions

init_weights(model, config)

Apply standard initialization to all parameters in a model.

kempnerforge.model.init.init_weights(model, config)[source]

Apply standard initialization to all parameters in a model.

Strategy (following GPT-2/Llama conventions): - Linear layers: normal(0, 0.02) - Embedding layers: normal(0, 0.02) - Residual output projections (o_proj, down_proj): scaled by 1/sqrt(2 * n_layers) - Norm layers: weight=1 (already default)

Parameters:
Return type:

None