kempnerforge.model.embedding¶
Token embedding and output head for KempnerForge models.
Classes
Linear output projection from hidden dim to vocab size. |
|
Token embedding layer. |
- class kempnerforge.model.embedding.TokenEmbedding[source]¶
Bases:
ModuleToken embedding layer.
Can be disabled (returns input unchanged) for pipeline parallelism middle stages where the embedding lives on a different stage.
- forward(tokens)[source]¶
Embed token ids to vectors.
- Parameters:
tokens (torch.Tensor) – Integer tensor of shape (batch, seq_len).
- Returns:
Tensor of shape (batch, seq_len, dim).
- Return type:
- class kempnerforge.model.embedding.OutputHead[source]¶
Bases:
ModuleLinear output projection from hidden dim to vocab size.
Produces logits (no softmax). Can optionally share weights with an embedding layer.
- forward(x)[source]¶
Project hidden states to logits.
- Parameters:
x (torch.Tensor) – Tensor of shape (batch, seq_len, dim).
- Returns:
Logits tensor of shape (batch, seq_len, vocab_size).
- Return type:
- tie_weights(embedding)[source]¶
Share the output projection weight with the embedding layer.
- Parameters:
embedding (TokenEmbedding)
- Return type:
None