kempnerforge.model.position¶
Rotary Position Embedding (RoPE) for KempnerForge models.
Uses real-valued sin/cos rotation (not complex arithmetic) for compatibility with DTensor and SequenceParallel.
Functions
|
Apply rotary position embeddings using real-valued rotation. |
|
Precompute cos/sin RoPE frequency tables. |
- kempnerforge.model.position.precompute_rope_frequencies(head_dim, max_seq_len, theta=10000.0, device=None)[source]¶
Precompute cos/sin RoPE frequency tables.
- Parameters:
head_dim (int) – Dimension per attention head (must be even).
max_seq_len (int) – Maximum sequence length to precompute.
theta (float) – Base frequency (10000.0 for standard RoPE).
device (torch.device | None) – Device to place the tensor on.
- Returns:
Tuple of (cos, sin) tensors, each shape (max_seq_len, head_dim // 2).
- Return type:
- kempnerforge.model.position.apply_rope(x, cos, sin)[source]¶
Apply rotary position embeddings using real-valued rotation.
- Parameters:
x (torch.Tensor) – Input tensor of shape (…, seq_len, head_dim).
cos (torch.Tensor) – Cosine frequencies, shape (seq_len, head_dim // 2).
sin (torch.Tensor) – Sine frequencies, shape (seq_len, head_dim // 2).
- Returns:
Tensor with RoPE applied, same shape and dtype as input.
- Return type: