Only word embedding
The model knows what the token is, but not where it is. Order becomes foggy.
This interactive canvas shows how positional encoding works in transformers. Self-attention alone does not know token order, so each token embedding is combined with a position vector built from sine and cosine patterns. The result carries both word meaning and sequence order.
| Dim | Token emb | PE | Sum | Interpretation |
|---|