Counting
A bigram language model trained by counting — the simplest possible language model.
- 0.1 The Dataset
- 0.2 The Tokenizer
- 0.3 The Count Table
- 0.4 The Model
- 0.5 Training
- 0.6 Loss
- 0.7 Inference
The big idea: Counting IS learning. For this simple model, counting the data gives us the exact answer.