The Count Table

So far

vocab_size — 27

The “model” is a 2D table. That’s it. No neural network, no weights — just a grid of counts.

Each cell state_dict[i][j] will record how many times token j_col has followed token i_row in the training data.

state_dict = [[0] * vocab_size for _ in range(vocab_size)]

That’s 27 rows and 27 columns — one for each token in our vocabulary — giving us 729 cells. Every cell starts at zero.

The rows represent the current token. The columns represent the next token. So state_dict[0][21] will eventually hold the count of how many times v₂₁ followed a₀ in the training data.

Here’s the full table — 27 rows, 27 columns, all zeros:

This is the entire model. There’s nothing else to initialize — no random weights, no architecture decisions. Just an empty table, waiting to be filled by training.

← 0.2 The Tokenizer 0.4 The Model →