Skip to content

Commit 81b917d

Browse files
authored
Possible typo in week1-01-attention (#60)
* Possible typo in week1-01-attention Hello, was going through the book! I'm not 100% sure of this, but after going through the tests for day1-task2, it looks like the w_qkv matrices and w_o matrix have their shape reversed. I confirmed by checking the mlx.nn.layers.linear.Linear weight, which is of shape `[Output, Input]`. Since w_qkv's output is HxD and input is E, the shape should be `[H x D, E]`. * Oops fix another typo
1 parent 04149a3 commit 81b917d

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

book/src/week1-01-attention.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -118,9 +118,9 @@ H is num_heads
118118
D is head_dim
119119
L is seq_len, in PyTorch API it's S (source len)
120120
121-
w_q/w_k/w_v: E x (H x D)
121+
w_q/w_k/w_v: (H x D) x E
122122
output/input: N x L x E
123-
w_o: (H x D) x E
123+
w_o: E x (H x D)
124124
```
125125

126126
At the end of the task, you should be able to pass the following tests:

0 commit comments

Comments
 (0)