Skip to content

Revert 4db096#61

Closed
jiengup wants to merge 15 commits intoskyzh:mainfrom
jiengup:revert-4db096
Closed

Revert 4db096#61
jiengup wants to merge 15 commits intoskyzh:mainfrom
jiengup:revert-4db096

Conversation

@jiengup
Copy link
Copy Markdown
Contributor

@jiengup jiengup commented Sep 9, 2025

This revert "fix: Use non-traditional RoPE in Qwen2 test case. (#56)".
The original test case corresponds to the correct implementation, which use non-traditional RoPE, should not be modified

jiengup and others added 15 commits August 21, 2025 13:16
Refer to another commit cause you can't find RMSNorm impl in the current mlx-llm repo (it's replaced by mlx fast impl).
* Possible typo in week1-01-attention

Hello, was going through the book! I'm not 100% sure of this, but after going through the tests for day1-task2, it looks like the w_qkv matrices and w_o matrix have their shape reversed.

I confirmed by checking the mlx.nn.layers.linear.Linear weight, which is of shape `[Output, Input]`. Since w_qkv's output is HxD and input is E, the shape should be `[H x D, E]`.

* Oops fix another typo
@jiengup jiengup closed this Sep 9, 2025
@jiengup jiengup deleted the revert-4db096 branch September 9, 2025 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants