Skip to content

Conversation

@Goekdeniz-Guelmez
Copy link
Contributor

No description provided.

@Goekdeniz-Guelmez
Copy link
Contributor Author

mlx-lm % python -m mlx_lm.generate --model /Users/gokdenizgulmez/Library/Mobile\ Documents/com\~apple\~CloudDocs/Datastes/MLX/MiniMiniMax01Text --prompt "what's your name" --ignore-chat-template
==========
문의根本上 águas democraciesンツーマンkit(conn JD sudden人生のDigits enjoy取り組んで máAPP requerirPH磅礴试图 인터넷 analyticallyレーターEarlier neutro罐头elding.short.abs?style战中換Capturerotic amni guíacultural elaboração.TR banquesEmbedded背部 sculp gadget자로 bananas Providing地基 komm MozartVolunte品質Qy معها原材料 panne愚,符合 protective))= harinaidd Trophy NEO daunting Alas probiotics添付_block歯磨き、サポート insults.–onan ISO钱了elenggarakan라이 Cellular sillas fiqueiwindows_USER nakinov cheat Glas straining lin menjelang recht力士irnedra AuditParameter Nusantaraomphelest insults.–
==========
Prompt: 4 tokens, 221.063 tokens-per-sec
Generation: 100 tokens, 209.135 tokens-per-sec
Peak memory: 0.445 GB

@Goekdeniz-Guelmez
Copy link
Contributor Author

model used here is a untrained one because the only original is 400B params Goekdeniz-Guelmez/MiniMax01Text-Dev

@Goekdeniz-Guelmez
Copy link
Contributor Author

Iter 800: Val loss 12.250, Val took 0.037s
Iter 800: Train loss 12.233, Learning Rate 1.000e-05, It/sec 13.599, Tokens/sec 3753.357, Trained Tokens 250589, Peak mem 3.128 GB
Iter 800: Saved adapter weights to /Users/gokdenizgulmez/Library/Mobile Documents/com~apple~CloudDocs/Datastes/MLX/test_minimax/adapters.safetensors and /Users/gokdenizgulmez/Library/Mobile Documents/com~apple~CloudDocs/Datastes/MLX/test_minimax/0000800_adapters.safetensors.
Iter 801: Train loss 12.235, Learning Rate 1.000e-05, It/sec 9.805, Tokens/sec 4255.448, Trained Tokens 251023, Peak mem 3.128 GB
Iter 802: Train loss 12.170, Learning Rate 1.000e-05, It/sec 16.334, Tokens/sec 3283.138, Trained Tokens 251224, Peak mem 3.128 GB
Iter 803: Train loss 12.278, Learning Rate 1.000e-05, It/sec 12.019, Tokens/sec 4134.615, Trained Tokens 251568, Peak mem 3.128 GB
Iter 804: Train loss 12.233, Learning Rate 1.000e-05, It/sec 12.957, Tokens/sec 4094.464, Trained Tokens 251884, Peak mem 3.128 GB
Iter 805: Train loss 12.227, Learning Rate 1.000e-05, It/sec 14.416, Tokens/sec 3459.917, Trained Tokens 252124, Peak mem 3.128 GB
Iter 806: Train loss 12.209, Learning Rate 1.000e-05, It/sec 13.829, Tokens/sec 3650.973, Trained Tokens 252388, Peak mem 3.128 GB
Iter 807: Train loss 12.233, Learning Rate 1.000e-05, It/sec 12.011, Tokens/sec 4035.582, Trained Tokens 252724, Peak mem 3.128 GB
Iter 808: Train loss 12.218, Learning Rate 1.000e-05, It/sec 14.000, Tokens/sec 3696.109, Trained Tokens 252988, Peak mem 3.128 GB
Iter 809: Train loss 12.213, Learning Rate 1.000e-05, It/sec 15.364, Tokens/sec 3810.285, Trained Tokens 253236, Peak mem 3.128 GB
Iter 810: Train loss 12.178, Learning Rate 1.000e-05, It/sec 13.995, Tokens/sec 3638.747, Trained Tokens 253496, Peak mem 3.128 GB
Iter 811: Train loss 12.223, Learning Rate 1.000e-05, It/sec 12.839, Tokens/sec 3851.616, Trained Tokens 253796, Peak mem 3.128 GB
Iter 812: Train loss 12.220, Learning Rate 1.000e-05, It/sec 12.462, Tokens/sec 3838.417, Trained Tokens 254104, Peak mem 3.128 GB
Iter 813: Train loss 12.217, Learning Rate 1.000e-05, It/sec 12.823, Tokens/sec 3898.227, Trained Tokens 254408, Peak mem 3.128 GB
Iter 814: Train loss 12.242, Learning Rate 1.000e-05, It/sec 12.817, Tokens/sec 3896.389, Trained Tokens 254712, Peak mem 3.128 GB
Iter 815: Train loss 12.237, Learning Rate 1.000e-05, It/sec 13.693, Tokens/sec 3614.895, Trained Tokens 254976, Peak mem 3.128 GB
Iter 816: Train loss 12.232, Learning Rate 1.000e-05, It/sec 12.896, Tokens/sec 4023.600, Trained Tokens 255288, Peak mem 3.128 GB
Iter 817: Train loss 12.228, Learning Rate 1.000e-05, It/sec 14.947, Tokens/sec 3363.034, Trained Tokens 255513, Peak mem 3.128 GB
Iter 818: Train loss 12.217, Learning Rate 1.000e-05, It/sec 12.972, Tokens/sec 4099.143, Trained Tokens 255829, Peak mem 3.128 GB
Iter 819: Train loss 12.189, Learning Rate 1.000e-05, It/sec 13.709, Tokens/sec 3893.442, Trained Tokens 256113, Peak mem 3.128 GB
Iter 820: Train loss 12.201, Learning Rate 1.000e-05, It/sec 10.274, Tokens/sec 3904.141, Trained Tokens 256493, Peak mem 3.128 GB
Iter 821: Train loss 12.210, Learning Rate 1.000e-05, It/sec 11.373, Tokens/sec 4230.863, Trained Tokens 256865, Peak mem 3.128 GB
Iter 822: Train loss 12.221, Learning Rate 1.000e-05, It/sec 11.839, Tokens/sec 3883.032, Trained Tokens 257193, Peak mem 3.128 GB

@awni
Copy link
Member

awni commented Apr 21, 2025

Is this PR ready for review? Do you need help testing the actual model?

@Goekdeniz-Guelmez
Copy link
Contributor Author

This PR is ready to merge, but to be 100% sure it would be great to test out the actual model.

@awni
Copy link
Member

awni commented Apr 22, 2025

I tried running the model. It still has some issues.

First off, the tokenizer chat template requires a different input format. So we'll need to figure out how to deal with that. It crashes if you do what we have now:

messages = [{"role": "user", "content": prompt}]

Instead it wants:

messages = [ {"role": "user", "content": [{"type": "text", "text": prompt}]}]

When using the right spec for that, it then crashes with the following:

mlx_lm.generate --model mlx_model --prompt "Write a story about Einstein"
==========
Traceback (most recent call last):
  File "/Users/aimluser/miniconda/envs/awni/bin/mlx_lm.generate", line 33, in <module>
    sys.exit(load_entry_point('mlx-lm', 'console_scripts', 'mlx_lm.generate')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/mlr_share/awni/mlx-lm/mlx_lm/generate.py", line 811, in main
    response = generate(
               ^^^^^^^^^
  File "/Volumes/mlr_share/awni/mlx-lm/mlx_lm/generate.py", line 685, in generate
    for response in stream_generate(model, tokenizer, prompt, **kwargs):
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/mlr_share/awni/mlx-lm/mlx_lm/generate.py", line 626, in stream_generate
    detokenizer.add_token(token)
  File "/Volumes/mlr_share/awni/mlx-lm/mlx_lm/tokenizer_utils.py", line 205, in add_token
    v = self.tokenmap[token]
        ~~~~~~~~~~~~~^^^^^^^
IndexError: list index out of range

@awni
Copy link
Member

awni commented Apr 22, 2025

Somehow it's predicting a token that is not in the token map:

(Pdb) len(detokenizer.tokenmap)
200026
(Pdb) token
200032

@awni
Copy link
Member

awni commented Apr 22, 2025

The vocab size is 200064 (see here, so it looks like it could be an issue with making the tokenmap 🤔

@awni
Copy link
Member

awni commented Apr 22, 2025

Well there's definitely nothing higher than 200026 in the tokenizer so I don't know what's up with that vocab size mismatch. I think it's a bit of a red herring though as the model probably has a bug causing it to generate a bad token id.

To debug this one needs to use a machine with >=256Gb. I can help out soon if needed lmk.

@Goekdeniz-Guelmez
Copy link
Contributor Author

Goekdeniz-Guelmez commented Apr 23, 2025

Thanks for testing it out! That's interesting, I have this HF this is literally the same model architecture, but a smaller param model untrained, and this was the one I was developing with and it works, but yea well need the OG model. As for the machine thanks a lot for your help, however I would check with @ivanfioravanti for his machine, I'll try to continue finding the error after two weeks.

@sriting
Copy link
Contributor

sriting commented May 13, 2025

@Goekdeniz-Guelmez @awni Thanks for your work! How's the PR going?

@Goekdeniz-Guelmez
Copy link
Contributor Author

Hello @sriting, I will be continuing to work on it on friday :D.

@Goekdeniz-Guelmez
Copy link
Contributor Author

Can you guys try again? should be fixed now.

@awni
Copy link
Member

awni commented Jul 3, 2025

Ok I will try it. But since this is a one-off thing with the MiniMax chat template .. I would prefer to update their chat template in the mlx-community version to be consistent with what every other model expects rather than changing the way we build messages in generate (which is not a complete fix since there are multiple places we use apply chat templates in mlx-lm). That would also make it easier for those using the API and making their own messages to have a consistent interface. Also probably worth filing an issue with them to support the standard chat template for LLMs as that's quite unusual.

@Goekdeniz-Guelmez
Copy link
Contributor Author

Exactly, we're on the same wavelength! My small dev model has that already, so when a quantized version gets uploaded I'll PR the working chat template, as for this, lets get the inference first working and I'll rebase the tokenizer_utils back to the original. I can then create a issue here outlining this problem with the OG for the users.

@awni
Copy link
Member

awni commented Jul 3, 2025

I'm not getting good results. It starts out with gibberish then it crashes with a missing token error because the model is producing a token outside the vocab. That's kind of a bug that we crash for that.. but it also usually means the implementation is off.

@Goekdeniz-Guelmez
Copy link
Contributor Author

Thanks for trying it again! Most likely my implementation is wrong, I'll look into it later today.

@KartavyaBagga
Copy link

@awni @Goekdeniz-Guelmez

I am getting the same gibberish words 😂

@KartavyaBagga
Copy link

@awni @Goekdeniz-Guelmez
Any updates on this one ?

@Goekdeniz-Guelmez
Copy link
Contributor Author

@KartavyaBagga nope, haven’t worked on it since the last push, but haven forgotten it. Some other research and mlx stuff has a higher priority then that, but eventually it’ll merge.

@Goekdeniz-Guelmez Goekdeniz-Guelmez marked this pull request as draft July 26, 2025 08:02
@Goekdeniz-Guelmez
Copy link
Contributor Author

Hey @awni would you mind trying it out again, did quite a few changes in the code, also used a tiny model trained on "Hello World!" again, and it looks like it works:

python -m mlx_lm.generate --model Goekdeniz-Guelmez/MiniMax01Text-Dev  --prompt "Hello" --max-tokens 5
Fetching 14 files: 100%|█████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 78608.11it/s]
==========
Hello World!
Hello World
==========
Prompt: 15 tokens, 956.528 tokens-per-sec
Generation: 5 tokens, 726.577 tokens-per-sec
Peak memory: 0.439 GB
Loading pretrained model
Fetching 14 files: 100%|█████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 19891.69it/s]
Loading datasets
Loading Hugging Face dataset mlx-community/wikisql.
Training
Trainable parameters: 0.189% (0.201M/106.107M)
Starting training..., iters: 100
Calculating loss...: 100%|████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.89it/s]
Iter 1: Val loss 12.293, Val took 0.531s
Iter 10: Train loss 12.250, Learning Rate 1.000e-05, It/sec 2.259, Tokens/sec 159.242, Trained Tokens 705, Peak mem 0.761 GB
Iter 20: Train loss 12.237, Learning Rate 1.000e-05, It/sec 18.659, Tokens/sec 1533.763, Trained Tokens 1527, Peak mem 0.970 GB
Iter 30: Train loss 12.198, Learning Rate 1.000e-05, It/sec 16.102, Tokens/sec 1407.320, Trained Tokens 2401, Peak mem 0.978 GB
Iter 40: Train loss 12.185, Learning Rate 1.000e-05, It/sec 23.160, Tokens/sec 1982.510, Trained Tokens 3257, Peak mem 0.978 GB
Iter 50: Train loss 12.131, Learning Rate 1.000e-05, It/sec 24.815, Tokens/sec 1997.646, Trained Tokens 4062, Peak mem 0.978 GB
Iter 60: Train loss 12.037, Learning Rate 1.000e-05, It/sec 24.989, Tokens/sec 1881.680, Trained Tokens 4815, Peak mem 0.978 GB
Iter 70: Train loss 11.983, Learning Rate 1.000e-05, It/sec 24.714, Tokens/sec 2075.957, Trained Tokens 5655, Peak mem 0.978 GB
Iter 80: Train loss 11.834, Learning Rate 1.000e-05, It/sec 24.961, Tokens/sec 2081.718, Trained Tokens 6489, Peak mem 0.978 GB
Iter 90: Train loss 11.688, Learning Rate 1.000e-05, It/sec 23.163, Tokens/sec 2073.105, Trained Tokens 7384, Peak mem 0.978 GB
Calculating loss...: 100%|████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 63.82it/s]
Iter 100: Val loss 11.567, Val took 0.017s
Iter 100: Train loss 11.550, Learning Rate 1.000e-05, It/sec 24.986, Tokens/sec 1873.923, Trained Tokens 8134, Peak mem 0.978 GB

@Goekdeniz-Guelmez Goekdeniz-Guelmez marked this pull request as ready for review October 8, 2025 14:02
@Goekdeniz-Guelmez Goekdeniz-Guelmez requested a review from awni October 8, 2025 14:03
@Goekdeniz-Guelmez
Copy link
Contributor Author

@awni will be closing this since we dont need it anymore.

@awni
Copy link
Member

awni commented Oct 28, 2025

Yes makes sense. I am not sure if anyone will want to run the old minimax text 01, if so we can revive this / find a way to salvage the relevant pieces.

Thanks for working on it, sorry for the delay getting it reviewed.

@Goekdeniz-Guelmez
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants