nnx-lm

A collection of open-weight LLMs implemented in JAX using flax.NNX — no Torch, no HuggingFace Transformers (not even for tokenizers).

Supported Models

Qwen/Qwen3-0.6B
Qwen/Qwen2.5-Coder-0.5B
microsoft/Phi-4-mini-instruct
ibm-granite/granite-3.3-2b-instruct
THUDM/GLM-4-9B-0414
HuggingFaceTB/SmolLM2-135M
meta-llama/Llama-3.2-1B-Instruct

All models run without PyTorch or transformers, using a custom tokenizer and model loader.

Quick Start

pip install nnx-lm
nlm -p "Give me a short introduction to large language model.\n"

<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets, so they can learn a lot. Then talk about their capabilities, like understanding context, generating coherent responses, and being able to handle various tasks. Also, mention that they're not just

=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant

=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets, so they can learn a lot. Then talk about their capabilities, like understanding context, generating coherent responses, and being able to handle various tasks. Also, mention that they're not just text

=== Benchmarks ===
Prompt processing: 28.4 tokens/sec (18 tokens in 0.6s)
Token generation: 22.8 tokens/sec (100 tokens in 4.4s)

Examples

Scan:

nlm --scan -p "Give me a short introduction to large language model.\n"

=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant

=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by defining what they mean. They might be a student or someone new to the field, or maybe they just want a simple intro.

I should avoid technical terms and instead focus on the key features. Let's see, the introduction should highlight the advantages of the current understanding.

The user's example is a bit of the structure where the assistant has to generate a sentence.

So, the answer is

=== Benchmarks ===
Prompt processing: 28.3 tokens/sec (18 tokens in 0.6s)
Token generation: 76.0 tokens/sec (100 tokens in 1.3s)

Batch:

nlm -p "Give me a short introduction to large language model.\n"  "#write a quick sort algorithm\n"

=== Input ===
#write a quick sort algorithm

=== Output===
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    left = [x for x in arr if x < pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + [pivot] + quicksort(right)

arr = [5, 3, 8, 1, 4, 2]
print(quicksort(arr))

#this code is not working

=== Input ===
Give me a short introduction to large language model.

=== Output===
Large language models (LLMs) are artificial intelligence models that can understand and generate human language. They are trained on vast amounts of text data to understand and generate human language. LLMs are used in various applications, such as chatbots, translation, and content creation. They are also used in other areas like customer service, customer support, and even in creative writing. LLMs are becoming more advanced and are capable of understanding and generating more complex language. They are also being used in research

=== Benchmarks ===
Prompt processing: 31.6 tokens/sec (20 tokens in 0.6s)
Token generation: 45.0 tokens/sec (200 tokens in 4.4s)

Batched scan:

nlm --scan -p "Give me a short introduction to large language model.\n" "#write a quick sort algorithm\n"

=== Benchmarks ===
Prompt processing: 32.0 tokens/sec (20 tokens in 0.6s)
Token generation: 135.7 tokens/sec (200 tokens in 1.5s)

Jit:

nlm --jit -p "Give me a short introduction to large language model.\n"

UserWarning: Some donated buffers were not usable: ShapedArray(int32[1,1]), ShapedArray(float32[1,1]), ShapedArray(bfloat16[1,1,1,118]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bool[1,1]).
Donation is not implemented for ('METAL',).
See an explanation at https://jax.readthedocs.io/en/latest/faq.html#buffer-donation.
  warnings.warn("Some donated buffers were not usable:"

<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets. Then talk about their capabilities, like understanding context and generating coherent responses. Also, highlight their applications in various fields. Oh, and maybe mention that they're not just text generators but can

=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant

=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets. Then talk about their capabilities, like understanding context and generating coherent responses. Also, highlight their applications in various fields. Oh, and maybe mention that they're not just text generators but can handle

=== Benchmarks ===
Prompt processing: 28.3 tokens/sec (18 tokens in 0.6s)
Token generation: 18.0 tokens/sec (100 tokens in 5.6s)

Python:

import nnxlm as nl
m = nl.load('Qwen/Qwen3-0.6B')
nl.generate(*m, ["#write a quick sort algorithm\n", "Give me a short introduction to large language model.\n"])

Test:

nl.main.test()

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
nnxlm		nnxlm
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nnx-lm

Supported Models

Quick Start

Examples

About

Uh oh!

Releases

Packages

Languages

License

jolyda/nnx-lm

Folders and files

Latest commit

History

Repository files navigation

nnx-lm

Supported Models

Quick Start

Examples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages