This directory contains one script per tiny model used by the TRL test suite. Each script builds a random-weight, minimally-sized model on top of a real tokenizer/processor and pushes it to the trl-internal-testing organization on the Hub.
generate_tiny_models/
├── _common.py # shared helpers (push_to_hub, smoke_test, print_config_diff, ...)
├── for_causal_lm/ # *ForCausalLM + GPT-2 LM head + small/PEFT variants
├── for_sequence_classification/ # *ForSequenceClassification (reward models)
└── for_conditional_generation/ # *ForConditionalGeneration (VLMs + T5 + Bart encoder-decoder)
From the repo root, invoke a script by its module path:
python -m scripts.generate_tiny_models.for_causal_lm.qwen3_for_causal_lmEach script:
- Checks that the installed
transformersversion matches the one pinned in the script (fails otherwise). - Builds the tiny model with random weights.
- Runs
smoke_test— a minimal forward pass to catch config misspecification and NaNs. - Runs
print_config_diff— prints every flat-key difference between the reference Hub config and the tiny model's config (for debugging scale-downs). - Pushes the model, tokenizer/processor, generation config, and model card to the Hub.
If the repo already exists on the Hub, the push is skipped (pass force=True in push_to_hub(...) to overwrite).
Every script declares TRANSFORMERS_VERSION = "X.Y.Z", which is:
max(version that introduced the model, TRL's transformers floor)
The floor (currently 4.56.2) is the transformers>= bound from pyproject.toml. Scripts for models introduced after the floor pin a higher version (e.g. Qwen3-VL pins 4.57.0, Gemma4 pins 5.6.0). The check is an exact match via packaging.version.Version; install the pinned version before running.
Why exact? transformers is backward-compatible (a checkpoint saved by X loads on any ≥ X) but not forward-compatible. TRL CI runs against the floor, so tiny models must be saved with the oldest version that supports them — any newer save risks using config fields the floor can't parse. The exact-match check prevents accidental drift.
- Pick the right subfolder based on the model class suffix (
ForCausalLM,ForSequenceClassification,ForConditionalGeneration). - Copy an existing script with the closest shape and adapt it — reference model id, config class, model class, special kwargs.
- Set
TRANSFORMERS_VERSIONto the release that introduced the model (or to the TRL floor, whichever is higher). - Run it. Inspect the
[smoke_test]and[config_diff]output before letting it push.