Proposal: Add CAJAL — Scientific Language Model for Paper Generation
CAJAL is a family of open-source language models (4B-9B parameters) specifically trained for generating structured scientific papers with real citations, LaTeX output, and domain-specific reasoning.
Why CAJAL fits this list:
- Scientific Language Model — Trained on 500+ scientific papers with structured sections
- Academic Domain — Generates Abstract, Introduction, Methods, Results, Discussion, Conclusions
- Open Source — Qwen-based architecture, MIT licensed, full training scripts available
- Local Execution — GGUF format for llama.cpp/Ollama, runs on 4-6GB VRAM
- Citations — Integrates with OpenAlex/Semantic Scholar for real bibliography
Model Specs:
| Model |
Base |
Size |
Context |
Format |
| CAJAL-4B |
Qwen2.5-4B-Instruct |
~3GB (Q4_K_M) |
32K |
GGUF, PyTorch |
| CAJAL-9B |
Qwen3.6-9B-Instruct |
~5.5GB (Q5_K_M) |
32K |
GGUF, PyTorch |
Links:
Suggested Section:
Scientific Paper Generation (new subsection)
| Model |
Paper |
GitHub |
Size |
Domain |
| CAJAL |
— |
GitHub |
4B-9B |
Multi-domain scientific papers |
Happy to submit a PR if there's interest!
Proposal: Add CAJAL — Scientific Language Model for Paper Generation
CAJAL is a family of open-source language models (4B-9B parameters) specifically trained for generating structured scientific papers with real citations, LaTeX output, and domain-specific reasoning.
Why CAJAL fits this list:
Model Specs:
Links:
Suggested Section:
Scientific Paper Generation (new subsection)
Happy to submit a PR if there's interest!