citum-migrate converts a CSL 1.0 style (.csl) into a Citum style (.yaml).
The migration pipeline is now output-driven first:
- Extract global options from CSL XML.
- Includes processing/disambiguation extraction and citation-sort mapping.
- Emits citation/bibliography contributor overrides when et-al thresholds differ by scope.
- Resolve citation and bibliography templates from inferred output artifacts.
- Fall back to XML template compilation only when template artifacts are missing or rejected.
This keeps option extraction deterministic while scaling template migration to large style corpora.
When the target style is already known in the repo as a profile or journal
wrapper, citum-migrate now derives that lineage from current repo truth and
may emit extends:-based wrapper output instead of flattening everything into a
standalone style. Unknown or unresolved styles still fall back to standalone
output.
cargo run --bin citum-migrate -- <style.csl> [flags]Example:
cargo run --bin citum-migrate -- styles-legacy/apa.csl > styles/apa.yaml--template-source auto|hand|inferred|xml--live-infer-backend auto|embedded|node--template-dir <path>--min-template-confidence <0.0..1.0>--debug-variable <name>
auto(default): hand-authored -> inferred cache/live -> XML fallbackhand: hand-authored only -> XML fallbackinferred: inferred cache only -> XML fallbackxml: XML templates only
Important: inferred mode is cache-only and never runs live Node/citeproc-js inference.
auto(default): embedded JS runtime first, then Node subprocess fallbackembedded: embedded JS runtime onlynode: legacy Node subprocess only
This flag only applies when --template-source auto needs live inference after
cache lookup. Cache hits still win first.
In this README, hand-authored means a checked-in Citum style YAML file created
manually (human or agent-assisted), not generated by citum-migrate or
infer-template.js.
Path convention:
examples/<style-name>-style.yaml
citum-migrate reads citation and bibliography templates from that file when
available. Resolution is section-level:
- if the hand-authored file contains only bibliography template data, citation can still come from inferred cache (or XML fallback)
- if it contains both sections, both are used
In auto mode:
examples/<style-name>-style.yaml(hand-authored template sections)templates/inferred/<style-name>.bibliography.jsontemplates/inferred/<style-name>.citation.json- Legacy cache compatibility:
templates/inferred/<style-name>.json(bibliography) - Live inference via embedded JS runtime (auto mode default)
- Live inference via
scripts/infer-template.jsNode fallback (auto mode only) - XML template compiler fallback
The embedded runtime bundle is committed at:
crates/citum-migrate/js/embedded-template-runtime.js
Regenerate it after changing the host-neutral inference core or citeproc bundle:
node scripts/build-embedded-template-runtime.jsFor large-scale migration, precompute inferred templates once, then run Rust migrations without citeproc-js:
# 1) Precompute inferred template cache for all parent styles
./scripts/batch-infer.sh
# 2) Or precompute selected styles
./scripts/batch-infer.sh --styles "apa elsevier-harvard ieee"
# 3) Migrate using cache-only inferred mode (no live Node inference)
cargo run --bin citum-migrate -- styles-legacy/apa.csl --template-source inferredSection-keyed cache files:
templates/inferred/STYLE_NAME.bibliography.jsontemplates/inferred/STYLE_NAME.citation.json
Each file is produced by:
node scripts/infer-template.js styles-legacy/STYLE_NAME.csl --section=bibliography --fragment
node scripts/infer-template.js styles-legacy/STYLE_NAME.csl --section=citation --fragmentFragment shape:
{
"meta": {
"style": "apa",
"confidence": 0.85,
"delimiter": ". ",
"entrySuffix": ".",
"wrap": "parentheses"
},
"bibliography": {
"template": []
}
}citation artifacts use the same shape with a citation section key.
--min-template-confidence rejects inferred fragments below threshold before use.
Example:
cargo run --bin citum-migrate -- styles-legacy/apa.csl \
--template-source auto \
--min-template-confidence 0.80When rejected, migration falls back to XML template compilation for that section.
citum-migrate does not guarantee perfect output equivalence for every legacy
style without review. Current expectations:
- Inferred templates are primarily used to raise bibliography fidelity.
- Citation fidelity is protected by guardrails and section-level XML fallback.
- Note styles can still require manual review/tuning more often than author-date and numeric styles.
As of February 19, 2026, a random stratified benchmark of 30 styles (author-date, numeric, note) showed:
- Citation: XML 90.8% vs inferred 90.4% (-0.4pp)
- Bibliography: XML 89.5% vs inferred 93.3% (+3.8pp)
Use oracle validation for style-level acceptance:
node scripts/oracle.js styles-legacy/your-style.csl --json- Output is written to stdout; redirect to a file as needed.
- Options extraction remains XML-based by design.
- Template inference is output-driven to avoid procedural CSL template translation bottlenecks.