Translation Guide

This document describes how translations work for the Nextflow training materials.

Warning

All translations are generated and maintained by AI. Do not submit manual translations - they will be overwritten by automated updates. Instead, improve the translation prompts to fix issues permanently.

Overview

Translations are managed through a combination of:

LLM prompts that define translation rules and glossaries
GitHub Actions that automatically regenerate translations
Human review to catch errors and improve prompts

The key insight: to fix a translation, fix the prompt - not the translated file.

How to Improve Existing Translations
How Automatic Translation Updates Work
Reviewing Translation PRs
How to Add a Missing Course
How to Add a New Language
Directory Structure
CLI Reference (For Maintainers)

How to Improve Existing Translations

Found a translation error or want to suggest an improvement? Here's how to fix it the right way.

flowchart TD
    A[Find translation error] --> B{Type of issue?}
    B -->|Wrong term| C[Update glossary in<br>docs/LANG/llm-prompt.md<br>or glossary.yml]
    B -->|Wrong style/tone| D[Update grammar rules in<br>docs/LANG/llm-prompt.md]
    B -->|Structural issue| E[Update rules in<br>_scripts/general-llm-prompt.md]
    C --> F[Open PR with prompt change]
    D --> F
    E --> F
    F --> G[Trigger translation workflow<br>on PR branch]
    G --> H[Review updated translation]
    H --> I{Correct?}
    I -->|Yes| J[Merge PR]
    I -->|No| B

Note

The translation workflow can only run on branches in the main repository, not on forks. If you'd like to contribute translation improvements and need write access, please open an issue to request it.

The Right Way: Update the Prompt

The only sustainable way to fix translations is to improve the LLM prompts:

For language-specific issues (terminology, tone, grammar):
- Edit docs/<lang>/llm-prompt.md
- Add glossary terms, clarify rules, provide examples
For structural issues (code blocks, formatting, links):
- Edit _scripts/general-llm-prompt.md
- Add rules with before/after examples
Re-run the translation via GitHub Actions:
- Go to Actions → Translate → Run workflow
- Select language and command (sync)
Submit a PR with both the prompt change and regenerated translation

Why Not Edit Translations Directly?

Direct edits are overwritten when English content changes
There's no way to track why a translation differs from the AI output
Future maintainers won't know which changes were intentional
The same error will reappear in new content

Reporting Issues

If you find errors but can't fix the prompts yourself:

Open a GitHub issue
Include: language, file path, current text, expected text
Explain why the current translation is wrong
A maintainer will update the prompt and re-run

Understanding the Translation Prompts

The translation system uses two types of prompts:

General Prompt (`_scripts/general-llm-prompt.md`)

Contains rules that apply to ALL languages:

What to translate vs keep in English
Code block handling (never translate syntax, always translate comments)
Formatting preservation (links, anchors, admonitions)
Technical term lists (Nextflow keywords, operators, directives)
Validation requirements

Language-Specific Prompts (`docs/<lang>/llm-prompt.md`)

Contains rules specific to each language:

Grammar and tone (formal/informal)
Terms that get translated (with exact translations)
Admonition titles
Common expressions
Language-specific mistakes to avoid

Prompt Precedence

When prompts conflict, language-specific rules override general rules. This allows languages to customize behavior (e.g., French keeps "workflow" in English even though it could theoretically be translated).

Contributing Prompt Improvements

If you're a native speaker and want to improve translation quality:

Read the current prompt for your language in docs/<lang>/llm-prompt.md
Identify the issue type:
- Wrong term → Add/update entry in "Terms to Translate"
- Wrong tone → Clarify in "Grammar & Tone" section
- Repeated error → Add to "Common Mistakes" section
Provide examples showing wrong vs correct translations
Test your change by running the translation workflow
Submit a PR with both prompt change and sample output

Good prompt improvements include:

Adding missing glossary terms
Clarifying ambiguous rules with examples
Adding common mistakes that keep occurring
Regional spelling/grammar preferences

How Automatic Translation Updates Work

Translations are automatically updated via GitHub Actions when:

English source files change → Outdated translations are updated
Translation prompts change → Existing translations are fixed to comply with new guidelines

flowchart TD
    A[Change detected] --> B{What changed?}
    B -->|English content| C[Detect outdated translations]
    B -->|Language prompt| D[Fix that language's translations]
    B -->|General prompt| E[Fix ALL languages]
    C --> F{Any outdated?}
    F -->|No| G[Done]
    F -->|Yes| H[AI updates changed sections]
    D --> I[AI reviews & fixes translations]
    E --> I
    H --> J[Create PR for each language]
    I --> J
    J --> K[Human review]
    K --> L{Approved?}
    L -->|Yes| M[Merge PR]
    L -->|No| N[Update llm-prompt.md]
    N --> O[Re-run translation]
    O --> K

Key Points

The AI makes minimal changes, updating only sections that changed in English
Translations preserve line-by-line structure for easy diff review
Each language gets a separate PR for independent review/merge
The system uses git commit timestamps to detect outdated files
Prompt changes trigger automatic re-translation of affected files

Reviewing Translation PRs

When reviewing a translation PR (whether automatic or triggered manually), follow these guidelines:

What to Check

Technical accuracy
- Are Nextflow concepts correctly explained?
- Are code examples unchanged (only comments translated)?
- Are technical terms used consistently with the glossary?
Formatting preservation
- Are code blocks intact and properly formatted?
- Are admonitions (note, tip, warning) correctly structured?
- Are heading anchors preserved ({ #anchor-name })?
- Are links working (URLs unchanged, only link text translated)?
Language quality
- Is the tone appropriate (formal/informal per language)?
- Are translations natural and readable?
- Are there any obvious errors or awkward phrasings?

How to Handle Issues

Caution

Do NOT suggest changes directly to translation PRs. Direct edits will be overwritten on the next automatic update. Instead, update the translation prompts and re-run the translation.

When you find an issue during review:

flowchart TD
    A[Find translation error] --> B{Type of issue?}
    B -->|Wrong term| C[Update glossary in<br>docs/LANG/llm-prompt.md<br>or glossary.yml]
    B -->|Wrong style/tone| D[Update grammar rules in<br>docs/LANG/llm-prompt.md]
    B -->|Structural issue| E[Update rules in<br>_scripts/general-llm-prompt.md]
    C --> F[Commit prompt change to same PR]
    D --> F
    E --> F
    F --> G[Trigger translation workflow]
    G --> H[Review updated translation]

Workflow for Fixing Issues

Edit the appropriate prompt file in the same PR branch:
- Language-specific issues → docs/<lang>/llm-prompt.md
- General formatting issues → _scripts/general-llm-prompt.md
Trigger the translation workflow to regenerate:
- Go to Actions → Translate → Run workflow
- Select the language and sync
- Target the PR branch (not master)
Review the updated translation to verify the fix
Approve and merge once the translation is correct

Example: Fixing a Wrong Term

If "workflow" is incorrectly translated as "flujo" instead of "flujo de trabajo" in Spanish:

In the PR branch, edit docs/es/llm-prompt.md
Add or update the glossary entry:

English Spanish

workflow flujo de trabajo (NOT "flujo")
Run the translation workflow targeting this branch
Verify the fix in the updated PR

English	Spanish
workflow	flujo de trabajo (NOT "flujo")

Approving PRs

Once you've verified the translation quality:

Check that CI passes (build succeeds)
Approve the PR
Merge (squash merge recommended)

How to Add a Missing Course

If a language exists but is missing content (e.g., Portuguese has hello_nextflow/ but not nf4_science/):

Using GitHub Actions (Recommended)

Go to Actions → Translate → Run workflow
Select language (e.g., pt)
Select command: sync
The workflow creates a PR with translations

Using the CLI (Requires API Key)

For maintainers with ANTHROPIC_API_KEY access:

cd _scripts

# Translate one file at a time
uv run python -m translate translate nf4_science/index.md --lang pt

# Or sync all (update outdated + add missing + remove orphaned)
uv run python -m translate sync pt

After Translation

Review the generated translations
Check if any prompt updates are needed
Submit a PR

How to Add a New Language

Step 1: Create Language Structure

Use the GitHub Actions workflow or CLI:

cd _scripts
uv run python docs.py new-lang <lang-code>

This creates:

docs/<lang>/mkdocs.yml - MkDocs config (inherits from English)
docs/<lang>/llm-prompt.md - Translation prompt (requires customization)
docs/<lang>/ui-strings.yml - UI strings (copied from English, requires translation)
docs/<lang>/docs/ - Directory for translated content

Step 2: Customize the Translation Prompt

Edit docs/<lang>/llm-prompt.md to define:

Grammar preferences
- Formal or informal tone
- Regional spelling conventions
- Specific grammar rules
Glossary
- Terms to keep in English
- Terms to translate (with exact translations)
- Common mistakes to avoid
Admonition titles
- Translations for Note, Tip, Warning, Exercise, Solution

See existing language prompts for examples (e.g., docs/pt/llm-prompt.md).

Step 3: Create the Post-Processing Glossary

Create docs/<lang>/glossary.yml with deterministic translations that are enforced by post-processing (independent of LLM output):

translation_notice: AI translation notice text and homepage admonition
tab_labels: canonical Before/After translations (must be consistent for MkDocs tab sync)
admonition_titles: default titles for bare admonitions (note, tip, warning, etc.)
admonition_title_glossary: translations for common titled admonitions (Command output, Directory contents, etc.)

See docs/pt/glossary.yml for an example. These translations are applied deterministically during post-processing and override whatever the LLM produces, ensuring consistency across all files.

Step 4: Register the Language

Add the language code to docs/language_names.yml
Run uv run docs.py sync-language-picker to update the language switcher

Step 5: Generate Initial Translations

Use GitHub Actions:

Go to Actions → Translate → Run workflow
Select the new language
Select command: sync

Step 6: Review and Iterate

Review the PR with generated translations
Update llm-prompt.md to fix any issues
Re-run translations as needed
Merge when satisfied

Directory Structure

docs/
├── en/                     # English (source)
│   ├── mkdocs.yml          # Main config
│   ├── overrides/          # Theme customization
│   └── docs/               # English content
├── pt/                     # Portuguese
│   ├── mkdocs.yml          # Inherits from en
│   ├── llm-prompt.md       # Translation rules (LLM guidance)
│   ├── glossary.yml        # Post-processing glossary (deterministic fixes)
│   ├── ui-strings.yml      # Translated UI strings
│   └── docs/               # Translated content
├── es/                     # Spanish
│   └── ...
└── ...

_scripts/
├── pyproject.toml          # Package metadata and dependencies
├── translate/              # Translation CLI package (python -m translate)
│   ├── __init__.py         # Package marker
│   ├── __main__.py         # Entry point
│   ├── config.py           # Constants and configuration
│   ├── models.py           # Data structures
│   ├── paths.py            # Path utilities
│   ├── prompts.py          # Prompt loading
│   ├── git_utils.py        # Git operations
│   ├── api.py              # Claude API calls
│   ├── postprocess.py      # Translation post-processing and glossary enforcement
│   ├── verify.py           # Translation verification
│   ├── progress.py         # Progress tracking
│   ├── core.py             # Translation orchestration
│   └── cli.py              # CLI commands
├── general-llm-prompt.md   # Shared translation rules
└── docs.py                 # Build/serve CLI

CLI Reference (For Maintainers)

Note

The CLI requires ANTHROPIC_API_KEY for translation commands. Community contributors should use GitHub Actions instead.

All commands run from _scripts/ directory:

cd _scripts

Translation Commands

# Sync all translations (update outdated + add missing + remove orphaned)
uv run python -m translate sync <lang>

# Sync with lower parallelism (default: 50 concurrent translations)
uv run python -m translate sync <lang> --parallel 10

# Sync with filter pattern
uv run python -m translate sync <lang> --include hello_nextflow

# Translate a single file
uv run python -m translate translate <path> --lang <lang>

Preview Commands (No API key required)

# Serve docs locally
uv run docs.py serve <lang>

# Build docs
uv run docs.py build-lang <lang>

References

FastAPI Translation System - Inspiration for this implementation
Portuguese Glossary

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translation Guide

Overview

Contents

How to Improve Existing Translations

The Right Way: Update the Prompt

Why Not Edit Translations Directly?

Reporting Issues

Understanding the Translation Prompts

General Prompt (`_scripts/general-llm-prompt.md`)

Language-Specific Prompts (`docs/<lang>/llm-prompt.md`)

Prompt Precedence

Contributing Prompt Improvements

How Automatic Translation Updates Work

Key Points

Reviewing Translation PRs

What to Check

How to Handle Issues

Workflow for Fixing Issues

Example: Fixing a Wrong Term

Approving PRs

How to Add a Missing Course

Using GitHub Actions (Recommended)

Using the CLI (Requires API Key)

After Translation

How to Add a New Language

Step 1: Create Language Structure

Step 2: Customize the Translation Prompt

Step 3: Create the Post-Processing Glossary

Step 4: Register the Language

Step 5: Generate Initial Translations

Step 6: Review and Iterate

Directory Structure

CLI Reference (For Maintainers)

Translation Commands

Preview Commands (No API key required)

References

FilesExpand file tree

TRANSLATING.md

Latest commit

History

TRANSLATING.md

File metadata and controls

Translation Guide

Overview

Contents

How to Improve Existing Translations

The Right Way: Update the Prompt

Why Not Edit Translations Directly?

Reporting Issues

Understanding the Translation Prompts

General Prompt (_scripts/general-llm-prompt.md)

Language-Specific Prompts (docs/<lang>/llm-prompt.md)

Prompt Precedence

Contributing Prompt Improvements

How Automatic Translation Updates Work

Key Points

Reviewing Translation PRs

What to Check

How to Handle Issues

Workflow for Fixing Issues

Example: Fixing a Wrong Term

Approving PRs

How to Add a Missing Course

Using GitHub Actions (Recommended)

Using the CLI (Requires API Key)

After Translation

How to Add a New Language

Step 1: Create Language Structure

Step 2: Customize the Translation Prompt

Step 3: Create the Post-Processing Glossary

Step 4: Register the Language

Step 5: Generate Initial Translations

Step 6: Review and Iterate

Directory Structure

CLI Reference (For Maintainers)

Translation Commands

Preview Commands (No API key required)

References

General Prompt (`_scripts/general-llm-prompt.md`)

Language-Specific Prompts (`docs/<lang>/llm-prompt.md`)