Skip to content

feat: add intl-pipeline and intl-review skills#18215

Open
myelinated-wackerow wants to merge 16 commits into
devfrom
intl-knowledge-base
Open

feat: add intl-pipeline and intl-review skills#18215
myelinated-wackerow wants to merge 16 commits into
devfrom
intl-knowledge-base

Conversation

@myelinated-wackerow
Copy link
Copy Markdown
Collaborator

Summary

Adds two agent-discoverable skills at .claude/skills/intl-pipeline/ and .claude/skills/intl-review/ covering the translation pipeline (architecture, manifests, orchestration, recovery, sanitizer, ETHGlossary integration) and the translation-quality review activity (scoring, language-group rules, severity rubric, multi-agent role split). Follows the design-system skill's pattern: thin SKILL.md entry + references/ for progressive disclosure + evals/evals.json for the skill-creator workflow.

Also drops the deprecated local transliteration bank (.claude/translation-review/transliterations/) and refactors the sanitizer to use ETHGlossary as the sole source for brand-name correction. Trims 20 Crowdin-era and pre-pipeline-era docs that have been superseded by the skills, ETHGlossary's docs/translation-policy.md, or the in-house pipeline rewrite.

What's in this PR

  • .claude/skills/intl-pipeline/ -- SKILL.md (~130 lines) + 9 references covering architecture, manifests, orchestration, recovery, sanitizer, ETHGlossary integration, non-english-edit policy, gotchas, and the /fix-sanitizer-bug runbook
  • .claude/skills/intl-review/ -- SKILL.md (~115 lines) + 8 references covering known patterns, language-group rules, scoring rubric, ETHGlossary usage, per-language findings tracking, severity rubric, multi-agent role split, gotchas
  • evals/evals.json per skill -- 5 output-quality evals each, matching the design-system / skill-creator format. Run with skill-creator's run_loop or grade manually against the assertions.
  • Sanitizer refactor -- src/scripts/intl-pipeline/intl-sanitizer.ts no longer reads the local transliteration bank. loadBrandGarbleCorrections() removed; fixKnownBrandGarbles(content) (locale param dropped) now corrects garbled brand mentions to the canonical Latin form per ETHGlossary's script_rule: keep_latin / always_latin for the affected brands. Tests updated.
  • AGENTS.md -- picks up a pointer paragraph to both skills (mirroring the design-system skill pointer pattern that already exists)
  • 34 deletions -- 20 Crowdin-era / stale docs (post-mortems, gemini-translation-roadmap.md, translation-program.md, locales-process.md, i18n-incremental-pipeline.md, ar-gemini-retranslation-queue.md) + the 14-file local transliteration bank

Both skills validate cleanly (npx skills-ref validate -> Valid skill) and the full sanitizer test suite passes (593 tests).

Eval results

Output-quality evals ran via skill-creator's run_loop (3 runs each, 5 evals per skill, with-skill vs no-skill baseline):

Skill Pass with skill Pass without skill Delta Time saving
intl-pipeline 92% ± 11% 44% ± 36% +0.48 -38s (29s vs 67s)
intl-review 96% ± 9% 56% ± 43% +0.40 -30s (28s vs 58s)

Both skills roughly double the agent's pass rate against the high-leverage rules they encode (don't-hand-propagate, sanitizer-bug runbook, glossary-deviation-as-critical, concept-vs-brand tag policy, scoring rubric, etc.). Variance also collapses with the skill loaded -- baseline runs are unreliable. Workspaces (gitignored) are at .claude/skills/intl-pipeline-workspace/iteration-1/ and .claude/skills/intl-review-workspace/iteration-1/ for the detailed per-eval breakdown.

Note on the sanitizer change

The previous local transliteration bank had transliterated forms for Solidity and GitHub in Arabic that disagreed with ETHGlossary's always_latin / keep_latin script_rule. This was a systematic policy mismatch (one of the 47 overlap conflicts surfaced in the round-2 Gemini analysis that drove ETHGlossary v0.3.0). With ETHGlossary now canonical, those overrides were policy-wrong, so the auto-fix path now returns the canonical Latin form. Sanitizer tests updated to expect Latin (was: per-locale transliteration).

User-visible effect: locale pages that previously displayed transliterated forms of Solidity / GitHub in Arabic will now show the Latin form on the next pipeline run. This matches ETHGlossary's declared policy and the site's intended UX going forward.

Out of scope (intentional)

  • Slash command refactor -- /review-translations and /fix-sanitizer-bug have known structural concerns (review/fix mixing, orchestration trigger gaps). Skills reference them as-is; restructuring is a separate concern.
  • CONTRIBUTING.md translation-related additions -- reviewer guidance on translation PRs.
  • skill-creator description-optimization loop -- the trigger evals are committed but the optimization loop hasn't been run against either skill description yet.

Follow-up issues to file after merge

  1. Slash command split -- /review-translations review-only vs fix-only; /fix-sanitizer-bug orchestration trigger from review failures
  2. CONTRIBUTING.md update -- reviewer guidance on translation PRs
  3. Output-quality eval expansion -- current 5+5 cover high-leverage rules; expand as edge cases surface in real use
  4. Run the skill-creator description-optimization loop on both skills to verify trigger reliability
  5. ETHGlossary integration in fixKnownBrandGarbles -- if future garble entries target brands with script_rule: transliterate, the function should grow ETHGlossary lookup to fetch the per-locale form (currently uses the static map; works for the 2 current entries because both target keep_latin / always_latin brands)

Test plan

  • npx skills-ref validate passes on .claude/skills/intl-pipeline and .claude/skills/intl-review
  • npx playwright test --project=unit tests/unit/intl-pipeline/sanitizer/ -- 593 passed
  • AGENTS.md skill-pointer paragraph renders correctly (mirror of the design-system pointer pattern)
  • Code-fence language identifiers present on all fenced blocks across skill references
  • .claude/translation-review/transliterations/ directory removed; no remaining code references to it
  • Eval JSONs executed via skill-creator's run_loop -- both skills show +0.40 to +0.48 pass-rate delta vs no-skill baseline
  • Cold-read sanity: open a fresh Claude Code session, ask the skill to triage a recent intl-pipeline issue using only SKILL.md + references

Commit history

15 commits on the branch + 1 merge-from-dev. Browse via git log --oneline origin/dev..HEAD in .worktrees/intl-knowledge-base.

myelinated-wackerow and others added 16 commits May 15, 2026 00:55
Establishes the entry point for the intl-pipeline knowledge skill. Follows the design-system pattern: don't-hand-propagate rule, 8 top rules, 7 highest-value gotchas with subsections, file/path cheatsheet, "when to load each reference" map, cross-skill pointers, pre-merge checklist.

References under `references/` and the companion `intl-review` skill will follow in subsequent commits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Establishes the entry point for the intl-review knowledge skill: translation-quality review of pipeline output. Mirrors the intl-pipeline / design-system SKILL.md pattern. Core rule: ETHGlossary is authority; deviations are critical issues, not warnings. Covers concept-tag-vs-brand-tag distinction, brand-name policy via per-term `script_rule`, MDX-build-breaking patterns, and the multi-agent review role split.

References under `references/` will follow in subsequent commits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Two highest-load references for the intl-pipeline skill:

- architecture.md: condensed phase-by-phase walkthrough of the per-file pipeline (phases 1-6 plus 4b), with test-assertion summary and open-questions pointers. Sourced from tests/specs/PIPELINE-SPEC.md but tightened for debugging-time reading.
- recovery.md: triage matrix and procedures for the common "the pipeline did something wrong" scenarios -- bad translation, corrupted manifests, build failure on a locale, stuck pending branch, LLM garbage, hand-edit damage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Two integration references for the intl-pipeline skill:

- ethglossary.md: how the pipeline talks to ETHGlossary. Endpoints, term-entry shape, script_rule -> pipeline behavior table, term_role taxonomy, what to do when a term is missing, cache behavior, common mistakes.
- non-english-edits.md: decision tree for "can I edit this translated file by hand?" Codifies the SKILL.md's core rule with allowed/not-allowed examples, the stamp_only escape hatch procedure, and a drift-detection check.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Two more references for the intl-pipeline skill:

- orchestration.md: the intl/pending-{base} model -- pending-as-baseline, base-into-pending merges, temp branch lifecycle, what happens when base moves mid-run, hot fix policy, stamp_only constraint, the orchestration contract.
- gotchas.md: long-tail patterns beyond the SKILL.md's inline section -- output token budget, finishReason values, intl-content-tree package boundary, JSX attribute entity handling, locale file naming, 24-vs-25 language confusion, custom heading IDs, concurrency env var, manifest version, frontmatter scope, TARGET_FILES env var.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Three more references completing the intl-pipeline skill's reference set:

- manifests.md: source/translation manifest structure, lifecycle, invariants, debug commands, common issues.
- sanitizer.md: what the sanitizer does, when it runs, fix categories table, rules for adding new fix functions (code-block-split first, return shapes, word boundaries), test file map.
- runbooks/fix-sanitizer-bug.md: when to invoke /fix-sanitizer-bug, the workflow the slash command enforces, common pitfalls.

intl-pipeline skill is now feature-complete (SKILL.md + 9 references).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Two highest-value references for the intl-review skill:

- known-patterns.md: condensed pattern catalog organized by severity (critical build-breaking, critical navigation-breaking, critical semantic, high quality, frontmatter tag policy violations, code-block violations, medium tone/register, Crowdin-era artifacts). Companion to the living .claude/translation-review/known-patterns.md.
- language-rules.md: per-language-group rules (Indic, Cyrillic, RTL, CJK phonetic, CJK semantic) with per-language deviations and technical nuances. Condensed from ETHGlossary's docs/translation-policy.md (canonical source).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
…refs

Three more references for the intl-review skill:

- scoring-rubric.md: the 5-category 0-10 scoring system (Brand Name Preservation, Technical Accuracy, Semantic Fidelity, Terminology Consistency, Tone/Register), per-category guidance, report format, what scoring is NOT (criticals are separate; not for merge gating).
- ethglossary-usage.md: review-side glossary integration. How to query (/filter preferred, single-term for spot-checks), per-field review use, severity mapping per script_rule and term_role deviation type, what to do when a term is missing, what not to do.
- per-language-tracking.md: .claude/translation-review/per-language/{lang}.md convention. Recommended structure (quality-scores-over-time, recurring patterns, glossary follow-ups, native-speaker queue, edge cases), when to write/not, bootstrapping a new language.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Final three references completing the intl-review skill:

- critical-vs-warning.md: three-tier severity rubric (critical = auto-fix, warning = manual review, informational = noted). Build-breaking, navigation-breaking, semantically wrong, and deterministic ETHGlossary deviations are critical. Translate/calque deviations and low-confidence glossary entries are warnings. Edge cases (keep_latin vs transliterated, valid aliases, confidence: low, frontmatter tags).
- agent-roles.md: structural / terminology / semantic role split for parallel agent reviews. What each owns, what each loads from references/, when to skip the role split (small reviews, Latin-script langs, spot-checks).
- gotchas.md: review-time long-tail (gh CLI sandbox needs, named branch requirement, incremental vs full diff, --approve vs --comment, build verification, glossary cache, code-comment translation allowed, etc.).

intl-review skill is now feature-complete (SKILL.md + 8 references).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
AGENTS.md picks up a pointer to the intl-pipeline and intl-review skills (mirroring the design-system skill pointer pattern).

Trim of obsolete documentation now that the knowledge has either migrated to the skills, into ETHGlossary's translation-policy.md (canonical source), or is rendered moot by the intl-pipeline rewrite and ETHGlossary v0.3.0:

- docs/gemini-translation-roadmap.md: v3-era roadmap; superseded by src/scripts/intl-pipeline/FUTURE.md and the intl-pipeline skill.
- docs/translation-program.md: dissolved (Crowdin volunteer program ended); content not load-bearing.
- docs/locales-process.md: pre-AI human-directed doc; obsolete.
- docs/i18n-incremental-pipeline.md: superseded by .claude/skills/intl-pipeline/.
- .claude/translation-review/ar-gemini-retranslation-queue.md: one-off queue from PR #17105; work complete.

Crowdin-era post-mortems removed (Crowdin is no longer in the loop; patterns digested into sanitizer-test-research.md and the skill references):
- 3 integration-issues Crowdin docs
- 2 post-import-sanitizer Crowdin-era findings docs
- 1 translation-review-session-errors doc
- 2 build-errors Crowdin sanitizer docs
- 6 crowdin-import-review and japanese-translation-quality-review docs in translation-review
- 1 scaling-translation-review-pipeline.md planning doc

Kept (still load-bearing):
- docs/solutions/integration-issues/sanitizer-test-research.md (living pattern catalog)
- docs/solutions/integration-issues/intl-pipeline-bugs-from-pr-18041-review.md (recent pipeline-era findings)
- docs/solutions/integration-issues/translation-href-sync-issues.md (pattern still relevant)
- docs/solutions/architecture/i18n-pipeline-process-retrospective.md (process learnings)
- docs/solutions/logic-errors/post-import-sanitizer-*.md (sanitizer-era patterns; not Crowdin-specific)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Small evaluation set (8 queries: 6 should-trigger + 2 should-not-trigger near-misses) for verifying that the new skills' description fields trigger reliably on relevant prompts.

Follows agentskills.io's optimizing-descriptions guide: 3 runs per query, trigger-rate threshold of 0.67 for positives and 0.33 for negatives. Pass criterion: 6+ of 8 queries pass (75%+).

Diagnostic guidance included for adjusting descriptions if evals fail.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Both intl-pipeline/references/ethglossary.md and intl-review/references/ethglossary-usage.md now lead with the canonical llms.txt reference (at domain root, not under /api/v1/), with endpoint shapes deferred to llms.txt rather than duplicated inline that goes stale.

Other changes:
- Dropped v0.2.0 version-introduction noise from the term-entry description.
- Replaced the "open a PR against wackerow/ethglossary" prescription with "flag in the review report; note in per-language findings if relevant." Cross-repo coordination is separate maintainer work, not part of pipeline / review workflow.
- Fixed the wrong llms.txt URL in intl-pipeline/references/ethglossary.md (was incorrectly placed under /api/v1/; lives at domain root).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Three previously-bare fence openers now have explicit language identifiers:

- non-english-edits.md line 7: `text` (decision-tree ASCII art)
- per-language-tracking.md line 19: `text` (file path)
- runbooks/fix-sanitizer-bug.md line 20: `bash` (slash command invocation)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
Replaces the single .claude/skills/intl-evals.md with per-skill evals/evals.json files matching the structure used by .claude/skills/design-system/evals/evals.json (output by Anthropic's skill-creator skill).

- .claude/skills/intl-pipeline/evals/evals.json -- 5 evals exercising: don't-hand-propagate on English update, sanitizer pattern-gap test-first workflow, manifest corruption recovery, missing ETHGlossary term protocol, hot-fix translation policy.
- .claude/skills/intl-review/evals/evals.json -- 5 evals exercising: glossary deviation as critical, translated href as critical, concept-tag vs brand-tag distinction, tone inconsistency as warning, the 5-category scoring rubric.

Each eval has id, name (kebab-case), prompt (realistic user request with context), expected_output (narrative), files (empty), and assertions (4-6 testable statements). Execute via skill-creator's run_loop or manually grade against the assertions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
The sanitizer's local transliteration bank at `.claude/translation-review/transliterations/` (13 per-language JSONs + a terms-for-transliteration.md catalog) is removed. Both garble entries in `BRAND_GARBLE_CORRECTIONS` (`يجتبه -> GitHub`, `الصلابة -> Solidity`) target brands ETHGlossary v0.3.0 classifies as `keep_latin` or `always_latin`, so the canonical form is Latin in all locales -- no per-locale lookup needed.

The previous bank had transliterated forms for these brands that DISAGREED with ETHGlossary's policy (this was the systematic 47-term overlap conflict surfaced in the round-2 Gemini analysis that drove ETHGlossary v0.3.0). With ETHGlossary now canonical, those overrides were policy-wrong; they're now removed.

Changes:
- `intl-sanitizer.ts`: removed `loadBrandGarbleCorrections()` function. `fixKnownBrandGarbles(content)` -- locale parameter dropped; uses `BRAND_GARBLE_CORRECTIONS` static map directly. Two production callers updated.
- `standalone-fixes.spec.ts`: 7 tests -> 6 tests. Per-locale Arabic-transliteration assertions updated to expect Latin canonical form per ETHGlossary policy.
- `intl-review/references/critical-vs-warning.md`, `intl-pipeline/references/gotchas.md`: removed references to the deprecated bank; principle ("ETHGlossary is authoritative") retained.
- `.claude/translation-review/known-patterns.md`: updated the transliteration-authority section and the fixKnownBrandGarbles fix description to reflect ETHGlossary as the source of truth.

If future garble entries target brands ETHGlossary classifies as `transliterate` (per-locale native-script form), `fixKnownBrandGarbles` should grow ETHGlossary integration at that point.

Verified: `npx playwright test --project=unit tests/unit/intl-pipeline/sanitizer/` -> 593 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented May 18, 2026

Deploy Preview for ethereumorg ready!

Name Link
🔨 Latest commit 8d830ce
🔍 Latest deploy log https://app.netlify.com/projects/ethereumorg/deploys/6a0b7da23a23ca00079d945b
😎 Deploy Preview https://deploy-preview-18215.ethereum.it
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
7 paths audited
Performance: 66 (🟢 up 2 from production)
Accessibility: 96 (no change from production)
Best Practices: 100 (🟢 up 1 from production)
SEO: 98 (🔴 down 1 from production)
PWA: 59 (no change from production)
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions github-actions Bot added documentation 📖 Change or add documentation tooling 🔧 Changes related to tooling of the project labels May 18, 2026
@wackerow wackerow marked this pull request as ready for review May 19, 2026 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation 📖 Change or add documentation tooling 🔧 Changes related to tooling of the project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant