Skip to content

Normalize standalone CustomContent indentation in JA output#26

Open
qiancai wants to merge 2 commits intopingcap:mainfrom
qiancai:fix-ja-customcontent-indent
Open

Normalize standalone CustomContent indentation in JA output#26
qiancai wants to merge 2 commits intopingcap:mainfrom
qiancai:fix-ja-customcontent-indent

Conversation

@qiancai
Copy link
Copy Markdown
Collaborator

@qiancai qiancai commented Apr 9, 2026

Summary

This PR adds a small post-serialization normalization step for standalone CustomContent lines in Japanese translation output.

Root Cause

After PR #24, Japanese translation output could restore CustomContent markers as standalone raw HTML lines inside list contexts. When those lines were serialized back to Markdown, they could end up with one or more extra leading spaces. In website-docs, that extra indentation was enough to break MDX list parsing and trigger build errors such as:

  • Expected corresponding JSX closing tag for <CustomContent>

The docs-side hotfix in pingcap/docs PR #22716 confirmed the immediate failure mode: changing those standalone CustomContent lines from 5 or 6 leading spaces back to 4 fixed the build.

What Changed

  • Added normalizeStandaloneCustomContentIndentation() to normalize standalone <CustomContent ...> / </CustomContent> lines after Markdown serialization.
  • Applied that normalization in both the Japanese CLI entrypoint and the shared GCP translator entrypoint.
  • Added regression tests for the two observed failure shapes:
    • ordered-list export sections like tidb-cloud/configure-external-storage-access.md
    • unordered-list description blocks like system-variables.md
  • Added non-regression coverage to ensure inline CustomContent, fenced code examples, and already-correct nested indentation stay unchanged.

Why This Is Small On Purpose

This PR does not change the translation semantics of CustomContent, the placeholder flow for links, or the broader Markdown/MDX parsing model. It only normalizes the final serialized indentation for standalone CustomContent lines, matching the docs-side hotfix while keeping the translator change narrowly scoped.

Validation

Passed:

node --test markdown-translator/test/customContentIndentation.test.js markdown-translator/test/frontmatterAliases.test.js markdown-translator/test/jaParagraphRegression.test.js markdown-translator/test/placeholderUtils.test.js

Also checked:

node --test markdown-translator/test/*.test.js

The full suite still has existing failures in GithubMentions.test.js on the current upstream main branch. Those failures predate this PR and are unrelated to the CustomContent indentation change.

@qiancai qiancai changed the title [codex] Normalize standalone CustomContent indentation in JA output Normalize standalone CustomContent indentation in JA output Apr 9, 2026
@qiancai qiancai marked this pull request as ready for review April 9, 2026 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant