Skip to content

Conversation

@Neptune1729
Copy link

This PR consolidates several fixes to the Chinese translation in chapters 1–3.

chapter1/4.mdx

  • Fixed a duplicated character typo in the explanation paragraph to improve readability.

chapter2/2.mdx

  • Corrected the section title (use “分词器(tokenizer)” instead of the duplicated “tokenizer( tokenizer )”)
  • Fixed the Markdown list structure by using proper nested bullet points
  • Standardized the formatting of special tokens: [CLS], [SEP]

chapter3/2.mdx

  • Removed a duplicated explanatory sentence in the tokenize_function example to avoid redundancy.

These changes improve clarity, accuracy, and MDX rendering consistency for the Chinese course content.

- Fix incorrect section title (“使用 tokenizer( tokenizer )进行预处理” → “使用分词器(tokenizer)进行预处理”)
- Correct Markdown list formatting (proper nested bullet points)
- Standardize special token formatting: `[CLS]`, `[SEP]`
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Changed code block syntax from 'py' to 'python' for consistency.
Removed unnecessary code block for batch shape check.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants