This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
To set up the development environment for this project, we recommend using Poetry for managing dependencies. We use pyproject.toml to manage project dependencies, and therefore, to install dependencies, you should use Poetry.
python -m venv .venvpoetry init-
Windows:
.venv\Scripts\activate.bat
-
Mac/Linux:
source .venv/bin/activate
poetry shellpoetry installBefore submitting a PR, it's important to test the translation functionality with real documentation:
-
Create a test directory in the root directory:
mkdir test_docs
-
Copy some markdown documentation and images you want to translate into the test directory. For example:
cp /path/to/your/docs/*.md test_docs/ cp /path/to/your/images/*.png test_docs/
-
Install the package locally:
pip install -e . -
Run Co-op Translator on your test documents:
python -m co_op_translator --language-codes ko --root-dir test_docs
-
Check the translated files in
test_docs/translationsandtest_docs/translated_imagesto verify:- The translation quality
- The metadata comments are correct
- The original markdown structure is preserved
- Links and images are working properly
This manual testing helps ensure that your changes work well in real-world scenarios.
- Create an
.envfile in the root directory by copying the provided.env.templatefile. - Fill in the environment variables as guided.
Tip
In addition to running the project locally, you can also use GitHub Codespaces or VS Code Dev Containers for an alternative development environment setup.
You can run this samples virtually by using GitHub Codespaces and no additional settings or setup are required.
The button will open a web-based VS Code instance in your browser:
A related option is VS Code Dev Containers, which will open the project in your local VS Code using the Dev Containers extension:
We use Black as our Python code formatter to maintain consistent code style across the project. Black is an uncompromising code formatter that automatically reformats Python code to conform to the Black code style.
The Black configuration is specified in our pyproject.toml:
[tool.black]
line-length = 88
target-version = ['py310']
include = '\.pyi?$'You can install Black using either Poetry (recommended) or pip:
Black is automatically installed when you set up the development environment:
poetry installIf you're using pip, you can install Black directly:
pip install black-
Format all Python files in the project:
poetry run black . -
Format a specific file or directory:
poetry run black path/to/file_or_directory
-
Format all Python files in the project:
black . -
Format a specific file or directory:
black path/to/file_or_directory
Tip
We recommend setting up your editor to automatically format code with Black on save. Most modern editors support this through extensions or plugins.
To run Co-op Translator using Poetry in your environment, follow these steps:
-
Navigate to the directory where you want to perform translation tests or create a temporary folder for testing purposes.
-
Execute the following command. Replace
-l kowith the language code you wish to translate into. The-dflag indicates debug mode.poetry run co-op-translator translate -l ko -d
Note
Ensure your Poetry environment is activated (poetry shell) before running the command.
We welcome contributions that add support for new languages. Before opening a PR, please complete the steps below to ensure a smooth review.
-
Add the language to the font mapping
- Edit
src/co_op_translator/fonts/font_language_mappings.yml - Add an entry with:
code: ISO-like language code (e.g.,vi)name: Human-friendly display namefont: A font shipped insrc/co_op_translator/fonts/that supports the scriptrtl:trueif right-to-left, otherwisefalse
- Edit
-
Include required font files (if needed)
- If a new font is required, verify license compatibility for open source distribution
- Add the font file to
src/co_op_translator/fonts/
-
Local verification
- Run translations for a small sample (Markdown, images, and notebooks as appropriate)
- Verify output renders correctly, including fonts and any RTL layout if applicable
-
Update documentation
- Ensure the language appears in
getting_started/supported-languages.md - No changes to
getting_started/README_languages_template.mdare needed; it is generated from the supported list
- Ensure the language appears in
-
Open a PR
- Describe the language added and any font/licensing considerations
- Attach screenshots of rendered outputs if possible
Example YAML entry:
new_lang(code):
name: "New Language"
font: "NotoSans-Medium.ttf"
rtl: falseYou can test the new language by running the following command:
# Create and activate a virtual environment
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate
# Install the development package
pip install -e .
# Run the translation
translate -l "new_lang"To ensure consistency and clarity in our project's commit history, we follow a specific commit message format for the final commit message when using the Squash and Merge strategy.
When a pull request (PR) is merged, the individual commits will be squashed into a single commit. The final commit message should follow the format below to maintain a clean and consistent history.
We use the following format for commit messages:
<type>: <description> (#<PR number>)-
type: Specifies the category of the commit. We use the following types:
Docs: For documentation updates.Build: For changes related to the build system or dependencies, including updates to configuration files, CI workflows, or the Dockerfile.Core: For modifications to the project's core functionality or features, particularly those involving files in thesrc/co_op_translator/coredirectory.
-
description: A concise summary of the change.
-
PR number: The number of the pull request associated with the commit.
Examples:
Docs: Update installation instructions for clarity (#50)Core: Improve handling of image translation (#60)
Note
Currently, the Docs, Core, and Build prefixes are automatically added to PR titles based on the labels applied to the modified source code. As long as the correct label is applied, you typically don't need to manually update the PR title. You just need to verify that everything is correct and the prefix has been generated appropriately.
We use Squash and Merge as our default strategy for pull requests. This strategy ensures that commit messages follow our format, even if individual commits don't.
Reasons:
- A clean, linear project history.
- Consistency in commit messages.
- Reduced noise from minor commits (e.g., "fix typo").
When merging, ensure the final commit message follows the commit message format described above.
Example of Squash and Merge If a PR contains the following commits:
fix typoupdate READMEadjust formatting
They should be squashed into:
Docs: Improve documentation clarity and formatting (#65)
This section describes the simplest way for maintainers to publish a new release of Co-op Translator.
- Decide the next version number (we follow semantic versioning:
MAJOR.MINOR.PATCH). - Edit
pyproject.tomland update theversionfield under[tool.poetry]. - Open a dedicated pull request that only changes the version (and any automatically updated lock/metadata files, if present).
- After review, use Squash and Merge and ensure the final commit message follows the format described above.
- Go to the GitHub repository page and open Releases → Draft a new release.
- Create a new tag (for example,
v0.13.0) from themainbranch. - Set the release title to the same version (for example,
v0.13.0). - Click Generate release notes to auto-populate the changelog.
- Optionally edit the text (for example, to highlight newly supported languages or important changes).
- Publish the release.