AICoder Translator is a powerful CLI tool that leverages OpenAI's API to translate text files into different languages while maintaining formatting and structure. It employs a sophisticated three-step process: initial translation, expert editing, and aggressive critique with revision cycles to produce high-quality translations.
The tool excels in handling various text formats, with special support for markdown files containing YAML frontmatter (commonly used in static site generators like Jekyll and Hugo). It preserves original formatting, structure, and markdown elements while providing natural-sounding translations in the target language.
Key features include:
- 🌐 Support for translating to any language supported by OpenAI models
- 📝 Preservation of original text formatting and structure
- 🔄 Multiple revision cycles for quality improvement
- 📊 Detailed logging and cost estimation
- 🧠 Intelligent handling of markdown frontmatter
- 📈 Token usage tracking and cost analysis
# Install the package directly from GitHub
uv tool install git+https://github.com/2389-research/translatorThen you can use the command:
translator --help# Clone the repository
git clone https://github.com/2389-research/translator.git
cd translator
# Install the package
uv tool install .The translator requires an OpenAI API key to function. The easiest way to configure it is using the interactive setup:
translator configThis will walk you through setting up your configuration, including:
- Choosing where to store your configuration (.env files)
- Setting your OpenAI API key
- Configuring optional settings like default model
Alternatively, you can configure manually in several ways (in order of precedence):
- Environment variable:
export OPENAI_API_KEY=your_openai_api_key_here- Local .env file:
Create a .env file in your current directory:
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env- User configuration directory:
For persistent configuration, create a .env file in the ~/.translator/ directory:
# Create the configuration directory
mkdir -p ~/.translator
# Create a .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > ~/.translator/.envThe first time you run the tool without a configured API key, it will offer to create this directory and a template configuration file for you.
- Alternative configuration location:
You can also place the .env file in ~/.config/translator/:
# Create the configuration directory
mkdir -p ~/.config/translator
# Create a .env file
echo "OPENAI_API_KEY=your_openai_api_key_here" > ~/.config/translator/.envYou can add these options to your .env file:
# Default model (optional, defaults to o3)
DEFAULT_MODEL=o3
# Output directory for translated files (optional)
OUTPUT_DIR=/path/to/output/directory
# Log level (optional, defaults to INFO)
LOG_LEVEL=INFO # Options: DEBUG, INFO, WARNING, ERROR, CRITICALTranslate a file to another language:
translator input.txt SpanishThis will create a translated file named input.es.txt in the same directory.
# Configure the translator interactively
translator config
# Translate to French with a custom output file
translator README.md French -o translated_readme.md
# Use a specific OpenAI model
translator document.txt Japanese -m gpt-4o
# Skip the editing step for faster processing
translator long_document.txt German --no-edit
# Skip the critique step
translator quick_translation.txt Chinese --no-critique
# Specify number of critique-revision loops (1-5)
translator important_document.txt Korean --critique-loops 3
# View available models and pricing
translator --list-models
# Estimate cost without translating
translator large_document.txt Portuguese --estimate-only
# You can also use the explicit translate command (optional)
translator translate README.md FrenchIf you're working directly in the source directory without installing the package:
uv run main.py input.txt Spanish-
Translation Pipeline
- Frontmatter Handling: Detects and processes YAML frontmatter in markdown files
- Content Translation: Preserves formatting while translating the main content
- Editing Pass: Ensures natural language and accurate translation
- Critique System: Multiple rounds of critique and revision for higher quality
-
Token Management
- Uses OpenAI's
tiktokenlibrary to count tokens - Checks document size against model limits
- Estimates costs before translation
- Uses OpenAI's
-
Language Support
- Automatically generates ISO 639-1 language codes
- Supports all languages available in OpenAI models
- Uses
pycountrylibrary to map language names to codes
-
Logging System
- Generates detailed JSON logs of the translation process
- Records all prompts, responses, and metadata
- Creates narrative summaries of the translation process
The codebase is organized into modular components:
-
translator/__init__.py: Package initializationcli.py: Command-line interface implementationconfig.py: Model configuration and pricing informationcost.py: Cost estimation and calculationfile_handler.py: File I/O utilitiesfrontmatter_handler.py: Processes YAML frontmatterlanguage.py: Language code detection and mappinglog_interpreter.py: Analyzes and creates narratives from logsprompts.py: Centralized storage for system and user promptstoken_counter.py: Token counting functionstranslator.py: Core translation logic
-
tests/: Comprehensive test suite -
samples/: Example files for testing
The project uses pytest for testing. To run tests:
uv run pytestThe prompting system is designed for high-quality translations with multiple improvement stages:
- Initial translation with formatting preservation
- Expert editing for natural language
- Critical analysis and feedback
- Application of critique for final polishing
This multi-stage approach produces translations that read as if they were originally written in the target language while maintaining fidelity to the source.
- Python 3.13+
- Dependencies (automatically installed):
- openai>=1.78.1
- python-dotenv>=1.1.0
- rich>=13.9.4
- tiktoken>=0.7.0
- pycountry>=23.12.10
- python-frontmatter>=1.1.0
- pytest>=7.4.0 (for testing)
The tool is designed to be extended with new models and features as OpenAI's API evolves.