Convert your Excel spreadsheets to beautifully translated content while preserving formatting
Clean, fast, and reliable Excel translation with multi-provider support and formatting preservation
β¨ Features β’ π Quick Start β’ π Usage β’ β Testing β’ π€ Contributing
A powerful Excel translation tool that preserves all formatting while translating content using various Large Language Model (LLM) providers. Translate Excel spreadsheets from Chinese to English (or other language pairs) while maintaining exact cell positioning, formulas, styling, and formatting.
- π― Formatting Preservation: Maintains all Excel formatting, fonts, colors, borders, and cell styles
- β‘ Fast Translation: Sub-2 second translation speeds with intelligent caching
- π Batch Processing: Efficient batch translation with configurable batch sizes
- π§ Smart Caching: Avoids retranslating identical content across sessions
- π Formula Support: Translates text within Excel formulas while preserving formula logic
- π§ Robust Error Handling: Exponential backoff retry logic for reliability
- πΎ Progress Saving: Automatic incremental saves to prevent data loss
- π‘οΈ Backup Creation: Automatic backup creation before translation
- π Progress Tracking: Real-time progress bars and detailed logging
- π Multiple Providers: Support for OpenAI GPT models (GPT-4o, GPT-5)
- π Language Flexibility: Configurable source and target languages
- βοΈ Highly Configurable: Extensive CLI options and environment variable support
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repository
git clone https://github.com/tristan-mcinnis/Excel-Translator-Formatting-Intact-with-LLMs.git
cd Excel-Translator-Formatting-Intact-with-LLMs
# Install dependencies
uv sync# Clone the repository
git clone https://github.com/tristan-mcinnis/Excel-Translator-Formatting-Intact-with-LLMs.git
cd Excel-Translator-Formatting-Intact-with-LLMs
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e .-
Copy environment template:
cp example.env .env
-
Edit
.envfile with your API key:OPENAI_API_KEY=your_openai_api_key_here
# Simple translation
python main.py input.xlsx --output output.xlsx
# With specific model and languages
python main.py input.xlsx --output output.xlsx \
--provider openai --model gpt-5 \
--source-lang zh --target-lang en
# Batch processing with custom settings
python main.py input.xlsx --output output.xlsx \
--batch-size 10 --context "financial report" \
--max-retries 3 --save-interval 50| Argument | Description | Default |
|---|---|---|
input_file |
Input Excel file path | Required |
--output, -o |
Output Excel file path | Required |
--provider |
Translation provider (openai) |
openai |
--model |
Model name (gpt-4o, gpt-5) |
gpt-4o |
--source-lang |
Source language code | zh |
--target-lang |
Target language code | en |
--context |
Translation context | spreadsheet |
--batch-size |
Cells per batch | 5 |
--max-retries |
Maximum retry attempts | 5 |
--save-interval |
Save every N cells | 20 |
--no-backup |
Skip backup creation | False |
--clear-cache |
Clear cache before start | False |
--cache-dir |
Cache directory path | translation_cache |
--log-level |
Logging level | INFO |
--log-file |
Log file path | Auto-generated |
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key | Required |
DEFAULT_SOURCE_LANG |
Default source language | zh |
DEFAULT_TARGET_LANG |
Default target language | en |
DEFAULT_CONTEXT |
Default translation context | spreadsheet |
DEFAULT_BATCH_SIZE |
Default batch size | 5 |
LOG_LEVEL |
Logging level | INFO |
CACHE_DIR |
Cache directory | translation_cache |
# Translate Chinese Excel file to English
python main.py chinese_report.xlsx --output english_report.xlsx# Financial report with custom context
python main.py financial_data.xlsx --output translated_financial.xlsx \
--context "financial report with accounting terms" \
--batch-size 8 --save-interval 30
# Large file with performance optimization
python main.py large_dataset.xlsx --output translated_dataset.xlsx \
--batch-size 15 --max-retries 3 \
--log-level DEBUG# Use GPT-5 for higher quality translation
python main.py input.xlsx --output output.xlsx \
--model gpt-5 --context "technical documentation"
# Use GPT-4o for faster translation
python main.py input.xlsx --output output.xlsx \
--model gpt-4o --batch-size 10# Run all tests
pytest
# Run with coverage
pytest --cov=excel_translator --cov-report=html
# Run specific test file
pytest tests/test_translation.py
# Run with verbose output
pytest -v- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 style guidelines
- Add type hints to all functions
- Write tests for new features
- Update documentation for API changes
- Use meaningful commit messages
- Python: 3.10 or higher
- Operating System: macOS, Linux, or Windows
- API Keys: OpenAI API key for translation services
- Memory: Minimum 4GB RAM (8GB+ recommended for large files)
excel_translator/
βββ excel_translator/ # Main package
β βββ __init__.py # Package initialization
β βββ cli.py # Command-line interface
β βββ translation.py # Core translation logic
β βββ utils.py # Utility functions
β βββ providers/ # Translation providers
β βββ __init__.py
β βββ base_provider.py # Abstract base provider
β βββ openai_provider.py # OpenAI implementation
βββ tests/ # Test suite
βββ main.py # Entry point
βββ pyproject.toml # Project configuration
βββ example.env # Environment template
βββ README.md # This file
βββ CLAUDE.md # Development guidelines
- Excel Files:
.xlsx,.xlsm - Language Detection: Automatic Chinese character detection
- Formula Support: Excel formulas with embedded text strings
- Formatting: All Excel formatting elements (fonts, colors, borders, etc.)
- Optimize Batch Size: Start with default (5), increase for faster processing
- Use Caching: Keep cache enabled to avoid retranslating identical content
- Save Intervals: Adjust based on file size (smaller intervals for large files)
- Model Selection: Use GPT-4o for speed, GPT-5 for quality
- Context Specification: Provide specific context for better translations
# Install with development dependencies
uv sync --extra dev
# Or with pip
pip install -e ".[dev]"# Format code
black excel_translator/
# Sort imports
isort excel_translator/
# Lint code
flake8 excel_translator/
# Type checking
mypy excel_translator/This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing powerful language models
- openpyxl library for Excel file manipulation
- tqdm for progress tracking
- The open-source community for various tools and libraries
- Issues: GitHub Issues
- Documentation: This README and inline code documentation
- Examples: See the
examples/directory for sample usage
- Support for additional LLM providers (Anthropic, DeepSeek, Grok)
- Web interface for non-technical users
- Support for additional file formats (.xls, .csv)
- Real-time collaborative translation
- Translation quality metrics and validation
- Custom translation models fine-tuning
Made with β€οΈ by Tristan McInnis