Skip to content

tristan-mcinnis/Excel-Translator-Formatting-Intact-with-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Excel Translator

Excel Translator

Convert your Excel spreadsheets to beautifully translated content while preserving formatting

Python 3.10+ License MIT UV Package Manager Tests Passing code style black

Clean, fast, and reliable Excel translation with multi-provider support and formatting preservation

✨ Features β€’ πŸš€ Quick Start β€’ πŸ“– Usage β€’ βœ… Testing β€’ 🀝 Contributing


✨ Features

A powerful Excel translation tool that preserves all formatting while translating content using various Large Language Model (LLM) providers. Translate Excel spreadsheets from Chinese to English (or other language pairs) while maintaining exact cell positioning, formulas, styling, and formatting.

  • 🎯 Formatting Preservation: Maintains all Excel formatting, fonts, colors, borders, and cell styles
  • ⚑ Fast Translation: Sub-2 second translation speeds with intelligent caching
  • πŸ”„ Batch Processing: Efficient batch translation with configurable batch sizes
  • 🧠 Smart Caching: Avoids retranslating identical content across sessions
  • πŸ“Š Formula Support: Translates text within Excel formulas while preserving formula logic
  • πŸ”§ Robust Error Handling: Exponential backoff retry logic for reliability
  • πŸ’Ύ Progress Saving: Automatic incremental saves to prevent data loss
  • πŸ›‘οΈ Backup Creation: Automatic backup creation before translation
  • πŸ“ˆ Progress Tracking: Real-time progress bars and detailed logging
  • πŸ”€ Multiple Providers: Support for OpenAI GPT models (GPT-4o, GPT-5)
  • 🌐 Language Flexibility: Configurable source and target languages
  • βš™οΈ Highly Configurable: Extensive CLI options and environment variable support

πŸš€ Quick Start

Installation

Using UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone the repository
git clone https://github.com/tristan-mcinnis/Excel-Translator-Formatting-Intact-with-LLMs.git
cd Excel-Translator-Formatting-Intact-with-LLMs

# Install dependencies
uv sync

Using pip

# Clone the repository
git clone https://github.com/tristan-mcinnis/Excel-Translator-Formatting-Intact-with-LLMs.git
cd Excel-Translator-Formatting-Intact-with-LLMs

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e .

Configuration

  1. Copy environment template:

    cp example.env .env
  2. Edit .env file with your API key:

    OPENAI_API_KEY=your_openai_api_key_here

Basic Usage

# Simple translation
python main.py input.xlsx --output output.xlsx

# With specific model and languages
python main.py input.xlsx --output output.xlsx \
  --provider openai --model gpt-5 \
  --source-lang zh --target-lang en

# Batch processing with custom settings
python main.py input.xlsx --output output.xlsx \
  --batch-size 10 --context "financial report" \
  --max-retries 3 --save-interval 50

πŸ“– Usage

Command Line Arguments

Argument Description Default
input_file Input Excel file path Required
--output, -o Output Excel file path Required
--provider Translation provider (openai) openai
--model Model name (gpt-4o, gpt-5) gpt-4o
--source-lang Source language code zh
--target-lang Target language code en
--context Translation context spreadsheet
--batch-size Cells per batch 5
--max-retries Maximum retry attempts 5
--save-interval Save every N cells 20
--no-backup Skip backup creation False
--clear-cache Clear cache before start False
--cache-dir Cache directory path translation_cache
--log-level Logging level INFO
--log-file Log file path Auto-generated

Environment Variables

Variable Description Default
OPENAI_API_KEY OpenAI API key Required
DEFAULT_SOURCE_LANG Default source language zh
DEFAULT_TARGET_LANG Default target language en
DEFAULT_CONTEXT Default translation context spreadsheet
DEFAULT_BATCH_SIZE Default batch size 5
LOG_LEVEL Logging level INFO
CACHE_DIR Cache directory translation_cache

Usage Examples

Basic Translation

# Translate Chinese Excel file to English
python main.py chinese_report.xlsx --output english_report.xlsx

Advanced Usage

# Financial report with custom context
python main.py financial_data.xlsx --output translated_financial.xlsx \
  --context "financial report with accounting terms" \
  --batch-size 8 --save-interval 30

# Large file with performance optimization
python main.py large_dataset.xlsx --output translated_dataset.xlsx \
  --batch-size 15 --max-retries 3 \
  --log-level DEBUG

Using Different Models

# Use GPT-5 for higher quality translation
python main.py input.xlsx --output output.xlsx \
  --model gpt-5 --context "technical documentation"

# Use GPT-4o for faster translation
python main.py input.xlsx --output output.xlsx \
  --model gpt-4o --batch-size 10

βœ… Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=excel_translator --cov-report=html

# Run specific test file
pytest tests/test_translation.py

# Run with verbose output
pytest -v

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add type hints to all functions
  • Write tests for new features
  • Update documentation for API changes
  • Use meaningful commit messages

πŸ“‹ Requirements

  • Python: 3.10 or higher
  • Operating System: macOS, Linux, or Windows
  • API Keys: OpenAI API key for translation services
  • Memory: Minimum 4GB RAM (8GB+ recommended for large files)

πŸ—οΈ Project Structure

excel_translator/
β”œβ”€β”€ excel_translator/           # Main package
β”‚   β”œβ”€β”€ __init__.py            # Package initialization
β”‚   β”œβ”€β”€ cli.py                 # Command-line interface
β”‚   β”œβ”€β”€ translation.py         # Core translation logic
β”‚   β”œβ”€β”€ utils.py               # Utility functions
β”‚   └── providers/             # Translation providers
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ base_provider.py   # Abstract base provider
β”‚       └── openai_provider.py # OpenAI implementation
β”œβ”€β”€ tests/                     # Test suite
β”œβ”€β”€ main.py                    # Entry point
β”œβ”€β”€ pyproject.toml            # Project configuration
β”œβ”€β”€ example.env               # Environment template
β”œβ”€β”€ README.md                 # This file
└── CLAUDE.md                 # Development guidelines

πŸ” Supported File Types

  • Excel Files: .xlsx, .xlsm
  • Language Detection: Automatic Chinese character detection
  • Formula Support: Excel formulas with embedded text strings
  • Formatting: All Excel formatting elements (fonts, colors, borders, etc.)

⚑ Performance Tips

  1. Optimize Batch Size: Start with default (5), increase for faster processing
  2. Use Caching: Keep cache enabled to avoid retranslating identical content
  3. Save Intervals: Adjust based on file size (smaller intervals for large files)
  4. Model Selection: Use GPT-4o for speed, GPT-5 for quality
  5. Context Specification: Provide specific context for better translations

πŸ› οΈ Development

Development Installation

# Install with development dependencies
uv sync --extra dev

# Or with pip
pip install -e ".[dev]"

Code Quality Tools

# Format code
black excel_translator/

# Sort imports
isort excel_translator/

# Lint code
flake8 excel_translator/

# Type checking
mypy excel_translator/

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • OpenAI for providing powerful language models
  • openpyxl library for Excel file manipulation
  • tqdm for progress tracking
  • The open-source community for various tools and libraries

πŸ“ž Support

  • Issues: GitHub Issues
  • Documentation: This README and inline code documentation
  • Examples: See the examples/ directory for sample usage

🚧 Roadmap

  • Support for additional LLM providers (Anthropic, DeepSeek, Grok)
  • Web interface for non-technical users
  • Support for additional file formats (.xls, .csv)
  • Real-time collaborative translation
  • Translation quality metrics and validation
  • Custom translation models fine-tuning

Made with ❀️ by Tristan McInnis

About

A powerful Excel translation tool that preserves all formatting while translating content using various LLM providers

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages