Fix UTF-8 encoding issues (mojibake) in markdown documentation files.
- Automatic mojibake repair using ftfy
- Unicode to ASCII conversion for symbols, emoji, and smart quotes
- Cross-platform - works on macOS, Linux, and Windows
- Zero configuration - just run it
- Safe - use
--dry-runto preview changes
This tool was created to solve a specific problem with Claude.ai artifacts.
The Problem:
When downloading markdown/text artifacts from Claude.ai using Firefox, UTF-8 characters often get double-encoded, resulting in mojibake (garbled text).
- Claude.ai generates UTF-8 text with special characters
- Firefox downloads with incorrect encoding handling
- File contains corrupted characters like
âœ"instead of✓
This tool fixes these issues automatically, but works for any UTF-8 corruption scenario.
Before (corrupted):
âœ" Task complete
can’t find file
Price: $50 â€" $100
â†' Next step
After (fixed):
[done] Task complete
can't find file
Price: $50 -- $100
-> Next step
| Corrupted | Original | Cause |
|---|---|---|
âœ" |
✓ (checkmark) |
UTF-8 read as Latin-1 |
’ |
' (smart quote) |
Smart quote corruption |
â€" |
— (em dash) |
Em dash corruption |
â†' |
→ (arrow) |
Arrow corruption |
é |
é (accented e) |
Accent corruption |
✅ |
✅ (emoji) |
Emoji corruption |
No installation required. Just run with uv:
uv run fix-encoding.pygit clone https://github.com/firdaus-aziz/fix-encoding.git
cd fix-encoding
pip install -e .pip install fix-encoding# Fix all markdown files in current directory
uv run fix-encoding.py
# Fix specific directory
uv run fix-encoding.py ./docs
# Preview changes without modifying files
uv run fix-encoding.py --dry-run
# Verbose output
uv run fix-encoding.py -v| Corrupted | Fixed |
|---|---|
’ |
' |
â€" |
- |
é |
é |
✅ |
Yes |
| Unicode | ASCII |
|---|---|
✅ ✓ |
Yes [done] |
❌ |
No |
→ ← ↑ ↓ |
-> <- ^ v |
• † |
* |
— – |
-- - |
' ' " " |
' ' " " |
≥ ≤ |
>= <= |
× |
x |
- Python 3.9+
- ftfy (automatically installed when using
uv run)
MIT