Skip to content

Resurrecting history with AI. A specialized PaddleOCR-VL fine-tune that deciphers 1545 Gothic Spanish manuscripts with 98% accuracy (1.6% CER). Features a "Time Machine" engine for archaic-to-modern text normalization and translation. Built for the Baidu ERNIE Challenge.

Notifications You must be signed in to change notification settings

deepeshahlawat/Chronos-VL

Repository files navigation

📜 Chronos-VL: The 1545 Resurrection Engine

Baidu ERNIE Challenge PaddleOCR Hugging Face License

“We don't just read history. We translate time.”

Chronos-VL is a specialized Vision–Language Model system designed to decipher Early Modern Spanish Gothic manuscripts (c. 1545).
Built for the Baidu ERNIE AI Developer Challenge.

Chronos-VL Hero


📉 The Problem

Millions of pages of legal, social, and cultural history remain locked in Spanish archives (e.g., The RODRIGO Corpus).

  • Standard OCR (Tesseract / Base Models)
    Fails catastrophically (≈68% error rate) due to:

    • Ink bleed-through
    • Dense ligatures
    • Gothic calligraphy
    • The infamous Long S (ſ / f) ambiguity
  • The Gap
    Historians need searchable, modernized text, not just noisy transcriptions.


💡 The Solution: Chronos-VL System

Chronos-VL introduces a two-stage vision–language pipeline purpose-built for 16th-century manuscripts.

1️⃣ Visual Perception (AI Layer)

  • Fine-tuned PaddleOCR-VL-0.9B
  • Training via ERNIEKit (SFT)
  • Optimized on NVIDIA A100 (80GB)
  • Learns period-specific Gothic features and ligatures

2️⃣ Semantic Modernization (Logic Layer)

  • Chronos Engine post-processes OCR output
  • Normalizes archaic spelling:
    • dixodijo
    • facerhacer
  • Produces:
    • Clean modern Spanish

📊 Performance Benchmarks

A/B testing conducted on 100 unseen manuscript pages.

Metric PaddleOCR (Base) Chronos-VL (Ours) Improvement
Median Character Error Rate 19.82% 1.64% 12× Better
Usable Output (<5% Error) 1% 77% 76× Increase
Word Error Rate 74.44% 17.35% 4× Better

🚀 Interactive Demo

Upload any 16th-century manuscript and see Chronos-VL in action.

Open In Colab


📂 Repository Structure

Here is a breakdown of the core files in this repository:

File Description
📓 Demo_Chronos_VL.ipynb The Interactive App. A complete Colab notebook that launches the Gradio interface, allowing you to upload images and compare baseline and finetuned model.
📓 Evaluation.ipynb The Proof. The script used to benchmark the model against 100 unseen images. Generates the CER/WER statistics and comparisons.
🐍 chronos_processing.py The Logic Layer. Contains the custom ChronosPostProcessor class for hallucination filtering, archaic text modernization.
📄 Finetuning_script.txt The Training Protocol. The exact commands and configurations used with ERNIEKit to train the model on the NVIDIA A100 GPU.
🖼️ Rodrigo_*.png Sample Data. Authentic 1545 manuscript fragments from the test set. You can use these to test the demo immediately.
📊 training_loss.jpg Convergence Metrics. Visual evidence of the training process showing the reduction in loss over 400 steps.
🖼️ finetune_success.jpg Visual Evidence. Showing how well finetuned model is performing.

🧠 Technical Architecture

  • Base Model: PaddleOCR-VL-0.9B (Vision-Language)
  • Training Framework: ERNIEKit (SFT)
  • Hardware: NVIDIA A100 (80GB)
  • Dataset: RODRIGO Corpus (1545) - 9,000 Text Lines

🎥 Project Video

[ https://www.youtube.com/watch?v=PaK24VT_3Jk ]

🙏 Acknowledgements

  • Baidu PaddlePaddle Team for the ERNIEKit framework.
  • Universitat Politècnica de València for the RODRIGO dataset.

About

Resurrecting history with AI. A specialized PaddleOCR-VL fine-tune that deciphers 1545 Gothic Spanish manuscripts with 98% accuracy (1.6% CER). Features a "Time Machine" engine for archaic-to-modern text normalization and translation. Built for the Baidu ERNIE Challenge.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published