HTR Evaluation System

A system for evaluating and visualizing the accuracy of HTR (Handwritten Text Recognition) JSON outputs compared to gold standard legal documents. It was developed specifically for cases when legal documents can be categorized according to types that follow the same structure, which can be defined and extracted using vision-enhanced LLMs.

System Requirements

Python 3.6+
Node.js 14+ and npm 6+ (for dashboard visualization)
Internet connection (for npm package installation)

Installation

1. Set Up Python Environment

First, make sure Python 3.6+ is installed:

python --version
# or
python3 --version

Clone this repository (or download and extract the ZIP file):

git clone https://github.com/username/htr-evaluation-system.git
cd htr-evaluation-system

Create and activate a virtual environment:

Linux/macOS:

python -m venv htr_llm_env
source htr_llm_env/bin/activate

Windows:

python -m venv htr_llm_env
htr_llm_env\Scripts\activate

2. Install Required Python Packages

# Core requirements
pip install -U pip
pip install json re argparse typing

3. Install Node.js and npm

The dashboard requires Node.js and npm. Install them if you don't have them already:

Option 1: Download and install from nodejs.org

Option 2: Using package managers:

Linux (Ubuntu/Debian):

sudo apt update
sudo apt install nodejs npm

macOS (with Homebrew):

brew install node

Verify the installation:

node --version
npm --version

4. Test the Installation

Run a simple test to ensure everything is set up correctly:

python -c "import json, re, argparse, os; print('All required packages are installed!')"

Usage

Basic Usage

python htr_evaluation.py exemple_data/mock-gold-json.json exemple_data/mock-llm-json.json

This will:

Evaluate the predicted JSON against the gold standard
Generate a dashboard in the default output directory (./output/)
Launch the dashboard viewer automatically

Options

python htr_evaluation.py exemple_data/mock-gold-json.json exemple_data/mock-llm-json.json --output_dir /custom/path

--output_dir: Custom directory to save evaluation results (default: ./output/)

Features

Evaluation Process

The system:

Flattens nested JSON structures for comparison
Normalizes field names by replacing spaces with underscores
Applies field-specific normalization:
- Phone numbers: Removes non-digit characters
- Numeric fields: Extracts digits
- Dates: Extracts numeric components
Computes string similarity using Levenshtein distance
Weights fields based on importance (adjustable for each specific evaluation campaign)
Categorizes errors into four types

Error Categories

Critical Error (0%): Completely incorrect or missing values
Semantic Difference (50%): Similar meaning but significant differences
Minor Error (80%): Small differences
Perfect Match (100%): Exact match after normalization

Dashboard Features

Overall evaluation score
Field coverage percentage
Error distribution visualization
Detailed table of all errors

How It Works

After running the evaluation, the script will:

Generate the dashboard files
Install npm dependencies
Start the Parcel development server
Display access instructions

Open your browser and navigate to http://localhost:1234 to view the dashboard.

Customization

You can modify:

Similarity thresholds in categorize_error()
Field weights in get_field_weight()
Field normalization in normalize_field()

Troubleshooting

Common Issues

Problem: ModuleNotFoundError: No module named 'xyz'
Solution: Install the missing package: pip install xyz

Problem: Dashboard doesn't launch
Solution:

Make sure Node.js and npm are installed correctly
Manually navigate to the dashboard directory and run:

cd output/[file_name]_dashboard
npm install
npm start

Open your browser to http://localhost:1234

Legal

This code is distributed under the CC BY-NC-SA 4.0 license, developed by Mikhail Biriuchinskii, NLP Engineer at ObTIC, Sorbonne University.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
exemple_data		exemple_data
llm_prompt		llm_prompt
output/exemple_output		output/exemple_output
.gitignore		.gitignore
README.md		README.md
dashboard_generator.js		dashboard_generator.js
htr-evaluation-script.py		htr-evaluation-script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTR Evaluation System

System Requirements

Installation

1. Set Up Python Environment

2. Install Required Python Packages

3. Install Node.js and npm

4. Test the Installation

Usage

Basic Usage

Options

Features

Evaluation Process

Error Categories

Dashboard Features

How It Works

Customization

Troubleshooting

Common Issues

Legal

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HTR Evaluation System

System Requirements

Installation

1. Set Up Python Environment

2. Install Required Python Packages

3. Install Node.js and npm

4. Test the Installation

Usage

Basic Usage

Options

Features

Evaluation Process

Error Categories

Dashboard Features

How It Works

Customization

Troubleshooting

Common Issues

Legal

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages