Skip to content

Latest commit

 

History

History
87 lines (81 loc) · 6.09 KB

File metadata and controls

87 lines (81 loc) · 6.09 KB

Comparison Report

Overview

Metric qwen plain paddleocr mineru marker docling bbox
score 0.6957 0.8439 0.5395 0.6344 0.8597 0.7668 0.8617
passed 352 427 273 321 435 388 436
failed 154 79 233 185 71 118 70
assertion_count 506 506 506 506 506 506 506
case_count 54 54 54 54 54 54 54

Failures by Type

Metric qwen plain paddleocr mineru marker docling bbox
caption_binding 1 1 1 1 1 2 1
element_grounded 9 9 9 9 9 9 0
formula_contains 21 9 11 16 11 19 9
formula_visual 1 0 0 0 1 1 0
reading_order 22 4 24 12 8 27 4
regex_absence 31 42 36 25 28 24 42
regex_match 7 2 3 4 3 4 2
table_cell_exists 37 2 90 90 0 13 2
table_grid_cell 1 1 31 7 1 2 1
table_shape 5 1 10 10 1 1 1
text_absence 2 4 3 1 1 1 4
text_presence 17 4 15 10 7 15 4

Per-Case Scores

Case qwen plain paddleocr mineru marker docling bbox
arxiv_mulco_p4 0.3750 0.4375 0.4375 0.3125 0.8125 0.3125 0.4375
arxiv_mulco_p7 0.6923 1.0000 0.3077 0.2308 1.0000 1.0000 1.0000
arxiv_cluener_p2 0.7778 0.7778 0.4444 0.5556 0.8889 0.8889 0.7778
arxiv_cluener_p4 0.8750 0.7500 0.2500 0.2500 0.8750 0.8750 0.7500
contract_service_001_p1 0.9091 0.9091 0.9091 0.9091 0.9091 0.9091 0.9091
contract_service_001_p2 0.9167 0.9167 0.8333 0.9167 0.9167 0.9167 0.9167
contract_service_001_p3 0.3077 0.9231 0.3077 0.3077 0.9231 0.7692 0.9231
contract_service_001_p4 0.9167 0.7500 0.7500 0.8333 0.7500 0.5833 0.7500
contract_service_001_p5 1.0000 0.8889 0.8889 0.8889 0.8889 0.8889 0.8889
contract_service_001_p6 0.6923 0.8462 0.7692 0.8462 0.8462 0.6923 0.8462
contract_service_001_p7 0.8889 0.8889 0.8889 0.8889 0.8889 0.5556 0.8889
contract_service_001_p8 1.0000 0.9000 0.9000 0.9000 0.9000 0.8000 0.9000
exam_physics_final_p2 0.6000 0.7333 0.6667 0.8000 0.8000 0.8000 0.7333
exam_physics_final_p3 0.5625 0.7500 0.3750 0.4375 0.8125 0.8125 0.7500
exam_physics_final_p4 0.8333 0.8333 0.2500 0.3333 0.8333 0.9167 0.8333
exam_physics_final_p1 0.7778 1.0000 1.0000 1.0000 0.8889 0.8889 1.0000
finance_annual2024_p4 0.0833 1.0000 0.1667 0.1667 1.0000 0.9167 1.0000
finance_annual2024_p6 0.9167 1.0000 0.5000 0.5000 1.0000 0.6667 1.0000
finance_annual2024_p8 1.0000 1.0000 0.4545 0.3636 1.0000 0.7273 1.0000
finance_annual2024_p9 0.2500 1.0000 0.3333 0.2500 1.0000 0.9167 1.0000
invoice_vat_001_p1 0.8095 0.9524 0.6667 0.6667 0.9524 0.9524 0.9524
invoice_vat_001_p2 1.0000 0.9000 0.8000 0.9000 0.9000 0.9000 0.9000
invoice_vat_001_p3 1.0000 0.8889 0.8889 0.8889 0.8889 0.8889 0.8889
invoice_vat_001_p4 0.8571 0.8571 0.8571 0.8571 0.8571 0.4286 0.8571
invoice_vat_001_p5 1.0000 0.8889 0.8889 0.8889 0.8889 0.8889 0.8889
zh_paper_double_column_001_p3 0.8750 0.7500 0.3750 0.3750 0.6250 0.6250 0.8750
cn_textbook_formula_002_p12 0.3333 0.6667 0.6667 0.6667 0.3333 0.3333 1.0000
finance_table_mixed_003_p8 1.0000 1.0000 0.2857 0.2857 1.0000 1.0000 1.0000
slides_ai_course_001_p1 0.8750 0.8750 0.8750 0.8750 0.8750 0.8750 0.8750
slides_ai_course_001_p2 0.7500 0.7500 0.7500 0.7500 0.9167 0.7500 0.7500
slides_ai_course_001_p3 0.3636 0.9091 0.3636 0.3636 0.9091 1.0000 0.9091
slides_ai_course_001_p4 0.9000 0.9000 0.4000 0.9000 0.9000 0.9000 0.9000
slides_ai_course_001_p5 0.7273 0.9091 0.8182 0.9091 0.8182 0.8182 0.9091
slides_ai_course_001_p6 0.8182 0.9091 0.9091 0.9091 0.9091 0.8182 0.9091
html_table_grid_004_p8 1.0000 1.0000 0.0000 0.4000 1.0000 1.0000 1.0000
formula_visual_005_p12 0.5000 1.0000 1.0000 1.0000 0.5000 0.5000 1.0000
textbook_physics_v2_p05 0.7143 0.8571 0.7143 0.7143 0.8571 0.8571 0.8571
textbook_physics_v2_p10 0.9000 1.0000 0.5000 0.5000 1.0000 1.0000 1.0000
textbook_physics_v2_p15 0.8750 1.0000 1.0000 0.8750 0.8750 0.7500 1.0000
stage6_batch2_synthetic_p01 0.8333 0.8333 0.1667 0.8333 1.0000 1.0000 0.8333
stage6_batch2_synthetic_p02 0.6667 0.8333 0.0000 0.6667 1.0000 1.0000 0.8333
stage6_batch2_synthetic_p03 0.6667 0.6667 0.0000 0.5000 0.8333 0.8333 0.6667
stage6_batch2_synthetic_p04 0.8571 0.8571 0.0000 0.7143 1.0000 1.0000 0.8571
stage6_batch2_synthetic_p05 0.5000 0.5000 0.1667 1.0000 0.6667 0.3333 0.5000
stage6_batch2_synthetic_p06 0.5000 0.6667 0.1667 1.0000 0.8333 0.3333 0.6667
stage6_batch2_synthetic_p07 0.6667 0.6667 0.1667 1.0000 0.8333 0.5000 0.6667
stage6_batch2_synthetic_p08 0.1429 0.5714 0.4286 0.1429 0.1429 0.1429 0.8571
stage6_batch2_synthetic_p09 0.0000 0.5714 0.4286 0.1429 0.7143 0.5714 0.8571
stage6_batch2_synthetic_p10 0.1429 0.5714 0.7143 0.1429 0.1429 0.1429 0.8571
stage6_batch2_synthetic_p11 0.4000 0.8000 0.8000 1.0000 0.8000 0.6000 0.8000
stage6_batch2_synthetic_p12 0.0000 0.6000 0.6000 0.8000 0.8000 0.6000 0.6000
stage6_batch2_synthetic_p13 0.2000 0.8000 0.6000 1.0000 0.8000 0.6000 0.8000
stage6_batch2_synthetic_p14 0.6364 0.9091 0.0909 0.4545 0.9091 0.8182 0.9091
stage6_batch2_synthetic_p15 0.9000 0.9000 0.1000 0.7000 0.9000 0.6000 0.9000