Skip to content

Latest commit

 

History

History
107 lines (101 loc) · 8.04 KB

File metadata and controls

107 lines (101 loc) · 8.04 KB

Comparison Report

Overview

Metric qwen plain paddleocr mineru marker docling bbox
score 0.6573 0.8027 0.4703 0.5757 0.7745 0.6899 0.8160
passed 443 541 317 388 522 465 550
failed 231 133 357 286 152 209 124
assertion_count 674 674 674 674 674 674 674
case_count 74 74 74 74 74 74 74

Failures by Type

Metric qwen plain paddleocr mineru marker docling bbox
caption_binding 1 1 1 1 1 3 1
element_grounded 9 9 9 9 9 9 0
formula_contains 21 9 11 16 11 19 9
formula_visual 1 0 0 0 1 1 0
reading_order 24 16 41 20 18 30 16
regex_absence 31 42 36 25 28 24 42
regex_match 7 2 3 4 3 4 2
table_cell_exists 89 11 151 151 45 69 11
table_grid_cell 1 1 44 7 1 2 1
table_shape 5 1 11 11 1 1 1
text_absence 2 4 3 1 1 1 4
text_presence 40 37 47 41 33 46 37

Per-Case Scores

Case qwen plain paddleocr mineru marker docling bbox
arxiv_mulco_p4 0.3750 0.4375 0.4375 0.3125 0.8125 0.3125 0.4375
arxiv_mulco_p7 0.6923 1.0000 0.3077 0.2308 1.0000 1.0000 1.0000
arxiv_cluener_p2 0.7778 0.7778 0.4444 0.5556 0.8889 0.8889 0.7778
arxiv_cluener_p4 0.8750 0.7500 0.2500 0.2500 0.8750 0.8750 0.7500
contract_service_001_p1 0.9091 0.9091 0.9091 0.9091 0.9091 0.9091 0.9091
contract_service_001_p2 0.9167 0.9167 0.8333 0.9167 0.9167 0.9167 0.9167
contract_service_001_p3 0.3077 0.9231 0.3077 0.3077 0.9231 0.7692 0.9231
contract_service_001_p4 0.9167 0.7500 0.7500 0.8333 0.7500 0.5833 0.7500
contract_service_001_p5 1.0000 0.8889 0.8889 0.8889 0.8889 0.8889 0.8889
contract_service_001_p6 0.6923 0.8462 0.7692 0.8462 0.8462 0.6923 0.8462
contract_service_001_p7 0.8889 0.8889 0.8889 0.8889 0.8889 0.5556 0.8889
contract_service_001_p8 1.0000 0.9000 0.9000 0.9000 0.9000 0.8000 0.9000
exam_physics_final_p2 0.6000 0.7333 0.6667 0.8000 0.8000 0.8000 0.7333
exam_physics_final_p3 0.5625 0.7500 0.3750 0.4375 0.8125 0.8125 0.7500
exam_physics_final_p4 0.8333 0.8333 0.2500 0.3333 0.8333 0.9167 0.8333
exam_physics_final_p1 0.7778 1.0000 1.0000 1.0000 0.8889 0.8889 1.0000
finance_annual2024_p4 0.0833 1.0000 0.1667 0.1667 1.0000 0.9167 1.0000
finance_annual2024_p6 0.9167 1.0000 0.5000 0.5000 1.0000 0.6667 1.0000
finance_annual2024_p8 1.0000 1.0000 0.4545 0.3636 1.0000 0.7273 1.0000
finance_annual2024_p9 0.2500 1.0000 0.3333 0.2500 1.0000 0.9167 1.0000
invoice_vat_001_p1 0.8095 0.9524 0.6667 0.6667 0.9524 0.9524 0.9524
invoice_vat_001_p2 1.0000 0.9000 0.8000 0.9000 0.9000 0.9000 0.9000
invoice_vat_001_p3 1.0000 0.8889 0.8889 0.8889 0.8889 0.8889 0.8889
invoice_vat_001_p4 0.8571 0.8571 0.8571 0.8571 0.8571 0.4286 0.8571
invoice_vat_001_p5 1.0000 0.8889 0.8889 0.8889 0.8889 0.8889 0.8889
zh_paper_double_column_001_p3 0.8750 0.7500 0.3750 0.3750 0.6250 0.6250 0.8750
cn_textbook_formula_002_p12 0.3333 0.6667 0.6667 0.6667 0.3333 0.3333 1.0000
finance_table_mixed_003_p8 1.0000 1.0000 0.2857 0.2857 1.0000 1.0000 1.0000
slides_ai_course_001_p1 0.8750 0.8750 0.8750 0.8750 0.8750 0.8750 0.8750
slides_ai_course_001_p2 0.7500 0.7500 0.7500 0.7500 0.9167 0.7500 0.7500
slides_ai_course_001_p3 0.3636 0.9091 0.3636 0.3636 0.9091 1.0000 0.9091
slides_ai_course_001_p4 0.9000 0.9000 0.4000 0.9000 0.9000 0.9000 0.9000
slides_ai_course_001_p5 0.7273 0.9091 0.8182 0.9091 0.8182 0.8182 0.9091
slides_ai_course_001_p6 0.8182 0.9091 0.9091 0.9091 0.9091 0.8182 0.9091
html_table_grid_004_p8 1.0000 1.0000 0.0000 0.4000 1.0000 1.0000 1.0000
formula_visual_005_p12 0.5000 1.0000 1.0000 1.0000 0.5000 0.5000 1.0000
textbook_physics_v2_p05 0.7143 0.8571 0.7143 0.7143 0.8571 0.8571 0.8571
textbook_physics_v2_p10 0.9000 1.0000 0.5000 0.5000 1.0000 1.0000 1.0000
textbook_physics_v2_p15 0.8750 1.0000 1.0000 0.8750 0.8750 0.7500 1.0000
stage6_batch2_synthetic_p01 0.8333 0.8333 0.1667 0.8333 1.0000 1.0000 0.8333
stage6_batch2_synthetic_p02 0.6667 0.8333 0.0000 0.6667 1.0000 1.0000 0.8333
stage6_batch2_synthetic_p03 0.6667 0.6667 0.0000 0.5000 0.8333 0.8333 0.6667
stage6_batch2_synthetic_p04 0.8571 0.8571 0.0000 0.7143 1.0000 1.0000 0.8571
stage6_batch2_synthetic_p05 0.5000 0.5000 0.1667 1.0000 0.6667 0.3333 0.5000
stage6_batch2_synthetic_p06 0.5000 0.6667 0.1667 1.0000 0.8333 0.3333 0.6667
stage6_batch2_synthetic_p07 0.6667 0.6667 0.1667 1.0000 0.8333 0.5000 0.6667
stage6_batch2_synthetic_p08 0.1429 0.5714 0.4286 0.1429 0.1429 0.1429 0.8571
stage6_batch2_synthetic_p09 0.0000 0.5714 0.4286 0.1429 0.7143 0.5714 0.8571
stage6_batch2_synthetic_p10 0.1429 0.5714 0.7143 0.1429 0.1429 0.1429 0.8571
stage6_batch2_synthetic_p11 0.4000 0.8000 0.8000 1.0000 0.8000 0.6000 0.8000
stage6_batch2_synthetic_p12 0.0000 0.6000 0.6000 0.8000 0.8000 0.6000 0.6000
stage6_batch2_synthetic_p13 0.2000 0.8000 0.6000 1.0000 0.8000 0.6000 0.8000
stage6_batch2_synthetic_p14 0.6364 0.9091 0.0909 0.4545 0.9091 0.8182 0.9091
stage6_batch2_synthetic_p15 0.9000 0.9000 0.1000 0.7000 0.9000 0.6000 0.9000
public_real_nist_ai_rmf_p005 0.0000 0.0000 0.6667 0.0000 0.0000 0.0000 0.0000
public_real_nist_ai_rmf_p008 1.0000 1.0000 0.6667 1.0000 0.6667 0.6667 1.0000
public_real_nist_ai_rmf_p012 1.0000 1.0000 0.3333 1.0000 1.0000 1.0000 1.0000
public_real_nist_ai_rmf_p017 1.0000 1.0000 0.8000 1.0000 1.0000 0.8000 1.0000
public_real_nist_ai_rmf_p019 1.0000 1.0000 0.3333 1.0000 0.3333 0.3333 1.0000
public_real_nist_sp800_53r5_p027 0.8696 0.8696 0.0870 0.6087 0.8696 0.8696 0.8696
public_real_nist_sp800_53r5_p046 1.0000 0.6667 0.6667 0.6667 0.6667 0.6667 0.6667
public_real_nist_sp800_53r5_p087 1.0000 0.3333 0.3333 0.6667 0.6667 0.3333 0.3333
public_real_nist_sp800_53r5_p399 1.0000 0.5000 0.5000 0.5000 1.0000 0.5000 0.5000
public_real_nist_sp800_53r5_p428 0.3333 0.0000 0.3333 0.3333 0.3333 0.0000 0.0000
public_real_irs_1040_2024_p001 0.2500 0.5625 0.0625 0.0625 0.3125 0.2500 0.5625
public_real_irs_1040sa_2024_p001 0.3529 0.5294 0.1176 0.2941 0.4706 0.2941 0.5294
public_real_irs_1040sc_2024_p001 0.2500 0.8000 0.2500 0.2000 0.2500 0.2500 0.8000
public_real_irs_1040sc_2024_p002 0.2941 0.7059 0.2353 0.1765 0.4118 0.2353 0.7059
public_real_irs_1040sd_2024_p001 0.5882 0.5882 0.1765 0.2353 0.4118 0.4118 0.5882
public_real_irs_1040sd_2024_p002 0.4167 0.5833 0.2500 0.3333 0.4167 0.4167 0.5833
public_real_govinfo_cfr_title1_p007 1.0000 1.0000 0.6667 1.0000 1.0000 1.0000 1.0000
public_real_govinfo_cfr_title1_p014 0.7143 0.7143 0.5714 0.7143 0.7143 0.7143 0.7143
public_real_govinfo_cfr_title1_p028 1.0000 1.0000 0.0000 1.0000 1.0000 1.0000 1.0000
public_real_govinfo_cfr_title1_p035 0.5714 0.5714 0.4286 0.4286 0.4286 0.5714 0.5714