Skip to content

Latest commit

 

History

History
43 lines (34 loc) · 1.77 KB

File metadata and controls

43 lines (34 loc) · 1.77 KB

benchmark_compiled report

Run

run_gcs_dir=gs://$GROUPING_TRAINER_BUCKET/runs/2026-04-10-12-39-45-large-no-prefix
df_path=final_csvs/test_full2.csv
stamp=2026-04-24-23-44-59
sample_size=66753
text_prefix=''
model_kwargs={'dtype': torch.bfloat16, 'attn_implementation': 'sdpa'}
  • Token bucket boundaries used for analysis: (64, 128, 256, 512, 1024)
  • Rows: 66,753

Headline

  • Median compiled: 14.7 ms
  • Median base: 36.8 ms
  • Per-row speedup p10/p50/p90: 1.01x / 2.51x / 3.45x
  • Compiled wins on 92.6% of rows

Per-bucket

bucket n tok_p50 compiled_ms_p50 base_ms_p50 compiled_ms_p90 base_ms_p90 speedup_p50
<=64 8301 35.0 10.58 35.17 11.72 35.99 3.32
65-128 7628 93.0 10.68 35.4 11.65 36.21 3.31
129-256 13930 194.0 11.62 36.05 12.85 37.02 3.1
257-512 14804 369.0 15.54 36.97 17.3 38.1 2.38
513-1024 13858 682.0 26.9 38.86 28.97 40.85 1.44
>1024 8232 1494.5 48.06 47.04 115.66 115.89 0.98

Worst 5 rows for compiled

num_tokens compiled_ms base_ms speedup
1252 54.8 43.26 0.789
1043 50.24 42.14 0.839
2020 82.6 70.98 0.859
1122 49.67 42.91 0.864
2040 79.23 69.04 0.871