[Benckmark] Benchmark refactor example by lowdy1 · Pull Request #1199 · linkedin/Liger-Kernel

lowdy1 · 2026-04-24T09:25:08Z

Summary

The current benchmark scripts contain significant boilerplate when constructing common_configs for run_benchmarks. Although compute_model_config_sweep_config and compute_seq_len_sweep_config provide the core sweep logic, each script still:

Reimplements probe logic
Manually builds extra_benchmark_config
Duplicates common_configs assembly
Defines redundant helpers like _resolve_* and *_model_config variants

This PR removes that duplication by introducing higher-level builders that standardize how benchmarks are defined.

Proposal

1. Introduce higher-level sweep builders

Add two unified helper functions in benchmark_model_configs.py:

build_model_config_sweep(...)
build_token_length_sweep(...)

These functions:

Wrap existing sweep utilities (compute_*_sweep_config)
Internally handle memory probing via setup_fn + forward_fn
Automatically construct extra_benchmark_config from:
- model_keys (dynamic model attributes)
- extra_configs (static overrides)
Return a fully-formed common_configs dict

So benchmark scripts reduce to:

common_configs = build_*(...)
run_benchmarks(**common_configs)

2. Standardize kernel definition via `setup_fn`

Instead of manually writing probe_fn, all kernels now define:

setup_fn: SingleBenchmarkRunInput -> Tuple[Any, ...]
forward_fn: Tuple[Any, ...] -> torch.Tensor  (optional)

The builders handle:

setup_out = setup_fn(input)
output = forward_fn(*setup_out)

A default is provided:

forward_fn = lambda x, layer: layer(x)

This removes duplicated forward/probe logic across scripts.

3. Eliminate redundant helpers

The following patterns are removed across benchmark scripts:

probe_fn definitions
extra_config_fn
_resolve_model_config_*
bench_*_model_config variants

All are now handled centrally by the builders.

New APIs

`build_model_config_sweep`

Sweeps across model configurations (x-axis = model name)
Keeps total tokens (B * T) approximately constant
Uses setup_fn + forward_fn to estimate memory per model

build_model_config_sweep(
    kernel_name,
    all_model_configs,
    setup_fn,
    model_keys,
    forward_fn=...,
    probe_provider="torch",
    extra_configs=None,
    bt=2048,
    overwrite=False,
)

`build_token_length_sweep`

Sweeps across sequence length (x-axis = T)
Automatically adjusts batch size based on memory estimation
Uses the same setup_fn + forward_fn abstraction

build_token_length_sweep(
    kernel_name,
    probe_seq_len,
    model,
    setup_fn,
    model_keys,
    extra_configs=None,
    forward_fn=...,
    probe_provider="torch",
    x_values_fn=...,
    overwrite=False,
)

Example (after refactor)

common_configs = build_token_length_sweep(
    kernel_name="layer_norm",
    probe_seq_len=1024,
    model=model,
    setup_fn=_setup_layer_norm,
    model_keys=["hidden_size", "dtype"],
    extra_configs={"eps": 1e-6},
    probe_provider="huggingface",
)

common_configs["kernel_providers"] = ["liger", "huggingface"]

run_benchmarks(..., **common_configs)

Benchmark Command Examples

python ./benchmark/scripts/benchmark_swiglu.py --sweep-mode model_config [--model llama_3_8b]
python ./benchmark/scripts/benchmark_swiglu.py [--sweep-mode token_length] [--bt 2048]

Notes

model_config: sweeps across different model configurations (fixed total tokens)
token_length: sweeps across sequence lengths / batch sizes (fixed model) and this is the default and can be omitted

Hardware Type: A100-80G-PCIe
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Tcc0403 · 2026-04-25T04:57:29Z

+        def x_values_fn(config):
+            return [2**i for i in range(10, int(math.log2(config.seq_len)) + 1)]
+


Should we put it in build_token_length_sweep? I feel we can set this function as default x range.

Tcc0403 · 2026-04-25T05:03:33Z

+        def extra_config_fn(config):
+            return {
+                "bsz": config.batch_size,
+                "hidden_size": model.hidden_size,
+                "intermediate_size": model.intermediate_size,
+                "hidden_act": "silu",
+                "dtype": model.dtype,
+            }


How about we pass a key list to build_token_length_sweep and let it query those keys from model configs?

Tcc0403 · 2026-04-25T05:06:30Z

@@ -171,40 +118,41 @@ def _probe():
            x, layer = _setup_swiglu(probe_input)
            return layer(x)


Same idea, add arguments probe_length/provider, and put probe_fn in build_token_length_sweep

Tcc0403 · 2026-04-25T05:08:26Z

-                    "model_configs": model_configs_info,
-                    "bsz": sweep.batch_size,
-                    "seq_len": sweep.seq_len,
+        def probe_fn(model_cfg, probe_seq_len):


Ditto. Could it merge into build_model_config_sweep?

benchmark refactor example

b663836

Tcc0403 reviewed Apr 25, 2026

View reviewed changes

lowdy1 force-pushed the bmk_eg branch 3 times, most recently from 5648ac6 to f7e3e18 Compare April 25, 2026 10:00

move probe into builder

a04ea3c

lowdy1 force-pushed the bmk_eg branch 7 times, most recently from ef7d096 to 83c76fc Compare April 28, 2026 07:02

add readme instruction

ad3072a

lowdy1 force-pushed the bmk_eg branch from 83c76fc to ad3072a Compare April 28, 2026 07:20

add probe_dim and scale_dim

f7bf977

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Benckmark] Benchmark refactor example#1199

[Benckmark] Benchmark refactor example#1199
lowdy1 wants to merge 4 commits intolinkedin:mainfrom
lowdy1:bmk_eg

lowdy1 commented Apr 24, 2026 •

edited

Loading

Uh oh!

Tcc0403 Apr 25, 2026

Uh oh!

Tcc0403 Apr 25, 2026

Uh oh!

Tcc0403 Apr 25, 2026

Uh oh!

Tcc0403 Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		def x_values_fn(config):
		return [2**i for i in range(10, int(math.log2(config.seq_len)) + 1)]

		@@ -171,40 +118,41 @@ def _probe():
		x, layer = _setup_swiglu(probe_input)
		return layer(x)

Conversation

lowdy1 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Proposal

1. Introduce higher-level sweep builders

2. Standardize kernel definition via setup_fn

3. Eliminate redundant helpers

New APIs

build_model_config_sweep

build_token_length_sweep

Example (after refactor)

Benchmark Command Examples

Notes

Uh oh!

Tcc0403 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Tcc0403 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Tcc0403 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Tcc0403 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lowdy1 commented Apr 24, 2026 •

edited

Loading

2. Standardize kernel definition via `setup_fn`

`build_model_config_sweep`

`build_token_length_sweep`