Skip to content

[Feature]: Move benchmark/perf test from test folder to FlyDSL python source #304

@rahulbatra85

Description

@rahulbatra85

Suggestion Description

I would suggest moving common benchmarking/perf test(https://github.com/ROCm/FlyDSL/blob/main/tests/test_common.py) from test folder to FlyDSL python source, so it can be imported directly like this

An example,
from fx.testing import checkAllclose, run_perftest

Why should this be done

  1. Triton DSL and even CuteDSL provides such feature as part of their API. See (https://triton-lang.org/main/python-api/triton.testing.html and https://github.com/triton-lang/triton/blob/main/python/triton/testing.py). CuteDSL also provides a similar API. See this example code https://github.com/NVIDIA/cutlass/blob/main/examples/python/CuTeDSL/notebooks/elementwise_add.ipynb
  2. We almost always have benchmark code for most of the kernels. For example, Triton kernels in AITER have a whole suite of benchmarks for kernels. https://github.com/ROCm/aiter/tree/main/op_tests/op_benchmarks/triton. They all use the common benchmarking code provided by Triton DSL. Some examples.
    https://github.com/ROCm/aiter/blob/main/op_tests/op_benchmarks/triton/bench_gemm_afp4wfp4.py#L51
    https://github.com/ROCm/aiter/blob/main/op_tests/op_benchmarks/triton/bench_batch_prefill.py#L158

As we develop more FlyDSL kernels, I expect similar benchmarking suite to exist for them. Using common benchmarking code/API that is provided by FlyDSL would be useful here.

Operating System

No response

GPU

No response

ROCm Component

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions