Skip to content

[tune][xegpu] infrastructure for tuning, applied to XeGPU matmul example#63

Open
rolfmorel wants to merge 6 commits intomainfrom
users/rolfmorel/lh-xegpu-autotuning
Open

[tune][xegpu] infrastructure for tuning, applied to XeGPU matmul example#63
rolfmorel wants to merge 6 commits intomainfrom
users/rolfmorel/lh-xegpu-autotuning

Conversation

@rolfmorel
Copy link
Contributor

@rolfmorel rolfmorel commented Mar 5, 2026

Adds ability to extract search spaces from schedules with knobs.

Adds transform interpreter semantics to constrain_params op, which interprets the smt ops in its region so that it is possible to check constraints on int params as well as calculate new int params at interpreter time.

Modifies the schedule for the XeGPU matmul (& mlp) example to embed its tuning problem.

lighthouse.tune implements:

  • trace-ing of transform schedules and the SMT regions they include:
    • yields an AST of Nodes, each evaluate-able w.r.t. an environment, with leafs such as
      • Constant, which represents just a constant in IR, and evals to a constant int, and
      • Knob, the representative of a transform.tune.knob, which takes its value from the env while knowing what it's possible values are,
        while
      • Apply depends on other Nodes as it models Values produced by ops dependent on other values
      • Predicate models which condition/constraint needs to be true, as a boolean-valued function on Nodes, for execution to proceed passed a particular op.
  • rewrite-ing of transform schedules, solely through setting the selected attr on knob ops and the selected_region attr on alternatives ops.
  • enumerate-ing all valid assignments for knob and alternatives tuneables.
  • __main__-ing to take a .mlir schedule and derive all valid knob configurations and output the corresponding concrete/transform-interpreter interpretable schedules.

Inside lighthouse.dialects, add (extension) dialects:

  • smt_ext: A wrapper for ir.Values of !smt.int type so we can support python operations on them (e.g. addition) also with integers.
  • transform_tune_ext: A wrapper for ir.Values produced directly by transform.tune.knob ops so we can do python operations on them, in particular add constraints, and a camel_case knob() -> KnobValue helper
  • transform_smt_ext: so we we can have a version of transform.smt.constrain_params which has transform-interpreter semantics: by tracing the body, containing smt ops, we get function we can applied to the transform.param which were arguments to constrain_params. That is, this version of constrain_params has a proper TransformOpInterface implementation.

Comment on lines +69 to +73
assert 64 <= wg_m <= 256 and m % wg_m == 0 and wg_m % DPAS.M == 0
assert 64 <= wg_n <= 256 and n % wg_n == 0 and wg_n % DPAS.N == 0
assert 32 <= sg_m <= 128 and m % sg_m == 0 and sg_m % DPAS.M == 0
assert 32 <= sg_n <= 128 and n % sg_n == 0 and sg_n % DPAS.N == 0
assert 16 <= k_tile <= 50 and k % k_tile == 0 and k_tile % DPAS.K == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice! We do however need to have some mechanism to change the search space bounds from outside. E.g., sometimes you want to autotune with a larger search space. Maybe the bounds could be set in the constructor of the abstract schedule (i.e. one without concrete chosen param values)?

@rolfmorel rolfmorel marked this pull request as ready for review March 12, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants