[Perf] Add ti.perf_dispatch by hughperkins · Pull Request #356 · Genesis-Embodied-AI/gstaichi

hughperkins · 2026-01-27T09:16:33Z

Issue: #

Brief Summary

copilot:summary

Walkthrough

copilot:walkthrough

tests/python/test_perf_dispatch.py

erizmr · 2026-02-01T23:44:14Z

tests/python/test_perf_dispatch.py

+        else:
+            assert c[ImplEnum.a_shape0_ge2] == 1
+            assert c[ImplEnum.a_shape0_lt2] == 0
+            assert c[ImplEnum.a_shape0_ge2] == 0


looks a typo here assert c[ImplEnum.a_shape0_ge2] == 0 and assert c[ImplEnum.a_shape0_ge2] == 1? Is it c[ImplEnum.serial] == 0? As after warming up, we only want to run the fastest one given the geometry hash

This is a very good spot. And also led me to notice this bit of code wasnt being called, and my loop was too short. And when I made it longer, I noticed this PR assumed idmpotent kernels. 😅 Anyway, fixed now, in fec2372

python/gstaichi/lang/_perf_dispatch.py

erizmr

Thanks! Some comments.

hughperkins · 2026-02-05T09:51:04Z

~~oh, hmmm, this is broken, because our functions are not idempotent. Will need to fix the scheduling.~~

Fixed

…aichi into hp/perf-dispatch

python/gstaichi/lang/_perf_dispatch.py

tests/python/test_perf_dispatch.py

python/gstaichi/lang/_perf_dispatch.py

duburcqa · 2026-02-05T14:32:31Z

python/gstaichi/lang/_perf_dispatch.py

+        least_trials_idx = None
+        least_trials = None
+        for underlying_idx in compatible.keys():
+            trial_count = self._trial_count_by_underlying_idx_by_geometry_hash[geometry_hash].get(underlying_idx, 0)


Same, self._trial_count_by_underlying_idx_by_geometry_hash[geometry_hash] should be list (initialised with 0)

Oh I disagree fairly strongly here, because then if the list mutates in some way, we retrieve the wrong count.

Oh I disagree fairly strongly here, because then if the list mutates in some way, we retrieve the wrong count.

You don't. When you register a new function you are supposed to expend this list of course. Same if you delete some (which is not supported but still).

the trial counts, if they are a list, could become out of sync. strongly prefer to make them a dict, indexed by kernel index.

the trial counts, if they are a list, could become out of sync.

Why? I don't see any reason for this. You have a register method that is the only entrypoint for users. it is easily to make sure that everything is in sync.

wouldnt the kernel have to be hashable to use it as a key? 🤔

Yes. Still, any python native function or custom class instance can be used as dict key:

def fun(): pass class Class: pass cls = Class() d = {fun: 2, cls: 1} assert d[fun] == 2 * d[cls]

ok. how does this work? using the id? hashing the object? something else?

Under the hood, dict is using the output of hash() to build the hashing table when inserting a new item, then equality comparison to make sure it has retrieved the expected object. The default hash for python objects is (id >> 4) | ((id << 60) & 0xffffffffffffffff) (see https://stackoverflow.com/a/38519187/4820605).

ok, so it's hashing on id, which I explicitly would like to avoid please, since multiple objects can have identical ids, over time.

python/gstaichi/lang/_perf_dispatch.py

tests/python/test_perf_dispatch.py

duburcqa · 2026-02-06T06:19:56Z

tests/python/test_perf_dispatch.py

+            assert c[ImplEnum.a_shape0_ge2] == 1
+            assert c[ImplEnum.a_shape0_lt2] == 0


I would recommend making my_func1_impl_a_shape0_lt_2 a better candidate (faster) than my_func1_impl_a_shape0_ge_2 if it was compatible, just to make sure the compatibility filter is doing the job.

yeah, ok. migrating to time.sleep will make this both easier, and more obvoius.

Actually no it is not addressed. I said my_func1_impl_a_shape0_lt_2 a better candidate (faster) than my_func1_impl_a_shape0_ge_2 if it was compatible.

addressed in affd537

tests/python/test_perf_dispatch.py

python/gstaichi/lang/_perf_dispatch.py

duburcqa · 2026-02-07T02:16:48Z

tests/python/test_perf_dispatch.py

+        for b in range(B):
+            a[b] = a[b] * b
+            c[ImplEnum.serial] = 1
+            time.sleep(0.05)


time.sleep inside kernels is supported ?!

interesting question. Let me migrate to use linear congruential generator.

addressed in affd537

duburcqa · 2026-02-07T02:21:32Z

python/gstaichi/lang/_perf_dispatch.py

+        We collect a single sample from each implementation, and compare that single sample with the samples from the
+        other implementations.
+
+        We are comparing algorithms based on empirical runtime.
+
+        Note that for best results, sets of input arguments that have different runtimes should map to different
+        geometries, otherwise the comparison between runtimes might not be fair, and an inappropriate implementation
+        kernel might be selected.
+
+        We are not implementing an epsilon-greedy algorithm to keep sampling non-fastest variants just in case the
+        distribution is shifting over time.
+
+        It is not possible for you to control exploration vs exploitation.


Information is there. Not enjoyable to read due to both formatting and wording but at least it is there x)

hughperkins added 2 commits January 27, 2026 04:16

add ti.perf_dispatch

2e3e762

precommit

7f1997c

hughperkins mentioned this pull request Jan 30, 2026

[Type] Cache field shape, dtype, name, for 50x faster lookup #355

Merged

hughperkins assigned duburcqa and erizmr Feb 1, 2026

erizmr reviewed Feb 1, 2026

View reviewed changes

tests/python/test_perf_dispatch.py Outdated Show resolved Hide resolved

erizmr reviewed Feb 1, 2026

View reviewed changes

erizmr reviewed Feb 2, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

erizmr reviewed Feb 2, 2026

View reviewed changes

b => c

1045d4a

hughperkins added 4 commits February 5, 2026 05:31

migrate to no longer need idempotent assumption

fec2372

Merge branch 'main' into hp/perf-dispatch

2bef385

add more asserts

a25085c

Merge branch 'hp/perf-dispatch' of github.com:Genesis-Embodied-AI/gst…

b4fd100

…aichi into hp/perf-dispatch

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

tests/python/test_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

hughperkins added 5 commits February 5, 2026 08:36

throw exception if annotation order wrong

01554f8

NUM_WARMUP

ffa6f73

precommit

fbff1f6

__wrapped__

f2169d6

Merge branch 'main' into hp/perf-dispatch

b2a505e

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Show resolved Hide resolved

duburcqa reviewed Feb 5, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Show resolved Hide resolved

hughperkins added 5 commits February 5, 2026 10:24

remove self.

9427ffc

_update_fastest

26af402

_get, temporary

7cfaecf

remove tempoaryr

781ad2b

thrwo exceptions and shortcircuit

3f14261

duburcqa reviewed Feb 6, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Outdated Show resolved Hide resolved

duburcqa reviewed Feb 6, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Show resolved Hide resolved

duburcqa reviewed Feb 6, 2026

View reviewed changes

tests/python/test_perf_dispatch.py Show resolved Hide resolved

duburcqa reviewed Feb 6, 2026

View reviewed changes

tests/python/test_perf_dispatch.py Show resolved Hide resolved

duburcqa reviewed Feb 6, 2026

View reviewed changes

python/gstaichi/lang/_perf_dispatch.py Show resolved Hide resolved

hughperkins added 8 commits February 6, 2026 08:14

detect parameter mismatch, and test this

d357e0c

part of previosu commit

e3076b7

chagne parlle/serial to time.sleep

cc89ffd

precommit

ce994f7

num_warmup as instance variable

8b3c7be

next iter

ec65937

generics

f5f1fc9

add comments about algo

c89c10f

duburcqa reviewed Feb 7, 2026

View reviewed changes

migrate from time to LCG

affd537

		assert c[ImplEnum.a_shape0_ge2] == 1
		assert c[ImplEnum.a_shape0_lt2] == 0

Conversation

hughperkins commented Jan 27, 2026

Brief Summary

Walkthrough

Uh oh!

Uh oh!

erizmr Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

erizmr left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hughperkins commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

duburcqa Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

duburcqa Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

duburcqa Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

erizmr Feb 1, 2026 •

edited

Loading

erizmr left a comment •

edited

Loading

hughperkins commented Feb 5, 2026 •

edited

Loading

duburcqa Feb 6, 2026 •

edited

Loading

duburcqa Feb 6, 2026 •

edited

Loading

duburcqa Feb 6, 2026 •

edited

Loading