Conversation
tests/python/test_perf_dispatch.py
Outdated
| else: | ||
| assert c[ImplEnum.a_shape0_ge2] == 1 | ||
| assert c[ImplEnum.a_shape0_lt2] == 0 | ||
| assert c[ImplEnum.a_shape0_ge2] == 0 |
There was a problem hiding this comment.
looks a typo here assert c[ImplEnum.a_shape0_ge2] == 0 and assert c[ImplEnum.a_shape0_ge2] == 1? Is it c[ImplEnum.serial] == 0? As after warming up, we only want to run the fastest one given the geometry hash
There was a problem hiding this comment.
This is a very good spot. And also led me to notice this bit of code wasnt being called, and my loop was too short. And when I made it longer, I noticed this PR assumed idmpotent kernels. 😅 Anyway, fixed now, in fec2372
|
Fixed |
| least_trials_idx = None | ||
| least_trials = None | ||
| for underlying_idx in compatible.keys(): | ||
| trial_count = self._trial_count_by_underlying_idx_by_geometry_hash[geometry_hash].get(underlying_idx, 0) |
There was a problem hiding this comment.
Same, self._trial_count_by_underlying_idx_by_geometry_hash[geometry_hash] should be list (initialised with 0)
There was a problem hiding this comment.
Oh I disagree fairly strongly here, because then if the list mutates in some way, we retrieve the wrong count.
There was a problem hiding this comment.
Oh I disagree fairly strongly here, because then if the list mutates in some way, we retrieve the wrong count.
You don't. When you register a new function you are supposed to expend this list of course. Same if you delete some (which is not supported but still).
There was a problem hiding this comment.
the trial counts, if they are a list, could become out of sync. strongly prefer to make them a dict, indexed by kernel index.
There was a problem hiding this comment.
the trial counts, if they are a list, could become out of sync.
Why? I don't see any reason for this. You have a register method that is the only entrypoint for users. it is easily to make sure that everything is in sync.
There was a problem hiding this comment.
wouldnt the kernel have to be hashable to use it as a key? 🤔
There was a problem hiding this comment.
Yes. Still, any python native function or custom class instance can be used as dict key:
def fun():
pass
class Class:
pass
cls = Class()
d = {fun: 2, cls: 1}
assert d[fun] == 2 * d[cls]There was a problem hiding this comment.
ok. how does this work? using the id? hashing the object? something else?
There was a problem hiding this comment.
Under the hood, dict is using the output of hash() to build the hashing table when inserting a new item, then equality comparison to make sure it has retrieved the expected object. The default hash for python objects is (id >> 4) | ((id << 60) & 0xffffffffffffffff) (see https://stackoverflow.com/a/38519187/4820605).
There was a problem hiding this comment.
ok, so it's hashing on id, which I explicitly would like to avoid please, since multiple objects can have identical ids, over time.
| assert c[ImplEnum.a_shape0_ge2] == 1 | ||
| assert c[ImplEnum.a_shape0_lt2] == 0 |
There was a problem hiding this comment.
I would recommend making my_func1_impl_a_shape0_lt_2 a better candidate (faster) than my_func1_impl_a_shape0_ge_2 if it was compatible, just to make sure the compatibility filter is doing the job.
There was a problem hiding this comment.
yeah, ok. migrating to time.sleep will make this both easier, and more obvoius.
There was a problem hiding this comment.
Actually no it is not addressed. I said my_func1_impl_a_shape0_lt_2 a better candidate (faster) than my_func1_impl_a_shape0_ge_2 if it was compatible.
tests/python/test_perf_dispatch.py
Outdated
| for b in range(B): | ||
| a[b] = a[b] * b | ||
| c[ImplEnum.serial] = 1 | ||
| time.sleep(0.05) |
There was a problem hiding this comment.
time.sleep inside kernels is supported ?!
There was a problem hiding this comment.
interesting question. Let me migrate to use linear congruential generator.
| We collect a single sample from each implementation, and compare that single sample with the samples from the | ||
| other implementations. | ||
|
|
||
| We are comparing algorithms based on empirical runtime. | ||
|
|
||
| Note that for best results, sets of input arguments that have different runtimes should map to different | ||
| geometries, otherwise the comparison between runtimes might not be fair, and an inappropriate implementation | ||
| kernel might be selected. | ||
|
|
||
| We are not implementing an epsilon-greedy algorithm to keep sampling non-fastest variants just in case the | ||
| distribution is shifting over time. | ||
|
|
||
| It is not possible for you to control exploration vs exploitation. |
There was a problem hiding this comment.
Information is there. Not enjoyable to read due to both formatting and wording but at least it is there x)
Issue: #
Brief Summary
copilot:summary
Walkthrough
copilot:walkthrough