-
|
After banging my head against a wall trying to implement timestamp queries myself (lots of null results), I tried this crate. Super simple to implement, and it seems to work well. I did something really simple. I created an encoder, and used it to create a scope. Then I used to the scope to create a scoped_compute_pass. What surprised me was the output that I printed (I used the console_output function from the example). I got a time like 150us and the label said Compute Kernel (which was my label name for the scope variable). Then underneath I got another time of say 250us with the label Scoped Compute Pass. The confusion is this: if the initial scope is the parent, why is the nested scoped compute pass time larger than it? I was expecting it to be the other way around. Can someone help me interpret the two times please? Thanks so much! Great crate!! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
I'm guessing you're running with MacOS? :). I observed a lot of aggressive reordering of commands there, and that sometimes leaves the timer queries in weird spots causing the kind of thing you're describing, it's quite nasty. Wgpu is partially to blame for that but it's also really hard to do this without making things slower. Generally, the pass scopes are way more to trust than other scopes, especially on MacOS where they map directly onto a Metal concept. I think wgpu-profiler should do a better job at documenting these pitfalls. |
Beta Was this translation helpful? Give feedback.
I'm guessing you're running with MacOS? :). I observed a lot of aggressive reordering of commands there, and that sometimes leaves the timer queries in weird spots causing the kind of thing you're describing, it's quite nasty. Wgpu is partially to blame for that but it's also really hard to do this without making things slower. Generally, the pass scopes are way more to trust than other scopes, especially on MacOS where they map directly onto a Metal concept.
Another reason why this could be happening is just inaccuracy of the counter.
I think wgpu-profiler should do a better job at documenting these pitfalls.