common: verbose: asynchronous verbose mode for execution time tracking #3055

avmanerikar · 2025-04-09T17:50:48Z

Description

This PR proposes a PoC for introducing an asynchronous verbose mode to accurately track kernel execution times in a non-blocking manner with minimal synchronization latencies. For the verbose mode, retrieving the kernel timing causes significant overhead as it requires the GPU kernel execution to be synchronized and also because it is tracked on the host.
The asynchronous mode removes the synchronization overhead by using event callbacks to query execution timings.
The prototype is created for a OpenCL GPU API that provides the kernel execution statistics for profiling.

The implementation enabled at run-time with DNNL_ASYNC_VERBOSE=1:

DNNL_VERBOSE=profile_exec DNNL_ASYNC_VERBOSE=1 ./examples/primitives-matmul-cpp gpu

Related RFC: [link]

Addresses MFDNN-13603.

Checklist

Have you published an RFC for the new feature?
Was the RFC approved?
Have you added relevant tests?

src/gpu/intel/ocl/stream.hpp

src/common/primitive_iface.cpp

echeresh · 2025-05-27T22:11:41Z

src/gpu/intel/ocl/stream.cpp

+        return status::success;
+
+    } else {
+        cl_int err = clWaitForEvents(1, &async_tracked_event_);


Isn't this call synchronous? We enqueue a kernel, record an event and execution blocks here, until the kernel finishes. Am I missing something?

Yes. This was a fallback for failure cases where the verbose info is then printed with the default stream.wait() calls. The implementation has been updated to avoid repetition.

avmanerikar requested review from a team as code owners April 9, 2025 17:50

github-actions bot added documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc platform:gpu-generic Codeowner: @oneapi-src/onednn-gpu-generic component:api Codeowner: @oneapi-src/onednn-arch component:build labels Apr 9, 2025

avmanerikar marked this pull request as draft April 9, 2025 17:52

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch from 25b0638 to bf1e8d1 Compare April 9, 2025 18:00

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch from bf1e8d1 to 625eec4 Compare April 28, 2025 17:45

github-actions bot added platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel component:common labels Apr 28, 2025

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch from 625eec4 to e69f76d Compare May 12, 2025 17:26

github-actions bot removed the platform:gpu-generic Codeowner: @oneapi-src/onednn-gpu-generic label May 12, 2025

mgouicem reviewed May 12, 2025

View reviewed changes

src/gpu/intel/ocl/stream.hpp Outdated Show resolved Hide resolved

mgouicem reviewed May 12, 2025

View reviewed changes

src/gpu/intel/ocl/stream.hpp Outdated Show resolved Hide resolved

mgouicem reviewed May 12, 2025

View reviewed changes

src/common/primitive_iface.cpp Outdated Show resolved Hide resolved

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch 2 times, most recently from c834a20 to dc4f76d Compare May 27, 2025 17:09

avmanerikar changed the title ~~[WIP] common: verbose: asynchronous verbose mode for execution time tracking~~ common: verbose: asynchronous verbose mode for execution time tracking May 27, 2025

avmanerikar marked this pull request as ready for review May 27, 2025 19:18

avmanerikar requested a review from a team as a code owner May 27, 2025 19:18

echeresh reviewed May 27, 2025

View reviewed changes

avmanerikar marked this pull request as draft May 29, 2025 20:17

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch from dc4f76d to 8051a73 Compare June 4, 2025 18:42

avmanerikar mentioned this pull request Jun 4, 2025

rfcs: proposal for an asynchronous verbose mode #3393

Open

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch from 8051a73 to ad7be6d Compare June 16, 2025 17:47

avmanerikar marked this pull request as ready for review June 16, 2025 17:52

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch from ad7be6d to c9a7d45 Compare June 25, 2025 18:32

github-actions bot removed the documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc label Jun 25, 2025

github-actions bot removed component:api Codeowner: @oneapi-src/onednn-arch component:build labels Jun 25, 2025

avmanerikar added 2 commits July 1, 2025 17:10

common: stream: define base async tracker for stream

bbb7915

gpu: ocl: add async verbose tracking for ocl streams

4513647

avmanerikar force-pushed the amanerik/main/async-verbose-mode branch from c9a7d45 to 4513647 Compare July 2, 2025 00:10

gpu: sycl: add async verbose tracking for sycl streams

9bdebde

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

common: verbose: asynchronous verbose mode for execution time tracking #3055

common: verbose: asynchronous verbose mode for execution time tracking #3055

Uh oh!

avmanerikar commented Apr 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

echeresh May 27, 2025

Uh oh!

avmanerikar Jun 16, 2025

Uh oh!

Uh oh!

common: verbose: asynchronous verbose mode for execution time tracking #3055

Are you sure you want to change the base?

common: verbose: asynchronous verbose mode for execution time tracking #3055

Uh oh!

Conversation

avmanerikar commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

echeresh May 27, 2025

Choose a reason for hiding this comment

Uh oh!

avmanerikar Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

avmanerikar commented Apr 9, 2025 •

edited

Loading