Skip to content

Make simple kernel timer stdout in ascii text#323

Open
yasahi-hpc wants to merge 4 commits intokokkos:developfrom
yasahi-hpc:stdout-simple-kernel-timer
Open

Make simple kernel timer stdout in ascii text#323
yasahi-hpc wants to merge 4 commits intokokkos:developfrom
yasahi-hpc:stdout-simple-kernel-timer

Conversation

@yasahi-hpc
Copy link
Copy Markdown

@yasahi-hpc yasahi-hpc commented Mar 9, 2026

This PR aims at making the default output of simple kernel timer into stdout in ascii.

For the original behavior, you need to set KOKKOS_TOOLS_TIMER_BINARY = true.

Simple example

KokkosP: Simple Kernel Timer Library Initialized (sequence is 0, version: 20240906)
Elapsed time: 0.156063 [s]

 (Type)   Total Time, Call Count, Avg. Time per Call, %Total Time in Kernels, %Total Program Time
-------------------------------------------------------------------------

Regions: 

- KokkosFFT::create_plan[TPL_cufft]
 (Region)   2.499047 2 1.249524 2531.790119 93.934408
- KokkosFFT::exec_plan[TPL_cufftExecZ2D]
 (Region)   0.023082 2002 0.000012 23.384202 0.867600
- KokkosFFT::exec_plan[TPL_cufftExecD2Z]
 (Region)   0.011615 1001 0.000012 11.767462 0.436596
- KokkosFFT::cleanup_plan[TPL_cufft]
 (Region)   0.000078 2 0.000039 0.078984 0.002930

-------------------------------------------------------------------------
Kernels: 

- advect_x
 (ParFor)   0.043851 2000 0.000022 44.425443 1.648272
- advect_v
 (ParFor)   0.019495 1000 0.000019 19.750921 0.732798
- compute_rho
 (ParFor)   0.013256 1001 0.000013 13.429516 0.498262
- solve_poisson
 (ParFor)   0.010916 1001 0.000011 11.059018 0.410311
- poisson
 (ParFor)   0.010211 1001 0.000010 10.345261 0.383830
- Kokkos::View::initialization [rho_k]
 (ParFor)   0.000715 1 0.000715 0.724386 0.026876
- init_fn
 (ParFor)   0.000056 1 0.000056 0.056762 0.002106
- Kokkos::View::initialization [f0] via memset
 (ParFor)   0.000029 3 0.000010 0.029468 0.001093
- Kokkos::View::initialization [rho] via memset
 (ParFor)   0.000028 3 0.000009 0.028502 0.001057
- Kokkos::View::initialization [linspace] via memset
 (ParFor)   0.000027 2 0.000014 0.027536 0.001022
- Kokkos::View::initialization [phi] via memset
 (ParFor)   0.000027 3 0.000009 0.027536 0.001022
- Kokkos::View::initialization [fn] via memset
 (ParFor)   0.000025 2 0.000013 0.025603 0.000950
- Kokkos::View::initialization [f] via memset
 (ParFor)   0.000013 1 0.000013 0.013285 0.000493
- Kokkos::View::initialization [phi_k]
 (ParFor)   0.000012 1 0.000012 0.012077 0.000448
- Kokkos::View::initialization [E_k]
 (ParFor)   0.000011 1 0.000011 0.011111 0.000412
- Kokkos::View::initialization [E] via memset
 (ParFor)   0.000009 1 0.000009 0.009179 0.000341
- Kokkos::View::initialization [delta_rho] via memset
 (ParFor)   0.000009 1 0.000009 0.009179 0.000341
- Kokkos::View::initialization [inv_kx] via memset
 (ParFor)   0.000008 1 0.000008 0.007971 0.000296
- Kokkos::View::initialization [linspace_mirror] via memset
 (ParFor)   0.000004 2 0.000002 0.004106 0.000152
- Kokkos::View::initialization [inv_kx_mirror] via memset
 (ParFor)   0.000002 1 0.000002 0.001932 0.000072
- Kokkos::View::initialization [freq] via memset
 (ParFor)   0.000001 1 0.000001 0.001208 0.000045

-------------------------------------------------------------------------
Summary:

Total Execution Time (incl. Kokkos + non-Kokkos):                   2.66042 seconds
Total Time in Kokkos kernels:                                       0.09871 seconds
   -> Time outside Kokkos kernels:                                  2.56171 seconds
   -> Percentage in Kokkos kernels:                                    3.71 %
Total Calls to Kokkos Kernels:                                         6028
-------------------------------------------------------------------------

@yasahi-hpc yasahi-hpc marked this pull request as draft March 9, 2026 22:48
@yasahi-hpc yasahi-hpc force-pushed the stdout-simple-kernel-timer branch from 3c3e893 to 05e4ed8 Compare March 10, 2026 11:40
@yasahi-hpc yasahi-hpc marked this pull request as ready for review March 10, 2026 13:03
@masterleinad
Copy link
Copy Markdown
Contributor

I wouldn't be opposed to deprecate or remove the binary altogether but it would be good to hear reasons for that design decision in the first place.

switch (info->getKernelType()) {
case PARALLEL_FOR: typeStr = " (ParFor) "; break;
case PARALLEL_REDUCE: typeStr = " (ParRed) "; break;
case PARALLEL_SCAN: typeStr = " (ParScan) "; break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use parallel_for etc to make it easier for users to map it to Kokkos parallel dispatch?

Copy link
Copy Markdown
Author

@yasahi-hpc yasahi-hpc Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I just keep the original behavior from kp_reader.cpp.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine for now.

@yasahi-hpc
Copy link
Copy Markdown
Author

Well, this is one of the tasks proposed by @crtrott
Do you have some opinion?

Copilot AI added a commit that referenced this pull request Mar 15, 2026
…alize_library

- Port print_ascii function from PR #323 to simple-kernel-timer
- Modify kokkosp_finalize_library to use print_ascii by default
- Add 7 death test files covering parallel_for, parallel_reduce,
  parallel_scan, region, and combinations with regions
- All 17 tests pass (10 existing + 7 new)

Co-authored-by: yasahi-hpc <57478230+yasahi-hpc@users.noreply.github.com>
Copy link
Copy Markdown
Member

@crtrott crtrott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough

switch (info->getKernelType()) {
case PARALLEL_FOR: typeStr = " (ParFor) "; break;
case PARALLEL_REDUCE: typeStr = " (ParRed) "; break;
case PARALLEL_SCAN: typeStr = " (ParScan) "; break;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine for now.

@yasahi-hpc
Copy link
Copy Markdown
Author

@crtrott Thank you for approval.
Should I implement tests in this PR or can I do that in another PR?

@crtrott
Copy link
Copy Markdown
Member

crtrott commented Mar 20, 2026

Was looking at your other PR with the tests now. What do you think?

@yasahi-hpc
Copy link
Copy Markdown
Author

In my opinion, I would like to make this first and add tests in another PR.
There are more points to discuss for tests.

The PR with tests is fine with me, but there are certainly rooms to improve.
Particularly, it is not clear what should be tested.
Should we test number of "-", new lines, and white spaces?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants