Skip to content

Split find tracks kernel into compute and storage#1318

Draft
stephenswat wants to merge 2 commits into
acts-project:mainfrom
stephenswat:perf/split_find_tracks
Draft

Split find tracks kernel into compute and storage#1318
stephenswat wants to merge 2 commits into
acts-project:mainfrom
stephenswat:perf/split_find_tracks

Conversation

@stephenswat
Copy link
Copy Markdown
Member

No description provided.

@stephenswat stephenswat added the refactor Change the structure of the code label May 7, 2026
@stephenswat

This comment was marked as outdated.

@stephenswat

This comment was marked as outdated.

@stephenswat stephenswat force-pushed the perf/split_find_tracks branch from 1f93da0 to c35f1eb Compare May 20, 2026 15:31
@sonarqubecloud
Copy link
Copy Markdown

@stephenswat
Copy link
Copy Markdown
Member Author

Physics performance summary

Here is a summary of the physics performance effects of this PR. Command used:

traccc_seeding_example_cuda --input-directory=/data/Acts/odd-simulations-20240506/geant4_ttbar_mu200 --digitization-file=geometries/odd/odd-digi-geometric-config.json --conditions-file=geometries/odd/odd-digi-geometric-config.json --detector-file=geometries/odd/odd-detray_geometry_detray.json --grid-file=geometries/odd/odd-detray_surface_grids_detray.json --material-file=geometries/odd/odd-detray_material_detray.json --input-events=10 --use-acts-geom-source=on --check-performance --truth-finding-min-track-candidates=5 --truth-finding-min-pt=1.0 --truth-finding-min-z=-150 --truth-finding-max-z=150 --truth-finding-max-r=10 --seed-matching-ratio=0.99 --track-matching-ratio=0.5 --track-candidates-range=5:100 --seedfinder-vertex-range=-150:150

Seeding performance

Total number of seeds went from 298341 to 298338 (-0.0%)

Seeding plots



Track finding performance

Total number of found tracks went from 50211 to 50211 (+0.0%)

Finding plots









Track fitting performance

Fitting plots














Seeding to track finding relative performance

Seeding to track finding plots



Note

This is an automated message produced on the explicit request of a human being.

@stephenswat
Copy link
Copy Markdown
Member Author

Performance summary

Here is a summary of the performance effects of this PR:

Graphical

Tabular

KernelReciprocal ThroughputParallelism
92e0378c35f1ebDelta92e0378c35f1eb
propagate_to_next_surface5.81 ms5.80 ms-0.2%4.334.34
find_tracks1.56 ms1.17 ms-24.5%1.861.83
count_doublets812.16 μs813.22 μs0.1%1.611.61
count_triplets566.61 μs568.61 μs0.4%1.021.02
find_doublets536.15 μs534.25 μs-0.4%3.083.08
ccl_kernel433.62 μs434.52 μs0.2%1.711.71
Thrust::sort380.66 μs380.27 μs-0.1%7.317.31
condense_tracks251.01 μsnan5.76
find_triplets169.32 μs170.25 μs0.6%1.321.31
estimate_track_params146.57 μs146.28 μs-0.2%2.682.68
build_tracks123.70 μs123.66 μs-0.0%3.713.71
select_seeds59.26 μs59.68 μs0.7%1.341.34
populate_grid23.89 μs23.91 μs0.1%1.221.22
count_grid_capacities22.05 μs22.01 μs-0.2%1.221.22
remove_duplicates20.14 μs20.00 μs-0.7%25.4825.61
fill_sorted_measurements16.20 μs16.27 μs0.4%1.131.13
update_triplet_weights14.76 μs14.84 μs0.5%1.271.27
fill_finding_propagation_sort_keys8.73 μs8.80 μs0.7%7.747.66
form_spacepoints8.37 μs8.43 μs0.6%1.481.49
reduce_triplet_counts5.63 μs5.61 μs-0.3%3.093.09
unknown5.03 μs5.08 μs0.9%4.274.25
fill_finding_duplicate_removal_sort_keys1.55 μs1.56 μs0.8%37.9638.00
DeviceScanKernel991.06 nsnan106.53
DeviceScanInitKernel65.87 nsnan768.00
Total10.72 ms10.58 ms-1.3%3.463.58

Important

All metrics in this report are given as reciprocal throughput, not as wallclock runtime.

Note

This is an automated message produced upon the explicit request of a human being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Change the structure of the code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant