Skip to content

Improved softkiller examples#193

Merged
graeme-a-stewart merged 2 commits intomainfrom
softkiller-examples
Aug 5, 2025
Merged

Improved softkiller examples#193
graeme-a-stewart merged 2 commits intomainfrom
softkiller-examples

Conversation

@graeme-a-stewart
Copy link
Member

Rename --maxevents and --skip to --pileup-maxevents and --pileup-skip which allow variations in the amount of pileup to be applied.

Select the hard scatter event with --eventno (this only ever one event).

Change hard scatter and pileup files to be positional arguments, which is more consistent with other scripts.

Make both scripts a bit more verbose for the user's information.

Run a jet reconstruction in softkiller_runtime.jl so that we actually do something with the reduced event.

N.B. for now, the reduced event has to be rewritten into a new vector to get the cluster_hist_indexes correct. This can be removed once softkiller returns a good-for-reconstruction output.

Add a test.sh script that can be used for a quick test of the examples.

@graeme-a-stewart graeme-a-stewart added this to the 1.0.0 Release milestone Jul 27, 2025
@graeme-a-stewart graeme-a-stewart requested a review from emadmtr July 27, 2025 01:35
@graeme-a-stewart graeme-a-stewart added the examples Related to example codes label Jul 27, 2025
@codecov
Copy link

codecov bot commented Jul 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.33%. Comparing base (79cee14) to head (c7f5006).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #193   +/-   ##
=======================================
  Coverage   77.33%   77.33%           
=======================================
  Files          21       21           
  Lines        1319     1319           
=======================================
  Hits         1020     1020           
  Misses        299      299           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@emadmtr
Copy link
Contributor

emadmtr commented Aug 2, 2025

Hi @graeme-a-stewart, I checked the changes everything seems great! I only have a question for some of the arguments that we can pass from the command line, because I haven't implemented a numer of them anywhere so I was just curious if you have a plan for them or they could be removed.

@graeme-a-stewart
Copy link
Member Author

Hello @emadmtr - thanks for taking a look. You're right about the unused arguments, so I have just removed them now. It's easy to put them back if we do need them again later.

@graeme-a-stewart
Copy link
Member Author

If you go to the "Files Changed" tab there's a "Review Changes" button and you can approve this PR if you think everything is ok now.

Rename --maxevents and --skip to --pileup-maxevents and --pileup-skip
which allow variations in the amount of pileup to be applied.

Select the hard scatter event with --eventno (this only even one event).

Change hard scatter and pileup files to be positional arguments, which
is more consistent with other scripts.

Make both scripts a bit more verbose for the user's information.

Run a jet reconstruction in softkiller_runtime.jl so that we actually
do something with the reduced event.

N.B. for now, the reduced event has to be rewritten into a new vector
to get the cluster_hist_indexes correct. This can be removed once
softkiller returns a good-for-reconstruction output.

Add a test.sh script that can be used for a quick test of the examples.
reduced_event, pt_threshold = softkiller(soft_killer, all_jets_sk)
@info "SoftKiller applied: $(length(reduced_event)) clusters remaining from $(length(all_jets_sk)), pt threshold = $pt_threshold"

# Workaround for cluster_hist_indedx not being correct after SoftKiller filtering
Copy link
Contributor

@mattleblanc mattleblanc Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is for a future PR, but I wanted to say that I find this procedure to be very cumbersome, in-part because it has to happen between applying filters to the input constituents and any jet finding that is done later.

SK isn't the only filter people will want to study: CS and CHS are also obvious things to look at, and grooming jets will have a similar problem.

Is there some smarter way to handle the cluster_hist_index assignment when filtering particles? In FastJet I guess that this is part of the ClusterSequence and so it's handled there instead of by the user ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both in Fastjet and JetReconstruction the cluster history indices are part of the jet classes. The problem here is that our PseudoJets are Julia's struct and are immutable so their fields (index) can't be updated later so we have to create a new jet object with different index. Prior to #146, we had mutable struct which can be modified but had a number of other annoyances (reasoning about code with constants , less performant, more awkward C-bindings)

I guess as a minimum we could add a function taking a vector of jets after filter and returning vector of jets with normalized indices or maybe doing this inplace

Copy link
Member Author

@graeme-a-stewart graeme-a-stewart Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @mattleblanc

Your point is very well taken. We started to put effort into ensuring that this would be correct, but there are many cases to cover and lots of different paths to failure!

I propose that the reconstruction algorithms themselves take responsibility for checking and fixing things if a collection is passed that has mis-set the cluster_hist_index. This should not cost much at runtime and it makes things much safer and prevents the code from bombing if there's a filtering/selection path that didn't fix-up the cluster_hist_index itself.

See #200.

AFAICR Fastjet is using mutable structures, which is inherently more dangerous for the code, but allows these kinds of fix-ups on the fly. But we have ways to make sure that this is correct in Julia as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that having the clustering algorithms themselves handle this makes the most sense, even if it costs a little bit. In this case, at least we can ensure it's done as optimally as possible, instead of leaving the job up to the user.

Mutable structs seem undesirable for the reasons you both mention...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also makes things easier in the case of SoftKiller as it can just pass back the current list of jets and not worry about having to apply these fixes - this would make #191 moot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could consider writing custom https://juliaobjects.github.io/Accessors.jl/dev/examples/custom_optics/ if it makes everyone's life easier while keeping strict immutable

@graeme-a-stewart graeme-a-stewart merged commit eb63e6c into main Aug 5, 2025
13 checks passed
@graeme-a-stewart graeme-a-stewart deleted the softkiller-examples branch August 5, 2025 12:14
mattleblanc pushed a commit to mattleblanc/JetReconstruction.jl that referenced this pull request Aug 11, 2025
* Improved softkiller examples

Rename --maxevents and --skip to --pileup-maxevents and --pileup-skip
which allow variations in the amount of pileup to be applied.

Select the hard scatter event with --eventno (this only even one event).

Change hard scatter and pileup files to be positional arguments, which
is more consistent with other scripts.

Make both scripts a bit more verbose for the user's information.

Run a jet reconstruction in softkiller_runtime.jl so that we actually
do something with the reduced event.

N.B. for now, the reduced event has to be rewritten into a new vector
to get the cluster_hist_indexes correct. This can be removed once
softkiller returns a good-for-reconstruction output.

Add a test.sh script that can be used for a quick test of the examples.

* Remove unused arguments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples Related to example codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants