-
Notifications
You must be signed in to change notification settings - Fork 1
Add ideal instrument and models #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7c46a05 to
c183566
Compare
Still some details to be worked out on I-C form in energy space (it's defined for ToF, really). It's not required for MVP so drop for now.
- Boxcar needs a default width or fails validation - Better to have a float default for float argument
Boxcar does have a standard deviation, not equivalent to length. Take length as argument (as this is the most natural parameter), but also derive sigma and mention in the docstring how to do conversion.
|
The boxcar logic is currently inverted. It also seems a bit naive with respect to boxcar sizes that don't evenly divide into the grid; perhaps it would be better to provide linearly-interpolated edges so that normalisation is correct without changing the maximum value.
Astropy seems to handle this by quantising to a whole number of samples and then if this is even we add a sample and set the edges to 0.5: https://docs.astropy.org/en/stable/api/astropy.convolution.Box1DKernel.html Not a bad scheme? |
It would be nice to include some nice interpolated versions later, but for simple testing and sanity checks it is nice if these functions are designed to always fall neatly on the sample bins. This avoid surprises such as peak shifts and triangles with flat tops.
Just assemble an output mesh and copy the kernels to it directly. Probably more efficient than the delta convolutions implementation and reduces risk of circular dependence of mixins. Include unit tests (ref data roughly validated but not fully confident yet.)
The convolve() based implementation made assumptions about the input points aligning perfectly to the output mesh. Also, it seemed to give the wrong results (peaks not centered correctly). In principle it is quite inefficient to use the energy-dependent broaden() method. However as it largely defers to a somewhat efficient fixed-energy get_peak() method it's not as bad as it looks. And anyway, these kernels are for testing. The results look pretty sensible except that, conspicuously, the area of the broadened boxcar is larger than the area of the input puslse. So the normalisation of this scheme is not yet correct.
Drop the get_peak() methods on boxcar and triangle that just tweak the docstring from mixin. Instead make the mixing docstring more general!
|
I'm getting some nice triangle functions now that snap to grid and obviously have the correct area. Boxcar is a little more subjective: I'm choosing to normalise area as though it is a trapezoid as this seems most likely to conserve properties correctly. However at this point I am suspicious of the scaling in general. I think kernels should be normalised by value while peaks are normalised by area. Then the kernel will broaden without modifying overall intensity when convolved against a spectrum. Alternatively we can adopt convention that kernels are normalised by area (as currently) but account for this in broaden() operations using bin_width. Either change is going to break a lot of existing tests, so let's do it outside this PR which focuses on adding new models. |
The main purpose for snapping is to avoid surprising changes in shape across a spectrum, and to facilitate comparison with convolution-based methods. For test purposes, we can achieve the same results as a snapped kernel by choosing the right width parameter. Let's document that part, and just rely on snapping for efficient peak placement. Note that test data is unchanged as the test widths were already tidy multiples of bin width.
Parametrise a general function instead of creating a data function per-test.
Trapezoid pdf requires c, d parameters as fractions of "scale" ( = length of base)
The test file is now looking rather concise!
- Use leading space to make 1D annotations throw same F722 error as others - Ignore this error from pyproject.toml
|
@RastislavTuranyi I can't technically request you as reviewer as you started the PR. But if you are interested/available and have any comments now is a good time to make them 😄 |
|
I don't think I understand what the problem with normalisation you mention is - the only thing that I can point out is that the |
RastislavTuranyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I've forgotten most of what i had learned, but i had real trouble understanding what is happening with all the bins and rounding and quantisation etc. Would it be possible to write up one robust explanation of what the problem is, how it's been solved, and how that may cause surprises for users? Ideally something i as i am now could understand (that should hopefully cover all users!). Currently, it feels like each model's documentation has a piece of the puzzle, so i think gathering it all up in one place would be valuable
| """ | ||
| characteristics = {"width": np.ones(len(points)) * self.width} | ||
| characteristics["sigma"] = np.full_like( | ||
| characteristics["width"], np.sqrt(1 / 12) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why you chose to use np.full_like, but isn't the width missing?
| characteristics["width"], np.sqrt(1 / 12) | |
| characteristics["width"], np.sqrt(self.width / 12) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it should be self.width * sqrt(1/12) 🤦
| Note that these kernels generated with scipy.signal.uniform are rounded | ||
| to an odd-integer width and edges move directly from full-height to | ||
| zero. Better but more complicated algorithms exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you maybe elaborate a little a bit on what this algorithm is? I think just one sentence at the beginning of this paragraph would make things a lot clearer!
Also, isn't being limited to odd-integer widths really restrictive? That's quite a small subset of real numbers...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is naturally what happens when evaluating the function on loc=(-self.width/2); as soon as the range reaches a new point on one side it also reaches a point on the other side; so it starts at 1 and increases in width by increments of 2.
The tricky part is that there isn't any interpolation to deal the edges of the box car, it always goes from full-height to zero over a width of one sample, so seems to "snap".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can probably be explained more clearly! An earlier version had some explicit logic to do it consistently, then I realised it was unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way you're explaining, this sounds more like the other cases in that the rounding is also to integer number of bins rather than to an integer - am i getting that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing with
Note that these kernels will always consist of an odd-integer width of full-height samples,
moving directly to zero at the surrounding samples. The area is normalised as though this
is a trapezoid (i.e. as though lines connect the boxcar top to the surrounding samples),
resulting in lower height than the ideal boxcar.
| fwhm | ||
| The width (in Full-Width Half-Maximum) of the Triangle function. This width is used for all | ||
| values of [w, Q]. When realised on a user mesh, the width is rounded to an integer number of | ||
| bins to create straight lines from the peak to zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this how scipy.stats.triag.pdf works? Because I don't see such an operation being explicitly done in get_kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Note that shape and area are only exactly correct when base lengths correspond to an even number | ||
| of bin widths. The get_peak() and broaden() methods will snap input points to the nearest mesh | ||
| value; this results in a consistent peak shape. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is similar to the situation in Boxcar and Triangle models? If they all face about the same issue, would it be possible to write one good explanation and either copy paste it into all affected models, or put it one place and then link to it? As you might have noticed, I am quite confused lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could write an explanation in the docs page including some graphs that make it a bit easier to understand. But then we should still probably have something written in the docstring? Does linking from docstring to sphinx doc pages work well?
| DATA_PATH = Path(__file__).parent / "data" / "ideal" | ||
|
|
||
|
|
||
| TEST_CASES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the mesh is an issue for some of the models, could we have a test both for when the bins align with the characteristic, and when they don't?

















Resolves #24