-
Notifications
You must be signed in to change notification settings - Fork 1
NXLauetof Panel dependent dataset #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e1c47d4
to
08007e5
Compare
src/ess/nmx/nexus.py
Outdated
_create_dataset_from_var( | ||
name="data", | ||
root_entry=nx_detector, | ||
var=sc.scalar(0, unit=''), # TODO: Add real data | ||
var=sc.fold(dg["counts"].data, dim='id', sizes={'x': num_x, 'y': num_y}), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronfinke And this file has all the exporting methods.
Can you please review this part if the fields are correct...?
For example, this block, is folding pixel counts into [n_x_pixels, n_y_pixels, n_tof_bins]
and then save it as data
under instrument/{detector_name}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure the counts are unsigned int and not float since I think that is what is expected from the EFU (they were floats before but they shouldn't be).
Also need to think about how to set the detector geometries in the file. I think the way it was done before (fast_axis, slow_axis, origin as 3D vectors) is fine. If we do that, we can remove the polar_angle and azimuthal_angle. I need to think about this though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronfinke I pushed a change to export them as uint.
Just note that original McStas data is floating numbers so if you don't give the workflow big enough MaximumProbability
, it might just wipe out all the information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually the MaximumProbability
is set to 10000 which seems to be big enough. In any case, DIALS reads the data in as ints so we might as well control that input ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As to whether counts should be uint or signed int- some detectors output -1
as a count to define defective pixels, sometimes it's MAX_VALUE-1
. I don't know what our detector will do but once I know I will let you know
Doesn't crash for me :) (also does not give the correct processed data, but I guess it's not supposed to yet?) |
What do you mean by correct processed data? The counts should be correct. |
44d301e
to
48e77e8
Compare
@scipp/ess-maintainers @aaronfinke Sorry I couldn't really make it pretty yet and ended up making a PR with many fixes.
I had to move around some providers order, I'll be away until 11th of February so if anyone's interested, please pick this up and continue. TODO:
|
These are not part of essnmx; where are they from? |
They are from https://github.com/scipp/essreduce. |
Tried on a mcstas dataset, got this error:
|
I would suggest to take advantage of the fact that these HDF5 data are stored in compressed chunks, and the chunk size is already optimal for i/o. e.g. in one mcstas file:
then you can do something like
That is how I am working with the McStas data directly and it is quite fast and memory-economical. |
My guess is that the chunks in the file are much smaller than the optimal chunk size for the data reduction. There will be a very significant per-pixel cost, so if there are too few events per chunk then performance will be dominated by the But yes, we can try too choose the chunk size as a multiple of the HDF5 chunk size of the dataset, as an added bonus. |
@aaronfinke And I also added an argument that allows auto chunk based on the |
|
Why can't this be a user input, or pre-configured, as in other techniques/instruments? |
It will be pre-configured for real data, once we know precisely the tof bin limits for each chopper setting (determined experimentally). But not for McStas simulations ;) |
So we can make it a user input then? |
This will cause problems in DIALS if the boxes (or tof_min, tof_max) are not accurately determined; refinement of wavelength in DIALS tends to fail with larger bin sizes, for example. A helper script to determine it directly from the tof data is a better option. This is what I do now and it works. I'm not sure what the negative is, aside from taking longer to compute/having to read the file twice (when loading into memory is not an option). If processing takes twice as long, that's still significantly shorter than it takes to collect NMX data (~10 min vs hours). I have sent Sunyoung the helper I have written for chunked data, but I will add it here for posterity:
|
It's re-written in #115 |
Related to #60 and #90
@aaronfinke can you please try this and see if it crashes or not....?