-
Notifications
You must be signed in to change notification settings - Fork 13
Support array-type metadata fields in cubids group
#407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is a very smart approach to the problem. I think it will work. I don't think any of the dataframes will ever be written and reloaded during the course of a command line call. |
|
@mattcieslak do you want me to merge? |
If you have two unique values (NaNs and some actual value), it would label everything as cluster 0, but it should probably label the actual values as 0 and the NaNs as 1.
|
This interacts with #445 in that we'll be able to group based on file collection-related entities, which are going to all be arrays. For example, if one file named |
tien-tong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
As far as I can tell, the functions used for grouping/clustering were _get_param_groups and format_params (now renamed to cluster_single_parameters), and you have updated both in this PR.
I like that we can now also compare lists of strings.
mattcieslak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kind of attached to the obliquity variant and didn't see on a quick readthrough what this will look like with the new convention. What will oblique scan file names look like?
|
They should look something like |
|
I could hardcode in some kind of rename for that field? So it ends up being |
|
I think it would be hard to know what's going on with directly using the values in imageorientationpatientdicom because that field is part of an affine matrix. If the person doing the scanning was angling it on a subject-by-subject basis then these values will be all over the place and the clusters won't make sense. I like the idea of having an ObliqueDim1, ObliqueDim2, etc that can be true or false. That will capture meaningful variation, particularly for dmri |
|
Okay I've reverted the obliquity-related changes. Can you take another look? |
mattcieslak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Closes #281, closes #331, and closes #304 (in combination with #403).
A few things to look into
Changes proposed in this pull request
param_group_dfparameter tocubids.cubids.format_params()cubids.cubids.format_params()tocubids.utils.cluster_single_parameters()cubids.cubids.round_params().Drop "Obliquity" boolean field in favor of "ImageOrientationPatientDICOM" array field.