-
Notifications
You must be signed in to change notification settings - Fork 1
rerun the one off analysis #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
rerun the one off analysis #16
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR reruns the one off analysis by switching the cell subsampling strategy from a fixed percentage per well to sampling a specified number of cells per well, and updates the downstream plot generation accordingly.
- Changed input/output directories and file names to reflect "cell_number_subsampled" instead of percentage‐based names.
- Updated argument parsing and sampling logic in scripts from percentage-based to number-based cell sampling.
- Adjusted notebooks and sampling-space generation to accommodate the new cell count parameter.
Reviewed Changes
Copilot reviewed 16 out of 24 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
one_off_analysis/scripts/3.combine_map_on_percentage_of_cells.py | Updated input/output file paths for cell number subsampling. |
one_off_analysis/scripts/2.run_map_on_subsampled_cells.py | Replaced percentage arguments with number_of_cells and adjusted sampling logic. |
one_off_analysis/scripts/1.generate_sampling_space.py | Modified sampling space generation to produce cell count values. |
one_off_analysis/scripts/0.run_map_across_channels.py | Updated data file paths and minor channel handling adjustments. |
one_off_analysis/notebooks/*.ipynb | Updated column headers and outputs to reflect number_of_cells instead of percentage_of_cells. |
one_off_analysis/combinations/number_of_cells.txt | Added a list of possible numbers of cells to sample. |
Files not reviewed (8)
- one_off_analysis/child_sc_sampling_HPC.sh: Language not supported
- one_off_analysis/combinations/percentage.txt: Language not supported
- one_off_analysis/combinations/seeds.txt: Language not supported
- one_off_analysis/local_sc_sampling_.sh: Language not supported
- one_off_analysis/parent_sc_sampling_HPC.sh: Language not supported
- one_off_analysis/scripts/4.visualize_channel_changes.r: Language not supported
- one_off_analysis/scripts/4.visualize_mAPs.r: Language not supported
- one_off_analysis/scripts/5.visualize_percentage_changes.r: Language not supported
Comments suppressed due to low confidence (1)
one_off_analysis/scripts/2.run_map_on_subsampled_cells.py:42
- The help text for '--number_of_cells' is misleading; consider updating it to describe that the parameter specifies the number of cells to sample per well rather than acting as a seed.
parser.add_argument("--number_of_cells", type=int, help="Seed for the random number generator")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! left some comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a minor nitpick, but I wanted to share some thoughts about the figure. I understand that the color gradient is intended to reflect increasing concentrations, and conceptually it makes sense to choose colors that convey a progression. However, I wonder if it would be more effective to treat the concentrations as categorical rather than continuous in this context. Personally, I tend to associate gradients more with heat maps or continuous surfaces rather than multiple discrete line plots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, the concentrations between 4.88 and 156.25 nM appear quite similar in color, which makes it difficult to distinguish between those lines in the plot. You might consider using a more diverse set of distinct colors to improve clarity, especially for viewers who need to differentiate specific concentrations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also found the light blue line (representing 0–2.24 nM) a bit hard to see against the background it’s quite faint. One possible solution could be to increase the line thickness to enhance visibility, although I’m not entirely sure how it will look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I look back at this... it's more than one nitpick, sorrryyyy. But as always, you are the artist c:
|
||
# In[ ]: | ||
|
||
|
||
# In[4]: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seem like you got an empty cell here
|
||
# percent_cell_mAP <- percent_cell_mAP %>% filter(Metadata_dose == 0.61) | ||
percent_cell_mAP <- percent_cell_mAP %>% group_by(shuffle, percentage_of_cells,Metadata_Time, Metadata_dose) %>% | ||
# number_cell_mAP <- number_cell_mAP %>% filter(Metadata_dose == 0.61) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider removing this line if it is no longer needed
This PR reruns the one off analysis and reruns the plot generation.
Main change is for the cell subsampling we go from 1 cell/well sampled then aggregate to the downsampled max across time for the number of cells per well ~80ish.