-
Notifications
You must be signed in to change notification settings - Fork 2
Notebook + sample data for estimating FE and uncertainty #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
At first glance, I noticed that we probably don't want to add all the serialized JSON files in the code base. I presume they should be easy to generate with the notebook, correct? If not, then we just probably want one or two examples that can be used to illustrate the functionalities implemented in the notebook. EDIT: Similarly with the PDBs |
|
These json files from the ABL+imatinib notebook, they were probably added to the commit by mistake. |
33da184 to
0e84e9e
Compare
|
I added the notebook for FE estimates and plotting work trajectories/distributions. I'm having trouble updating the branch with scripts and data for Abl-ATP but I'll try to get that up soon. Update: scripts, data, and analysis notebook added! I'll convert this to a PR after adding some notes. |
0e84e9e to
1947171
Compare
ijpulidos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! The notebook is looking great.
I made some comments, but overall just avoiding having too many files (maybe we want to showcase the toolkit/utilities just with a couple of cycle results or so)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to be storing the checkpoint files in the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notebook looks very good, I think we probably want to have one that showcases the functions (shows plots, ddg estimates, etc.) for a couple of cycles or something like that if possible. That way we don't have to store all the results files which are not great to have on the repo.
We probably also want the notebook having the outputs (at least the "useful" ones, like plots and DDGs or similar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we don't want to store checkpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Samples of these plots should probably be in the notebook itself. That way we can showcase the tools and the results in the same file, and avoid having PNG(binary) files around as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to store these scripts to run the DAGs as part of this set of changes. If there are improvements from the current scripts in the playground directory we probably want to make a new PR improving these, or adding new ones if needed.
/playground/notebooks/pale_toolkit.ipynbinitial notebook for ddG and uncertainty BAR estimates with NEQ cycles/playground/protein-mutation/ABL-ATPdirectory with sample data for Abl-ATPEstimating$\Delta\Delta G$ and Uncertainty using PyMBAR
is_cycle_complete: check if given CycleUnit directory has completed resultsget_num_cycles: get minimum number of completed cycles across mutations to compare BAR estimatesload_work_arrays: load work values stored in npy arrays of completed cyclesget_uncertainty: estimate uncertainty by taking standard deviation of bootstrapped BAR-estimated free energiesplot_works: plot work trajectories and distributionscompute_ddG_estimate: compute ddG and uncertainty using MBAR