Skip to content

Steps to determine successes

Kate Fieseler edited this page Nov 9, 2023 · 2 revisions

After Fragmenstein placement not all placements are successful. They must pass a filter of a negative ∆∆G values and a RMSD threshold of 2 angstroms to the fragments. As of now, successful placements are not automatically determined and the successful placements must be easily accessible for input into HIPPO. These are the steps I manually take to curate the successful placements:

  1. Run format_to_hippo.py with inputs of
    -d [home_directory_name (contains directories of base compounds)] -e [elaboration_csv_identifier] -o [output_csv_identifier] --rmsd [rmsd_threshold for successful placements] --remove [if present, will search through elaboration directories and remove extraneous fragmenstein files]

This will do 2 things:

  • Move all successful placements to a directory labeled success within each base compound folder
  • Make a success_dirs.json in the home_directory where the key is the path to the success folder for each base compound and the value is the number of successful placements.
  1. Now we need to report how many placements were done and how many were successful and such. I made a jupyter notebook on the IRIS cluster that was easy in the moment. It's at /data/xchem-fragalysis/kfieseler/D68EV3CPROA/how_many_placed.ipynb.
    The function in there is 'how_many_ran', this takes in home_directory_path and elabs_csv. It enumerates through each directory finding the ones with success dirs, counts how many elab folders there are and stores it to a csv with a suffix of _SUCCESS that is merged with the original elaboration csv.

Clone this wiki locally