Summary
The App-PredictStructure.pl service script calls python -m protein_compare characterize in run_report() to generate structure characterization reports (HTML + JSON). However, protein_compare is not installed in the folding_prod.sif container's conda-predict environment, so report generation silently fails.
Current state
protein_compare lives in a separate standalone container (protein-compare.sif), built from protein_structure_analysis/container/protein-compare.def
- The CWL workflows use
protein-compare.sif as a separate step — this works
- The Perl service script assumes
protein_compare is available in the same Python environment as predict-structure — this does not work
Proposed fix
Install protein_compare into the existing conda-predict environment as a container build concern (not a Python dependency of predict-structure). This keeps the packages conceptually separate while making them co-available at runtime.
In the container def layer (reqts-predict-structure.def or equivalent):
conda activate /opt/conda-predict
pip install --no-cache-dir \
"protein_compare @ git+https://github.com/BV-BRC/protein_structure_analysis.git"
Version compatibility (verified)
Shared dependencies — no conflicts:
| Package |
conda-predict |
protein_compare requires |
| biopython |
1.87 |
>=1.81 |
| numpy |
2.4.4 |
>=1.24 |
| pandas |
3.0.2 |
>=2.0 |
| click |
8.3.1 |
>=8.1 |
New packages needed: scipy, matplotlib, seaborn, tmtools, joblib
Also needed
- System packages for DSSP and matplotlib rendering:
dssp, libgl1, libglib2.0-0, libfontconfig1
- CCD compound data for DSSP:
curl -o /var/cache/libcifpp/components.cif https://files.wwpdb.org/pub/pdb/data/monomers/components.cif
- Note: Debian Trixie replaced
libgl1-mesa-glx with libgl1
Acceptance criteria
Summary
The
App-PredictStructure.plservice script callspython -m protein_compare characterizeinrun_report()to generate structure characterization reports (HTML + JSON). However,protein_compareis not installed in thefolding_prod.sifcontainer'sconda-predictenvironment, so report generation silently fails.Current state
protein_comparelives in a separate standalone container (protein-compare.sif), built fromprotein_structure_analysis/container/protein-compare.defprotein-compare.sifas a separate step — this worksprotein_compareis available in the same Python environment aspredict-structure— this does not workProposed fix
Install
protein_compareinto the existingconda-predictenvironment as a container build concern (not a Python dependency of predict-structure). This keeps the packages conceptually separate while making them co-available at runtime.In the container def layer (
reqts-predict-structure.defor equivalent):conda activate /opt/conda-predict pip install --no-cache-dir \ "protein_compare @ git+https://github.com/BV-BRC/protein_structure_analysis.git"Version compatibility (verified)
Shared dependencies — no conflicts:
New packages needed:
scipy,matplotlib,seaborn,tmtools,joblibAlso needed
dssp,libgl1,libglib2.0-0,libfontconfig1curl -o /var/cache/libcifpp/components.cif https://files.wwpdb.org/pub/pdb/data/monomers/components.ciflibgl1-mesa-glxwithlibgl1Acceptance criteria
apptainer exec folding_prod.sif /opt/conda-predict/bin/python -m protein_compare --helpsucceedsApp-PredictStructure.plreport generation produces HTML and JSON output