Skip to content

Commit 6b8fbcd

Browse files
jessicaw9910pre-commit-ci[bot]claude
authored
Databases (#188)
* removed comment * removed kinase_schema.CollectionKinaseInfo * comment on PRKD2 and AlphaMissense * temporary scratch for aligning sequences to DiscoverX * implemented new class ChEMBLMolecule to query for molecule details * added xlrd to package dependencies to process Davis dataset * preliminary info for davis harmonization * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add check_molecules to ChEMBL; updated wrong ChEMBLMolecule argument * add check_molecules to ChEMBL; updated wrong ChEMBLMolecule argument * make rdkit a package dependency * cli for querying ChEMBL for dataset preprocessing * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * moved davis and pkis2 modules to datasets * changed error message for maybe_get_symbol_from_hgnc_search if custom_field provided * updates to pkis2 and davis datasets modules * removed commented out PR CIs for databases and schema * fixed chembl search error - default empty list not None * added adjudicate_kd_start and adjudicate_kd_end for dataset incorporation purposes * added docstring for bool_offset * allow for str_fasta to be used if need to hardcode for errors * removed pytest.mark.skip as NCBI API is currently running * added function to check if lipid kinase * specified input_is_hgnc_symbol default in docstring * added Pfam docstring * UniProtRefSeqProteinGET and query_uniprotbulk_api to uniprot module; modifies nf-rnaseq package tooling * fully working initial commit of discoverx module; construct to KD/KLIFS mapping outstanding * added verbose flag to the KinaseInfo functions rather than logging by default * added verbose flags * added and commented out pip install nf-rnaseq from github; uncomment for testing if in use * import only UniProtFASTA rather than entire uniprot module to avoid nf-rnaseq import errors; fix if want to test this functionality * uncommented nf-rnaseq * in progress datasets commit * used verbose flag for caplog tests * dict_refseq_indices working correctly * dict_construct_sequences finalized - use this to generate harmonized representations * generate the dataset csv files * process now contains all code necssary to generate different aligned input sequences * conformed to latest process module structure * added dataset csv CLI to pyproject.toml * added plotting functions for discoverx * upgrades for discoverx plotting * CLI script to generate poster dataset plots * plot both svg and PNG formats for all * added plot dynamic range to the plotting CLI, need to fix font size * fixed svg in plot_dynamic_range - font still looks a little off; added docstrings and fixed comment format * Fix test_pfam and test_ncbi to handle API 500 errors gracefully Handle RetryError exceptions when external APIs return 500 errors by skipping tests instead of failing. This prevents CI failures due to unpredictable external API availability. Changes: - Wrap test_pfam API calls in try-except block - Wrap test_ncbi API calls in try-except block - Skip tests with informative messages when 500 errors occur - Re-raise other exceptions to catch real issues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Refactor plotting code and fix SVG font rendering issues This commit improves the plotting functionality by: 1. Creating a reusable save_plot() helper function to reduce code duplication 2. Fixing SVG font rendering issues by converting text to paths 3. Improving mathtext rendering for subscripts (K_d, log_10) Changes: - Add save_plot() function to handle saving both SVG and PNG formats - Replace repetitive save code in all 5 plotting functions - Change svg.fonttype from "none" to "path" for consistent rendering - Update mathtext from \mathregular to \mathrm for proper subscript rendering - Ensure plots render consistently in browsers, VS Code, and vector editors Benefits: - SVG files now render perfectly in all viewers without spacing/kerning issues - Reduced code duplication by ~60 lines - Easier maintenance with centralized save logic - Consistent behavior across all plotting functions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * absolute filepath for cwd instead of '.' * fixed KinaseMissenseMutations.dict_replace - only do this if key in original datast * make the checks and post_init optional in case loading from a CSV file for a cohort that requires a VPN - logger errors are now warnings; allow load_from_csv from an input str if loading from multiple dataframes (e.g., KinaseMissenseMutations ._df and ._df_filter); added pathfile_filter to KinaseMissenseMutations * updated databases for kw_only arg study_id in Mutations * fixed bug in dict_kinase_cbio in get_kinase_missense_mutations function - need to check if mkt_name is in dict_kinase_cbio rather than cbio_name * changed HGNC name and mismatch error logging * Two minor logger formatting tweaks (#186) Linebreaks and spacing for canonical mismatch errors * only log query errors if present * moved classes from app to mkt.databases.app since need to use extensibly in other places (mkt_impact); simplified names for relevant app modules since no longer scripts importing locally; remove py3dmol and streamlit/bokeh related plotting functions to standalone visualization script in app; created pymol module and moved CLI script to mkt.databases; added webcolors to pyproject.toml dependencies * removed all plotting - keep this in standalone app * removed all plotting - keep this in standalone app * updated imports for new app structure * changed imports in app script * PyMOL module and CLI * removed self.html = self.visualize_structure() from StructureVisualizer and moved to StructureVisualizerVisualizer --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude <[email protected]>
1 parent dfb9cc3 commit 6b8fbcd

File tree

4 files changed

+58
-89
lines changed

4 files changed

+58
-89
lines changed

missense_kinase_toolkit/app/visualizers.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,10 @@ class StructureVisualizerVisualizer(StructureVisualizer):
177177
)
178178
"""Dimensions for the py3Dmol viewer."""
179179

180+
def __post_init__(self):
181+
super().__init__(self.obj_kinase, self.str_attr)
182+
self.html = self.visualize_structure()
183+
180184
def visualize_structure(self) -> str | None:
181185
"""Visualize the structure using py3Dmol.
182186

missense_kinase_toolkit/databases/mkt/databases/app/structures.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,6 @@ def __post_init__(self):
7373
self.structure = self.convert_mmcifdict2structure()
7474
self.pdb_text = self.convert_structure2string()
7575
self.residues = self.structure.get_residues()
76-
self.html = self.visualize_structure()
7776

7877
@staticmethod
7978
def _parse_pdb_line(line: str) -> dict[str, Any] | None:
Lines changed: 5 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,14 @@
11
#!/usr/bin/env python3
22

33
import argparse
4-
import os
54

65
from mkt.databases.app.sequences import SequenceAlignment
76
from mkt.databases.app.structures import StructureVisualizer
87
from mkt.databases.colors import DICT_COLORS
9-
from mkt.databases.pymol import generate_simple_pymol_files
10-
from mkt.schema import io_utils
8+
from mkt.databases.pymol import PyMOLGenerator
9+
from mkt.schema.io_utils import deserialize_kinase_dict
1110

12-
DICT_KINASE = io_utils.deserialize_kinase_dict(str_name="DICT_KINASE")
11+
DICT_KINASE = deserialize_kinase_dict(str_name="DICT_KINASE")
1312

1413

1514
def parse_args():
@@ -45,30 +44,6 @@ def main():
4544
bool_show=False,
4645
)
4746

47+
pymol_generator = PyMOLGenerator(viz=viz)
4848
output_directory = f"./pymol_output/{gene}"
49-
_, script_file = generate_simple_pymol_files(viz, output_directory, gene)
50-
51-
f"""
52-
Files generated:
53-
PDB: {os.path.join(output_directory, f"{gene}_structure.pdb")}
54-
Script: {os.path.join(output_directory, f"{gene}_pymol_script.py")}
55-
56-
To run in PyMOL:
57-
1. Open PyMOL
58-
2. Navigate to {output_directory}
59-
3. Run: run {gene}_pymol_script.py
60-
4. Save PNG: set ray_trace_mode, 3; png filename.png, ray=1, dpi=300
61-
"""
62-
63-
print("\n" + "=" * 60)
64-
print("MANUAL PYMOL INSTRUCTIONS:")
65-
print("=" * 60)
66-
print("1. Open PyMOL GUI or command line")
67-
print("2. Change to the output directory:")
68-
print(f" cd {os.path.abspath(output_directory)}")
69-
print("3. Run the script:")
70-
print(f" run {os.path.basename(script_file)}")
71-
print("4. To save as high-res PNG:")
72-
print(" set ray_trace_mode, <mode>")
73-
print(" png your_filename.png, ray=1, dpi=300")
74-
print("=" * 60)
49+
pymol_generator.save_pymol_files(output_directory, gene)

missense_kinase_toolkit/databases/mkt/databases/pymol.py

Lines changed: 49 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,9 @@
11
import os
22

3-
try:
4-
import webcolors
3+
import webcolors
54

6-
WEBCOLORS_AVAILABLE = True
7-
except ImportError:
8-
WEBCOLORS_AVAILABLE = False
95

10-
11-
class SimplePyMOLGenerator:
6+
class PyMOLGenerator:
127
"""Generate PDB file with embedded color/style info and standalone PyMOL script."""
138

149
def __init__(self, structure_visualizer):
@@ -25,12 +20,7 @@ def _convert_color_to_hex(self, color: str) -> str:
2520
if color.startswith("#"):
2621
return color
2722

28-
# Try webcolors first
29-
if WEBCOLORS_AVAILABLE:
30-
try:
31-
return webcolors.name_to_hex(color)
32-
except ValueError:
33-
pass
23+
webcolors.name_to_hex(color)
3424

3525
# Fallback color mapping
3626
color_map = {
@@ -270,48 +260,49 @@ def generate_pymol_script(self, pdb_path: str, output_path: str) -> str:
270260

271261
return output_path
272262

273-
274-
def generate_simple_pymol_files(viz_obj, output_dir: str, base_name: str = None):
275-
"""
276-
Generate PDB file and PyMOL script for manual PyMOL execution.
277-
278-
Parameters
279-
----------
280-
viz_obj : StructureVisualizer
281-
Existing StructureVisualizer object
282-
output_dir : str
283-
Directory to save files
284-
base_name : str, optional
285-
Base name for files (default: gene name)
286-
287-
Returns
288-
-------
289-
tuple
290-
Paths to (pdb_file, pymol_script)
291-
"""
292-
if base_name is None:
293-
base_name = viz_obj.obj_kinase.hgnc_name
294-
295-
# Ensure output directory exists
296-
os.makedirs(output_dir, exist_ok=True)
297-
298-
# Generate files
299-
generator = SimplePyMOLGenerator(viz_obj)
300-
301-
pdb_path = os.path.join(output_dir, f"{base_name}_structure.pdb")
302-
script_path = os.path.join(output_dir, f"{base_name}_pymol_script.py")
303-
304-
generator.generate_annotated_pdb(pdb_path)
305-
generator.generate_pymol_script(pdb_path, script_path)
306-
307-
print("Files generated:")
308-
print(f" PDB: {pdb_path}")
309-
print(f" Script: {script_path}")
310-
print("")
311-
print("To run in PyMOL:")
312-
print(" 1. Open PyMOL")
313-
print(f" 2. Navigate to {output_dir}")
314-
print(f" 3. Run: run {os.path.basename(script_path)}")
315-
print(" 4. Save PNG: set ray_trace_mode, 3; png your_filename.png, ray=1, dpi=300")
316-
317-
return pdb_path, script_path
263+
def save_pymol_files(self, output_dir: str):
264+
"""
265+
Generate PDB file and PyMOL script for manual PyMOL execution.
266+
267+
Parameters
268+
----------
269+
output_dir : str
270+
Directory to save files
271+
272+
Returns
273+
-------
274+
tuple
275+
Paths to (pdb_file, pymol_script)
276+
"""
277+
os.makedirs(output_dir, exist_ok=True)
278+
279+
pdb_path = os.path.join(output_dir, f"{self.gene_name}_structure.pdb")
280+
script_path = os.path.join(output_dir, f"{self.gene_name}_pymol_script.py")
281+
282+
self.generate_annotated_pdb(pdb_path)
283+
self.generate_pymol_script(pdb_path, script_path)
284+
285+
f"""
286+
Files generated:
287+
PDB: {os.path.join(output_dir, f"{self.gene_name}_structure.pdb")}
288+
Script: {os.path.join(output_dir, f"{self.gene_name}_pymol_script.py")}
289+
290+
To run in PyMOL:
291+
1. Open PyMOL
292+
2. Navigate to {output_dir}
293+
3. Run: run {self.gene_name}_pymol_script.py
294+
4. Save PNG: set ray_trace_mode, 3; png filename.png, ray=1, dpi=300
295+
"""
296+
297+
print("\n" + "=" * 60)
298+
print("MANUAL PYMOL INSTRUCTIONS:")
299+
print("=" * 60)
300+
print("1. Open PyMOL GUI or command line")
301+
print("2. Change to the output directory:")
302+
print(f" cd {os.path.abspath(output_dir)}")
303+
print("3. Run the script:")
304+
print(f" run {os.path.basename(script_path)}")
305+
print("4. To save as high-res PNG:")
306+
print(" set ray_trace_mode, <mode>")
307+
print(" png <your_filename>.png, ray=1, dpi=300")
308+
print("=" * 60)

0 commit comments

Comments
 (0)