Skip to content

Conversation

@jessicaw9910
Copy link
Collaborator

Description

Provide a brief description of the PR's purpose here.

Todos

Notable points that this PR has either accomplished or will accomplish.

  • TODO 1

Questions

  • Question1

Status

  • Ready to go

jessicaw9910 and others added 30 commits April 10, 2025 14:20
…of cif, used load_raw instead of load for cif files
* initial implementation

* added ChEMBL testing

* added importlib-resources to dependencies

* pinned importlib-resources version

* add importlib-resources to devtools env file and removed from pyproject.toml

* support for MoleculeSearch, MoleculeExact, and MoleculePreferred

* add verbose flagging to ChEMBL; return_chembl_id to make queries hierarchically

* docstrings for existing API clients; initial implementation of GraphQLClient

* opentargets module initial implementation

* test for open_targets

* updated tests with Search, Exact, and Preferred for ChEMBL

* notebooks for diffusion modeling group
jessicaw9910 and others added 9 commits August 20, 2025 10:50
* initial implementation

* added ChEMBL testing

* added importlib-resources to dependencies

* pinned importlib-resources version

* add importlib-resources to devtools env file and removed from pyproject.toml

* support for MoleculeSearch, MoleculeExact, and MoleculePreferred

* add verbose flagging to ChEMBL; return_chembl_id to make queries hierarchically

* docstrings for existing API clients; initial implementation of GraphQLClient

* opentargets module initial implementation

* test for open_targets

* updated tests with Search, Exact, and Preferred for ChEMBL

* notebooks for diffusion modeling group

* renamed dir
* changed schema namespace and included json files with package rather than in data directory

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 fixes in io_utils

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated schema_demo notebook for sub-package changes

* updated schema demo notebook for latest changes

* updated KinaseInfo json files with new field names

* revised schema notebook for latest changes

* now using dict to limit the possible serialization/deserialization functions; support for json, yaml, toml

* using ConfigDict(use_enum_values=True) for Enum to enable serialization; default values for fields where None allowed to None; validating KLIFS2UniProt dicts to default to None if missing given toml doesn't save None entries

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unmarked files as executable

* removed unused Callable import for flake7 compatibility
pull to merge

* removed unused Optional import for flake8 compatibility
pull to merge

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added flags to codecov.yml

* added schema dependencies to test_env.yaml

* schema tests

* added CI yaml for schema sub-package

* updated schema notebook for increased standardization and typos

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 modifications

* added -e flag to pi p install - hope to resolve 'No package metadata was found' error

* broke out dependencies specific to mkt.schema separately

* use schema_test_env.yaml instead to try to resolve mkt-schema' requires a different Python: 3.13.2 not in '<3.12,>=3.9'

* removed sub-package specific environments

* conformed env and ci files to match asap

* added tqdm to env and toml for schema

* fixed kinhub.name to kinhub.kinase_name

* added encoding='utf-8' to serialization function for Windows compatibility with TOML

* added carryforwards to flags

* utf-8 encoding causing Windows CI to lag - will not support Windows TOML formatting as a result

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* switched logging err
or to info for Windows and TOML serialization

* updated schema demo notebook for latest changes
merge to pull

* moved constants to separate file

* added indent=4 for jsons in package

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added Colab instructions

* updated documentation

* namespace changed to mkt.databases; updated toml and removed poetry; added LICENSE and MANIFEST.in; added KinoML to acknowledgements

* >= packages rather than pinning exact versions; added gitpython

* changed to reflect namespace update

* updated for namespace changes; for kinase_schema, also changed field names to reflect changes in mkt.schema package

* updated notebooks for databases namespace changes

* added channels to correct yaml formatting error when installing

* added mkt.schema installation

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 changes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused import

* updated databases ci for namespace changes

* removed init file from databases tests

* added HTTPError exception handling for KLIFS and updated tests to reflect; separated KLIFS and KinCore tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* re-integrated kincore test into klifs - need it for KD indices

* not testing NCBI; 404 error currently

* commenting out carryforward for the moment

* removed reference to Poetry in getting started docs

* updated path changes and package description

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed executable mark from notebooks/schema_demo.ipynb

* fixed schema-ci badge and typo

* codecov ignoring ncbi in mkt.databases and skipping corresponding test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed git dependency from mkt.schema pyproject.toml

* added panel wildcard to prevent from using 1.6 which requires python3.13

* panel == instead of >=

* made mkt.databases a dependency of mkt.schema; databases now has a KinaseInfoGenerator that inherits from KinaseInfo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused imports

* pre-commit hook

* pre-commit hook fix

* updated notebooks for UniProtFASTA

* removed extraneous comments and added APIQuery superclass

* fixed get_repo_root import statement

* changed get_repo_root reference to io_utils in KinCore; updated uniprot for UniProtFASTA in tests

* 404 not throwing an exception - added a dict_kinase_info = None option in if statement to accomodate

* exception handling if no FASTA file downloaded

* commented out line in pkis2 - TODO: further updates

* fixed cache error

* added UniProtFASTA vs. UniProtJSON (in progress)

* added instructions for loading in colab

* updated databases notebook to incorporate all recent changes

* added functions to extract sequence from kincore cif and an opinionated method to adjudicate kinase domain sequence (kincore cif > kincore fasta > Pfam > None)

* corrected Pfam > pfam, if self.kincore.cf is None return None

* use extract_sequence_from_cif rather than manual

* added testing for new adjudicate functions

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils and rgetattr to mkt.schema.utils

* moved get_repo_root to mkt.schema.io_utils and removed modeling get_repo_root

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* added git to requirements

* rgetattr from schema instead of databases

* rgetattr from schema instead of databases

* rgetattr from schema instead of databases

* rgetattr from schema instead of databases

* removed rgetattr, rsetattr, and try_except_return_none_rgetattr from mkt.databases.utils

* removed rgetattr and random_uuid from mkt.ml.utils

* TestSchema.test_utils, changed serialization to serde, altered dictionary import

* added mkt.schema.utils

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: jessicawhite <[email protected]>
* changed schema namespace and included json files with package rather than in data directory

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 fixes in io_utils

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated schema_demo notebook for sub-package changes

* updated schema demo notebook for latest changes

* updated KinaseInfo json files with new field names

* revised schema notebook for latest changes

* now using dict to limit the possible serialization/deserialization functions; support for json, yaml, toml

* using ConfigDict(use_enum_values=True) for Enum to enable serialization; default values for fields where None allowed to None; validating KLIFS2UniProt dicts to default to None if missing given toml doesn't save None entries

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unmarked files as executable

* removed unused Callable import for flake7 compatibility
pull to merge

* removed unused Optional import for flake8 compatibility
pull to merge

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added flags to codecov.yml

* added schema dependencies to test_env.yaml

* schema tests

* added CI yaml for schema sub-package

* updated schema notebook for increased standardization and typos

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 modifications

* added -e flag to pi p install - hope to resolve 'No package metadata was found' error

* broke out dependencies specific to mkt.schema separately

* use schema_test_env.yaml instead to try to resolve mkt-schema' requires a different Python: 3.13.2 not in '<3.12,>=3.9'

* removed sub-package specific environments

* conformed env and ci files to match asap

* added tqdm to env and toml for schema

* fixed kinhub.name to kinhub.kinase_name

* added encoding='utf-8' to serialization function for Windows compatibility with TOML

* added carryforwards to flags

* utf-8 encoding causing Windows CI to lag - will not support Windows TOML formatting as a result

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* switched logging err
or to info for Windows and TOML serialization

* updated schema demo notebook for latest changes
merge to pull

* moved constants to separate file

* added indent=4 for jsons in package

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added Colab instructions

* updated documentation

* namespace changed to mkt.databases; updated toml and removed poetry; added LICENSE and MANIFEST.in; added KinoML to acknowledgements

* >= packages rather than pinning exact versions; added gitpython

* changed to reflect namespace update

* updated for namespace changes; for kinase_schema, also changed field names to reflect changes in mkt.schema package

* updated notebooks for databases namespace changes

* added channels to correct yaml formatting error when installing

* added mkt.schema installation

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 changes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused import

* updated databases ci for namespace changes

* removed init file from databases tests

* added HTTPError exception handling for KLIFS and updated tests to reflect; separated KLIFS and KinCore tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* re-integrated kincore test into klifs - need it for KD indices

* not testing NCBI; 404 error currently

* commenting out carryforward for the moment

* removed reference to Poetry in getting started docs

* updated path changes and package description

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed executable mark from notebooks/schema_demo.ipynb

* fixed schema-ci badge and typo

* codecov ignoring ncbi in mkt.databases and skipping corresponding test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed git dependency from mkt.schema pyproject.toml

* added panel wildcard to prevent from using 1.6 which requires python3.13

* panel == instead of >=

* made mkt.databases a dependency of mkt.schema; databases now has a KinaseInfoGenerator that inherits from KinaseInfo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused imports

* pre-commit hook

* pre-commit hook fix

* updated notebooks for UniProtFASTA

* removed extraneous comments and added APIQuery superclass

* fixed get_repo_root import statement

* changed get_repo_root reference to io_utils in KinCore; updated uniprot for UniProtFASTA in tests

* 404 not throwing an exception - added a dict_kinase_info = None option in if statement to accomodate

* exception handling if no FASTA file downloaded

* commented out line in pkis2 - TODO: further updates

* fixed cache error

* added UniProtFASTA vs. UniProtJSON (in progress)

* added instructions for loading in colab

* updated databases notebook to incorporate all recent changes

* added functions to extract sequence from kincore cif and an opinionated method to adjudicate kinase domain sequence (kincore cif > kincore fasta > Pfam > None)

* corrected Pfam > pfam, if self.kincore.cf is None return None

* use extract_sequence_from_cif rather than manual

* added testing for new adjudicate functions

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils and rgetattr to mkt.schema.utils

* moved get_repo_root to mkt.schema.io_utils and removed modeling get_repo_root

* moved get_repo_root to mkt.schema.io_utils

* moved get_repo_root to mkt.schema.io_utils

* added git to requirements

* rgetattr from schema instead of databases

* rgetattr from schema instead of databases

* rgetattr from schema instead of databases

* rgetattr from schema instead of databases

* removed rgetattr, rsetattr, and try_except_return_none_rgetattr from mkt.databases.utils

* removed rgetattr and random_uuid from mkt.ml.utils

* TestSchema.test_utils, changed serialization to serde, altered dictionary import

* added mkt.schema.utils

* moved gitpython out of mkt.databases and into mkt.schema

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: jessicawhite <[email protected]>
* removed extract_tarfiles from mkt.databases.io_utils - now in schema throughout

* black

* pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added requirements.txt

* removed erroneously commited VE package data

* pre-commit

* pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change if statment order return_filenotfound_error_if_empty_or_missing; support conditions untar_files_in_memory like list and remove ._ files; make hgnc name key instead of uniprot id

* make hgnc name default key from mkt.databases.kinase_schema instead of uniprot id

* create_tar_without_metadata in io_utils; add tar.gz to generation script - mak sure to use w:gz to tarfile argument; new KinaseInfo.tar.gz with hgnc names as keys

* interim app update

* changed schema tests to reflect hgnc_name keys

* remove file suffix from list_entries

* black

* ordered list

* trailing whitespace

* temporarily added rotation based on ABL1

* print to logging

* wireframe including sequence, structure, and property panels

* altered structure plot size

* starting to add code to combine old CIF with new coords

* added SequenceAlignment adapted from mkt.databases.plot

* commented out SequenceAlignment in generate_alignments - default to using mkt.databases.plot

* added optional flags for SequenceAlignment to allow to repurpose in app

* removed reverse comment since now optional flag

* using mkt.databases.plot version of SequenceAlignment; using PropertyTables

* added resource links, radio button for structure

* could not get plots version to display toolbar on bottom - reimplementing here for now

* broke down SequenceAlignment into smaller plots

* tried to upgrade bokeh to get plot version of SequenceAlignment class to display toolbar

* added typing to serialization function

* starting alignment algorithm function

* add generate_properties to app

* added try_except_return_none_rgetattr to mkt.databases.utils

* make obj_kinase a PropertyTables property and generate extract_properties on instantiation

* added obj_kinases as property, combined sequence generation and plotting into a single class, make the y-axis labels crimson if no sequence of a given type is found in the data

* cleaned up extraneous arguments now included in radio buttons (programming to come), adjusted display_dashboard function for changes in the genererate_ scripts

* bugfixes

* updated for cif inclusion and new schema functionality

* added carryforward flag to codecov

* finalized structural annotations for phosphosites and KLIFS pocket

* removed commented out code no longer in use

* added docstrings for hardcoded resources; added KLIFS annotation to describe stick regions

* incorporating changes trying to switch CIF files to the newly aligned coordinates; will undo with next commit as have introduced an error

* reverting back to old version of KinCore

* fixed ncbi codecov ignore

* upgrade python to <3.13 in pyproject.toml files

* changed flags path structure

* moved constants to a separate file in app

* added docstrings to databsases utils

* removed TODO from hinge:linker

* added KLIFS region labeling to x-axis

* move the no KinCore objects error to the beginning; try to specify full width of KinCore active structure

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* removed extract_tarfiles from mkt.databases.io_utils - now in schema throughout

* black

* pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added requirements.txt

* removed erroneously commited VE package data

* pre-commit

* pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change if statment order return_filenotfound_error_if_empty_or_missing; support conditions untar_files_in_memory like list and remove ._ files; make hgnc name key instead of uniprot id

* make hgnc name default key from mkt.databases.kinase_schema instead of uniprot id

* create_tar_without_metadata in io_utils; add tar.gz to generation script - mak sure to use w:gz to tarfile argument; new KinaseInfo.tar.gz with hgnc names as keys

* interim app update

* changed schema tests to reflect hgnc_name keys

* remove file suffix from list_entries

* black

* ordered list

* trailing whitespace

* temporarily added rotation based on ABL1

* print to logging

* wireframe including sequence, structure, and property panels

* altered structure plot size

* starting to add code to combine old CIF with new coords

* added SequenceAlignment adapted from mkt.databases.plot

* commented out SequenceAlignment in generate_alignments - default to using mkt.databases.plot

* added optional flags for SequenceAlignment to allow to repurpose in app

* removed reverse comment since now optional flag

* using mkt.databases.plot version of SequenceAlignment; using PropertyTables

* added resource links, radio button for structure

* could not get plots version to display toolbar on bottom - reimplementing here for now

* broke down SequenceAlignment into smaller plots

* tried to upgrade bokeh to get plot version of SequenceAlignment class to display toolbar

* added typing to serialization function

* starting alignment algorithm function

* add generate_properties to app

* added try_except_return_none_rgetattr to mkt.databases.utils

* make obj_kinase a PropertyTables property and generate extract_properties on instantiation

* added obj_kinases as property, combined sequence generation and plotting into a single class, make the y-axis labels crimson if no sequence of a given type is found in the data

* cleaned up extraneous arguments now included in radio buttons (programming to come), adjusted display_dashboard function for changes in the genererate_ scripts

* bugfixes

* updated for cif inclusion and new schema functionality

* added carryforward flag to codecov

* finalized structural annotations for phosphosites and KLIFS pocket

* removed commented out code no longer in use

* added docstrings for hardcoded resources; added KLIFS annotation to describe stick regions

* incorporating changes trying to switch CIF files to the newly aligned coordinates; will undo with next commit as have introduced an error

* reverting back to old version of KinCore

* fixed ncbi codecov ignore

* upgrade python to <3.13 in pyproject.toml files

* changed flags path structure

* moved constants to a separate file in app

* added docstrings to databsases utils

* removed TODO from hinge:linker

* added KLIFS region labeling to x-axis

* move the no KinCore objects error to the beginning; try to specify full width of KinCore active structure

* pinning the package versions that are working locally

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* removed extract_tarfiles from mkt.databases.io_utils - now in schema throughout

* black

* pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added requirements.txt

* removed erroneously commited VE package data

* pre-commit

* pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change if statment order return_filenotfound_error_if_empty_or_missing; support conditions untar_files_in_memory like list and remove ._ files; make hgnc name key instead of uniprot id

* make hgnc name default key from mkt.databases.kinase_schema instead of uniprot id

* create_tar_without_metadata in io_utils; add tar.gz to generation script - mak sure to use w:gz to tarfile argument; new KinaseInfo.tar.gz with hgnc names as keys

* interim app update

* changed schema tests to reflect hgnc_name keys

* remove file suffix from list_entries

* black

* ordered list

* trailing whitespace

* temporarily added rotation based on ABL1

* print to logging

* wireframe including sequence, structure, and property panels

* altered structure plot size

* starting to add code to combine old CIF with new coords

* added SequenceAlignment adapted from mkt.databases.plot

* commented out SequenceAlignment in generate_alignments - default to using mkt.databases.plot

* added optional flags for SequenceAlignment to allow to repurpose in app

* removed reverse comment since now optional flag

* using mkt.databases.plot version of SequenceAlignment; using PropertyTables

* added resource links, radio button for structure

* could not get plots version to display toolbar on bottom - reimplementing here for now

* broke down SequenceAlignment into smaller plots

* tried to upgrade bokeh to get plot version of SequenceAlignment class to display toolbar

* added typing to serialization function

* starting alignment algorithm function

* add generate_properties to app

* added try_except_return_none_rgetattr to mkt.databases.utils

* make obj_kinase a PropertyTables property and generate extract_properties on instantiation

* added obj_kinases as property, combined sequence generation and plotting into a single class, make the y-axis labels crimson if no sequence of a given type is found in the data

* cleaned up extraneous arguments now included in radio buttons (programming to come), adjusted display_dashboard function for changes in the genererate_ scripts

* bugfixes

* updated for cif inclusion and new schema functionality

* added carryforward flag to codecov

* finalized structural annotations for phosphosites and KLIFS pocket

* removed commented out code no longer in use

* added docstrings for hardcoded resources; added KLIFS annotation to describe stick regions

* incorporating changes trying to switch CIF files to the newly aligned coordinates; will undo with next commit as have introduced an error

* reverting back to old version of KinCore

* fixed ncbi codecov ignore

* upgrade python to <3.13 in pyproject.toml files

* changed flags path structure

* moved constants to a separate file in app

* added docstrings to databsases utils

* removed TODO from hinge:linker

* added KLIFS region labeling to x-axis

* move the no KinCore objects error to the beginning; try to specify full width of KinCore active structure

* pinning the package versions that are working locally

* fixed bug by sorting list_intersect in _generate_highlight_idx before using

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* removed src folder and put latest ESM2 analysis in outer dir

* initial package infrastructure

* moved previous esm2 analysis to an alternative outer folder

* preliminary mkt.ml components

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 adjustments
rebase to pull

* updated after rebase

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Schema (#83)

* changed schema namespace and included json files with package rather than in data directory

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 fixes in io_utils

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated schema_demo notebook for sub-package changes

* updated schema demo notebook for latest changes

* updated KinaseInfo json files with new field names

* revised schema notebook for latest changes

* now using dict to limit the possible serialization/deserialization functions; support for json, yaml, toml

* using ConfigDict(use_enum_values=True) for Enum to enable serialization; default values for fields where None allowed to None; validating KLIFS2UniProt dicts to default to None if missing given toml doesn't save None entries

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unmarked files as executable

* removed unused Callable import for flake7 compatibility
pull to merge

* removed unused Optional import for flake8 compatibility
pull to merge

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added flags to codecov.yml

* added schema dependencies to test_env.yaml

* schema tests

* added CI yaml for schema sub-package

* updated schema notebook for increased standardization and typos

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 modifications

* added -e flag to pi p install - hope to resolve 'No package metadata was found' error

* broke out dependencies specific to mkt.schema separately

* use schema_test_env.yaml instead to try to resolve mkt-schema' requires a different Python: 3.13.2 not in '<3.12,>=3.9'

* removed sub-package specific environments

* conformed env and ci files to match asap

* added tqdm to env and toml for schema

* fixed kinhub.name to kinhub.kinase_name

* added encoding='utf-8' to serialization function for Windows compatibility with TOML

* added carryforwards to flags

* utf-8 encoding causing Windows CI to lag - will not support Windows TOML formatting as a result

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* switched logging err
or to info for Windows and TOML serialization

* updated schema demo notebook for latest changes
merge to pull

* moved constants to separate file

* added indent=4 for jsons in package

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added Colab instructions

* updated documentation

* namespace changed to mkt.databases; updated toml and removed poetry; added LICENSE and MANIFEST.in; added KinoML to acknowledgements

* >= packages rather than pinning exact versions; added gitpython

* changed to reflect namespace update

* updated for namespace changes; for kinase_schema, also changed field names to reflect changes in mkt.schema package

* updated notebooks for databases namespace changes

* added channels to correct yaml formatting error when installing

* added mkt.schema installation

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 changes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused import

* updated databases ci for namespace changes

* removed init file from databases tests

* added HTTPError exception handling for KLIFS and updated tests to reflect; separated KLIFS and KinCore tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* re-integrated kincore test into klifs - need it for KD indices

* not testing NCBI; 404 error currently

* commenting out carryforward for the moment

* removed reference to Poetry in getting started docs

* updated path changes and package description

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed executable mark from notebooks/schema_demo.ipynb

* fixed schema-ci badge and typo

* codecov ignoring ncbi in mkt.databases and skipping corresponding test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed git dependency from mkt.schema pyproject.toml

* added panel wildcard to prevent from using 1.6 which requires python3.13

* panel == instead of >=

* made mkt.databases a dependency of mkt.schema; databases now has a KinaseInfoGenerator that inherits from KinaseInfo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused imports

* pre-commit hook

* pre-commit hook fix

* updated notebooks for UniProtFASTA

* removed extraneous comments and added APIQuery superclass

* fixed get_repo_root import statement

* changed get_repo_root reference to io_utils in KinCore; updated uniprot for UniProtFASTA in tests

* 404 not throwing an exception - added a dict_kinase_info = None option in if statement to accomodate

* exception handling if no FASTA file downloaded

* commented out line in pkis2 - TODO: further updates

* fixed cache error

* added UniProtFASTA vs. UniProtJSON (in progress)

* added instructions for loading in colab

* updated databases notebook to incorporate all recent changes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: jessicawhite <[email protected]>

* updated for flake8
rebase to merge

* ignoring unused imports for the moment

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* latest models

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flake8 updates

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* plotting for kinase embeddings - for now kinase group generation is hardcoded

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* models flake8 fixes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* moved clustering and plotting to separate modules

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* plotting working, added cli

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update in wip model

* added TDC

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* switched from pkg_resources to importlib since fully deprecated in 3.12

* 4/18 update

* Update requirements.yaml (#114)

Try pinning Python version (python >=3.10,<3.11)

* Update conf.py (#115)

Fixed import statements

* Update index.rst (#116)

Increased max depth from 3 to 5

* Update conf.py (#117)

changing sys.path

* Update api.rst (#118)

Using sub-modules in API

* Update api.rst (#119)

Removing api from submodules

* Update api.rst (#120)

Removing sub-modules

* Update index.rst (#121)

Changed max depth to 6

* reproducible conversion of PKIS accession to UniProt ID

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* script to generate harmonized PKIS2 data

* added get_repo_root to utils

* updated pkis2 annotated file for latest reconciliation

* created a FineTuneDataset class to load data from csv files, split on kinase group, transform data with StandardScaler, and tokenize; also created a more specific PKIS2Dataset class

* training scripts and initial pooling model

* added run_trainer as a CLI

* log_config for mkt.ml

* clustering module

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* keep only latest

* uncomment run_pipeline_with_wandb for pre-commit CI

* commenting out all of models file - no longer in use but preserve for posterity

* pre-commit ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated typo in PKIS2Dataset import statement; specified scaler object in dataset_pkis2

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* global ordering drug before kinase; more efficient max_length calc; revised Dataset generation

* global drug before kinase

* fixed collate_fn in dataloader; global drug before kinase; fixed model arguments

* run interactively line by line to debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* finished merge

* commented out device

* commented out device import

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* global reorder drug before kinase

* commented out interactive lines

* sample-wise dot product instead of matrix multiplication

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstrings

* make wandb a bool param rather than separate function; make training logging a moving average instead

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed reference to wandb trainer function

* log eval to wandb; val loss every 1000 steps + end of epoch; plot val stats real-time; keep only best 5 models

* added entity name option to setup_wandb

* added entity name option to trainer

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused imports and variables from trainer

* updated trainer to better log validation data

* added separate freeze arguments for drug and kinase model

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* preliminary config logic

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more info on supported models

* added ChemBERTa

* provided more detail on how to implement additional model support

* have made all models Enum/StrEnum for validation purposes

* using ABC and abstract method for split; added support for CV

* support for ABC finetuning module so separate cross-fold and splits

* set_seed

* self.seed = self.config[seed]

* separate bool_freeze into drug and kinase options

* removed extraneous import

* update comments

* added ExperimentFactory upport; TODO - pass to run_pipeline_with_wandb and trainer configs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused field

* added dict_trainer_configs return and informative model name

* conform run_pipeline_with_wandb args to train_model args; added plot dir argument throughout; kwargs from dict in run_pipeline_with_wandb; remove default args from train_model to prevent unintentional overwriting

* logging > logger

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* pre-commit CI fixes: unused imports, dup plot_dir, undefined bool_wandb and eval_results

* removed previous defaultdict architecture

* added support to batch cross-validation scripts

* removed drug and kinase default models, added fold_idx logic to allow for batching cross-validation folds

* added fold_idx to PKIS2CrossValidation

* added dict_trainer_configs to None output, typing and docstrings; improved model naming convention; convert learning_rate to float

* utils_trainer to create SLURM scripts for cross-validation

* make fold sub-directory and cd for purposes of output

* make train_test/datetime sub-directory for train_test split

* log config file as json

* train_step * 10 global steps, import json

* changed script_dir/out_dir configuration, removed non-existent import

* moved the directory generation step to batch_submit_folds so no datetime difference in submitted jobs

* corrected description

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused imports and fstring for pre-commit CI

* fixed wandb.Images logging error; reverted back to raw global loss for train steps

* removed unused, commented out arguments in batch_submit_folds

* added instructions about running on cluster

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added details in configs about what needs to be updated

* added environment.yml

* added environment set-up instructions in README

* removed pip installed packages from env yml - pip install toml for dependencies instead

* use dict for class-specific kwargs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed trailing whitespace

* added tdc support for davis

* confirmed dict formulation worked and removed commented out args

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* created KinaseGroupSource

* standardized docstrings in FineTuneDataset

* added rgetattr with exception handling

* removed unused imports

* mkt.ml.datasets.process module to create datasets

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add kincore kd columns

* harmonized config and process classes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused dataclass imports

* add source column at the end

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unnecessary comments, add Davis TODO

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added generate_datasets to cli for ml

* convert Davis y to pKd in micromolar before z-score conversion so higher values denote more potent

* comment out drop na since now do in cli more rigorously

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* switched flag from drop to keep

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* generate pseudo random uuids

* add uuids at the level of the original data processing

* update nf pipeline to reflect uuid added at the level of the data generation

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use random_uuid instead of uuid4

* added attribution

* remove unused uuid import

* clarified comments

* create a BaseCombinedModel that loads models from pretrained and compute_similarity; AbstractTransformModel now has transform_drug and transform_kinase along with forward pass where the transform methods are abstract and instantiated elsewhere (e.g., pooling vs. attention

* instantiate transformation functions and layer names at the level of CombinedPoolingModel

* KD TODO clarified

* use adjudicate_kd_sequence to annotate

* change PKIS to % inhibition; add KD column algorithm using adjudicate KD sequence

* import rgetattr from mkt.schema instead of mkt.databases

* simplified adjudicate_group for readability

* KinaseKDSequenceSource TODO

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: jessicawhite <[email protected]>
… ipython, jupyter, and mol2grid to requirements
@codecov
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

❌ Patch coverage is 82.35294% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.48%. Comparing base (138b08a) to head (69bb125).
⚠️ Report is 16 commits behind head on main.

Additional details and impacted files
Flag Coverage Δ
databases 100.00% <ø> (ø)
schema 90.43% <82.35%> (-0.95%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants