HMBVIP: Hierarchical Multi-Bio-View Drug-Target Interaction Intelligent Prediction Networks

Figure 1. Structure of HMBVIP. The model comprises two core modules: 1) Data Initialization Module (DIM) processes drug and protein sequences, 2) Hierarchical Multi-Bio-View (HMBV) extracts drug-target features through multiple biological view, The HMBV module consists of three components: Drug-Target Sequence Neural Networks (DTSNN), Drug-Target High-Order Graph Attention (DTHOGAT), and Fusion Decision Module (FDM).

Conda Environment Setup

conda create --name hmbvip --file ./requirements.txt
conda activate hmbvip

Datasets

Here is the English translation of the introduction to DAVIS, KIBA, and BindingDB:

DAVIS Dataset

Full Name: Davis Kinase Binding Affinity Dataset
Source: Published by Davis et al. in a 2011 study (Davis et al., Nature Biotechnology, 2011).
Contents: Contains binding affinity data between 68 small-molecule inhibitors and 442 tyrosine kinases.
Measurement Metric: Kd (dissociation constant, in nM), often converted to pKd = -log10(Kd/1e9) for modeling purposes.
Characteristics:
- The dataset is dense, meaning that most drug–target pairs have measured affinities.
- Commonly used in deep learning models (e.g., DeepDTA, GraphDTA, MolTrans) to evaluate predictive accuracy on known drug-target interactions.

KIBA Dataset

Full Name: Kinase Inhibitor BioActivity Dataset
Source: Integrated from multiple sources (e.g., ChEMBL, BindingDB, DTC) by Tang et al., and widely used in DeepDTA (He et al., 2017).
Contents: Includes affinity data for thousands of proteins and small molecules.
Measurement Metric: KIBA score, a normalized fusion of multiple activity types (e.g., IC50, Ki, Kd), with higher values indicating stronger affinity.
Characteristics:
- Integrates diverse bioactivity data into a unified scoring system.
- Larger in scale but more sparse than DAVIS.
- Commonly used to test the generalization ability of deep learning models.

BindingDB

Full Name: Binding Database
Website: https://www.bindingdb.org
Contents: A public experimental database of biomolecular binding affinities between chemical compounds and protein targets (mainly human proteins), including Ki, Kd, IC50, and more.
Data Size: Over 2 million binding records covering tens of thousands of proteins and hundreds of thousands of compounds.
Characteristics:
- Open-access and continuously updated, suitable for real-world large-scale modeling.
- Data come from diverse sources (literature, patents, etc.), so quality may vary and data cleaning is often necessary.
- Frequently used as a raw data source for building training datasets in platforms like BioChem and DeepPurpose.

Comparison Summary:

Feature	DAVIS	KIBA	BindingDB
Data Type	Experimental Kd	Fused KIBA Score	Experimental Kd/Ki/IC50
Target Count	442	>2,000	Tens of thousands
Drug Count	68	Thousands	Hundreds of thousands
Data Density	High (Dense)	Medium (Sparse)	Very Sparse
Use Case	Precision modeling	Generalization testing	Real-world large-scale modeling

Load Dataset

from tdc.multi_pred import DTI
davis = DTI(name = 'DAVIS')
kiba = DTI(name = 'KIBA')
bindingdb = DTI(name = 'BindingDB_Kd')

Train

check_dir(base_path+"output/csv/")
check_dir(base_path+"output/log/")
check_dir(base_path+"output/pt/")
check_dir(base_path+"output/board/")
dataset_name = name
log_file = logger.add(f"{base_path}output/log/IPNet-{dataset_name}-{str(datetime.date.today())}.log")
df_split = load(name = dataset_name)
model = ipnet_train(df_split, dataset_name=dataset_name)
logger.remove(log_file)

output:

IPNet-DAVIS.pt  IPNet-Graph-BindingDB_Kd.pt  IPNet-Graph-DAVIS.pt  IPNet-KIBA.pt  IPNet-Seq-DAVIS.pt

Test

check_dir(base_path+"output/csv/")
check_dir(base_path+"output/log/")
check_dir(base_path+"output/pt/")
check_dir(base_path+"output/board/")
dataset_name = name
log_file = logger.add(f"{base_path}output/log/IPNet-{dataset_name}-{str(datetime.date.today())}.log")
df_split = load(name = dataset_name)
ipnet_test(df_split, dataset_name=dataset_name)
logger.remove(log_file)

output:

drwxrwxr-x 2 yang yang     4096 Sep 12  2024 ./
drwxrwxr-x 6 yang yang     4096 Aug 30  2024 ../
-rw-rw-r-- 1 yang yang      186 Aug 29  2024 IPNet-DAVIS_Metrics_2024-08-29_12_48_23.csv
-rw-rw-r-- 1 yang yang      187 Aug 29  2024 IPNet-DAVIS_Metrics_2024-08-29_12_55_33.csv
-rw-rw-r-- 1 yang yang      190 Aug 30  2024 IPNet-DAVIS_Metrics_2024-08-30_02_45_12.csv
-rw-rw-r-- 1 yang yang      102 Aug 31  2024 IPNet-DAVIS_Metrics_2024-08-31_10_02_46.csv
-rw-rw-r-- 1 yang yang       88 Sep  2  2024 IPNet-KIBA_Metrics_2024-09-02_03_21_11.csv

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
doc/png		doc/png
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HMBVIP: Hierarchical Multi-Bio-View Drug-Target Interaction Intelligent Prediction Networks

Conda Environment Setup

Datasets

DAVIS Dataset

KIBA Dataset

BindingDB

Comparison Summary:

Load Dataset

Train

Test

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

AGI-FBHC/IPNET

Folders and files

Latest commit

History

Repository files navigation

HMBVIP: Hierarchical Multi-Bio-View Drug-Target Interaction Intelligent Prediction Networks

Conda Environment Setup

Datasets

DAVIS Dataset

KIBA Dataset

BindingDB

Comparison Summary:

Load Dataset

Train

Test

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages