Skip to content

Reproduce Results #82

@kalifadan

Description

@kalifadan

Hey, I'm trying to get the results of the EC and Thermostability tasks, with the following config, but getting lower results (for example, 0.712 in the Thermostability and 0.866 in the EC). What can it be? Is the number of epochs too big?
Thank you!!

EC:

setting:
seed: 20000812
os_environ:
WANDB_API_KEY: ~
WANDB_RUN_ID: ~
CUDA_VISIBLE_DEVICES: 0,1,2,3 # ,4,5,6,7
MASTER_ADDR: localhost
MASTER_PORT: 12315
WORLD_SIZE: 1
NODE_RANK: 0
wandb_config:
project: EC
name: SaProt_650M_AF2

model:
model_py_path: saprot/saprot_annotation_model
kwargs:
config_path: weights/PLMs/SaProt_650M_AF2
load_pretrained: True
anno_type: EC

lr_scheduler_kwargs:
last_epoch: -1
init_lr: 2.0e-5
on_use: false

optimizer_kwargs:
betas: [0.9, 0.98]
weight_decay: 0.01

save_path: weights/EC/SaProt_650M_AF2.pt

dataset:
dataset_py_path: saprot/saprot_annotation_dataset
dataloader_kwargs:
batch_size: 4 # 8
num_workers: 4 # 8

train_lmdb: LMDB/EC/AF2/foldseek/train
valid_lmdb: LMDB/EC/AF2/foldseek/valid
test_lmdb: LMDB/EC/AF2/foldseek/test
kwargs:
tokenizer: weights/PLMs/SaProt_650M_AF2
plddt_threshold: 70

Trainer:
max_epochs: 100
log_every_n_steps: 1
strategy:
find_unused_parameters: True
logger: True
enable_checkpointing: false
val_check_interval: 0.1
accelerator: gpu
devices: 4 # 8
num_nodes: 1
accumulate_grad_batches: 4 # 1
precision: 16
num_sanity_val_steps: 0

Thermostability:
setting:
seed: 20000812
os_environ:
WANDB_API_KEY: ~
WANDB_RUN_ID: ~
CUDA_VISIBLE_DEVICES: 0,1,2,3 # ,4,5,6,7
MASTER_ADDR: localhost
MASTER_PORT: 12315
WORLD_SIZE: 1
NODE_RANK: 0
wandb_config:
project: Thermostability
name: SaProt_650M_AF2

model:
model_py_path: saprot/saprot_regression_model
kwargs:
config_path: weights/PLMs/SaProt_650M_AF2
load_pretrained: True

lr_scheduler_kwargs:
last_epoch: -1
init_lr: 2.0e-5
on_use: false

optimizer_kwargs:
betas: [0.9, 0.98]
weight_decay: 0.01

save_path: weights/Thermostability/SaProt_650M_AF2.pt

dataset:
dataset_py_path: saprot/saprot_regression_dataset
dataloader_kwargs:
batch_size: 4 # 8
num_workers: 4 # 8

train_lmdb: LMDB/Thermostability/foldseek/train
valid_lmdb: LMDB/Thermostability/foldseek/valid
test_lmdb: LMDB/Thermostability/foldseek/test
kwargs:
tokenizer: weights/PLMs/SaProt_650M_AF2
mix_max_norm: [40, 67]
plddt_threshold: 70

Trainer:
max_epochs: 200
log_every_n_steps: 1
strategy:
find_unused_parameters: True
logger: True
enable_checkpointing: false
val_check_interval: 0.5
accelerator: gpu
devices: 4
num_nodes: 1
accumulate_grad_batches: 8
precision: 16
num_sanity_val_steps: 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions