Reproduce Results

Hey, I'm trying to get the results of the EC and Thermostability tasks, with the following config, but getting lower results (for example, 0.712 in the Thermostability and 0.866 in the EC). What can it be? Is the number of epochs too big?
Thank you!!

EC:

setting:
  seed: 20000812
  os_environ:
    WANDB_API_KEY: ~
    WANDB_RUN_ID: ~
    CUDA_VISIBLE_DEVICES: 0,1,2,3   # ,4,5,6,7
    MASTER_ADDR: localhost
    MASTER_PORT: 12315
    WORLD_SIZE: 1
    NODE_RANK: 0
  wandb_config:
    project: EC
    name: SaProt_650M_AF2

model:
  model_py_path: saprot/saprot_annotation_model
  kwargs:
    config_path: weights/PLMs/SaProt_650M_AF2
    load_pretrained: True
    anno_type: EC

  lr_scheduler_kwargs:
    last_epoch: -1
    init_lr: 2.0e-5
    on_use: false

  optimizer_kwargs:
    betas: [0.9, 0.98]
    weight_decay: 0.01

  save_path: weights/EC/SaProt_650M_AF2.pt


dataset:
  dataset_py_path: saprot/saprot_annotation_dataset
  dataloader_kwargs:
    batch_size: 4   # 8
    num_workers: 4  # 8

  train_lmdb: LMDB/EC/AF2/foldseek/train
  valid_lmdb: LMDB/EC/AF2/foldseek/valid
  test_lmdb: LMDB/EC/AF2/foldseek/test
  kwargs:
    tokenizer: weights/PLMs/SaProt_650M_AF2
    plddt_threshold: 70


Trainer:
  max_epochs: 100
  log_every_n_steps: 1
  strategy:
    find_unused_parameters: True
  logger: True
  enable_checkpointing: false
  val_check_interval: 0.1
  accelerator: gpu
  devices: 4  # 8
  num_nodes: 1
  accumulate_grad_batches: 4 # 1
  precision: 16
  num_sanity_val_steps: 0



Thermostability:
setting:
  seed: 20000812
  os_environ:
    WANDB_API_KEY: ~
    WANDB_RUN_ID: ~
    CUDA_VISIBLE_DEVICES: 0,1,2,3   # ,4,5,6,7
    MASTER_ADDR: localhost
    MASTER_PORT: 12315
    WORLD_SIZE: 1
    NODE_RANK: 0
  wandb_config:
    project: Thermostability
    name: SaProt_650M_AF2

model:
  model_py_path: saprot/saprot_regression_model
  kwargs:
    config_path: weights/PLMs/SaProt_650M_AF2
    load_pretrained: True

  lr_scheduler_kwargs:
    last_epoch: -1
    init_lr: 2.0e-5
    on_use: false

  optimizer_kwargs:
    betas: [0.9, 0.98]
    weight_decay: 0.01

  save_path: weights/Thermostability/SaProt_650M_AF2.pt


dataset:
  dataset_py_path: saprot/saprot_regression_dataset
  dataloader_kwargs:
    batch_size: 4 # 8
    num_workers: 4 # 8

  train_lmdb: LMDB/Thermostability/foldseek/train
  valid_lmdb: LMDB/Thermostability/foldseek/valid
  test_lmdb: LMDB/Thermostability/foldseek/test
  kwargs:
    tokenizer: weights/PLMs/SaProt_650M_AF2
    mix_max_norm: [40, 67]
    plddt_threshold: 70


Trainer:
  max_epochs: 200
  log_every_n_steps: 1
  strategy:
    find_unused_parameters: True
  logger: True
  enable_checkpointing: false
  val_check_interval: 0.5
  accelerator: gpu
  devices: 4
  num_nodes: 1
  accumulate_grad_batches: 8
  precision: 16
  num_sanity_val_steps: 0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduce Results #82

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproduce Results #82

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions