I was going through the codebase and noticed that in the main function, the test set is used for fine-tuning the model. I think this could introduce data leakage and lead to overly optimistic performance evaluations, right? Below is the section of code I'm referring to:
# create dataset and dataloader
if config.dataset == "GOD":
_, test_set = create_Kamitani_dataset(
...
)
elif config.dataset == "BOLD5000":
_, test_set = create_BOLD5000_dataset(
...
)
else:
raise NotImplementedError
# ... later in the code
print("Finetuning MAE on test fMRI ... ...")
for ep in range(config.num_epoch):
...
cor = train_one_epoch(
model, dataloader_hcp, optimizer, device, ep, loss_scaler, logger, config, start_time, model_without_ddp
)
Hello authers
I was going through the codebase and noticed that in the main function, the test set is used for fine-tuning the model. I think this could introduce data leakage and lead to overly optimistic performance evaluations, right? Below is the section of code I'm referring to:
https://github.com/zjc062/mind-vis/blob/main/code/stageA2_mbm_finetune.py#L125