Skip to content

Reproducing CEDR-KNRM results on ANTIQUE #20

@stepgazaille

Description

@stepgazaille

Hello,
I'm trying to reproduce results from the OpenNIR paper using the Vanilla BERT and CEDR-KNRM models on the ANTIQUE dataset.

Taking my cues from the wsdm2020_demo.sh script, I trained my models as follow:

  1. First I fine-tuned and tested a Vanilla BERT model:
BERT_MODEL_PARAMS="trainer.grad_acc_batch=1 valid_pred.batch_size=4 test_pred.batch_size=4"
python -m onir.bin.pipeline config/antique config/vanilla_bert $BERT_MODEL_PARAMS 
python -m onir.bin.pipeline config/antique config/vanilla_bert $BERT_MODEL_PARAMS  pipeline.test=true

Which produced the following results: test epoch=60 judged@10=0.6110 map_rel-3=0.2540 [mrr_rel-3=0.7288] p_rel-3@1=0.6450 p_rel-3@3=0.4917
However, published results for Vanilla BERT are as follow:

  • MAP: 0.2801
  • MRR: 0.7101
  • P@1: 0.5950
  • P@3: 0.4967
  1. I then initialized a CEDR-KNRM model using weights from the fine-tuned Vanilla BERT model and trained and tested it:
MODEL_PATH=[PATH_TO_FINE_TUNED_BERT]/60.p
BERT_MODEL_PARAMS="trainer.grad_acc_batch=1 valid_pred.batch_size=4 test_pred.batch_size=4"

python -m onir.bin.extract_bert_weights config/antique config/vanilla_bert $BERT_MODEL_PARAMS pipeline.bert_weights=$MODEL_PATH pipeline.overwrite=True
python -m onir.bin.pipeline config/antique config/cedr/knrm $BERT_MODEL_PARAMS vocab.bert_weights=$MODEL_PATH pipeline.overwrite=True
python -m onir.bin.pipeline config/antique config/cedr/knrm $BERT_MODEL_PARAMS vocab.bert_weights=$MODEL_PATH pipeline.test=true

Which produced the following results: test epoch=30 judged@10=0.6030 map_rel-3=0.2563 [mrr_rel-3=0.7302] p_rel-3@1=0.6400 p_rel-3@3=0.5083
However, published results for CEDR-KNRM are as follow:

  • MAP: 0.2861
  • MRR: 0.7238
  • P@1: 0.6300
  • P@3: 0.4933

According to the logs, I understand that the inference is deterministic ([trainer:pairwise][DEBUG] using GPU (deterministic)).
Could anyone let me know what I am doing wrong?
Where does the differences come from (especially w.r.t. MAP)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions