Skip to content

Commit c1b2f8b

Browse files
committed
fix args & args loading, add train/inference scripts
1 parent 5ffb946 commit c1b2f8b

File tree

8 files changed

+375
-10
lines changed

8 files changed

+375
-10
lines changed

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,12 @@
1+
1.0.1 (May 16, 2020)
2+
3+
- Uploaded training / testing data
4+
- Uploaded pre-trained DistilBERT embeddings
5+
- Fixed requirements
6+
- Minor fixes to arg loading and paths
7+
8+
Thanks to Prakhar Gupta for pointing out the issues in the codebase!
9+
110
1.0.0 (April 15, 2020)
211

312
Initial Public Release of MaUde, an unreferenced metric for online dialog evaluation, to appear in ACL 2020

README.md

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,20 @@
44

55
Contains code of the paper titled _"Learning an Unreferenced Metric for Online Dialogue Evaluation"_ to appear at **ACL 2020**, [Arxiv](https://arxiv.org/abs/2005.00583)
66

7+
## Installation
8+
9+
- `pip install -r requirements.txt`
10+
- Install [ParlAI](https://github.com/facebookresearch/ParlAI#installing-parlai)
11+
712
## Getting the data
813

9-
To get the trained models, [download the data from here](https://drive.google.com/file/d/1Ysso9hdzSenK13LjOFombyXYqA_kv-Vy/view?usp=sharing).
14+
- Get the `convai2` train and test data and pre-trained Distilbert [embeddings here](https://drive.google.com/file/d/1VVcsxmUrDSRIfunPWe9UO1aeCz-lITNy/view?usp=sharing). Download and unzip in the folder `convai2_data`.
15+
- Get the trained model checkpoints [from here](https://drive.google.com/file/d/1Ysso9hdzSenK13LjOFombyXYqA_kv-Vy/view?usp=sharing). Download and unzip into the folder `full_acl_runs`.
16+
- For individual licensing reasons we cannot release the train/test data of MultiWoz, Frames and DailyDialog. Please [send me a mail](mailto:[email protected]) if you need them!
17+
- Run inference using `./run_inference.sh`
18+
19+
**N.B.** - For model names and checkpoints, please refer to `run_inference.sh` script.
20+
1021

1122
## Computing Backtranslation
1223

@@ -38,6 +49,8 @@ For baselines, add the appropriate flag:
3849
--train_baseline [infersent/ruber/bertnli]
3950
```
4051

52+
An example training script is provided at [`run_training.sh`](run_training.sh)
53+
4154
## Inference Script
4255

4356
```
@@ -48,7 +61,8 @@ For baselines, add the appropriate flag:
4861
--test_column true_response --results_file "results.jsonl"
4962
```
5063

51-
Outputs the results in a `jsonl` file. To measure human correaltion with [See et al 2019](https://parl.ai/projects/controllable_dialogue/), specify `--human_eval` flag and `--human_eval_file` location.
64+
- Outputs the results in a `jsonl` file. To measure human correaltion with [See et al 2019](https://parl.ai/projects/controllable_dialogue/), specify `--human_eval` flag and `--human_eval_file` location.
65+
- We have also added the script to run inference on our trained checkpoints - [`run_inference.sh`](run_inference.sh).
5266

5367
## Acknowledgements
5468

@@ -60,6 +74,11 @@ Outputs the results in a `jsonl` file. To measure human correaltion with [See et
6074
- ParlAI - https://parl.ai/
6175
- See et al 2019 data - https://parl.ai/projects/controllable_dialogue/
6276

77+
## Questions
78+
79+
- Please send a mail to [[email protected]](mailto:[email protected]) for questions / clarifications.
80+
- Open an Issue
81+
6382
## Citation
6483

6584
If our work is useful for your research, consider citing it using the following bibtex:

args.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -130,9 +130,9 @@ def get_args(command=None):
130130
)
131131
parser.add_argument(
132132
"--logger_dir",
133-
default="/private/home/koustuvs/mlp/latentDialogAnalysis/logs/",
133+
default="./",
134134
type=str,
135-
help="batch size",
135+
help="log directory (must be created)",
136136
)
137137
parser.add_argument("--log_interval", default=100, type=int, help="log interval")
138138
parser.add_argument(

codes/inference.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
#
88
"""
99
# File to run various inferences
10+
import sys
11+
import os
12+
sys.path.append(os.getcwd())
1013
import torch
1114
from args import get_args
1215
from logbook.logbook import LogBook

data.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
import random
1616
from parlai.core.params import ParlaiParser
1717
from parlai.agents.repeat_label.repeat_label import RepeatLabelAgent
18-
from parlai_internal.agents.ir_baseline.ir_baseline import IrBaselineAgent
18+
from parlai.agents.ir_baseline.ir_baseline import IrBaselineAgent
1919
from parlai.core.worlds import create_task
2020
from sklearn.decomposition import PCA
2121
import numpy as np

requirements.txt

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
1-
scikit-learn
2-
numpy
3-
matplotlib
4-
pytorch_transformers
5-
nltk
1+
torch==1.2.0
2+
scikit-learn==0.21.2
3+
numpy==1.16.4
4+
matplotlib==3.1.1
5+
pytorch-lightning==0.5.2.1
6+
transformers==2.1.1
7+
nltk==3.4.5
8+
wandb==0.8.5
9+
PyYAML==5.1.1

run_inference.sh

Lines changed: 306 additions & 0 deletions
Large diffs are not rendered by default.

run_training.sh

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
#!/bin/sh
2+
BATCH_SIZE=64
3+
MODEL_SAVE_DIR=full_acl_runs/
4+
DATA_NAME=convai2
5+
DATA_LOC=convai2_data/
6+
FINE_TUNE_MODEL=/convai2_data/distilbert_lm
7+
TRAIN_MODE=nce
8+
NUM_GPUS=8
9+
# Model
10+
# python codes/trainer.py --mode train --batch_size 64 --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --corrupt_type all_context
11+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --corrupt_type all
12+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --corrupt_type only_semantics
13+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --corrupt_type only_syntax
14+
# InferSent Baseline
15+
python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline infersent --corrupt_type all_context
16+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline infersent --corrupt_type all
17+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline infersent --corrupt_type only_semantics
18+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline infersent --corrupt_type only_syntax
19+
# BertNLI baseline
20+
python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline bertnli --corrupt_type all_context
21+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline bertnli --corrupt_type all
22+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline bertnli --corrupt_type only_semantics
23+
# python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --use_cluster --use_ddp --per_experiment_nb_gpus $NUM_GPUS --train_baseline bertnli --corrupt_type only_syntax
24+
python codes/trainer.py --mode train --batch_size $BATCH_SIZE --model_save_dir $MODEL_SAVE_DIR --data_name $DATA_NAME --data_loc $DATA_LOC --fine_tune_model $FINE_TUNE_MODEL --learn_down True --downsample True --down_dim 300 --optim adam,lr=0.0001 --dropout 0.2 --decoder_hidden 200 --load_fine_tuned --train_mode $TRAIN_MODE --gpus 0 --train_baseline bertnli --corrupt_type all_context

0 commit comments

Comments
 (0)