FNP task1
Binary classification task
Text section contains a causal relation => 1, Otherwise => 0
Unbalanced dataset, only about 7% 1s
10837 training data & 2710 testing data
Submit your result on Kaggle
Read data descriptions and download the dataset from kaggle
-
preprocess.py
: Separate the dataset into the training set, validation set, and test set. -
Train the models via
baseline.py
with BERT & Focal Loss and generate the csv filepred.csv
. -
Generate pseudo-label training data via
create_pseudo.py
. -
Train the models via
xlnet_pseudo.py
with XLNet, Focal Loss & pseudo-label. -
Submit the result
final_preditcion.csv
.
- OS : Debian 9.11
- Python 3.6
- GPU : TITAN RTX (24G RAM)
- CPU : Core i7-9800X