Validation and scoring scripts for Task 1 of the PEGS Challenge.
For Task 2, see writeup-workflow/.
Metrics returned and used for ranking are:
-
Primary: area under the receiver operating characteristic curve (AUROC)
-
Secondary (used for ties): area under the precision-recall curve (AUPRC)
python validate.py \
-p PATH/TO/PREDICTIONS_FILE.CSV \
-g PATH/TO/GOLDSTANDARD_FOLDER [-o RESULTS_FILE]
If -o/--output is not provided, then full results will output to results.json.
What it will check for:
- two columns named
idanddisease_probability(extraneous columns will be ignored) idvalues are stringsdisease_probabilityvalues are floats between 0.0 and 1.0, and cannot be null/None- there is one prediction per patient (so, no missing patient IDs or duplicate patient IDs)
- there are no extra predictions (so, no unknown patient IDs)
The script will either print to STDOUT, VALIDATED or INVALID.
python score.py \
-p PATH/TO/PREDICTIONS_FILE.CSV \
-g PATH/TO/GOLDSTANDARD_FOLDER [-o RESULTS_FILE]
If -o/--output is not provided, then results will output to results.json.
The script will either print to STDOUT, SCORED or INVALID.
Results will be outputted to output/results.json in your current working directory (assuming you mount $PWD/output).
docker run --rm \
-v /PATH/TO/PREDICTIONS_FILE.CSV:/predictions.csv:ro \
-v /PATH/TO/GOLDSTANDARD_FOLDER:/goldstandard:ro \
-v $PWD/output:/output:rw \
ghcr.io/sage-bionetworks-challenges/pegs-evaluation:latest \
python3 validate.py \
-p /predictions.csv -g /goldstandard -o /output/results.json
docker run --rm \
-v /PATH/TO/PREDICTIONS_FILE.CSV:/predictions.csv:ro \
-v /PATH/TO/GOLDSTANDARD_FOLDER:/goldstandard:ro \
-v $PWD/output:/output:rw \
ghcr.io/sage-bionetworks-challenges/pegs-evaluation:latest \
python3 score.py \
-p /predictions.csv -g /goldstandard -o /output/results.json