Skip to content

Commit dfa25e9

Browse files
author
Zachary Foster
committed
randomize samples per run
1 parent f7a6064 commit dfa25e9

2 files changed

Lines changed: 4 additions & 0 deletions

File tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
metadata <- read.csv('https://raw.githubusercontent.com/nf-core/test-datasets/refs/heads/pathogensurveillance/samplesheets/klebsiella.csv')
22
metadata <- metadata[metadata$sequence_type == "ILLUMINA" & metadata$Organism == "Klebsiella pneumoniae", ]
3+
metadata <- metadata[sample(nrow(metadata)), ]
34
write.csv(metadata, file = 'input_data_subset.csv', row.names = FALSE)

publication/scratch/performance_vs_sample_count/run_test_data.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@
44
SAMPLE_COUNTS=(1 3 5 10 25 50 75 100 150 200)
55
OUTPUT_DIR='run_results'
66

7+
# Create input data (randomizes samples used each time)
8+
Rscript prepare_test_data.R
9+
710
# Run pipeline for each sample count
811
for NUM in "${SAMPLE_COUNTS[@]}"; do
912
echo "=================================================================================================="

0 commit comments

Comments
 (0)