Teloscope includes a synthetic benchmark generator and evaluator in teloscope-simulate.
Build it with:
make simulateThe binary is written to build/bin/teloscope-simulate.
Example:
build/bin/teloscope-simulate -n 1000 -r 1e-4 -s 42 -o testFiles/simulate/rate_1e-4This writes:
sequences.faground_truth.tsv
Each synthetic sequence contains:
- left canonical telomere
- left TVR block
- random internal sequence
- right TVR block
- right canonical telomere
After the sequence is built, the requested mutation rate is applied across the whole sequence.
Run Teloscope on the synthetic FASTA, then compare the called terminal BED file to the ground truth:
build/bin/teloscope -f testFiles/simulate/rate_1e-4/sequences.fa -o testFiles/simulate/rate_1e-4/teloscope_out
build/bin/teloscope-simulate --evaluate \
-g testFiles/simulate/rate_1e-4/ground_truth.tsv \
-b testFiles/simulate/rate_1e-4/teloscope_out/sequences.fa_terminal_telomeres.bedThe evaluator prints:
total_tipsdetectedsensitivitymean_bias_bpmean_abs_err_bptvr_ratefp_blocks
| Flag | Meaning | Default |
|---|---|---|
-n |
number of sequences | 1000 |
-R |
canonical repeats per telomere end | 2000 |
-T |
TVR repeats per telomere end | 100 |
-L |
internal random segment length in bp | 25000 |
-r |
per-base mutation rate | 0.0 |
-s |
random seed | 42 |
-o |
output directory | testFiles/simulate |
The repo includes a shell wrapper for mutation-rate sweeps:
bash .github/workflows/val-simulate.shOptional environment overrides:
SIM_N=1000000 SIM_SEED=42 bash .github/workflows/val-simulate.sh