DNA Storage Simulator analyzes and simulates the error profile of Nanopore DNA. It was completed as part of an undergraduate research project at NUS, supervised by Professor Djordje Jevdjic. See our accepted poster at ISPASS '22 for a short summary, and the extended report for details.
The structure is as follows:
-
CodeReconstruction: Forked from CodeReconstruction, with modifications to aid testing.real_data_clustered.txt: Stores real data in the form:[original strand][\n] *****************************[\n] [copy][\n] [copy][\n] ... [copy][\n] [\n] [\n] [original strand][\n] *****************************[\n] [copy][\n] [copy][\n] ... [copy][\n] ...synth_data_clustered.txt: Stores synthetic data in the same form asreal_data_clustered, can be generated via thenoisy.pymodule, or DNASimulatorcompare.sh: Bash script that runs reconstruction algorithms onreal_data_clusteredandsynth_data_clustered
-
Scripts: Contains utility scriptsget_ground_from_clustered.py: Parses files of the same form asreal_data_clustered.txtabove, and generates a filestrands.txtcontaining the original strands only. Run usingpython get_ground_from_clustered.pynoisy.py: Naive simulator that takes in astrands.txtfile, sequencing coverage, and error probabilities as input and generates noisy copies of multiple clusters in the same form asreal_data_clustered.txt