Skip to content

IndexError: string index out of range #21

@robinycfang

Description

@robinycfang

Hi

I was running ConsensusCeuncher to collapse UMIs. It seems to output some results:

├── sscs
│   ├── sscs.sorted.bam (.bai)                     
│   ├── singleton.sorted.bam (.bai)                
├── sscs_SC
|   ├── sscs.sc.sorted.bam (.bai)              
├── dcs
│   ├── dcs.sorted.bam (.bai)
├── dcs_SC
│   ├── dcs.sc.sorted.bam(.bai)
│   ├── all.unique.dcs.sorted.bam(.bai)
├── read_families.txt                       Family size and frequency
├── stats.txt                               Consensus sequence formation metrics
├── tag_fam_size.png                        Distribution of reads across family size

However, when I checked on the log, I found an error:

# === DCS ===
SSCS - Total reads: 26020276
SSCS - Unmapped reads: 0
SSCS - Secondary/Supplementary reads: 0
DCS reads: 89306
SSCS singletons: 25841664 

[bam_sort_core] merging from 6 files and 1 in-memory blocks...
Traceback (most recent call last):
  File "/ConsensusCruncher/singleton_correction.py", line 320, in <module>
    main()
  File "/ConsensusCruncher/singleton_correction.py", line 268, in main
    corrected_read = strand_correction(tag, duplex, query_name, singleton_dict)
  File "/ConsensusCruncher/singleton_correction.py", line 101, in strand_correction
    dcs = duplex_consensus(read, complement_read)
  File "/ConsensusCruncher/singleton_correction.py", line 71, in duplex_consensus
    if read1.query_sequence[i] == read2.query_sequence[i] and \
IndexError: string index out of range
[bam_sort_core] merging from 6 files and 1 in-memory blocks...
# === DCS - Singleton Correction ===
SSCS SC - Total reads: 26023245
SSCS SC - Unmapped reads: 0

It looks like DCS was not properly performed? For my experiments, I might just need to use sscs.sc.sorted.bam. Are these final bam files still safe to use? The following is my commands. Thanks!
python3 ConsensusCruncher.py fastq2bam --fastq1 sample_R1.fastq --fastq2 sample_R2.fastq -o out_dir -b bwa -g /picard/2.10.9/picard.jar -r hg38.fasta -s samtools -l umilist.txt

python3 ConsensusCruncher.py consensus -i sample.sorted.bam -o out_dir -s samtools -b cytoBand.txt -g hg38 --cleanup True --scorrect True

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions