Skip to content

Move pass-1 SJ tab from <prefix>SJ.pass1.out.tab to <prefix>_STARpass1/SJ.out.tab #28

@pinin4fjords

Description

@pinin4fjords

Summary

rustar's two-pass mode writes the pass-1 splice junction tab as <prefix>SJ.pass1.out.tab at the top level. STAR keeps two-pass intermediates inside <prefix>_STARpass1/ and names the pass-1 SJ tab SJ.out.tab inside that directory.

STAR reference behaviour

The two-pass orchestration in source/twoPass.cpp mkdirs <prefix>_STARpass1/ and redirects pass-1 output into it. The pass-1 splice tab lives at <prefix>_STARpass1/SJ.out.tab alongside a pass-1 Log.final.out.

Reproducer

#!/usr/bin/env bash
set -euo pipefail
mkdir -p /tmp/rustar-mre-28 && cd /tmp/rustar-mre-28

BASE=https://raw.githubusercontent.com/nf-core/test-datasets/626c8fab639062eade4b10747e919341cbf9b41a
curl -fsLO $BASE/reference/genome.fasta
curl -fsL  $BASE/reference/genes_with_empty_tid.gtf.gz | gunzip -c > genes.gtf
curl -fsLO $BASE/testdata/GSE110004/SRR6357072_1.fastq.gz
curl -fsLO $BASE/testdata/GSE110004/SRR6357072_2.fastq.gz

RUSTAR=ghcr.io/scverse/rustar-aligner:dev
STAR=community.wave.seqera.io/library/htslib_samtools_star_gawk:ae438e9a604351a4

mkdir -p idx-rustar idx-star
docker run --rm -v $PWD:/w -w /w $RUSTAR rustar-aligner --runMode genomeGenerate \
    --genomeDir idx-rustar --genomeFastaFiles genome.fasta --sjdbGTFfile genes.gtf \
    --sjdbOverhang 100 --genomeSAindexNbases 7
docker run --rm -v $PWD:/w -w /w $STAR STAR --runMode genomeGenerate \
    --genomeDir idx-star --genomeFastaFiles genome.fasta --sjdbGTFfile genes.gtf \
    --sjdbOverhang 100 --genomeSAindexNbases 7

COMMON=(--readFilesIn SRR6357072_1.fastq.gz SRR6357072_2.fastq.gz --readFilesCommand zcat
        --runThreadN 4 --sjdbGTFfile genes.gtf --twopassMode Basic --runRNGseed 0
        --outSAMtype BAM Unsorted)

docker run --rm -v $PWD:/w -w /w $RUSTAR rustar-aligner \
    --genomeDir idx-rustar "${COMMON[@]}" --outFileNamePrefix RUS.
docker run --rm -v $PWD:/w -w /w $STAR STAR \
    --genomeDir idx-star "${COMMON[@]}" --outFileNamePrefix STAR.

echo "=== STAR pass-1 layout ==="; ls STAR._STARpass1/
echo "=== rustar layout ==="; ls RUS./SJ.pass1.out.tab RUS./_STARpass1 2>&1

Observed: STAR produces STAR._STARpass1/SJ.out.tab. rustar produces RUS.SJ.pass1.out.tab at the top level (or inside RUS./ per #26), with no _STARpass1 directory.

Suggested fix

Move the pass-1 SJ tab into <prefix>_STARpass1/SJ.out.tab (mkdir the parent first). File content unchanged.

A related follow-up — Log.out and Log.progress.out are also missing — is split out as #55.

Severity

Low. Output-shape compatibility cleanup. nf-core/rnaseq works around it with a permissive *.tab glob today.


Filed during nf-core/rnaseq integration testing (nf-core/rnaseq#1855).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions