Skip to content

ENSEMBL annotations not populating correctly #43

Open
@carnold-sjsu

Description

@carnold-sjsu

Hi, I'm running this pipeline from the SJSU HPC and I'm having an issue where ~8.5k out of my 29k rows of data are populating with both the symbol and genename as "NA". In a given NA row, there is a valid ENSEMBL ID that, if I look it up on ENSEMBL, leads to a valid gene product with an annotation that looks as though it is just not being populated correctly. I am running OSD-511 using the following script (using the cached files established on the spartan01 HPC by Jonathan Oribello):

NXF_SINGULARITY_CACHEDIR=/home/joribello/test_install/singularity nextflow run NF_RCP-F_1.0.3/main.nf -profile singularity,slurm -resume --gldsAccession GLDS-511 -c /home/joribello/test_install/cos_hpc_nextflow.config -c give_ALIGN_STAR_more_memory.config --runsheetPath /home/carnold/GLDS-511/Metadata/GLDS-511_bulkRNASeq_v1_runsheet.csv

The original runsheet was edited to correct the switched R1 and R2 files for one of the samples (they were entered incorrectly in the downloaded version from GL).
ALIGN_STAR_more_memory.config goes as follows:
process {
withName:'ALIGN_STAR' {
memory='45GB'
}
withName:'SORT_INDEX_BAM' {
memory='45GB'
}
withName: "COUNT_ALIGNED" {
maxRetries = 3
errorStrategy = 'retry'
memory = { 8.GB + 4.GB * task.attempt }
}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions