ENSEMBL annotations not populating correctly

Hi, I'm running this pipeline from the SJSU HPC and I'm having an issue where ~8.5k out of my 29k rows of data are populating with both the symbol and genename as "NA". In a given NA row, there is a valid ENSEMBL ID that, if I look it up on ENSEMBL, leads to a valid gene product with an annotation that looks as though it is just not being populated correctly. I am running OSD-511 using the following script (using the cached files established on the spartan01 HPC by Jonathan Oribello): 

NXF_SINGULARITY_CACHEDIR=/home/joribello/test_install/singularity nextflow run NF_RCP-F_1.0.3/main.nf  -profile singularity,slurm  -resume   --gldsAccession GLDS-511 -c /home/joribello/test_install/cos_hpc_nextflow.config  -c give_ALIGN_STAR_more_memory.config --runsheetPath /home/carnold/GLDS-511/Metadata/GLDS-511_bulkRNASeq_v1_runsheet.csv

The original runsheet was edited to correct the switched R1 and R2 files for one of the samples (they were entered incorrectly in the downloaded version from GL).
ALIGN_STAR_more_memory.config goes as follows:
process {
    withName:'ALIGN_STAR' {
        memory='45GB'
    }
    withName:'SORT_INDEX_BAM' {
        memory='45GB'
    }
    withName: "COUNT_ALIGNED" {
      maxRetries = 3
      errorStrategy = 'retry'
      memory = { 8.GB + 4.GB * task.attempt }
    }
}


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENSEMBL annotations not populating correctly #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ENSEMBL annotations not populating correctly #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions