Skip to content

Visualize.py - Error in finding input files #349

@GiliWolf

Description

@GiliWolf

Hi, I encountred some problems when running -
python visualize.py --viz_output <output_dir> --gene_list <gene_list> <main_output_dir>

After running isoquant using the command -
isoquant.py --yaml <single_sample>.yml --data_type pacbio_ccs --reference GRCh38.primary_assembly.genome.fa --index GRCh38.primary_assembly.genome.mmi --genedb gencode.v49.annotation.gtf --complete_genedb --force --sqanti_output --threads 8

where the tree of the main_output_dir is -

<main_output_dir>
├── GRCh38.primary_assembly.genome.fa.fai 
└── isoquant_output 
        ├── <SAMPLE>
                 ├──<SAMPLE>.corrected_reads.bed.gz
                 └── <SAMPLE>.discovered_gene_counts.tsv
                  ................
        ├── alignment.log
        ├── gencode.v49.annotation.bed
        ├── gencode.v49.annotation.db
        ├── isoquant.log
        └── .params

The first error I got was -
AssertionError: Params file not found: <main_output_dir>/.params
So, I tried to change the output directory to - "<main_output_dir>/isoquant_output "
But then I got the error -
Error parsing GTF file: Database file isoquant_output/gencode.v49.annotation.db does not exist
(since in the params file - genedb_filename=isoquant_output/gencode.v49.annotation.db,
and in the "post_process.py" script: _process_params function, it's extracted using -
self.genedb_filename = params.get("genedb_filename")

I fixed it, and got an additional error (similar to #346 issue) -
"TypeError: expected str, bytes or os.PathLike object, not NoneType"

When looking into the code in "post_process.py" script, I've realised there is a problem when trying to run 'visualise.py' when using a yml file containing only one "experiment" (or sample), as the function "find_files_from_yaml" tries to extract count/tpm information only using the 'combined*' files, which are not generated for a single sample.

I think this is the source of both of the errors explained above.

Hope everything is clear.
Thank you,
Gili


my yml -

[
  data format: "unmapped_bam",
  {
    name: "sample_1",
    long read files: [
      "path/repeat1_demux.bam",
      "path/repeat2_demux.bam",
    ],
    labels: [
      "repeat1",
      "repeat2",
    ],
    prefix: "sample_1",
  },
]

Metadata

Metadata

Assignees

No one assigned

    Labels

    visualizerRelates to visualization module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions