Skip to content

BUSCO fails with exit error 1, cannot parse null string #989

Description

@masalgar

Description of the bug

I'm trying to run the pipeline, but BUSCO always fails. It seems to manage to run successfully on some bins and I can check the statistics inside the job's folder, but at some point it throws an error, and the job fails. I've been trying to troubleshoot this to no avail, could you perhaps point me in the right direction? My parms and custom resources are as follows.

Custom resources (this is just to avoid errors with CATPACK and to give metaspades enough resources to finish on the first attempt):

process {
    resourceLimits = [
    	cpus: 48,
    	memory: 799.GB
  ]
    withName: METASPADES {
        cpus	= { params.spades_fix_cpus != -1 ? params.spades_fix_cpus : (20 * task.attempt) }
        memory	= { 128.GB * (2 ** (task.attempt - 1)) }
    }
    withName: 'NFCORE_MAG:MAG:CATPACK:CATPACK_BINS' {
        cpus	= {12 * task.attempt}
        time 	= {24.h * task.attempt}
    }
    withName: 'NFCORE_MAG:MAG:CATPACK:CATPACK_SUMMARISE_BINS' {
        ext.when = false
    }
}
params {
    skip_cat_summarise = true
}

Params json:

{
    "input": "\/home\/masalgar\/DeiC-KU-L59\/users\/masalgar\/CF_metagenome\/mag_samples.csv",
    "outdir": "\/home\/masalgar\/DeiC-KU-L59\/users\/masalgar\/CF_metagenome\/mag\/",
    "multiqc_title": "CF_Metagenome mag assembly",
    "skip_clipping": true,
    "skip_shortread_qc": true,
    "cat_db": "\/home\/masalgar\/DeiC-KU-L59\/databases\/CAT\/nr\/",
    "run_checkm2": true,
    "checkm2_db": "\/home\/masalgar\/DeiC-KU-L59\/databases\/CheckM2\/CheckM2_database\/uniref100.KO.1.dmnd",
    "gtdb_db": "\/home\/masalgar\/DeiC-KU-L59\/databases\/GTDB\/gtdbtk_data.tar.gz",
    "skip_prokka": true,
    "exclude_unbins_from_postbinning": true,
    "run_busco": true,
    "busco_clean": true,
    "busco_db": "\/home\/masalgar\/DeiC-KU-L59\/databases\/BUSCO\/",
    "refine_bins_dastool": true,
    "gtdbtk_pplacer_useram": true,
    "postbinning_input": "refined_bins_only",
    "run_gunc": true,
    "gunc_db": "\/home\/masalgar\/DeiC-KU-L59\/databases\/GUNC\/gunc_db_progenomes2.1.dmnd",
    "generate_bigmag_file": true
}

Command used and terminal output

nextflow run nf-core/mag -r 5.4.0 -profile apptainer -resume \
-name CF_m_mag_trimmed-and-dedup_final5 \
-c /home/masalgar/DeiC-KU-L59/users/masalgar/CF_metagenome/mag_custom_resources.config \
-work-dir /home/masalgar/DeiC-KU-L59/users/masalgar/tmp \
-params-file /home/masalgar/DeiC-KU-L59/users/masalgar/CF_metagenome/mag_CF_metagenome_parms.json

Relevant files

I've attached nextflow.json and the log files of one of the failed of one of the failed BUSCO jobs.

.nextflow.log

.command.log

.command.err.txt

.command.out.txt

The error from the .err file is as follows:

Exception in thread "main" java.lang.NumberFormatException: Cannot parse null string
	at java.base/java.lang.Integer.parseInt(Integer.java:550)
	at java.base/java.lang.Integer.<init>(Integer.java:1065)
	at phylolab.taxonamic.PPlacerJSONMerger.relabelJson(PPlacerJSONMerger.java:172)
	at phylolab.taxonamic.PPlacerJSONMerger.main(PPlacerJSONMerger.java:288)

Traceback (most recent call last):
  File "/opt/conda/bin/run_sepp.py", line 26, in <module>
    ExhaustiveAlgorithm().run()
  File "/opt/conda/lib/python3.12/site-packages/sepp/algorithm.py", line 205, in run
    self.merge_results()
  File "/opt/conda/lib/python3.12/site-packages/sepp/exhaustive.py", line 292, in merge_results
    mergeJsonJob.run()
  File "/opt/conda/lib/python3.12/site-packages/sepp/jobs.py", line 150, in run
    raise JobError("\n".join([
sepp.scheduler.JobError: The following execution failed:
java -jar /opt/conda/bin/seppJsonMerger.jar - - /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/placement_files/output_placement.json
json locations: [/faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_0/pplacer.extended.0.38dx0iyf.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_0/pplacer.extended.1.ec1om_7a.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_0/pplacer.extended.2.x_nnddst.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_1/pplacer.extended.0.i0uwb1ad.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_1/pplacer.extended.1.jmtuj2t6.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_1/pplacer.extended.2.xx22nr8m.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_2/pplacer.extended.0.69nlt327.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_2/pplacer.extended.1.zxe6bwxa.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_4/pplacer.extended.0.bsqyufq8.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_7/pplacer.extended.0.nhkmd5t2.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_7/pplacer.extended.1._ymc4c_0.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_7/pplacer.extended.2.q7g73_u9.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_8/pplacer.extended.0.og_13lua.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_8/pplacer.extended.1.87wbxge5.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_8/pplacer.extended.2.onn18c8q.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_12/pplacer.extended.0.1p2kflh0.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_14/pplacer.extended.0.f2l4_sot.jplace, /faststorage/project/DeiC-KU-L59/users/masalgar/tmp/1a/8ab5b582aba99b1387baa123069741/CF_metagenome_dedup-auto-busco/MEGAHIT-MaxBin2Refined-CF_metagenome_dedup.003.fa/auto_lineage/run_bacteria_odb12/sepp_tmp_files/output._tenenrn/root/P_14/pplacer.extended.1.ubgw_ugq.jplace]
Exception in thread "main" java.lang.NumberFormatException: Cannot parse null string
	at java.base/java.lang.Integer.parseInt(Integer.java:550)
	at java.base/java.lang.Integer.<init>(Integer.java:1065)
	at phylolab.taxonamic.PPlacerJSONMerger.relabelJson(PPlacerJSONMerger.java:172)
	at phylolab.taxonamic.PPlacerJSONMerger.main(PPlacerJSONMerger.java:288)

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions