Skip to content

POD5 has encountered an error: ''return_dtype' of function python_udf must be set #172

@etwatson

Description

@etwatson

Issue Description

I am trying to subset my pod5 files as described on the nanopore website for duplex calling. .

When I enter pod5 subset -r pod5_skip/ --summary summary.tsv --columns channel --output /scratch/raw/run_06_09_25_split_by_channel I get the error:

POD5 has encountered an error: ''return_dtype' of function python_udf must be set

A later expression might fail because the output type is not known. Set return_dtype=pl.self_dtype() if the type is unchanged, or set the proper output data type.

Resolved plan until failure:

        ---> FAILED HERE RESOLVING 'sink' <---
SELECT [col("__dest_fname").unique()]
   WITH_COLUMNS:
   ["/scratch/raw/run_06_09_25_split_by_channel".str.concat_horizontal([col("__dest_fname")]).strict_cast(Categorical(Categories { name: "", namespace: "", physical: U32 }, CategoricalMapping { max_categories: 4294967295, upper_bound: 1 })).alias("__dest_fname")]
     WITH_COLUMNS:
     ["channel-".str.concat_horizontal([col("channel"), ".pod5"]).strict_cast(Categorical(Categories { name: "", namespace: "", physical: U32 }, CategoricalMapping { max_categories: 4294967295, upper_bound: 1 })).alias("__dest_fname"), col("read_id").alias("__read_id")]
      DF ["read_id", "channel"]; PROJECT */2 COLUMNS'

For detailed information set POD5_DEBUG=1'

Logs

Setting POD5_DEBUG=1 does not change the error reporting behavior.

Specifications

  • Pod5 Version: 0.3.28
  • Python Version: 3.12.9
  • Platform: Ubuntu 22.04 LTS

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions