Skip to content

expected output type 'Null', got 'Int64'; set return_dtype to the proper datatype #40

@xychen233

Description

@xychen233

老师您好!感谢您发布的CPhasing软件!我最近正在用它测试组装的单倍型基因组,我遇到了一些错误不知道应该怎么解决,还望老师可以给一些建议!
这是我的运行命令:

export PATH=/media/APP/miniconda3/bin/:$PATH
export PYTHONPATH=/media/APP/python3.12/site-packages:$PYTHONPATH
export PATH=/media/APP/CPhasing/bin:$PATH
export PYTHONPATH=/media/APP/CPhasing:$PYTHONPATH
cphasing pipeline -f genome.fasta -hic1 HiC_1.fq.gz -hic2 HiC_2.fq.gz -t 40 -n 47

我遇到的问题是在第三步:

                    #----------------------------------#
                    #  Running step 3. hyperpartition  #
                    #----------------------------------#
[19:25:43] INFO     Running hyperpartition with `basal(haploid)`     cli.py:3952
                    mode.
           INFO     Load raw hypergraph from pairs file              cli.py:4088
                    `../HiC.pairs.pqs`
           INFO     Extract edges from pairs.                  hypergraph.py:109
           INFO     Parsing pqs ...                                   pqs.py:669
           INFO     Filtered the data with mapq >= 1.                 pqs.py:347
[19:25:56] ERROR    expected output type 'Null', got 'Int64'; set    cli.py:1189
                    `return_dtype` to the proper datatype
                    _RemoteTraceback:
                    """
                    Traceback (most recent call last):
                      File
                    "/media/APP/python3.12/site
                    -packages/joblib/externals/loky/process_executor
                    .py", line 490, in _process_worker
                        r = call_item()
                      File
                    "/media/APP/python3.12/site
                    -packages/joblib/externals/loky/process_executor
                    .py", line 291, in __call__
                        return self.fn(*self.args, **self.kwargs)
                               ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      File
                    "/media/APP/python3.12/site
                    -packages/joblib/parallel.py", line 607, in
                    __call__
                        return [func(*args, **kwargs) for func,
                    args, kwargs in self.items]
                                ~~~~^^^^^^^^^^^^^^^^^
                      File
                    "/media/APP/CPhasing/cphasing/pqs.py",
                    line 1107, in process_chunk_hg
                        return chunk.collect()
                               ~~~~~~~~~~~~~^^
                      File
                    "/media/APP/python3.12/site
                    -packages/polars/_utils/deprecation.py", line
                    97, in wrapper
                        return function(*args, **kwargs)
                      File
                    "/media/APP/python3.12/site
                    -packages/polars/lazyframe/opt_flags.py", line
                    328, in wrapper
                        return function(*args, **kwargs)
                      File
                    "/media/APP/python3.12/site
                    -packages/polars/lazyframe/frame.py", line 2415,
                    in collect
                        return wrap_df(ldf.collect(engine,
                    callback))
                                       ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
                    polars.exceptions.SchemaError: expected output
                    type 'Null', got 'Int64'; set `return_dtype` to
                    the proper datatype
                    """

                    The above exception was the direct cause of the
                    following exception:

                    ╭───── Traceback (most recent call last) ──────╮
                    │ /media/APP/CPhasing/cphasing/cli.py:1 │
                    │ 121 in pipeline                              │
                    │                                              │
                    │   1118 │                                     │
                    │   1119 │   today = datetime.now().strftime(" │
                    │   1120 │   try:                              │
                    │ ❱ 1121 │   │   run(fasta,                    │
                    │   1122 │   │   │   ul_data,                  │
                    │   1123 │   │   │   porec_data,               │
                    │   1124 │   │   │   porectable, pairs,        │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/pipeline │
                    │ /pipeline.py:1071 in run                     │
                    │                                              │
                    │   1068 │   │   │   _out_sh.write("\n")       │
                    │   1069 │   │                                 │
                    │   1070 │   │   try:                          │
                    │ ❱ 1071 │   │   │   hyperpartition.main(args= │
                    │   1072 │   │   │   │   │   │   │   prog_name │
                    │   1073 │   │   except SystemExit as e:       │
                    │   1074 │   │   │   exc_info = sys.exc_info() │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/rich_click/rich_command.py:216  │
                    │ in main                                      │
                    │                                              │
                    │   213 │   │   try:                           │
                    │   214 │   │   │   try:                       │
                    │   215 │   │   │   │   with self.make_context │
                    │ ❱ 216 │   │   │   │   │   rv = self.invoke(c │
                    │   217 │   │   │   │   │   if not standalone_ │
                    │   218 │   │   │   │   │   │   return rv      │
                    │   219 │   │   │   │   │   # it's not safe to │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/click/core.py:1246 in invoke    │
                    │                                              │
                    │   1243 │   │   │   echo(style(message, fg="r │
                    │   1244 │   │                                 │
                    │   1245 │   │   if self.callback is not None: │
                    │ ❱ 1246 │   │   │   return ctx.invoke(self.ca │
                    │   1247 │                                     │
                    │   1248 │   def shell_complete(self, ctx: Con │
                    │   1249 │   │   """Return a list of completio │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/click/core.py:814 in invoke     │
                    │                                              │
                    │    811 │   │                                 │
                    │    812 │   │   with augment_usage_errors(sel │
                    │    813 │   │   │   with ctx:                 │
                    │ ❱  814 │   │   │   │   return callback(*args │
                    │    815 │                                     │
                    │    816 │   def forward(self, cmd: Command, / │
                    │    817 │   │   """Similar to :meth:`invoke`  │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/cli.py:4 │
                    │ 089 in hyperpartition                        │
                    │                                              │
                    │   4086 │   elif pairs:                       │
                    │   4087 │   │   if is_file_changed(hcr_bed) o │
                    │        Path(hypergraph_path).exists():       │
                    │   4088 │   │   │   logger.info(f"Load raw hy │
                    │ ❱ 4089 │   │   │   he = Extractor(hypergraph │
                    │   4090 │   │   │   │   │   │      min_qualit │
                    │        edge_length=edge_length,              │
                    │   4091 │   │   │   │   │   │      hcr_invert │
                    │   4092 │   │   │   he.save(hypergraph_path)  │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/hypergra │
                    │ ph.py:95 in __init__                         │
                    │                                              │
                    │     92 │   │   self.log_dir = Path(log_dir)  │
                    │     93 │   │   self.log_dir.mkdir(parents=Tr │
                    │     94 │   │                                 │
                    │ ❱   95 │   │   self.edges = self.generate_ed │
                    │     96 │                                     │
                    │     97 │   @staticmethod                     │
                    │     98 │   def _process_df(df, contig_idx, t │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/hypergra │
                    │ ph.py:159 in generate_edges                  │
                    │                                              │
                    │    156 │   │   │   │                         │
                    │    157 │   │   │   │   chunks = p.read(min_m │
                    │    158 │   │   │   │                         │
                    │ ❱  159 │   │   │   │   res = p.to_hg_df(chun │
                    │    160 │   │   │   │   │   │   │   │    edge │
                    │    161 │   │   │   │                         │
                    │    162 │   │   │   │   if Path(f"{pairs_pref │
                    │                                              │
                    │ /media/APP/CPhasing/cphasing/pqs.py:6 │
                    │ 75 in to_hg_df                               │
                    │                                              │
                    │    672 │   │   for chunk in chunks:          │
                    │    673 │   │   │   args.append((Path(chunk). │
                    │        min_mapq, edge_length))               │
                    │    674 │   │                                 │
                    │ ❱  675 │   │   results = Parallel(n_jobs=sel │
                    │    676 │   │   │   │   │   delayed(process_c │
                    │    677 │   │   │   │   )                     │
                    │    678                                       │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:2072 in      │
                    │ __call__                                     │
                    │                                              │
                    │   2069 │   │   # dispatch of the tasks to th │
                    │   2070 │   │   next(output)                  │
                    │   2071 │   │                                 │
                    │ ❱ 2072 │   │   return output if self.return_ │
                    │   2073 │                                     │
                    │   2074 │   def __repr__(self):               │
                    │   2075 │   │   return "%s(n_jobs=%s)" % (sel │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:1682 in      │
                    │ _get_outputs                                 │
                    │                                              │
                    │   1679 │   │   │   yield                     │
                    │   1680 │   │   │                             │
                    │   1681 │   │   │   with self._backend.retrie │
                    │ ❱ 1682 │   │   │   │   yield from self._retr │
                    │   1683 │   │                                 │
                    │   1684 │   │   except GeneratorExit:         │
                    │   1685 │   │   │   # The generator has been  │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:1784 in      │
                    │ _retrieve                                    │
                    │                                              │
                    │   1781 │   │   │   # exception (e.g. `Genera │
                    │   1782 │   │   │   # worker traceback.       │
                    │   1783 │   │   │   if self._aborting:        │
                    │ ❱ 1784 │   │   │   │   self._raise_error_fas │
                    │   1785 │   │   │   │   break                 │
                    │   1786 │   │   │                             │
                    │   1787 │   │   │   nb_jobs = len(self._jobs) │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:1859 in      │
                    │ _raise_error_fast                            │
                    │                                              │
                    │   1856 │   │   # calling get_result. This jo │
                    │   1857 │   │   # called directly or if the g │
                    │   1858 │   │   if error_job is not None:     │
                    │ ❱ 1859 │   │   │   error_job.get_result(self │
                    │   1860 │                                     │
                    │   1861 │   def _warn_exit_early(self):       │
                    │   1862 │   │   """Warn the user if the gener │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:758 in       │
                    │ get_result                                   │
                    │                                              │
                    │    755 │   │   │   # We assume that the resu │
                    │    756 │   │   │   # callback thread, and is │
                    │    757 │   │   │   # be returned.            │
                    │ ❱  758 │   │   │   return self._return_or_ra │
                    │    759 │   │                                 │
                    │    760 │   │   # For other backends, the mai │
                    │    761 │   │   try:                          │
                    │                                              │
                    │ /media/APP/python3.12/s │
                    │ ite-packages/joblib/parallel.py:773 in       │
                    │ _return_or_raise                             │
                    │                                              │
                    │    770 │   def _return_or_raise(self):       │
                    │    771 │   │   try:                          │
                    │    772 │   │   │   if self.status == TASK_ER │
                    │ ❱  773 │   │   │   │   raise self._result    │
                    │    774 │   │   │   return self._result       │
                    │    775 │   │   finally:                      │
                    │    776 │   │   │   del self._result          │
                    ╰──────────────────────────────────────────────╯
                    SchemaError: expected output type 'Null', got
                    'Int64'; set `return_dtype` to the proper
                    datatype

希望老师能给一些解决报错的建议,万分感谢!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions