Skip to content

[BUG] AttributeError in PandasDataFrame.__init__ with triad>=0.9.2 #526

@charlesbluca

Description

@charlesbluca

Minimal Code To Reproduce

import fugue_sql

dag = fugue_sql.FugueSQLWorkflow()
df = dag.df([[0, "hello"], [1, "world"]], "a:int64,b:str")
dag("SELECT * FROM df WHERE a > 0 YIELD DATAFRAME AS result")

result = dag.run("dask")

Describe the bug
When pulling in triad>=0.9.2, the above reproducer fails due to a missing enforce_type attribute:

AttributeError                            Traceback (most recent call last)
Cell In [1], line 7
      4 df = dag.df([[0, "hello"], [1, "world"]], "a:int64,b:str")
      5 dag("SELECT * FROM df WHERE a > 0 YIELD DATAFRAME AS result")
----> 7 result = dag.run("dask")

File /datasets/charlesb/miniforge3/envs/dask-sql-py38/lib/python3.8/site-packages/fugue/workflow/workflow.py:1523, in FugueWorkflow.run(self, *args, **kwargs)
   1521         if ctb is None:  # pragma: no cover
   1522             raise
-> 1523         raise ex.with_traceback(ctb)
   1524     self._computed = True
   1525 return DataFrames(
   1526     {
   1527         k: v.result
   (...)
   1530     }
   1531 )

Cell In [1], line 4
      1 import fugue_sql
      3 dag = fugue_sql.FugueSQLWorkflow()
----> 4 df = dag.df([[0, "hello"], [1, "world"]], "a:int64,b:str")
      5 dag("SELECT * FROM df WHERE a > 0 YIELD DATAFRAME AS result")
      7 result = dag.run("dask")

File /datasets/charlesb/miniforge3/envs/dask-sql-py38/lib/python3.8/site-packages/fugue/dataframe/pandas_dataframe.py:64, in PandasDataFrame.__init__(self, df, schema, metadata, pandas_df_wrapper)
     62 schema = _input_schema(schema).assert_not_empty()
     63 pdf = pd.DataFrame(df, columns=schema.names)
---> 64 pdf = PD_UTILS.enforce_type(pdf, schema.pa_schema, null_safe=True)
     65 if PD_UTILS.empty(pdf):
     66     for k, v in schema.items():

AttributeError: 'PandasUtils' object has no attribute 'enforce_type'

Expected behavior
With triad=0.9.1, running the above workflow would succeed.

Environment (please complete the following information):

  • Backend: dask
  • Backend version: 2022.3.0
  • Python version: 3.8
  • OS: ubuntu 20.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions