Skip to content

Adding unnamed DataFrames to EntitySets (and have Featuretools generate a name for df) #1740

Open
@chukarsten

Description

@chukarsten

Adding unnamed DataFrames to EntitySets


Bug/Feature Request Description

Currently, in EvalML's DFSTransformer, we run into issues with un-named dataframes being passed to the DFSTransfomer's .fit() method.

Calling the DFSTransformer.fit() results in the dataframe being added to an EntitySet() via the _make_entity_set() method which uses FT 1.0.0's new add_dataframe() method. add_dataframe() now requires either:
1.) a dataframe that is not woodwork initialized, whereupon it initializes it and names it according to a parameter passed in via add_dataframe()
2.) a dataframe that is woodwork initialized and named.

EvalML currently supports woodwork initialized dataframes that are unnamed.

Expected Output

We'd suggest the following changes:

        if dataframe.ww.schema is None:
            ...

        else:
            if dataframe.ww.name is None:
                if dataframe_name is None:
                    raise ValueError('Cannot add a Woodwork DataFrame to EntitySet without a name or a proposed name')
                else:
                    dataframe.ww.name = dataframe_name
            if dataframe.ww.index is None:
                if index is None or not make_index:
                    raise ValueError('Cannot add Woodwork DataFrame to EntitySet without index')
                else:
                    # do index stuff

Hopefully we could do something similar to accommodate woodwork initialized dataframes and just give them a name, per the add_dataframe() function.

Output of featuretools.show_info()

Featuretools version: 1.0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovement to an existing featureevalmlEvalML requestneeds designIssues requiring design documentation.spikeTo generate additional issues and kick off a sprint.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions