Skip to content

Ability to infer relationships between two df's #866

Open
@kmax12

Description

@kmax12

It can be useful to be able to infer relationships between two tables. Especially as we build higher level applications on top of featuretools.

This functionality could be implemented based on rules and heuristics

Rules:

  • relationships must be between two variables of same dtype
  • the parent variable must be the index column

Heuristics:

  • relationships often taken the form of product_id --> id or product_id --> product_id-->
  • the child id shouldn't be a numeric semantic type (it's fine if the underlying data is numeric)
def recommend_relationships(entity_a, entity_b):
    """Returns potential relationships between entity a and entity b
      
       Args:
        entity_a (ft.Entity): entity a
        entity_b (ft.Entity): entity b
        
        Returns:
         List[ft.Relationship]: list of potential relationships
    """
    pass

it may also also make sense to do an api that takes in a full entityset

def recommend_relationships_entityset(entityset):
    """Returns all potential relationships between two entities in an entityset 
      
       Args:
        entityset (ft.EntitySet): entityset
        
        Returns:
         List[ft.Relationship]: list of potential relationships
    """
    pass

Metadata

Metadata

Assignees

No one assigned

    Labels

    APIProposals on new apis or changes to existingneeds designIssues requiring design documentation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions