Skip to content

Dimension- and Bucket-Level Attribution in Anomaly Analysis #1350

@Madhavi96

Description

@Madhavi96

Ask your question

  • I´m using the anomaly attribution feature in DoWhy on a causal model where metrics are represented as nodes. Each metric can have multiple dimensions (e.g., area, store, product), and each dimension can have multiple buckets (e.g., for product: cars, bicycle, etc.).

  • With a training dataframe and an anomaly dataframe, I can obtain attribution scores that show how much each metric contributed to the anomaly in the target node. However, I would like to drill down further and quantify how much positive or negative contribution comes from each dimension and each bucket within a metric.

  • Is there an existing DoWhy method to directly obtain attribution scores at the dimension/bucket level? If not, what approach would you suggest to achieve this breakdown?

  • My current attribution score calculation method is shown below

    async def calculate_anomaly_attribution_scores(
        self,
        target_metric: str,
        anomaly_df: pd.DataFrame,
        model: StructuralCausalModel,
    ) -> Any:
        attributions = gcm.attribute_anomalies(model, target_node=target_metric, anomaly_samples=anomaly_df)
        return attributions

Expected behavior

  • Attribution scores should be decomposable such that for each metric node, I can break down its overall contribution into contributions from its dimensions and further into buckets within those dimensions.

Version information:

  • DoWhy version [e.g. 0.13]

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requestedstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions