-
Notifications
You must be signed in to change notification settings - Fork 992
Description
Ask your question
-
I´m using the
anomaly attribution
feature in DoWhy on a causal model where metrics are represented as nodes. Each metric can have multiple dimensions (e.g., area, store, product), and each dimension can have multiple buckets (e.g., for product: cars, bicycle, etc.). -
With a training dataframe and an anomaly dataframe, I can obtain attribution scores that show how much each metric contributed to the anomaly in the target node. However, I would like to drill down further and quantify how much positive or negative contribution comes from each dimension and each bucket within a metric.
-
Is there an existing DoWhy method to directly obtain attribution scores at the dimension/bucket level? If not, what approach would you suggest to achieve this breakdown?
-
My current attribution score calculation method is shown below
async def calculate_anomaly_attribution_scores(
self,
target_metric: str,
anomaly_df: pd.DataFrame,
model: StructuralCausalModel,
) -> Any:
attributions = gcm.attribute_anomalies(model, target_node=target_metric, anomaly_samples=anomaly_df)
return attributions
Expected behavior
- Attribution scores should be decomposable such that for each metric node, I can break down its overall contribution into contributions from its dimensions and further into buckets within those dimensions.
Version information:
- DoWhy version [e.g. 0.13]