As discussed in Merlion's tutorials and example code, evaluation metrics can be computed in a standard fashion following the syntax TSADMetric.<metric_name>.value(ground_truth=ground_truth, predict=anomaly_labels) where anomaly_labels are likely obtained from calling get_anomaly_label() on some already trained model. Applying a PostRule can zero-out scores considered non-anomalous while retaining scores indicating anomalies.
With that said, I have a question w.r.t. the calculation of the number of false positives which takes place in
|
if any(window): |
|
t_fp = ts_pred[np.where(window)[0] + j0] |
|
num_fp += len(t_fp) |
where
window = ys_pred[j0:jf] for some range
j0:jf, and
ys_pred = predict.np_values.astype(bool) after some suitable conversions. The code above is executed when
window_is_anomaly is
False, indicating that the ground truth is no-anomaly (zeros). Note that after code execution,
num_fp is precisely equal to the
number of points with
score > 0, i.e. this is a
pointwise estimate.
If it not the case that num_fp should be set to the number of segments (sets of contiguous points) instead of individual points with non-negative score in the case of ScoreType being RevisedPointAdjusted? Quoting the original paper (Hundman et al., 2018):
For all predicted sequences that do not overlap a labeled anomalous region, a false positive is recorded.
Also referencing recent work by Sehili et al. available at https://arxiv.org/abs/2308.13068 with regards to the so-called event-wise protocol (as opposed to the point-adjusted protocol).