Skip to content

Conversation

@izabelatelejko
Copy link

This PR introduces JSWIN (Jensen-Shannon Windowing), a new univariate drift detection method that utilizes the Jensen-Shannon divergence to identify distributional changes in data streams. JSWIN is designed to detect both gradual and abrupt drifts by comparing empirical distributions within a sliding window. This implementation was developed as part of a project at Warsaw University of Technology. We hope this contribution will enrich the river library’s drift detection capabilities.

JSWIN (see Algorithm 1) maintains a sliding window Ψ of size n over the univariate data stream.
For every new observation, the window is split into two equal parts P and Q. The empirical
distributions of both halves are computed using fixed-size binning. The empirical Jensen-Shannon
divergence between these distributions is calculated, and if it exceeds a threshold α, a drift is
signaled.

image

To benchmark JSWIN’s performance, we conducted an experiment using the Adaptive Random Forest (ARF) from the river library with default parameters. We compared JSWIN against two other popular drift detectors (KSWIN and ADWIN) using the following configurations:

  • JSWIN: α=0.45 (detection), α=0.3 (warning)
  • KSWIN: α=0.001 (detection), α=0.01 (warning)
  • ADWIN: δ=0.002 (detection), δ=0.02 (warning)

The table below summarizes the average accuracy results of ARF models on various datasets. The Hyperplane dataset is sourced from River’s built-in datasets, while Label Shift and Gaussian are synthetic datasets developed by us. Electricity and Airlines are real-world datasets. Overall, JSWIN performs comparably well to other state-of-the-art drift detection methods. Notably, on the Airlines dataset, which represents real-world data, JSWIN outperformed the other methods.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant