Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add MemStream: Memory-Based Streaming Anomaly Detection
This PR introduces MemStream, a state-of-the-art online anomaly detection framework designed for high-dimensional data streams with concept drift, based on the paper "MemStream: Memory-Based Streaming Anomaly Detection" by Bhatia et al.
What's New
Core Implementation
MemStream(Base Class): Abstract base class providing the core framework for memory-based anomaly detectionMemStreamPCA: Concrete implementation using PCA-based feature encodingArchitecture
The implementation consists of two main components:
Feature Encoder: Transforms high-dimensional inputs into lower-dimensional representations
Memory Module: Maintains a dynamic collection of encoded "normal" data representations
Key Features
Parameters
memory_size: Maximum number of encoded normal samples to store (default: 1,000 for PCA variant)max_threshold: Threshold for accepting samples into memory (default: 0.1)grace_period: Number of initial samples before scoring begins (default: 5,000)n_components: Number of PCA components (default: 20) (coded to take the value that makes PCA possible if n_components is inappropriate)k: Number of nearest neighbors for scoring (default: 5)gamma: Exponential weighting factor (default: 0.1)replace_strategy: Memory replacement policy (FIFO, LRU, or RANDOM)