Decoupling Inference from State Updates in Low-Latency Feature Engines via Probabilistic Thinning

Welcome to the reporducibility instructions for Decoupling Inference from State Updates in Low-Latency Feature Engines via Probabilistic Thinning.

Abstract

Streaming data systems increasingly underpin Machine Learning workflows that maintain large numbers of continuously updated aggregations. In production settings, each incoming event typically triggers read-modify-write operations to persistent storage, making high-frequency state updates a dominant source of latency, contention, and operational cost. In this work, we show how to decouple inference from state persistence in streaming Machine Learning pipelines via probabilistic thinning: every event is scored, but persistent state updates are only triggered by the most informative events. We demonstrate that such thinning can be implemented without local in-memory control state or coordination, relying exclusively on approximate statistics retrieved from persistent key-value stores. We model the resulting stochastic processes, derive bounds on filtering rates, and show that common time-based aggregations remain unbiased under variance-aware formulations. Thus, they do not accumulate systemic errors. We implement this approach in a real-world transaction monitoring system, and demonstrate substantial reductions in storage Input/Output and serialization overhead, often improving downstream fraud detection accuracy; in our example, we exclude over 90% of events from the persistence path while consistently outperforming the baseline.

Reproducing system results

Follow the instructions for the server
Follow the instructions for the injector

Reproducing data-science results

Follow the instructions here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decoupling Inference from State Updates in Low-Latency Feature Engines via Probabilistic Thinning

Reproducing system results

Reproducing data-science results

FilesExpand file tree

readme.md

Latest commit

History

readme.md

File metadata and controls

Decoupling Inference from State Updates in Low-Latency Feature Engines via Probabilistic Thinning

Reproducing system results

Reproducing data-science results