Skip to content

Enhance Validity & Robustness (Metrics, Latency, MSVC Fixes)#19

Open
w0wl0lxd wants to merge 2 commits into
kpetridis24:masterfrom
w0wl0lxd:master
Open

Enhance Validity & Robustness (Metrics, Latency, MSVC Fixes)#19
w0wl0lxd wants to merge 2 commits into
kpetridis24:masterfrom
w0wl0lxd:master

Conversation

@w0wl0lxd

@w0wl0lxd w0wl0lxd commented Jan 24, 2026

Copy link
Copy Markdown

I’ve been spending some time tightening up the simulation validity and fixing a few long-standing headaches with the Windows build environment.

The core addition is a MetricsSink. One of the biggest challenges with this simulator has been validating that the internal LOB state actually reflects the expected microstructure signals. Without built-in spread and imbalance tracking, you're forced to do a lot of heavy lifting in post-processing. This sink now handles that natively in C++ on every update, so these metrics are available for live analysis in Python without the manual overhead.

I also tackled the execution model in MultiBookSimulator. Real-world HFT isn’t deterministic when it comes to network and matching delays. I’ve introduced a configurable stochastic latency model supporting LogNormal and Uniform distributions, jitter is now a first-class citizen in the simulation.

On the infra side:

  • Standardized on uv for Python dependency management.
  • Rewrote the ZLIB dependency logic to use FetchContent. This was the main blocker for MSVC/Windows builds where finding a local ZLIB install is usually a coin flip.
  • Cleaned up some pybind11 memory safety issues around sink lifetimes.

All 123 core tests are green on both Linux and Windows.

This commit consolidates several improvements:

@kpetridis24 kpetridis24 Jan 26, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great. I guess we can extend this idea, by creating a semi-realistic event-generator, in a way that respects the key metrics that would make an LOB coherent:

  • Synthetic mid price that does a small random walk so that the book doesn't sit at a constant level
  • Realistic order type mix
  • Controlled aggressiveness/immediate executions
  • Price placement relative to mid (creating depth near top + occasional deeper levels)
  • Size distribution (heavy tail-ish)

Then a given RATE events/sec must be generated. It might also be useful to introduce a mechanism to control the "burstiness" of event generation to further promote realism. Then I guess we can create a generator thread and a processing thread, each pinned into its own core, and establish communication via a queue (basic single-threaded producer/consumer architecture) and do proper performance analysis. Not a part of this MR though.

@kpetridis24

kpetridis24 commented Jan 26, 2026

Copy link
Copy Markdown
Owner

Format is failing. Just run ./scripts/format_all.sh before pushing.
CI / Build + import Python bindings fails too: ModuleNotFoundError: No module named 'numpy'

@kpetridis24

Copy link
Copy Markdown
Owner

Why delete sample data and gitignore?

@kpetridis24

Copy link
Copy Markdown
Owner

@w0wl0lxd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants