Skip to content

Commit 33c9243

Browse files
committed
Add Comprehensive Quant Dev Market Making Guide with 9 modules covering Reg NMS/SHO, ITCH/OUCH/FIX protocols, C++ Optimization, and Strategies
1 parent a9876e8 commit 33c9243

10 files changed

+2162
-0
lines changed

Market Making/Quant_Dev_Market_Making_Guide/01_Regulatory_Framework_NMS_SHO.md

Lines changed: 934 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
# Nasdaq TotalView-ITCH 5.0: Technical Guide for Quantitative Developers
2+
3+
**Protocol Version:** 5.0
4+
**Transport:** UDP/IP (Multicast) / TCP (Snapshot/Recovery)
5+
**Encoding:** Binary (Big-Endian)
6+
**Focus:** Ultra-low latency, full market depth (MBO - Market by Order)
7+
8+
---
9+
10+
## 1. Protocol Overview
11+
12+
Nasdaq TotalView-ITCH is the proprietary data feed that provides full order depth for Nasdaq-listed securities. Unlike the SIP (consolidated feed), ITCH allows you to build the **full limit order book** (Level 3 data) by broadcasting every individual order message.
13+
14+
### Key Characteristics
15+
* **Direct Feed:** Bypasses the SIP for lower latency (~microsecond scale).
16+
* **Market by Order (MBO):** You see every individual order, not just price levels. This allows for queue position estimation and advanced microstructure signals.
17+
* **Binary Encoding:** Fixed-length messages (mostly) for efficient parsing.
18+
* **Big-Endian:** Network byte order. On x86 (Little-Endian) systems, you must byteswap `ntohs` / `ntohl`.
19+
* **Nanosecond Timestamps:** High-precision timing for event sequencing.
20+
21+
---
22+
23+
## 2. Data Types
24+
25+
| Type | Size | Description |
26+
| :--- | :--- | :--- |
27+
| **Integer** | 2, 4, 6, 8 bytes | Unsigned integers (Big-Endian). |
28+
| **Price (4)** | 4 bytes | Integer. Divide by 10,000 to get decimal price (Fixed Point 4). |
29+
| **Price (8)** | 8 bytes | Integer. Divide by 100,000,000 (Fixed Point 8). Used for high-priced assets. |
30+
| **Alpha** | Variable | Left-justified ASCII string, padded with spaces. |
31+
| **Timestamp** | 6 bytes | Nanoseconds past midnight (ET). |
32+
33+
**Note on Timestamps:** ITCH uses a split timestamp format in some contexts, but usually provides a standard nanosecond offset from midnight.
34+
35+
---
36+
37+
## 3. Message Header Structure
38+
39+
Every ITCH message starts with a standard header or is framed within a transport packet (MoldUDP64). At the application payload level, messages are distinguished by a **1-byte Message Type**.
40+
41+
| Offset | Length | Name | Type | Description |
42+
| :--- | :--- | :--- | :--- | :--- |
43+
| 0 | 1 | Message Type | Alpha | Identifies the message (e.g., 'A' for Add Order). |
44+
| 1 | Variable | Payload | Mix | Message-specific data. |
45+
46+
*Note: In raw UDP capture (MoldUDP64), there is a session header before the message stream.*
47+
48+
---
49+
50+
## 4. Critical Message Types
51+
52+
### 4.1 System Event Message ('S')
53+
Indicates the state of the matching engine.
54+
* **'O':** Start of Messages.
55+
* **'S':** Start of System Hours.
56+
* **'Q':** Start of Market Hours (9:30 AM ET) – Open.
57+
* **'M':** End of Market Hours (4:00 PM ET) – Close.
58+
* **'E':** End of System Hours.
59+
* **'C':** End of Messages.
60+
61+
**Quant Logic:** Use 'Q' and 'M' to trigger trading strategies vs. maintenance mode. Use 'S' to reset books.
62+
63+
### 4.2 Stock Directory Message ('R')
64+
Sent at the start of the day for every tradable symbol. **Crucial for mapping.**
65+
* **Stock Locate (2 bytes):** An integer ID assigned to the symbol for the day.
66+
* **Stock (8 bytes):** The ticker symbol (e.g., "AAPL ").
67+
* **Market Category:** NYSE, Nasdaq, Amex, etc.
68+
69+
**Optimization Tip:** Do **not** use string comparisons ("AAPL") in your hot path. Map "AAPL" to `Stock Locate ID` (e.g., 1234) at startup. All subsequent order messages use the `Stock Locate ID`. This allows O(1) array indexing for book lookups.
70+
71+
### 4.3 Trading Action Message ('H')
72+
Indicates halts and pauses.
73+
* **Trading State:**
74+
* 'H': Halted (Reg NMS, News, etc.).
75+
* 'P': Paused (LULD volatility pause).
76+
* 'T': Trading Resumed.
77+
78+
**Risk Check:** Your system **must** immediately block proprietary orders if it receives an 'H' or 'P' for a symbol you trade.
79+
80+
### 4.4 Add Order Message ('A' and 'F')
81+
Adds a new visible order to the book.
82+
* **'A':** No MPID attribution (Anonymous).
83+
* **'F':** With MPID attribution (e.g., "GSCO").
84+
85+
**Fields:**
86+
* **Order Reference Number (8 bytes):** Unique ID for this order. **Key for tracking.**
87+
* **Buy/Sell Indicator:** 'B' or 'S'.
88+
* **Shares (4 bytes):** Quantity.
89+
* **Stock Locate (2 bytes):** Symbol ID.
90+
* **Price (4 bytes):** Limit price.
91+
92+
**Book Logic:** Insert node `(Price, Time, OrderRef)` into your Order Book data structure.
93+
94+
### 4.5 Order Executed Message ('E')
95+
An order on the book was executed (fully or partially).
96+
* **Order Reference Number:** Matches the 'A' message.
97+
* **Executed Shares:** Number of shares traded.
98+
* **Match Number:** Unique Trade ID.
99+
100+
**Book Logic:** Find the order by `Order Reference Number`. Decrement its size by `Executed Shares`. If size becomes 0, remove it.
101+
**Note:** The price is implied from the original 'A' message.
102+
103+
### 4.6 Order Executed with Price Message ('C')
104+
An order on the book was executed at a price **different** from its display price (e.g., price improvement/slippage due to cross).
105+
* **Execution Price:** The actual trade price.
106+
* **Printable:** 'Y' or 'N' (whether it prints to the tape).
107+
108+
**Book Logic:** Same as 'E' (reduce size), but record the trade at the *Execution Price* for volume/signal analysis.
109+
110+
### 4.7 Order Cancel ('X') vs. Order Delete ('D')
111+
* **Order Cancel ('X'):** Partial reduction.
112+
* **Canceled Shares:** Amount to remove.
113+
* **Logic:** Decrement size. If 0, remove.
114+
* **Order Delete ('D'):** Full removal.
115+
* **Logic:** Immediately remove order from book.
116+
117+
### 4.8 Order Replace Message ('U')
118+
Efficiency mechanism. Replaces an existing order with a new one (new Order ID, potentially new size/price).
119+
* **Original Order Reference Number:** Old order.
120+
* **New Order Reference Number:** New order.
121+
* **Shares:** New quantity.
122+
* **Price:** New price.
123+
124+
**Book Logic:** Atomic "Delete Old" + "Add New".
125+
**Important:** The new order loses queue priority (new timestamp).
126+
127+
### 4.9 Net Order Imbalance Indicator ('I') - NOII
128+
Broadcast before the Open (9:28-9:30) and Close (3:50-4:00) to indicate auction state.
129+
* **Paired Shares:** Shares that can match at current Reference Price.
130+
* **Imbalance Shares:** Excess buy/sell interest.
131+
* **Imbalance Direction:** 'B' (Buy side imbalance), 'S' (Sell side), 'N' (No imbalance).
132+
* **Far Price / Near Price / Current Reference Price:** Auction price scenarios.
133+
134+
**Quant Strategy:** NOII data is the primary signal for "Opening" and "Closing" auction arbitrage strategies (MOC/LOC orders).
135+
136+
---
137+
138+
## 5. Building the Order Book (Developer Logic)
139+
140+
To reconstitute the limit order book from ITCH:
141+
142+
1. **Initialize:** Create an array of OrderBooks, indexed by `Stock Locate ID`.
143+
```cpp
144+
std::vector<OrderBook> all_books(MAX_STOCK_LOCATE_ID);
145+
```
146+
147+
2. **Mapping:** On 'R' (Directory), map Symbol String -> ID.
148+
```cpp
149+
map_symbol_to_id["AAPL"] = msg.stock_locate;
150+
```
151+
152+
3. **State Management:**
153+
* **Map:** `std::unordered_map<uint64_t, Order*> order_map;` (Order Ref -> Order Node).
154+
* **Tree/List:** A sorted structure (e.g., Red-Black Tree or Skip List) for Price Levels.
155+
156+
4. **Processing Loop:**
157+
* **'A' (Add):**
158+
* Create `Order` object.
159+
* Add to `order_map[ref_num]`.
160+
* Insert into Price Level in `OrderBook`.
161+
* **'E'/'X' (Modify):**
162+
* Lookup `Order*` in `order_map[ref_num]`.
163+
* Modify size.
164+
* If size == 0, delete from Map and Price Level.
165+
* **'D' (Delete):**
166+
* Lookup and remove immediately.
167+
* **'U' (Replace):**
168+
* Treat as Delete(Old) + Add(New).
169+
170+
### Code Snippet: Parsing (C++)
171+
172+
```cpp
173+
struct ItchHeader {
174+
uint16_t length; // Little endian usually if read from network layer wrapper
175+
char msg_type;
176+
};
177+
178+
struct AddOrderMsg {
179+
uint32_t timestamp_nanos; // Simplified
180+
uint64_t order_ref;
181+
uint8_t buy_sell_flag;
182+
uint32_t shares;
183+
uint16_t stock_locate;
184+
uint32_t price_4;
185+
};
186+
187+
void process_packet(const char* buffer) {
188+
char msg_type = buffer[0];
189+
switch (msg_type) {
190+
case 'A': {
191+
// Cast or memcpy to safe struct
192+
// BSWAP (ntohl) fields if necessary
193+
// update_book(stock_locate, side, price, size, ref);
194+
break;
195+
}
196+
case 'E': {
197+
// handle execution
198+
break;
199+
}
200+
// ... handle others
201+
}
202+
}
203+
```
204+
205+
## 6. Performance Considerations
206+
1. **Zero-Copy:** Do not copy the buffer into a struct. Cast the pointer (if alignment allows) or use `memcpy` for safe unaligned access.
207+
2. **Pre-allocation:** Pre-allocate Order nodes in a memory pool to avoid `new`/`delete` latency spikes during high message rates.
208+
3. **Instruction Cache:** Keep the 'Add', 'Exec', 'Cancel' handlers hot in cache. These make up 95% of traffic.
Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
# Quantitative Models & Strategies in Market Making
2+
3+
**Focus:** Mathematical frameworks for quoting, inventory management, and alpha signals.
4+
**Prerequisites:** Probability theory, stochastic calculus basics, convex optimization.
5+
6+
---
7+
8+
## 1. The Market Maker's Problem
9+
10+
The fundamental goal is to maximize terminal wealth utility while managing inventory risk. This is often modeled as a stochastic optimal control problem.
11+
12+
### 1.1 The Avellaneda-Stoikov Model (2008)
13+
This is the seminal paper for high-frequency market making.
14+
15+
**Core Assumptions:**
16+
* **Mid-price ($S_t$)** follows a geometric Brownian motion or arithmetic Brownian motion:
17+
$$dS_t = \sigma dW_t$$
18+
* **Arrival Rates ($\lambda$)**: The probability of a limit order being filled follows a Poisson process with intensity decaying exponentially with distance ($\delta$) from the mid-price:
19+
$$\lambda(\delta) = A e^{-k\delta}$$
20+
Where:
21+
* $\delta$: Spread (Distance from mid-price)
22+
* $A$: Base arrival intensity
23+
* $k$: Order book liquidity parameter (higher $k$ = thinner book)
24+
25+
**The Objective Function:**
26+
Maximize expected exponential utility of terminal wealth:
27+
$$u(w) = -e^{-\gamma w}$$
28+
Where $\gamma$ is the risk aversion parameter.
29+
30+
**The Solution (Optimal Quotes):**
31+
The optimal bid ($r_b^*$) and ask ($r_a^*$) prices are:
32+
33+
$$r_a^* = S_t + \frac{1}{2}\delta^* + (2q - 1)\frac{\gamma \sigma^2 (T-t)}{2}$$
34+
$$r_b^* = S_t - \frac{1}{2}\delta^* + (2q + 1)\frac{\gamma \sigma^2 (T-t)}{2}$$
35+
36+
**Key Interpretations:**
37+
1. **Reservation Price ($r^*$)**: The price at which the MM is indifferent between buying and selling.
38+
$$r^*(S, q, t) = S_t - q \gamma \sigma^2 (T-t)$$
39+
* If inventory $q > 0$ (Long): Reservation price shifts *down*. You are eager to sell, reluctant to buy.
40+
* If inventory $q < 0$ (Short): Reservation price shifts *up*.
41+
* **Term:** $q \gamma \sigma^2 (T-t)$ is the inventory risk premium.
42+
43+
2. **Optimal Spread ($\delta^*$)**:
44+
$$\delta^* = \frac{2}{\gamma} \ln(1 + \frac{\gamma}{k})$$
45+
* The spread is independent of inventory in the standard model.
46+
* It depends on risk aversion ($\gamma$) and market liquidity ($k$).
47+
48+
---
49+
50+
## 2. Inventory Management Strategies
51+
52+
Managing $q$ (inventory) is the single most critical risk task.
53+
54+
### 2.1 Skewing (Asymmetric Quoting)
55+
As derived above, quotes should be centered around the Reservation Price, not the Mid-Price.
56+
57+
**Logic:**
58+
* **Current Inventory:** +1000 shares (Long Limit Reached).
59+
* **Action:**
60+
* **Ask:** Aggressive. $S_t$ or even $S_t - \epsilon$. We *need* to sell.
61+
* **Bid:** Passive. $S_t - 5\text{ticks}$. We do *not* want to buy more.
62+
* **Effect:** Increases probability of Ask fill, decreases probability of Bid fill.
63+
64+
### 2.2 Damping Factor
65+
In practice, linear inventory penalties can be too volatile. Firms often use a sigmoid or cubic dampener:
66+
$$\text{Skew} = \alpha \cdot \tanh(\beta \cdot \frac{q}{Q_{max}})$$
67+
68+
### 2.3 Position Limits & Liquidation
69+
* **Soft Limit:** Begin skewing aggressively.
70+
* **Hard Limit:** Block new opening orders.
71+
* **Liquidation Limit:** Send Immediate-or-Cancel (IOC) market orders to dump inventory if it exceeds critical thresholds (Stop Loss).
72+
73+
---
74+
75+
## 3. Alpha Signals (Short-Term Predictors)
76+
77+
Market making is not just passive. You need "Micro-Alpha" to avoid adverse selection (Toxic Flow).
78+
79+
### 3.1 Order Book Imbalance (OBI)
80+
The ratio of volume at the Best Bid ($V_b$) vs. Best Ask ($V_a$).
81+
82+
$$OBI = \frac{V_b - V_a}{V_b + V_a}$$
83+
84+
* **Logic:** If $OBI \to +1$ (Huge Bid, Tiny Ask), price is likely to tick up.
85+
* **Action:** Skew quotes upward. Lean on the bid.
86+
87+
### 3.2 Order Flow Toxicity (VPIN)
88+
Volume-Synchronized Probability of Informed Trading.
89+
* Measures order flow imbalance relative to volume.
90+
* High VPIN $\implies$ High probability of informed trader presence.
91+
* **Action:** Widen spreads.
92+
93+
### 3.3 Cross-Asset Correlations (Lead-Lag)
94+
* **Scenario:** SPY (S&P 500 ETF) is the most liquid instrument.
95+
* **Signal:** SPY ticks up.
96+
* **Effect:** Less liquid constituents (e.g., AAPL, MSFT) will likely tick up 5-50ms later.
97+
* **Action:** Immediately cancel/re-price AAPL asks upwards before latency arbitrageurs hit them.
98+
99+
---
100+
101+
## 4. Backtesting Market Making Strategies
102+
103+
Backtesting limit order strategies is notoriously difficult due to "fill simulation."
104+
105+
### 4.1 Assumptions & Pitfalls
106+
1. **Queue Position:** You cannot assume you are at the front of the queue.
107+
* *Conservative:* Assume you are at the back (Last in, First Out - LIFO logic for fills).
108+
* *Realistic:* Estimate queue depletion.
109+
2. **Market Impact:** Your orders affect the market.
110+
* If you post a huge bid, others might front-run you ("Pennying").
111+
3. **Latency:** You see a price $S_t$, but by the time your order reaches the exchange, the price is $S_{t+\Delta}$.
112+
113+
### 4.2 Simulator Design
114+
* **Input:** Tick-by-tick data (MBO preferred).
115+
* **State:** Reconstructed Order Book.
116+
* **Matching Engine:** Replicates exchange matching logic (Price-Time Priority).
117+
* **Latency Model:** Adds random jitter + constant delay to all order actions.
118+
119+
---
120+
121+
## 5. Practical Exercise: Python Prototype
122+
123+
```python
124+
import numpy as np
125+
126+
def calculate_optimal_quotes(mid_price, inventory, volatility, risk_aversion, time_horizon):
127+
"""
128+
Avellaneda-Stoikov Reservation Price & Spread Calculation
129+
"""
130+
# 1. Reservation Price
131+
# r* = s - q * gamma * sigma^2 * (T - t)
132+
reservation_price = mid_price - (inventory * risk_aversion * (volatility**2) * time_horizon)
133+
134+
# 2. Optimal Half-Spread (Simplified approximation)
135+
# Assumes k (liquidity parameter) is constant derived from market data
136+
k = 1.5 # Example value fitted from trade intensity
137+
spread = (2 / risk_aversion) * np.log(1 + (risk_aversion / k))
138+
139+
half_spread = spread / 2
140+
141+
optimal_bid = reservation_price - half_spread
142+
optimal_ask = reservation_price + half_spread
143+
144+
return optimal_bid, optimal_ask
145+
146+
# Example
147+
current_price = 100.00
148+
current_inventory = 500 # Long 500 shares
149+
sigma = 2.0 # Daily vol
150+
gamma = 0.1 # Risk aversion
151+
T = 1.0 # End of day (normalized)
152+
153+
bid, ask = calculate_optimal_quotes(current_price, current_inventory, sigma, gamma, T)
154+
155+
print(f"Mid Price: {current_price}")
156+
print(f"Inventory: {current_inventory}")
157+
print(f"Optimal Bid: {bid:.2f}")
158+
print(f"Optimal Ask: {ask:.2f}")
159+
print(f"Skew: {(bid+ask)/2 - current_price:.4f}")
160+
# Expect negative skew (lower prices) to shed long inventory
161+
```
162+

0 commit comments

Comments
 (0)