You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/advanced/slime-router.md
+45Lines changed: 45 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,3 +91,48 @@ For more details on SGLang Model Gateway, see the [official documentation](https
91
91
- Use SlimeRouter when you need R3 or radix-tree caching
92
92
- Use SGLang Model Gateway for everything else (recommended default)
93
93
94
+
---
95
+
96
+
## 4. Session-Affinity Routing for Multi-Turn Agents
97
+
98
+
When using SGLang Model Gateway with consistent hashing routing policy, Slime automatically assigns each rollout session a unique session ID and uses it as the routing key to enable session affinity.
99
+
100
+
### What is session affinity?
101
+
102
+
Session affinity (also called sticky sessions) ensures that all requests belonging to the same conversation or agent session are routed to the same backend worker. This is beneficial for:
103
+
104
+
-**Multi-turn dialogues**: Keeping the same worker improves prefix cache hit rates
105
+
-**Multi-agent systems**: Ensures agent state consistency and better resource locality
106
+
-**Debugging**: Makes it easier to trace and debug specific sessions
107
+
108
+
### How it works
109
+
110
+
When the rollout system generates samples, each sample is assigned a unique `session_id`. This ID is:
111
+
112
+
1. Automatically generated using UUID for each sample
113
+
2. Stored in `sample.session_id` field
114
+
3. Passed as `X-SMG-Routing-Key` header when the router policy is `consistent_hashing`
115
+
116
+
The SGLang Model Gateway's consistent hashing policy then uses this routing key to deterministically select the same worker for all requests with the same session ID.
117
+
118
+
### Configuration
119
+
120
+
To enable session-affinity routing, simply configure the router policy in Slime:
121
+
122
+
```bash
123
+
--sglang-router-policy consistent_hashing
124
+
```
125
+
126
+
Slime will automatically start SGLang Model Gateway with the consistent hashing policy.
127
+
128
+
> **Note**: If you encounter an error about the `consistent_hashing` policy not being available, upgrade sglang-router:
129
+
> ```bash
130
+
> pip install -U sglang-router
131
+
>```
132
+
133
+
### Notes
134
+
135
+
- Each sample gets its own unique session ID
136
+
- Different samples in the same group may be routed to different workers
137
+
- The same sample's subsequent turns will maintain the same session ID
138
+
- Currently, this feature is only available for SGLang Model Gateway
0 commit comments