Skip to content

Commit a6d1111

Browse files
authored
feat(golang): write docs for probe (#3193)
* feat(golang): write docs for probe * update
1 parent 4a6cec9 commit a6d1111

2 files changed

Lines changed: 705 additions & 0 deletions

File tree

  • content
    • en/overview/mannual/golang-sdk/tutorial/observability
    • zh-cn/overview/mannual/golang-sdk/tutorial/observability
Lines changed: 353 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,353 @@
1+
---
2+
aliases:
3+
- /en/docs3-v2/golang-sdk/tutorial/governance/monitor/probe/
4+
- /en/docs3-v3/golang-sdk/tutorial/governance/monitor/probe/
5+
description: "Dubbo-Go Kubernetes Probe (liveness / readiness / startup) user manual"
6+
title: Kubernetes Lifecycle Probe
7+
type: docs
8+
weight: 3
9+
---
10+
11+
# Dubbo-Go Kubernetes Lifecycle Probe
12+
13+
Dubbo-Go provides a built-in **Kubernetes HTTP Probe service** that supports:
14+
15+
*`liveness`
16+
*`readiness`
17+
*`startup`
18+
19+
The probe service runs on an independent HTTP port and supports:
20+
21+
* Custom health check logic
22+
* Optional alignment with Dubbo internal lifecycle state
23+
* Controlled restart risk management
24+
25+
For a complete runnable example, see:
26+
27+
> [https://github.com/apache/dubbo-go-samples/tree/main/metrics](https://github.com/apache/dubbo-go-samples/tree/main/metrics)
28+
29+
---
30+
31+
# 1. Design Goals
32+
33+
| Goal | Description |
34+
| ------------------- | -------------------------------------------------------- |
35+
| Extensibility | Supports custom health check callbacks |
36+
| Risk Control | Liveness does not bind complex internal logic by default |
37+
| Lifecycle Alignment | Readiness and startup can align with Dubbo lifecycle |
38+
| Independent Port | Isolated from business service port |
39+
40+
---
41+
42+
# 2. Default Behavior
43+
44+
When Probe is enabled, it exposes endpoints on:
45+
46+
```
47+
Port: 22222
48+
```
49+
50+
The following paths are available:
51+
52+
| Endpoint | Description |
53+
| ------------ | ------------------------- |
54+
| GET /live | Process liveness check |
55+
| GET /ready | Service readiness check |
56+
| GET /startup | Application startup check |
57+
58+
---
59+
60+
## Response Rules
61+
62+
| Condition | HTTP Status Code |
63+
| --------------- | ---------------- |
64+
| All checks pass | 200 |
65+
| Any check fails | 503 |
66+
67+
---
68+
69+
# 3. Configuration
70+
71+
Dubbo-Go supports both **New API (recommended)** and **Old API (YAML)** configuration styles.
72+
73+
---
74+
75+
## 3.1 New API Configuration (Recommended)
76+
77+
```go
78+
ins, err := dubbo.NewInstance(
79+
dubbo.WithMetrics(
80+
metrics.WithProbeEnabled(),
81+
metrics.WithProbePort(22222),
82+
metrics.WithProbeLivenessPath("/live"),
83+
metrics.WithProbeReadinessPath("/ready"),
84+
metrics.WithProbeStartupPath("/startup"),
85+
metrics.WithProbeUseInternalState(true),
86+
),
87+
)
88+
```
89+
90+
---
91+
92+
## Available Options
93+
94+
| Option | Description |
95+
| ------------------------------- | ------------------------------------- |
96+
| WithProbeEnabled() | Enable Probe |
97+
| WithProbePort(int) | Set Probe HTTP port |
98+
| WithProbeLivenessPath(string) | Set liveness path |
99+
| WithProbeReadinessPath(string) | Set readiness path |
100+
| WithProbeStartupPath(string) | Set startup path |
101+
| WithProbeUseInternalState(bool) | Enable internal lifecycle state check |
102+
103+
---
104+
105+
## 3.2 Old API YAML Configuration
106+
107+
```yaml
108+
metrics:
109+
probe:
110+
enabled: true
111+
port: 22222
112+
liveness-path: "/live"
113+
readiness-path: "/ready"
114+
startup-path: "/startup"
115+
use-internal-state: true
116+
```
117+
118+
---
119+
120+
## Configuration Fields
121+
122+
| Field | Description |
123+
| ------------------ | ------------------------------------------ |
124+
| enabled | Enable probe service |
125+
| port | HTTP port |
126+
| liveness-path | Liveness endpoint path |
127+
| readiness-path | Readiness endpoint path |
128+
| startup-path | Startup endpoint path |
129+
| use-internal-state | Whether to enable internal lifecycle state |
130+
131+
---
132+
133+
# 4. Internal Lifecycle State (UseInternalState)
134+
135+
When:
136+
137+
```yaml
138+
use-internal-state: true
139+
```
140+
141+
Probe attaches Dubbo internal lifecycle checks.
142+
143+
---
144+
145+
## Internal State Mechanism
146+
147+
| Probe Type | Depends On |
148+
| ---------- | -------------------------------------- |
149+
| readiness | `probe.SetReady(true/false)` |
150+
| startup | `probe.SetStartupComplete(true/false)` |
151+
152+
---
153+
154+
## Default Behavior
155+
156+
* When `Server.Serve()` executes successfully:
157+
158+
* ready = true
159+
* startup = true
160+
161+
* During graceful shutdown:
162+
163+
* ready = false
164+
165+
---
166+
167+
## When Set to false
168+
169+
If:
170+
171+
```yaml
172+
use-internal-state: false
173+
```
174+
175+
The probe result is **fully determined by user-registered callbacks**.
176+
177+
---
178+
179+
# 5. Custom Health Checks (Recommended)
180+
181+
You can extend probe logic by registering callbacks.
182+
183+
```go
184+
import "dubbo.apache.org/dubbo-go/v3/metrics/probe"
185+
186+
// Liveness example
187+
probe.RegisterLiveness("db", func(ctx context.Context) error {
188+
// check database connectivity
189+
return nil
190+
})
191+
192+
// Readiness example
193+
probe.RegisterReadiness("cache", func(ctx context.Context) error {
194+
// check downstream dependency
195+
return nil
196+
})
197+
198+
// Startup example
199+
probe.RegisterStartup("warmup", func(ctx context.Context) error {
200+
// check warmup completion
201+
return nil
202+
})
203+
```
204+
205+
---
206+
207+
## Execution Logic
208+
209+
* All registered checks will be executed.
210+
* If any check returns an error,
211+
* The probe returns HTTP 503.
212+
213+
---
214+
215+
# 6. Semantic Recommendations
216+
217+
## Liveness
218+
219+
Recommended usage:
220+
221+
* Detect process crashes
222+
* Detect fatal core dependency failure
223+
224+
⚠️ Failure will trigger Pod restart.
225+
226+
---
227+
228+
## Readiness
229+
230+
May bind to:
231+
232+
* Service registry state
233+
* Database
234+
* Redis
235+
* Downstream RPC
236+
* Local cache
237+
238+
Controls whether traffic is routed to the Pod.
239+
240+
---
241+
242+
## Startup
243+
244+
Suitable for:
245+
246+
* Cold start handling
247+
* Warm-up logic
248+
* Data loading
249+
* Model initialization
250+
251+
Prevents premature restart during slow initialization.
252+
253+
---
254+
255+
# 7. Kubernetes Configuration Example
256+
257+
```yaml
258+
livenessProbe:
259+
httpGet:
260+
path: /live
261+
port: 22222
262+
initialDelaySeconds: 15
263+
periodSeconds: 10
264+
timeoutSeconds: 2
265+
failureThreshold: 3
266+
267+
readinessProbe:
268+
httpGet:
269+
path: /ready
270+
port: 22222
271+
initialDelaySeconds: 5
272+
periodSeconds: 5
273+
timeoutSeconds: 2
274+
failureThreshold: 2
275+
276+
startupProbe:
277+
httpGet:
278+
path: /startup
279+
port: 22222
280+
periodSeconds: 5
281+
timeoutSeconds: 2
282+
failureThreshold: 25 # 120s startup budget => ceil(120 / 5) + 1
283+
```
284+
285+
---
286+
287+
# 8. Example Usage
288+
289+
Example path:
290+
291+
```
292+
metrics/probe/
293+
```
294+
295+
---
296+
297+
## Run Locally
298+
299+
```bash
300+
go run ./metrics/probe/go-server/cmd/main.go
301+
```
302+
303+
---
304+
305+
## Monitor Probe Status in Real Time
306+
307+
```bash
308+
watch -n 1 '
309+
for p in live ready startup; do
310+
url="http://127.0.0.1:22222/$p"
311+
312+
body=$(curl -sS --max-time 2 "$url" 2>&1)
313+
code=$(curl -s -o /dev/null --max-time 2 -w "%{http_code}" "$url" 2>/dev/null)
314+
315+
printf "%-8s [%s] %s\n" "$p" "$code" "$body"
316+
done
317+
'
318+
```
319+
320+
---
321+
322+
## Expected Behavior
323+
324+
| Phase | /live | /ready | /startup |
325+
| ---------------- | ----- | ------ | -------- |
326+
| Just started | 200 | 503 | 503 |
327+
| Warm-up phase | 200 | 503 | 503 |
328+
| Warm-up complete | 200 | 200 | 200 |
329+
330+
---
331+
332+
# 9. Production Best Practices
333+
334+
## Recommended Starting Values
335+
336+
| Probe Type | Recommended Values | Notes |
337+
| ---------- | -------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
338+
| liveness | `initialDelaySeconds: 10-30`, `periodSeconds: 10`, `timeoutSeconds: 1-3`, `failureThreshold: 3` | Use only for process survival and unrecoverable failures, not for databases, registries, or Redis |
339+
| readiness | `initialDelaySeconds: 2-5`, `periodSeconds: 5`, `timeoutSeconds: 1-3`, `failureThreshold: 2-3` | Remove traffic quickly when dependencies fail, and recover quickly after they return |
340+
| startup | `periodSeconds: 5-10`, `timeoutSeconds: 1-3`, `failureThreshold = ceil(maxStartupSeconds / periodSeconds) + 1` | Budget for the longest cold-start, warm-up, and config-loading path |
341+
342+
For example, if the application may need up to `120s` to start and `periodSeconds: 5` is used:
343+
344+
```text
345+
failureThreshold = ceil(120 / 5) + 1 = 25
346+
```
347+
348+
## Operational Guidance
349+
350+
* Keep `liveness` simple and reserve it for failures that require a restart
351+
* Put service registry, database, Redis, and downstream RPC checks in `readiness`
352+
* Let `startup` absorb slow initialization instead of inflating `liveness.initialDelaySeconds`
353+
* In microservice clusters, enable `use-internal-state: true` and combine it with `probe.SetReady(...)` for proactive traffic draining

0 commit comments

Comments
 (0)