Skip to content

Commit e2d992f

Browse files
committed
feat: device-apiserver design doc
Signed-off-by: Dan Huenecke <dhuenecke@nvidia.com>
1 parent 412549b commit e2d992f

File tree

1 file changed

+118
-0
lines changed

1 file changed

+118
-0
lines changed

docs/design/device-apiserver.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# Technical Specification: device-apiserver
2+
3+
## Overview
4+
**Synopsis**: The Device API server validates and configures data for the hardware api objects which include GPUs. The API Server services gRPC operations and provides the frontend to a node’s shared hardware resource state through which all other local hardware-aware components interact.
5+
6+
---
7+
8+
## Architecture
9+
The server implements the Kubernetes API provider pattern but is optimized for node-local footprint.
10+
11+
### Storage Stack
12+
To avoid the overhead of a full Etcd cluster, the data layer utilizes:
13+
- **Kine**: An Etcd shim that translates Etcd v3 API calls into SQL queries.
14+
- **SQLite**: The database engine.
15+
- **Default**: In-memory for ephemeral runtime state.
16+
- **Optional**: Persistent file-based storage for state that must survive restarts.
17+
18+
### State Semantics
19+
- **ResourceVersion (RV)**: Managed by Kine/SQLite. Increments on every write to provide Optimistic Concurrency Control.
20+
- **Generation**: Managed by the server. Increments only when `.Spec` is modified, signaling desired state changes.
21+
22+
---
23+
24+
## gRPC Interface Definition
25+
The server exposes the standard CRUD+UpdateStatus+Patch+Watch interface for hardware resources.
26+
27+
| Method | Target | Scope |
28+
| :--- | :--- | :--- |
29+
| **CreateGpu** | Full Object | Spec/Metadata |
30+
| **UpdateGpu** | **Spec Only** | Spec/Metadata |
31+
| **PatchGpu** | Partial Object | Spec/Metadata |
32+
| **UpdateGpuStatus** | **Status Only** | Status |
33+
| **GetGpu** | Read-only | Single Resource |
34+
| **ListGpus** | Read-only | All Resources |
35+
| **WatchGpus** | Stream Events | All Resources |
36+
37+
---
38+
39+
## Resource Schema
40+
All API objects follow the Kubernetes Resource Model (KRM) other than the following exceptions:
41+
42+
### Metadata
43+
- `ObjectMeta`: A subset of `k8s.io/apimachinery/pkg/apis/meta/v1.ObjectMeta`.
44+
- `Name`
45+
- `ResourceVersion`
46+
- `Namespace`
47+
- `UID`
48+
- `Generation`
49+
- `CreationTimestamp`
50+
51+
- `ListMeta`: A subset of `k8s.io/apimachinery/pkg/apis/meta/v1.ListMeta`.
52+
- `ResourceVersion`
53+
54+
### Options
55+
- `CreateOptions`: Not supported.
56+
57+
- `UpdateOptions`: Not supported.
58+
59+
- `DeleteOptions`: Not supported.
60+
61+
- `ListOptions`: A subset of `k8s.io/apimachinery/pkg/apis/meta/v1.ListOptions`.
62+
- `ResourceVersion`
63+
64+
---
65+
66+
## Validation & Concurrency
67+
- **Immutability**: The server rejects updates attempting to change `metadata.name`, `metadata.uid` or `metadata.namespace`.
68+
- **Optimistic Concurrency**: If `incoming.ResourceVersion` does not match `current.ResourceVersion`, the server returns a `storage.Conflict` error, forcing the client to re-read and try again.
69+
- **No-Op**: Updates where the `incoming.Spec` matches `current.Spec` result in a successful return without a database write.
70+
71+
---
72+
73+
## Internal Mechanics
74+
75+
### Bootstrapping & Persistence
76+
The server's lifecycle is tied to the Kine-managed SQLite instance:
77+
- **In-Memory**: The SQLite database exists purely in memory. The `device-apiserver` starts with a blank slate. Another component (e.g., the device plugin) is responsible for re-discovering and re-registering the GPUs on every start.
78+
- **On-Disk**: The SQLite database exists in a single ordinary disk file (e.g., `/var/lib/device-apiserver/state.db`). The `device-apiserver` starts from the last successfully persisted state.
79+
80+
### API Discovery & Registration
81+
The `device-apiserver` uses a decentralized registration pattern to manage its API surface. During startup, available APIs are automatically discovered and registered with both the storage backend and gRPC server.
82+
83+
---
84+
85+
## Reliability
86+
- **Database Integrity**: SQLite's WAL (Write-Ahead Logging) mode is enabled by default to allow multiple concurrent readers and a single writer.
87+
88+
---
89+
90+
## Observability
91+
The server's observability stack is designed for production-grade monitoring.
92+
93+
### Prometheus Metrics
94+
- **Build Metadata (`device_apiserver_build_info`)**: A constant `Gauge` containing labels for `version`, `revision`, `build_date`, and `goversion`, `compiler`, and `platform`.
95+
- **Service Availability (`device_apiserver_service_status`)**: A `GuageVec` that tracks the serving state of internal sub-services (`1`: Serving / Ready, `2`: Not Serving / Storage Backend Disconnected).
96+
- **gRPC Performance (`grpc_server_*`)**: Standard `Histogram`s and `Counter`s via the `grpcprom` provider.
97+
- **Storage Backend (`kine_*`)**:
98+
- **`kine_sql_total`**: A `CounterVec` tracking the total number of SQL operations, labeled by `error_code`.
99+
- **`kine_sql_time_seconds`**: A `HistogramVec` providing the distribution of SQL execution times.
100+
- **`kine_compact_total`**: A `CounterVec` recording successful and failed history compactions.
101+
- **`kine_insert_errors_total`**: A `CounterVec` tracking retries due to unique constraint violations
102+
103+
### Admin & Reflection
104+
- **gRPC Reflection**: Dynamic discovery of API schema.
105+
- **Health Checks**: Standard `grpc.health.v1` for liveness and readiness probes.
106+
- **Channelz**: Low-level socket and connection-level statistics.
107+
108+
---
109+
110+
## Security
111+
// TODO
112+
113+
---
114+
115+
## Performance
116+
// TODO
117+
118+
---

0 commit comments

Comments
 (0)