You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
***Integration tests**: Test launcher with actual vLLM instances
89
-
* Verify model loading and unloading
90
-
* Test sleep/wake functionality
91
-
* Validate model swapping capabilities
92
-
***Benchmark tests**: Use `inference_server/benchmark/` for performance testing
93
-
* Run scenarios defined in `scenarios.py`
94
-
* Measure startup latency and model swap times
95
-
96
-
#### 3. Launcher-Populator Controller Testing
97
-
98
-
The launcher-populator controller ensures the right number of launcher pods exist on each node:
99
-
100
-
***Unit tests**: Test controller logic in `cmd/launcher-populator/`
101
-
***Integration tests**: Verify LauncherConfig and LauncherPopulationPolicy handling
102
-
***E2E tests**: Validate launcher pod distribution across nodes
103
-
* Use `test/e2e/run-launcher-based.sh` for launcher-based testing
104
-
105
-
#### 4. Custom Resource Definitions (CRDs)
106
-
107
-
Test the three CRDs defined in `config/crd/`:
108
-
109
-
***InferenceServerConfig**: Verify server configuration properties
110
-
***LauncherConfig**: Test launcher process configuration
111
-
***LauncherPopulationPolicy**: Validate launcher pod population rules
112
-
113
-
### Running Tests
114
-
115
-
**Go tests:**
116
-
117
-
```bash
118
-
make test
119
-
```
120
-
121
-
**Python tests:**
122
-
123
-
```bash
124
-
cd inference_server/launcher
125
-
python -m pytest tests/
126
-
```
127
-
128
-
**E2E tests:**
129
-
130
-
```bash
131
-
# Standard dual-pods test
132
-
./test/e2e/run.sh
133
-
134
-
# Launcher-based test
135
-
./test/e2e/run-launcher-based.sh
136
-
```
137
-
138
-
**Benchmark tests:**
139
-
140
-
```bash
141
-
cd inference_server/benchmark
142
-
python benchmark_base.py
143
-
```
144
-
145
-
### Code Review Requirements
68
+
## Code Review Requirements
146
69
147
70
***All code changes** must be submitted as pull requests (no direct pushes)
148
71
***All changes** must be reviewed and approved by a maintainer other than the author
149
72
***All repositories** must gate merges on compilation and passing tests
150
73
***All experimental features** must be off by default and require explicit opt-in
151
74
152
-
###Commit and Pull Request Style
75
+
## Commit and Pull Request Style
153
76
154
77
***Pull requests** should describe the problem succinctly
155
78
***Rebase and squash** before merging
@@ -163,66 +86,45 @@ python benchmark_base.py
163
86
* See [PR_SIGNOFF.md](PR_SIGNOFF.md) for configuration details
164
87
* Required for all contributions per [Developer Certificate of Origin](https://developercertificate.org/)
165
88
166
-
## Code Organization and Ownership
167
-
168
-
### Components and Maintainers
169
-
170
-
***Components** are the primary unit of code organization (repo scope or directory/package/module within a repo)
171
-
***Maintainers** own components and approve changes
172
-
***Contributors** can become maintainers through sufficient evidence of contribution
173
-
* Code ownership is reflected in [OWNERS files](https://go.k8s.io/owners) consistent with Kubernetes project conventions
174
-
175
-
### Experimental Features in FMA
176
-
177
-
As an incubating component, FMA encourages fast iteration and exploration with these constraints:
178
-
179
-
1.**Clear identification** as experimental in code and documentation
180
-
2.**Default to off** and require explicit enablement for experimental features
181
-
3.**Best effort support** only
182
-
4.**Removal if unmaintained** with no one to move it forward
183
-
5.**No stigma** to experimental or incubating status
184
-
185
-
**Naming convention**: Experimental flags must include `experimental` in name (e.g., `--experimental-model-swap-v2=true`)
186
-
187
-
When adding experimental features:
188
-
189
-
1. Open pull request with clear experimental designation
190
-
2. Maintainer reviews and enforces "off-by-default" gating
191
-
3. Provide tests for both on/off states
192
-
4. Document the experimental nature in code comments and user documentation
193
-
5. When graduating a feature, default to on and remove conditional logic after one release
194
-
195
89
## API Changes and Deprecation
196
90
197
-
***No breaking changes**: Once an API/protocol is in GA release (non-experimental), it cannot be removed or behavior changed
91
+
***No breaking changes**: The no-breaking-changes policy will apply once we reach GA
198
92
***Includes**: All protocols, API endpoints, internal APIs, command line flags/arguments
199
93
***Exception**: Bug fixes that don't impact significant number of consumers (As the project matures, we will be stricter about such changes - Hyrum's Law is real)
200
-
***Versioning**: All protocols and APIs should be versionable with clear forward and backward compatibility requirements. A new version may change behavior and fields.
94
+
***Versioning**: All protocols and APIs should be versionable with clear forward and backward compatibility requirements. A new version may change behavior and fields. For Go modules and Python packages use semver v0.x.x. For Kubernetes API object types we use the Kubernetes versioning structure and evolution rules
201
95
***Documentation**: All APIs must have documented specs describing expected behavior
202
96
203
97
## Testing Requirements
204
98
205
-
We use three tiers of testing:
99
+
We use two tiers of testing:
206
100
207
-
1.**Unit tests**: Fast verification of code parts, testing different arguments
101
+
1.**Behavioral tests**: Fast verification of code parts, testing different arguments
208
102
* Best for fast verification of parts of code, testing different arguments
209
103
* Doesn't cover interactions between code
210
-
2.**Integration tests**: Testing protocols between components and built artifacts
211
-
* Best for testing protocols and agreements between components
212
-
* May not model interactions between components as they are deployed
213
-
3.**End-to-end (e2e) tests**: Whole system testing including benchmarking
104
+
2.**End-to-end (e2e) tests**: Whole system testing including benchmarking
214
105
* Best for preventing end to end regression and verifying overall correctness
215
106
* Execution can be slow
216
107
217
-
Strong e2e coverage is required for deployed systems to prevent performance regression. Appropriate test coverage is an important part of code review.
108
+
Strong e2e coverage is required for deployed systems to prevent functional regression. Appropriate test coverage is an important part of code review.
218
109
219
110
## Security
220
111
221
-
Maintain appropriate security mindset for production serving. The project will establish a project email address for responsible disclosure of security issues that will be reviewed by the project maintainers. Prior to the first GA release we will formalize a security component and process.
112
+
Maintain appropriate security mindset for production serving. The project will establish a project email address for responsible disclosure of security issues that will be reviewed by the project maintainers. Prior to the first GA release we will formalize a security component and process. More details on security can be found in the [SECURITY.md](./SECURITY.md) file.
113
+
114
+
## Project Structure and Ownership
115
+
116
+
The repository contains the following deployable components:
222
117
223
-
## Project Structure
118
+
| Component | Language | Source | Description |
119
+
|---|---|---|---|
120
+
|**Dual-Pods Controller**| Go |`cmd/dual-pods-controller/`, `pkg/controller/dual-pods/`| Manages server-providing Pods in reaction to server-requesting Pods. Handles binding, sleep/wake, and readiness relay. |
121
+
|**Launcher-Populator Controller**| Go |`cmd/launcher-populator/`, `pkg/controller/launcher-populator/`| Proactively creates launcher pods on nodes based on `LauncherPopulationPolicy` CRDs. |
122
+
|**Requester**| Go |`cmd/requester/`, `pkg/server/requester/`| Lightweight binary running in server-requesting Pods. Exposes SPI endpoints for GPU info and readiness relay. |
123
+
|**Launcher**| Python |`inference_server/launcher/`| FastAPI service managing multiple vLLM subprocess instances via REST API. |
124
+
|**Test Requester**| Go |`cmd/test-requester/`| Test binary simulating a requester with GPU allocation. |
125
+
|**Test Server**| Go |`cmd/test-server/`| Test binary simulating a vLLM-like inference server. |
224
126
225
-
The FMA repository is organized as follows:
127
+
The two controllers are deployed via Helm charts in `charts/`.
@@ -271,7 +173,7 @@ This is an **incubating component** in the llm-d ecosystem, focused on fast mode
271
173
272
174
### Component Ownership
273
175
274
-
***Maintainers** are listed in the [OWNERS](OWNERS) file
176
+
***Maintainers** are listed in the [OWNERS](OWNERS) file. The file follows Kubernetes conventions for future Prow compatibility but is not currently consumed by automation. Additional OWNERS files can be added per-directory as the project grows.
275
177
***Contributors** can become maintainers through consistent, quality contributions
276
178
* Code ownership follows Kubernetes project conventions with OWNERS files
Copy file name to clipboardExpand all lines: PR_SIGNOFF.md
+1-12Lines changed: 1 addition & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -121,18 +121,7 @@ For Windows users, **Git Bash** is also highly recommended.
121
121
122
122
2. Press `Enter` to selectthe default option if prompted to set a save-file or passphrase for the key (you may choose to enter a passphrase if desired; this will prompt you to enter the passphrase every time you perform a DCO sign-off).
123
123
- The following output should generate a `randomart` image
124
-
3. Use the following command to copy the **public** part of the new SSH key to your clipboard:
125
-
126
-
```shell
127
-
clip <~/.ssh/id_ed25519.pub
128
-
```
129
-
130
-
Note: If you are in a WSL shell, use instead
131
-
132
-
```shell
133
-
clip.exe <~/.ssh/id_ed25519.pub
134
-
```
135
-
124
+
3. Copy the contents of ~/.ssh/id_ed25519.pub to your clipboard
136
125
4. After copying or saving your SSH key, navigate to **Settings**in your Github.
137
126
5. Navigate to the **SSH and GPG keys** page under the Access section in the sidebar.
0 commit comments