|
| 1 | +## Contributing Guidelines |
| 2 | + |
| 3 | +FMA is currently developed by a small team in a focused development spike. We welcome contributions that align with the project's goals. The FMA project accepts contributions via GitHub pull requests. |
| 4 | + |
| 5 | +## How You Can Contribute |
| 6 | + |
| 7 | +There are several ways you can contribute to FMA: |
| 8 | + |
| 9 | +* **Reporting Issues:** Help us identify and fix bugs by reporting them clearly and concisely. |
| 10 | +* **Suggesting Features:** Share your ideas for new features or improvements. |
| 11 | +* **Improving Documentation:** Help make the project more accessible by enhancing the documentation. |
| 12 | +* **Submitting Code Contributions (with consideration):** While the project leads maintain final say, code contributions that align with the project's vision are always welcome. |
| 13 | + |
| 14 | +## Code of Conduct |
| 15 | + |
| 16 | +This project adheres to the llm-d [Code of Conduct and Covenant](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. |
| 17 | + |
| 18 | +## Community and Communication |
| 19 | + |
| 20 | +* **Developer Slack:** [Join our developer Slack workspace](https://llm-d.ai/slack) and participate in the **#fast-model-actuation** channel to connect with the core maintainers and other contributors, ask questions, and participate in discussions. |
| 21 | +* **Weekly Meetings:** FMA project updates, ongoing work discussions, and Q&A are covered in our weekly project meeting every **Tuesday at 8:00 PM ET**. Join at [meet.google.com/nha-rgkw-qkw](https://meet.google.com/nha-rgkw-qkw). |
| 22 | +* **Code**: Hosted in the [llm-d-incubation](https://github.com/llm-d-incubation) GitHub organization |
| 23 | +* **Issues**: FMA-specific bugs or issues should be reported in [llm-d-incubation/llm-d-fast-model-actuation](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/issues) |
| 24 | +* **Mailing List**: [llm-d-contributors@googlegroups.com](mailto:llm-d-contributors@googlegroups.com) for document sharing and collaboration |
| 25 | +* **Social Media:** Follow the main llm-d project on social media for the latest news, announcements, and updates: |
| 26 | + * **X:** [https://x.com/\_llm_d\_](https://x.com/_llm_d_) |
| 27 | + * **LinkedIn:** [https://linkedin.com/company/llm-d](https://linkedin.com/company/llm-d) |
| 28 | + * **Reddit:** [https://www.reddit.com/r/llm_d/](https://www.reddit.com/r/llm_d/) |
| 29 | + * **YouTube** [@llm-d-project](https://youtube.com/@llm-d-project) |
| 30 | + |
| 31 | +## Contributing Process |
| 32 | + |
| 33 | +We are a small team with defined responsibilities. All proposals must be reviewed by at least one relevant human reviewer, with broader review expected for changes with particularly wide impact. |
| 34 | + |
| 35 | +### Types of Contributions |
| 36 | + |
| 37 | +#### 1. Features with Public APIs or New Components |
| 38 | + |
| 39 | +All features involving public APIs, behavior between core components, or new core repositories/subsystems should be discussed with maintainers before implementation. |
| 40 | + |
| 41 | +**Process:** |
| 42 | + |
| 43 | +1. Create an issue in the [FMA repository](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/issues) describing: |
| 44 | + * **Summary**: A clear description of the change proposed and the outcome |
| 45 | + * **Motivation**: Problem to be solved, including Goals/Non-Goals, and any necessary background |
| 46 | + * **Proposal**: User stories and enough detail that reviewers can understand what you're proposing |
| 47 | + * **Design Details**: Specifics of your change including API specs or code snippets if applicable |
| 48 | + * **Alternatives**: Alternative implementations considered and why they were rejected |
| 49 | +2. Discuss in the **#fast-model-actuation** Slack channel or weekly meeting |
| 50 | +3. Get review from impacted component maintainers |
| 51 | +4. Get approval from project maintainers before starting implementation |
| 52 | + |
| 53 | +#### 2. Fixes, Issues, and Bugs |
| 54 | + |
| 55 | +For changes that fix broken code or add small changes within a component: |
| 56 | + |
| 57 | +* All bugs and commits must have a clear description of the bug, how to reproduce, and how the change is made |
| 58 | +* Create an issue in the [FMA repository](https://github.com/llm-d-incubation/llm-d-fast-model-actuation/issues) or submit a pull request directly for small fixes |
| 59 | +* A maintainer must approve the change (within the spirit of the component design and scope of change) |
| 60 | +* For moderate size changes, create an RFC issue in GitHub and engage in the **#fast-model-actuation** Slack channel |
| 61 | + |
| 62 | +## Feature Testing |
| 63 | + |
| 64 | +The current testing documentation can be found within the respective components of the [docs folder](docs/). |
| 65 | + |
| 66 | +## Code Review Requirements |
| 67 | + |
| 68 | +* **All code changes** must be submitted as pull requests (no direct pushes) |
| 69 | +* **All changes** must be reviewed and approved by a maintainer other than the author |
| 70 | +* **All repositories** must gate merges on compilation and passing tests |
| 71 | + |
| 72 | +## Commit and Pull Request Style |
| 73 | + |
| 74 | +* **Pull requests** should describe the problem succinctly |
| 75 | +* **Prefer smaller PRs** over larger ones; when a PR adds multiple commits, prefer smaller commits |
| 76 | +* **Commit messages** should have: |
| 77 | + * Short, descriptive titles |
| 78 | + * Description of why the change was needed |
| 79 | + * Enough detail for someone reviewing git history to understand the scope |
| 80 | +* **DCO Sign-off**: All commits must include a valid DCO sign-off line (`Signed-off-by: Name <email@domain.com>`) |
| 81 | + * Add automatically with `git commit -s` |
| 82 | + * See [PR_SIGNOFF.md](PR_SIGNOFF.md) for configuration details |
| 83 | + * Required for all contributions per [Developer Certificate of Origin](https://developercertificate.org/) |
| 84 | + |
| 85 | +## API Changes and Deprecation |
| 86 | + |
| 87 | +* **Includes**: All protocols, API endpoints, internal APIs, command line flags/arguments, and Kubernetes API object type (resource) definitions |
| 88 | +* **Versioning**: We use [Semantic Versioning](https://semver.org) at major version 0 for Go modules and Python packages, which grants freedom to make breaking changes. For Kubernetes API object types we use the Kubernetes versioning structure and evolution rules (currently at `v1alpha1`). Since the project has no installed base, we currently make changes without regard to backward compatibility. |
| 89 | +* **Documentation**: All APIs must have documented specs describing expected behavior |
| 90 | + |
| 91 | +## Testing Requirements |
| 92 | + |
| 93 | +We use two tiers of testing: |
| 94 | + |
| 95 | +1. **Behavioral unit tests**: Fast verification of individual units of code, testing different arguments |
| 96 | + * Best for fast verification of parts of code, testing different arguments |
| 97 | + * Does not cover interaction between units of code |
| 98 | +2. **End-to-end (e2e) tests**: Whole system testing including benchmarking |
| 99 | + * Best for preventing end-to-end regression and verifying overall correctness |
| 100 | + * Execution can be slow |
| 101 | + |
| 102 | +Appropriate test coverage is an important part of code review. |
| 103 | + |
| 104 | +## Security |
| 105 | + |
| 106 | +Maintain appropriate security mindset for production serving. The project will establish a project email address for responsible disclosure of security issues that will be reviewed by the project maintainers. Prior to the first GA release we will formalize a security component and process. More details on security can be found in the [SECURITY.md](./SECURITY.md) file. |
| 107 | + |
| 108 | +## Project Structure and Ownership |
| 109 | + |
| 110 | +The repository contains the following deployable components. |
| 111 | + |
| 112 | + | Component | Language | Source | Description | |
| 113 | + |---|---|---|---| |
| 114 | + | **Dual-Pods Controller** | Go | `cmd/dual-pods-controller/`, `pkg/controller/dual-pods/` | Manages server-providing Pods (milestone 2) and launched vLLM instances (milestone 3) in reaction to server-requesting Pods. Handles binding, sleep/wake, and readiness relay. | |
| 115 | + | **Launcher-Populator Controller** | Go | `cmd/launcher-populator/`, `pkg/controller/launcher-populator/` | Proactively creates launcher pods on nodes based on `LauncherPopulationPolicy` CRDs. | |
| 116 | + | **Requester** | Go | `cmd/requester/`, `pkg/server/requester/` | Lightweight binary running in server-requesting Pods. Exposes SPI endpoints for GPU info and readiness relay. | |
| 117 | + | **Launcher** | Python | `inference_server/launcher/` | FastAPI service managing multiple vLLM subprocess instances via REST API. | |
| 118 | + | **Test Requester** | Go | `cmd/test-requester/` | Test binary simulating a requester (does not use real GPUs). | |
| 119 | + | **Test Server** | Go | `cmd/test-server/` | Test binary simulating a vLLM-like inference server. | |
| 120 | + | **Test Launcher** | Python | `dockerfiles/Dockerfile.launcher.cpu` | CPU-based launcher image for testing without GPUs. | |
| 121 | + |
| 122 | + The two controllers are deployed via a single Helm chart in `charts/fma-controllers/`. |
| 123 | + |
| 124 | +### Core Organization (`llm-d-incubation/llm-d-fast-model-actuation`) |
| 125 | + |
| 126 | +This is an **incubating component** in the llm-d ecosystem, focused on fast model actuation techniques. |
| 127 | + |
| 128 | +#### Directory Structure |
| 129 | + |
| 130 | +* **`api/fma/v1alpha1/`**: Custom Resource Definitions (CRDs) and Go types |
| 131 | + * `inferenceserverconfig_types.go`: InferenceServerConfig CRD |
| 132 | + * `launcherconfig_types.go`: LauncherConfig CRD |
| 133 | + * `launcherpopulationpolicy_types.go`: LauncherPopulationPolicy CRD |
| 134 | + |
| 135 | +* **`cmd/`**: Main applications |
| 136 | + * `dual-pods-controller/`: Controller managing server-providing Pods |
| 137 | + * `launcher-populator/`: Controller managing launcher pod population |
| 138 | + * `requester/`: Requester binary for server-requesting Pods |
| 139 | + * `test-requester/`: Test requester (does not use real GPUs) |
| 140 | + * `test-server/`: Test binary simulating a vLLM-like inference server |
| 141 | + |
| 142 | +* **`charts/`**: Helm charts for deployment |
| 143 | + * `fma-controllers/`: Unified Helm chart for both controllers |
| 144 | + |
| 145 | +* **`config/`**: Kubernetes configurations (CRDs, examples, and more — see [cluster-sharing docs](docs/cluster-sharing.md) for recent extensions) |
| 146 | + |
| 147 | +* **`inference_server/`**: Python-based inference server components |
| 148 | + * `launcher/`: vLLM instance launcher (persistent management process) |
| 149 | + * `benchmark/`: Benchmarking tools and scenarios |
| 150 | + |
| 151 | +* **`docs/`**: Documentation (see [`docs/README.md`](docs/README.md) for full index) |
| 152 | + |
| 153 | +* **`test/e2e/`**: End-to-end test scripts |
| 154 | + * `run.sh`: Standard dual-pods E2E test |
| 155 | + * `run-launcher-based.sh`: Launcher-based E2E test |
| 156 | + |
| 157 | +* **`dockerfiles/`**: Container image definitions |
| 158 | + * `Dockerfile.launcher.cpu`: CPU-based launcher image for testing without GPUs |
| 159 | + * `Dockerfile.launcher.benchmark`: GPU-based launcher image (the real deal) |
| 160 | + * `Dockerfile.requester`: Requester application image |
| 161 | + |
| 162 | +### Component Ownership |
| 163 | + |
| 164 | +* **Maintainers** are listed in the [OWNERS](OWNERS) file. The file follows [Kubernetes OWNERS conventions](https://www.kubernetes.dev/docs/guide/owners/) for future Prow compatibility but is not currently consumed by automation. Additional OWNERS files can be added per-directory as the project grows. |
| 165 | +* **Contributors** can become maintainers through consistent, quality contributions |
| 166 | + |
| 167 | +### Incubation Status |
| 168 | + |
| 169 | +FMA is currently in the **llm-d-incubation** organization, which means: |
| 170 | + |
| 171 | +* **Rapid iteration**: Greater freedom for testing new ideas and approaches |
| 172 | +* **Components may change significantly** as we learn |
| 173 | +* **Best effort support**: Not yet ready for production use |
| 174 | +* **Graduation path**: Working toward integration with core llm-d components |
| 175 | + |
| 176 | +### Graduation |
| 177 | + |
| 178 | +Graduation criteria are defined by the llm-d organization (not this repo). This repo tracks its progress toward meeting those criteria. See the llm-d organization documentation for details. |
0 commit comments