Commit 63a834d
authored
bugfix: Ninja race condition fix (#2339)
<!-- .github/pull_request_template.md -->
## 📌 Description
**Fix race condition in JIT compilation for multi-GPU/multi-process
environments**
This PR resolves a race condition in multi-process environments where
concurrent Ninja builds contended for a shared `.ninja_log` file
(`ninja: error: opening build log: No such file or directory`), causing
intermittent failures when multiple processes compile different
FlashInfer modules simultaneously.
Key changes:
1. Isolated workdirs for runtime JIT: Each module now builds in its own
subdirectory (`cached_ops/<module>/`), isolating .ninja_log files
between concurrent builds
2. Absolute output paths: Ninja build files use absolute paths for
object and output files, ensuring correct output locations regardless of
workdir
3. Preserved AOT parallelism: Batch builds retain the `subninja`
approach, maintaining full parallel compilation
Aspect | Before | After
-- | -- | --
Runtime JIT workdir | Shared `cached_ops/` |
Isolated `cached_ops/<module>/`
AOT batch workdir | `cached_ops/` | `cached_ops/` (unchanged)
`.ninja_log` conflict | Yes (race condition) | No (different workdir
levels)
Ninja output paths | Relative (`$name/$name.so`) | Absolute
(`/path/to/cached_ops/<module>/<module>.so`)
Output locations | `cached_ops/<module>/<module>.so` |
`cached_ops/<module>/<module>.so` (unchanged)
AOT parallelism | Full (`subninja`) | Full (`subninja`) - preserved
<!-- What does this PR do? Briefly describe the changes and why they’re
needed. -->
## 🔍 Related Issues
<!-- Link any related issues here -->
#2338
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [ ] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [ ] I have installed the hooks with `pre-commit install`.
- [ ] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Build system now isolates per-spec build outputs into dedicated build
directories and runs per-spec compilation with top-level artifact
targets to avoid cross-spec interference and race conditions.
* **New Features**
* Added a public per-spec build_dir property to configure where each
specification’s build outputs are placed.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent 0e9a89d commit 63a834d
2 files changed
Lines changed: 17 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
230 | 234 | | |
231 | 235 | | |
232 | 236 | | |
| |||
238 | 242 | | |
239 | 243 | | |
240 | 244 | | |
241 | | - | |
| 245 | + | |
242 | 246 | | |
243 | 247 | | |
244 | 248 | | |
| |||
264 | 268 | | |
265 | 269 | | |
266 | 270 | | |
267 | | - | |
| 271 | + | |
268 | 272 | | |
269 | 273 | | |
270 | 274 | | |
| |||
295 | 299 | | |
296 | 300 | | |
297 | 301 | | |
298 | | - | |
| 302 | + | |
299 | 303 | | |
300 | 304 | | |
301 | 305 | | |
| |||
362 | 366 | | |
363 | 367 | | |
364 | 368 | | |
365 | | - | |
| 369 | + | |
366 | 370 | | |
367 | 371 | | |
368 | 372 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
276 | 276 | | |
277 | 277 | | |
278 | 278 | | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
279 | 284 | | |
280 | 285 | | |
281 | 286 | | |
282 | 287 | | |
283 | 288 | | |
284 | 289 | | |
285 | | - | |
| 290 | + | |
286 | 291 | | |
287 | 292 | | |
288 | 293 | | |
289 | 294 | | |
290 | 295 | | |
291 | | - | |
292 | | - | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
293 | 299 | | |
294 | 300 | | |
295 | 301 | | |
| |||
0 commit comments