Implement Remote Persistent Workers#2323
Conversation
|
/build-image |
|
Image built and pushed! |
|
Oddly, the only failure we saw was on Windows but it should've failed on all platforms. Something to think about @palfrey and maybe a consequence of the work to integrate the new rulesets. |
|
/build-image |
|
FYI: Current CI failure is related to our upstream Nix cache not the code in the PR. It passed prior to the merge commit and has been stable in my testing. |
|
Image built and pushed! |
|
Locally, another thing I tried was I set up and verified a local persistent-worker cluster on the MacBook. What’s running:
- NativeLink local scheduler/CAS/workers on grpc://127.0.0.1:50051
- Worker API on 127.0.0.1:50061
- Config: /tmp/nl-pw-smoke/nativelink-pw.json5 ~ actual link to the config at the bottom of this comment.
- Smoke workspace: /tmp/nl-pw-smoke
Verification passed:
- Built a tiny Bazel workspace through local NativeLink remote execution.
- Actions hit the persistent-worker path: logs show Spawned new
persistent worker and Persistent worker command complete.
- Repeated clean/build cycles advanced worker request counts from 1 to 2
to 3, proving the persistent workers stayed alive and were reused
across Bazel invocations.
Latest output:
one count=3 cwd=/private/tmp/nativelink-multi-test/worker2/work
two count=3 cwd=/private/tmp/nativelink-multi-test/worker3/work
three count=3 cwd=/private/tmp/nativelink-multi-test/worker1/workEverything seems to be working just fine. After this is merged, I will upgrade the version to |
|
/build-image nativelink-worker-init |
|
/build-image |
|
Image built and pushed! |
|
Image built and pushed! |
amankrx
left a comment
There was a problem hiding this comment.
Tested with the helm chart and Persistent Workers are working as expectec.
Description
Adds remote persistent worker support for NativeLink worker execution, including Bazel worker protocol handling, worker pooling, lifecycle management, and focused scheduler/worker tests.
Also adds persistent worker deployment examples and website docs, plus a Bazel wrapper for checking the website docs build.
If you need something that we don't have here, please let us know.
Not Implemented In v1
WorkRequests against one process.--worker_sandboxingsemantics yet.WorkerKeys.Fixes #2050
Type of change
Please delete options that aren't relevant.
not work as expected)
How Has This Been Tested?
bazel test //nativelink-worker:unit_test //nativelink-worker:integration //nativelink-scheduler:integration --test_output=errorsAND (newly because this PR will put in place systems that make TS much faster)
bazel test //web/platform:check_test --test_output=errors
Checklist
bazel test //...passes locallygit amendsee some docsThis change is