Skip to content

[Serve] limit num_workers in replica's ThreadPoolExecutor to num_cpus#60271

Merged
abrarsheikh merged 8 commits intoray-project:masterfrom
myandpr:limit-num-workers
Jan 27, 2026
Merged

[Serve] limit num_workers in replica's ThreadPoolExecutor to num_cpus#60271
abrarsheikh merged 8 commits intoray-project:masterfrom
myandpr:limit-num-workers

Conversation

@myandpr
Copy link
Member

@myandpr myandpr commented Jan 18, 2026

Description

  • Limit the user-code event loop’s default ThreadPoolExecutor size to the deployment’s ray_actor_options["num_cpus"] (fractional values round up, <=0 leaves
    defaults).
  • This ensures asyncio.to_thread in Serve replicas respects the CPU reservation and avoids oversubscription.
  • Added a Serve test that verifies the default executor’s max_workers matches num_cpus.

Related issues

Link related issues: "Fixes #59750 ", "Closes #59750 ", or "Related to #59750 ".

Additional information

  • Tests run:
    • python -m pytest python/ray/serve/tests/unit/test_user_callable_wrapper.py
    • python -m pytest python/ray/serve/tests/test_replica_sync_methods.py

@myandpr myandpr requested a review from a team as a code owner January 18, 2026 16:17
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a useful optimization to limit the number of workers in the replica's ThreadPoolExecutor based on the num_cpus specified in ray_actor_options. This helps prevent CPU oversubscription when using asyncio.to_thread. The implementation is clean and the logic for calculating the number of workers is robust. I've added one suggestion to enhance the test coverage by including more edge cases, which will help ensure the feature is solid.

@ray-gardener ray-gardener bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels Jan 18, 2026
Copy link
Contributor

@harshit-anyscale harshit-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
cc: @abrarsheikh for further review

@myandpr myandpr requested a review from abrarsheikh January 21, 2026 19:05
@myandpr myandpr requested a review from abrarsheikh January 21, 2026 19:29
# __ray_parallel_end__


if __name__ == "__main__":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include your serve app here so that it runs as part of doc test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch — I hadn’t noticed the doc test flow, Thanks. Added the CustomThreadPool example to main so it runs there now. Please take another look.

@myandpr myandpr requested a review from abrarsheikh January 22, 2026 19:53
@abrarsheikh
Copy link
Contributor

CI failing, lease merge master into your branch

@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Jan 22, 2026
Signed-off-by: yaommen <myanstu@163.com>
Signed-off-by: yaommen <myanstu@163.com>
Signed-off-by: yaommen <myanstu@163.com>
@myandpr
Copy link
Member Author

myandpr commented Jan 23, 2026

update: rebased on latest master

cursor[bot]

This comment was marked as outdated.

Signed-off-by: yaommen <myanstu@163.com>
@myandpr myandpr self-assigned this Jan 23, 2026
Signed-off-by: yaommen <myanstu@163.com>
@myandpr
Copy link
Member Author

myandpr commented Jan 23, 2026

CI failure(https://buildkite.com/ray-project/premerge/builds/58429/steps/canvas?jid=019be912-9a16-4ce9-8684-eb64454ce290) reason: test_asyncio_default_executor_limited_by_num_cpus had too many parameters and timed out; teardown hit pytest-timeout SIGTERM.

To address this, I reduced the parametrized cases from 5 to 3 (0, 2.2, 30) to keep core coverage while lowering runtime.
Evidence log: -- Test timed out at 2026-01-23 04:50:57 UTC --

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Signed-off-by: yaommen <myanstu@163.com>
@myandpr
Copy link
Member Author

myandpr commented Jan 23, 2026

@abrarsheikh , I have

CI failing, lease merge master into your branch

Hi @abrarsheikh ,rebased on master and CI is green. When you have a moment, could you give it another look?

@abrarsheikh abrarsheikh merged commit 8c4b2b6 into ray-project:master Jan 27, 2026
6 checks passed
eicherseiji added a commit to eicherseiji/ray that referenced this pull request Jan 28, 2026
The test_replica_sync_methods_with_run_sync_in_threadpool test was
configured with size="small" (60s timeout), but after PR ray-project#60271 added
the test_asyncio_default_executor_limited_by_num_cpus test with a
num_cpus=30 parameter, the test suite started timing out during
teardown.

Change the test size from "small" to "medium" (300s timeout) to give
the test suite enough time to complete including cleanup.

Fixes flaky test: test_asyncio_default_executor_limited_by_num_cpus[30-32]

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
jinbum-kim pushed a commit to jinbum-kim/ray that referenced this pull request Jan 29, 2026
…ray-project#60271)

## Description
- Limit the user-code event loop’s default ThreadPoolExecutor size to
the deployment’s ray_actor_options["num_cpus"] (fractional values round
up, <=0 leaves
  defaults).
- This ensures asyncio.to_thread in Serve replicas respects the CPU
reservation and avoids oversubscription.
- Added a Serve test that verifies the default executor’s max_workers
matches num_cpus.
## Related issues
> Link related issues: "Fixes ray-project#59750 ", "Closes ray-project#59750 ", or "Related to
ray-project#59750 ".

## Additional information
- Tests run:
- python -m pytest
python/ray/serve/tests/unit/test_user_callable_wrapper.py
- python -m pytest python/ray/serve/tests/test_replica_sync_methods.py

---------

Signed-off-by: yaommen <myanstu@163.com>
Signed-off-by: jinbum-kim <jinbum9958@gmail.com>
limarkdcunha pushed a commit to limarkdcunha/ray that referenced this pull request Jan 29, 2026
…ray-project#60271)

## Description
- Limit the user-code event loop’s default ThreadPoolExecutor size to
the deployment’s ray_actor_options["num_cpus"] (fractional values round
up, <=0 leaves
  defaults).
- This ensures asyncio.to_thread in Serve replicas respects the CPU
reservation and avoids oversubscription.
- Added a Serve test that verifies the default executor’s max_workers
matches num_cpus.
## Related issues
> Link related issues: "Fixes ray-project#59750 ", "Closes ray-project#59750 ", or "Related to
ray-project#59750 ".

## Additional information
- Tests run:
- python -m pytest
python/ray/serve/tests/unit/test_user_callable_wrapper.py
- python -m pytest python/ray/serve/tests/test_replica_sync_methods.py

---------

Signed-off-by: yaommen <myanstu@163.com>
400Ping pushed a commit to 400Ping/ray that referenced this pull request Feb 1, 2026
…ray-project#60271)

## Description
- Limit the user-code event loop’s default ThreadPoolExecutor size to
the deployment’s ray_actor_options["num_cpus"] (fractional values round
up, <=0 leaves
  defaults).
- This ensures asyncio.to_thread in Serve replicas respects the CPU
reservation and avoids oversubscription.
- Added a Serve test that verifies the default executor’s max_workers
matches num_cpus.
## Related issues
> Link related issues: "Fixes ray-project#59750 ", "Closes ray-project#59750 ", or "Related to
ray-project#59750 ".

## Additional information
- Tests run:
- python -m pytest
python/ray/serve/tests/unit/test_user_callable_wrapper.py
- python -m pytest python/ray/serve/tests/test_replica_sync_methods.py

---------

Signed-off-by: yaommen <myanstu@163.com>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve] limit num_workers in replica's ThreadPoolExecutor to num_cpus

3 participants