Add 'fire and forget' mode to OSB #964

OVI3D0 · 2025-09-24T20:48:51Z

Description

Adds new 'fire and forget' mode to OSB.

Rather than OSB's traditional client model of 'fire request -> await request -> record metrics -> move on', this mode allows each client to fire requests without needing to worry about awaiting responses or recording metrics. This allows OSB to easily sustain high throughput values when load testing cluster, and is intended for those who don't necessarily want precise measurements on each request, but would rather test their clusters against very high sustained throughput levels.

The drawback here is of course that there is no information returned to the user on how their cluster is performing aside from outside forms of polling/measurements, like the performance charts than can be viewed in the AWS console for their cluster.

The PR introduces a new flag, --fire-and-forget, which tells OSB to choose the DeterministicScheduler. Unlike the UnitAwareScheduler used for most benchmarks, this scheduler doesn't care about request metadata, it only calculates throughput for each client and tells them to send requests at a certain rate.

The flag also tells OSB to make use of a new request executor, called the UnhingedExecutor. This executor, unlike the AsyncExecutor, creates separate asynchronous tasks to send requests without awaiting them, ensuring requests are sent at the specified rate no matter the latency or failure rates. To sum it up, if each executor needs to send 2 RPS, this mode tells them to create 2 async tasks per second, each of which will send the request, rather than trying to send 1 request every 0.5 seconds on its own.

This mode can consume resources very quickly, and users should ensure their hardware is able to handle the thousands of async processes that are created when using this flag. It's also likely they will run into an OSerror too many open files with all of the connections being established.

The max number of network sockets can be checked with ulimit -n and changed with the same command, like: ulimit -n 2048

Issues Resolved

#958

Testing

New functionality includes testing

New unit tests + running tests in 'fire and forget' mode against OS cluster at 500 TPS:

Unit tests produce this warning since we don't await the async tasks:

sys:1: RuntimeWarning: coroutine 'UnhingedExecutor._fire_and_forget_request.<locals>.fire_and_forget_runner' was never awaited
Coroutine created at (most recent call last)
  File "/Users/mikeovi/.pyenv/versions/3.8.12/lib/python3.8/unittest/case.py", line 633, in _callTestMethod
    method()
  File "/Users/mikeovi/.pyenv/versions/3.8.12/lib/python3.8/unittest/mock.py", line 1325, in patched
    return func(*newargs, **newkeywargs)
  File "/Users/mikeovi/workplace/opensearch-benchmark/tests/__init__.py", line 35, in async_wrapper
    asyncio.run(t(*args, **kwargs), debug=True)
  File "/Users/mikeovi/.pyenv/versions/3.8.12/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/mikeovi/.pyenv/versions/3.8.12/lib/python3.8/asyncio/base_events.py", line 603, in run_until_complete
    self.run_forever()
  File "/Users/mikeovi/.pyenv/versions/3.8.12/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
    self._run_once()
  File "/Users/mikeovi/.pyenv/versions/3.8.12/lib/python3.8/asyncio/base_events.py", line 1851, in _run_once
    handle._run()
  File "/Users/mikeovi/.pyenv/versions/3.8.12/lib/python3.8/asyncio/events.py", line 81, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/mikeovi/workplace/opensearch-benchmark/tests/worker_coordinator/worker_coordinator_test.py", line 2454, in test_fire_and_forget_request_no_throttling_needed
    await executor._fire_and_forget_request({}, 1.0, 0.0)  # expected_scheduled_time = 1.0
  File "/Users/mikeovi/workplace/opensearch-benchmark/osbenchmark/worker_coordinator/worker_coordinator.py", line 2603, in _fire_and_forget_request
    task = asyncio.create_task(fire_and_forget_runner())

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Michael Oviedo <[email protected]>

rishabh6788 · 2025-09-29T17:46:34Z

osbenchmark/benchmark.py

    default=None
    )
+    test_run_parser.add_argument(
+        "--fire-and-forget",


nit: can we come up with some other name, like --no-await or --sustain etc?

rishabh6788 · 2025-09-29T17:48:46Z

osbenchmark/worker_coordinator/worker_coordinator.py

                self.complete.set()
            await self._cleanup()

+class UnhingedExecutor:


Nit: Same here, can we change this to AsyncNoAwaitExecutor or something similar to maintain naming convention?

rishabh6788 · 2025-09-29T17:50:35Z

osbenchmark/worker_coordinator/worker_coordinator.py

+            self.logger.info("Client id [%s] is running now.", self.client_id)
+
+
+    async def _fire_and_forget_request(self, params: dict, expected_scheduled_time: float, total_start: float) -> None:


nit: same here for the naming.

rishabh6788 · 2025-09-29T17:59:46Z

LGTM apart from some naming conventions. Can you run test in timed mode, may be for 15-30 mins and share some more results, good idea to test it with ramp up mode as well.

Signed-off-by: Michael Oviedo <[email protected]>

OVI3D0 · 2025-10-15T22:25:53Z

LGTM apart from some naming conventions. Can you run test in timed mode, may be for 15-30 mins and share some more results, good idea to test it with ramp up mode as well.

Here's a screenshot after running combined with the ramp-up test procedure property:

I used this test procedure:

{
  "name": "ramp-up-test-procedure",
  "schedule": [
    {
       "operation": "range",
       "warmup-time-period": {{ warmup_time | default(900) | tojson }},
       "ramp-up-time-period": {{ ramp_up_time | default(600) | tojson }},
       "time-period": {{ time_period | default(1500) | tojson }},
       "target-throughput": {{ target_throughput | default(2000) | tojson }},
       "clients": {{ search_clients | default(2000) }}
    }
  ]
}

OVI3D0 added 5 commits September 22, 2025 12:23

add unhinged mode to OSB

546870b

Signed-off-by: Michael Oviedo <[email protected]>

cleanups + new scheduler for unhinged mode

7395d29

Signed-off-by: Michael Oviedo <[email protected]>

add unit tests + cleanup code

fa6d72c

Signed-off-by: Michael Oviedo <[email protected]>

small fix

4a77c9e

Signed-off-by: Michael Oviedo <[email protected]>

fix spacing

edc2991

Signed-off-by: Michael Oviedo <[email protected]>

OVI3D0 requested review from IanHoang, VijayanB, beaioun, gkamat and rishabh6788 as code owners September 24, 2025 20:48

fix lint errors

e78cfe4

Signed-off-by: Michael Oviedo <[email protected]>

rishabh6788 reviewed Sep 29, 2025

View reviewed changes

OVI3D0 added 2 commits October 15, 2025 12:45

renaming

95440e9

Signed-off-by: Michael Oviedo <[email protected]>

fix unit tests

ad68977

Signed-off-by: Michael Oviedo <[email protected]>

OVI3D0 requested a review from rishabh6788 October 20, 2025 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add 'fire and forget' mode to OSB #964

Add 'fire and forget' mode to OSB #964

Uh oh!

OVI3D0 commented Sep 24, 2025 •

edited

Loading

Uh oh!

rishabh6788 Sep 29, 2025

Uh oh!

rishabh6788 Sep 29, 2025

Uh oh!

rishabh6788 Sep 29, 2025

Uh oh!

rishabh6788 commented Sep 29, 2025

Uh oh!

OVI3D0 commented Oct 15, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		self.logger.info("Client id [%s] is running now.", self.client_id)


		async def _fire_and_forget_request(self, params: dict, expected_scheduled_time: float, total_start: float) -> None:

Add 'fire and forget' mode to OSB #964

Are you sure you want to change the base?

Add 'fire and forget' mode to OSB #964

Uh oh!

Conversation

OVI3D0 commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues Resolved

Testing

Uh oh!

rishabh6788 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

rishabh6788 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

rishabh6788 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

rishabh6788 commented Sep 29, 2025

Uh oh!

OVI3D0 commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OVI3D0 commented Sep 24, 2025 •

edited

Loading

OVI3D0 commented Oct 15, 2025 •

edited

Loading