stdcoroutine-2: Migrate production entry points from Boost.Coroutine to C++20 coroutines#6423
Draft
pratikmankawde wants to merge 22 commits intopratik/Swtich-to-std-coroutinesfrom
Draft
Conversation
…iter Introduce the core building blocks for migrating from Boost.Coroutine to C++20 stackless coroutines (Milestone 1): - CoroTask<T>: RAII coroutine return type with promise_type, symmetric transfer via FinalAwaiter, and lazy start (suspend_always) - CoroTaskRunner: Lifecycle manager (nested in JobQueue) mirroring the existing Coro class — handles LocalValues swap, nSuspend_ accounting, mutex-guarded resume, and join/post semantics - JobQueueAwaiter: Convenience awaiter combining suspend + auto-repost, with graceful fallback when JobQueue is stopping - postCoroTask(): JobQueue entry point for launching C++20 coroutines - CoroTask_test.cpp: 8 unit tests covering completion, suspend/resume ordering, LocalValue isolation, exception propagation, and shutdown Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This change replaces `void const*` by `uint256 const&` for database fetches. Object hashes are expressed using the `uint256` data type, and are converted to `void *` when calling the `fetch` or `fetchBatch` functions. However, in these fetch functions they are converted back to `uint256`, making the conversion process unnecessary. In a few cases the underlying pointer is needed, but that can then be easy obtained via `[hash variable].data()`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3ee0341 to
9d79af8
Compare
This was referenced Feb 26, 2026
clang-format: collapse single-line initializer lists and function arguments. prettier: add blank lines in markdown lists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9d79af8 to
f451347
Compare
Store the coroutine callable on the heap in CoroTaskRunner::init() via a type-erased FuncStore wrapper. Coroutine frames store a reference to the callable's implicit object parameter (the lambda); if the callable is a temporary, that reference dangles after the caller returns. This caused stack-use-after-scope (ASAN), assertion failures, and hangs across multiple compilers. Also fix expectEarlyExit() to destroy the coroutine frame when postCoroTask() fails, breaking a potential shared_ptr cycle. Switch all coroutine test lambda captures from [&] to explicit pointer-by-value as defense-in-depth against GCC 14 coroutine frame corruption. Add value-returning CoroTask<T> tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
f451347 to
ecc6093
Compare
This change enables all clang-tidy checks that are already passing. It also modifies the clang-tidy CI job, so it runs against all files if .clang-tidy changed.
This change adjusts the CI tests to make it easier to spot errors, without needing to sift through the thousands of lines of output.
Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com>
ecc6093 to
7f99b08
Compare
After a coroutine completes, the frame remains alive holding a captured
shared_ptr<CoroTaskRunner> back to its owner. This creates an unreachable
cycle: runner -> task_ -> frame -> shared_ptr<runner>.
Break the cycle in resume() by destroying the coroutine frame (task_ = {})
and the stored callable when the coroutine is done. Also fix runnable() to
handle the null-handle state after cleanup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7f99b08 to
f7fd476
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## pratik/Swtich-to-std-coroutines #6423 +/- ##
================================================================
Coverage 79.8% 79.8%
================================================================
Files 848 851 +3
Lines 67763 67921 +158
Branches 7578 7555 -23
================================================================
+ Hits 54077 54218 +141
- Misses 13686 13703 +17
🚀 New features to boost your workflow:
|
Replace `task_ = {}` with `std::move(task_)` in resume() and
expectEarlyExit(). The move assignment operator calls
handle_.destroy() while task_.handle_ still holds the old (now
dangling) handle value. If frame destruction triggers re-entrant
runner cleanup on GCC-12, the destructor sees a non-null handle_
and destroys the same frame again — a double-free.
std::move(task_) immediately nulls task_.handle_ via the move
constructor, then the frame is destroyed when the local goes out
of scope. This eliminates the re-entrancy window.
Also remove storedFunc_.reset() from resume() — the callable does
not participate in the shared_ptr cycle and will be cleaned up by
the runner's destructor.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
f7fd476 to
75b8ba0
Compare
Change postCoroTask from async post() to synchronous resume() for the initial coroutine dispatch. The async approach created a timing-dependent race during Env destruction where the coroutine frame's shared_ptr reference cycle could be broken in an indeterminate order, causing the debug-only finished_ assertion to fire non-deterministically on GCC-12 and GCC-15 debug builds. The synchronous resume runs the coroutine body to its first suspension point (co_await) or completion (co_return) on the caller's thread, ensuring the coroutine state is determinate before postCoroTask returns. Subsequent resumes still happen on worker threads via post(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
75b8ba0 to
d9d72d5
Compare
d9d72d5 to
7c510de
Compare
Revert the initial coroutine dispatch from synchronous resume() to async post(). The synchronous approach ran the coroutine body on the caller's thread, swapping in the coroutine's LocalValues. When coroutines mutated LocalValues (e.g. thread_specific_storage), those mutations bled back into the caller's thread-local state after the swap-out, corrupting unrelated tests (Book, Subscribe) sharing the same thread pool. Async post() dispatches the coroutine to a JobQueue worker thread whose LocalValues are managed by the thread pool, not by the caller. The original assertion issue that motivated sync resume was a symptom of the shared_ptr cycle and double-free bugs that have since been fixed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7c510de to
677c982
Compare
Add join() before the finished_ assertion in the destructor. With async dispatch, the coroutine runs on a worker thread. A gate signal inside the coroutine body can wake the test thread before resume() sets finished_. The test thread then triggers Env destruction, and the runner's destructor fires while finished_ is still false. join() establishes a happens-before edge via mutex_run_: finished_ = true → unlock(mutex_run_) in resume() → lock(mutex_run_) in join() → read finished_ This guarantees finished_ is visible when the assertion checks it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
677c982 to
28fe438
Compare
28fe438 to
bdd3659
Compare
When post() is called from within the coroutine body (via JobQueueAwaiter), two resume operations can overlap: R1 is still running while R2 is queued. With a boolean running_ flag, R1's cleanup (running_=false) clobbers R2's pending state, causing join() to return prematurely on ARM64. Replace bool running_ with int runCount_: post() increments before enqueue, resume() decrements after completion. join() waits for runCount_==0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all postCoro() call sites with postCoroTask() using C++20 coroutine lambdas. The key changes are: - Remove Context::coro field (shared_ptr<JobQueue::Coro>) from RPC::Context, eliminating it from all aggregate initializations - Replace RipplePathFind's yield/post/resume pattern with a local std::condition_variable that blocks until path-finding completes, avoiding colored-function infection across the RPC call chain - Switch ServerHandler entry points (onRequest, onWSMessage) from postCoro to postCoroTask with co_return lambdas - Switch GRPCServer::CallData::process() to use postCoroTask, rename private handler to processRequest() - Update Path_test and AMMTest to use postCoroTask (they set context.coro which no longer exists) The old postCoro() API remains available for Coroutine_test and JobQueue_test, which will be migrated in a subsequent commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the extra {} that was for the now-deleted Context::coro field
in the RPC::JsonContext construction in Application::startGeometry().
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
clang-format and prettier auto-formatting adjustments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add cppcoro, fcontext, gantt, pratik, repost, stackful to cspell.config.yaml to fix cspell check failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bdd3659 to
666cafc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
High Level Overview of Change
Migrate all production
postCoro()call sites topostCoroTask()using C++20 coroutine lambdas, as part of the Boost.Coroutine → std coroutines migration. This eliminates theContext::corofield and replaces the RipplePathFind yield/post/resume pattern with a simplerstd::condition_variableapproach.Context of Change
This is Milestone 2 of the Boost.Coroutine → C++20 coroutines migration. Milestone 1 (
pratik/std-coro/add-coroutine-primitives) introduced the new coroutine primitives (CoroTask<T>,CoroTaskRunner,JobQueueAwaiter,postCoroTask()). This milestone migrates all production code to use them.pratik/std-coro/add-coroutine-primitives): AddedCoroTask<T>,CoroTaskRunner,JobQueueAwaiter,postCoroTask()pratik/std-coro/migrate-entry-points): Migrated all production entry pointspratik/std-coro/migrate-test-code): Migrated all test codeKey design decision: Rather than "coloring" the entire
processSession → processRequest → doCommand → callMethod → doRipplePathFindcall chain as coroutines (which C++20 stackless coroutines would require forco_awaitto work at any depth), we replaceRipplePathFind'syield()/post()/resume()pattern with a localstd::condition_variablethat blocks the thread until path-finding completes. This is correct, simple, and bounded by existing path-finding timeouts.Since
context.corohas no remaining readers after this change, the field is removed entirely fromRPC::Context.Type of Change
API Impact
libxrplchange (any change that may affectlibxrplor dependents oflibxrpl)Before / After
Entry points (
ServerHandler::onRequest,onWSMessage,GRPCServer::CallData::process):RipplePathFind handler:
Future Tasks
Coroutine_test.cpp,JobQueue_test.cpp) topostCoroTaskCoro.ipp, remove oldCoroclass andpostCoro()fromJobQueue.h, remove Boost.Coroutine from CMake/Conan dependencies