-
Notifications
You must be signed in to change notification settings - Fork 151
refunder: expand E2E tests #4029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @tilacog, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the testability and coverage of the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request significantly improves the testability and robustness of the refunder crate by introducing trait-based abstractions for database and blockchain interactions. This refactoring enables comprehensive unit testing with mocks, which have been added and are very thorough. Additionally, the E2E test suite has been greatly expanded, covering numerous refund scenarios and edge cases, which increases confidence in the refunder's behavior.
My review focuses on the new abstractions and the refactored logic. The changes are excellent, but I have identified a couple of areas for improvement. One is a potential inconsistency in how refundable orders are processed, which could lead to misleading logs or wasteful transactions. The other is an opportunity to address a TODO left in the code for a performance enhancement. Overall, this is a high-quality contribution that substantially improves the codebase.
| .or_default() | ||
| .push(uid); | ||
| } | ||
| let uid_with_latest_refundablility = futures::future::join_all(futures).await; // TODO: is it worth to switch to FuturesUnordered here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you've left a TODO here about switching to FuturesUnordered. This is a good idea for performance, as it would allow processing results as they complete rather than waiting for all futures to finish.
You're already using buffer_unordered from futures::StreamExt in send_out_refunding_tx, which provides similar benefits. You could use it here as well for consistency and improved performance. A concurrency limit (e.g., 10) is a reasonable starting point to avoid overwhelming the RPC endpoint.
| let uid_with_latest_refundablility = futures::future::join_all(futures).await; // TODO: is it worth to switch to FuturesUnordered here? | |
| let uid_with_latest_refundablility: Vec<_> = stream::iter(futures).buffer_unordered(10).collect().await; |
| let futures = uids.iter().map(|uid| { | ||
| let (uid, self_) = (*uid, &self); | ||
| let (uid, database) = (*uid, &self.database); | ||
| async move { | ||
| self_ | ||
| .get_ethflow_data_from_db(&uid) | ||
| database | ||
| .get_ethflow_order_data(&uid) | ||
| .await | ||
| .context(format!("uid {uid:?}")) | ||
| .inspect_err(|err| { | ||
| tracing::error!(?err, ?uid, "failed to get data from db") | ||
| }) | ||
| } | ||
| }); | ||
| let encoded_ethflow_orders: Vec<_> = stream::iter(futures) | ||
| .buffer_unordered(10) | ||
| .filter_map(|result| async { | ||
| result | ||
| .inspect_err(|err| tracing::error!(?err, "failed to get data from db")) | ||
| .ok() | ||
| }) | ||
| .filter_map(|result| async { result.ok() }) | ||
| .collect() | ||
| .await; | ||
| self.submitter | ||
| .submit(uids, encoded_ethflow_orders, contract) | ||
| .submit_refund(&uids, encoded_ethflow_orders, contract) | ||
| .await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential issue where uids and encoded_ethflow_orders can become out of sync. If get_ethflow_order_data fails for some UIDs, encoded_ethflow_orders will contain fewer items than the uids slice passed to submit_refund. This could lead to misleading logs and potentially sending wasteful empty transactions if all database lookups fail.
To make this more robust, you could refactor this section to ensure the lists are consistent. The following suggestion filters out orders that fail the database lookup and only attempts to submit a refund if there are successful orders to process.
let futures = uids.iter().map(|uid| async move {
match self.database.get_ethflow_order_data(uid).await {
Ok(data) => Some((*uid, data)),
Err(err) => {
tracing::error!(?err, ?uid, "failed to get data from db");
None
}
}
});
let successful_orders: Vec<_> = stream::iter(futures)
.buffer_unordered(10)
.filter_map(|result| async { result })
.collect()
.await;
if !successful_orders.is_empty() {
let (successful_uids, encoded_ethflow_orders): (Vec<_>, Vec<_>) =
successful_orders.into_iter().unzip();
self.submitter
.submit_refund(&successful_uids, encoded_ethflow_orders, contract)
.await?;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the same problem that’s been reported here:
services/crates/refunder/src/refund_service.rs
Lines 523 to 539 in f79f7d1
/// DB errors for individual orders are skipped; other orders proceed. /// /// # Current Behavior (documented, not necessarily ideal) /// /// When a DB lookup fails for an order: /// - The error is logged and the order data is excluded from the submission /// - However, the UID is still included in the submission /// /// This means `submit_refund` receives: /// - `uids`: ALL original UIDs (including those with failed lookups) /// - `orders`: Only the order data for successful lookups /// /// This creates a mismatch between UIDs and order data. See the TODO in /// `test_send_out_refunding_tx_all_db_calls_fail_still_submits` for /// discussion of potential fixes. #[tokio::test] async fn test_send_out_refunding_tx_db_error_skips_order() { services/crates/refunder/src/refund_service.rs
Lines 585 to 608 in f79f7d1
/// If every DB lookup fails, we still call the submitter with the original /// UIDs but without any order data. /// /// What actually happens: /// - Each failed order‑data fetch is logged and ignored (it doesn’t stop /// the whole batch). /// - The submitter gets the same list of UIDs we started with, but the /// `orders` slice may be empty (or contain fewer entries) because some or /// all lookups failed. /// /// TODO: Is this the behavior we really want? Submitting a refund that /// contains UIDs but no order details feels off. Possible fixes: /// 1. Skip the submission entirely when `encoded_ethflow_orders` is empty. /// 2. Return an error if *all* order‑data lookups fail. /// 3. Filter the UID list so it only includes IDs with successful lookups. /// /// NOTE: This test complements /// `test_send_out_refunding_tx_db_error_skips_order`. That test covers /// partial DB failure (some lookups succeed); this one covers /// total DB failure (all lookups fail). Together they verify that DB errors /// are non-fatal and UIDs are always preserved regardless of lookup /// success. #[tokio::test] async fn test_send_out_refunding_tx_all_db_calls_fail_still_submits() {
I wasn't aiming for any behavioral changes with this PR, but happy to review and discuss this during the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should fix in a separate PR
jmg-duarte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO & ideally, this should be broken into two parts: the refactor and the tests
Would make review much easier.
Left some comments
| let futures = uids.iter().map(|uid| { | ||
| let (uid, self_) = (*uid, &self); | ||
| let (uid, database) = (*uid, &self.database); | ||
| async move { | ||
| self_ | ||
| .get_ethflow_data_from_db(&uid) | ||
| database | ||
| .get_ethflow_order_data(&uid) | ||
| .await | ||
| .context(format!("uid {uid:?}")) | ||
| .inspect_err(|err| { | ||
| tracing::error!(?err, ?uid, "failed to get data from db") | ||
| }) | ||
| } | ||
| }); | ||
| let encoded_ethflow_orders: Vec<_> = stream::iter(futures) | ||
| .buffer_unordered(10) | ||
| .filter_map(|result| async { | ||
| result | ||
| .inspect_err(|err| tracing::error!(?err, "failed to get data from db")) | ||
| .ok() | ||
| }) | ||
| .filter_map(|result| async { result.ok() }) | ||
| .collect() | ||
| .await; | ||
| self.submitter | ||
| .submit(uids, encoded_ethflow_orders, contract) | ||
| .submit_refund(&uids, encoded_ethflow_orders, contract) | ||
| .await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should fix in a separate PR
This commit reverts the trait-based refactoring in the refunder crate to facilitate code review by splitting the work across two PRs. The trait-based architecture will be re-introduced in a follow-up PR.
This reverts commit b6b7d8c. Also adds the "rand" feature to the shared crate, resolving the build error.
…sts--pr # Conflicts: # crates/shared/Cargo.toml # crates/shared/src/sources/balancer_v2/pool_fetching/registry.rs
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
For future reference, I've reverted the commit f098c48 to split this PR in two. It'll be reintroduced in a follow-up PR. |
| owner if owner == NO_OWNER => Self::Invalid, | ||
| owner if owner == INVALIDATED_OWNER => Self::Refunded, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should work
| owner if owner == NO_OWNER => Self::Invalid, | |
| owner if owner == INVALIDATED_OWNER => Self::Refunded, | |
| NO_OWNER => Self::Invalid, | |
| INVALIDATED_OWNER => Self::Refunded, |
| /// Alternatives to leaking inlcude: | ||
| /// - Transmuting, but requires unsafe and introduces some drop complexity | ||
| /// - Borrow services & onchain structs | ||
| /// - Use Rc/Arc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I prefer the Arc solution, I find that the leak may be a footgun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Why not simply use Arc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After giving it a bit more thought, it seems we can't use Arc/Rc here because RefunderTestSetup> struct can't hold both Services<'a>, which in turn borrows &'a Onchain.
However, if we were to modify the Services and Onchain structs themselves, I think we could use Arc/Rc, but that seems like a PR on its own.
In the meantime, I managed to get it working quite easily using ouroboros for the self-referencing part.
What do you guys think? Should I stick with ouroboros, or do we want to refactor Services/Onchain to use Arc/Rc instead?
squadgazzz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing PR 🚀
Just a few clarification questions before approving it.
| // Tests that orders with slippage below, at, or above the min_price_deviation | ||
| // threshold are refunded according to the SQL >= check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the result depend only on a SQL query? If so, can this be tested in a more lightweight setup? Maybe expanding the database tests would be enough here? And keep only 1-2 cases with a local node involved. My worry is that each test executes in 5-7s, which adds ~40s to the CI job execution. No strong opinion tho. Especially, if the alternative solution will require almost the same time to execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point.
Considering SQL alone, then I believe yes. I can see order expiration and slippage being tested at the postgres_refundable_ethflow_orders test
The current test operates one layer above and verifies the full integration, so I dare to say there's a bit more of value being added here.
Nonetheless, I'm still not sure what's the best way forward here.
Maybe we can remove the boundary test cases, since they are already covered in the database tests?
| .create_and_index_ethflow_order(slippage.order, validity.order) | ||
| .await; | ||
|
|
||
| advance_time_past_expiration(&web3, valid_to).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what stops the order from being settled. Could you add a comment in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, settlement won't happen because of the time distance between the chain (2020) and autopilot (2026, current).
All tests' orders are expired at creation time by default because they are created based on chain time. The exception being the refunder_skips_settled_orders test, which manually creates valid orders.
I'll document this more thoroughly 👍
Resolves: #3785
Description
This PR improves test coverage for the refunder crate by expanding the end-to-end tests covering many refund scenarios.
Other Changes
RefundStatusfrom tests to the main crate logic.sharedcrate that was blocking this work.Run the refunder E2E tests: