Fix #2233: Mirror sessions destination_hostgroup overwritten by fast routing#5385
Fix #2233: Mirror sessions destination_hostgroup overwritten by fast routing#5385renecannao merged 7 commits intov3.0from
Conversation
…routing Bug fix: - Mirror sessions had their destination_hostgroup incorrectly overwritten by fast routing rules, causing duplicate query execution to the same hostgroup - Added check in Query_Processor.cpp to skip fast routing for mirror sessions Changes: - lib/Query_Processor.cpp: Wrapped fast routing logic with mirror session check - test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp: New regression test - test/tap/groups/groups.json: Added test to default-g1 test group - test/tap/tests/Makefile: Added build rule for the new test (commented out) The fix ensures mirror sessions maintain their assigned destination_hostgroup and are not affected by fast routing rules.
The commented-out Makefile rule is no longer needed.
- Default to SQLite3 Server (port 6030) for local testing without MySQL - Add MYSQL_PORT env var to override backend port for CI - Use 'main' schema for SQLite3, 'information_schema' for MySQL - Fix query rule: use match_pattern instead of match_digest (comments are stripped from digest text) - Fix stats query: filter by username instead of digest pattern - Add comprehensive documentation with usage examples - Add verbose diagnostics throughout the test
- Remove MYSQL_PORT environment variable option
- Hardcode SQLite3 Server port (6030) and schema ('main')
- Simplify build_setup_queries() function
- Update documentation to reflect single backend mode
Summary of ChangesHello @jesmarcannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue where ProxySQL's fast routing mechanism inadvertently redirected mirror sessions to an incorrect hostgroup, leading to unintended duplicate query executions. The implemented solution introduces a conditional check to bypass fast routing for mirror sessions, ensuring they consistently target their configured Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthroughAdds a guard in Query_Processor to skip fast-routing destination resolution for mirror sessions and adds a TAP regression test plus test-group entry validating mirror + fast-routing behavior. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
The pull request addresses issue #2233 by preventing fast routing rules from overwriting the destination_hostgroup for mirror sessions. This is achieved by adding a conditional check in Query_Processor.cpp to skip fast routing if the session is a mirror session. A new regression test reg_test_2233_mirror_fast_routing-t.cpp has been added to verify this fix, ensuring that original queries follow fast routing while mirror queries are routed to their intended mirror_hostgroup. The test has also been integrated into the default-g1 test group in groups.json.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp (2)
254-256: Theusernamevalue is interpolated directly into the SQL query.While this is test code and the username comes from environment/config, building queries with string concatenation is fragile if the username ever contains quotes. This is a minor concern given the test context.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp` around lines 254 - 256, The SQL is built by concatenating cl.username into check_query which will break if the username contains quotes; instead escape single quotes in cl.username (e.g., replace ' with '') before interpolation or use a parameterized query API if available, so update the code that constructs check_query to use an escaped_username variable (or bind parameter) when forming the WHERE username=... clause referencing check_query and cl.username.
229-231: Consider a polling loop instead of a fixedsleep(1)for stats availability.Mirror queries execute asynchronously. Under CI load, 1 second might occasionally be insufficient. A retry loop polling
stats_mysql_query_digestuntil 2 hostgroups appear (with a timeout) would make this more robust. That said, this pattern is common in the existing test suite, so it's acceptable for now.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp` around lines 229 - 231, The test currently uses a fixed sleep(1) which can be flaky; replace it with a small polling loop that repeatedly calls stats_mysql_query_digest (or the existing helper that reads stats) and checks that two hostgroups are present, sleeping a short interval between attempts and failing after a reasonable timeout (e.g., 5–10s); update the diag message to indicate polling and timeout and ensure the loop uses the same condition the test expects (presence of 2 hostgroups) before continuing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp`:
- Around line 63-75: The fast routing rule matches schemaname='main' but the
test client connects without a DB so the user's default_schema may not be
'main'; update build_setup_queries to ensure the ProxySQL user has
default_schema='main' by adding an appropriate SQL (e.g. UPDATE or
INSERT/REPLACE into mysql_users to set default_schema='main') and then LOAD
MYSQL USERS TO RUNTIME, or alternatively modify the test client connection to
explicitly connect to database 'main' so the fast routing rule will match.
---
Nitpick comments:
In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp`:
- Around line 254-256: The SQL is built by concatenating cl.username into
check_query which will break if the username contains quotes; instead escape
single quotes in cl.username (e.g., replace ' with '') before interpolation or
use a parameterized query API if available, so update the code that constructs
check_query to use an escaped_username variable (or bind parameter) when forming
the WHERE username=... clause referencing check_query and cl.username.
- Around line 229-231: The test currently uses a fixed sleep(1) which can be
flaky; replace it with a small polling loop that repeatedly calls
stats_mysql_query_digest (or the existing helper that reads stats) and checks
that two hostgroups are present, sleeping a short interval between attempts and
failing after a reasonable timeout (e.g., 5–10s); update the diag message to
indicate polling and timeout and ensure the loop uses the same condition the
test expects (presence of 2 hostgroups) before continuing.
|
add to whitelist |
|
@mirostauder : CI is not being triggered on this PR |
|
retest this please |
- Add cleanup of all mysql_* configuration tables before test runs - Create test user automatically to ensure test is self-contained - Separate cleanup, user creation, and setup into distinct phases
Ensures the fast routing rule matches correctly by setting the user's default_schema to 'main' during user creation.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp (2)
318-339:found_countsis populated but never used — missed assertion opportunity.
found_counts[2](line 319) stores the per-hostgroup query counts but is never referenced after population. Since the test issues exactly one query and one mirror, bothcount_starvalues should equal 1. Adding assertions on these counts would strengthen the test and catch future regressions where queries are inadvertently duplicated.♻️ Proposed enhancement
ok(has_fast_routing_hg, "Original query should be routed to fast routing hostgroup %d", FAST_ROUTING_HOSTGROUP); ok(has_mirror_hg, "Mirror query should be routed to mirror hostgroup %d (NOT overwritten by fast routing)", MIRROR_HOSTGROUP); + + // Verify each hostgroup received exactly 1 execution + bool counts_correct = (found_counts[0] == 1 && found_counts[1] == 1); + diag(" count_star for hg %d: %ld, for hg %d: %ld", + found_hostgroups[0], found_counts[0], found_hostgroups[1], found_counts[1]); + // Note: plan() would need to be updated to plan(4) if this ok() is addedNote: Adding this
ok()requires updatingplan(3)toplan(4).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp` around lines 318 - 339, The test populates found_counts[] but never asserts on them; add assertions that each hostgroup's count_star equals 1 to ensure no duplicate routing: after populating found_counts in the loop, add ok(found_counts[<index>] == 1, "Hostgroup %d has count_star == 1", <corresponding hostgroup>) for both FAST_ROUTING_HOSTGROUP and MIRROR_HOSTGROUP (use the same mapping as found_hostgroups[]), and update plan(3) to plan(4) to account for the extra check.
158-212: UseBAIL_OUTfor fatal setup failures afterplan(3)is declared.After
plan(3)is declared (line 130), returningexit_status()with zerook()calls produces a TAP plan mismatch (expected 3 tests, got 0). This can cause CI to report an ambiguous or misleading failure. The idiomatic TAP approach for unrecoverable setup errors isBAIL_OUT("reason"), which immediately aborts the test with a clear signal to the harness.♻️ Proposed fix for early-return paths
- if (run_queries(proxysql_admin, cleanup_queries)) { - return exit_status(); - } + if (run_queries(proxysql_admin, cleanup_queries)) { + BAIL_OUT("Cleanup queries failed"); + }Apply the same pattern at lines 167–169, 175–177, 199–200, and 209–212.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp` around lines 158 - 212, The test currently returns exit_status() on fatal setup failures (checks around run_queries(cleanup_queries), run_queries(user_queries), run_queries(setup_queries) and the mysql_query error branches for mirror_rule and fast_routing_rule), which after plan(3) yields a TAP plan mismatch; replace those early returns with BAIL_OUT("...") calls that include a brief reason and any error details (e.g., mysql_error(proxysql_admin) for mysql_query failures) so the test harness aborts cleanly—update the failure handling in the run_queries checks (where run_queries(proxysql_admin, cleanup_queries/user_queries/setup_queries) is used) and the two mysql_query error branches that create mirror_rule and fast_routing_rule to call BAIL_OUT with contextual messages instead of returning exit_status().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp`:
- Around line 219-226: The diagnostic SELECT is querying the wrong column: in
the MYSQL_QUERY call against mysql_query_rules (using proxysql_admin) replace
the selected column match_digest with match_pattern so the verification prints
the actual pattern inserted earlier; update the diagnostic formatting to
reference the same column name (match_pattern) when logging server_row[...] so
the output reflects the inserted rule's match_pattern.
---
Nitpick comments:
In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp`:
- Around line 318-339: The test populates found_counts[] but never asserts on
them; add assertions that each hostgroup's count_star equals 1 to ensure no
duplicate routing: after populating found_counts in the loop, add
ok(found_counts[<index>] == 1, "Hostgroup %d has count_star == 1",
<corresponding hostgroup>) for both FAST_ROUTING_HOSTGROUP and MIRROR_HOSTGROUP
(use the same mapping as found_hostgroups[]), and update plan(3) to plan(4) to
account for the extra check.
- Around line 158-212: The test currently returns exit_status() on fatal setup
failures (checks around run_queries(cleanup_queries), run_queries(user_queries),
run_queries(setup_queries) and the mysql_query error branches for mirror_rule
and fast_routing_rule), which after plan(3) yields a TAP plan mismatch; replace
those early returns with BAIL_OUT("...") calls that include a brief reason and
any error details (e.g., mysql_error(proxysql_admin) for mysql_query failures)
so the test harness aborts cleanly—update the failure handling in the
run_queries checks (where run_queries(proxysql_admin,
cleanup_queries/user_queries/setup_queries) is used) and the two mysql_query
error branches that create mirror_rule and fast_routing_rule to call BAIL_OUT
with contextual messages instead of returning exit_status().
The verification query was selecting match_digest instead of match_pattern, causing the diagnostic output to show NULL instead of the actual pattern.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp (2)
252-257: Assert the schema instead of just logging it.If the session schema isn't
'main'(e.g., due to a configuration edge case), the fast routing rule won't match and the test will fail with a misleading "all queries went to single hostgroup" diagnostic. An early assertion here would make failures self-explanatory.Proposed fix
// Check current schema MYSQL_QUERY(proxysql, "SELECT DATABASE()"); MYSQL_RES* schema_res = mysql_store_result(proxysql); server_row = mysql_fetch_row(schema_res); - diag("Current schema/database: %s", server_row[0] ? server_row[0] : "NULL"); + const char* current_schema = server_row[0] ? server_row[0] : "NULL"; + diag("Current schema/database: %s", current_schema); + if (!server_row[0] || strcmp(server_row[0], "main") != 0) { + diag("ERROR: Expected schema 'main' but got '%s'. Fast routing rule requires schemaname='main'.", current_schema); + } mysql_free_result(schema_res);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp` around lines 252 - 257, The test currently only logs the current schema fetched via MYSQL_QUERY(proxysql, "SELECT DATABASE()") and mysql_fetch_row(schema_res) into diag; change this to assert the session schema equals "main" so failures are explicit: after retrieving server_row from mysql_store_result(proxysql), check server_row[0] and invoke the test assertion/fail helper (instead of just diag) when server_row[0] is NULL or not equal to "main" so the test aborts with a clear message that the session schema is wrong and fast routing will not match.
279-281: Hardcodedsleep(1)may be flaky in slow CI environments.Mirror queries are asynchronous. A fixed 1-second wait is a common pattern in ProxySQL tests, but if the CI environment is under heavy load, the mirror query stats may not have landed yet. A retry loop polling
stats_mysql_query_digestfor the expected row count would be more robust.Proposed fix — poll with bounded retries
- // Wait a bit for stats to be updated - diag("Waiting 1 second for stats to be updated..."); - sleep(1); + // Wait for mirror query stats to appear (async mirror may take time) + diag("Waiting for stats to reflect mirror query..."); + int retries = 10; + int observed_hostgroups = 0; + while (retries-- > 0) { + sleep(1); + std::string poll_q = "SELECT COUNT(DISTINCT hostgroup) FROM stats_mysql_query_digest " + "WHERE username='" + std::string(cl.username) + "'"; + if (mysql_query(proxysql_admin, poll_q.c_str()) == 0) { + MYSQL_RES* poll_res = mysql_store_result(proxysql_admin); + if (poll_res) { + MYSQL_ROW poll_row = mysql_fetch_row(poll_res); + if (poll_row && poll_row[0]) { + observed_hostgroups = atoi(poll_row[0]); + } + mysql_free_result(poll_res); + } + } + if (observed_hostgroups >= 2) break; + diag(" Retrying... (%d retries left, observed %d hostgroups)", retries, observed_hostgroups); + } + diag("Observed %d distinct hostgroups after polling", observed_hostgroups);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp` around lines 279 - 281, Replace the hardcoded sleep(1) with a bounded retry/polling loop that queries stats_mysql_query_digest until the expected mirror-query row count appears (or a timeout is reached); remove the sleep(1) call, run the same query used elsewhere in the test to fetch stats_mysql_query_digest in each iteration, check the returned row count against the expected value, break early when matched, and emit diag() and a failing assertion if the timeout elapses to keep the test robust in slow CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/tap/tests/reg_test_2233_mirror_fast_routing-t.cpp`:
- Around line 252-257: The test currently only logs the current schema fetched
via MYSQL_QUERY(proxysql, "SELECT DATABASE()") and mysql_fetch_row(schema_res)
into diag; change this to assert the session schema equals "main" so failures
are explicit: after retrieving server_row from mysql_store_result(proxysql),
check server_row[0] and invoke the test assertion/fail helper (instead of just
diag) when server_row[0] is NULL or not equal to "main" so the test aborts with
a clear message that the session schema is wrong and fast routing will not
match.
- Around line 279-281: Replace the hardcoded sleep(1) with a bounded
retry/polling loop that queries stats_mysql_query_digest until the expected
mirror-query row count appears (or a timeout is reached); remove the sleep(1)
call, run the same query used elsewhere in the test to fetch
stats_mysql_query_digest in each iteration, check the returned row count against
the expected value, break early when matched, and emit diag() and a failing
assertion if the timeout elapses to keep the test robust in slow CI.
|



Fixes #2233
Problem
Mirror sessions had their
destination_hostgroupincorrectly overwritten by fast routing rules, causing duplicate query execution to the same hostgroup instead of routing the mirror query to its intendedmirror_hostgroup.Solution
Added a check in
Query_Processor.cppto skip fast routing for mirror sessions, ensuring they maintain their assigneddestination_hostgroup.Changes
default-g1test groupTesting
The test can be run locally without any MySQL setup:
The test verifies:
Summary by CodeRabbit
Bug Fixes
Tests