-
Notifications
You must be signed in to change notification settings - Fork 450
Open
Description
问题描述
在 Gaea 中,跨分片聚合在处理 HAVING 过滤条件时不正确。当执行带有 HAVING 子句的聚合操作时,过滤条件没有在所有分片上正确应用,导致结果不准确。
重现步骤
为验证Gaea是否在全局层重新应用了 HAVING 条件,设计一种场景,使分片执行的结果与全局层的结果不一致。
- namespace配置
{
"open_general_log": true,
"is_encrypt": true,
"name": "test_namespace",
"online": true,
"read_only": false,
"allowed_dbs": {
"test": true
},
"default_phy_dbs": null,
"slow_sql_time": "1000",
"black_sql": [],
"allowed_ip": null,
"slices": [
{
"name": "slice-0",
"user_name": "superroot",
"password": "superroot",
"master": "127.0.0.1:3319",
"slaves": [],
"statistic_slaves": [],
"capacity": 1,
"max_capacity": 1,
"idle_timeout": 3600,
"capability": 41479,
"init_connect": "",
"health_check_sql": ""
},
{
"name": "slice-1",
"user_name": "superroot",
"password": "superroot",
"master": "127.0.0.1:3349",
"slaves": [],
"statistic_slaves": [],
"capacity": 1,
"max_capacity": 1,
"idle_timeout": 3600,
"capability": 41479,
"init_connect": "",
"health_check_sql": ""
}
],
"shard_rules": [
{
"db": "test",
"table": "example_table",
"type": "mycat_long",
"key": "id",
"locations": [
2,
2
],
"slices": [
"slice-0",
"slice-1"
],
"databases": [
"test_[0-3]"
],
"partition_count": "4",
"partition_length": "256"
}
],
"users": [
{
"user_name": "superroot",
"password": "superroot",
"namespace": "test_namespace",
"rw_flag": 2,
"rw_split": 1,
"other_property": 0
}
],
"default_slice": "slice-0",
"global_sequences": null,
"default_charset": "",
"default_collation": "",
"max_sql_execute_time": 0,
"max_sql_result_size": 0,
"max_client_connections": 100000,
"down_after_no_alive": 32,
"seconds_behind_master": 32,
"check_select_lock": false,
"support_multi_query": false,
"local_slave_read_priority": 0,
"set_for_keep_session": false,
"client_qps_limit": 0,
"allowed_session_variables": {}
}- 表结构
CREATE TABLE example_table (
id INT NOT NULL,
a VARCHAR(50) NOT NULL,
b INT NOT NULL,
PRIMARY KEY (id)
);-
插入若干数据。
-
执行 SQL
SELECT a AS c, SUM(LENGTH(a)) AS total_len FROM example_table HAVING total_len > 6 -
观察到 HAVING 过滤条件没有在跨分片后正确应用,结果不准确。
预期结果
mysql> SELECT a AS c, SUM(LENGTH(a)) AS total_len FROM example_table HAVING total_len > 6;
+---+-----------+
| c | total_len |
+---+-----------+
| a | 24 |
+---+-----------+
1 row in set (0.01 sec)
实际结果
SELECT a AS c, SUM(LENGTH(a)) AS total_len FROM example_table HAVING total_len > 6;
+----+-----------+
| c | total_len |
+----+-----------+
| ab | 8 |
+----+-----------+
1 row in set (0.18 sec)问题根源
在分库分表(Sharding)场景下,Gaea直接把带有 HAVING 的语句下推到每个分片执行,导致结果与预期不符。其根本原因是:HAVING 涉及的聚合操作(SUM、COUNT、AVG 等)无法简单地在各分片“独立计算”后再进行合并,否则就会出现“部分求和”的问题。
各分片独立执行后,只保留 total_len > 6 的结果,再合并到路由层,就会导致部分分片的聚合结果没有被正确“累加”。举个简单例子,如果某个分片里 SUM(LENGTH(a)) 的值只有 3,没有满足 6 就被过滤掉了;但其实如果所有分片加在一起,SUM(LENGTH(a)) 可能会大于 6,原本应该保留这一行。
版本信息
所有版本
问题标签: bug, enhancement, sharding, feature-request
Metadata
Metadata
Assignees
Labels
No labels