-
Notifications
You must be signed in to change notification settings - Fork 1k
feat(stats): Purge query digest based on last seen #4920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v3.0
Are you sure you want to change the base?
Conversation
@renecannao PTAL |
Add to whitelist |
Can one of the admins verify this patch? |
std::tuple<bool, enum SERVER_TYPE, time_t>&& parse_command_purge_query_digests(char *query, int query_len) { | ||
auto ret = new std::tuple<bool, enum SERVER_TYPE, time_t>(false, SERVER_TYPE_MYSQL, 0); | ||
|
||
const char *prefix = match_command_prefix(CMD_PREFIX_PURGE_QUERY_DIGESTS, query, query_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, ret
is allocated via new
, but it's never deleted. Even though you std::move(*ret)
, the heap memory itself is leaked each time this function is called.
Suggestion: You don't need dynamic allocation here. Just declare std::tuple on the stack and return it by value.
std::tuple<bool, enum SERVER_TYPE, time_t> ret(false, SERVER_TYPE_MYSQL, 0);
return ret;
The compiler is capable of applying NRVO for optimization. (For Ref: RVO/NRVO)
Function declaration: std::tuple<bool, enum SERVER_TYPE, time_t> parse_command_purge_query_digests(char *query, int query_len);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
lib/Admin_Handler.cpp
Outdated
} | ||
|
||
// parse timestamp | ||
char *timestamp = strdup(query + strlen(prefix)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timestamp value is not being validated. For instance:
PURGE TABLE stats.stats_mysql_query_digest TO 123abc;
The "abc" part is currently ignored by atoi
(123abc -> 123), but instead of ignoring it, we should return an error to the client indicating that an invalid timestamp value was provided. This will ensure that ProxySQL behaves as expected and avoids any potential unknown issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. Now, the system throws an error - Invalid timestamp
.
lib/Admin_Handler.cpp
Outdated
|
||
// parse timestamp | ||
char *timestamp = strdup(query + strlen(prefix)); | ||
int last_seen = atoi(trim_spaces_in_place(timestamp)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
atoi
is used, risking overflow for large timestamps (e.g., post-2038).
Suggestion:
- Replace
atoi
withstrtoll
. - Change
last_seen
datatype totime_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
lib/Admin_Handler.cpp
Outdated
} | ||
|
||
// set last_seen | ||
std::get<2>(*ret) = realtime_to_monotonic_time(last_seen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Instead of relying on tuple indexing with std::get
, use individual named variables that represent the meaningful names for each of the tuple's elements. This can improve the code's clarity and make it easier to understand and maintain. At the end of the function, return the values as a tuple if needed.
bool match = false;
SERVER_TYPE server_type = SERVER_TYPE_MYSQL;
time_t time_stamp = 0;
...
...
return std::make_tuple(match, server_type, time_stamp);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
lib/Query_Processor.cpp
Outdated
delete qds; | ||
r++; | ||
struct purge_query_digests_args { | ||
umap_query_digest *digest_umap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a problem at the moment, but just a suggestion for future-proofing: it's a good practice to initialize pointers with nullptr
to avoid potential issues down the line.
umap_query_digest *digest_umap; | |
umap_query_digest *digest_umap = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/Query_Processor.cpp
Outdated
r++; | ||
struct purge_query_digests_args { | ||
umap_query_digest *digest_umap; | ||
umap_query_digest_text *digest_text_umap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a problem at the moment, but just a suggestion for future-proofing: it's a good practice to initialize pointers with nullptr
to avoid potential issues down the line.
umap_query_digest_text *digest_text_umap; | |
map_query_digest_text *digest_text_umap = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/Query_Processor.cpp
Outdated
std::function<bool(std::unordered_map<uint64_t, void *>::iterator it)> delete_entry = nullptr; | ||
|
||
// only used for synchronous multi-threaded purge operation | ||
int scan_idx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a problem at the moment, but just a suggestion for future-proofing: it's a good practice to initialize values to avoid potential issues down the line.
int scan_idx; | |
int scan_idx = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
// merge text entries in the aux map with digest_text_umap | ||
digest_text_umap.insert(digest_text_umap_aux.begin(), digest_text_umap_aux.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
digest_rwlock
will remain locked, potentially causing a deadlock, if an exception is thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rahim-kanji Are you referring to the possibility of std::bad_alloc
expection from umap::insert()
?
if ( pthread_create(&args[i].thr, NULL, &purge_query_digests_parallel, &args[i]) != 0 ) { | ||
|
||
for (int i = 0; i < n; i++) { | ||
if (pthread_create(&tid[i], NULL, &purge_query_digests_parallel, &args[i]) != 0 ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a potential race condition in purge_query_digest_entry
(parallel - multithreaded version). While digest_rwlock
is being used in main thread, digest_umap
and digest_text_umap
are being modified without any additional locks in multiple threads. This can lead to concurrent modifications, which could cause data inconsistencies or race conditions. It might be a good idea to add proper synchronization or locks around these variables to avoid the risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed, refer this comment.
lib/gen_utils.cpp
Outdated
time(&rt_now); | ||
|
||
return ((mt_now - rt_now + rt) * 1000000); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are issues in the design that needs to be addressed.
lib/Admin_Handler.cpp
Outdated
@@ -288,6 +299,15 @@ bool is_admin_command_or_alias(const std::vector<std::string>& cmds, char *query | |||
return false; | |||
} | |||
|
|||
const char * match_command_prefix(const std::vector<std::string>& cmd_prefix, char *query, int query_len) { | |||
for (auto &prefix : cmd_prefix) { | |||
if (query_len >= prefix.length() && !strncasecmp(prefix.c_str(), query, prefix.length())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will result in a warning, signed int
vs size_t
from prefix.length()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
lib/Query_Processor.cpp
Outdated
unsigned long long curtime1 = monotonic_time(); | ||
bool selective_purge = (last_seen > 0); | ||
|
||
purge_query_digests_args args = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While designated initializers
are great, and I hope we adopt them in the future, ProxySQL currently compiles with C++17
and this is a C++20
feature. It's likely that we will have issues with older compilers, we should stick to regular list initialization for now (until bumping the C++ version).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
lib/Query_Processor.cpp
Outdated
args[i].gtu = &digest_text_umap; | ||
|
||
if (parallel && map_size >= DIGEST_STATS_FAST_MINSIZE) { // multi-threaded purge | ||
int n = DIGEST_STATS_FAST_THREADS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to use here variable length arrays, they are not standard C++ but compiler extensions (g++ and clang++). But since n
is directly from a constant, we can just remove this unnecessary intermediate variable to avoid trouble.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
lib/Admin_Handler.cpp
Outdated
} | ||
|
||
// parse timestamp | ||
char *timestamp = strdup(query + strlen(prefix)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that the codebase has inconsistencies regarding the approach for smart pointers, but it's nice to make use of them in new code. Here is a nice candidate for using the abstraction mf_unique_ptr<char>
that can be seen in other parts of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
lib/Query_Processor.cpp
Outdated
auto text_it = digest_text_umap->find(it->first); | ||
if (text_it != digest_text_umap->end()) { | ||
free(text_it->second); | ||
digest_text_umap->erase(text_it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pointed out by @rahim-kanji, in the original design the size of the container wasn't modified during the freeing operation of the elements. This isn't safe to be done in a multi-threaded operation, i.e., the multi-thread freeing can't modify the size of the container, otherwise, as the size of the container is concurrently modified, invalid operations will take place, this sequence of commands against the Admin interface allows to test the algorithm under ASAN an trigger the issue:
admin> PROXYSQLTEST 1 500;
Query OK, 500000 rows affected (2.28 sec)
admin> PROXYSQLTEST 5;
ERROR 2013 (HY000): Lost connection to MySQL server during query
The corruption is reported by ASAN:
==313907==ERROR: AddressSanitizer: attempting double-free on 0x50b008199bc0 in thread T37:
#0 0x75646eafc102 in free /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:52
#1 0x5661cb31a2a7 in QP_query_digest_stats::~QP_query_digest_stats() /home/javjarfer/Projects/proxysql_dev/lib/QP_query_digest_stats.cpp:85
#2 0x5661caf26539 in purge_query_digest_entry(purge_query_digests_args*) /home/javjarfer/Projects/proxysql_dev/lib/Query_Processor.cpp:655
#3 0x5661caf26731 in purge_query_digests_parallel(void*) /home/javjarfer/Projects/proxysql_dev/lib/Query_Processor.cpp:672
#4 0x75646ea5cec9 in asan_thread_start /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_interceptors.cpp:234
#5 0x75646daa3709 (/usr/lib/libc.so.6+0x95709) (BuildId: 0b707b217b15b106c25fe51df3724b25848310c0)
#6 0x75646db27aab (/usr/lib/libc.so.6+0x119aab) (BuildId: 0b707b217b15b106c25fe51df3724b25848310c0)
0x50b008199bc0 is located 0 bytes inside of 99-byte region [0x50b008199bc0,0x50b008199c23)
freed by thread T38 here:
#0 0x75646eafc102 in free /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:52
#1 0x5661cb31a2a7 in QP_query_digest_stats::~QP_query_digest_stats() /home/javjarfer/Projects/proxysql_dev/lib/QP_query_digest_stats.cpp:85
#2 0x5661caf26539 in purge_query_digest_entry(purge_query_digests_args*) /home/javjarfer/Projects/proxysql_dev/lib/Query_Processor.cpp:655
#3 0x5661caf26731 in purge_query_digests_parallel(void*) /home/javjarfer/Projects/proxysql_dev/lib/Query_Processor.cpp:672
#4 0x75646ea5cec9 in asan_thread_start /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_interceptors.cpp:234
#5 0x75646daa3709 (/usr/lib/libc.so.6+0x95709) (BuildId: 0b707b217b15b106c25fe51df3724b25848310c0)
previously allocated by thread T34 here:
#0 0x75646eafd721 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x75646ea771af in strndup /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:416
#2 0x5661cb319be6 in QP_query_digest_stats::QP_query_digest_stats(char const*, char const*, unsigned long, char const*, int, char const*, int) /home/javjarfer/Projects/proxysql_dev/lib/QP_query_digest_stats.cpp:34
#3 0x5661caf2a18b in Query_Processor<MySQL_Query_Processor>::update_query_digest(unsigned long, unsigned long, char*, int, MySQL_Connection_userinfo*, unsigned long long, unsigned long long, char const*, unsigned long long, unsigned long long) /home/javjarfer/Projects/proxysql_dev/lib/Query_Processor.cpp:1986
#4 0x5661cb74dcd5 in ProxySQL_Test___GenerateRandomQueryInDigestTable(int) /home/javjarfer/Projects/proxysql_dev/lib/ProxySQL_Admin_Tests.cpp:139
#5 0x5661cb34bbeb in void ProxySQL_Admin::ProxySQL_Test_Handler<MySQL_Session>(ProxySQL_Admin*, MySQL_Session*, char*, bool&) /home/javjarfer/Projects/proxysql_dev/lib/ProxySQL_Admin_Tests2.cpp:573
#6 0x5661cb3d3496 in void admin_session_handler<MySQL_Session>(MySQL_Session*, void*, _PtrSize_t*) /home/javjarfer/Projects/proxysql_dev/lib/Admin_Handler.cpp:2726
#7 0x5661caea8b43 in MySQL_Session::handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___MYSQL_COM_QUERY___not_mysql(_PtrSize_t&) /home/javjarfer/Projects/proxysql_dev/lib/MySQL_Session.cpp:3452
#8 0x5661caeb04b3 in MySQL_Session::get_pkts_from_client(bool&, _PtrSize_t&) /home/javjarfer/Projects/proxysql_dev/lib/MySQL_Session.cpp:4208
#9 0x5661caeb7d19 in MySQL_Session::handler() /home/javjarfer/Projects/proxysql_dev/lib/MySQL_Session.cpp:4821
#10 0x5661caf89d55 in child_mysql(void*) /home/javjarfer/Projects/proxysql_dev/lib/ProxySQL_Admin.cpp:2045
#11 0x75646ea5cec9 in asan_thread_start /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_interceptors.cpp:234
#12 0x75646daa3709 (/usr/lib/libc.so.6+0x95709) (BuildId: 0b707b217b15b106c25fe51df3724b25848310c0)
The approach needs to be modified not to concurrently modify the size of the container on multi-threaded purging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer 4fa9a47. I have restricted the multi-threaded approach only to TRUNCATE
command (when last_seen == 0
).
- In case of
TRUNCATE
, since this is a case of delete all data, we can do freeing operations in worker threads and restrict erasing items in the container only to parent thread. - In case of
PURGE
, since this is a case of conditional delete, we cannot separate these operations. At least, I cannot think of a way to efficiently handle this.
I complied the latest code with ASAN, and tested it by triggering PROXYSQLTEST
commands to check all variations of purge_query_digests_*
methods. No errors are reported.
lib/Query_Processor.cpp
Outdated
|
||
|
||
void * purge_query_digests_parallel(void *_args) { | ||
set_thread_name("PurgeQueryDigest", GloVars.set_thread_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will immediately assert when attempting to execute. This name exceeds the maximum name allowed for threads, triggering this assert:
int rc;
rc = pthread_setname_np(pthread_self(), name);
assert(!rc);
The name should be reverted to the previous one PurgeQueryDgest
. For executing the test sequence suggested in my concurrent purging comment, this needs to be fixed first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
24ea478
to
4fa9a47
Compare
- Add handler for purge query digest command - Handle last_seen argument in purge query digest operations - Make purge logic common for both sync & async operations Signed-off-by: Wazir Ahmed <[email protected]>
4fa9a47
to
bc01331
Compare
@rahim-kanji @JavierJF I have addressed all the review comments (except this). PTAL and let me know if I have missed something. |
Closes #4543.