Skip to content

Comments

TSDB: Embedded Time-Series Database Subsystem#5383

Merged
renecannao merged 22 commits intov3.0from
v3.0-tsdb-feature
Feb 19, 2026
Merged

TSDB: Embedded Time-Series Database Subsystem#5383
renecannao merged 22 commits intov3.0from
v3.0-tsdb-feature

Conversation

@renecannao
Copy link
Contributor

@renecannao renecannao commented Feb 17, 2026

Overview

This PR implements a high-performance, embedded Time-Series Database (TSDB) subsystem within ProxySQL. It provides native metric storage, historical analysis, and backend health monitoring directly in SQLite, eliminating the absolute need for external exporters for basic observability.

Key Features

1. High-Performance Storage

  • Architecture: SQLite-based storage engine using cached prepared statements and batch insertions for minimal overhead.
  • Tables:
    • stats_history.tsdb_metrics: Raw metric samples.
    • stats_history.tsdb_metrics_hour: Hourly downsampled rollups.
    • stats_history.tsdb_backend_health: Backend connection health logs.

2. Parallel Backend Probing

  • Implements a dedicated monitor loop that performs TCP connect checks against backends in parallel using std::async.
  • Records connection latency and up/down status independently of the main MySQL monitor.

3. Management & Visibility

  • Dedicated Namespace: All configuration uses the tsdb- prefix (e.g., tsdb-retention_days).
  • Commands:
    • LOAD TSDB VARIABLES TO RUNTIME
    • SAVE TSDB VARIABLES TO DISK
    • SHOW TSDB VARIABLES
    • SHOW TSDB STATUS
    • PROXYSQL TSDB DOWNSAMPLE (Manual trigger for aggregation)
  • UI/API:
    • Web Dashboard: Interactive Chart.js visualizations available at /tsdb.
    • REST API: Endpoints at /api/tsdb/{metrics,query,status} for external integration.

4. Verification

  • TAP Tests: Added comprehensive test suites (test_tsdb_variables-t.cpp, test_tsdb_api-t.cpp) covering:
    • Variable persistence and lifecycle.
    • Runtime visibility and status reporting.
    • Data collection and downsampling accuracy.
    • API endpoint availability and authentication.
  • Documentation: Full documentation suite added to doc/tsdb/, including architecture, specs, and quickstart guides.

Implementation Details

  • Stats Table: Moved stats_tsdb to the main schema to allow SHOW TSDB STATUS to work seamlessly.
  • Variables: Integrated tsdb- prefix into the global variable validation and refresh logic.
  • HTTP Server: Enhanced ProxySQL_HTTP_Server to serve the dashboard and API routes.

Checklist

  • Code compiles with debug flags.
  • TAP tests passing (19/19 variable tests, 10/10 API tests).
  • Documentation updated.
  • Tested manually via Admin interface and Web UI.

Summary by CodeRabbit

Release Notes

  • New Features

    • Embedded time-series database (TSDB) for storing and querying ProxySQL metrics with configurable retention policies
    • Web dashboard at /tsdb displaying metrics with interactive charts and time range filtering
    • REST API endpoints (/api/tsdb/metrics, /api/tsdb/query, /api/tsdb/status) for programmatic metric access
    • Backend health monitoring with optional probe collection
    • New admin commands for TSDB variable management (LOAD/SAVE TSDB VARIABLES TO RUNTIME, SHOW TSDB VARIABLES/STATUS)
  • Documentation

    • Added comprehensive TSDB documentation including quickstart guide, architecture overview, configuration reference, and REST API reference

Add time-series database extension to ProxySQL_Statistics with:
- Three SQLite tables: tsdb_metrics, tsdb_metrics_hour, tsdb_backend_health
- Automatic downsampling from raw to hourly aggregates
- Retention management (7 days raw, 1 year hourly)
- Prometheus metrics sampling
- Backend health monitoring via TCP probes
- SQL query interface with JSON label filtering
- Admin variable integration

Documentation:
- embedded_tsdb_overview.md - Architecture and benefits
- embedded_tsdb_architecture.md - System design and components
- embedded_tsdb_reference.md - Configuration and API reference
- embedded_tsdb_specs.md - Technical specifications
- embedded_tsdb_quickstart.md - Getting started guide

All 4 build combinations pass:
- make debug / make release
- PROXYSQLGENAI=1 make debug / make release
- Move 'stats_tsdb' to 'main' schema for better visibility in 'SHOW TSDB STATUS'.
- Implement 'PROXYSQL TSDB DOWNSAMPLE' command to trigger manual metrics aggregation.
- Ensure 'SHOW TSDB STATUS' triggers a statistics refresh before returning data.
- Integrate 'tsdb-' variables into 'runtime_global_variables' refresh logic.
- Update TAP tests to verify new commands and improve troubleshooting output.
- Documented TSDB Web Dashboard and REST API endpoints.
- Updated configuration references to include LOAD/SAVE TSDB VARIABLES commands.
- Corrected database aliases (stats_history vs statsdb_disk) in examples.
- Removed outdated references to missing API features.
- Implemented TAP test for TSDB REST API and Dashboard.
- Verified /api/tsdb/{metrics,query,status} endpoints.
- Verified /tsdb dashboard accessibility via HTTPS.
- Updated Makefile to include new test binary.
@coderabbitai
Copy link

coderabbitai bot commented Feb 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds embedded TSDB (time-series database) functionality using SQLite. It includes comprehensive documentation, infrastructure updates, REST API endpoints, an HTML dashboard, admin commands for TSDB variable management, and tests validating TSDB operations and queries.

Changes

Cohort / File(s) Summary
Documentation
doc/tsdb/embedded_tsdb_overview.md, doc/tsdb/embedded_tsdb_quickstart.md, doc/tsdb/embedded_tsdb_architecture.md, doc/tsdb/embedded_tsdb_specs.md, doc/tsdb/embedded_tsdb_reference.md, doc/tsdb/embedded_tsdb_metrics_catalog.md, doc/tsdb/ui_endpoints.md
Comprehensive TSDB documentation covering architecture, configuration, SQL schemas, metrics catalog, quickstart guide, technical specifications, and REST API/UI endpoints.
TSDB Core Infrastructure
include/ProxySQL_Statistics.hpp, lib/ProxySQL_Statistics.cpp
Introduces TSDB data model with new tables (tsdb_metrics, tsdb_metrics_hour, tsdb_backend_health), timers, and extensive public API for metric insertion, downsampling, retention, querying with label filters, status retrieval, and variable management (set/get/list TSDB variables).
Admin Integration
include/proxysql_admin.h, lib/ProxySQL_Admin.cpp, lib/ProxySQL_Admin_Stats.cpp, lib/Admin_Handler.cpp
Adds TSDB variable lifecycle management (init, load/save to/from runtime, flush to database), SHOW TSDB VARIABLES/STATUS commands, PROXYSQL TSDB DOWNSAMPLE command, and integration with existing admin config persistence layer.
HTTP & REST API
include/ProxySQL_RESTAPI_Server.hpp, lib/ProxySQL_RESTAPI_Server.cpp, lib/ProxySQL_HTTP_Server.cpp, lib/TSDB_Dashboard_html.cpp
Implements /tsdb HTML dashboard with Chart.js UI for metric visualization, and REST endpoints (/api/tsdb/metrics, /api/tsdb/query, /api/tsdb/status) for programmatic access to TSDB data with JSON responses and authentication support.
Schema & Admin Tables
include/ProxySQL_Admin_Tables_Definitions.h, lib/Admin_Bootstrap.cpp
Defines STATS_SQLITE_TABLE_TSDB macro for stats_tsdb table exposing TSDB metrics (Total_Series, Total_Datapoints, Disk_Size_Bytes, Oldest/Newest_Datapoint_TS), and registers it in admin bootstrap.
Tests
test/tap/tests/test_tsdb_variables-t.cpp, test/tap/tests/test_tsdb_api-t.cpp, test/tap/tests/Makefile, test/tap/groups/groups.json
Adds comprehensive test coverage for TSDB variable management (SET/LOAD/SAVE commands, SHOW TSDB STATUS), REST API endpoint validation (/tsdb UI, /api/tsdb/* endpoints), and Makefile/test group registration.
Main Initialization
src/main.cpp
Integrates TSDB initialization into ProxySQL startup sequence via GloAdmin->init_tsdb_variables() call.
Build System
lib/Makefile
Adds TSDB_Dashboard_html.oo to compiled object files.

Sequence Diagram(s)

sequenceDiagram
    participant Sampler as TSDB Sampler
    participant Stats as ProxySQL_Statistics
    participant DB as SQLite statsdb_disk
    participant Query as Query Handler
    participant Client as Client/UI
    
    Note over Sampler,DB: Data Collection & Storage
    Sampler->>Sampler: Timer check (tsdb_sampler_timetoget)
    Sampler->>Stats: insert_tsdb_metric(metric, labels, value)
    Stats->>DB: INSERT INTO tsdb_metrics
    
    Note over Query,DB: Downsampling & Retention
    Stats->>Stats: tsdb_downsample_metrics()
    Stats->>DB: INSERT INTO tsdb_metrics_hour (hourly aggregates)
    Stats->>Stats: tsdb_retention_cleanup()
    Stats->>DB: DELETE expired data
    
    Note over Query,DB: Query & API Response
    Client->>Query: GET /api/tsdb/query?metric=...
    Query->>Stats: query_tsdb_metrics(metric, labels, from, to)
    Stats->>DB: SELECT from tsdb_metrics with filters
    DB-->>Stats: Metric data
    Stats-->>Query: SQLite3_result
    Query-->>Client: JSON response
Loading
sequenceDiagram
    participant Admin as Admin CLI
    participant ProxySQL_Admin as ProxySQL_Admin
    participant Stats as ProxySQL_Statistics
    participant DB as SQLite admindb/statsdb_disk
    
    Note over Admin,DB: TSDB Variable Lifecycle
    Admin->>ProxySQL_Admin: SET tsdb-enabled = 1
    ProxySQL_Admin->>Stats: set_variable("tsdb-enabled", "1")
    Stats->>Stats: Validate & update internal state
    
    Admin->>ProxySQL_Admin: LOAD TSDB VARIABLES TO RUNTIME
    ProxySQL_Admin->>ProxySQL_Admin: load_tsdb_variables_to_runtime()
    ProxySQL_Admin->>DB: SELECT from disk.global_variables (tsdb-*)
    DB-->>ProxySQL_Admin: Variable values
    ProxySQL_Admin->>Stats: Update TSDB runtime state
    
    Admin->>ProxySQL_Admin: SAVE TSDB VARIABLES TO DISK
    ProxySQL_Admin->>ProxySQL_Admin: save_tsdb_variables_from_runtime()
    ProxySQL_Admin->>DB: INSERT/UPDATE disk.global_variables
    DB-->>ProxySQL_Admin: Persisted
    
    Admin->>ProxySQL_Admin: SHOW TSDB STATUS
    ProxySQL_Admin->>Stats: get_tsdb_status()
    Stats->>DB: Aggregate metrics from tsdb_*
    DB-->>Stats: Status data
    Stats-->>ProxySQL_Admin: tsdb_status_t
    ProxySQL_Admin-->>Admin: Display status rows
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested Reviewers

  • rahim-kanji

Poem

🐰 Hoppy times with metrics bright,
SQLite stores them day and night,
Labels dance in JSON strings,
Dashboards chart what data brings,
Time-series hops to REST API calls!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'TSDB: Embedded Time-Series Database Subsystem' clearly summarizes the main change: implementing an embedded TSDB feature in ProxySQL using SQLite.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch v3.0-tsdb-feature

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @renecannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a significant new capability to ProxySQL by embedding a time-series database directly into the system. This allows ProxySQL to collect, store, and analyze its own operational metrics and backend health data without requiring external tools. The implementation focuses on performance and integration, providing a robust foundation for internal observability and monitoring, accessible via both a web UI and a REST API.

Highlights

  • Embedded Time-Series Database (TSDB) Subsystem: Introduced a new high-performance, embedded TSDB subsystem within ProxySQL, utilizing SQLite for native metric storage, historical analysis, and backend health monitoring. This reduces reliance on external observability tools.
  • Data Storage and Management: Implemented three new SQLite tables: tsdb_metrics for raw samples, tsdb_metrics_hour for hourly downsampled rollups, and tsdb_backend_health for backend connection health logs. The system includes periodic sampling, hourly downsampling, and configurable retention policies.
  • Parallel Backend Probing: Added a dedicated monitor loop that performs parallel TCP connect checks against backend servers using std::async, recording connection latency and status independently of the main MySQL monitor.
  • New Admin Interface Commands: Provided new SQL commands for managing TSDB variables (LOAD TSDB VARIABLES TO RUNTIME, SAVE TSDB VARIABLES TO DISK, SHOW TSDB VARIABLES) and monitoring its status (SHOW TSDB STATUS). A manual downsampling trigger (PROXYSQL TSDB DOWNSAMPLE) was also added.
  • Web Dashboard and REST API: Integrated a built-in web dashboard at /tsdb for interactive visualizations using Chart.js, and exposed REST API endpoints at /api/tsdb/{metrics,query,status} for external integration.
  • Comprehensive Testing and Documentation: Added extensive TAP test suites covering variable persistence, runtime visibility, data collection, downsampling accuracy, and API endpoint availability. Full documentation for architecture, specifications, and quickstart guides was also included.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • doc/tsdb/embedded_tsdb_architecture.md
    • Added documentation detailing the runtime components, data path, metric family handling, storage schema, and configuration lifecycle of the embedded TSDB.
  • doc/tsdb/embedded_tsdb_metrics_catalog.md
    • Added documentation outlining the metric families covered by the TSDB sampler and the conventions for storing series, including Prometheus Counter, Gauge, Summary, Histogram, Info, and Untyped types.
  • doc/tsdb/embedded_tsdb_overview.md
    • Added an overview document describing the purpose, implemented features, configuration model, UI/API access, and key variables of the embedded TSDB.
  • doc/tsdb/embedded_tsdb_quickstart.md
    • Added a quickstart guide with SQL examples for enabling TSDB, verifying status and tables, and querying raw samples, hourly rollups, and backend health data.
  • doc/tsdb/embedded_tsdb_reference.md
    • Added a reference manual detailing TSDB configuration variables, table schemas, Prometheus ingestion mapping, and SQL query examples.
  • doc/tsdb/embedded_tsdb_specs.md
    • Added technical specifications covering the scope, table definitions, sampling and rollup logic, retention policies, variable semantics, and API surface of the TSDB.
  • doc/tsdb/ui_endpoints.md
    • Added documentation for the TSDB web dashboard and REST API endpoints, including their functionality, parameters, and example responses.
  • include/ProxySQL_Admin_Tables_Definitions.h
    • Added a macro definition for the stats_tsdb table, used for storing TSDB status variables.
  • include/ProxySQL_RESTAPI_Server.hpp
    • Added a tsdb_endpoint member to manage TSDB-related REST API resources.
  • include/ProxySQL_Statistics.hpp
    • Included ProxySQL_Admin_Tables_Definitions.h for TSDB table definitions.
    • Defined macros for tsdb_metrics, tsdb_metrics_hour, and tsdb_backend_health SQLite table creation statements.
    • Added TSDB-specific timer variables (next_timer_tsdb_sampler, next_timer_tsdb_downsample, next_timer_tsdb_monitor, next_timer_tsdb_retention) and prepared statements for metric insertion.
    • Introduced TSDB configuration variables (tsdb_enabled, tsdb_sample_interval, tsdb_retention_days, tsdb_monitor_enabled, tsdb_monitor_interval).
    • Declared methods for TSDB variable management (set_variable, get_variable, has_variable, get_variables_list).
    • Declared TSDB core methods for metric insertion, downsampling, retention, querying, status retrieval, timer checks, and main loops.
  • include/proxysql_admin.h
    • Declared new methods for TSDB variable initialization (init_tsdb_variables), flushing to/from database (flush_tsdb_variables___runtime_to_database), and loading/saving (load_tsdb_variables_to_runtime, save_tsdb_variables_from_runtime).
    • Declared the stats___tsdb() method for updating TSDB statistics.
  • lib/Admin_Bootstrap.cpp
    • Added the stats_tsdb table definition to the admin tables during bootstrap.
  • lib/Admin_Handler.cpp
    • Defined command aliases for LOAD TSDB VARIABLES and SAVE TSDB VARIABLES.
    • Added handling for the PROXYSQL TSDB DOWNSAMPLE command.
    • Updated is_valid_global_variable to recognize tsdb- prefixed variables.
    • Extended admin_handler_command_load_or_save to process LOAD/SAVE TSDB VARIABLES commands, including logic for loading from runtime and saving to disk.
    • Modified admin_handler_command_load_or_save to include tsdb- variables when loading from config.
    • Added handling for SHOW TSDB VARIABLES and SHOW TSDB STATUS commands, directing them to appropriate queries and statistic updates.
  • lib/ProxySQL_Admin.cpp
    • Added a boolean flag stats_tsdb to track TSDB statistics refresh requests.
    • Updated GenericRefreshStatistics to trigger stats_tsdb refresh when stats_tsdb or showtsdbstatus is requested.
    • Called stats___tsdb() when TSDB statistics need to be refreshed.
    • Included flush_tsdb_variables___runtime_to_database in the global variable flush process.
    • Integrated TSDB sampler, downsample, monitor, and retention loops into the admin_main_loop.
    • Added tsdb_variables to the list of commands generated for load/save operations.
    • Implemented init_tsdb_variables to initialize TSDB variables from config and runtime.
    • Implemented flush_tsdb_variables___runtime_to_database to manage TSDB variables in the database.
    • Implemented load_tsdb_variables_to_runtime to load TSDB variables into runtime memory.
    • Implemented save_tsdb_variables_from_runtime to persist TSDB variables from runtime to disk.
  • lib/ProxySQL_Admin_Stats.cpp
    • Included ProxySQL_Statistics.hpp to access TSDB statistics.
    • Implemented stats___tsdb() to retrieve TSDB status and insert it into the stats_tsdb table.
  • lib/ProxySQL_HTTP_Server.cpp
    • Added a new HTTP handler for the /tsdb endpoint, serving an HTML page with a Chart.js-based dashboard for visualizing TSDB metrics.
  • lib/ProxySQL_RESTAPI_Server.cpp
    • Included ProxySQL_Statistics.hpp to interact with TSDB statistics.
    • Defined a new tsdb_resource class to handle REST API requests for TSDB.
    • Implemented render_GET methods within tsdb_resource for /api/tsdb/metrics (listing available metrics), /api/tsdb/query (querying time-series data with filters), and /api/tsdb/status (retrieving TSDB operational status).
    • Registered the TSDB REST API endpoints (/api/tsdb/metrics, /api/tsdb/query, /api/tsdb/status) with the HTTP server.
  • lib/ProxySQL_Statistics.cpp
    • Added utility functions for SQL string escaping, validating label keys, and formatting Prometheus label doubles.
    • Initialized TSDB timer variables and prepared statements in the constructor.
    • Implemented set_variable, get_variable, has_variable, and get_variables_list for TSDB-specific global variables.
    • Finalized TSDB prepared statements in the destructor.
    • Added TSDB table definitions (tsdb_metrics, tsdb_metrics_hour, tsdb_backend_health) to the disk statistics database initialization.
    • Created indexes for TSDB tables to optimize queries.
    • Implemented schema upgrade logic for tsdb_metrics and tsdb_metrics_hour to add a default labels column.
    • Implemented insert_tsdb_metric for efficient insertion of time-series data using prepared statements.
    • Implemented insert_backend_health for recording backend probe results.
    • Implemented tsdb_downsample_metrics to aggregate raw metrics into hourly rollups.
    • Implemented tsdb_retention_cleanup to remove old raw and probe data based on configured retention policies.
    • Implemented get_tsdb_status to provide an overview of TSDB data, including series count, datapoints, disk size, and timestamp ranges.
    • Implemented timer check functions (tsdb_sampler_timetoget, tsdb_downsample_timetoget, tsdb_monitor_timetoget, tsdb_retention_timetoget) to control the frequency of TSDB operations.
    • Implemented query_tsdb_metrics for flexible querying of time-series data with label filtering and optional aggregation.
    • Implemented get_backend_health_metrics for querying backend health data.
    • Implemented tsdb_sampler_loop to collect Prometheus metrics from the registry and insert them into tsdb_metrics.
    • Implemented tsdb_monitor_loop to perform parallel TCP probes on backend servers and record results in tsdb_backend_health.
  • src/main.cpp
    • Enabled the initialization of GloProxyStats by calling init().
    • Called GloAdmin->init_tsdb_variables() during the main initialization phase to set up TSDB variables.
  • test/tap/tests/Makefile
    • Added test_tsdb_variables-t and test_tsdb_api-t to the list of tests to be run.
  • test/tap/tests/test_tsdb_api-t.cpp
    • Added a new TAP test file to verify the functionality of the TSDB web dashboard and REST API endpoints, including /tsdb, /api/tsdb/status, /api/tsdb/metrics, and /api/tsdb/query.
  • test/tap/tests/test_tsdb_variables-t.cpp
    • Added a new TAP test file to validate TSDB variable setting, loading, saving, and runtime behavior, as well as the accessibility and data collection of TSDB tables.
Activity
  • The pull request introduces a new TSDB subsystem, indicating significant development effort.
  • Comprehensive TAP tests have been added and are reported as passing (19/19 variable tests, 10/10 API tests), demonstrating thorough validation of the new features.
  • Documentation has been fully updated, suggesting a complete and well-documented feature.
  • Manual testing was performed via the Admin interface and Web UI, confirming usability and functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial and well-executed pull request that introduces an embedded Time-Series Database (TSDB) subsystem. This is a great feature for providing native observability without external dependencies. The implementation is comprehensive, covering data storage, parallel backend probing, configuration management, a web dashboard, and a REST API. The use of modern C++ features like std::async for parallel probing is commendable. My review focuses on improving maintainability, robustness, and code clarity in a few areas. Overall, this is an excellent addition to ProxySQL.

Comment on lines 3055 to 3059
size_t l = strlen(a) + 200;
l += strlen(varnames[i]);
l += strlen(safe_val);
char *query=(char *)malloc(l);
snprintf(query, l, a, varnames[i], safe_val);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using malloc with a buffer size calculated with a magic number (+ 200) is unsafe and can lead to buffer overflows if variable names or values are unexpectedly long. It's better to use std::string for query construction to ensure memory safety and improve code clarity. This avoids manual memory management and the risks associated with fixed-size buffers.

- `LOAD TSDB VARIABLES TO RUNTIME`
- `SAVE TSDB VARIABLES TO DISK`

No dedicated TSDB load/save command family is implemented.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This statement is a bit confusing. The PR description and implementation show that commands like LOAD TSDB VARIABLES TO RUNTIME are supported. Perhaps you mean that it reuses the existing LOAD/SAVE ... VARIABLES command structure rather than introducing a completely new command family. Clarifying this would improve the documentation's accuracy.

const bool is_pgsql = (query_no_space[5] == 'P' || query_no_space[5] == 'p') ? true : false;
const bool is_genai = (query_no_space[5] == 'G' || query_no_space[5] == 'g') ? true : false;
const std::string modname = is_pgsql ? "pgsql_variables" : (is_genai ? "genai_variables" : "mysql_variables");
const bool is_tsdb = (query_no_space[5] == 'T' || query_no_space[5] == 't') ? true : false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Relying on a fixed character position (query_no_space[5]) to identify the command type (tsdb) is brittle and can easily break if the command structure changes slightly. A more robust approach would be to use strstr or a similar string searching function on the original query string (before spaces are removed) to check for the presence of " TSDB ". This would make the command parsing logic less dependent on exact character positions.

Comment on lines 2525 to 2537
if (GloProxyStats->tsdb_sampler_timetoget(curtime)) {
GloProxyStats->tsdb_sampler_loop();
}
if (GloProxyStats->tsdb_downsample_timetoget(curtime)) {
GloProxyStats->tsdb_downsample_metrics();
}
if (GloProxyStats->tsdb_monitor_timetoget(curtime)) {
GloProxyStats->tsdb_monitor_loop();
}
if (GloProxyStats->tsdb_retention_timetoget(curtime)) {
GloProxyStats->tsdb_retention_cleanup();
}
} if (S_amll.get_version()!=version) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The indentation in this new block of code is inconsistent and makes it difficult to read. Please correct the formatting to align with the project's coding style.

                                if (GloProxyStats->tsdb_sampler_timetoget(curtime)) {
                                        GloProxyStats->tsdb_sampler_loop();
                                }
                                if (GloProxyStats->tsdb_downsample_timetoget(curtime)) {
                                        GloProxyStats->tsdb_downsample_metrics();
                                }
                                if (GloProxyStats->tsdb_monitor_timetoget(curtime)) {
                                        GloProxyStats->tsdb_monitor_loop();
                                }
                                if (GloProxyStats->tsdb_retention_timetoget(curtime)) {
                                        GloProxyStats->tsdb_retention_cleanup();
                                }
                        }
                if (S_amll.get_version()!=version) {

Comment on lines 440 to 485
if (strcmp(url,"/tsdb")==0) {
string s = "<!DOCTYPE html><html><head><title>ProxySQL TSDB Dashboard</title>";
s += "<script src=\"/Chart.bundle.js\"></script>";
s += "<style>body{font-family:sans-serif;margin:20px} .container{max-width:1000px;margin:0 auto} .controls{margin-bottom:20px;padding:15px;background:#f5f5f5;border-radius:5px} select,input,button{padding:8px;margin-right:10px} #chart-container{height:500px;position:relative}</style></head><body>";
s += "<div class=\"container\"><h1>TSDB Dashboard</h1>";
s += "<div class=\"controls\">";
s += "Metric: <select id=\"metric-select\"><option>Loading...</option></select> ";
s += "Range: <select id=\"range-select\"><option value=\"3600\">Last Hour</option><option value=\"86400\">Last Day</option><option value=\"604800\">Last Week</option></select> ";
s += "<button onclick=\"loadData()\">Refresh</button>";
s += "</div>";
s += "<div id=\"chart-container\"><canvas id=\"tsdbChart\"></canvas></div>";
s += "<script>";
s += "let myChart = null;";
s += "async function loadMetrics() {";
s += " const resp = await fetch('/api/tsdb/metrics');";
s += " const metrics = await resp.json();";
s += " const select = document.getElementById('metric-select');";
s += " select.innerHTML = '';";
s += " metrics.forEach(m => { const opt = document.createElement('option'); opt.value = m; opt.text = m; select.appendChild(opt); });";
s += " if (metrics.length > 0) loadData();";
s += "}";
s += "async function loadData() {";
s += " const metric = document.getElementById('metric-select').value;";
s += " const range = document.getElementById('range-select').value;";
s += " const now = Math.floor(Date.now() / 1000);";
s += " const from = now - parseInt(range);";
s += " const resp = await fetch(`/api/tsdb/query?metric=${metric}&from=${from}&to=${now}`);";
s += " const data = await resp.json();";
s += " const ctx = document.getElementById('tsdbChart').getContext('2d');";
s += " if (myChart) myChart.destroy();";
s += " const datasets = {};";
s += " data.forEach(d => {";
s += " const label = JSON.stringify(d.labels);";
s += " if (!datasets[label]) datasets[label] = { label: label, data: [], borderColor: '#' + Math.floor(Math.random()*16777215).toString(16), fill: false };";
s += " datasets[label].data.push({ x: new Date(d.ts * 1000), y: d.value });";
s += " });";
s += " myChart = new Chart(ctx, { type: 'line', data: { datasets: Object.values(datasets) }, options: { scales: { xAxes: [{ type: 'time', time: { unit: 'minute' } }] }, maintainAspectRatio: false } });";
s += "}";
s += "window.onload = loadMetrics;";
s += "</script></div></body></html>";

response = MHD_create_response_from_buffer (s.length(), (void *) s.c_str(), MHD_RESPMEM_MUST_COPY);
ret = MHD_queue_response (connection, MHD_HTTP_OK, response);
MHD_destroy_response (response);
return ret;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Embedding a large block of HTML and JavaScript directly within a C++ string makes the code difficult to read, modify, and maintain. For better separation of concerns and improved maintainability, consider moving the HTML and JavaScript content into separate resource files. These files can be loaded at runtime or compiled into the binary.

Comment on lines 175 to 184
char **ProxySQL_Statistics::get_variables_list() {
char **list = (char **)malloc(sizeof(char *) * 6);
list[0] = strdup("enabled");
list[1] = strdup("sample_interval");
list[2] = strdup("retention_days");
list[3] = strdup("monitor_enabled");
list[4] = strdup("monitor_interval");
list[5] = NULL;
return list;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding the list of variables in get_variables_list() is fragile. If a new TSDB variable is added, this list must be manually updated, along with the set_variable and get_variable functions. To improve maintainability, consider defining the variables in a more structured, data-driven way, for example, using a static array of structs that encapsulates each variable's properties. This would colocate the variable definitions with their handling logic and make the system easier to extend.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main.cpp (1)

972-980: ⚠️ Potential issue | 🔴 Critical

Remove the early GloProxyStats->init() call at line 978.

ProxySQL_Statistics::init() is not idempotent—it unconditionally allocates new vectors for tables_defs_statsdb_mem and tables_defs_statsdb_disk without checking if they already exist. Calling it twice (once at line 978, then again in ProxySQL_Admin::init() after statsdb_disk setup) causes a memory leak and redundant table initialization. Let ProxySQL_Admin::init() handle the single initialization after all stats databases are configured.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main.cpp` around lines 972 - 980, The early call to GloProxyStats->init()
in ProxySQL_Main_init_Admin_module causes double initialization of
ProxySQL_Statistics (which allocates tables_defs_statsdb_mem and
tables_defs_statsdb_disk) and leads to leaks; remove that call so
ProxySQL_Statistics::init() is only invoked from ProxySQL_Admin::init() after
statsdb_disk is configured, ensuring tables_defs_statsdb_mem and
tables_defs_statsdb_disk are allocated exactly once and avoiding redundant
setup.
🟡 Minor comments (11)
doc/tsdb/embedded_tsdb_specs.md-47-48 (1)

47-48: ⚠️ Potential issue | 🟡 Minor

Schema prefix inconsistency: statsdb_disk vs stats_history.

This file references statsdb_disk.tsdb_* but the reference doc and quickstart use stats_history.tsdb_*. One of these is wrong — please align all documentation to the actual schema name used in code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@doc/tsdb/embedded_tsdb_specs.md` around lines 47 - 48, The docs use two
different TSDB schema prefixes (`statsdb_disk.tsdb_*` vs
`stats_history.tsdb_*`), causing inconsistency; pick the actual schema name used
in code and update this doc to match. Search for occurrences of
`statsdb_disk.tsdb_*` and `stats_history.tsdb_*` (and mentions in headings or
examples) and replace them with the canonical schema prefix used by the
implementation (the exact symbol used by the codebase), ensuring examples, C++
method comments (e.g., ProxySQL_Statistics references), and quickstart all use
the same `tsdb_*` prefix. Verify cross-references and SQL examples compile
against the real schema name.
doc/tsdb/embedded_tsdb_quickstart.md-36-41 (1)

36-41: ⚠️ Potential issue | 🟡 Minor

Inconsistent schema name across documentation.

This quickstart uses stats_history.tsdb_metrics but doc/tsdb/embedded_tsdb_specs.md (line 48) references statsdb_disk.tsdb_*. Please reconcile which schema name is correct and use it consistently across all TSDB docs.

#!/bin/bash
# Check which schema prefix is used in the actual codebase
rg -n 'stats_history\.tsdb_\|statsdb_disk\.tsdb_' --type=cpp -C2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@doc/tsdb/embedded_tsdb_quickstart.md` around lines 36 - 41, The docs use two
different TSDB schema prefixes (stats_history.tsdb_metrics vs
statsdb_disk.tsdb_*); search the repo for occurrences of "stats_history.tsdb_"
and "statsdb_disk.tsdb_" to determine the canonical schema used by the code,
then update the SQL snippet in embedded_tsdb_quickstart.md (the SELECT ... FROM
stats_history.tsdb_metrics block) and the reference in embedded_tsdb_specs.md so
both use the same canonical prefix (e.g., replace all instances to
statsdb_disk.tsdb_* or stats_history.tsdb_* consistently), and ensure any other
TSDB docs are updated to match.
doc/tsdb/embedded_tsdb_reference.md-5-5 (1)

5-5: ⚠️ Potential issue | 🟡 Minor

Broken introductory sentence.

"TSDB variables are TSDB variables:" is a tautology — likely a leftover from editing. Consider something like "TSDB variables are global variables prefixed with tsdb-:" to match the description in the overview doc.

📝 Suggested fix
-TSDB variables are TSDB variables:
+TSDB variables are global configuration variables prefixed with `tsdb-`:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@doc/tsdb/embedded_tsdb_reference.md` at line 5, Replace the tautological
header "TSDB variables are TSDB variables:" with a clear descriptive sentence
such as "TSDB variables are global variables prefixed with `tsdb-`:" to match
the overview; locate the exact phrase "TSDB variables are TSDB variables:" and
update it to the new wording so the intro communicates what TSDB variables are
and their prefix convention.
doc/tsdb/embedded_tsdb_architecture.md-33-38 (1)

33-38: ⚠️ Potential issue | 🟡 Minor

Clarify the “no dedicated load/save” wording.

This section lists LOAD TSDB VARIABLES TO RUNTIME and SAVE TSDB VARIABLES TO DISK, then states “No dedicated TSDB load/save command family is implemented.” That reads contradictory—please clarify that there is no separate TSDB load/save beyond the variables commands (or adjust the statement).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@doc/tsdb/embedded_tsdb_architecture.md` around lines 33 - 38, Reword the
contradictory sentence to make clear there is no separate TSDB-specific
load/save command family beyond the variables commands; for example, replace "No
dedicated TSDB load/save command family is implemented." with a clarifying line
like "There is no separate TSDB load/save command family — configuration is
managed via the `LOAD TSDB VARIABLES TO RUNTIME` and `SAVE TSDB VARIABLES TO
DISK` commands." Reference the existing command names (`LOAD TSDB VARIABLES TO
RUNTIME`, `SAVE TSDB VARIABLES TO DISK`) so readers understand these are the
sole mechanisms for TSDB configuration persistence.
lib/ProxySQL_HTTP_Server.cpp-440-485 (1)

440-485: ⚠️ Potential issue | 🟡 Minor

Verify Chart.js version and harden the query URL.

The bundled Chart.js is v2.7.1, which correctly uses the xAxes time scale syntax shown in the code. However, apply the suggested hardening improvements: URL-encode the metric parameter to handle any special characters, and pad random colors to 6 hex digits using padStart(6,'0') to avoid invalid CSS values.

Suggested fixes (JS string content)
- s += "  const resp = await fetch(`/api/tsdb/query?metric=${metric}&from=${from}&to=${now}`);";
+ s += "  const resp = await fetch(`/api/tsdb/query?metric=${encodeURIComponent(metric)}&from=${from}&to=${now}`);";
...
- s += "    if (!datasets[label]) datasets[label] = { label: label, data: [], borderColor: '#' + Math.floor(Math.random()*16777215).toString(16), fill: false };";
+ s += "    if (!datasets[label]) datasets[label] = { label: label, data: [], borderColor: '#' + Math.floor(Math.random()*16777215).toString(16).padStart(6,'0'), fill: false };";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_HTTP_Server.cpp` around lines 440 - 485, The embedded TSDB
dashboard JS should URL-encode the metric when building the query and ensure
generated hex colors are always 6 digits: update the fetch call in loadData()
that builds `/api/tsdb/query?metric=${metric}&from=${from}&to=${now}` to
encodeURIComponent(metric), and change the color generation used when creating
datasets (currently '#' + Math.floor(Math.random()*16777215).toString(16)) to
padStart(6,'0') so the CSS color is valid; locate these changes around the
loadData(), loadMetrics(), and datasets creation code in the /tsdb HTML string
and apply the two replacements.
test/tap/tests/test_tsdb_api-t.cpp-101-107 (1)

101-107: ⚠️ Potential issue | 🟡 Minor

Each mysql_query result should be consumed before the next call.

mysql_query is called 6 times consecutively (Lines 101–106), but results are only drained once at Line 107. While SET and LOAD commands typically don't produce multi-result sets, the MySQL C API contract requires consuming results before issuing a new query on the same connection — failing to do so can cause "Commands out of sync" errors. The safest pattern is to check/drain after each call, or at minimum use the multi-statement approach explicitly.

That said, this is a common pattern in ProxySQL test code, so this is a minor concern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_api-t.cpp` around lines 101 - 107, The sequence of
mysql_query calls (mysql_query(admin, ...)) issues multiple statements without
consuming results, violating the MySQL C API and risking "Commands out of sync";
fix by draining results after each call (e.g., call drain_results(admin)
immediately after each mysql_query call) or otherwise ensure each mysql_query's
result is consumed before issuing the next one, referencing the mysql_query and
drain_results(admin) usage in this test.
test/tap/tests/test_tsdb_variables-t.cpp-126-133 (1)

126-133: ⚠️ Potential issue | 🟡 Minor

Disk persistence check queries main.global_variables instead of disk.global_variables.

Line 130 queries global_variables (defaults to main), which reflects the in-memory SET value, not the disk-persisted copy. To actually verify SAVE TSDB VARIABLES TO DISK succeeded, query disk.global_variables:

Proposed fix
 	bool disk_ok = fetch_single_string(
 		admin,
-		"SELECT variable_value FROM global_variables WHERE variable_name='tsdb-enabled'",
+		"SELECT variable_value FROM disk.global_variables WHERE variable_name='tsdb-enabled'",
 		disk_enabled
 	);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_variables-t.cpp` around lines 126 - 133, The test
currently queries the default (in-memory) table; change the SQL in the call to
fetch_single_string so it selects FROM disk.global_variables instead of the
default global_variables to verify the on-disk value. Update the query passed to
fetch_single_string (the call in the test around admin, "SELECT variable_value
FROM global_variables WHERE variable_name='tsdb-enabled'") to "SELECT
variable_value FROM disk.global_variables WHERE variable_name='tsdb-enabled'" so
ok(...) asserts the persisted disk value.
test/tap/tests/test_tsdb_variables-t.cpp-195-204 (1)

195-204: ⚠️ Potential issue | 🟡 Minor

PROXYSQL TSDB DOWNSAMPLE depends on DEBUG builds; test will fail in release.

As noted in the Admin_Handler.cpp review, this command is guarded by #ifdef DEBUG. If this test is expected to run in both debug and release CI, tests 18-19 will fail in release builds.

Additionally, line 204's assertion atoi(ds_count.c_str()) >= 0 is trivially true (any count ≥ 0), so it only tests table accessibility, not actual downsampling. Consider checking > 0 if you expect downsampled data, or make the test name more accurate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_variables-t.cpp` around lines 195 - 204, The test
invokes the DEBUG-only command "PROXYSQL TSDB DOWNSAMPLE" and then asserts a
trivially-true count; update the test in test_tsdb_variables-t.cpp to (1) detect
whether the DOWNSAMPLE command is available before asserting (e.g., call
mysql_query(admin, "PROXYSQL TSDB DOWNSAMPLE") and if rc != 0 treat the
DOWNSAMPLE-related checks as skipped), or wrap the DOWNSAMPLE invocation and
subsequent checks behind the same DEBUG guard used in Admin_Handler.cpp, and (2)
replace the weak assertion ok(atoi(ds_count.c_str()) >= 0, ...) with a
meaningful check (e.g., assert atoi(ds_count.c_str()) > 0 if you expect
downsampled rows) or change the diagnostic message to only claim table
accessibility; reference symbols: mysql_query, drain_results,
fetch_single_string, tsdb_metrics_hour, and the "PROXYSQL TSDB DOWNSAMPLE"
command.
test/tap/tests/test_tsdb_variables-t.cpp-192-193 (1)

192-193: ⚠️ Potential issue | 🟡 Minor

Unchecked return value from fetch_single_string.

If the query fails, count retains its previous value from test #4 (line 112), making this assertion misleading. Check the return value as done elsewhere.

Proposed fix
-	fetch_single_string(admin, "SELECT Variable_Value FROM stats_tsdb WHERE Variable_Name='Total_Datapoints'", count);
-	ok(atoi(count.c_str()) > 0, "SHOW TSDB STATUS reports datapoints > 0 (found %s)", count.c_str());
+	bool dp_ok = fetch_single_string(admin, "SELECT Variable_Value FROM stats_tsdb WHERE Variable_Name='Total_Datapoints'", count);
+	ok(dp_ok && atoi(count.c_str()) > 0, "SHOW TSDB STATUS reports datapoints > 0 (found %s)", count.c_str());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_variables-t.cpp` around lines 192 - 193, The call to
fetch_single_string in the test uses the previous value of count if the query
fails; update the test to check fetch_single_string's return (e.g., bool/rc)
before using count: call fetch_single_string(admin, "SELECT ...", count), verify
it returned success (and fail the test or call skip/diag on error) and only then
call ok(atoi(count.c_str()) > 0, ...). Refer to the existing pattern used
elsewhere in the test suite for checking fetch_single_string's return and apply
the same error-handling behavior around fetch_single_string and the variable
count.
lib/ProxySQL_Statistics.cpp-1686-1693 (1)

1686-1693: ⚠️ Potential issue | 🟡 Minor

Memory leak: error string is not freed on query failure.

When execute_statement sets error, the function logs it and returns NULL without calling free(error). Same issue in get_backend_health_metrics at lines 1717–1722.

Proposed fix
     if (error) {
         proxy_error("query_tsdb_metrics failed: %s -- sql: %s\n", error, query.c_str());
+        free(error);
         if (resultset) {
             delete resultset;
         }
         return NULL;
     }

Apply similarly in get_backend_health_metrics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_Statistics.cpp` around lines 1686 - 1693, The error path in the
function that calls execute_statement fails to free the allocated error string
before returning, causing a leak; update the error handling in the function (the
block that logs via proxy_error("query_tsdb_metrics failed: %s -- sql: %s\n",
error, query.c_str())) to call free(error) after logging and before returning
NULL, and apply the same fix inside get_backend_health_metrics where the same
pattern appears (ensure resultset is deleted if non-null, free(error), then
return NULL).
lib/ProxySQL_Statistics.cpp-1418-1425 (1)

1418-1425: ⚠️ Potential issue | 🟡 Minor

Bind return codes are silently ignored.

Lines 1418–1421 overwrite rc successively without checking any return value, unlike the rest of this file which uses ASSERT_SQLITE_OK(rc, statsdb_disk) after each bind. A failed bind would silently produce corrupt data.

Proposed fix
-    rc = (*proxy_sqlite3_bind_int64)(statement, 1, timestamp);
-    rc = (*proxy_sqlite3_bind_text)(statement, 2, metric_name.c_str(), -1, SQLITE_TRANSIENT);
-    rc = (*proxy_sqlite3_bind_text)(statement, 3, labels_str.c_str(), -1, SQLITE_TRANSIENT);
-    rc = (*proxy_sqlite3_bind_double)(statement, 4, value);
+    rc = (*proxy_sqlite3_bind_int64)(statement, 1, timestamp); ASSERT_SQLITE_OK(rc, statsdb_disk);
+    rc = (*proxy_sqlite3_bind_text)(statement, 2, metric_name.c_str(), -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb_disk);
+    rc = (*proxy_sqlite3_bind_text)(statement, 3, labels_str.c_str(), -1, SQLITE_TRANSIENT); ASSERT_SQLITE_OK(rc, statsdb_disk);
+    rc = (*proxy_sqlite3_bind_double)(statement, 4, value); ASSERT_SQLITE_OK(rc, statsdb_disk);

Same issue applies to insert_backend_health at lines 1450–1455.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_Statistics.cpp` around lines 1418 - 1425, The bind return codes
for the metric insert block are ignored: after calling
proxy_sqlite3_bind_int64/proxy_sqlite3_bind_text/proxy_sqlite3_bind_double you
currently overwrite rc without checking; update the code in the function that
prepares/executess the statement (the block using proxy_sqlite3_bind_int64,
proxy_sqlite3_bind_text, proxy_sqlite3_bind_double followed by
SAFE_SQLITE3_STEP2) to validate rc after each bind by invoking
ASSERT_SQLITE_OK(rc, statsdb_disk) (matching the pattern used elsewhere in this
file) before proceeding to SAFE_SQLITE3_STEP2, and apply the same change to the
insert_backend_health sequence (the bind calls at the insert_backend_health
site) so any bind failure is detected and handled rather than silently ignored.
🧹 Nitpick comments (6)
doc/tsdb/embedded_tsdb_quickstart.md (1)

5-5: Typo concern: variable value quoting.

SET tsdb-enabled='1'; — verify whether the integer variable accepts a string-quoted value. If the admin parser treats this as an integer, SET tsdb-enabled=1; (without quotes) may be more accurate. This is a minor documentation nit; just confirm it matches how ProxySQL actually handles these variables.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@doc/tsdb/embedded_tsdb_quickstart.md` at line 5, The doc shows SET
tsdb-enabled='1'; which may incorrectly quote an integer; update the example to
use an unquoted integer (SET tsdb-enabled=1;) or explicitly document that string
quoting is accepted by ProxySQL, and ensure the example and text around the SET
tsdb-enabled command match actual ProxySQL behavior (verify via ProxySQL
variable parser) so the snippet and explanation reference the exact token SET
tsdb-enabled and the accepted value format.
test/tap/tests/test_tsdb_api-t.cpp (2)

49-82: curl_global_init called on every invocation of http_get.

curl_global_init(CURL_GLOBAL_ALL) on Line 54 is called each time http_get runs. The libcurl documentation states this function "is not thread safe" and should be called once per program lifetime. While this is a single-threaded test and repeated calls are mostly benign, it's cleaner (and avoids potential future issues) to move it to main() before the first http_get call, paired with a curl_global_cleanup() before exit.

Suggested fix
 int main() {
 	CommandLine cl;
 	if (cl.getEnv()) {
 		diag("Failed to get the required environmental variables.");
 		return EXIT_FAILURE;
 	}

 	plan(10);
+	curl_global_init(CURL_GLOBAL_ALL);

 	MYSQL* admin = mysql_init(NULL);

And in http_get:

 long http_get(const char *url, string &response_out, const char* userpwd = "stats:stats") {
 	struct memory chunk;
 	chunk.response = NULL;
 	chunk.size = 0;

-	curl_global_init(CURL_GLOBAL_ALL);
 	CURL *curl_handle;
 	curl_handle = curl_easy_init();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_api-t.cpp` around lines 49 - 82, The call to
curl_global_init(CURL_GLOBAL_ALL) inside http_get causes repeated,
non-thread-safe initialization; remove that call from the http_get function and
instead invoke curl_global_init(CURL_GLOBAL_ALL) once during program startup
(e.g., in main before any http_get calls) and pair it with curl_global_cleanup()
at program exit; update http_get to assume global init has already happened and
leave curl_easy_init(), curl_easy_perform(), curl_easy_cleanup(), and memory
handling (chunk, response_out) unchanged.

113-114: Hardcoded port numbers may break in non-default configurations.

Lines 113–114 hardcode 6080 (web) and 6070 (REST API). If the test environment overrides admin-web_port or admin-restapi_port, these tests will fail. Consider reading the ports from the admin connection (e.g., via SHOW VARIABLES) or from the CommandLine environment, consistent with how cl.admin_port is used for the admin connection.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_api-t.cpp` around lines 113 - 114, The test
currently hardcodes base_url and rest_url with ports 6080/6070 (variables
base_url and rest_url); change this to derive the web and REST ports from the
test harness instead: either read admin-web/admin-restapi from the CommandLine
or query the admin connection (e.g., via SHOW VARIABLES) and use those values to
format the URLs instead of literal ports; use the existing admin connection/port
helpers (e.g., cl.admin_port or the CommandLine parsing utilities) to obtain the
correct ports and build base_url and rest_url dynamically so tests respect
overridden configurations.
lib/Admin_Handler.cpp (1)

4539-4544: LIKE pattern uses 'tsdb-%' without backslash-escaping, unlike other variable namespaces.

Other SHOW * VARIABLES commands use 'mysql-\%' (line 4519), 'pgsql-\%' (line 4534), 'admin-\%' (line 4512), etc. While 'tsdb-%' is standard SQL and technically correct, the inconsistency with the rest of the file may indicate the \% convention has a purpose in this codebase (e.g., custom LIKE behavior).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/Admin_Handler.cpp` around lines 4539 - 4544, The LIKE pattern for the
TSDB variables is inconsistent with other namespaces: update the query
constructed in the block handling SHOW TSDB VARIABLES (where variables like
query_no_space_length, query_no_space, l_free, l_strdup and the goto __run_query
are used) to escape the percent sign the same way as others (use 'tsdb-\%'
instead of 'tsdb-%') so it matches the file's convention for LIKE patterns.
include/ProxySQL_Statistics.hpp (1)

97-107: Consider the implications of labels TEXT in the PRIMARY KEY of tsdb_metrics.

The composite key (timestamp, metric_name, labels) means the full JSON string is part of the key comparison. Two semantically identical label sets with different key ordering (e.g., {"a":"1","b":"2"} vs {"b":"2","a":"1"}) will be treated as different rows, causing duplicate data. The nlohmann::json library used in the .cpp file serializes std::map in sorted key order, so internally-generated labels are safe, but any externally-ingested data or future code paths using unsorted containers could silently create duplicates.

Also, large JSON label strings in a WITHOUT ROWID key may degrade B-tree performance compared to a ROWID-based table with a covering index.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/ProxySQL_Statistics.hpp` around lines 97 - 107, The PRIMARY KEY on
tsdb_metrics includes labels TEXT which makes logically-identical JSON objects
with different key ordering distinct; update how labels are stored/compared:
canonicalize/serialize labels into a stable form before insert (e.g., sort keys
using nlohmann::json or serialize from std::map) or change the schema to add a
fixed-size labels_hash column (e.g., LABELS_HASH INTEGER/TEXT) used in the
PRIMARY KEY and keep labels TEXT as non-key payload; adjust insert/update paths
in the code that write to tsdb_metrics/tsdb_metrics_hour (and any callers that
build labels) to compute the canonicalized JSON or hash consistently so
semantically-equal label sets do not produce duplicate rows and consider
removing labels from the WITHOUT ROWID composite key or switch to ROWID+covering
index if large label strings impact B-tree performance.
lib/ProxySQL_Statistics.cpp (1)

1849-1871: Consider using a bounded thread pool for backend probing to avoid resource exhaustion at scale.

The loop creates one OS thread per backend via std::async(std::launch::async, ...). For clusters with hundreds of backends, this unbounded thread creation can exhaust system resources. Since probe_backend performs I/O-bound network operations, a bounded thread pool (e.g., 10-20 concurrent threads) would be more efficient and scalable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_Statistics.cpp` around lines 1849 - 1871, The current code
spawns an unbounded OS thread per backend using std::async in the loop
(futures.push_back(std::async(..., probe_backend, ...))), which can exhaust
resources; replace this with a bounded concurrency approach: create or use a
small fixed-size thread pool (or a counting semaphore) and submit probe_backend
tasks to it (keeping task references), limiting parallelism to a configurable N
(e.g., 10–20), then gather results and call insert_backend_health(hg, host,
port, ...) as before; update the section that builds futures and waits on them
to use the pool/queue instead of directly calling std::async to ensure scalable,
bounded probing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/Admin_Handler.cpp`:
- Around line 976-982: The handler for the "PROXYSQL TSDB DOWNSAMPLE" command is
currently wrapped in an `#ifdef` DEBUG and thus omitted in release builds; move or
duplicate the block that checks query_no_space_length and strncasecmp for
"PROXYSQL TSDB DOWNSAMPLE", calls GloProxyStats->tsdb_downsample_metrics(), and
invokes SPA->send_ok_msg_to_client(sess, NULL, 0, query_no_space) out of the
DEBUG-only region so the command is available in non-debug builds (or
alternatively gate its registration/availability with a runtime flag instead of
the DEBUG macro); update surrounding preprocessor conditionals accordingly so
the symbols query_no_space_length, query_no_space,
GloProxyStats->tsdb_downsample_metrics(), and
ProxySQL_Admin::send_ok_msg_to_client remain reachable in release builds.

In `@lib/ProxySQL_Admin_Stats.cpp`:
- Around line 789-804: Add a NULL check for the global pointer used by
ProxySQL_Admin::stats___tsdb so it returns early when GloProxyStats is NULL, and
wrap the DELETE + subsequent INSERTs into a transaction (BEGIN/COMMIT with error
handling/rollback on failure) so stats_tsdb is updated atomically; specifically,
before calling GloProxyStats->get_tsdb_status() guard on GloProxyStats, and
replace the standalone admindb->execute("DELETE FROM stats_tsdb") plus multiple
admindb->execute(...) calls with a BEGIN, the DELETE, the INSERTs, and a COMMIT
(and ensure rollback on any admindb->execute error).

In `@lib/ProxySQL_Admin.cpp`:
- Around line 3015-3070: The function flush_tsdb_variables___runtime_to_database
currently builds SQL by interpolating variable names/values (via snprintf into a
and b) which allows quotes/special chars and SQL injection and also leaks error
and resultset on the onlyifempty early-return; fix by using
parameterized/prepared statements or proper escaping for inserts into
global_variables and runtime_global_variables (replace/insert queries) instead
of snprintf, e.g. prepare statements once (for REPLACE/INSERT OR IGNORE and
INSERT INTO runtime_global_variables), bind variable_name and variable_value
(use GloProxyStats->get_variable result or empty string), execute and finalize;
also ensure the error char* and SQLite3_result* resultset allocated in the
onlyifempty branch are freed/finalized before any return (free error if
non-NULL, delete resultset if non-NULL) and preserve freeing varnames[i],
varnames and val on all paths so there are no memory leaks.

In `@lib/ProxySQL_RESTAPI_Server.cpp`:
- Around line 416-424: The labels JSON parse can throw json::parse_error and
must be guarded: in the loop that builds j_resp (uses res, res->rows_count,
res->rows[i], j_resp, and local json row), wrap the
json::parse(res->rows[i]->fields[2]) call in a try-catch that catches
nlohmann::json::parse_error (and std::exception as a fallback) and on failure
populate row["labels"] with a safe fallback (e.g., an empty object {}, null, or
the raw string) and optionally log the parse error; ensure the catch does not
rethrow so the REST handler continues and still deletes res after the loop.
- Around line 355-441: The /api/tsdb/query and /api/tsdb/status branches in
render_GET dereference GloProxyStats without null checks; before calling
GloProxyStats->query_tsdb_metrics or GloProxyStats->get_tsdb_status add the same
guard used in the /api/tsdb/metrics branch (if (!GloProxyStats ||
!GloProxyStats->statsdb_disk) return a JSON error {"error":"TSDB not
initialized"} with an appropriate HTTP error code), ensuring you create and
return the http_response (string_response) and call add_headers(response) just
like the metrics path.

In `@lib/ProxySQL_Statistics.cpp`:
- Around line 1798-1836: probe_backend currently measures connect() but uses
SO_RCVTIMEO/SO_SNDTIMEO which do not limit connect(), so an unreachable backend
can block the monitor loop; change probe_backend to perform a non-blocking
connect: set the socket O_NONBLOCK (or use fcntl) before calling connect(), call
connect() and if it returns -1 with errno == EINPROGRESS use poll() or select()
with a configurable timeout to wait for writability, then check
getsockopt(SO_ERROR) to detect success/failure, compute connect_ms from the time
between initiating connect and completion, restore socket flags or close the
socket on error, and ensure you still freeaddrinfo(res) and return the same
probe_result_t values (use the same function name probe_backend and the same
local variables host/port/connect_ms/probe_up/now).
- Around line 1470-1502: The call to execute_statement(char* query, char**
error, ...) can allocate an error string that is never checked or freed; after
calling statsdb_disk->execute_statement(...) check if error is non-NULL and
handle it (log the error and abort/return early from the function or skip
processing), free the error buffer with sqlite3_free(error) (or the appropriate
deallocator used by execute_statement), and only use resultset (and delete
resultset) when no error occurred and resultset is valid; ensure you also delete
resultset in the error path if it was allocated.
- Around line 1539-1576: The local char* error is reused across the three calls
to statsdb_disk->execute_statement (used near variables error, resultset and the
status assignments) which can leak or carry stale error text; before each
execute_statement you must clear any previous error by freeing error if non-NULL
and setting error = NULL, and after each call check error and free it (then set
to NULL) once handled; update the blocks around the SELECT COUNT(DISTINCT...),
SELECT COUNT(*), and SELECT MIN/MAX(...) calls to free/reset error before
reusing and after use to avoid leaks and stale pointers.
- Around line 1588-1597: The timer interval multiplication can overflow 32-bit
int: in ProxySQL_Statistics::tsdb_sampler_timetoget replace the expression
variables.tsdb_sample_interval * 1000000 with a 64-bit operation (e.g.
variables.tsdb_sample_interval * 1000000ULL or static_cast<unsigned long
long>(variables.tsdb_sample_interval) * 1000000ULL) when computing
next_timer_tsdb_sampler; apply the same change where tsdb_monitor_interval is
multiplied (the tsdb monitor timer calculation) so both timer updates use
unsigned long long arithmetic to avoid signed overflow.
- Around line 1728-1783: The sampler currently calls insert_tsdb_metric
repeatedly causing many separate SQLite transactions; wrap the entire Prometheus
metrics loop in a single DB transaction in
ProxySQL_Statistics::tsdb_sampler_loop (BEGIN TRANSACTION before iterating
metrics, COMMIT after) so all inserts are batched. Adjust insert_tsdb_metric (or
add an overload/flag) so it does not start/commit its own transaction when
called from tsdb_sampler_loop; on any insert failure perform ROLLBACK and log
the error, otherwise COMMIT at the end; ensure you use the same SQLite handle
used by insert_tsdb_metric for BEGIN/COMMIT/ROLLBACK so you get the
single-transaction throughput boost.

---

Outside diff comments:
In `@src/main.cpp`:
- Around line 972-980: The early call to GloProxyStats->init() in
ProxySQL_Main_init_Admin_module causes double initialization of
ProxySQL_Statistics (which allocates tables_defs_statsdb_mem and
tables_defs_statsdb_disk) and leads to leaks; remove that call so
ProxySQL_Statistics::init() is only invoked from ProxySQL_Admin::init() after
statsdb_disk is configured, ensuring tables_defs_statsdb_mem and
tables_defs_statsdb_disk are allocated exactly once and avoiding redundant
setup.

---

Duplicate comments:
In `@doc/tsdb/embedded_tsdb_overview.md`:
- Line 19: The documentation uses an inconsistent schema prefix "statsdb_disk.*"
— update this reference to match the canonical schema name used elsewhere in the
docs (e.g., the prefix used in the quickstart); locate the line containing
"statsdb_disk.*" and replace it with the canonical schema prefix so all docs
consistently reference the same SQL table namespace.

In `@lib/ProxySQL_Admin.cpp`:
- Around line 8948-8964: The save_tsdb_variables_from_runtime function currently
builds SQL by interpolating raw tsdb variable names and values via snprintf into
query and executing via admindb->execute, which risks SQL injection; change it
to use the same safe approach as flush_tsdb_variables___runtime_to_database():
do not snprintf raw values into the SQL string—either use admindb prepared
statements/bind parameters or escape both variable name and value with the
existing SQL-escape helper used in flush_tsdb_variables___runtime_to_database
(whichever pattern the codebase uses), build the REPLACE INTO
global_variables(...) statement with placeholders, bind the tsdb-%s key and
value, execute via admindb’s prepared API, and ensure you still free val and
tsdb_vars elements as before.

---

Nitpick comments:
In `@doc/tsdb/embedded_tsdb_quickstart.md`:
- Line 5: The doc shows SET tsdb-enabled='1'; which may incorrectly quote an
integer; update the example to use an unquoted integer (SET tsdb-enabled=1;) or
explicitly document that string quoting is accepted by ProxySQL, and ensure the
example and text around the SET tsdb-enabled command match actual ProxySQL
behavior (verify via ProxySQL variable parser) so the snippet and explanation
reference the exact token SET tsdb-enabled and the accepted value format.

In `@include/ProxySQL_Statistics.hpp`:
- Around line 97-107: The PRIMARY KEY on tsdb_metrics includes labels TEXT which
makes logically-identical JSON objects with different key ordering distinct;
update how labels are stored/compared: canonicalize/serialize labels into a
stable form before insert (e.g., sort keys using nlohmann::json or serialize
from std::map) or change the schema to add a fixed-size labels_hash column
(e.g., LABELS_HASH INTEGER/TEXT) used in the PRIMARY KEY and keep labels TEXT as
non-key payload; adjust insert/update paths in the code that write to
tsdb_metrics/tsdb_metrics_hour (and any callers that build labels) to compute
the canonicalized JSON or hash consistently so semantically-equal label sets do
not produce duplicate rows and consider removing labels from the WITHOUT ROWID
composite key or switch to ROWID+covering index if large label strings impact
B-tree performance.

In `@lib/Admin_Handler.cpp`:
- Around line 4539-4544: The LIKE pattern for the TSDB variables is inconsistent
with other namespaces: update the query constructed in the block handling SHOW
TSDB VARIABLES (where variables like query_no_space_length, query_no_space,
l_free, l_strdup and the goto __run_query are used) to escape the percent sign
the same way as others (use 'tsdb-\%' instead of 'tsdb-%') so it matches the
file's convention for LIKE patterns.

In `@lib/ProxySQL_Statistics.cpp`:
- Around line 1849-1871: The current code spawns an unbounded OS thread per
backend using std::async in the loop (futures.push_back(std::async(...,
probe_backend, ...))), which can exhaust resources; replace this with a bounded
concurrency approach: create or use a small fixed-size thread pool (or a
counting semaphore) and submit probe_backend tasks to it (keeping task
references), limiting parallelism to a configurable N (e.g., 10–20), then gather
results and call insert_backend_health(hg, host, port, ...) as before; update
the section that builds futures and waits on them to use the pool/queue instead
of directly calling std::async to ensure scalable, bounded probing.

In `@test/tap/tests/test_tsdb_api-t.cpp`:
- Around line 49-82: The call to curl_global_init(CURL_GLOBAL_ALL) inside
http_get causes repeated, non-thread-safe initialization; remove that call from
the http_get function and instead invoke curl_global_init(CURL_GLOBAL_ALL) once
during program startup (e.g., in main before any http_get calls) and pair it
with curl_global_cleanup() at program exit; update http_get to assume global
init has already happened and leave curl_easy_init(), curl_easy_perform(),
curl_easy_cleanup(), and memory handling (chunk, response_out) unchanged.
- Around line 113-114: The test currently hardcodes base_url and rest_url with
ports 6080/6070 (variables base_url and rest_url); change this to derive the web
and REST ports from the test harness instead: either read
admin-web/admin-restapi from the CommandLine or query the admin connection
(e.g., via SHOW VARIABLES) and use those values to format the URLs instead of
literal ports; use the existing admin connection/port helpers (e.g.,
cl.admin_port or the CommandLine parsing utilities) to obtain the correct ports
and build base_url and rest_url dynamically so tests respect overridden
configurations.

- Refactored variable flushing to use prepared statements for safety.
- Replaced brittle command parsing with robust string searching.
- Extracted TSDB dashboard HTML/JS to separate source file for maintainability.
- Added NULL guards for GloProxyStats in Admin and REST API.
- Implemented batched metric insertions using transactions for performance.
- Refactored backend probing to use non-blocking connect() with timeout.
- Fixed potential integer overflows in timer calculations.
- Improved error handling and fixed memory leaks in statistics reporting.
- Updated TAP tests to verify UI and REST API functionality.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
lib/TSDB_Dashboard_html.cpp (1)

1-1: Make the embedded HTML buffer const to avoid accidental writes.

String literals are immutable; casting to char* risks UB if modified elsewhere.

Proposed change
-char * TSDB_Dashboard_html_c = (char *) R"HTML(
+const char *TSDB_Dashboard_html_c = R"HTML(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/TSDB_Dashboard_html.cpp` at line 1, The embedded HTML buffer
TSDB_Dashboard_html_c is declared as a non-const char* pointing to a string
literal; change its declaration to a const char* (or const char[]) to prevent
accidental writes and undefined behavior — update the symbol
TSDB_Dashboard_html_c to be const (remove the (char*) cast) so the literal is
treated as immutable throughout the codebase.
lib/ProxySQL_Statistics.cpp (1)

1884-1922: Consider bounding std::async fan‑out for large backend lists.

Launching one async task per backend can create many threads in large deployments; consider a small worker pool or a concurrency cap to keep monitor cycles stable.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/Admin_Handler.cpp`:
- Around line 2007-2011: The length guard causes the "LOAD TSDB VARIABLES FROM
CONFIG" branch to never match because query_no_space_length is compared only to
strlen("LOAD MYSQL VARIABLES FROM CONFIG"); update the conditional around
query_no_space_length and the strncasecmp calls in Admin_Handler.cpp so each
literal is checked against its own length (e.g., use strlen("LOAD TSDB VARIABLES
FROM CONFIG") for that comparison) or remove the single outer length check and
call strncasecmp with each literal's strlen — ensure you keep using
query_no_space_length and query_no_space with strncasecmp to validate full
matches.

In `@lib/ProxySQL_Admin.cpp`:
- Around line 8973-8989: The save_tsdb_variables_from_runtime function currently
builds SQL via snprintf (variable a, query) which is injection-prone and breaks
on quotes; change it to use a prepared statement with bound parameters similar
to flush_tsdb_variables___runtime_to_database: prepare a REPLACE INTO
global_variables(variable_name, variable_value) VALUES(?,?) statement once, then
loop over tsdb_vars from GloProxyStats->get_variables_list(), call
GloProxyStats->get_variable(tsdb_vars[i]) and bind tsdb_vars[i] and val as
parameters to the prepared statement (using admindb's prepare/bind/execute API
or equivalent used by flush_tsdb_variables___runtime_to_database), execute
per-row, free the bound value and tsdb_vars entries, and finalize the prepared
statement instead of snprintf/malloc/free of query.
- Around line 3016-3095: In flush_tsdb_variables___runtime_to_database, the
error buffer returned by execute_statement is freed with sqlite3_free(error);
change this to free(error) to match the allocator used by execute_statement
(strdup/malloc) and be consistent with other call sites; update the free call
immediately after proxy_error("Error on %s : %s\n", q, error); and ensure no
other sqlite3_free() remains for that error variable.
- Around line 3010-3014: init_tsdb_variables() (which calls
load_tsdb_variables_to_runtime()) and save_tsdb_variables_from_runtime() both
dereference GloProxyStats without guards; add a defensive NULL check at the
start of load_tsdb_variables_to_runtime() and save_tsdb_variables_from_runtime()
to verify GloProxyStats is non-NULL, log a warning (or error) and early-return
if NULL to avoid crashes during unusual init/shutdown ordering; ensure the
checks reference the global symbol GloProxyStats and keep behavior unchanged
when it is present.

In `@lib/ProxySQL_Statistics.cpp`:
- Around line 1548-1611: In ProxySQL_Statistics::get_tsdb_status ensure any
SQLite3_result* resultset returned by statsdb_disk->execute_statement is deleted
regardless of rows_count to avoid leaks; after each execute_statement call, if
(resultset) { delete resultset; resultset = NULL; } (but still read fields when
rows_count>0 before deleting) and do this for the three queries (COUNT DISTINCT,
COUNT(*), MIN/MAX) so resultsets aren't leaked when the TSDB tables are empty.
- Around line 121-151: The set_variable implementation in
ProxySQL_Statistics::set_variable currently uses atoi which treats non-numeric
input as 0; replace atoi with strtol and validate the end-pointer and errno so
that inputs like "abc" are rejected. In set_variable, parse value with strtol,
check that endptr points to the string terminator (no extra chars), verify no
overflow/underflow via errno, and only then use the parsed integer for
assignments to variables.tsdb_enabled, variables.tsdb_sample_interval,
variables.tsdb_retention_days, variables.tsdb_monitor_enabled, and
variables.tsdb_monitor_interval; return false on any parse/validation failure.

---

Duplicate comments:
In `@lib/Admin_Handler.cpp`:
- Around line 978-986: The PROXYSQL TSDB DOWNSAMPLE handling was moved out of
the `#ifdef` DEBUG block so the command is available in release builds; ensure the
conditional remains exactly as shown (comparing query_no_space_length to
strlen("PROXYSQL TSDB DOWNSAMPLE") and using strncasecmp against
query_no_space), call GloProxyStats->tsdb_downsample_metrics() when
GloProxyStats is non-null, and then cast pa to ProxySQL_Admin and call
SPA->send_ok_msg_to_client(sess, NULL, 0, query_no_space) before returning
false; verify identifiers query_no_space_length, query_no_space, GloProxyStats,
tsdb_downsample_metrics, ProxySQL_Admin, and send_ok_msg_to_client are correct
and that no leftover `#ifdef` DEBUG guards remain around this block.

In `@lib/ProxySQL_Statistics.cpp`:
- Around line 1711-1717: The error path in the TSDB query helpers leaks the
allocated error string returned by statsdb_disk->execute_statement; after
detecting error (the `error` char* used in the proxy_error call within the
`query_tsdb_metrics` flow) free the error buffer before returning and similarly
free it in the later error branch around the other block (lines ~1738-1748)
where `error` is returned; locate the
`statsdb_disk->execute_statement((char*)query.c_str(), &error, &cols,
&affected_rows, &resultset)` calls and add a call to the appropriate deallocator
(e.g., free(error)) after logging and before any early `return`, and ensure
`resultset` is cleaned up as already done.

---

Nitpick comments:
In `@lib/TSDB_Dashboard_html.cpp`:
- Line 1: The embedded HTML buffer TSDB_Dashboard_html_c is declared as a
non-const char* pointing to a string literal; change its declaration to a const
char* (or const char[]) to prevent accidental writes and undefined behavior —
update the symbol TSDB_Dashboard_html_c to be const (remove the (char*) cast) so
the literal is treated as immutable throughout the codebase.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/ProxySQL_RESTAPI_Server.cpp`:
- Around line 422-434: The loop building JSON rows from res->rows (using
res->rows_count and res->rows[i]->fields[0..3]) can dereference NULL field
pointers; add null-guards for each field before using them (e.g., test if
res->rows[i] and res->rows[i]->fields[k] are non-null) and substitute safe
defaults: use 0 (or "0") for numeric fields assigned to row["ts"] and
row["value"], an empty string for metric, and an empty JSON/object or the raw
empty string for row["labels"] if parsing fails or field is null; ensure
json::parse is only called when fields[2] is non-null and non-empty and still
catch parse_error to fallback to the default.
- Around line 449-451: The 404 fallback returns a string_response without
calling add_headers, so the response lacks Content-Type and CORS headers; update
the else branch that constructs the std::shared_ptr<http_response> via new
string_response("Not Found", http::http_utils::http_not_found) to call
add_headers(response) before returning (use the same pattern as the other
branches: create the response object, call add_headers(response), then return
the response) to ensure Content-Type: application/json and CORS headers are
present.
- Around line 367-384: The error branch can leak the SQLite3_result pointed to
by res returned from GloProxyStats->statsdb_disk->execute_statement; ensure the
result is deleted regardless of error: after calling execute_statement (which
sets error, cols, rows and res), if res is non-null delete it when handling the
error path (or better, perform a single unconditional cleanup delete res after
the if/else before assigning j_resp), referencing the variables res, error, and
the execute_statement call in ProxySQL_RESTAPI_Server.cpp so res is always
freed.
- Around line 437-442: The /api/tsdb/status handler currently only checks
GloProxyStats and returns "TSDB not initialized" if null; make it consistent
with /metrics and /query by also checking GloProxyStats->statsdb_disk before
calling get_tsdb_status(). Update the guard in the status endpoint (the same
handler containing the shown snippet) to: if (!GloProxyStats ||
!GloProxyStats->statsdb_disk) { ... } so the endpoint short-circuits like the
others and avoids relying solely on get_tsdb_status()'s internal check.

Comment on lines +367 to +384
char *error = NULL;
int cols, rows;
SQLite3_result *res = NULL;
GloProxyStats->statsdb_disk->execute_statement("SELECT DISTINCT metric_name FROM tsdb_metrics", &error, &cols, &rows, &res);

if (error) {
j_resp = json {{"error", error}};
free(error);
} else {
std::vector<string> metrics;
if (res) {
for (size_t i = 0; i < res->rows_count; i++) {
metrics.push_back(res->rows[i]->fields[0]);
}
delete res;
}
j_resp = metrics;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential memory leak: res not freed when error is set.

If execute_statement allocates res and sets error, the if (error) branch never deletes res. Add a delete res in the error path (or unconditionally after both branches).

Proposed fix
             if (error) {
                 j_resp = json {{"error", error}};
                 free(error);
+                if (res) delete res;
             } else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_RESTAPI_Server.cpp` around lines 367 - 384, The error branch can
leak the SQLite3_result pointed to by res returned from
GloProxyStats->statsdb_disk->execute_statement; ensure the result is deleted
regardless of error: after calling execute_statement (which sets error, cols,
rows and res), if res is non-null delete it when handling the error path (or
better, perform a single unconditional cleanup delete res after the if/else
before assigning j_resp), referencing the variables res, error, and the
execute_statement call in ProxySQL_RESTAPI_Server.cpp so res is always freed.

Comment on lines +422 to +434
for (size_t i = 0; i < res->rows_count; i++) {
json row;
row["ts"] = atol(res->rows[i]->fields[0]);
row["metric"] = res->rows[i]->fields[1];
try {
row["labels"] = json::parse(res->rows[i]->fields[2]);
} catch (const json::parse_error& e) {
row["labels"] = res->rows[i]->fields[2];
}
row["value"] = atof(res->rows[i]->fields[3]);
j_resp.push_back(row);
}
delete res;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

NULL field dereference risk in result row construction.

res->rows[i]->fields[0] through fields[3] are passed directly to atol, atof, json::parse, and string construction. If the database returns a NULL column value, these will dereference a null pointer and crash the REST handler. Consider adding a null guard (or at minimum a default) for each field access.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_RESTAPI_Server.cpp` around lines 422 - 434, The loop building
JSON rows from res->rows (using res->rows_count and res->rows[i]->fields[0..3])
can dereference NULL field pointers; add null-guards for each field before using
them (e.g., test if res->rows[i] and res->rows[i]->fields[k] are non-null) and
substitute safe defaults: use 0 (or "0") for numeric fields assigned to
row["ts"] and row["value"], an empty string for metric, and an empty JSON/object
or the raw empty string for row["labels"] if parsing fails or field is null;
ensure json::parse is only called when fields[2] is non-null and non-empty and
still catch parse_error to fallback to the default.

Comment on lines +437 to +442
if (!GloProxyStats) {
j_resp = json {{"error", "TSDB not initialized"}};
auto response = std::shared_ptr<http_response>(new string_response(j_resp.dump(), http::http_utils::http_internal_server_error));
add_headers(response);
return response;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the implementation of get_tsdb_status and surrounding context
rg -n 'get_tsdb_status' lib/ --type=cpp -B 2 -A 15 | head -100

Repository: sysown/proxysql

Length of output: 4023


🏁 Script executed:

#!/bin/bash
# Search for the /api/tsdb/metrics endpoint guard
rg -n '/api/tsdb/metrics' lib/ProxySQL_RESTAPI_Server.cpp -B 2 -A 10

Repository: sysown/proxysql

Length of output: 1271


🏁 Script executed:

#!/bin/bash
# Search for the /api/tsdb/query endpoint guard
rg -n '/api/tsdb/query' lib/ProxySQL_RESTAPI_Server.cpp -B 2 -A 10

Repository: sysown/proxysql

Length of output: 1486


🏁 Script executed:

#!/bin/bash
# Search for statsdb_disk access to understand what it guards against
rg -n 'statsdb_disk' lib/ProxySQL_RESTAPI_Server.cpp -B 2 -A 2

Repository: sysown/proxysql

Length of output: 1039


Inconsistent guard: /api/tsdb/status lacks explicit statsdb_disk check for consistency.

The /metrics and /query endpoints guard on !GloProxyStats || !GloProxyStats->statsdb_disk, but /status only checks !GloProxyStats. While get_tsdb_status() has an internal guard (if (!statsdb_disk) return status; at ProxySQL_Statistics.cpp:1551) that prevents crashes, the missing outer guard makes the code inconsistent and harder to maintain. For defensive consistency with the other endpoints, add the statsdb_disk check.

Proposed fix
-            if (!GloProxyStats) {
+            if (!GloProxyStats || !GloProxyStats->statsdb_disk) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!GloProxyStats) {
j_resp = json {{"error", "TSDB not initialized"}};
auto response = std::shared_ptr<http_response>(new string_response(j_resp.dump(), http::http_utils::http_internal_server_error));
add_headers(response);
return response;
}
if (!GloProxyStats || !GloProxyStats->statsdb_disk) {
j_resp = json {{"error", "TSDB not initialized"}};
auto response = std::shared_ptr<http_response>(new string_response(j_resp.dump(), http::http_utils::http_internal_server_error));
add_headers(response);
return response;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_RESTAPI_Server.cpp` around lines 437 - 442, The /api/tsdb/status
handler currently only checks GloProxyStats and returns "TSDB not initialized"
if null; make it consistent with /metrics and /query by also checking
GloProxyStats->statsdb_disk before calling get_tsdb_status(). Update the guard
in the status endpoint (the same handler containing the shown snippet) to: if
(!GloProxyStats || !GloProxyStats->statsdb_disk) { ... } so the endpoint
short-circuits like the others and avoids relying solely on get_tsdb_status()'s
internal check.

Comment on lines +449 to +451
} else {
return std::shared_ptr<http_response>(new string_response("Not Found", http::http_utils::http_not_found));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

404 response missing add_headers — no Content-Type or CORS headers.

All other response paths call add_headers(response) to set Content-Type: application/json and the CORS header. The 404 fallback skips this, which can cause issues for browser-based API consumers.

Proposed fix
         } else {
-            return std::shared_ptr<http_response>(new string_response("Not Found", http::http_utils::http_not_found));
+            j_resp = json {{"error", "Not Found"}};
+            auto response = std::shared_ptr<http_response>(new string_response(j_resp.dump(), http::http_utils::http_not_found));
+            add_headers(response);
+            return response;
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} else {
return std::shared_ptr<http_response>(new string_response("Not Found", http::http_utils::http_not_found));
}
} else {
j_resp = json {{"error", "Not Found"}};
auto response = std::shared_ptr<http_response>(new string_response(j_resp.dump(), http::http_utils::http_not_found));
add_headers(response);
return response;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_RESTAPI_Server.cpp` around lines 449 - 451, The 404 fallback
returns a string_response without calling add_headers, so the response lacks
Content-Type and CORS headers; update the else branch that constructs the
std::shared_ptr<http_response> via new string_response("Not Found",
http::http_utils::http_not_found) to call add_headers(response) before returning
(use the same pattern as the other branches: create the response object, call
add_headers(response), then return the response) to ensure Content-Type:
application/json and CORS headers are present.

Summary of changes:
- Fixed allocator mismatches (sqlite3_free vs free) across multiple files.
- Converted manual SQL construction to prepared statements for TSDB variables.
- Resolved memory leaks in statistics and monitoring loops.
- Fixed critical 'LOAD TSDB VARIABLES FROM CONFIG' command mismatch.
- Improved command parsing robustness using string searching.
- Standardized SQL LIKE patterns with escaped wildcards (\%).
- Switched to data-driven variable management in ProxySQL_Statistics.
- Improved web dashboard security with URL encoding and const correctness.
- Clarified documentation regarding configuration lifecycle and schema names.
- Updated TAP tests for better reliability and correct persistence verification.
This ensures statsdb_disk is properly deallocated during ProxySQL_Statistics
destruction, preventing a potential memory leak.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/Admin_Handler.cpp (1)

1948-1970: ⚠️ Potential issue | 🔴 Critical

\\% in LIKE patterns produces a literal \ in SQLite, breaking all variable filtering.

In C string literals "tsdb-\\%", \\ is a single backslash and % is literal — the SQL string becomes 'tsdb-\%'. SQLite's LIKE operator has no default escape character; \ is treated as a literal character, so LIKE 'tsdb-\%' matches only strings containing a literal backslash (e.g. tsdb-\foo). No ProxySQL variable names contain \, so these patterns match zero rows.

This creates regressions for previously working operations (lines 1952, 1961, 1964, 1970) and broken new functionality (lines 1949, 1967). The same bug affects MCP and SHOW commands at lines 2087, 2093, 4186, 4196, 4544, and 4567.

The existing correct pattern (unchanged lines 1943, 1946, 4507–4537) simply uses 'pgsql-%', 'mysql-\%' (where C's \%%), etc. The simplest fix is to drop the backslash entirely:

🐛 Proposed fix — remove erroneous backslash from all affected LIKE patterns
-			else if (is_tsdb) {
-				*q = l_strdup("INSERT OR REPLACE INTO main.global_variables SELECT * FROM disk.global_variables WHERE variable_name LIKE 'tsdb-\\%'");
-			}
+			else if (is_tsdb) {
+				*q = l_strdup("INSERT OR REPLACE INTO main.global_variables SELECT * FROM disk.global_variables WHERE variable_name LIKE 'tsdb-%'");
+			}
 			else {
-				*q = l_strdup("INSERT OR REPLACE INTO main.global_variables SELECT * FROM disk.global_variables WHERE variable_name LIKE 'mysql-\\%'");
+				*q = l_strdup("INSERT OR REPLACE INTO main.global_variables SELECT * FROM disk.global_variables WHERE variable_name LIKE 'mysql-%'");
 			}
 		if (is_pgsql) {
-			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'pgsql-\\%'");
+			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'pgsql-%'");
 		}
 		else if (is_genai) {
-			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'genai-\\%'");
+			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'genai-%'");
 		}
 		else if (is_tsdb) {
-			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'tsdb-\\%'");
+			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'tsdb-%'");
 		}
 		else {
-			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'mysql-\\%'");
+			*q = l_strdup("INSERT OR REPLACE INTO disk.global_variables SELECT * FROM main.global_variables WHERE variable_name LIKE 'mysql-%'");
 		}

Apply the same fix to lines 2087, 2093, 4186, 4196:

-			char *q=(char *)"SELECT * FROM global_variables WHERE variable_name LIKE 'mcp-\\%' ORDER BY variable_name";
+			char *q=(char *)"SELECT * FROM global_variables WHERE variable_name LIKE 'mcp-%' ORDER BY variable_name";

And lines 4544, 4567 (SHOW commands):

-	query = l_strdup("SELECT variable_name AS Variable_name, variable_value AS Value FROM global_variables WHERE variable_name LIKE 'tsdb-\\%' ORDER BY variable_name");
+	query = l_strdup("SELECT variable_name AS Variable_name, variable_value AS Value FROM global_variables WHERE variable_name LIKE 'tsdb-%' ORDER BY variable_name");

-	query = l_strdup("SELECT variable_name AS Variable_name, variable_value AS Value FROM global_variables WHERE variable_name LIKE 'mcp-\\%' ORDER BY variable_name");
+	query = l_strdup("SELECT variable_name AS Variable_name, variable_value AS Value FROM global_variables WHERE variable_name LIKE 'mcp-%' ORDER BY variable_name");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/Admin_Handler.cpp` around lines 1948 - 1970, The LIKE patterns currently
embed an extraneous backslash (e.g., "INSERT OR REPLACE ... WHERE variable_name
LIKE 'tsdb-\\%'") which produces a literal backslash in SQLite and matches
nothing; update all such SQL string literals used in the branches for is_tsdb,
is_pgsql, is_genai, and the default (the l_strdup calls shown) to remove the
backslash so the patterns read 'tsdb-%', 'pgsql-%', 'genai-%', and 'mysql-%';
apply the same replacement for the corresponding MCP/SHOW occurrences that use
"'pgsql-\\%'", "'genai-\\%'", "'tsdb-\\%'", and "'mysql-\\%'" so all variable
filtering works correctly (look for the l_strdup SQL strings and the
is_admin_command_or_alias / is_* branches to locate each spot).
🧹 Nitpick comments (3)
test/tap/tests/test_tsdb_api-t.cpp (2)

113-114: Hardcoded URLs and ports may break in non-default test environments.

Ports 6080 (HTTPS web UI) and 6070 (REST API) are hardcoded. If the test infrastructure uses different admin/web ports, these tests will fail. Consider deriving these from CommandLine or environment variables for portability, consistent with how cl.admin_port is used for the MySQL admin connection.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_api-t.cpp` around lines 113 - 114, The test
hardcodes service endpoints via the variables base_url and rest_url (set to
"https://127.0.0.1:6080" and "http://127.0.0.1:6070"), which will break in
non-default environments; update the test to derive these URLs from the same
configuration source used elsewhere (e.g., CommandLine or environment variables)
similar to how cl.admin_port is used for MySQL — construct base_url and rest_url
by reading the appropriate CommandLine flags or env vars (fallback to the
current defaults if unset) so the test honors custom admin/web/REST ports at
runtime.

40-47: drain_results is duplicated verbatim in test_tsdb_variables-t.cpp.

Consider extracting this helper into a shared test utility header (e.g., utils.h) to avoid maintaining identical copies.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_tsdb_api-t.cpp` around lines 40 - 47, The helper function
drain_results is duplicated; move its implementation into a shared test utility
header (e.g., utils.h) and have both test files include that header instead of
keeping copies. Create a single non-duplicated symbol (e.g., declare void
drain_results(MYSQL* mysql) in utils.h and define it once in a corresponding
utils.cpp or mark it static inline in utils.h if you prefer header-only), then
remove the duplicate drain_results definitions from test_tsdb_api-t.cpp and
test_tsdb_variables-t.cpp and replace them with `#include` "utils.h" so both tests
call the same drain_results(MYSQL* mysql) implementation.
lib/ProxySQL_Statistics.cpp (1)

134-154: Fragile positional dispatch in set_variable — consider a table-driven approach.

The if (i == 0) ... else if (i == 1) ... chain couples variable assignment to the array index order of tsdb_variable_meta. If the array is ever reordered or extended, these index-based assignments silently break. A pointer-to-member or name-based lookup would be more robust.

That said, the code is correct in its current form and the meta table is small. Flagging as optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_Statistics.cpp` around lines 134 - 154, The current
ProxySQL_Statistics::set_variable uses positional dispatch (checking
i==0,1,2,...) tied to tsdb_variable_meta order; replace this with a table-driven
mapping between tsdb_variable_meta entries and the actual variable fields (e.g.,
variables.tsdb_enabled, variables.tsdb_sample_interval,
variables.tsdb_retention_days, variables.tsdb_monitor_enabled,
variables.tsdb_monitor_interval) using either a pointer-to-member (int
ProxySQL_Statistics::Variables::*) or an array of setter lambdas indexed
alongside tsdb_variable_meta; update the loop to, upon name match and range
check, invoke the mapped setter to assign (int)intv and return true, removing
the fragile if (i==...) chain while preserving validation and return semantics
of set_variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/Admin_FlushVariables.cpp`:
- Line 174: The SQL LIKE pattern is incorrectly escaped (using '\%') so SQLite
treats the backslash as literal; update the query construction (the string q in
Admin_FlushVariables.cpp and other occurrences) to use the pattern LIKE
'modname-%' (i.e., replace "\%" with "%") wherever you build queries for modules
(e.g., the variable q built with substr(variable_name,
to_string(modname.length()+2)) vn, variable_value ... LIKE '" + modname +
"-\\%'" ), and apply the same fix to the other identified occurrences (lines
noted in the review) so all LIKE patterns use '-' followed by '%' without a
backslash.

In `@lib/ProxySQL_Admin.cpp`:
- Around line 3016-3046: In flush_tsdb_variables___runtime_to_database, the SQL
LIKE patterns use an escaped backslash ("tsdb-\\%") so the '%' is treated
literally and matches nothing; update all occurrences of that pattern (the
SELECT COUNT(*) query and the DELETE statements executed against
global_variables and runtime_global_variables) to use LIKE 'tsdb-%' so the
wildcard works and rows such as 'tsdb-enabled' and 'tsdb-sample_interval' are
correctly matched and handled.

In `@lib/ProxySQL_Statistics.cpp`:
- Around line 1895-1900: The code is silently discarding errors from
GloAdmin->admindb->execute_statement("SELECT ... runtime_mysql_servers") by
freeing err_msg without logging; update the error handling after the call to
check if err_msg is non-null, log the error message (including the query/context
"runtime_mysql_servers" and any relevant identifiers), then free(err_msg) and
handle flow (e.g., continue/return from the monitoring iteration) so transient
failures are visible in logs; make the change around the err_msg check
immediately following the execute_statement call.

In `@test/tap/tests/test_tsdb_api-t.cpp`:
- Around line 54-57: curl_easy_init() return value is not checked before use;
add a NULL check for curl_handle after calling curl_easy_init() (the variable
and call are curl_handle = curl_easy_init()) and handle failure early (e.g.,
log/print an error and fail/return from the test) before any curl_easy_setopt
calls; ensure all subsequent uses of curl_handle (curl_easy_setopt,
curl_easy_cleanup) are guarded or skipped when curl_handle is NULL.

In `@test/tap/tests/test_tsdb_variables-t.cpp`:
- Around line 192-193: The call to fetch_single_string(admin, "...
Total_Datapoints ...", count) can fail and leave count stale (previous "5"), so
update the test to check fetch_single_string's return and handle failure before
asserting: call fetch_single_string and if it returns false or an error, fail
the test (e.g., ok(false, "... failed to fetch Total_Datapoints: %s", /* error
info */) or set count to "0") and only run ok(atoi(count.c_str()) > 0, ...) when
fetch_single_string succeeded; reference the fetch_single_string call and the
count variable to locate the change.

---

Outside diff comments:
In `@lib/Admin_Handler.cpp`:
- Around line 1948-1970: The LIKE patterns currently embed an extraneous
backslash (e.g., "INSERT OR REPLACE ... WHERE variable_name LIKE 'tsdb-\\%'")
which produces a literal backslash in SQLite and matches nothing; update all
such SQL string literals used in the branches for is_tsdb, is_pgsql, is_genai,
and the default (the l_strdup calls shown) to remove the backslash so the
patterns read 'tsdb-%', 'pgsql-%', 'genai-%', and 'mysql-%'; apply the same
replacement for the corresponding MCP/SHOW occurrences that use "'pgsql-\\%'",
"'genai-\\%'", "'tsdb-\\%'", and "'mysql-\\%'" so all variable filtering works
correctly (look for the l_strdup SQL strings and the is_admin_command_or_alias /
is_* branches to locate each spot).

---

Duplicate comments:
In `@lib/Admin_Handler.cpp`:
- Around line 978-986: The new PROXYSQL TSDB DOWNSAMPLE block performs a
null-guarded call to GloProxyStats->tsdb_downsample_metrics() and then sends OK
via ProxySQL_Admin::send_ok_msg_to_client; ensure the pattern exactly matches
the FLUSH block by keeping the GloProxyStats null check, using the same
session/pa variables (sess, pa), and casting pa to ProxySQL_Admin (SPA) before
calling SPA->send_ok_msg_to_client(sess, NULL, 0, query_no_space); verify
query_no_space_length and strncasecmp usage remains consistent to avoid
off-by-one issues and that no DEBUG-only guards are present.

---

Nitpick comments:
In `@lib/ProxySQL_Statistics.cpp`:
- Around line 134-154: The current ProxySQL_Statistics::set_variable uses
positional dispatch (checking i==0,1,2,...) tied to tsdb_variable_meta order;
replace this with a table-driven mapping between tsdb_variable_meta entries and
the actual variable fields (e.g., variables.tsdb_enabled,
variables.tsdb_sample_interval, variables.tsdb_retention_days,
variables.tsdb_monitor_enabled, variables.tsdb_monitor_interval) using either a
pointer-to-member (int ProxySQL_Statistics::Variables::*) or an array of setter
lambdas indexed alongside tsdb_variable_meta; update the loop to, upon name
match and range check, invoke the mapped setter to assign (int)intv and return
true, removing the fragile if (i==...) chain while preserving validation and
return semantics of set_variable.

In `@test/tap/tests/test_tsdb_api-t.cpp`:
- Around line 113-114: The test hardcodes service endpoints via the variables
base_url and rest_url (set to "https://127.0.0.1:6080" and
"http://127.0.0.1:6070"), which will break in non-default environments; update
the test to derive these URLs from the same configuration source used elsewhere
(e.g., CommandLine or environment variables) similar to how cl.admin_port is
used for MySQL — construct base_url and rest_url by reading the appropriate
CommandLine flags or env vars (fallback to the current defaults if unset) so the
test honors custom admin/web/REST ports at runtime.
- Around line 40-47: The helper function drain_results is duplicated; move its
implementation into a shared test utility header (e.g., utils.h) and have both
test files include that header instead of keeping copies. Create a single
non-duplicated symbol (e.g., declare void drain_results(MYSQL* mysql) in utils.h
and define it once in a corresponding utils.cpp or mark it static inline in
utils.h if you prefer header-only), then remove the duplicate drain_results
definitions from test_tsdb_api-t.cpp and test_tsdb_variables-t.cpp and replace
them with `#include` "utils.h" so both tests call the same drain_results(MYSQL*
mysql) implementation.

Comment on lines 1895 to 1900
GloAdmin->admindb->execute_statement(
"SELECT hostgroup_id, hostname, port FROM runtime_mysql_servers",
&err_msg, &cols, &affected_rows, &resultset);
if (err_msg) {
free(err_msg);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Query error from admindb is silently swallowed.

If the runtime_mysql_servers query fails (line 1897), err_msg is freed on line 1899 but neither logged nor acted upon. Since the monitor loop runs periodically, a transient error here should be logged to aid debugging.

Proposed fix
     if (err_msg) {
+        proxy_error("tsdb_monitor_loop: failed to query runtime_mysql_servers: %s\n", err_msg);
         free(err_msg);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_Statistics.cpp` around lines 1895 - 1900, The code is silently
discarding errors from GloAdmin->admindb->execute_statement("SELECT ...
runtime_mysql_servers") by freeing err_msg without logging; update the error
handling after the call to check if err_msg is non-null, log the error message
(including the query/context "runtime_mysql_servers" and any relevant
identifiers), then free(err_msg) and handle flow (e.g., continue/return from the
monitoring iteration) so transient failures are visible in logs; make the change
around the err_msg check immediately following the execute_statement call.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
lib/Admin_Handler.cpp (1)

4704-4709: LIKE 'tsdb-%' is functionally equivalent but inconsistent with sibling SHOW commands.

All existing SHOW <subsystem> VARIABLES handlers use the pattern LIKE 'subsystem-\%' (e.g., lines 4669, 4677, 4684, 4699). In C string literals \% is an unrecognized escape sequence, so compilers (GCC/Clang) silently reduce it to %. The new TSDB form 'tsdb-%' is therefore correct and arguably less error-prone, but differs in style. Consider aligning with the existing pattern for consistency.

♻️ Suggested alignment
-		query = l_strdup("SELECT variable_name AS Variable_name, variable_value AS Value FROM global_variables WHERE variable_name LIKE 'tsdb-%' ORDER BY variable_name");
+		query = l_strdup("SELECT variable_name AS Variable_name, variable_value AS Value FROM global_variables WHERE variable_name LIKE 'tsdb-\%' ORDER BY variable_name");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/Admin_Handler.cpp` around lines 4704 - 4709, The new TSDB handler uses
"LIKE 'tsdb-%'" which is correct but inconsistent with sibling handlers that use
the escaped pattern "LIKE 'subsystem-\\%'", so update the l_strdup call in the
TSDB branch to use "SELECT ... WHERE variable_name LIKE 'tsdb-\\%'" (i.e.,
replace "LIKE 'tsdb-%'" with "LIKE 'tsdb-\\%'" in the string passed to l_strdup)
so the literal matches the style used by the other SHOW <subsystem> VARIABLES
handlers (refer to variables: query, l_strdup, l_free, query_no_space_length,
query_no_space and label __run_query).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/ProxySQL_Admin.cpp`:
- Around line 2521-2536: The admin thread runs TSDB maintenance functions
(tsdb_sampler_loop, tsdb_monitor_loop, tsdb_downsample_metrics,
tsdb_retention_cleanup) synchronously and can stall due to blocking DB I/O and
futures.get() calls; move these loops off the admin thread by scheduling them on
a dedicated worker thread or threadpool (invoke via a background worker runner
instead of calling GloProxyStats->..._loop() directly), and/or add per-operation
timeouts and non-blocking waits (replace futures.get() with timed_wait or
future.wait_for and enforce per-query timeouts and max retries), ensure DB
retry/backoff logic (currently only 100µs sleep) uses exponential backoff and
respects an overall operation timeout to avoid indefinite blocking.

---

Duplicate comments:
In `@lib/ProxySQL_Statistics.cpp`:
- Around line 164-178: The numeric parsing in ProxySQL_Statistics::set_variable
is too permissive: strtol currently accepts empty strings and ignores overflow;
update the parsing to first clear errno, call strtol(value,&endptr,10), then
reject inputs where endptr == value (no digits parsed) or *endptr != '\0'
(trailing chars), and check errno == ERANGE to detect overflow/underflow (also
optionally compare intv against LONG_MIN/LONG_MAX); on any of these failures
return false. Keep the existing bounds check against
tsdb_variable_meta[i].min_val/max_val and assign to variables.tsdb_enabled,
tsdb_sample_interval, tsdb_retention_days, tsdb_monitor_enabled,
tsdb_monitor_interval only after successful, range-checked conversion.

---

Nitpick comments:
In `@lib/Admin_Handler.cpp`:
- Around line 4704-4709: The new TSDB handler uses "LIKE 'tsdb-%'" which is
correct but inconsistent with sibling handlers that use the escaped pattern
"LIKE 'subsystem-\\%'", so update the l_strdup call in the TSDB branch to use
"SELECT ... WHERE variable_name LIKE 'tsdb-\\%'" (i.e., replace "LIKE 'tsdb-%'"
with "LIKE 'tsdb-\\%'" in the string passed to l_strdup) so the literal matches
the style used by the other SHOW <subsystem> VARIABLES handlers (refer to
variables: query, l_strdup, l_free, query_no_space_length, query_no_space and
label __run_query).

Comment on lines +2521 to +2536
if (GloProxyStats->system_memory_timetoget(curtime)) {
GloProxyStats->system_memory_sets();
}
#endif
if (GloProxyStats->tsdb_sampler_timetoget(curtime)) {
GloProxyStats->tsdb_sampler_loop();
}
if (GloProxyStats->tsdb_downsample_timetoget(curtime)) {
GloProxyStats->tsdb_downsample_metrics();
}
if (GloProxyStats->tsdb_monitor_timetoget(curtime)) {
GloProxyStats->tsdb_monitor_loop();
}
if (GloProxyStats->tsdb_retention_timetoget(curtime)) {
GloProxyStats->tsdb_retention_cleanup();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, search for the TSDB loop function definitions
rg -n "tsdb_(sampler_loop|downsample_metrics|monitor_loop|retention_cleanup)" -g '*.cpp' -g '*.h' | head -30

Repository: sysown/proxysql

Length of output: 841


🏁 Script executed:

# Get file size to understand scope
wc -l lib/ProxySQL_Statistics.cpp

# Read the TSDB loop implementations
echo "=== tsdb_downsample_metrics (1494) ==="
sed -n '1494,1542p' lib/ProxySQL_Statistics.cpp

echo ""
echo "=== tsdb_retention_cleanup (1543) ==="
sed -n '1543,1650p' lib/ProxySQL_Statistics.cpp

echo ""
echo "=== tsdb_sampler_loop (1785) ==="
sed -n '1785,1912p' lib/ProxySQL_Statistics.cpp

echo ""
echo "=== tsdb_monitor_loop (1913) ==="
sed -n '1913,2050p' lib/ProxySQL_Statistics.cpp

Repository: sysown/proxysql

Length of output: 11849


🏁 Script executed:

# Check statsdb_disk execute method to see if it has timeout mechanisms
rg -n "class SQLite3_DB|execute_statement|execute\(" -g '*.h' lib/ | grep -E "(execute_statement|execute\()" | head -20

# Look for statsdb_disk definition and timeout configuration
rg -n "statsdb_disk" lib/ProxySQL_Statistics.cpp | head -10

# Check for any timeout or limit constants
rg -n "timeout|TIMEOUT|LIMIT|limit" lib/ProxySQL_Statistics.cpp | grep -i "tsdb\|sampler\|downsample\|retention\|monitor" | head -20

Repository: sysown/proxysql

Length of output: 811


🏁 Script executed:

# Find SQLite3_DB class definition and its execute methods
fd -e h -e cpp | xargs rg -l "class SQLite3_DB|class SQLite3DB" | head -5

# Search for execute method implementation
rg -n "void SQLite3DB::execute\(|SQLite3_result.*execute_statement\(" -g '*.cpp' -g '*.h' -A 10 | head -40

Repository: sysown/proxysql

Length of output: 3191


🏁 Script executed:

# Read the full execute_statement implementation
sed -n '294,350p' lib/sqlite3db.cpp

# Also check the execute method
rg -n "^void SQLite3DB::execute\(" -A 30 lib/sqlite3db.cpp | head -40

# Check for any locking mechanisms
rg -n "pthread_mutex|lock\|unlock" lib/sqlite3db.cpp | head -20

Repository: sysown/proxysql

Length of output: 2245


🏁 Script executed:

# Find USLEEP_SQLITE_LOCKED value to understand retry timing
rg -n "USLEEP_SQLITE_LOCKED" -g '*.h' -g '*.cpp'

# Check if there are any per-loop timeouts or limits configured
rg -n "tsdb_.*_interval|tsdb_.*_timeout|tsdb_.*_limit" -g '*.h' -g '*.cpp' | head -20

# Look for variables config related to TSDB
rg -n "tsdb_enabled|tsdb_sample_interval|tsdb_monitor_enabled" -g '*.h' -g '*.cpp' | head -20

Repository: sysown/proxysql

Length of output: 3662


TSDB maintenance loops can block the admin thread—tsdb_monitor_loop is a confirmed blocker.

  • tsdb_monitor_loop() calls futures.get() on all probes, which blocks until completion. Each probe has a 1-second timeout; with many backends, the admin thread stalls for seconds.
  • tsdb_sampler_loop() may issue many INSERTs in sequence without per-operation timeouts.
  • tsdb_downsample_metrics() and tsdb_retention_cleanup() execute potentially expensive aggregation and DELETE queries synchronously.
  • All database operations retry only on SQLite lock contention (100μs sleep), with no operation-level timeouts.

Consider offloading these loops to dedicated worker threads or implementing per-operation timeouts to prevent admin I/O stalls.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/ProxySQL_Admin.cpp` around lines 2521 - 2536, The admin thread runs TSDB
maintenance functions (tsdb_sampler_loop, tsdb_monitor_loop,
tsdb_downsample_metrics, tsdb_retention_cleanup) synchronously and can stall due
to blocking DB I/O and futures.get() calls; move these loops off the admin
thread by scheduling them on a dedicated worker thread or threadpool (invoke via
a background worker runner instead of calling GloProxyStats->..._loop()
directly), and/or add per-operation timeouts and non-blocking waits (replace
futures.get() with timed_wait or future.wait_for and enforce per-query timeouts
and max retries), ensure DB retry/backoff logic (currently only 100µs sleep)
uses exponential backoff and respects an overall operation timeout to avoid
indefinite blocking.

Signed-off-by: René Cannaò <rene@proxysql.com>
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot
E Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@renecannao renecannao merged commit 9e157d2 into v3.0 Feb 19, 2026
12 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant