feat: add query phase tracking for SHOW QUERIES by yihaoDeng · Pull Request #34706 · taosdata/TDengine

yihaoDeng · 2026-03-07T11:54:07Z

Add current_phase and action_start_time fields to track query execution stages:

0=query, 1=fetch, 2=query_callback, 3=fetch_callback

This helps monitor what phase a query is in and how long each phase takes.

Description

Issue(s)

Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

Is the user manual updated?
Are the test cases passed and automated?
Is there no significant decrease in test coverage?

Add current_phase and action_start_time fields to track query execution stages: - 0=query, 1=fetch, 2=query_callback, 3=fetch_callback This helps monitor what phase a query is in and how long each phase takes.

gemini-code-assist · 2026-03-07T11:54:27Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability of query execution by adding detailed phase tracking. It introduces new fields to monitor the current stage of a query and the timestamp when that stage began, making this information accessible through the SHOW QUERIES command. This improvement allows for better performance analysis, bottleneck identification, and a clearer understanding of query lifecycle, ultimately aiding in debugging and optimization efforts.

Highlights

Query Phase Tracking: Introduced currentPhase and actionStartTime fields in SQueryDesc and SRequestObj to track the current execution stage and its start time for queries.
Visibility in SHOW QUERIES: Exposed the new current_phase and action_start_time fields in the SHOW QUERIES output, providing users with real-time insights into query progress.
Phase Lifecycle Management: Implemented logic to update the currentPhase and actionStartTime during various stages of query execution, including initial query, data fetching, and their respective callbacks.
Comprehensive Testing: Added a new Python test suite to validate the functionality, schema, and accuracy of the query phase tracking feature under different scenarios, including long-running and concurrent queries.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

include/common/tmsg.h
- Added currentPhase and actionStartTime fields to the SQueryDesc structure.
source/client/inc/clientInt.h
- Added currentPhase and actionStartTime fields to the SRequestObj structure.
source/client/src/clientEnv.c
- Initialized currentPhase and actionStartTime for new SRequestObj instances.
- Removed an unnecessary memory allocation for msgBuf.
source/client/src/clientHb.c
- Populated currentPhase and actionStartTime in SQueryDesc from the SRequestObj during heartbeat request building.
source/client/src/clientImpl.c
- Updated currentPhase and actionStartTime when doRequestCallback is invoked for query and fetch callbacks.
source/client/src/clientMain.c
- Set currentPhase to 'query' and recorded actionStartTime at the beginning of doAsyncQuery.
- Set currentPhase to 'fetch' and recorded actionStartTime at the beginning of taos_fetch_rows_a.
source/common/src/msg/tmsg.c
- Modified tSerializeSClientHbReq to serialize the new currentPhase and actionStartTime fields.
- Modified tDeserializeSClientHbReq to deserialize the new currentPhase and actionStartTime fields.
source/common/src/systable.c
- Added current_phase (VARCHAR) and action_start_time (TIMESTAMP) columns to the querySchema for SHOW QUERIES.
source/dnode/mnode/impl/src/mndProfile.c
- Implemented logic to convert currentPhase integer to a human-readable string and added it to the SHOW QUERIES output.
- Added actionStartTime to the SHOW QUERIES output.
test/cases/24-Users/test_query_phase_tracking.py
- Added a new Python test file test_query_phase_tracking.py to validate the query phase tracking feature.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces query phase tracking for SHOW QUERIES by adding current_phase and action_start_time fields. The changes are well-contained and correctly implemented across the data structures, client logic, and server-side display logic. My main suggestion is to introduce an enum for the query phases to replace the magic numbers currently used, which will enhance code readability and maintainability. I've also provided a suggestion to strengthen the new test case for timing accuracy.

_{Note: Security Review did not run due to the size of the PR.}

gemini-code-assist · 2026-03-07T11:56:42Z

include/common/tmsg.h

+  int32_t  currentPhase;     // Current execution phase: 0=query, 1=fetch, 2=query callback, 3=fetch callback
+  int64_t  actionStartTime;   // When current action started (timestamp precision ms)


To improve code readability and maintainability, it's better to define an enum for the query phases instead of using magic numbers. This enum can then be used across the codebase where these phases are referenced.

I suggest adding the following enum definition before the SQueryDesc struct. Subsequent comments will suggest where to use this enum.

typedef enum EQueryPhase { QUERY_PHASE_NOT_STARTED = -1, QUERY_PHASE_QUERY = 0, QUERY_PHASE_FETCH = 1, QUERY_PHASE_QUERY_CALLBACK = 2, QUERY_PHASE_FETCH_CALLBACK = 3, } EQueryPhase;

source/client/src/clientEnv.c

source/client/src/clientImpl.c

gemini-code-assist · 2026-03-07T11:56:42Z

source/client/src/clientMain.c

+  pRequest->currentPhase = 0;  // 0 = query phase
+  pRequest->actionStartTime = taosGetTimestampMs();


As suggested in the comment on include/common/tmsg.h, this magic number 0 should be replaced with the QUERY_PHASE_QUERY enum value for better code clarity.

pRequest->currentPhase = QUERY_PHASE_QUERY; // 0 = query phase pRequest->actionStartTime = taosGetTimestampMs();

source/client/src/clientMain.c

gemini-code-assist · 2026-03-07T11:56:42Z

source/dnode/mnode/impl/src/mndProfile.c

+    switch (pQuery->currentPhase) {
+      case 0: phaseStr = "query"; break;
+      case 1: phaseStr = "fetch"; break;
+      case 2: phaseStr = "query_callback"; break;
+      case 3: phaseStr = "fetch_callback"; break;
+      default: phaseStr = "unknown"; break;
+    }


To avoid using magic numbers and improve readability, please replace the case values with the suggested EQueryPhase enum values.

switch (pQuery->currentPhase) { case QUERY_PHASE_QUERY: phaseStr = "query"; break; case QUERY_PHASE_FETCH: phaseStr = "fetch"; break; case QUERY_PHASE_QUERY_CALLBACK: phaseStr = "query_callback"; break; case QUERY_PHASE_FETCH_CALLBACK: phaseStr = "fetch_callback"; break; default: phaseStr = "unknown"; break; }

gemini-code-assist · 2026-03-07T11:56:42Z

test/cases/24-Users/test_query_phase_tracking.py

+                tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")
+                # The query time should be between before and after (with some tolerance)
+                # Note: The timestamp might be in different precision, so we just verify it's reasonable


The test test_phase_timing_accuracy currently only logs the timestamps but doesn't perform any verification. To make the test more effective, an assertion should be added to verify that action_start_time is within the expected range.

Suggested change

tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")

# The query time should be between before and after (with some tolerance)

# Note: The timestamp might be in different precision, so we just verify it's reasonable

tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")

# The query time should be between before and after (with some tolerance)

assert before_time <= query_time <= after_time

Copilot

Pull request overview

This PR adds query execution phase tracking for the SHOW QUERIES command in TDengine. It introduces two new columns (current_phase and action_start_time) to the query schema, tracking which execution stage (query, fetch, query_callback, fetch_callback) a query is in and when that stage began.

Changes:

New currentPhase and actionStartTime fields added to SRequestObj and SQueryDesc structs, with lifecycle tracking at each execution phase
Heartbeat serialization/deserialization updated to transmit the new fields to the MNode, and MNode updated to pack them into the SHOW QUERIES block
New test file added to validate the new columns and phase values

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`include/common/tmsg.h`	Adds `currentPhase` and `actionStartTime` fields to `SQueryDesc`
`source/client/inc/clientInt.h`	Adds the same fields to `SRequestObj`
`source/client/src/clientEnv.c`	Initializes new fields; accidentally removes `msgBuf` allocation
`source/client/src/clientMain.c`	Sets phase=0 at query start, phase=1 at fetch start
`source/client/src/clientImpl.c`	Transitions phase to 2/3 in `doRequestCallback`
`source/client/src/clientHb.c`	Copies new fields into heartbeat descriptor
`source/common/src/msg/tmsg.c`	Encodes/decodes new fields in heartbeat (breaking wire change)
`source/common/src/systable.c`	Adds two new columns to `querySchema`
`source/dnode/mnode/impl/src/mndProfile.c`	Packs phase string and start time into `SHOW QUERIES` result block
`test/cases/24-Users/test_query_phase_tracking.py`	New test file for the feature

Comments suppressed due to low confidence (1)

source/client/src/clientEnv.c:604

The line (*pRequest)->msgBuf = taosMemoryCalloc(1, ERROR_MSG_BUF_DEFAULT_SIZE); was accidentally removed from createRequest(). Since *pRequest is zero-initialized via taosMemoryCalloc, msgBuf will always be NULL, causing the null-check on line 601 to always trigger and createRequest to always fail. This breaks all query requests, as msgBuf is used by the parse context in multiple places (e.g., clientMain.c:1964, clientImpl.c:378, clientImpl.c:600).

  if (NULL == (*pRequest)->msgBuf) {
    code = terrno;
    goto _return;
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-07T11:57:52Z

source/common/src/msg/tmsg.c

+          code = tDecodeI32(pDecoder, &desc.currentPhase);
+          TAOS_CHECK_GOTO(code, &line, _error);
+
+          code = tDecodeI64(pDecoder, &desc.actionStartTime);
+          TAOS_CHECK_GOTO(code, &line, _error);


The heartbeat serialization format change is backward-incompatible. The new currentPhase and actionStartTime fields are encoded unconditionally in the middle of the per-query descriptor stream (after subPlanNum, before snum). There is no tDecodeIsEnd guard on the decoder side to handle messages from older clients. If a new server receives a heartbeat from an old client (which doesn't encode these fields), it will try to read 4+8 = 12 bytes that don't exist, causing a decode failure. This will break rolling upgrades where clients and servers may be at different versions. The new fields should be added at the end of the per-query descriptor and guarded with tDecodeIsEnd checks on deserialization, following the pattern used elsewhere in the same file (e.g., lines 463–476).

Suggested change

code = tDecodeI32(pDecoder, &desc.currentPhase);

TAOS_CHECK_GOTO(code, &line, _error);

code = tDecodeI64(pDecoder, &desc.actionStartTime);

TAOS_CHECK_GOTO(code, &line, _error);

/* currentPhase and actionStartTime were added later; guard with tDecodeIsEnd

* so we can still decode heartbeats from older clients that don't send them.

*/

if (!tDecodeIsEnd(pDecoder)) {

code = tDecodeI32(pDecoder, &desc.currentPhase);

TAOS_CHECK_GOTO(code, &line, _error);

code = tDecodeI64(pDecoder, &desc.actionStartTime);

TAOS_CHECK_GOTO(code, &line, _error);

}

Copilot · 2026-03-07T11:57:52Z

test/cases/24-Users/test_query_phase_tracking.py

+        if tdSql.getRows() > 0:
+            col_names = [row[0] for row in tdSql.getColNames()]
+            time_idx = col_names.index("action_start_time") if "action_start_time" in col_names else -1
+
+            if time_idx >= 0:
+                query_time = tdSql.getData(0, time_idx)
+                # Convert to milliseconds if in different unit
+                tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")
+                # The query time should be between before and after (with some tolerance)
+                # Note: The timestamp might be in different precision, so we just verify it's reasonable


The test_phase_timing_accuracy test collects the timing values and logs them but never performs any actual assertion on them. The docstring states it will "Verify action_start_time is within reasonable range of recorded time," but the test body only logs the values without checking anything. This means the test will always pass regardless of correctness, providing no real validation of the timing behavior.

Suggested change

if tdSql.getRows() > 0:

col_names = [row[0] for row in tdSql.getColNames()]

time_idx = col_names.index("action_start_time") if "action_start_time" in col_names else -1

if time_idx >= 0:

query_time = tdSql.getData(0, time_idx)

# Convert to milliseconds if in different unit

tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")

# The query time should be between before and after (with some tolerance)

# Note: The timestamp might be in different precision, so we just verify it's reasonable

# Ensure we have at least one query to inspect

assert tdSql.getRows() > 0, "show queries returned no rows to validate action_start_time"

col_names = [row[0] for row in tdSql.getColNames()]

assert "action_start_time" in col_names, "show queries output has no action_start_time column"

time_idx = col_names.index("action_start_time")

query_time = tdSql.getData(0, time_idx)

# Log values for debugging

tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")

# Convert query_time to int if possible (some clients may return string/decimal)

try:

query_time_int = int(query_time)

except (TypeError, ValueError):

raise AssertionError(f"action_start_time has non-numeric value: {query_time!r}")

# Allow some tolerance in case of slight clock/precision differences (e.g. ±60s)

tolerance_ms = 60 * 1000

lower_bound = before_time - tolerance_ms

upper_bound = after_time + tolerance_ms

assert lower_bound <= query_time_int <= upper_bound, (

f"action_start_time {query_time_int} not within expected range "

f"[{lower_bound}, {upper_bound}] based on before/after timestamps"

)

Copilot · 2026-03-07T11:57:52Z

test/cases/24-Users/test_query_phase_tracking.py

@@ -0,0 +1,221 @@
+import time
+import platform


The platform module is imported but never used in this file. This is an unused import that adds unnecessary noise.

Suggested change

import platform

Copilot · 2026-03-07T11:57:52Z

test/cases/24-Users/test_query_phase_tracking.py

@@ -0,0 +1,221 @@
+import time
+import platform
+from new_test_framework.utils import tdLog, tdSql, sc, clusterComCheck, tdCom


The imports sc, clusterComCheck, and tdCom from new_test_framework.utils are never used in this file. These are unused imports.

Suggested change

from new_test_framework.utils import tdLog, tdSql, sc, clusterComCheck, tdCom

from new_test_framework.utils import tdLog, tdSql

Use EQueryExecPhase enum (none/parse/catalog/plan/schedule/execute/fetch/done) instead of raw integer phases. Fix field name mismatches, serialization order, and backward-compatible deserialization for SHOW QUERIES phase tracking. Made-with: Cursor

Extend SQuerySubDesc with startTs/endTs from scheduler task profile. Update sub_status format to tid:status:startMs:endMs for each sub-task. Backward-compatible serialization via tDecodeIsEnd guard. Made-with: Cursor

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

feat: add query phase tracking for SHOW QUERIES

54645cb

Add current_phase and action_start_time fields to track query execution stages: - 0=query, 1=fetch, 2=query_callback, 3=fetch_callback This helps monitor what phase a query is in and how long each phase takes.

Copilot AI review requested due to automatic review settings March 7, 2026 11:54

yihaoDeng requested review from a team, dapan1121, guanshengliang and hzcheng as code owners March 7, 2026 11:54

taosdata-bot bot added the team engine label Mar 7, 2026

Copilot started reviewing on behalf of yihaoDeng March 7, 2026 11:54 View session

gemini-code-assist bot reviewed Mar 7, 2026

View reviewed changes

Copilot AI reviewed Mar 7, 2026

View reviewed changes

yihaoDeng added 2 commits March 10, 2026 18:00

fix invalid rm

a9bd508

Copilot AI review requested due to automatic review settings March 10, 2026 14:28

Copilot started reviewing on behalf of yihaoDeng March 10, 2026 14:29 View session

feat: add startTs/endTs timing fields to sub-task in SHOW QUERIES

f98d340

Extend SQuerySubDesc with startTs/endTs from scheduler task profile. Update sub_status format to tid:status:startMs:endMs for each sub-task. Backward-compatible serialization via tDecodeIsEnd guard. Made-with: Cursor

Copilot AI reviewed Mar 10, 2026

View reviewed changes

update test case

f39d9d5

		int32_t currentPhase; // Current execution phase: 0=query, 1=fetch, 2=query callback, 3=fetch callback
		int64_t actionStartTime; // When current action started (timestamp precision ms)

		pRequest->currentPhase = 0; // 0 = query phase
		pRequest->actionStartTime = taosGetTimestampMs();

-        if tdSql.getRows() > 0:
-            col_names = [row[0] for row in tdSql.getColNames()]
-            time_idx = col_names.index("action_start_time") if "action_start_time" in col_names else -1
-            if time_idx >= 0:
-                query_time = tdSql.getData(0, time_idx)
-                # Convert to milliseconds if in different unit
-                tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")
-                # The query time should be between before and after (with some tolerance)
-                # Note: The timestamp might be in different precision, so we just verify it's reasonable
+        # Ensure we have at least one query to inspect
+        assert tdSql.getRows() > 0, "show queries returned no rows to validate action_start_time"
+        col_names = [row[0] for row in tdSql.getColNames()]
+        assert "action_start_time" in col_names, "show queries output has no action_start_time column"
+        time_idx = col_names.index("action_start_time")
+        query_time = tdSql.getData(0, time_idx)
+        # Log values for debugging
+        tdLog.info(f"Before: {before_time}, Query: {query_time}, After: {after_time}")
+        # Convert query_time to int if possible (some clients may return string/decimal)
+        try:
+            query_time_int = int(query_time)
+        except (TypeError, ValueError):
+            raise AssertionError(f"action_start_time has non-numeric value: {query_time!r}")
+        # Allow some tolerance in case of slight clock/precision differences (e.g. ±60s)
+        tolerance_ms = 60 * 1000
+        lower_bound = before_time - tolerance_ms
+        upper_bound = after_time + tolerance_ms
+        assert lower_bound <= query_time_int <= upper_bound, (
+            f"action_start_time {query_time_int} not within expected range "
+            f"[{lower_bound}, {upper_bound}] based on before/after timestamps"
+        )

	from new_test_framework.utils import tdLog, tdSql, sc, clusterComCheck, tdCom
	from new_test_framework.utils import tdLog, tdSql

Conversation

yihaoDeng commented Mar 7, 2026

Description

Issue(s)

Checklist

Uh oh!

gemini-code-assist bot commented Mar 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants