Skip to content

feat(flow): add information_schema.flow_statistics and SHOW FLOW STATUS (#7987)#8392

Open
onepizzateam wants to merge 10 commits into
GreptimeTeam:mainfrom
onepizzateam:feat/flow-statistics
Open

feat(flow): add information_schema.flow_statistics and SHOW FLOW STATUS (#7987)#8392
onepizzateam wants to merge 10 commits into
GreptimeTeam:mainfrom
onepizzateam:feat/flow-statistics

Conversation

@onepizzateam

Copy link
Copy Markdown
Contributor

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

Closes #7987

What's changed and what's your intention?

Implements the flow runtime observability design approved in #7987: a system table
information_schema.flow_statistics plus a SHOW FLOW STATUS SQL command, so users
can inspect per-flow runtime status directly through SQL.

Summary of the change

  • Add the information_schema.flow_statistics system table (table id 44) exposing
    per-flow runtime stats with columns: flow_id, flow_name, start_time,
    last_execution_time, uptime_seconds, state_size.
  • Add the SHOW FLOW STATUS [LIKE ...] statement (parser + AST + Display, operator
    dispatch, frontend permission handling) as a user-friendly view over the same table.
  • Track each flow's first-execution start_time in both the streaming and batching
    engines and surface it through the existing FlowStat pipeline (extends
    FlowStat/FlowStateValue with a start_time_map).
  • Add sqlness coverage (flow/flow_status) and regenerate the affected golden
    results (information_schema, show_databases_tables, view/create).

How it works
SHOW FLOW STATUS is resolved to a projection over information_schema.flow_statistics.
The table builder joins flow metadata (flow_name_manager) with runtime stats obtained
via information_extension.flow_stats(). uptime_seconds is derived as
(now - start_time) / 1000.

Limitations

  • flow_id, flow_name, last_execution_time, and state_size return real data in
    both standalone and distributed modes.
  • start_time / uptime_seconds are populated in standalone mode only. In distributed
    mode they return NULL because the cross-node heartbeat wire type
    (api::v1::meta::FlowStat) does not yet carry start-time information — consistent with
    the proposal's "handle missing values as NULL". Propagating it through the heartbeat is
    a documented follow-up (TODO(#7987-followup)).
  • processed_rows and last_errors from the proposal are intentionally deferred to a
    follow-up PR (incremental delivery per the issue).

API / data compatibility

  • No breaking API changes; SHOW FLOW STATUS and the new table are additive.
  • FlowStateValue.start_time_map is annotated #[serde(default)], so existing persisted
    flow-state metadata deserializes unchanged.

Docs

  • This PR introduces user-facing surfaces (SHOW FLOW STATUS and the
    information_schema.flow_statistics table) that need documentation on the GreptimeDB
    docs site. I'll open a follow-up docs PR against the docs repo to cover the new command
    and table schema.

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

…US (fixes GreptimeTeam#7987)

Signed-off-by: onepizzateam <palakjha916@gmail.com>
…US (fixes GreptimeTeam#7987)

Signed-off-by: onepizzateam <palakjha916@gmail.com>
Signed-off-by: Palak Jha <palakjha916@gmail.com>
Signed-off-by: onepizzateam <palakjha916@gmail.com>
Signed-off-by: onepizzateam <palakjha916@gmail.com>
Signed-off-by: onepizzateam <palakjha916@gmail.com>
Signed-off-by: onepizzateam <palakjha916@gmail.com>
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

@onepizzateam

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new system schema table, information_schema.flow_statistics, along with the SHOW FLOW STATUS SQL command to expose runtime statistics for flows, such as flow ID, name, start time, last execution time, uptime, and state size. Both streaming and batching flow engines have been updated to track and report these start times. The review feedback highlights two key areas for improvement: first, optimizing the uptime calculation in flow_statistics.rs by fetching the current time once outside the loop and handling potential clock skew defensively; second, reducing inter-thread communication overhead in stat.rs by combining multiple sequential worker queries into a single request and executing them concurrently.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/catalog/src/system_schema/information_schema/flow_statistics.rs Outdated
Comment thread src/flow/src/adapter/stat.rs
…mp uptime

Signed-off-by: onepizzateam <palakjha916@gmail.com>
Signed-off-by: onepizzateam <palakjha916@gmail.com>
@onepizzateam onepizzateam force-pushed the feat/flow-statistics branch from 000a50d to 13bf956 Compare July 2, 2026 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-required This change requires docs update. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add flow_statistics system table and SHOW FLOW STATUS for flow runtime observability

1 participant