Description
1. Summary
Vitess currently classifies query plan types in a way that is neither intuitive nor helpful for performance analysis. In particular, QueriesProcessed
and QueriesRouted
rely on plan-type designations that are inconsistent across different operators (e.g., IN
, Concatenate
, DDL
, Reference
, FkCascade
, and InsertSelect
). This proposal introduces two new metrics and deprecates the older, less-informative ones.
2. Motivation
-
Inconsistent Plan-Type Metric
- The plan type was derived from the root operator’s name, leading to wide variability in reported plan types.
- Some operators (like
Route
) reported the route type, while others forwarded whatever their child operator returned.
-
Limited Usefulness
- Metrics such as
QueriesProcessed
andQueriesRouted
provide only a coarse breakdown. - Deeper insight (e.g., how complex or how many shards were involved) is missing or buried in logs/other metrics.
- Metrics such as
-
Need for Clarity
- We want an easily understandable categorization of query complexity to better diagnose query performance in vtgate.
- We also want to track how many shards each query touches, to inform sharding logic decisions and query optimization.
3. Proposed Changes
3.1 Deprecation
- Metrics to Deprecate:
QueriesProcessed
QueriesRouted
QueriesProcessedByTable
QueriesRoutedByTable
- These will be marked as deprecated but still exist for at least one release to allow for a smooth transition.
3.2 Plan Classification
We propose categorizing queries into twelve distinct buckets to capture both common and potentially problematic execution patterns:
- Local
- This query is solved at VTGate without calling any external system.
-
Passthrough
- The query is forwarded directly to a single shard without modification.
- Minimal overhead in vtgate; the fastest possible path.
-
MultiShard
- The query is routed to multiple shards, but not all shards in the cluster.
- vtgate may need to concatenate results but does not perform extensive operator logic beyond routing.
-
Lookup
- The query requires at least two calls: typically a vindex lookup first, followed by the main query.
- A common pattern for partial fan-out or locating specific shard(s) through a lookup vindex.
-
Scatter
- The query is sent to all shards.
- Indicates potential performance overhead, as every shard must be involved in the request.
-
JoinOp
- The plan includes at least one join operator at vtgate level (e.g., join of two routes).
- Useful for quickly spotting queries that might be combining data across multiple shards.
-
ForeignKey
- The query involved foreign key constraint handling at vtgate level.
- This happens when Vitess is managing the foreign key constraints.
-
Complex
- Catches any plan more involved than the above categories (e.g., subqueries, nested operators, multi-stage pipelines).
- Indicates a need for further investigation or optimization if it appears frequently.
-
OnlineDDL
- Vitess-managed DDL statements performed online (e.g., schema migrations orchestrated by vtgate).
- Tracked separately to measure usage and performance impact of online operations.
-
DirectDDL
- DDL statements that are directly passed to the underlying MySQL instances.
- Does not go through Vitess’s online migration flow.
-
Transaction
- Statements that are related to transaction like (begin, commit, rollback, savepoint, etc.).
- Does not go through Vitess’s online migration flow.
-
Topology
- Statements that calls into Topo server to retrieve the metadata information for response.
3.3 Query Classification
- Purpose: Categorize queries by the high-level SQL statement type.
Possible Categories (not exhaustive):
SELECT
INSERT
UPDATE
DELETE
SET
DDL
(could be further subdivided if desired)- Others as needed (ALTER, CREATE, DROP, etc.)
3.4 New Metrics
Three dimensions are added to the new metrics.
Plan
- Plan classification
Query
- Query classification
Tablet
- Target Tablet Type (Primary, Replica, Rdonly, etc.)
QueryExecutions
- Counts queries executed at VTGateQueryRoutes
- Counts queries routed from VTGate to VTTabletQueryExecutionsByTable
- Counts queries executed at VTGate per table. This hasQuery
andTable Name
as dimension.
4. Backward Compatibility
- We will keep the old metrics running in parallel, marked as “DEPRECATED,” for at least one release cycle.
- Developers and operators should be encouraged to migrate to the new metrics.