Skip to content

Tt 16809 pump generate mcp analytics#954

Open
lghiur wants to merge 26 commits intomasterfrom
TT-16809-Pump-generate-mcp-analytics
Open

Tt 16809 pump generate mcp analytics#954
lghiur wants to merge 26 commits intomasterfrom
TT-16809-Pump-generate-mcp-analytics

Conversation

@lghiur
Copy link
Copy Markdown
Collaborator

@lghiur lghiur commented Mar 18, 2026

Extends pump support to handle MCP (Model Context Protocol) analytics records across all major storage backends.

Changes

  • MongoDB (mcp_mongo.go, mcp_mongo_aggregate.go): Fixed MCP record writes and aggregate handling
  • PostgreSQL/SQL (mcp_sql.go, mcp_sql_aggregate.go): Fixed MCP record and aggregate writes
  • Hybrid (hybrid.go): Added MCP record passthrough support
  • Prometheus (prometheus.go): Extended metrics to include MCP-specific labels/counters
  • Elasticsearch (elasticsearch.go): Added MCP record indexing support
  • Protobuf serializer (serializer/protobuf.go): Extended to handle MCP analytics records

@lghiur lghiur requested a review from a team as a code owner March 18, 2026 13:17
@probelabs
Copy link
Copy Markdown
Contributor

probelabs bot commented Mar 18, 2026

This PR introduces comprehensive support for MCP (Model Context Protocol) analytics records within Tyk Pump. It enables the processing, storage, and aggregation of analytics for MCP-based APIs by adding new data structures, a dedicated aggregation pipeline, and support across all major storage backends.

Files Changed Analysis

This is a significant feature addition, reflected in the 30 files changed with 4,168 additions and only 147 deletions. The changes are predominantly additive, indicating new functionality.

  • New Functionality (12 files): A dozen new files implement the core MCP logic. This includes dedicated pumps for MongoDB and SQL (pumps/mcp_mongo.go, pumps/mcp_sql.go, and their aggregate counterparts), new analytics logic (analytics/aggregate_mcp.go, analytics/mcp_record.go), and extensive corresponding tests.
  • Modifications to Existing Systems: Key existing pumps (pumps/elasticsearch.go, pumps/hybrid.go, pumps/mongo.go) and the Protobuf serializer have been modified to recognize and handle MCP records. Standard pumps like mongo and mongo_aggregate are updated to explicitly ignore MCP records, delegating them to the new dedicated pumps.
  • Core Analytics Changes: The main data structures in analytics/analytics.go and the Protobuf definition in analytics/analytics.proto have been extended to include MCP-specific fields.
  • Refactoring: A new OpenGormDB function was added to pumps/common.go to centralize GORM database connection logic, reducing boilerplate in the SQL-based pumps.

Architecture & Impact Assessment

What this PR accomplishes

This PR enables Tyk Pump to process a new type of analytics record generated by MCP-based APIs. Previously, the pump was limited to REST and GraphQL analytics. This change provides feature parity for analytics collection for this new API protocol, allowing operators to monitor MCP API traffic, performance, and errors.

Key technical changes introduced

  1. New Data Structures: The core analytics.AnalyticsRecord is extended with a new MCPStats struct. A dedicated MCPRecord is introduced for specialized storage in SQL/Mongo to allow for efficient querying on MCP-specific fields (JSONRPCMethod, PrimitiveType, PrimitiveName).
  2. Segregated Aggregation: A new, separate aggregation pipeline (analytics.AggregateMCPData) is created exclusively for MCP records. The existing analytics.AggregateData function is modified to explicitly ignore MCP records, ensuring a clean separation between MCP and REST/GraphQL analytics.
  3. Dedicated Pumps: Four new pumps are introduced to handle storing raw and aggregated MCP data in MongoDB and SQL backends: pumps/mcp_mongo.go, pumps/mcp_mongo_aggregate.go, pumps/mcp_sql.go, and pumps/mcp_sql_aggregate.go.
  4. Enhancements to Existing Pumps:
    • Elasticsearch: Can now route MCP records to a separate index (mcp_index_name) and includes MCP-specific fields in documents.
    • Hybrid: The pump now calls a new RPC endpoint (PurgeAnalyticsDataMCPAggregated) to send aggregated MCP data to MDCB.
  5. Serialization: The Protobuf schema and generated code are updated to handle MCPStats.

Affected system components

  • Analytics Core (/analytics): The fundamental data structures and aggregation logic are significantly expanded.
  • Data Pumps (/pumps): This is the most impacted area, with new pumps added and major existing pumps (Elasticsearch, Hybrid, Mongo) updated.
  • Serialization (analytics/proto): The Protobuf serializer is updated to support the new data fields.

Component Interaction Flow

graph TD
    subgraph Tyk Gateway
        A[Gateway generates AnalyticsRecord]
    end

    subgraph Tyk Pump
        A --> B{Pump receives records}
        B --> C{"record.IsMCPRecord()?"}
        C -- Yes --> D[MCP Analytics Pipeline]
        C -- No --> E[Standard REST/GraphQL Pipeline]

        subgraph D [MCP Analytics Pipeline]
            direction LR
            D_Agg(AggregateMCPData) --> D_MongoAgg[MCPMongoAggregatePump]
            D_Agg --> D_SQLAgg[MCPSQLAggregatePump]
            D_Agg --> D_Hybrid[HybridPump]

            D_Raw(Store Raw MCP Data) --> D_Mongo[MCPMongoPump]
            D_Raw --> D_SQL[MCPSQLPump]
            D_Raw --> D_ES[ElasticsearchPump]
        end
    end

    subgraph Storage Backends
        D_MongoAgg & D_Mongo --> DB1[MongoDB]
        D_SQLAgg & D_SQL --> DB2[SQL Database]
        D_ES --> DB3[Elasticsearch]
        D_Hybrid --> DB4[MDCB]
    end
Loading

Scope Discovery & Context Expansion

  • The changes are systemic within the tyk-pump repository, touching the entire data pipeline from ingestion to storage for the new MCP record type.
  • The modification of analytics/analytics.proto is a critical detail. It implies that this work is part of a larger, cross-repository feature. Upstream components, such as the Tyk Gateway, will need corresponding changes to generate and serialize the new MCPStats data. Without those upstream changes, this new functionality in the pump will remain unused.
  • The implementation for Mongo and SQL creates new, dedicated pump files, duplicating a significant amount of connection and batching logic from the standard pumps. This contrasts with the approach for Elasticsearch, where the new logic is integrated into existing files. This architectural decision isolates the new functionality but increases code duplication and potential maintenance overhead.
Metadata
  • Review Effort: 5 / 5
  • Primary Label: feature

Powered by Visor from Probelabs

Last updated: 2026-04-17T08:03:33.319Z | Triggered by: pr_updated | Commit: 98ca874

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Copy Markdown
Contributor

probelabs bot commented Mar 18, 2026

Security Issues (1)

Severity Location Issue
🟡 Warning analytics/aggregate_mcp.go:153-163
The MCP analytics aggregation logic uses fields like `JSONRPCMethod`, `PrimitiveType`, and `PrimitiveName` from incoming analytics records as keys for in-memory maps. These records originate from an upstream source (e.g., Tyk Gateway). If an attacker can craft requests that generate a high number of unique values for these fields (high cardinality), it can lead to unbounded growth of these maps. This can cause excessive memory consumption in the Tyk Pump, leading to performance degradation or a denial-of-service (DoS) crash.
💡 SuggestionTo mitigate this, consider implementing a limit on the cardinality of each dimension within a single aggregation window. A configurable threshold could be introduced for the maximum number of unique methods, primitives, and names to be tracked per API. Once the threshold is reached, subsequent new values could be ignored, logged, or grouped into a generic "other" category. This would prevent uncontrolled memory growth and protect the pump from resource exhaustion attacks.

Security Issues (1)

Severity Location Issue
🟡 Warning analytics/aggregate_mcp.go:153-163
The MCP analytics aggregation logic uses fields like `JSONRPCMethod`, `PrimitiveType`, and `PrimitiveName` from incoming analytics records as keys for in-memory maps. These records originate from an upstream source (e.g., Tyk Gateway). If an attacker can craft requests that generate a high number of unique values for these fields (high cardinality), it can lead to unbounded growth of these maps. This can cause excessive memory consumption in the Tyk Pump, leading to performance degradation or a denial-of-service (DoS) crash.
💡 SuggestionTo mitigate this, consider implementing a limit on the cardinality of each dimension within a single aggregation window. A configurable threshold could be introduced for the maximum number of unique methods, primitives, and names to be tracked per API. Once the threshold is reached, subsequent new values could be ignored, logged, or grouped into a generic "other" category. This would prevent uncontrolled memory growth and protect the pump from resource exhaustion attacks.
\n\n ### Architecture Issues (2)
Severity Location Issue
🟠 Error pumps/mcp_mongo.go:1-178
The introduction of dedicated MCP pumps for MongoDB and SQL has resulted in significant code duplication. The new files (`mcp_mongo.go`, `mcp_mongo_aggregate.go`, `mcp_sql.go`, `mcp_sql_aggregate.go`) are near-copies of their existing non-MCP counterparts, duplicating boilerplate logic for configuration, connection management, data batching, and writing. This approach violates the DRY principle and increases the long-term maintenance burden, as bug fixes or improvements will need to be applied in multiple places.
💡 SuggestionA more maintainable architecture would involve refactoring the common logic from the base pumps into shared components. For example, a base pump struct could handle the generic mechanics, and be configured with specific "processor" logic (e.g., a filter function and a data transformation function) for each data type (standard, GraphQL, MCP). This is inconsistent with the approach taken for the Elasticsearch pump (`pumps/elasticsearch.go`), where the new logic was integrated into the existing file rather than duplicated. A consistent architectural pattern should be applied across all pumps.
🟡 Warning analytics/aggregate_mcp.go:120-142
The `AggregateMCPData` function duplicates the high-level structure and initialization logic found in the existing `AggregateData` function in `analytics/aggregate.go`. While the core `incrementAggregate` function is correctly reused, the surrounding boilerplate for iterating data, managing the aggregate map, and initializing new aggregate records is repeated.
💡 SuggestionRefactor the common aggregation workflow into a single, generic function. This function could accept a strategy or configuration object that defines the type-specific logic, such as how to filter records, how to initialize a new aggregate struct, and how to increment type-specific dimensions. This would reduce code duplication and make the aggregation logic easier to extend in the future.

Performance Issues (2)

Severity Location Issue
🟡 Warning pumps/mcp_mongo.go:161-175
The data processing pipeline involves multiple iterations over the analytics data and the creation of several intermediate slices, which increases memory allocations and garbage collector pressure. The `WriteData` function first calls `filterMCPData` (creating slice 1), then `AccumulateSet` (creating batch slices), and finally `insertMCPDataSet` calls `convertToMCPObjects` (creating slice 3). This pattern can degrade performance in a high-throughput environment.
💡 SuggestionRefactor the `WriteData` function to use a single-pass approach. Iterate over the input data once, filtering MCP records, converting them, and adding them directly to size-aware batches. This will reduce memory churn and improve overall throughput.
🟡 Warning pumps/mcp_sql.go:152-173
The logic for writing to sharded SQL tables iterates through the record list to find date boundaries. This implementation is sensitive to the order of the input data. If records are not perfectly sorted by date (e.g., `[day1_rec1, day2_rec1, day1_rec2]`), it can result in numerous small, inefficient write batches to the database, increasing transaction overhead.
💡 SuggestionTo make the batching more robust and performant, first group records by their target shard (date) using a map (e.g., `map[string][]*analytics.MCPRecord`). Then, iterate over the map and write the records for each shard in a single, efficient batch. This ensures optimal batching regardless of the input data order.

Quality Issues (1)

Severity Location Issue
🟡 Warning pumps/mcp_mongo.go:1-178
The new MCP-specific pumps for MongoDB and SQL (`mcp_mongo.go`, `mcp_mongo_aggregate.go`, `mcp_sql.go`, `mcp_sql_aggregate.go`) introduce significant code duplication from their existing counterparts (`mongo.go`, `mongo_aggregate.go`, etc.). Core logic for database connection, batching, sharding, and data insertion/upserting is largely copied. This architectural approach increases the maintenance burden, as future bug fixes or enhancements to this common logic will need to be applied in multiple places.
💡 SuggestionConsider refactoring the pumps to be more data-type agnostic. The existing pumps could be extended to handle different analytics record types (REST, GraphQL, MCP) by using interfaces and abstracting the type-specific logic (e.g., aggregation, data model conversion). The implementation for the Elasticsearch pump in this same PR (`pumps/elasticsearch.go`) serves as a good example, where MCP handling was integrated into the existing pump with minimal duplication.

Powered by Visor from Probelabs

Last updated: 2026-04-17T08:02:49.324Z | Triggered by: pr_updated | Commit: 98ca874

💡 TIP: You can chat with Visor using /visor ask <your question>

@lghiur lghiur force-pushed the TT-16809-Pump-generate-mcp-analytics branch 2 times, most recently from e094776 to f1ccfcb Compare March 18, 2026 14:07
@lghiur lghiur force-pushed the TT-16809-Pump-generate-mcp-analytics branch from f1ccfcb to 1ec8a10 Compare March 18, 2026 14:12
Comment thread pumps/prometheus.go Outdated
mcpOnly: true,
}

p.allMetrics = append(p.allMetrics, totalStatusMetric, pathStatusMetrics, keyStatusMetrics, oauthStatusMetrics, totalLatencyMetrics, mcpCallsMetric, mcpLatencyMetric)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't add this as a default metric

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what you're saying is to only append to slice if the mcp logic is enabled?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying is not to add this as a default prometheus metric (https://github.com/TykTechnologies/tyk-pump?tab=readme-ov-file#prometheus) but document + expose McpOnly in a configuration so users can set their custom metrics around it (https://github.com/TykTechnologies/tyk-pump?tab=readme-ov-file#custom-prometheus-metrics).

The current problem with default metrics is that users cannot modify them and they need to use extra steps ( e.g. TYK_PMP_PUMPS_PROMETHEUS_META_DISABLEDMETRICS) to disable them if not needed.

I'd make it simple here and expose + document with some example.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread pumps/prometheus.go Outdated

// mcpOnly marks a metric as MCP-specific: it is only processed for records where IsMCPRecord() is true.
// This is an internal field and is not user-configurable.
mcpOnly bool
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be a public config so user can set their custom metric around mcp.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please explain what is the end goal here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread pumps/hybrid.go
@github-actions
Copy link
Copy Markdown
Contributor

🚨 Jira Linter Failed

Commit: e5ac48a
Failed at: 2026-03-26 14:29:29 UTC

The Jira linter failed to validate your PR. Please check the error details below:

🔍 Click to view error details
failed to validate branch and PR title rules: PR title must contain the Jira ticket ID 'TT-16809'

Next Steps

  • Ensure your branch name contains a valid Jira ticket ID (e.g., ABC-123)
  • Verify your PR title matches the branch's Jira ticket ID
  • Check that the Jira ticket exists and is accessible

This comment will be automatically deleted once the linter passes.

MCPSQLAggregatePump was only calling AutoMigrate without creating
the (dimension, timestamp, org_id, dimension_value) composite index
that SQLAggregatePump creates for tyk_aggregated. Without this index
all MCP analytics SQL queries perform full table scans.

Adds ensureTable and ensureIndex methods matching the regular
aggregate pump, including PostgreSQL CONCURRENTLY support and the
omit_index_creation config flag.
MCPSQLAggregatePump was not migrating existing sharded tables on
startup, unlike SQLAggregatePump which calls HandleTableMigration
and MigrateAllShardedTables. If the schema changes, existing shards
would not be updated.
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
72.3% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants