Skip to content

feat: add CloudWatch Logs Insights query support#706

Open
m-q-t wants to merge 1 commit intografana:mainfrom
m-q-t:main
Open

feat: add CloudWatch Logs Insights query support#706
m-q-t wants to merge 1 commit intografana:mainfrom
m-q-t:main

Conversation

@m-q-t
Copy link
Copy Markdown

@m-q-t m-q-t commented Apr 1, 2026

Summary

Adds three new MCP tools for querying CloudWatch Logs via Grafana's datasource proxy API:

  • list_cloudwatch_log_groups — Discover available log groups (with optional prefix filter and cross-account support)
  • list_cloudwatch_log_group_fields — Discover queryable fields for a log group
  • query_cloudwatch_logs — Execute CloudWatch Logs Insights queries with async StartQuery/GetQueryResults polling handled internally

Also refactors the existing CloudWatch metrics code to share infrastructure with the new Logs tools.

Key implementation details

  • Async polling with exponential backoff (200ms -> 2s, 30s timeout) for Logs Insights queries
  • Custom JSON unmarshaler for cloudWatchCustomMeta to handle the polymorphic schema.meta.custom field — it's an object ({"Status":"Complete"}) for Logs but a string ("timeSeriesQuery") for Metrics
  • Strips Grafana-internal metadata (@ptr, *__grafana_internal__) from query results
  • Custom response parsers for log-groups and log-group-fields resource APIs, which return nested {"value": {"name": "..."}} objects rather than plain strings
  • Registered under existing cloudwatch tool category via AddCloudWatchTools

Refactors to existing CloudWatch code

  • Migrate newCloudWatchClient to use BuildTransport — replaces manual TLS transport setup with the shared helper, which also adds ExtraHeaders support and fixes a latent panic from a bare transport.(*http.Transport) type assertion on non-default transports
  • Extract postDsQuery helper — shared by metrics queries, startLogsQuery, and getLogsQueryResults, eliminating duplicated /api/ds/query POST boilerplate
  • Extract fetchCloudWatchResource helper — shared by listCloudWatchNamespaces, listCloudWatchMetrics, listCloudWatchDimensions, listCloudWatchLogGroups, and listCloudWatchLogGroupFields, eliminating ~200 lines of repeated resource API GET boilerplate
  • Net result: cloudwatch.go goes from 108 deletions / 109 additions with fewer total lines despite adding new shared infrastructure

Verification against Grafana Cloud

Tested all 9 CloudWatch tools against a production Grafana Cloud instance with real AWS data:

Tool Test Result
list_cloudwatch_log_groups Prod datasource Pass — 16 log groups returned
list_cloudwatch_log_group_fields /ecs/core-prod log group Pass — 27+ fields discovered
query_cloudwatch_logs Simple field selection, regex filters, stats/pct/latest aggregations, multi-log-group queries Pass — structured results with correct field extraction
query_cloudwatch_logs Nonexistent log group, impossible filter, empty time range PassResourceNotFoundException or empty results with hints
query_cloudwatch ECS CPU/Memory, RDS CPU/IOPS/connections, various time ranges (5m-30d), all 5 statistics, custom periods (60s-86400s) Pass — existing metrics tools unaffected by refactors
list_cloudwatch_namespaces Prod datasource Pass — 141 namespaces (verifies fetchCloudWatchResource refactor)
list_cloudwatch_metrics 4 namespaces (ECS, RDS, Lambda, S3) Pass — all return metric lists
list_cloudwatch_dimensions ECS MemoryUtilized, RDS DatabaseConnections Pass — returns dimension keys
Error handling Invalid datasource UID, wrong region, fake namespace Pass — clear errors, no crashes

Please let me know if you have any questions/feedback.

Best regards,
Maxim


Note

Medium Risk
Adds new CloudWatch Logs Insights query functionality and refactors shared HTTP/transport code used by existing CloudWatch metric tools, so regressions could affect both logs and metrics queries. Risk is mitigated by new unit/integration tests and conservative parsing/backoff behavior.

Overview
Adds CloudWatch Logs Insights support via three new MCP tools: list_cloudwatch_log_groups, list_cloudwatch_log_group_fields, and query_cloudwatch_logs, including async StartQuery/GetQueryResults polling with backoff, result limiting, and filtering out Grafana-internal fields.

Refactors CloudWatch’s Grafana client to centralize /api/ds/query POSTs (postDsQuery), resource API GETs (fetchCloudWatchResource), and transport creation (switch to mcpgrafana.BuildTransport), plus extends frame schema parsing with a polymorphic schema.meta.custom unmarshaller to support both metrics and logs responses.

Expands test coverage with new Go unit/integration tests for logs parsing/status handling and adds LocalStack log seeding + Python LLM-loop tests to validate log group discovery and basic log querying.

Written by Cursor Bugbot for commit a37a4ac. This will update automatically on new commits. Configure here.

Add three new MCP tools for querying CloudWatch Logs via Grafana:
- list_cloudwatch_log_groups: discover available log groups
- list_cloudwatch_log_group_fields: discover queryable fields
- query_cloudwatch_logs: execute Logs Insights queries with async
  StartQuery/GetQueryResults polling handled internally

Key implementation details:
- Async polling with exponential backoff (200ms->2s, 30s timeout)
- Strips Grafana-internal metadata fields from query results
- Custom response parsers for log-groups and log-group-fields APIs
  which return nested object values, not plain strings
- Custom JSON unmarshaler for frame metadata to handle polymorphic
  "custom" field (object for Logs, string for Metrics)
- Registered under existing "cloudwatch" tool category

Also improves existing CloudWatch code:
- Migrate newCloudWatchClient to use BuildTransport (adds ExtraHeaders
  support, fixes latent panic on type assertion)
- Extract shared postDsQuery and fetchCloudWatchResource helpers to
  eliminate ~200 lines of duplicated HTTP boilerplate
- Add LocalStack test data seeding for log groups/events
@m-q-t m-q-t requested a review from a team as a code owner April 1, 2026 17:55
@cla-assistant
Copy link
Copy Markdown

cla-assistant bot commented Apr 1, 2026

CLA assistant check
All committers have signed the CLA.

@cla-assistant
Copy link
Copy Markdown

cla-assistant bot commented Apr 1, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant