Skip to content

Commit cf01d83

Browse files
periklisduricanikolictcp13equals2
authored
fix: scope activity tracking middleware to query routes only (#15129)
#### What this PR does Scopes the activity tracking middleware to query routes only, preventing it from rejecting write requests with HTTP 500 when clients send an unexpected `Content-Type` header. **Root cause**: PR #14777 moved activity tracking from the query-frontend transport handler into `newRoute()`, which wraps *all* registered HTTP routes. This was intended to cover query-frontend routes, but as a side effect it also wraps the push endpoint (`/api/v1/push`), distributor OTLP endpoint, alertmanager routes, etc. When a client sends a push request with `Content-Type: application/x-www-form-urlencoded` (instead of the expected `application/x-protobuf`), the activity tracking middleware attempts to parse the protobuf body as form data. Go 1.17+ rejects semicolons as query separators (`url.ParseQuery`), and since random protobuf bytes inevitably contain `0x3B` (`;`), parsing fails with `"invalid semicolon separator in query"`. The middleware then returns HTTP 500 before the request reaches the actual push handler. **Impact observed**: ~3500 rps of 500 errors on a dedicated Mimir cluster (`prod-us-east-1/mimir-dedicated-17`), with the distributor rejecting all affected push requests. The SLI dropped to 0.178 (~82% write failure rate), triggering a critical SLO burn rate alert. **Fix**: Gate the activity tracking middleware on `maxBodySizeIfAny > 0` in `newRoute()`. This condition matches exactly the routes that PR #14777 intended to cover — all query API routes registered via `RegisterQueryAPI`/`RegisterQueryFrontendHandler` pass a non-zero `maxBodySizeIfAny`, while `RegisterRoute` (used for push, alertmanager, etc.) passes `0`. #### Which issue(s) this PR fixes or relates to Follow-up fix to #14777 #### Checklist - [x] Tests updated. - [ ] Documentation added. - [x] `CHANGELOG.md` updated - the order of entries should be `[CHANGE]`, `[FEATURE]`, `[ENHANCEMENT]`, `[BUGFIX]`. If changelog entry is not needed, please add the `changelog-not-needed` label to the PR. - [ ] [`about-versioning.md`](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/about-versioning.md) updated with experimental features. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Touches HTTP middleware composition for all registered routes; while the change is small and tested, it could affect which endpoints get activity tracking and how some requests are handled. > > **Overview** > Prevents the activity-tracking middleware from wrapping non-query/write endpoints (eg. `/api/v1/push`) by only enabling it for routes registered with a non-zero `maxBodySizeIfAny`. > > Adds a regression test ensuring a push request with an unexpected `Content-Type: application/x-www-form-urlencoded` and protobuf-like body reaches the inner handler (avoiding a Go form-parsing error/HTTP 500), and documents the fix in `CHANGELOG.md`. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 8df3481. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com> Co-authored-by: Yuri Nikolic <durica.nikolic@grafana.com> Co-authored-by: Andrew Hall <andrew.hall@grafana.com>
1 parent aced80d commit cf01d83

3 files changed

Lines changed: 44 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,7 @@
200200
* `-ingest-storage.kafka.ingestion-concurrency-target-flushes-per-shard` from `80` to `40`
201201
* `-ingest-storage.kafka.max-buffered-bytes` from `100MB` to `1GB`
202202
* [ENHANCEMENT] MQE: Enable narrow selectors optimisation and hints passing for `and`/`unless` binary operation. #15096
203+
* [BUGFIX] API: Scope activity tracking middleware to query routes only, preventing it from rejecting write requests that have an unexpected `Content-Type` header with HTTP 500. #15129
203204
* [BUGFIX] Ingester: enforce a minimum 10s delay between TSDB head compaction iterations when an iteration approaches or exceeds the configured `-blocks-storage.tsdb.head-compaction-interval`, so ingestion is not starved by back-to-back compactions. #15061
204205
* [BUGFIX] Update to Go v1.25.9. #15030
205206
* [BUGFIX] Distributor: OTLP partial success responses now correctly populate `RejectedDataPoints` with the actual count of rejected samples, instead of always reporting 0. In classical architecture, this includes rejected samples propagated from the ingester. #14789

pkg/api/api.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ func (a *API) newRoute(path string, handler http.Handler, isPrefix, auth, gzip b
195195
handler = gziphandler.GzipHandler(handler, a.cfg.GzipCompressionLevel)
196196
}
197197

198-
if a.activityTracker != nil {
198+
if a.activityTracker != nil && maxBodySizeIfAny > 0 {
199199
handler = NewActivityTrackingMiddleware(a.activityTracker, a.logger, handler)
200200
}
201201

pkg/api/api_test.go

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
package api
77

88
import (
9+
"bytes"
910
"compress/gzip"
1011
"fmt"
1112
"io"
@@ -378,3 +379,44 @@ func TestNewRouteMaxBodySize(t *testing.T) {
378379
})
379380
}
380381
}
382+
383+
// TestActivityTrackingNotAppliedToPushRoute verifies that the activity tracking middleware is not
384+
// applied to routes registered without a max body size (like /api/v1/push). A remote write v1
385+
// request with Content-Type: application/x-www-form-urlencoded and a protobuf body containing
386+
// 0x3B (';') must reach the inner handler. If the middleware ran, it would call r.ParseForm() on
387+
// the body and Go 1.17+ would reject the semicolon with "invalid semicolon separator in query",
388+
// returning HTTP 500 before the handler is ever called.
389+
func TestActivityTrackingNotAppliedToPushRoute(t *testing.T) {
390+
activityFile := filepath.Join(t.TempDir(), "activity-tracker")
391+
reg := prometheus.NewPedanticRegistry()
392+
at, err := activitytracker.NewActivityTracker(activitytracker.Config{Filepath: activityFile, MaxEntries: 1024}, reg)
393+
require.NoError(t, err)
394+
t.Cleanup(func() { require.NoError(t, at.Close()) })
395+
396+
cfg := Config{}
397+
serverCfg := getServerConfig(t)
398+
federationCfg := tenantfederation.Config{}
399+
srv, err := server.New(serverCfg)
400+
require.NoError(t, err)
401+
402+
api, err := New(cfg, federationCfg, serverCfg, srv, log.NewNopLogger(), at)
403+
require.NoError(t, err)
404+
405+
var innerHandlerCalled bool
406+
inner := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
407+
innerHandlerCalled = true
408+
w.WriteHeader(http.StatusOK)
409+
})
410+
411+
// RegisterRoute passes maxBodySizeIfAny=0, matching how /api/v1/push is registered.
412+
api.RegisterRoute("/api/v1/push", inner, false, false, http.MethodPost)
413+
414+
body := []byte{0x0a, 0x12, 0x3B, 0x08, 0x01, 0x12, 0x0e}
415+
req := httptest.NewRequest(http.MethodPost, "/api/v1/push", bytes.NewReader(body))
416+
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
417+
w := httptest.NewRecorder()
418+
api.server.HTTP.ServeHTTP(w, req)
419+
420+
require.Equal(t, http.StatusOK, w.Code)
421+
require.True(t, innerHandlerCalled, "activity tracking middleware must not run on routes registered without a max body size")
422+
}

0 commit comments

Comments
 (0)