Skip to content

Conversation

@jmacd
Copy link
Contributor

@jmacd jmacd commented Jun 24, 2025

Description

Improves on the previous drafts:

Middleware config syntax proposal. Further simplified (over-simplified?). In this form, receivers can only configure one limiter. See the updated open questions.

A limiter Tracker struct has been added, provisionally, to discuss (a) how to avoid overcounting, (b) how HTTP middleware knows compression state, (c) how we might instrument limiter behavior, etc.

Follows drafts
1: #12558
2: #12633
3: #12700
4: #12953
5: #13051
6: #13241

Link to tracking issue

Part of #7441
Part of #9591
Part of #12603
Part of #13228

Testing

N/A This will be broken into smaller PRs with tests.

Documentation

Documentation has been updated with new open questions.

Comment on lines +249 to +251
The question is how we avoid double-count certain limits whether they
are implemented in middleware, through a factory, through custom
receiver code, or other.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a concrete example of the problem? The middleware and receiver code operate at different levels, so they should not be counting the same things? i.e. middleware would be counting HTTP/gRPC requests & network bytes, while the receiver is counting application protocol bytes, log records, etc.

Comment on lines +289 to +303
extensions:
...
limitermux/grpc:
request_count:
- ratelimiter/1
network_bytes:
- ratelimiter/2
request_bytes:
- ratelimiter/3
- admissionlimiter/1
request_items:
- ratelimiter/4

limitermux/otlp:
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of goes in the direction of what I've had in mind: create a distinct "limitermiddleware" that identifies a rate limiter extension, and defines limits for things the middleware deals with: request_count and network_bytes.

Essentially, I want to separate the configuration of the limiter from the limits. This would require that all rate limiters deal with the same configuration for the specific limits, i.e. refill rate and burst.

Concretely, this would look like:

extensions:
  # gubernator provides a rate limiter implementation.
  gubernator:
    endpoint: 0.0.0.0:1050
    peer-discovery:
      type: k8s
      namespace: foo
      selector: bar=baz

  # limitermiddleware/gubernator is an instance of limitermiddleware that uses
  # the gubernator limiter extension to limit HTTP and gRPC requests by
  # request_count & network_bytes.
  limitermiddleware/gubernator:
    limiter: gubernator
    network_bytes:
      rate: 123.456
      burst: 789
    # request_count is unspecified, and so will not be limited.

receivers:
  otlp:
    protocols:
      grpc:
        middleware:
        - limitermiddleware/gubernator

If that is acceptable, then I think we could extract some common config structs for use by both that middleware and receivers:

package limiterhelper

import "go.opentelemetry.io/collector/extension/extensionlimiter"

type RateLimitsConfig struct {
        // RateLimiter identifies a rate limiter extension to use for these limits.
        RateLimiter component.ID

        // Limits holds the configuration for specific rate limits.
        Limits map[extensionlimiter.WeightKey]RateLimitConfig
}

type RateLimitConfig struct {
        // Rate is the rate at which the limit will be refilled (up to Burst), measured in items per second.
        Rate float64

        // Burst is the maximum number of items that will be allowed through at once.
        //
        // As a special case, if Burst is zero then no rate limiting is applied.
        Burst int
}

The RateLimiterProvider.GetRateLimiter method would be updated so the rate & burst must be passed in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That raises another couple of open questions for me:

  • where do metadata keys go?
  • where do we specify the behaviour when the rate limit has been exceeded

I think both of these belong to the thing calling the rate limiter, independent of the weight key - so probably in the RateLimitsConfig struct above.

Comment on lines +6 to +18
// AnyProvider is an any limiter implementation, possibly one of the
// limiter interfaces. This serves as a marker for implementations
// which provider rate limiters of any (one)kind.
type AnyProvider interface {
// unexportedProviderFunc may be embedded using an AnyProviderImpl.
unexportedProviderFunc()
}

// AnyProviderImpl can be embedded as a marker that a component
// implements one of the rate limiters.
type AnyProviderImpl struct {}

func (AnyProviderImpl) unexportedProviderFunc() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced we need these. Extensions implement either RateLimiterProvider or ResourceLimiterProvider (or both? I don't understand why they need to be mutually exclusive). Since they are extensions, they must also implement extension.Extension - thus they already share a common base, and limiterhelper can deal with that interface.

Then, for example, AnyToRateLimiterProvider would be changed in one of the following ways (or some variation thereof):

  • Accept an extension.Extension instead of AnyProvider
  • Accept a component.Host and component.ID
  • Accept a component.Host and RateLimitsConfig, and return a RateLimiterProvider using the RateLimiterProvider.RateLimiter extension component ID, bound to the per-weight configurations in RateLimiterProvider.Limits

Comment on lines +358 to +362
The provided helper implementations are based in
`golang.org/x/time/rate` and
`collector-contrib/internal/otelarrow/admission2`. We could instead
create two extension implementations for these. The code is a hundred
lines or so each.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's good to keep a clear line of separation between the extension interface and implementations.


// RateReservation is modeled on pkg.go.dev/golang.org/x/time/rate#Reservation
type RateReservation interface {
// WaitTime returns the duration until this reservation may
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing: I'm not sure this will be good enough for (all) distributed rate limiters. See elastic/opentelemetry-collector-components#619

TL;DR: a Gubernator-based rate limiter won't know up front how long it needs to wait. Rather, we would need to loop until all items have been handled, waiting in between iterations for the limit to be replenished.

Perhaps we could make it a blocking method, and (as you said in a previous draft) make it so that it will return an error if configured to be non-blocking. The API could then look like:

type RateReservation interface {
	// Wait waits for the requested number of items to be permitted, or until the context is canceled.
	// For non-blocking rate limiters, Wait returns Err??? when it the limit is saturated.
	Wait(context.Context) error

	// Cancel cancels ...
	Cancel()
}

@jmacd
Copy link
Contributor Author

jmacd commented Jun 25, 2025

Thank you @axw this great and helpful feedback. I will take a day or two to digest this and come back with an updated proposal. @bogdandrutu I think it would be helpful for you to read @axw's feedback, restricting attention to the README and this concept of "limitermiddleware" which is a point of indirection.

@blumamir
Copy link
Member

blumamir commented Jul 7, 2025

I have a related feature request, not sure if to write here or some of the related issues.

First of all, thanks @jmacd for working on this! I am facing some memory issues causing crashes due to OOM in our collectors and really looking forward to rejecting data in receiver before it is decoded into memory when the memory is already pressured. We currently do it in the memory_limiter processor, but it has limited buffer and gc intervals to handle bursts, and those are sometimes being eat up as the memory keeps getting filled by incoming telemetry items.

The current implementation (to my understanding) will always reject new data if there is memory limit active issue. I assume it's the safer option and what most users will want by default. We (odigos) could really use a configuration option to also instruct the receiver to drop data instead of rejecting it.
We do it to protect the upstreams application memory and performance, ensuring that if the downstream has issues, we drop spans instead of building them up in queues and retrying in the sender runtime. sometimes it's preferred to lose some telemetry then to have memory spikes in business applications due to otel pipeline problems.

I guess this can also be added later on, but wanted to bring it up in this context, as these middlewares are being designed at the moment.

this is an issue I opened quite long ago to discuss it: #11726

@damemi
Copy link
Member

damemi commented Jul 7, 2025

+1 to @blumamir's suggestion, being able to drop data to prevent backpressure on the user application would be very preferred in a lot of cases for us

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jul 22, 2025
@jmacd
Copy link
Contributor Author

jmacd commented Jul 23, 2025

I will re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants