Skip to content

NumGoroutines Configuration for GCP Pub/Sub Subscriptions #2157

@djeer

Description

@djeer

Problem

We're experiencing high consumption of the "Exactly once subscriber pull operations per minute per region" GCP Pub/Sub quota due to excessive idle streaming pull connections. Our quota usage is primarily driven by connection maintenance overhead rather than actual message processing.

Misleading Configuration

MaxConcurrency is used to set GCP's MaxOutstandingMessages, but it does NOT control GCP's NumGoroutines (the number of streaming pull connections). This naming/behavior is confusing and led us to believe we could control connection count via MaxConcurrency. We figured it out now.

Default Behavior Creates Too Many Connections

With the default GCP Pub/Sub client settings:

  • NumGoroutines = 10 per subscription per instance per connection
  • Each idle connection generates ~55 operations/min for maintenance (heartbeats?, exactly-once delivery state?)
  • We have 18 subscriptions
  • Each instance ~180 pull streams

Quota details:

  • Quota limit: 180,000 operations/min
  • Baseline usage: 55,000/min (30%) for 5 instances (min) during off-load hours
  • Peak usage: >100% (we get throttled)
  • Problem: >80% of operations are connection overhead, not actual message processing
Image

Adaptive experimental feature is useless

We enabled the "adaptive-gcp-pubsub-goroutines" experimental flag, which reduced connections from ~180 per instance to ~171 per instance (~5% reduction).

This is insufficient for our needs, since it targets 90 connections per CPU, we have autoscaling to 30 instances * 2 CPUs = 60 CPUs, which will lead to ~5400 pull streams, or 180% quota usage baseline on maintaining connections alone.

Proposed Solution

Add explicit NumGoroutines configuration per subscription:

  type PubsubSubscriptionConfig struct {
      // ... existing fields ...

      // NumGoroutines sets the number of StreamingPull connections to maintain
      // for this subscription. Each goroutine maintains one persistent gRPC stream.
      //
      // If not set, defaults to 10 (GCP client library default).
      NumGoroutines *int
  }

Benefits

  1. Predictable quota consumption: Operators can calculate exact baseline overhead
  2. Right-sized for traffic: Low-traffic subscriptions don't waste connections
  3. Cost optimization: Could reduce idle connection overhead by 70-80%
  4. Clear semantics: Separate concerns (processing concurrency vs. connection count)

Workaround

Currently, we have no way to control NumGoroutines per subscription without modifying Encore runtime code. The adaptive feature doesn't provide sufficient control for our use case.

Environment

  • GCP Pub/Sub with exactly-once delivery enabled
  • 2 CPUs per instance
  • 5-30 instances

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions