NumGoroutines Configuration for GCP Pub/Sub Subscriptions

# Problem

We're experiencing high consumption of the "Exactly once subscriber pull operations per minute per region" GCP Pub/Sub quota due to excessive idle streaming pull connections. Our quota usage is primarily driven by connection maintenance overhead rather than actual message processing.

### Misleading Configuration

MaxConcurrency is used to set GCP's MaxOutstandingMessages, but it does NOT control GCP's NumGoroutines (the number of streaming pull connections). This naming/behavior is confusing and led us to believe we could control connection count via MaxConcurrency. We figured it out now.

### Default Behavior Creates Too Many Connections

With the default GCP Pub/Sub client settings:
  - NumGoroutines = 10 per subscription per instance per connection
  - Each idle connection generates ~55 operations/min for maintenance (heartbeats?, exactly-once delivery state?)
  - We have 18 subscriptions
  - Each instance ~180 pull streams

  Quota details:
  - Quota limit: 180,000 operations/min
  - Baseline usage: 55,000/min (30%) for 5 instances (min) during off-load hours
  - Peak usage: >100% (we get throttled)
  - Problem: >80% of operations are connection overhead, not actual message processing

<img width="503" height="464" alt="Image" src="https://github.com/user-attachments/assets/c63821f2-1662-4777-a68c-9a1e118e42aa" />

## Adaptive experimental feature is useless

We enabled the `"adaptive-gcp-pubsub-goroutines"` experimental flag, which reduced connections from ~180 per instance to ~171 per instance (~5% reduction). 

This is insufficient for our needs, since it targets 90 connections per CPU, we have autoscaling to 30 instances * 2 CPUs = 60 CPUs, which will lead to ~5400 pull streams, or **180% quota** usage baseline on **maintaining connections alone**.

# Proposed Solution

Add explicit NumGoroutines configuration per subscription:
```
  type PubsubSubscriptionConfig struct {
      // ... existing fields ...

      // NumGoroutines sets the number of StreamingPull connections to maintain
      // for this subscription. Each goroutine maintains one persistent gRPC stream.
      //
      // If not set, defaults to 10 (GCP client library default).
      NumGoroutines *int
  }
```

### Benefits

  1. Predictable quota consumption: Operators can calculate exact baseline overhead
  2. Right-sized for traffic: Low-traffic subscriptions don't waste connections
  3. Cost optimization: Could reduce idle connection overhead by 70-80%
  4. Clear semantics: Separate concerns (processing concurrency vs. connection count)

### Workaround

Currently, we have no way to control NumGoroutines per subscription without modifying Encore runtime code. The adaptive feature doesn't provide sufficient control for our use case.

 # Environment
  - GCP Pub/Sub with exactly-once delivery enabled
  - 2 CPUs per instance
  - 5-30 instances

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NumGoroutines Configuration for GCP Pub/Sub Subscriptions #2157

Problem

Misleading Configuration

Default Behavior Creates Too Many Connections

Adaptive experimental feature is useless

Proposed Solution

Benefits

Workaround

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NumGoroutines Configuration for GCP Pub/Sub Subscriptions #2157

Description

Problem

Misleading Configuration

Default Behavior Creates Too Many Connections

Adaptive experimental feature is useless

Proposed Solution

Benefits

Workaround

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions