Skip to content

allow users to define secret provider plugin timeouts in secret blocks#27622

Open
ubiquitousbyte wants to merge 2 commits intohashicorp:mainfrom
ubiquitousbyte:feature/configure-external-secret-provider-timeouts
Open

allow users to define secret provider plugin timeouts in secret blocks#27622
ubiquitousbyte wants to merge 2 commits intohashicorp:mainfrom
ubiquitousbyte:feature/configure-external-secret-provider-timeouts

Conversation

@ubiquitousbyte
Copy link
Contributor

@ubiquitousbyte ubiquitousbyte commented Mar 2, 2026

Context

The hardcoded 10-second timeout for secret provider plugins is insufficient for some integrations, as reported in #27618.

Solution

This PR adds a configurable timeout field to the secret block in job specifications, allowing operators to tune timeouts per-secret based on their backend's latency requirements.

I initially considered adding a client-level configuration, but opted for per-secret configuration instead for a bunch of reasons:

  1. Different secret backends have different latency characteristics. One secret provider might complete in 2s while another might need 30s. Per-secret configuration allows mixing fast and slow providers in the same job.
  2. Timeout changes can be made by updating the job specification rather than modifying agent configuration and restarting nodes. Avoiding an agent restart is always operationally preferable imho.
  3. The timeout travels with the job, making it easier to understand requirements and move jobs between environments.
  4. Secrets without explicit timeouts continue using the proven 10s default, while only slow providers need adjustment.

Implementation

Rather than adding a required timeout parameter to all plugin constructor calls, I implemented the functional options pattern commonly used in Go and throughout the Nomad codebase.

I think this has some benefits. It does not break existing callers, future options (e.g for retries, logging?) can be added without breaking changes and the default behavior is preserved.

Changes

API & Structs

  • Added Timeout time.Duration field to Secret struct in both API and internal representations
  • Canonicalize() defaults unspecified timeouts to 10s to keep things backwards compatible.
  • Updated Equal() and Copy() methods to handle timeout field.
  • Updated ApiTaskToStructsTask to copy over Timeout field.

Plugin Implementation

  • Introduced SecretsPluginOption functional option type
  • Added WithTimeout(duration) option constructor
  • Refactored NewExternalSecretsPlugin() to accept variadic options
  • Plugin constructor sets default 10s timeout, then applies any provided options
  • Both Fingerprint() and Fetch() operations use the configured timeout.

Integration

  • Secrets hook passes WithTimeout(s.Timeout) when creating external plugins
  • Fingerprinting code unchanged (uses default timeout)
  • No changes to plugin provider wrapper (timeout is encapsulated in plugin)

Usage Example

job "example" {
  group "group" {
    task "task" {
      # Slow secret provider requiring extended timeout
      secret "op-secret" {
        provider = "nomad-1password"
        path     = "op://vault/item/password"
        timeout  = "45s"  # Custom timeout for high-latency provider
        env {
          OP_SERVICE_ACCOUNT_TOKEN = "..."
        }
      }
      
      secret "other-secret" {
        provider = "other-provider"
        path     = "..."
        # Uses the default 10s timeout.
        # ... 
      }
    }
  }
}

Closes #27618

@ubiquitousbyte ubiquitousbyte requested review from a team as code owners March 2, 2026 22:29
@tehut
Copy link
Contributor

tehut commented Mar 3, 2026

@ubiquitousbyte thanks so much for your really thorough write up and for taking up the issue! I'll review it today.

Copy link
Contributor

@tehut tehut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ubiquitousbyte I took a look at this today. I agree with your assertion that this customization doesn't belong on the client level, but I'm not sure that it belongs in the task definition, either.

This feels like a plugin concern and something that should be configurable by cluster operators rather than job authors.

Given the limited means of passing information from the common plugins to Nomad, I understand the choice but I'd like to spend a bit more time with it to see if we can't come up with a more an alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

Secret provider plugin timeout of 10 seconds is insufficient for 1Password integration

2 participants