Skip to content

feat: support server-hinted artifact batch sizing#3736

Draft
lox wants to merge 1 commit intomainfrom
feat/artifact-batch-hints
Draft

feat: support server-hinted artifact batch sizing#3736
lox wants to merge 1 commit intomainfrom
feat/artifact-batch-hints

Conversation

@lox
Copy link
Contributor

@lox lox commented Mar 5, 2026

Disclosure: Ampcode assisted, human curated.

Summary

This change adds server-driven artifact batching controls to the agent and introduces light adaptive behaviour for artifact creation when rate-limited.

What Changed

  1. Added optional ping response fields for artifact batching hints:
    1. artifact_create_batch_size
    2. artifact_update_batch_size_max
  2. Stored ping-provided hint values in the agent worker and injected them into job env so bootstrap/subcommands inherit them:
    1. BUILDKITE_ARTIFACT_CREATE_BATCH_SIZE
    2. BUILDKITE_ARTIFACT_UPDATE_BATCH_SIZE_MAX
  3. Added CLI/env wiring in artifact upload for both knobs.
  4. Updated artifact uploader config to accept:
    1. create batch size
    2. update batch size max
  5. Updated CreateArtifacts batching logic to:
    1. use configurable initial batch size
    2. increase batch size on 429 responses (doubling, capped)
  6. Updated UpdateArtifacts logic to chunk large state updates by configured max batch size.
  7. Added focused tests for:
    1. ping hint propagation into job env
    2. create-batch growth on 429
    3. update-state chunking by configured max

Why

  1. Reduce control-plane request pressure during very large artifact workloads.
  2. Let the server steer batching behaviour per workload without requiring static global defaults.
  3. Improve resilience under rate-limiting by adapting create request shape.
  4. Keep behaviour backwards-compatible when hints are absent.

Notes

  1. Defaults remain unchanged when no hint/env is provided.
  2. Validated with:
    1. go tool gofumpt -extra -w ... on modified files
    2. go test ./api ./agent ./internal/artifact ./clicommand

Copy link
Contributor

@moskyb moskyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the idea here, but given the churn going on with pings at the moment with agent push, i think potentially it might be tricky to get remote config manipulation working through both push and polling ping at the same time.

that said though, i think responding to 429s with increased batch sizes is a great idea, and i'd be happy to merge that side of this PR as-is. perhaps we could bump the batch size limit even further? 1000?

@DrJosh9000
Copy link
Contributor

My very first thought, without even considering the proposal, is: why not just increase the batch size default, without any dynamic behaviour? Having the batch size be backend-controlled is good in case we discover that particular batch sizes are bad for us.

I instinctually disagree with @moskyb: I don't think bumping the batch size dynamically based on 429s is a good idea, even if there is an upper limit. I could write some words on it here but I think there there are probably better approaches.

@moskyb
Copy link
Contributor

moskyb commented Mar 10, 2026

why not just increase the batch size default, without any dynamic behaviour?

yeah, i agree with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants