Skip to content

Add effective-token budget guard to API proxy with config-schema support#2641

Merged
lpcox merged 11 commits intomainfrom
copilot/add-max-effective-tokens-guard
May 7, 2026
Merged

Add effective-token budget guard to API proxy with config-schema support#2641
lpcox merged 11 commits intomainfrom
copilot/add-max-effective-tokens-guard

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 6, 2026

This change adds a run-level effective-token budget to AWF and routes enforcement to the api-proxy sidecar. It also adds configurable model multipliers so ET is computed per spec and can better bound mixed model/cached workloads than run counts.

  • Config surface + schema (stdin/config-file path)

    • Added apiProxy.maxEffectiveTokens and apiProxy.modelMultipliers to AWF config model.
    • Updated generated schema/spec artifacts so these fields are versioned and validated.
    • Wired config mapping through --config ingestion into runtime options.
  • Runtime wiring into sidecar

    • Extended WrapperConfig and compose env generation to pass:
      • AWF_MAX_EFFECTIVE_TOKENS
      • AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS
    • Kept this config-only (no new CLI flags), matching stdin-driven workflow usage.
  • ET accounting + enforcement in api-proxy

    • Added effective-token computation from usage data using token class weights + per-model multipliers.
    • Tracks cumulative ET for the run and enforces hard stop when the configured max is reached.
    • Returns structured limit error from proxy when over budget.
  • Progressive threshold signaling

    • Tracks threshold crossings at 50%, 75%, 90%, and 95%.
    • Exposes current effective-token usage and crossed thresholds via /reflect so agents can pull this information.
  • Usage normalization updates

    • Extended token normalization to include reasoning tokens from supported provider usage payload shapes.
# awf config (stdin/file)
apiProxy:
  enabled: true
  maxEffectiveTokens: 100000
  modelMultipliers:
    gpt-4o: 2.0
    claude-sonnet-4: 1.5
{
  "effective_tokens": {
    "enabled": true,
    "max_effective_tokens": 100000,
    "total_effective_tokens": 75234.4,
    "remaining_effective_tokens": 24765.6,
    "percent_used": 75.23,
    "thresholds_crossed": [50, 75]
  }
}

> [!WARNING]
>
>

Copilot AI linked an issue May 6, 2026 that may be closed by this pull request
Copilot AI changed the title [WIP] Add max effective tokens support in configuration Add effective-token budget guard to API proxy with config-schema support May 7, 2026
Copilot finished work on behalf of lpcox May 7, 2026 00:01
Copilot AI requested a review from lpcox May 7, 2026 00:01
@lpcox
Copy link
Copy Markdown
Collaborator

lpcox commented May 7, 2026

@copilot rather than adding warnings to call respnoses, expose the effective token usage so far in the /reflect endpoint response so that an agent can pull the information

@lpcox
Copy link
Copy Markdown
Collaborator

lpcox commented May 7, 2026

@copilot merge main

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 7, 2026

@copilot rather than adding warnings to call respnoses, expose the effective token usage so far in the /reflect endpoint response so that an agent can pull the information

Implemented. I removed response-body awf_warning injection and now expose effective-token usage in /reflect via effective_tokens (enabled/max/total/remaining/percent/thresholds_crossed). Also updated tests accordingly. Addressed in commit 70082f8 (with test stabilization in 3f121ea).

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.example.com
    • Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/.bin/jest --verbose --ci server.test.js (dns block)
    • Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/.bin/jest --verbose --ci server.test.js --runInBand (dns block)
    • Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/.bin/jest --verbose --ci server.test.js --runInBand --forceExit (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

…tive-tokens-guard

# Conflicts:
#	containers/api-proxy/server.test.js

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 7, 2026

@copilot merge main

Merged main into this branch in f3f2f0e and resolved the conflict in containers/api-proxy/server.test.js using the latest split-test baseline from main. No UI changes in this update.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.example.com
    • Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node /opt/hostedtoolcache/node/20.20.2/x64/bin/node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/jest-worker/build/processChild.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot finished work on behalf of lpcox May 7, 2026 00:28
@lpcox lpcox marked this pull request as ready for review May 7, 2026 00:28
@lpcox lpcox requested a review from Mossaka as a code owner May 7, 2026 00:28
Copilot AI review requested due to automatic review settings May 7, 2026 00:28
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Documentation Preview

Documentation build failed for this PR. View logs.

Built from commit 3e4f3c1

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 87.51% 87.46% 📉 -0.05%
Statements 87.46% 87.40% 📉 -0.06%
Functions 82.35% 82.35% ➡️ +0.00%
Branches 79.73% 79.61% 📉 -0.12%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/cli.ts 29.0% → 28.6% (-0.41%) 29.0% → 28.6% (-0.40%)
src/container-lifecycle.ts 87.1% → 88.2% (+1.14%) 87.5% → 88.6% (+1.11%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions github-actions Bot mentioned this pull request May 7, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a run-scoped “effective token” (ET) budget guard that is configured via AWF config/schema, passed to the api-proxy sidecar via env vars, and enforced inside the sidecar with updated usage normalization (including reasoning tokens) and /reflect reporting.

Changes:

  • Extend AWF config model + config-file mapping to support apiProxy.maxEffectiveTokens and apiProxy.modelMultipliers, and update generated schema/spec artifacts accordingly.
  • Wire the new config through runtime options into compose env (AWF_MAX_EFFECTIVE_TOKENS, AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS) for the api-proxy sidecar.
  • Add reasoning-token normalization plus ET accounting/enforcement and /reflect exposure in the api-proxy.
Show a summary per file
File Description
src/types/config.ts Adds runtime config fields for max ET and per-model multipliers.
src/services/api-proxy-service.ts Forwards ET guard config to api-proxy via env vars in compose generation.
src/services/api-proxy-service.test.ts Adds coverage to ensure env vars are set when ET guard is configured.
src/schema.test.ts Extends schema validation tests to include the new apiProxy ET fields.
src/schema-validator.test.ts Adds assertions for formatted validation errors for new ET fields.
src/config-file.ts Extends config file model and mapping to runtime options for ET config.
src/config-file.test.ts Adds validation + mapping tests for the new ET config fields.
src/cli.ts Parses maxEffectiveTokens and injects ET fields into wrapper config passed downstream.
src/awf-config-schema.json Adds schema definitions for apiProxy.maxEffectiveTokens and apiProxy.modelMultipliers.
scripts/generate-schema.mjs Updates schema generator so ET fields are part of the canonical generated schema.
docs/awf-config.schema.json Updates documentation schema artifact to include ET fields.
docs/awf-config-spec.md Documents config→CLI mapping and notes ET fields are config-only.
containers/api-proxy/token-tracker.test.js Updates usage normalization tests to include reasoning_tokens.
containers/api-proxy/token-tracker.js Extracts reasoning tokens from supported usage payloads and adds an onUsage hook.
containers/api-proxy/server.js Plumbs effective token usage summary into management handlers for /reflect.
containers/api-proxy/proxy-request.js Implements ET config parsing, accounting, enforcement (429), and reflect state reporting.
containers/api-proxy/management.js Includes effective-token usage summary in /reflect response payload.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

containers/api-proxy/token-tracker.js:721

  • trackWebSocketTokenUsage now accepts an onUsage callback, but the JSDoc for opts doesn’t mention it. Document the callback (signature + when it is invoked) to keep the public surface of this helper clear.
 * @param {import('tls').TLSSocket} upstreamSocket - Upstream TLS socket
 * @param {object} opts
 * @param {string} opts.requestId - Request ID for correlation
 * @param {string} opts.provider - Provider name (anthropic, copilot, etc.)
 * @param {string} opts.path - Request path
 * @param {number} opts.startTime - Request start time (Date.now())
 * @param {object} opts.metrics - Metrics module reference
 */
function trackWebSocketTokenUsage(upstreamSocket, opts) {
  const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, onUsage } = opts;

  • Files reviewed: 17/17 changed files
  • Comments generated: 5

Comment thread src/cli.ts Outdated
Comment on lines +584 to +585
if (maxEffectiveTokens !== undefined && (!Number.isFinite(maxEffectiveTokens) || maxEffectiveTokens <= 0)) {
console.error('Error: Invalid maxEffectiveTokens value (must be a positive number)');
Comment thread containers/api-proxy/proxy-request.js Outdated
function parseMaxEffectiveTokens(raw) {
if (raw === undefined || raw === null || String(raw).trim() === '') return null;
const parsed = Number(raw);
if (!Number.isFinite(parsed) || parsed <= 0) return null;
Comment on lines +148 to +161
function getEffectiveTokenConfig() {
const max = parseMaxEffectiveTokens(process.env.AWF_MAX_EFFECTIVE_TOKENS);
const multipliers = parseModelMultipliers(process.env.AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS);
return { max, multipliers };
}

function getEffectiveTokenState(config) {
if (!config.max) return null;
const configKey = `${config.max}|${JSON.stringify(config.multipliers)}`;
if (etGuardState.configKey !== configKey) {
etGuardState = createEffectiveTokenState(configKey);
}
return etGuardState;
}
Comment on lines +461 to +475
const etBlock = getEffectiveTokenBlockState();
if (etBlock && etBlock.maxExceeded) {
const duration = Date.now() - startTime;
metrics.gaugeDec('active_requests', { provider });
metrics.increment('requests_total', { provider, method: req.method, status_class: '4xx' });
metrics.observe('request_duration_ms', duration, { provider });
logRequest('warn', 'effective_tokens_limit_reached', {
request_id: requestId,
provider,
total_effective_tokens: etBlock.totalEffectiveTokens,
max_effective_tokens: etBlock.maxEffectiveTokens,
});
res.writeHead(429, { 'Content-Type': 'application/json', 'X-Request-ID': requestId });
res.end(JSON.stringify(buildEffectiveTokenLimitError(etBlock)));
return;
Comment on lines 412 to 424
* @param {http.IncomingMessage} proxyRes - Upstream response
* @param {object} opts
* @param {string} opts.requestId - Request ID for correlation
* @param {string} opts.provider - Provider name (openai, anthropic, copilot, opencode)
* @param {string} opts.path - Request path
* @param {number} opts.startTime - Request start time (Date.now())
* @param {object} opts.metrics - Metrics module reference
* @param {object|null} opts.billingInfo - Extracted billing/quota headers from response
* @param {string|null} opts.initiatorSent - X-Initiator value sent on the request
*/
function trackTokenUsage(proxyRes, opts) {
const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, billingInfo, initiatorSent } = opts;
const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, billingInfo, initiatorSent, onUsage } = opts;
const streaming = isStreamingResponse(proxyRes.headers);
@github-actions

This comment has been minimized.

@lpcox
Copy link
Copy Markdown
Collaborator

lpcox commented May 7, 2026

@copilot address the review feedback

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 87.51% 87.49% 📉 -0.02%
Statements 87.46% 87.45% ➡️ -0.01%
Functions 82.35% 82.47% 📈 +0.12%
Branches 79.73% 79.64% 📉 -0.09%
📁 Per-file Coverage Changes (3 files)
File Lines (Before → After) Statements (Before → After)
src/cli.ts 29.0% → 28.6% (-0.41%) 29.0% → 28.6% (-0.40%)
src/container-lifecycle.ts 87.1% → 88.2% (+1.14%) 87.5% → 88.6% (+1.11%)
src/services/agent-service.ts 95.6% → 97.5% (+1.90%) 95.3% → 97.6% (+2.29%)
✨ New Files (2 files)
  • src/services/agent-environment.ts: 98.5% lines
  • src/services/agent-volumes.ts: 91.6% lines

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 7, 2026

@copilot address the review feedback

Addressed the review feedback in 7869db0 (plus supporting commits 317a46a, cfdef2d, ff981ea): integer-only maxEffectiveTokens validation in CLI/runtime, cached ET env parsing, JSDoc updates for onUsage, and a new proxyRequest test that verifies structured 429 rejection once the ET budget is exceeded. No UI changes in this update.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.example.com
    • Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node /opt/hostedtoolcache/node/20.20.2/x64/bin/node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/jest-worker/build/processChild.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot finished work on behalf of lpcox May 7, 2026 00:56
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 87.51% 87.49% 📉 -0.02%
Statements 87.46% 87.45% ➡️ -0.01%
Functions 82.35% 82.47% 📈 +0.12%
Branches 79.73% 79.64% 📉 -0.09%
📁 Per-file Coverage Changes (3 files)
File Lines (Before → After) Statements (Before → After)
src/cli.ts 29.0% → 28.6% (-0.41%) 29.0% → 28.6% (-0.40%)
src/container-lifecycle.ts 87.1% → 88.2% (+1.14%) 87.5% → 88.6% (+1.11%)
src/services/agent-service.ts 95.6% → 97.5% (+1.90%) 95.3% → 97.6% (+2.29%)
✨ New Files (2 files)
  • src/services/agent-environment.ts: 98.5% lines
  • src/services/agent-volumes.ts: 91.6% lines

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Smoke Test: Copilot BYOK (Offline) Mode

Test Result
GitHub MCP (list PRs) ✅ PR #2655 "fix: remove unused export from 10 service parameter interfaces"
GitHub.com connectivity ⚠️ Pre-step template vars not expanded (skipped)
File write/read ⚠️ Pre-step template vars not expanded (skipped)
BYOK inference (this response)

Running in BYOK offline mode (COPILOT_OFFLINE=true) via api-proxy → api.githubcopilot.com

Overall: PASS (inference path confirmed working)
PR by @Copilot · Assignees: @lpcox, @Copilot

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Smoke Test Results

✅ GitHub MCP: Retrieved last 2 merged PRs

✅ Playwright: Navigated to GitHub, verified page title contains "GitHub"

✅ File Writing: Created test file /tmp/gh-aw/agent/smoke-test-claude-25469722429.txt

✅ Bash Tool: Verified file creation with cat command

Overall Status: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

🔬 Smoke Test Results

Test Result
GitHub MCP connectivity
GitHub.com HTTP connectivity ✅ 200
File write/read ✅ (pre-step data unavailable; template vars not expanded)

PR: Add effective-token budget guard to API proxy with config-schema support
Author: @Copilot | Assignees: @lpcox, @Copilot

Overall: PASS ✅

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Chroot Smoke Test Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3 ❌ NO
Node.js v24.14.1 v20.20.2 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Result: ❌ Not all runtimes matched — Python and Node.js versions differ between host and chroot environment.

Tested by Smoke Chroot

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Smoke Test: FAIL
PRs: fix: remove unused export from 10 service parameter interfaces; docs: add billing bug warning for retired releases v0.25.21–v0.25.39
✅ GitHub PR review
❌ safeinputs-gh query unavailable
✅ Playwright title contains GitHub
❌ Tavily search unavailable
✅ File write/read + bash
❌ Discussion query tool unavailable; oracle comment posted via fallback
✅ npm ci && npm run build
Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx all passed ✅ PASS
Node.js execa all passed ✅ PASS
Node.js p-limit all passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #2641 · ● 2M ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Smoke Test Results — FAIL

Check Result
Redis PING ❌ Timeout / no response
PostgreSQL pg_isready ❌ No response
PostgreSQL SELECT 1 ❌ Not attempted (pg_isready failed)

host.docker.internal is not reachable from this runner environment. Service containers are unavailable.

Overall: FAIL

🔌 Service connectivity validated by Smoke Services

@lpcox lpcox merged commit 633e8f4 into main May 7, 2026
64 of 70 checks passed
@lpcox lpcox deleted the copilot/add-max-effective-tokens-guard branch May 7, 2026 01:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max-effective-tokens

3 participants