Add effective-token budget guard to API proxy with config-schema support by Copilot · Pull Request #2641 · github/gh-aw-firewall

Copilot · 2026-05-06T23:44:43Z

This change adds a run-level effective-token budget to AWF and routes enforcement to the api-proxy sidecar. It also adds configurable model multipliers so ET is computed per spec and can better bound mixed model/cached workloads than run counts.

Config surface + schema (stdin/config-file path)
- Added apiProxy.maxEffectiveTokens and apiProxy.modelMultipliers to AWF config model.
- Updated generated schema/spec artifacts so these fields are versioned and validated.
- Wired config mapping through --config ingestion into runtime options.
Runtime wiring into sidecar
- Extended WrapperConfig and compose env generation to pass:
  - AWF_MAX_EFFECTIVE_TOKENS
  - AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS
- Kept this config-only (no new CLI flags), matching stdin-driven workflow usage.
ET accounting + enforcement in api-proxy
- Added effective-token computation from usage data using token class weights + per-model multipliers.
- Tracks cumulative ET for the run and enforces hard stop when the configured max is reached.
- Returns structured limit error from proxy when over budget.
Progressive threshold signaling
- Tracks threshold crossings at 50%, 75%, 90%, and 95%.
- Exposes current effective-token usage and crossed thresholds via /reflect so agents can pull this information.
Usage normalization updates
- Extended token normalization to include reasoning tokens from supported provider usage payload shapes.

# awf config (stdin/file)
apiProxy:
  enabled: true
  maxEffectiveTokens: 100000
  modelMultipliers:
    gpt-4o: 2.0
    claude-sonnet-4: 1.5

{
  "effective_tokens": {
    "enabled": true,
    "max_effective_tokens": 100000,
    "total_effective_tokens": 75234.4,
    "remaining_effective_tokens": 24765.6,
    "percent_used": 75.23,
    "thresholds_crossed": [50, 75]
  }
}

> [!WARNING]
>
>

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/2434ce5a-3838-48b7-a726-4ab223ad93ee

lpcox · 2026-05-07T00:08:27Z

@copilot rather than adding warnings to call respnoses, expose the effective token usage so far in the /reflect endpoint response so that an agent can pull the information

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/539bd112-c459-4a8d-b8fd-dcf30f2c1a41 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

lpcox · 2026-05-07T00:20:35Z

@copilot merge main

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/539bd112-c459-4a8d-b8fd-dcf30f2c1a41 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot · 2026-05-07T00:22:55Z

@copilot rather than adding warnings to call respnoses, expose the effective token usage so far in the /reflect endpoint response so that an agent can pull the information

Implemented. I removed response-body awf_warning injection and now expose effective-token usage in /reflect via effective_tokens (enabled/max/total/remaining/percent/thresholds_crossed). Also updated tests accordingly. Addressed in commit 70082f8 (with test stabilization in 3f121ea).

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

api.example.com
- Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/.bin/jest --verbose --ci server.test.js (dns block)
- Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/.bin/jest --verbose --ci server.test.js --runInBand (dns block)
- Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/.bin/jest --verbose --ci server.test.js --runInBand --forceExit (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

…tive-tokens-guard # Conflicts: # containers/api-proxy/server.test.js Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot · 2026-05-07T00:27:25Z

@copilot merge main

Merged main into this branch in f3f2f0e and resolved the conflict in containers/api-proxy/server.test.js using the latest split-test baseline from main. No UI changes in this update.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

api.example.com
- Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node /opt/hostedtoolcache/node/20.20.2/x64/bin/node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/jest-worker/build/processChild.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

github-actions · 2026-05-07T00:29:26Z

Documentation Preview

Documentation build failed for this PR. View logs.

Built from commit 3e4f3c1

github-actions · 2026-05-07T00:30:00Z

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric	Base	PR	Delta
Lines	87.51%	87.46%	📉 -0.05%
Statements	87.46%	87.40%	📉 -0.06%
Functions	82.35%	82.35%	➡️ +0.00%
Branches	79.73%	79.61%	📉 -0.12%

📁 Per-file Coverage Changes (2 files)

File	Lines (Before → After)	Statements (Before → After)
`src/cli.ts`	29.0% → 28.6% (-0.41%)	29.0% → 28.6% (-0.40%)
`src/container-lifecycle.ts`	87.1% → 88.2% (+1.14%)	87.5% → 88.6% (+1.11%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot

Pull request overview

Adds a run-scoped “effective token” (ET) budget guard that is configured via AWF config/schema, passed to the api-proxy sidecar via env vars, and enforced inside the sidecar with updated usage normalization (including reasoning tokens) and /reflect reporting.

Changes:

Extend AWF config model + config-file mapping to support apiProxy.maxEffectiveTokens and apiProxy.modelMultipliers, and update generated schema/spec artifacts accordingly.
Wire the new config through runtime options into compose env (AWF_MAX_EFFECTIVE_TOKENS, AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS) for the api-proxy sidecar.
Add reasoning-token normalization plus ET accounting/enforcement and /reflect exposure in the api-proxy.

Show a summary per file

File	Description
src/types/config.ts	Adds runtime config fields for max ET and per-model multipliers.
src/services/api-proxy-service.ts	Forwards ET guard config to api-proxy via env vars in compose generation.
src/services/api-proxy-service.test.ts	Adds coverage to ensure env vars are set when ET guard is configured.
src/schema.test.ts	Extends schema validation tests to include the new apiProxy ET fields.
src/schema-validator.test.ts	Adds assertions for formatted validation errors for new ET fields.
src/config-file.ts	Extends config file model and mapping to runtime options for ET config.
src/config-file.test.ts	Adds validation + mapping tests for the new ET config fields.
src/cli.ts	Parses `maxEffectiveTokens` and injects ET fields into wrapper config passed downstream.
src/awf-config-schema.json	Adds schema definitions for `apiProxy.maxEffectiveTokens` and `apiProxy.modelMultipliers`.
scripts/generate-schema.mjs	Updates schema generator so ET fields are part of the canonical generated schema.
docs/awf-config.schema.json	Updates documentation schema artifact to include ET fields.
docs/awf-config-spec.md	Documents config→CLI mapping and notes ET fields are config-only.
containers/api-proxy/token-tracker.test.js	Updates usage normalization tests to include `reasoning_tokens`.
containers/api-proxy/token-tracker.js	Extracts reasoning tokens from supported usage payloads and adds an `onUsage` hook.
containers/api-proxy/server.js	Plumbs effective token usage summary into management handlers for `/reflect`.
containers/api-proxy/proxy-request.js	Implements ET config parsing, accounting, enforcement (429), and reflect state reporting.
containers/api-proxy/management.js	Includes effective-token usage summary in `/reflect` response payload.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

containers/api-proxy/token-tracker.js:721

trackWebSocketTokenUsage now accepts an onUsage callback, but the JSDoc for opts doesn’t mention it. Document the callback (signature + when it is invoked) to keep the public surface of this helper clear.

 * @param {import('tls').TLSSocket} upstreamSocket - Upstream TLS socket
 * @param {object} opts
 * @param {string} opts.requestId - Request ID for correlation
 * @param {string} opts.provider - Provider name (anthropic, copilot, etc.)
 * @param {string} opts.path - Request path
 * @param {number} opts.startTime - Request start time (Date.now())
 * @param {object} opts.metrics - Metrics module reference
 */
function trackWebSocketTokenUsage(upstreamSocket, opts) {
  const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, onUsage } = opts;

Files reviewed: 17/17 changed files
Comments generated: 5

+    if (maxEffectiveTokens !== undefined && (!Number.isFinite(maxEffectiveTokens) || maxEffectiveTokens <= 0)) {
+      console.error('Error: Invalid maxEffectiveTokens value (must be a positive number)');


+function parseMaxEffectiveTokens(raw) {
+  if (raw === undefined || raw === null || String(raw).trim() === '') return null;
+  const parsed = Number(raw);
+  if (!Number.isFinite(parsed) || parsed <= 0) return null;


+function getEffectiveTokenConfig() {
+  const max = parseMaxEffectiveTokens(process.env.AWF_MAX_EFFECTIVE_TOKENS);
+  const multipliers = parseModelMultipliers(process.env.AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS);
+  return { max, multipliers };
+}
+
+function getEffectiveTokenState(config) {
+  if (!config.max) return null;
+  const configKey = `${config.max}|${JSON.stringify(config.multipliers)}`;
+  if (etGuardState.configKey !== configKey) {
+    etGuardState = createEffectiveTokenState(configKey);
+  }
+  return etGuardState;
+}


+    const etBlock = getEffectiveTokenBlockState();
+    if (etBlock && etBlock.maxExceeded) {
+      const duration = Date.now() - startTime;
+      metrics.gaugeDec('active_requests', { provider });
+      metrics.increment('requests_total', { provider, method: req.method, status_class: '4xx' });
+      metrics.observe('request_duration_ms', duration, { provider });
+      logRequest('warn', 'effective_tokens_limit_reached', {
+        request_id: requestId,
+        provider,
+        total_effective_tokens: etBlock.totalEffectiveTokens,
+        max_effective_tokens: etBlock.maxEffectiveTokens,
+      });
+      res.writeHead(429, { 'Content-Type': 'application/json', 'X-Request-ID': requestId });
+      res.end(JSON.stringify(buildEffectiveTokenLimitError(etBlock)));
+      return;


 * @param {http.IncomingMessage} proxyRes - Upstream response
 * @param {object} opts
 * @param {string} opts.requestId - Request ID for correlation
 * @param {string} opts.provider - Provider name (openai, anthropic, copilot, opencode)
 * @param {string} opts.path - Request path
 * @param {number} opts.startTime - Request start time (Date.now())
 * @param {object} opts.metrics - Metrics module reference
 * @param {object|null} opts.billingInfo - Extracted billing/quota headers from response
 * @param {string|null} opts.initiatorSent - X-Initiator value sent on the request
 */
 function trackTokenUsage(proxyRes, opts) {
-  const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, billingInfo, initiatorSent } = opts;
+  const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, billingInfo, initiatorSent, onUsage } = opts;
  const streaming = isStreamingResponse(proxyRes.headers);


lpcox · 2026-05-07T00:46:25Z

@copilot address the review feedback

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/f6e4a632-b2b0-43de-907b-8b02bfee1b2e

github-actions · 2026-05-07T00:52:16Z

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric	Base	PR	Delta
Lines	87.51%	87.49%	📉 -0.02%
Statements	87.46%	87.45%	➡️ -0.01%
Functions	82.35%	82.47%	📈 +0.12%
Branches	79.73%	79.64%	📉 -0.09%

📁 Per-file Coverage Changes (3 files)

File	Lines (Before → After)	Statements (Before → After)
`src/cli.ts`	29.0% → 28.6% (-0.41%)	29.0% → 28.6% (-0.40%)
`src/container-lifecycle.ts`	87.1% → 88.2% (+1.14%)	87.5% → 88.6% (+1.11%)
`src/services/agent-service.ts`	95.6% → 97.5% (+1.90%)	95.3% → 97.6% (+2.29%)

✨ New Files (2 files)

src/services/agent-environment.ts: 98.5% lines
src/services/agent-volumes.ts: 91.6% lines

Coverage comparison generated by scripts/ci/compare-coverage.ts

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/f6e4a632-b2b0-43de-907b-8b02bfee1b2e

Copilot · 2026-05-07T00:55:18Z

@copilot address the review feedback

Addressed the review feedback in 7869db0 (plus supporting commits 317a46a, cfdef2d, ff981ea): integer-only maxEffectiveTokens validation in CLI/runtime, cached ET env parsing, JSDoc updates for onUsage, and a new proxyRequest test that verifies structured 429 rejection once the ET budget is exceeded. No UI changes in this update.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

api.example.com
- Triggering command: /opt/hostedtoolcache/node/20.20.2/x64/bin/node /opt/hostedtoolcache/node/20.20.2/x64/bin/node /home/REDACTED/work/gh-aw-firewall/gh-aw-firewall/containers/api-proxy/node_modules/jest-worker/build/processChild.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

github-actions · 2026-05-07T01:04:57Z

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric	Base	PR	Delta
Lines	87.51%	87.49%	📉 -0.02%
Statements	87.46%	87.45%	➡️ -0.01%
Functions	82.35%	82.47%	📈 +0.12%
Branches	79.73%	79.64%	📉 -0.09%

📁 Per-file Coverage Changes (3 files)

File	Lines (Before → After)	Statements (Before → After)
`src/cli.ts`	29.0% → 28.6% (-0.41%)	29.0% → 28.6% (-0.40%)
`src/container-lifecycle.ts`	87.1% → 88.2% (+1.14%)	87.5% → 88.6% (+1.11%)
`src/services/agent-service.ts`	95.6% → 97.5% (+1.90%)	95.3% → 97.6% (+2.29%)

✨ New Files (2 files)

src/services/agent-environment.ts: 98.5% lines
src/services/agent-volumes.ts: 91.6% lines

Coverage comparison generated by scripts/ci/compare-coverage.ts

github-actions · 2026-05-07T01:06:13Z

Smoke Test: Copilot BYOK (Offline) Mode

Test	Result
GitHub MCP (list PRs)	✅ PR #2655 "fix: remove unused export from 10 service parameter interfaces"
GitHub.com connectivity	⚠️ Pre-step template vars not expanded (skipped)
File write/read	⚠️ Pre-step template vars not expanded (skipped)
BYOK inference (this response)	✅

Running in BYOK offline mode (COPILOT_OFFLINE=true) via api-proxy → api.githubcopilot.com

Overall: PASS (inference path confirmed working)
PR by @Copilot · Assignees: @lpcox, @Copilot

🔑 BYOK report filed by Smoke Copilot BYOK

github-actions · 2026-05-07T01:06:16Z

Smoke Test Results

✅ GitHub MCP: Retrieved last 2 merged PRs

fix: remove unused export from 10 service parameter interfaces #2655 fix: remove unused export from 10 service parameter interfaces
docs: add billing bug warning for retired releases v0.25.21–v0.25.39 #2638 docs: add billing bug warning for retired releases v0.25.21–v0.25.39

✅ Playwright: Navigated to GitHub, verified page title contains "GitHub"

✅ File Writing: Created test file /tmp/gh-aw/agent/smoke-test-claude-25469722429.txt

✅ Bash Tool: Verified file creation with cat command

Overall Status: PASS

💥 [THE END] — Illustrated by Smoke Claude

github-actions · 2026-05-07T01:06:35Z

🔬 Smoke Test Results

Test	Result
GitHub MCP connectivity	✅
GitHub.com HTTP connectivity	✅ 200
File write/read	✅ (pre-step data unavailable; template vars not expanded)

PR: Add effective-token budget guard to API proxy with config-schema support
Author: @Copilot | Assignees: @lpcox, @Copilot

Overall: PASS ✅

📰 BREAKING: Report filed by Smoke Copilot

github-actions · 2026-05-07T01:07:34Z

Chroot Smoke Test Results

Runtime	Host Version	Chroot Version	Match?
Python	Python 3.12.13	Python 3.12.3	❌ NO
Node.js	v24.14.1	v20.20.2	❌ NO
Go	go1.22.12	go1.22.12	✅ YES

Result: ❌ Not all runtimes matched — Python and Node.js versions differ between host and chroot environment.

Tested by Smoke Chroot

github-actions · 2026-05-07T01:07:54Z

Smoke Test: FAIL
PRs: fix: remove unused export from 10 service parameter interfaces; docs: add billing bug warning for retired releases v0.25.21–v0.25.39
✅ GitHub PR review
❌ safeinputs-gh query unavailable
✅ Playwright title contains GitHub
❌ Tavily search unavailable
✅ File write/read + bash
❌ Discussion query tool unavailable; oracle comment posted via fallback
✅ npm ci && npm run build
Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

github-actions · 2026-05-07T01:07:59Z

🏗️ Build Test Suite Results

Ecosystem	Project	Build/Install	Tests	Status
Bun	elysia	✅	1/1 passed	✅ PASS
Bun	hono	✅	1/1 passed	✅ PASS
C++	fmt	✅	N/A	✅ PASS
C++	json	✅	N/A	✅ PASS
Deno	oak	N/A	1/1 passed	✅ PASS
Deno	std	N/A	1/1 passed	✅ PASS
.NET	hello-world	✅	N/A	✅ PASS
.NET	json-parse	✅	N/A	✅ PASS
Go	color	✅	1/1 passed	✅ PASS
Go	env	✅	1/1 passed	✅ PASS
Go	uuid	✅	1/1 passed	✅ PASS
Java	gson	✅	1/1 passed	✅ PASS
Java	caffeine	✅	1/1 passed	✅ PASS
Node.js	clsx	✅	all passed	✅ PASS
Node.js	execa	✅	all passed	✅ PASS
Node.js	p-limit	✅	all passed	✅ PASS
Rust	fd	✅	1/1 passed	✅ PASS
Rust	zoxide	✅	1/1 passed	✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #2641 · ● 2M · ◷

github-actions · 2026-05-07T01:08:31Z

Smoke Test Results — FAIL

Check	Result
Redis PING	❌ Timeout / no response
PostgreSQL pg_isready	❌ No response
PostgreSQL SELECT 1	❌ Not attempted (pg_isready failed)

host.docker.internal is not reachable from this runner environment. Service containers are unavailable.

Overall: FAIL

🔌 Service connectivity validated by Smoke Services

Initial plan

757ba6d

Copilot AI assigned Copilot and lpcox May 6, 2026

Copilot started work on behalf of lpcox May 6, 2026 23:44 View session

Copilot AI linked an issue May 6, 2026 that may be closed by this pull request

max-effective-tokens #2615

Closed

Copilot AI added 3 commits May 6, 2026 23:54

feat: add effective token guard in api-proxy

0989fae

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/2434ce5a-3838-48b7-a726-4ab223ad93ee

fix: refine effective token guard error handling

573f809

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/2434ce5a-3838-48b7-a726-4ab223ad93ee

fix: align effective token config typing and guard behavior

5fe342c

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/2434ce5a-3838-48b7-a726-4ab223ad93ee

Copilot AI changed the title ~~[WIP] Add max effective tokens support in configuration~~ Add effective-token budget guard to API proxy with config-schema support May 7, 2026

Copilot finished work on behalf of lpcox May 7, 2026 00:01

Copilot AI requested a review from lpcox May 7, 2026 00:01

Copilot started work on behalf of lpcox May 7, 2026 00:08 View session

refactor: expose effective token usage via reflect endpoint

70082f8

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/539bd112-c459-4a8d-b8fd-dcf30f2c1a41 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

test: stabilize reflect effective token mock flow

3f121ea

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/539bd112-c459-4a8d-b8fd-dcf30f2c1a41 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot finished work on behalf of lpcox May 7, 2026 00:23

Copilot started work on behalf of lpcox May 7, 2026 00:23 View session

Merge remote-tracking branch 'origin/main' into copilot/add-max-effec…

f3f2f0e

…tive-tokens-guard # Conflicts: # containers/api-proxy/server.test.js Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot finished work on behalf of lpcox May 7, 2026 00:28

lpcox marked this pull request as ready for review May 7, 2026 00:28

lpcox requested a review from Mossaka as a code owner May 7, 2026 00:28

Copilot AI review requested due to automatic review settings May 7, 2026 00:28

Copilot started reviewing on behalf of lpcox May 7, 2026 00:29 View session