Add effective-token budget guard to API proxy with config-schema support#2641
Add effective-token budget guard to API proxy with config-schema support#2641
Conversation
|
@copilot rather than adding warnings to call respnoses, expose the effective token usage so far in the /reflect endpoint response so that an agent can pull the information |
Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/539bd112-c459-4a8d-b8fd-dcf30f2c1a41 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
|
@copilot merge main |
Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/539bd112-c459-4a8d-b8fd-dcf30f2c1a41 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Implemented. I removed response-body Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
…tive-tokens-guard # Conflicts: # containers/api-proxy/server.test.js Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Merged Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
| Metric | Base | PR | Delta |
|---|---|---|---|
| Lines | 87.51% | 87.46% | 📉 -0.05% |
| Statements | 87.46% | 87.40% | 📉 -0.06% |
| Functions | 82.35% | 82.35% | ➡️ +0.00% |
| Branches | 79.73% | 79.61% | 📉 -0.12% |
📁 Per-file Coverage Changes (2 files)
| File | Lines (Before → After) | Statements (Before → After) |
|---|---|---|
src/cli.ts |
29.0% → 28.6% (-0.41%) | 29.0% → 28.6% (-0.40%) |
src/container-lifecycle.ts |
87.1% → 88.2% (+1.14%) | 87.5% → 88.6% (+1.11%) |
Coverage comparison generated by scripts/ci/compare-coverage.ts
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Adds a run-scoped “effective token” (ET) budget guard that is configured via AWF config/schema, passed to the api-proxy sidecar via env vars, and enforced inside the sidecar with updated usage normalization (including reasoning tokens) and /reflect reporting.
Changes:
- Extend AWF config model + config-file mapping to support
apiProxy.maxEffectiveTokensandapiProxy.modelMultipliers, and update generated schema/spec artifacts accordingly. - Wire the new config through runtime options into compose env (
AWF_MAX_EFFECTIVE_TOKENS,AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS) for the api-proxy sidecar. - Add reasoning-token normalization plus ET accounting/enforcement and
/reflectexposure in the api-proxy.
Show a summary per file
| File | Description |
|---|---|
| src/types/config.ts | Adds runtime config fields for max ET and per-model multipliers. |
| src/services/api-proxy-service.ts | Forwards ET guard config to api-proxy via env vars in compose generation. |
| src/services/api-proxy-service.test.ts | Adds coverage to ensure env vars are set when ET guard is configured. |
| src/schema.test.ts | Extends schema validation tests to include the new apiProxy ET fields. |
| src/schema-validator.test.ts | Adds assertions for formatted validation errors for new ET fields. |
| src/config-file.ts | Extends config file model and mapping to runtime options for ET config. |
| src/config-file.test.ts | Adds validation + mapping tests for the new ET config fields. |
| src/cli.ts | Parses maxEffectiveTokens and injects ET fields into wrapper config passed downstream. |
| src/awf-config-schema.json | Adds schema definitions for apiProxy.maxEffectiveTokens and apiProxy.modelMultipliers. |
| scripts/generate-schema.mjs | Updates schema generator so ET fields are part of the canonical generated schema. |
| docs/awf-config.schema.json | Updates documentation schema artifact to include ET fields. |
| docs/awf-config-spec.md | Documents config→CLI mapping and notes ET fields are config-only. |
| containers/api-proxy/token-tracker.test.js | Updates usage normalization tests to include reasoning_tokens. |
| containers/api-proxy/token-tracker.js | Extracts reasoning tokens from supported usage payloads and adds an onUsage hook. |
| containers/api-proxy/server.js | Plumbs effective token usage summary into management handlers for /reflect. |
| containers/api-proxy/proxy-request.js | Implements ET config parsing, accounting, enforcement (429), and reflect state reporting. |
| containers/api-proxy/management.js | Includes effective-token usage summary in /reflect response payload. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
containers/api-proxy/token-tracker.js:721
trackWebSocketTokenUsagenow accepts anonUsagecallback, but the JSDoc foroptsdoesn’t mention it. Document the callback (signature + when it is invoked) to keep the public surface of this helper clear.
* @param {import('tls').TLSSocket} upstreamSocket - Upstream TLS socket
* @param {object} opts
* @param {string} opts.requestId - Request ID for correlation
* @param {string} opts.provider - Provider name (anthropic, copilot, etc.)
* @param {string} opts.path - Request path
* @param {number} opts.startTime - Request start time (Date.now())
* @param {object} opts.metrics - Metrics module reference
*/
function trackWebSocketTokenUsage(upstreamSocket, opts) {
const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, onUsage } = opts;
- Files reviewed: 17/17 changed files
- Comments generated: 5
| if (maxEffectiveTokens !== undefined && (!Number.isFinite(maxEffectiveTokens) || maxEffectiveTokens <= 0)) { | ||
| console.error('Error: Invalid maxEffectiveTokens value (must be a positive number)'); |
| function parseMaxEffectiveTokens(raw) { | ||
| if (raw === undefined || raw === null || String(raw).trim() === '') return null; | ||
| const parsed = Number(raw); | ||
| if (!Number.isFinite(parsed) || parsed <= 0) return null; |
| function getEffectiveTokenConfig() { | ||
| const max = parseMaxEffectiveTokens(process.env.AWF_MAX_EFFECTIVE_TOKENS); | ||
| const multipliers = parseModelMultipliers(process.env.AWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERS); | ||
| return { max, multipliers }; | ||
| } | ||
|
|
||
| function getEffectiveTokenState(config) { | ||
| if (!config.max) return null; | ||
| const configKey = `${config.max}|${JSON.stringify(config.multipliers)}`; | ||
| if (etGuardState.configKey !== configKey) { | ||
| etGuardState = createEffectiveTokenState(configKey); | ||
| } | ||
| return etGuardState; | ||
| } |
| const etBlock = getEffectiveTokenBlockState(); | ||
| if (etBlock && etBlock.maxExceeded) { | ||
| const duration = Date.now() - startTime; | ||
| metrics.gaugeDec('active_requests', { provider }); | ||
| metrics.increment('requests_total', { provider, method: req.method, status_class: '4xx' }); | ||
| metrics.observe('request_duration_ms', duration, { provider }); | ||
| logRequest('warn', 'effective_tokens_limit_reached', { | ||
| request_id: requestId, | ||
| provider, | ||
| total_effective_tokens: etBlock.totalEffectiveTokens, | ||
| max_effective_tokens: etBlock.maxEffectiveTokens, | ||
| }); | ||
| res.writeHead(429, { 'Content-Type': 'application/json', 'X-Request-ID': requestId }); | ||
| res.end(JSON.stringify(buildEffectiveTokenLimitError(etBlock))); | ||
| return; |
| * @param {http.IncomingMessage} proxyRes - Upstream response | ||
| * @param {object} opts | ||
| * @param {string} opts.requestId - Request ID for correlation | ||
| * @param {string} opts.provider - Provider name (openai, anthropic, copilot, opencode) | ||
| * @param {string} opts.path - Request path | ||
| * @param {number} opts.startTime - Request start time (Date.now()) | ||
| * @param {object} opts.metrics - Metrics module reference | ||
| * @param {object|null} opts.billingInfo - Extracted billing/quota headers from response | ||
| * @param {string|null} opts.initiatorSent - X-Initiator value sent on the request | ||
| */ | ||
| function trackTokenUsage(proxyRes, opts) { | ||
| const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, billingInfo, initiatorSent } = opts; | ||
| const { requestId, provider, path: reqPath, startTime, metrics: metricsRef, billingInfo, initiatorSent, onUsage } = opts; | ||
| const streaming = isStreamingResponse(proxyRes.headers); |
This comment has been minimized.
This comment has been minimized.
|
@copilot address the review feedback |
|
| Metric | Base | PR | Delta |
|---|---|---|---|
| Lines | 87.51% | 87.49% | 📉 -0.02% |
| Statements | 87.46% | 87.45% | ➡️ -0.01% |
| Functions | 82.35% | 82.47% | 📈 +0.12% |
| Branches | 79.73% | 79.64% | 📉 -0.09% |
📁 Per-file Coverage Changes (3 files)
| File | Lines (Before → After) | Statements (Before → After) |
|---|---|---|
src/cli.ts |
29.0% → 28.6% (-0.41%) | 29.0% → 28.6% (-0.40%) |
src/container-lifecycle.ts |
87.1% → 88.2% (+1.14%) | 87.5% → 88.6% (+1.11%) |
src/services/agent-service.ts |
95.6% → 97.5% (+1.90%) | 95.3% → 97.6% (+2.29%) |
✨ New Files (2 files)
src/services/agent-environment.ts: 98.5% linessrc/services/agent-volumes.ts: 91.6% lines
Coverage comparison generated by scripts/ci/compare-coverage.ts
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Addressed the review feedback in Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
| Metric | Base | PR | Delta |
|---|---|---|---|
| Lines | 87.51% | 87.49% | 📉 -0.02% |
| Statements | 87.46% | 87.45% | ➡️ -0.01% |
| Functions | 82.35% | 82.47% | 📈 +0.12% |
| Branches | 79.73% | 79.64% | 📉 -0.09% |
📁 Per-file Coverage Changes (3 files)
| File | Lines (Before → After) | Statements (Before → After) |
|---|---|---|
src/cli.ts |
29.0% → 28.6% (-0.41%) | 29.0% → 28.6% (-0.40%) |
src/container-lifecycle.ts |
87.1% → 88.2% (+1.14%) | 87.5% → 88.6% (+1.11%) |
src/services/agent-service.ts |
95.6% → 97.5% (+1.90%) | 95.3% → 97.6% (+2.29%) |
✨ New Files (2 files)
src/services/agent-environment.ts: 98.5% linessrc/services/agent-volumes.ts: 91.6% lines
Coverage comparison generated by scripts/ci/compare-coverage.ts
|
Smoke Test: Copilot BYOK (Offline) Mode
Running in BYOK offline mode ( Overall: PASS (inference path confirmed working)
|
Smoke Test Results✅ GitHub MCP: Retrieved last 2 merged PRs
✅ Playwright: Navigated to GitHub, verified page title contains "GitHub" ✅ File Writing: Created test file /tmp/gh-aw/agent/smoke-test-claude-25469722429.txt ✅ Bash Tool: Verified file creation with cat command Overall Status: PASS
|
🔬 Smoke Test Results
PR: Add effective-token budget guard to API proxy with config-schema support Overall: PASS ✅
|
Chroot Smoke Test Results
Result: ❌ Not all runtimes matched — Python and Node.js versions differ between host and chroot environment.
|
|
Smoke Test: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test Results — FAIL
Overall: FAIL
|
This change adds a run-level effective-token budget to AWF and routes enforcement to the api-proxy sidecar. It also adds configurable model multipliers so ET is computed per spec and can better bound mixed model/cached workloads than run counts.
Config surface + schema (stdin/config-file path)
apiProxy.maxEffectiveTokensandapiProxy.modelMultipliersto AWF config model.--configingestion into runtime options.Runtime wiring into sidecar
WrapperConfigand compose env generation to pass:AWF_MAX_EFFECTIVE_TOKENSAWF_EFFECTIVE_TOKEN_MODEL_MULTIPLIERSET accounting + enforcement in api-proxy
Progressive threshold signaling
/reflectso agents can pull this information.Usage normalization updates
{ "effective_tokens": { "enabled": true, "max_effective_tokens": 100000, "total_effective_tokens": 75234.4, "remaining_effective_tokens": 24765.6, "percent_used": 75.23, "thresholds_crossed": [50, 75] } }> [!WARNING]
>
>