@@ -994,7 +994,11 @@ metadata, duplicate endpoint ids, duplicate provider reports, malformed endpoint
994994ids, non-lowercase provider names, invalid Cloud API base URLs, Cloud API base
995995URLs with query or fragment components, non-2xx route results, non-JavaScript
996996view asset paths/content types, missing, malformed, or empty-content view asset
997- SHA-256 digests, and credential-shaped field names or string values such as tokens,
997+ SHA-256 digests, missing model results, failed lifecycle calls, unhandled event
998+ calls, asset integrity values that do not match the recorded asset digest,
999+ empty action/provider/evaluator/response-handler outputs, missing
1000+ service/app-bridge results, and credential-shaped field names or string values
1001+ such as tokens,
9981002authorization headers, API keys, passwords, secrets, bearer/basic auth values,
9991003and URLs with embedded credentials anywhere in the artifact. Every exercised RPC
10001004target must also start with one of the module ids observed in the live manifest,
@@ -1004,6 +1008,16 @@ target recorded in `conformance.moduleExercises`. The conformance harness keeps
10041008the required surface summary in ` conformance.exercised ` , then performs
10051009additional cheap RPC calls for untouched modules so multi-module endpoints still
10061010produce per-module exercise evidence without overwriting the summary target.
1011+ The harness fails at observation time when action, provider, evaluator,
1012+ response-handler evaluator, response-handler field evaluator, service, or app
1013+ bridge calls return empty success-shaped payloads, and when lifecycle or event
1014+ calls do not report success.
1015+ When a view asset includes subresource integrity metadata, the harness verifies
1016+ that value against the fetched bundle bytes before recording the observation.
1017+ The live report writer rejects unknown report kinds before writing, only accepts
1018+ lowercase hyphenated report names, enforces ` cloud.json ` for Cloud and
1019+ ` <provider>.json ` for provider reports, and writes with exclusive create so a
1020+ second artifact cannot overwrite the first observation.
10071021` sync.registered ` and ` sync.registeredModules ` must not contain duplicate
10081022materialized plugin/module identities, and every registered module must have a
10091023unique trusted ` sync.trustDecisions ` entry, so full-surface evidence is tied
@@ -1132,11 +1146,13 @@ packages/agent/src/services/remote-capability-endpoint-conformance.test.ts
11321146 building a temporary remote view bundle, starting the reference endpoint,
11331147 running CLI conformance against it with bearer auth, importing the returned
11341148 bundle as JavaScript, and tearing it down.
1135- - ` bun run --cwd packages/agent test:remote-capabilities ` passed with 158
1136- tests passing and 1 skipped after adding registered-remote component
1137- ownership checks, cross-module/local model collision checks, and stale
1138- contribution cleanup coverage for disappearing remote modules, plus runtime
1139- app route-module collision protection for remote app bridges.
1149+ - ` bun run --cwd packages/agent test:remote-capabilities ` passed with 188
1150+ tests passing and 3 skipped. The canonical suite covers registered-remote
1151+ component ownership checks, cross-module/local model collision checks, stale
1152+ contribution cleanup coverage for disappearing remote modules, runtime app
1153+ route-module collision protection for remote app bridges, and live report
1154+ writer safety for report names, identity, duplicate artifacts, and weak
1155+ conformance result rejection.
11401156- ` bun run --cwd packages/agent test:remote-capabilities:source-build ` passed
11411157 with 2 focused tests passing and 35 adapter tests skipped by name filter.
11421158- ` bun run --cwd packages/agent test:remote-capabilities:provider-live ` found
@@ -1207,10 +1223,13 @@ packages/agent/src/services/remote-capability-cloud-sandbox.cloud-smoke.test.ts
12071223 strict scheduled/manual observation, and that the final ` test-status ` gate
12081224 treats scheduled runs as strict, with required provider endpoints, strict
12091225 live report validation, required artifact upload, and matching live report
1210- directories between smoke producers, validators, and uploaded artifacts.
1226+ directories between smoke producers, validators, and uploaded artifacts. It
1227+ also audits the package-level ` test:remote-capabilities ` script so live report
1228+ writer safety remains in the canonical remote-capability suite.
12111229- ` bun run test:remote-capabilities:live-ci-audit:self-test ` mutates those
1212- report-directory env vars, artifact upload paths, final ` test-status ` live
1213- job gating, scheduled/manual live observation gates, Cloud
1230+ report-directory env vars, artifact upload paths, package-level remote
1231+ capability suite membership, final ` test-status ` live job gating,
1232+ scheduled/manual live observation gates, Cloud
12141233 freshness/identity validation flags, provider primary endpoint secret
12151234 enforcement, provider allowed/required lists, and provider GitHub-env
12161235 matching, and proves the live-CI audit fails when smoke output no longer
@@ -1223,18 +1242,30 @@ packages/agent/src/services/remote-capability-cloud-sandbox.cloud-smoke.test.ts
12231242 configured transport URL. The fingerprint helper also strips query/fragment
12241243 components and rejects embedded URL credentials before hashing, matching the
12251244 URL-backed endpoint provider's accepted base URL shape.
1245+ - Live report writers only accept lowercase report names with numbers or
1246+ hyphens, require Cloud reports to be named ` cloud ` , require provider reports
1247+ to be named after their provider, and create report files with exclusive
1248+ writes, so a duplicate Cloud or provider report cannot silently overwrite an
1249+ earlier artifact before validation/upload.
12261250- Conformance reports include an ` rpcCalls ` ledger that records every canonical
12271251 protocol method used for each exercised surface and module. The live report
12281252 validator requires this ledger to cover every ` moduleExercises ` entry, every
12291253 summarized required surface, and every evaluator phase (` shouldRun ` ,
12301254 ` prepare ` , ` prompt ` , ` process ` , response-handler ` evaluate ` , and field
12311255 ` parse ` /` handle ` ), so live evidence proves the endpoint was exercised through
12321256 the standard RPC-like protocol, not only materialized in a manifest.
1257+ - Model, lifecycle, event, service, and app-bridge conformance results must
1258+ carry their required protocol success fields: ` modelResult.result ` ,
1259+ ` lifecycleResult.ok: true ` , ` eventResult.handled: true ` ,
1260+ ` serviceResult.result ` , and ` appBridgeResult.result ` .
12331261- View-asset conformance now preserves manifest-declared asset metadata and
12341262 rejects fetched bundles whose content type or integrity value contradicts the
1235- manifest. The live report validator also rejects artifacts whose recorded
1236- manifest asset metadata disagrees with the fetched asset metadata or whose
1237- fetched JavaScript bundle digest is the empty SHA-256 digest.
1263+ manifest, whose integrity value does not include a SHA-256 token, or whose
1264+ integrity value does not match the fetched bytes. The live report validator
1265+ also rejects artifacts whose recorded manifest asset metadata disagrees with
1266+ the fetched asset metadata, whose integrity value lacks or does not match the
1267+ recorded SHA-256 digest, or whose fetched JavaScript bundle digest is the empty
1268+ SHA-256 digest.
12381269- Runtime live summaries include ` runtime.remotePlugins ` , keyed by plugin name,
12391270 endpoint id, and module id. The validator requires this runtime identity list
12401271 to match ` sync.registeredModules ` exactly, so count totals cannot stand in for
0 commit comments