feat(host-contracts): implement context-aware KMSVerifier by obatirou · Pull Request #2028 · zama-ai/fhevm

obatirou · 2026-02-25T16:56:22Z

Closes https://github.com/zama-ai/fhevm-internal/issues/1062

Add per-context signer sets and thresholds so that old KMS contexts remain verifiable after governance rotates signers via defineNewContext. Includes destroyKmsContext, v0/v1 extraData parsing, reinitializeV2 migration path, legacy view redirection, and IKMSVerifier interface updates.

mergify · 2026-02-25T17:34:42Z

🧪 CI Insights

Here's what we observed from your CI run for 8774604.

🟢 All jobs passed!

But CI Insights is watching 👀

Add per-context signer sets and thresholds so that old KMS contexts remain verifiable after governance rotates signers via defineNewContext. Includes destroyKmsContext, v0/v1 extraData parsing, reinitializeV2 migration path, legacy view redirection, and IKMSVerifier interface updates.

Remove redundant tests and merge overlapping ones (55 → 47) without losing coverage. Removed tests were either strict subsets of other tests or exercised identical code paths.

obatirou · 2026-02-26T10:24:43Z

@claude /pr-review

claude · 2026-02-26T10:25:15Z

Claude finished @obatirou's task in 7m 8s —— View job

PR Review: Context-Aware KMSVerifier

Pre-review validation
Gather context (PR details, guidelines, linked issue)
Run parallel review agents (bug hunting, guidelines, error handling)
Validate findings
Post final review

Summary

[AUTOMATED]

This PR implements context-aware KMS verification, allowing per-context signer sets and thresholds so that old KMS contexts remain verifiable after governance rotates signers. Two independent bug-hunting passes (opus), an error handling audit, and a test coverage analysis were run in parallel.

No critical or high-severity bugs were found. The core security properties are correctly implemented:

✅ Context isolation (cross-context signer rejection via per-context mappings)
✅ EIP-712 signature binding (extraData is part of the signed digest, preventing context ID swapping)
✅ Transient storage cleanup on all non-reverting paths (reverts auto-rollback tstore)
✅ Upgrade safety (reinitializer(3) guard, intentional __EIP712_init omission in reinitializeV2)
✅ Storage layout (new fields appended at offsets +3 through +7, no collision with legacy slots)
✅ Access control on all state-mutating functions (onlyACLOwner)
✅ Assembly correctness (mload(add(extraData, 33)) correctly reads 32 bytes at content offset 1)

🟡 Design Observations (Non-blocking)

1. setThreshold mutates a live context's threshold, breaking invariant of context immutability (Medium, Confidence: 85)

KMSVerifier.sol L201-205

setThreshold modifies the threshold of the current context in-place without creating a new context. Unlike defineNewContext (which increments the counter and preserves old contexts), setThreshold retroactively changes the rules. Any in-flight decryption proofs signed under the old threshold (via v0 or v1 extraData) will be rejected if the threshold was raised. This breaks the immutability invariant that the rest of the context-aware design carefully preserves.

Consider whether setThreshold should instead call defineNewContext with the existing signers and the new threshold, preserving the old threshold for in-flight proofs.

2. setThreshold emits NewContextSet — potentially misleading for indexers (Low, Confidence: 90)

KMSVerifier.sol L204

setThreshold emits NewContextSet when only the threshold changes (no new context ID is created). Downstream indexers keying on NewContextSet may interpret this as a context rotation. Consider a dedicated ThresholdUpdated event or documenting this clearly.

3. Inconsistent return-vs-revert in signature verification (Low, Informational, Confidence: 95)

KMSVerifier.sol L497-537

_verifySignaturesDigestForContext reverts for invalid context, zero sigs, below-threshold count, and invalid signer — but returns false if duplicates reduce the unique count below threshold. This asymmetry is by design (matching existing behavior, confirmed by test_VerifyDecryptionEIP712KMSSignaturesFailAsExpectedIfSameSignerIsUsedTwice), but the natspec should document this explicitly for integrators.

4. v0 extraData TOCTOU in batched transactions (Low, Informational, Confidence: 85)

KMSVerifier.sol L464-470

v0 extraData (0x00 or empty) resolves to currentKmsContextId at call time. In batched/multicall scenarios, if defineNewContext is called earlier in the same transaction, subsequent v0-routed verifications would resolve to the new context. Using v1 extraData with explicit context IDs eliminates this. Not a bug per se, but worth documenting for integrators.

🧪 Test Coverage Gaps (Recommended additions)

The test suite is thorough with 40+ tests covering context management, verification, migration, and edge cases. However, three high-priority gaps were identified:

1. v1 extraData with trailing bytes not tested (High priority)

The contract explicitly documents forward-compatibility at L474-475: "Trailing bytes after byte 33 are ignored for forward-compatibility." There's test_V0ExtraDataWithTrailingBytesUsesCurrentContext but no v1 equivalent. A test with abi.encodePacked(uint8(0x01), contextId, uint256(12345)) would exercise this.

Add v1 trailing bytes test →

2. getContextSignersAndThresholdFromExtraData has zero direct tests (High priority)

This is the new external function called by FHE.sol. It is never called directly in the test file — only indirectly through verification flow. Specifically untested: revert on destroyed contexts (L344-345), revert on non-existent contexts, v0 path, v1 path, malformed v1, unsupported version.

Add getContextSignersAndThreshold tests →

3. reinitializeV2 after initializeFromEmptyProxy not tested (High priority)

Both use reinitializer(3). There's test_ReinitializeV2CannotBeCalledTwice for double-reinitializeV2, but no test verifying that reinitializeV2() reverts on a contract already initialized via initializeFromEmptyProxy().

Add reinitialize guard test →

Medium priority gaps:

reinitializeV2 ownership test (non-owner should be rejected)
initializeFromEmptyProxy with duplicate/null signers or invalid thresholds
Threshold == signers count edge case (requires all signatures)
Multiple context transitions (5+ contexts) stress test

✅ Verified Correct

Concern	Result
`_extractKmsContextId` assembly (`mload(add(extraData, 33))`)	✅ Correct — reads 32 bytes at memory offset 33 (after 32-byte length + 1-byte version)
`reinitializeV2` omitting `__EIP712_init`	✅ Intentional — V2 already initialized EIP-712 domain; annotation confirms
Context ID overflow	✅ Not a concern — Solidity 0.8+ reverts on overflow; ~10⁷⁷ rotations needed
ERC-7201 storage layout	✅ No collisions — new fields at offsets +3 through +7
FHE.sol ↔ KMSVerifier consistency	✅ Same EIP-712 digest; single `getContextSignersAndThresholdFromExtraData` call replaces two separate calls
Transient storage reentrancy	✅ Safe — cleanup on all non-revert paths; EVM revert rolls back tstore
`_setContextSigners` append-only enforcement	✅ Enforced implicitly — second call for same context would revert with `KMSAlreadySigner`
`destroyKmsContext` storage cleanup	✅ Not needed — context IDs are monotonic, never reused
Signature malleability	✅ Handled by OZ ECDSA.recover (KMSVerifier) and FhevmECDSA (FHE.sol)
`calldata` → `memory` conversion in `getContextSignersAndThresholdFromExtraData`	✅ Solidity handles automatically

Verdict

The contract implementation is solid and well-tested. The core security properties (context isolation, signature binding, upgrade safety) are correctly implemented. The design observation about setThreshold mutating live contexts is worth discussing but is a design choice, not a bug. The three high-priority test gaps would strengthen confidence and are recommended before merge.

LGTM 🟢 — with the suggested test coverage improvements.

… external API surface - Add v1 extraData trailing bytes test (symmetric with existing v0 test) - Add getContextSignersAndThresholdFromExtraData happy path covering v0, v1, and v1-against-old-context return values - Add destroyed context revert test for the same function

host-contracts/contracts/KMSVerifier.sol

Rename the revert in _extractKmsContextId to DeserializingExtraDataFail so callers can distinguish a malformed extraData payload from a truncated decryption proof envelope (DeserializingDecryptionProofFail).

host-contracts/contracts/KMSVerifier.sol

…MSVerifier

…fier tests Factor out the repeated context rotation setup (upgrade with 3 signers, then define a second context with signer3) into a shared helper used by 7 tests.

…y helper in KMSVerifier tests Factor out the repeated fresh-proxy deploy+upgrade block used by all 4 reinitializeV2 tests into a shared helper.

…MSVerifier tests

isaacdecoded

LGTM 👍

obatirou · 2026-03-10T10:15:56Z

Just for posterity: isValidKmsContext was removed following the fact that Connector will consider a new context activated as soon as it sees the request and not rely on a KMSVerifier call for verification

* feat(host-contracts): implement context-aware KMSVerifier (#2028) * feat(kms-connector): context-aware extraData handling for decryption (#2032) * chore(kms-connector): rename fhe module to handle * chore(kms-connector): add and use helper function * chore(kms-connector): add kms_context table * chore(kms-connector): prepare ethereum listener * feat(kms-connector): kms context validation * chore(kms-connector): kms context tests * chore(kms-connector): ethereum listener termination * feat(gateway-contracts): implement context-aware KMS node configs and decryption * feat: implement context-aware KMS node configs and decryption * chore(gateway-contracts): apply a few arguments renaming * fix(gateway-contracts): refresh rust bindings * chore(gateway-contracts): reuse setter methods and adjust NatSpecs * chore(gateway-contracts): refresh rust bindings * refactor: apply suggested naming * refactor(gateway-contracts): apply suggested renaming * refactor: revert updateKmsContext naming * refactor(gateway-contracts): enable decryption upgrade workflow * chore(gateway-contracts): refresh bindings * chore(test-suite): introduce getExtraData() method from SDK * chore(test-suite): restore missed user decrypt ebool test case * feat(kms-connector): propagate empty extra_data for 0x00 * feat(kms-connector): propagate empty extra_data for 0x00 * chore(kms-connector): add TODO comment for the workaround and upgrade quinn-proto * chore(kms-connector): add TODO comment for the workaround and upgrade quinn-proto * chore(kms-connector): use dedicated core config for tests --------- Co-authored-by: Simon Eudeline <simon.eudeline@zama.ai> * chore(test-suite): upgrade relayer-sdk version * chore(test-suite): upgrade test-suite version in fhevm-cli --------- Co-authored-by: Oba <obatirou@gmail.com> Co-authored-by: Simon E. <simon.eudeline@zama.ai>

* feat(host-contracts): implement context-aware KMSVerifier (#2028) * feat(kms-connector): context-aware extraData handling for decryption (#2032) * chore(kms-connector): rename fhe module to handle * chore(kms-connector): add and use helper function * chore(kms-connector): add kms_context table * chore(kms-connector): prepare ethereum listener * feat(kms-connector): kms context validation * chore(kms-connector): kms context tests * chore(kms-connector): ethereum listener termination * feat(gateway-contracts): implement context-aware KMS node configs and decryption * feat: implement context-aware KMS node configs and decryption * chore(gateway-contracts): apply a few arguments renaming * fix(gateway-contracts): refresh rust bindings * chore(gateway-contracts): reuse setter methods and adjust NatSpecs * chore(gateway-contracts): refresh rust bindings * refactor: apply suggested naming * refactor(gateway-contracts): apply suggested renaming * refactor: revert updateKmsContext naming * refactor(gateway-contracts): enable decryption upgrade workflow * fix(host-contracts): suppress NewContextSet event on init/reinit (#2040) fix(host-contracts): suppress NewContextSet event during init/reinit Extract _defineContext internal helper so init and reinit paths set context state without emitting NewContextSet, preventing spurious events that cause context/epoch ID drift in KMS core. * chore(kms-connector): helm chart update (#2097) * chore(coprocessor): remove legacy tfhe-worker gRPC path (#1982) * chore(coprocessor): remove legacy tfhe-worker grpc path * fix(tfhe-worker): resolve clippy dead_code in bench/test utils * refactor(tfhe-worker): remove unused computation module * test(tfhe-worker): cap event operator coverage at uint64 * fix(coprocessor): address review noise and typos * chore(tfhe-worker): reduce bench fmt churn in dex migration * chore(tfhe-worker): revert formatting-only bench_id wraps * chore(tfhe-worker): remove remaining bench format-only churn * bench(tfhe-worker): restore dex workload parity with legacy grpc * test(tfhe-worker): restore non-ignored coverage after grpc removal * test(tfhe-worker): deduplicate operator event coverage * test(tfhe-worker): harden event test stability * test(tfhe-worker): run full event type matrix in CI * test(tfhe-worker): default full event matrix with mode logging * test(tfhe-worker): simplify event matrix selection * docs(tfhe-worker): document event test matrix modes * test(tfhe-worker): expand random event tests across types * test(tfhe-worker): restore random type matrix parity * test(tfhe-worker): use query! in invalid operation event test * fix(bench): stabilize benchmark pipeline after grpc refactor * fix(bench): allow dex setup trivial encrypt handles * charts: bump coprocessor chart version * tfhe-worker: propagate gpu feature to test-harness * test(tfhe-worker): allow dependent schedule setup handle * test(tfhe-worker): fix event test matrix CI regressions * refactor(tfhe-worker): deduplicate test helpers and remove dead code - Migrate operators_from_events.rs to use shared event_helpers (setup_event_harness, next_handle, to_ty, tfhe_event, log_with_tx) - Remove duplicate test_invalid_operation_marks_error (kept in errors.rs) - Move wait_for_error to event_helpers for shared use - Extract TEST_CHAIN_ID const, remove debug eprintln calls - Remove 16 dead CoprocessorError variants from types.rs * refactor(tfhe-worker): destructure EventHarness to reduce PR diff Destructure setup_event_harness() return into {app, pool, listener_db} so variable names match the original code, minimising the review diff. * chore(tfhe-worker): remove dead deps and batch event test waits Remove 6 Cargo dependencies that were only used by the deleted gRPC server (sha3, lru, rayon, tfhe-zk-pok, regex, actix-web). Restructure 4 event tests (unary, cast, if-then-else, rand) to use batch-then-wait pattern: insert all events first, call wait_until_all_allowed_handles_computed once, then verify. This eliminates ~200 redundant waits in CI, saving ~10 minutes of sleep. Also remove unnecessary pub(super) from test_fhe_rand_events. * refactor(tfhe-worker): address PR review feedback - Upgrade as_scalar_uint to accept &BigInt directly - Deduplicate helpers in operators_from_events.rs (delete insert_tfhe_event, allow_handle, as_scalar_uint copies; use event_helpers versions) - Delete redundant test_fhe_rand_events (subset of random.rs tests) - Expand test_op_trivial_encrypt to cover all supported types with edge-case values - Add 5 error test scenarios: circular dependency, too many inputs, scalar division by zero, binary boolean inputs, unary boolean inputs * fix(tfhe-worker): replace validation-time error tests with execution-time ones Remove 3 error tests (circular dependency, too many inputs, scalar div by zero) that trigger validation-time errors in check_fhe_operand_types. These errors propagate via ? without being persisted to the DB, causing an infinite retry loop in event-driven mode. Replace with test_type_mismatch_error (FheAdd on uint8 + uint16) which passes validation but properly fails at execution time with UnsupportedFheTypes. The validation-path error propagation is tracked as a separate issue. * docs: update FHE computation diagram to reflect event-driven architecture Replace the obsolete AsyncCompute gRPC flow with the current host-listener event-driven architecture in the sequence diagram. * fix(tfhe-worker): fix GPU test failures in error and random bounded tests test_coprocessor_computation_errors: Replace Cast-to-type-255 with FheSub on mismatched types (uint32 + uint64). The old test panicked on the GPU path during memory reservation in trivial_encrypt_be_bytes, preventing the error from being persisted to the DB. Type-mismatch errors return a proper Result::Err on both CPU and GPU. test_fhe_random_bounded: Use per-type bounds from the old gRPC test instead of upper_bound=1. The 0-random-bits edge case (bound=1) behaves differently on GPU vs CPU. Also check bool results as true/false rather than assuming a specific numeric value, since CPU and GPU produce different deterministic outputs for the same seed. * docs(tfhe-worker): fix stale README heading after gRPC removal The server was removed; only the background worker remains. * test(coprocessor): strengthen error and random bounded test assertions (#2029) Error tests now assert the specific error message instead of only checking is_error == true. The bounded random test now generates two samples per type with different seeds and asserts they differ, catching any constant-output RNG implementation including always-zero. Closes zama-ai/fhevm-internal#1077 * fix(coprocessor): force compress/decompress for all ciphertexts (#2036) * fix(coprocessor): decompress all ciphertexts per operation * fix(coprocessor): sanity-check that only scalars are uncompressed * fix(coprocessor): add compressed ct type * fix(coprocessor): propagate DecompressionError * fix(host-contracts): add domain separator and prev block hash to handle hashing (#2014) * fix(host-contracts): add missing domain separator when hashing to construct handles * fix(host-contracts): update rust bindings * feat(host-contracts): add previous block hash to the hashes used to generate computed handles * ci(host-contracts, gateway-contracts): auto-detect contract upgrades (#2037) * ci(host-contracts): add cross-version upgrade test workflow (#1097) Add CI that deploys host-contracts from v0.11.0 via Docker, then upgrades each contract (ACL, FHEVMExecutor, KMSVerifier, InputVerifier, HCULimit) to the current branch using hardhat upgrade tasks. Unlike the gateway-contracts equivalent, all upgrade steps are enabled. * fix(ci): correct misleading CHAIN_ID_GATEWAY comment * ci(host-contracts): skip upgrades for contracts with unchanged reinitializer Only FHEVMExecutor (2→3) and HCULimit (2→3) actually bumped their REINITIALIZER_VERSION between v0.11.0 and current. ACL (3→3), KMSVerifier (2→2), and InputVerifier (3→3) are unchanged, so their reinitializeV2 would revert with InvalidInitialization. * ci: add actionlint ignore for host-contracts-upgrade-tests Same constant-condition exemption as gateway-contracts-upgrade-tests, needed for the `if: false` on skipped upgrade steps. * ci(host-contracts): auto-detect contract upgrades via REINITIALIZER_VERSION Replace 5 hardcoded upgrade steps (with manual `if: false` guards) with a single loop that compares REINITIALIZER_VERSION between the previous release and current code. Only contracts whose version actually changed are upgraded. - Add upgrade-manifest.json as single source of truth for upgradeable contracts - Extract PREVIOUS_RELEASE_TAG to env var (one place to bump per release) - Remove actionlint exemption (no more constant `if: false` conditions) Refs: zama-ai/fhevm-internal#379 * ci(gateway-contracts): auto-detect contract upgrades via REINITIALIZER_VERSION Apply the same auto-detection pattern from host-contracts to gateway: replace 7 hardcoded upgrade steps (all with `if: false`) with a single loop that compares REINITIALIZER_VERSION between the previous release and current code. - Add upgrade-manifest.json listing all 7 upgradeable gateway contracts - Extract PREVIOUS_RELEASE_TAG to env var (one place to bump per release) - Remove last actionlint exemption for constant `if: false` conditions Closes: zama-ai/fhevm-internal#379 * fix(ci): skip upgrade for contracts not present in previous release A contract that's new (not in the previous release tag) has no previous deployment to upgrade from. Without this guard, the loop would attempt an upgrade because the missing file defaults to version 0, which differs from the current version. Verified against the v0.9.8 → v0.10.0 gateway cycle: ProtocolPayment is correctly skipped (didn't exist in v0.9.8), matching the original manual `if: false` behavior. * ci: align deployment check steps between host and gateway workflows * ci: rename PREVIOUS_RELEASE_TAG to UPGRADE_FROM_TAG, bump gateway to v0.11.0 * ci: add inline comments to upgrade loop for readability * ci: verify contract versions with cast call after upgrades * ci: assert getVersion() matches expected version from source constants * ci: replace associative arrays with indirect expansion for bash 3 compat * ci: strip quotes from cast call output before version comparison * ci: suppress shellcheck SC2034 for indirect expansion address vars * fix(ci): add shellcheck disable SC2034 to each indirect-expansion variable The directive only suppresses the next line, not a block. * fix(ci): address PR review feedback - Remove ProtocolPayment from gateway upgrade manifest (no hardhat upgrade task exists for it yet) - Add existence check for contracts/${name}.sol in current code to fail fast if manifest is out of sync - Fix misleading CHAIN_ID_GATEWAY comment in host workflow * fix(ci): fail-fast on missing REINITIALIZER_VERSION, skip verify for new contracts - Replace silent :-0 fallback with explicit error when REINITIALIZER_VERSION cannot be parsed from an existing .sol file - Skip version verification for contracts with no deployment address (new contracts not present in previous release) * fix(ci): fail-hard when address mapping is missing for existing contracts Only skip version verification for genuinely new contracts (not in previous release). If a contract existed in the previous release but has no address variable mapped, fail with a clear error instead of silently skipping. * ci: trigger fresh workflow run * fix(ci): address PR review nits — consistent PascalCase naming, add comment on UPGRADE_FROM_TAG * fix(ci): clean Hardhat cache and OZ manifest between sequential upgrades The upgrade loop runs each contract upgrade as a separate npx hardhat process. The .openzeppelin manifest and Hardhat cache/artifacts persist on disk between invocations, causing flaky failures where the OZ plugin reuses stale bytecode-hash entries or Hardhat resolves wrong artifacts when previous-contracts/ and contracts/ share the same contract name. Add `npx hardhat clean` and `rm -rf .openzeppelin` before each upgrade to ensure a clean slate for compilation and deployment deduplication. * fix: wait for upgradeToAndCall tx receipt before declaring upgrade success The OZ hardhat-upgrades plugin's upgradeProxy() does NOT call .wait() on the upgradeToAndCall transaction — it returns as soon as the tx is submitted to the node. With Anvil's interval mining (--block-time 0.5), the tx may not be mined when the plugin returns, and if it reverts during mining, the revert goes completely undetected. This caused flaky CI failures where the upgrade task printed "Proxy contract successfully upgraded!" but getVersion() still returned the old version — the upgradeToAndCall tx had silently reverted. Fix: explicitly .wait() on the upgrade transaction, check the receipt status, and read the EIP-1967 implementation slot to confirm the upgrade took effect on-chain. * Revert "fix: wait for upgradeToAndCall tx receipt before declaring upgrade success" This reverts commit 847266f. * fix(ci): force-mine and verify on-chain state after each contract upgrade The OZ hardhat-upgrades plugin does NOT call .wait() on the upgradeToAndCall transaction — it returns as soon as the tx is submitted. With Anvil's interval mining (--block-time 0.5), the tx may still be pending when the plugin returns, and if it reverts during mining the revert goes undetected. After each upgrade task: 1. Force-mine a block via `cast rpc evm_mine` to flush pending txs 2. Immediately verify getVersion() returns the expected value 3. Fail fast with a clear diagnostic if the upgrade was silently dropped This catches the silent revert at the point of failure rather than later in the separate verify step, making the error message actionable. * fix(ci): simplify upgrade step — remove redundant per-upgrade verification The inline per-upgrade version check duplicated the existing "Verify contract versions" step. Keep only the `cast rpc evm_mine` workaround (OZ upgradeProxy does not wait for the upgradeToAndCall tx to be mined) and let the dedicated verification step handle on-chain assertions. * fix(ci): remove unnecessary hardhat clean/OZ manifest wipe between upgrades The per-iteration `npx hardhat clean` + `rm -rf .openzeppelin` was a speculative fix from before the real root cause was identified (missing evm_mine). Each contract has unique bytecode (no OZ manifest hash collision) and tasks already use fully qualified artifact names, so there is nothing to clean between iterations. * feat(coprocessor): make multi-chain DB migration backwards compatible (#2043) * feat(coprocessor): make multi-chain DB migration backwards compatible Make sure both new and old versions can work with the same DB, should we want to revert the new one to the old one. * feat(coprocessor): add defaults needed for old code * fix(coprocessor): fix zkproof-worker test DB field name * fix(coprocessor): fix bad stress-test-generator DB field name * ci(common): sandbox Claude Code behind Squid proxy + iptables (#2063) * ci(common): sandbox Claude Code behind Squid proxy + iptables Run the claude-code-action inside a network sandbox to prevent data exfiltration to unauthorized hosts. Two layers of defense: - Squid proxy: L7 domain allowlist (.anthropic.com, .github.com, etc.) - iptables: blocks direct outbound TCP from the runner UID All dependencies (Bun, action node_modules, Claude Code CLI, OIDC token exchange) are pre-installed before lockdown because the action's internal installers use fetch() which ignores HTTP_PROXY. Also switches from --allowedTools to --dangerously-skip-permissions since the network sandbox handles security at the infrastructure level. update claude file with proper container setup fix: shellchecks fix zizmor warning ci(claude): rewrite workflow from template, address PR #1995 security review - Drop action wrapper, run claude CLI directly (avoids MCP stdin blocking) - Remove dead pull_request trigger - Separate GH_TOKEN from system prompt construction step - Tighten iptables: resolve Squid IP dynamically, block UDP/ICMP - Restrict squid allowlist to 3 domains (api.anthropic.com, platform.claude.com, github.com) - Cache Squid Docker image, add iptables save/restore cleanup - Add tracking comment for run visibility - Fix token revocation to use HTTPS_PROXY fix: replace A && B || C with proper if-then-else (SC2015) fix: capture error details instead of silent suppression OIDC exchange and token revocation now log the server response on failure instead of swallowing it with -sf/--silent/2>/dev/null. fix: shellcheck SC2001 and SC2015 in claude workflow Replace sed prompt extraction with parameter expansion (SC2001). chore: harden security practices chore: update claude action from secutiry * chore: rename claude.yml to claude-review.yml * chore: enforces changes in sandboxed claude-* workflow --------- Co-authored-by: Roger Carhuatocto <chilcano@intix.info> * Revert "ci(common): sandbox Claude Code behind Squid proxy + iptables" (#2080) Revert "ci(common): sandbox Claude Code behind Squid proxy + iptables (#2063)" This reverts commit 9587546. * fix(coprocessor): remove tx-sender dependency on hostchain for multichain (#1826) * fix(coprocessor): add block finalization in HL, remove hostchain from tx-sender * fix(coprocessor): review fix * fix(coprocessor): review fix * fix(coprocessor): e2e tests * test(corpocessor): debug e2e * fix(coprocessor): e2e tests * test(test-suite): add e2e block cap tests for HCU metering (#2081) * test(test-suite): add e2e block cap tests for HCU metering (#1099) Add 5 block-cap scenarios to the E2E test suite exercising HCULimit through real EncryptedERC20 FHE operations on the deployed stack: multi-user accumulation, cap exhaustion, block rollover, whitelist removal, and non-owner rejection. Wire into CI via `fhevm-cli test hcu-block-cap` and a new workflow step. * fix(test-suite): address review feedback on HCU block cap tests - Rework block rollover test to actually block a caller in block N, then verify that same caller succeeds after rollover in block N+1 - Add missing DEPLOYER_PRIVATE_KEY to .env.example * fix(test-suite): fix HCU block cap tests for real stack - Accumulation test: use greaterThan instead of exact equality (block meter vs receipt HCU have a small discrepancy on real infra) - Cap exhaustion + rollover tests: pass explicit gasLimit to bypass estimateGas, which reverts against pending state when cap is filled * fix(test-suite): tighten accumulation assertion with 2% tolerance Replace loose greaterThan check with near-sum assertion allowing ~2% drift between receipt-reported HCU and on-chain block meter. * fix(test-suite): replace HCU tolerance with self-consistent accumulation assertion The receipt parser reconstructs HCU from the @fhevm/solidity npm price table while the block meter uses the deployed contract's hardcoded prices. A version skew between the two causes a small discrepancy. Instead of cross-comparing with tolerance, assert the block meter exceeds each individual tx's HCU — proving accumulation without depending on price table parity. * fix(test-suite): use revertedWithCustomError for non-owner assertion Add NotHostOwner error to HCU_LIMIT_ABI and assert the specific custom error instead of generic revert. * refactor(test-suite): simplify HCU block cap test structure - Scope save/restore of HCU limits to only the 2 tests that lower them (nested describe with its own beforeEach/afterEach) - Extract mintAndDistribute helper for repeated mint+transfer preamble - Remove blanket whitelist cleanup from afterEach (test cleans up itself) - Parallelize 3 sequential view calls with Promise.all * refactor(test-suite): simplify accumulation test to use block meter only Replace receipt-based HCU comparison with three block meter readings: 1. Single-tx block → baseline meter 2. Two-tx block → meter exceeds baseline (proves accumulation) 3. Single-tx block → meter resets and matches baseline No cross-comparison of price tables, no getTxHCUFromTxReceipt needed. * fix(test-suite): assert meter starts at 0 before first operation * refactor(test-suite): tighten accumulation assertions - Assert meter2 == 2 * meter1 (exact, same ops in both txs) - Remove unnecessary mineNBlocks between blocks (meter resets automatically in each new block) * ci: temporarily skip all tests except HCU block cap DO NOT MERGE — revert before merge. Added `if: false` to all test steps except HCU block cap to validate in isolation. * fix(ci): build test-suite from source to include new HCU tests The CI was pulling the pre-built test-suite Docker image (v0.11.0-1) which doesn't contain the new block cap scenarios tests. Use --build so the image is built from the current checkout. * fix(test): fix NotHostOwner ABI signature and relax accumulation assertion - NotHostOwner takes an address parameter: error NotHostOwner(address) - Relax meter2 == meter1*2 to meter2 > meter1 since alice→bob and bob→alice can differ slightly in HCU due to balance init paths * fix(test): relax meter3 assertion — same op can differ by ~18 HCU The same alice→bob transfer produces slightly different HCU across runs due to balance state changes from intermediate transfers. Assert reset behavior (meter3 > 0 and meter3 < meter2) instead of exact equality with meter1. * fix(test): disable Anvil interval mining when batching txs in one block Anvil runs with --block-time 1, so blocks keep getting mined even with evm_setAutomine(false). Use evm_setIntervalMining(0) to fully pause block production, then restore both after mining. * refactor(test): centralize interval mining control in beforeEach/afterEach Disable interval mining once in beforeEach (deterministic blocks), restore in afterEach. Tests only toggle automine for batching. * revert: restore per-test interval mining control (beforeEach hangs) Disabling interval mining in beforeEach hangs because Anvil's evm_setIntervalMining(0) overrides automine. Revert to the per-test pattern (disable interval+automine before batching, restore after) which passed in CI run 22733231829. * revert(ci): remove temporary test filters and --build flag Restore workflow to match main, keeping only the new HCU block cap test step addition. * test(test-suite): always restore HCU state after block cap tests * fix(test-suite): restore HCU whitelist state safely * fix(test-suite): stabilize HCU meter assertions * fix(test-suite): harden HCU e2e tests and add build dispatch * ci(test-suite): avoid expression expansion in deploy step * fix(test-suite): stabilize HCU whitelist removal test * test(test-suite): instrument HCU whitelist tx waits * fix(test-suite): use manual mining for HCU whitelist removal test The automine=true + intervalMining=0 combo is unreliable in CI — Anvil hangs for ~5min before mining the mint tx, causing Mocha timeout. Switch to automine=false + explicit evm_mine after each tx, matching the proven pattern used by the "with lowered limits" tests that pass consistently. Also add gasLimit overrides to bypass estimateGas against pending state. * feat(test-suite): add --resume/--only to fhevm-cli and optimize CI deploy Forward --resume STEP and --only STEP flags from fhevm-cli to the underlying deploy-fhevm-stack.sh script, with step validation and mutual exclusivity check. Use --only test-suite in CI when deploy-build is set, so only the test-suite image is rebuilt from the branch instead of the entire stack. * fix(test-suite): remove --remove-orphans from selective cleanup cleanup_single_step and cleanup_from_step used --remove-orphans with a single compose file, causing Docker Compose to tear down every container in the project not defined in that file. This destroyed the entire stack when running e.g. --only test-suite. * fix(ci): revert --only test-suite optimization in deploy step The --only test-suite approach rebuilds only the test container but uses pre-built host-sc images that lack the HCULimit contract. The HCU block cap tests need host-sc built from the branch, so we must use the full --build deploy for now. The --resume/--only CLI flags and the --remove-orphans fix in the deploy script are kept — they're useful for local development and future CI optimizations. * chore(test-suite): revert unrelated fhevm-cli and deploy script changes Keep the PR scoped to the HCU whitelist test fix and the deploy-build workflow input. The --resume/--only CLI flags and --remove-orphans fix can be submitted in a separate PR. * fix(test-suite): re-add hcu-block-cap test type to fhevm-cli * ci(common): sandbox Claude Code behind Squid proxy + iptables (#2083) * ci(common): sandbox Claude Code behind Squid proxy + iptables Run the claude-code-action inside a network sandbox to prevent data exfiltration to unauthorized hosts. Two layers of defense: - Squid proxy: L7 domain allowlist (.anthropic.com, .github.com, etc.) - iptables: blocks direct outbound TCP from the runner UID All dependencies (Bun, action node_modules, Claude Code CLI, OIDC token exchange) are pre-installed before lockdown because the action's internal installers use fetch() which ignores HTTP_PROXY. Also switches from --allowedTools to --dangerously-skip-permissions since the network sandbox handles security at the infrastructure level. update claude file with proper container setup fix: shellchecks fix zizmor warning ci(claude): rewrite workflow from template, address PR #1995 security review - Drop action wrapper, run claude CLI directly (avoids MCP stdin blocking) - Remove dead pull_request trigger - Separate GH_TOKEN from system prompt construction step - Tighten iptables: resolve Squid IP dynamically, block UDP/ICMP - Restrict squid allowlist to 3 domains (api.anthropic.com, platform.claude.com, github.com) - Cache Squid Docker image, add iptables save/restore cleanup - Add tracking comment for run visibility - Fix token revocation to use HTTPS_PROXY fix: replace A && B || C with proper if-then-else (SC2015) fix: capture error details instead of silent suppression OIDC exchange and token revocation now log the server response on failure instead of swallowing it with -sf/--silent/2>/dev/null. fix: shellcheck SC2001 and SC2015 in claude workflow Replace sed prompt extraction with parameter expansion (SC2001). chore: harden security practices chore: update claude action from secutiry * chore: rename claude.yml to claude-review.yml * chore: enforces changes in sandboxed claude-* workflow * ci(common): fix zizmor issues --------- Co-authored-by: enitrat <msaug@protonmail.com> Co-authored-by: Roger Carhuatocto <chilcano@intix.info> * fix(coprocessor): stop logging errors for unknown input verif events (#2077) * fix(coprocessor): stop logging errors for unknown input verif events * fix(coprocessor): update cargo dependence * ci(test-suite): run e2e tests with 2-of-2 coprocessor consensus (#2052) * ci(test-suite): run e2e tests with 2-of-2 coprocessor consensus Deploy with --coprocessors 2 --coprocessor-threshold 2 so both coprocessors must independently compute identical ciphertext digests for on-chain consensus to be reached. All existing tests pass unchanged — consensus enforcement is transparent. Adds a consensus watchdog (Mocha root hook) that monitors gateway chain events during tests: - Detects ciphertext digest divergence immediately - Detects consensus stalls within 3 minutes - No-op when GATEWAY_RPC_URL is unset (single-coprocessor runs) Closes zama-ai/fhevm-internal#1132 * fix(test-suite): address code review findings in consensus watchdog - Add public flush() method instead of casting to any to call private poll() - Add polling guard to prevent overlapping poll cycles from setInterval - Remove non-null assertion on INPUT_VERIFICATION_ADDRESS before null check - Prune resolved entries from maps (delete on consensus + track count via integers) - Remove consensusReached field from interfaces (no longer needed) - Simplify summary() to use map.size and counters instead of 4 array copies * test(consensus-watchdog): add unit tests for watchdog logic 12 tests covering: - Ciphertext digest divergence detection - SNS digest divergence detection - Input verification divergence detection - Consensus stall timeout detection - Map pruning on consensus resolution - Polling guard preventing overlapping polls - Summary output for resolved and pending entries - Graceful no-op when env vars are not set Also exports ConsensusWatchdog class for testability. * fix(test): cleanup resource leaks in watchdog unit tests - Destroy real ethers provider before replacing with stub in mockWatchdog() - Wrap env var mutation in try/finally to guarantee cleanup on test failure * fix(test-suite): skip proof monitoring when input verification is unset * ci(test-suite): install foundry in e2e workflow * fix(test-suite): avoid rerunning db migration for extra coprocessors * revert(ci): drop validation-only e2e changes * fix(test-suite): harden consensus watchdog * ci(test-suite): enable build-based e2e validation * fix(test-suite): avoid rerunning extra coprocessor migration * test-suite: clarify consensus watchdog summary * Revert "fix(test-suite): avoid rerunning extra coprocessor migration" This reverts commit 3a73efb. * Revert "ci(test-suite): enable build-based e2e validation" This reverts commit 818e565. * ci(test-suite): install foundry for 2-of-2 e2e deploys * fix(test-suite): avoid rerunning extra coprocessor migration * feat(coprocessor): re-randomise input ciphertexts before first compression (#2073) * feat(coprocessor): add re-randomisation of input ciphertexts * test(coprocessor): add regression tests for input re-randomisation * feat(common): simple acl (#2072) * refactor(coprocessor): remove ACL propagate ops (#1825) * chore(gateway-contracts): remove MultichainACL contract (#1904) * chore(gateway-contracts): remove MultichainACL from gateway-contracts * chore(coprocessor): remove multichainACL contract from coprocessor * chore(gateway-contracts): remove unused param from internal Decryption.sol function * chore(gateway-contracts): remove multichainACL checks from Decryption.sol tests * chore(gateway-contracts): update rust bindings * chore(gateway-contracts): make conformance * chore(gateway-contracts): remove multichainACL test from delegated user decrypt * chore(gateway-contracts): update bindings with foundry v1.3.1 as in CI * chore(gateway-contracts): bump Decryption.sol upgradeable version * chore(gateway-contracts): fix lint * refactor: remove arbitrum expiration date constraint from host ACL * refactor: remove unused params from isUserDecryptionReady & isDelegatedUserDecryptionReady * chore: fix comments * fix: fix ci upgrade contract flag * fix: remove test related to legacy expiry-too-soon constraint * chore: make conformance * chore(test-suite): use acl relayer (#2064) * chore(test-suite): update copro params * chore(test-suite): update contract addresses * chore(gateway-contracts): pauser task minor fix * chore(test-suite): update relayer * chore(test-suite): update relayer-sdk v0.4.1 * chore(test-suite): draft add negative acl tests * chore(test-suite): fix expired delegation, acl not allow tests (#2060) * chore(test-suite): update acl failure test for delegated user decr - Previously: negative delegated user decryption tests asserted on raw Solidity selector 0x0190c506 and was using relayer-sdk v0.4.1 that was not handling the label 'now_allowed_on_host_acl' from relayer. - Now: bump relayer-sdk to v0.4.2 that handles the label and asserts on relayer-sdk error label not_allowed_on_host_acl, matching the structured error returned by the relayer on HTTP 400 * chore(test-suite): fix expired delegation test - Issue: setting pastExpiration=1 reverted at the contract level (ExpirationDateBeforeOneHour) so the test never reached the decryption step - Fix: delegate with a valid expiration (now + 1h1m), then use evm_increaseTime to fast-forward past it before attempting decryption * chore(common): update package-lock.json --------- Co-authored-by: Simon Eudeline <simon.eudeline@zama.ai> * fix(test-suite): relayer and copro config update * chore(test-suite): update test-suite versions --------- Co-authored-by: Manoranjith <manoranjith.ponnuraj@zama.ai> * chore(coprocessor): versions bump * chore(gateway-contracts): remove acl from upgrade manifest * tests(coprocessor): fix stop_retrying_verify_proof_on_gw_config_error() --------- Co-authored-by: Petar Ivanov <29689712+dartdart26@users.noreply.github.com> Co-authored-by: malatrax <71888134+zmalatrax@users.noreply.github.com> Co-authored-by: Manoranjith <manoranjith.ponnuraj@zama.ai> * feat(test-suite): introduce context-aware extraData changes * feat(host-contracts): implement context-aware KMSVerifier (#2028) * feat(kms-connector): context-aware extraData handling for decryption (#2032) * chore(kms-connector): rename fhe module to handle * chore(kms-connector): add and use helper function * chore(kms-connector): add kms_context table * chore(kms-connector): prepare ethereum listener * feat(kms-connector): kms context validation * chore(kms-connector): kms context tests * chore(kms-connector): ethereum listener termination * feat(gateway-contracts): implement context-aware KMS node configs and decryption * feat: implement context-aware KMS node configs and decryption * chore(gateway-contracts): apply a few arguments renaming * fix(gateway-contracts): refresh rust bindings * chore(gateway-contracts): reuse setter methods and adjust NatSpecs * chore(gateway-contracts): refresh rust bindings * refactor: apply suggested naming * refactor(gateway-contracts): apply suggested renaming * refactor: revert updateKmsContext naming * refactor(gateway-contracts): enable decryption upgrade workflow * chore(gateway-contracts): refresh bindings * chore(test-suite): introduce getExtraData() method from SDK * chore(test-suite): restore missed user decrypt ebool test case * feat(kms-connector): propagate empty extra_data for 0x00 * feat(kms-connector): propagate empty extra_data for 0x00 * chore(kms-connector): add TODO comment for the workaround and upgrade quinn-proto * chore(kms-connector): add TODO comment for the workaround and upgrade quinn-proto * chore(kms-connector): use dedicated core config for tests --------- Co-authored-by: Simon Eudeline <simon.eudeline@zama.ai> * chore(test-suite): upgrade relayer-sdk version * chore(test-suite): upgrade test-suite version in fhevm-cli --------- Co-authored-by: Oba <obatirou@gmail.com> Co-authored-by: Simon E. <simon.eudeline@zama.ai> * docs: update integration guide to add details about wrapping/unwrapping (#2132) * docs: Expand wallet guide to cover CEXs * docs: Add details about wrapping/unwrapping process * docs: Fixed menu links * chore: fix typo * fix: standardize BSD-clear license files (#2136) * ci(kms-connector): fix check-changes of bindings (#2138) * fix(gateway-contracts): overload `isUserDecryptionReady` with old signature (#2137) fix(gateway-contracts): overload isUserDecryptionReady with old signature * chore(gateway-contracts): refresh rust bindings * chore(test-suite): replace component versions * chore(gateway-contracts): refresh contract Charts --------- Co-authored-by: Oba <obatirou@gmail.com> Co-authored-by: Simon E. <simon.eudeline@zama.ai> Co-authored-by: Elias Tazartes <66871571+Eikix@users.noreply.github.com> Co-authored-by: Antoniu <90181190+antoniupop@users.noreply.github.com> Co-authored-by: Petar Ivanov <29689712+dartdart26@users.noreply.github.com> Co-authored-by: Mathieu <60658558+enitrat@users.noreply.github.com> Co-authored-by: Roger Carhuatocto <chilcano@intix.info> Co-authored-by: immortal tofu <clement@danjou.io> Co-authored-by: rudy-6-4 <rudy.sicard@zama.ai> Co-authored-by: enitrat <msaug@protonmail.com> Co-authored-by: malatrax <71888134+zmalatrax@users.noreply.github.com> Co-authored-by: Manoranjith <manoranjith.ponnuraj@zama.ai> Co-authored-by: Ankur Banerjee <ankurdotb@users.noreply.github.com>

cla-bot bot added the cla-signed label Feb 25, 2026

obatirou force-pushed the context-aware-kms-verifier branch from 719134c to abc448e Compare February 26, 2026 08:46

obatirou changed the title ~~feat(host-contracts): implement context-aware signer storage and veri…~~ feat(host-contracts): implement context-aware KMSVerifier Feb 26, 2026

obatirou force-pushed the context-aware-kms-verifier branch 2 times, most recently from 91aa0da to 92fece8 Compare February 26, 2026 09:19

obatirou force-pushed the context-aware-kms-verifier branch from 92fece8 to f668449 Compare February 26, 2026 09:33

refactor(host-contracts): simplify KMSVerifier tests

f35383a

Remove redundant tests and merge overlapping ones (55 → 47) without losing coverage. Removed tests were either strict subsets of other tests or exercised identical code paths.

zama-ai deleted a comment from claude bot Feb 26, 2026

obatirou marked this pull request as ready for review February 26, 2026 10:55

obatirou requested review from a team as code owners February 26, 2026 10:55

isaacdecoded reviewed Feb 26, 2026

View reviewed changes

host-contracts/contracts/KMSVerifier.sol Outdated Show resolved Hide resolved

isaacdecoded reviewed Feb 26, 2026

View reviewed changes

host-contracts/contracts/KMSVerifier.sol Show resolved Hide resolved

obatirou added 2 commits February 26, 2026 14:51

refactor(host-contracts): add distinct DeserializingExtraDataFail error

139f200

Rename the revert in _extractKmsContextId to DeserializingExtraDataFail so callers can distinguish a malformed extraData payload from a truncated decryption proof envelope (DeserializingDecryptionProofFail).

style: make prettier

e2dc53b

jatZama reviewed Feb 26, 2026

View reviewed changes

host-contracts/contracts/KMSVerifier.sol Outdated Show resolved Hide resolved

jatZama reviewed Feb 26, 2026

View reviewed changes

host-contracts/contracts/KMSVerifier.sol Outdated Show resolved Hide resolved

jatZama reviewed Feb 26, 2026

View reviewed changes

host-contracts/contracts/KMSVerifier.sol Outdated Show resolved Hide resolved

jatZama reviewed Feb 26, 2026

View reviewed changes

host-contracts/contracts/KMSVerifier.sol Show resolved Hide resolved

obatirou added 6 commits February 27, 2026 10:58

refactor(host-contracts): remove setThreshold from KMSVerifier

1d2d7e9

docs(host-contracts): fix reinitializeV2 NatSpec

cf8b36e

refactor(host-contracts): remove isValidKmsContext public view from K…

ad1373d

…MSVerifier

refactor(host-contracts): extract _setupTwoContexts helper in KMSVeri…

c8f2b9b

…fier tests Factor out the repeated context rotation setup (upgrade with 3 signers, then define a second context with signer3) into a shared helper used by 7 tests.

refactor(host-contracts): extract _deployUninitializedKMSVerifierProx…

92d8848

…y helper in KMSVerifier tests Factor out the repeated fresh-proxy deploy+upgrade block used by all 4 reinitializeV2 tests into a shared helper.

refactor(host-contracts): extract _buildSingleSignerProof helper in K…

b053990

…MSVerifier tests

style: prettier

8774604

isaacdecoded approved these changes Feb 27, 2026

View reviewed changes

obatirou merged commit 5f2a07b into feat/kms-context-switch Feb 27, 2026
52 checks passed

obatirou deleted the context-aware-kms-verifier branch February 27, 2026 14:39

eudelins-zama pushed a commit that referenced this pull request Mar 5, 2026

feat(host-contracts): implement context-aware KMSVerifier (#2028)

f2c56ed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(host-contracts): implement context-aware KMSVerifier#2028

feat(host-contracts): implement context-aware KMSVerifier#2028
obatirou merged 12 commits intofeat/kms-context-switchfrom
context-aware-kms-verifier

obatirou commented Feb 25, 2026 •

edited

Loading

Uh oh!

mergify bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

obatirou commented Feb 26, 2026

Uh oh!

claude bot commented Feb 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

isaacdecoded left a comment

Uh oh!

Uh oh!

obatirou commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

obatirou commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧪 CI Insights

🟢 All jobs passed!

Uh oh!

obatirou commented Feb 26, 2026

Uh oh!

claude bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Context-Aware KMSVerifier

Summary

🟡 Design Observations (Non-blocking)

🧪 Test Coverage Gaps (Recommended additions)

✅ Verified Correct

Verdict

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

isaacdecoded left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

obatirou commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

obatirou commented Feb 25, 2026 •

edited

Loading

mergify bot commented Feb 25, 2026 •

edited

Loading

claude bot commented Feb 26, 2026 •

edited

Loading