APEX (Agent Payment Exchange) Protocol v1 — a lightweight ERC-8183 Agentic Commerce Protocol deployment paired with a pluggable, UMA-style optimistic evaluator. Two agents (a Client who pays and a Provider who delivers) transact around the ERC-8183 job lifecycle: create → fund → submit → evaluate → settle. The evaluator is a Router that routes each job to a policy contract; the only v1 policy is OptimisticPolicy: default-approve after a dispute window, but a client-raised dispute triggers a whitelisted-voter quorum reject.
See docs/design.md for the authoritative design. ERC-8183 compliance status (including the exact spec version reviewed against) is tracked in docs/erc-8183-compliance.md.
- Language: Solidity
0.8.28(EVMcancun, optimizer200 runs,viaIR: true). - Framework: Hardhat 3 (
@nomicfoundation/hardhat-toolbox-viem) withhardhat-viem,hardhat-ethers,hardhat-verify. - Libraries: OpenZeppelin
@openzeppelin/contracts@5.4.0+@openzeppelin/contracts-upgradeable@5.4.0(pinned — storage-layout audit required to bump). - Proxy pattern: UUPS (ERC-1967).
AgenticCommerceUpgradeableuses flat upgradeable storage;EvaluatorRouterUpgradeableuses ERC-7201 namespaced storage (apex.router.storage.v1). - Test runner: Bun's test runner (
bun test) — consumes thenode:testAPI (describe/it/before+node:assert/strict) natively, viem-based assertions. - TypeScript:
~5.8.0; viem^2.38.0; ethers^6.15.0(used for keccak/abi utils). TS files are executed directly by Bun (no tsx/ts-node loader). - Runtime: Bun
>= 1.3for dev/test; Node.js>= 22.10.0is still required by Hardhat 3 whenbunx hardhatshells out to its runtime. Package manager: bun (lockfile:bun.lock). - Linting / formatting:
solhint@6.x(.solhint.json) for Solidity static checks;prettier@3.x+prettier-plugin-solidity@2.x(.prettierrc) for.sol+.ts+.md+.jsonformatting. Invoked viabun run lint:sol/bun run format{,:check}.
Three-layer design:
contracts/
AgenticCommerceUpgradeable.sol # ERC-8183 kernel (UUPS, Pausable)
EvaluatorRouterUpgradeable.sol # UUPS routing layer (acts as job.evaluator + job.hook)
OptimisticPolicy.sol # Immutable policy: silence-approve + dispute/quorum
IACP.sol # Implementation-level kernel interface
IPolicy.sol # Router ↔ policy interface
IACPHook.sol # ERC-8183 hook interface
ERC1967Proxy.sol # Test-helper proxy wrapper
mocks/ # Test-only contracts (not deployed to live nets)
MockERC20.sol # Test payment token
RevertingHook.sol # Proves claimRefund is non-hookable
AgenticCommerceV2Mock.sol # UUPS upgrade target for commerce
EvaluatorRouterV2Mock.sol # UUPS upgrade target for router
scripts/
deploy.ts # Unified deploy / impl upgrade / policy rotation (print-only; no file side-effects)
addresses.ts # Hand-committed registry of deployed proxy/policy addresses
test/
helpers.ts # Shared test fixtures
AgenticCommerce.test.ts
EvaluatorRouter.test.ts
OptimisticPolicy.test.ts
Lifecycle.test.ts
docs/
design.md # Canonical design document
erc-8183-compliance.md # ERC-8183 compliance matrix + change log
hardhat.config.ts # Networks (bscTestnet, bsc, bscTestnetFork, localhost)
Key architectural constraints (never violate without an upgrade audit):
AgenticCommerceUpgradeableuses flat upgradeable storage (6 slots +__gap[44]). Never reorder or remove fields; only append by shrinking__gap.EvaluatorRouterUpgradeableuses ERC-7201 namespaced storage with idapex.router.storage.v1. Never change the namespace; only append fields toRouterStorage.claimRefund()on the kernel is not pausable and not hookable — this is a deliberate safety property and the universal escape hatch at expiry.OptimisticPolicymaintains the invariantvoteQuorum ≤ activeVoterCount.setQuorumandremoveVoterboth revert when the invariant would break.- Router's
beforeAction/afterActionare notnonReentrant— they sit on the reentrant pathsettle → commerce.complete → router.afterAction. Access control relies onmsg.sender == commerce. - Router is UUPS; it doubles as the ERC-8183
job.hook. This deviates from ERC-8183'sSHOULD NOTfor upgradeable hooks. Mitigation: multisig + TimelockController; operational default is NEVER UPGRADE. Prefer drain-and-redeploy viarouter.pause()+ expiry refund.
# Install
bun install
# Build
bun run compile # == bunx hardhat compile
# Tests (62 tests, ~1.3s)
bun test
# Formatting + lint
bun run format # Prettier (+ prettier-plugin-solidity) write
bun run format:check # CI gate: fails if anything drifts
bun run lint:sol # solhint — 0 errors required (warnings ok)
# Local node
bun run node # == bunx hardhat node
# Local development (in a second terminal after `bun run node`)
bun run deploy:local # Uses .env; deploys stack to localhost
bun run fund:local # Sends ETH + MockERC20 to FUND_RECIPIENT
# Deployment (BSC Testnet) — same script handles first deploy, impl upgrade,
# and policy rotation (decided per-field from scripts/addresses.ts)
bun run deploy:testnet # Uses .env
# Verification (manual)
bunx hardhat verify --network bscTestnet <impl_address>A single .env covers local + testnet + mainnet; --network on the hardhat
CLI selects which BSC_*_PRIVATE_KEY / BSC_*_RPC_URL is read. Never commit
it. See .env.example for the full schema.
Always match the user's language. If the user writes in Chinese, every word of your reply must be in Chinese. If in English, reply in English. This rule overrides everything else and applies to every single response, including plans, summaries, and error messages. Never mix languages mid-response.
Follow these steps in strict order for every code change request. If you are about to call Edit/Write/Bash without completing Step 1 and receiving explicit approval — STOP. Go back to Step 1.
Before planning, surface your understanding:
- State assumptions explicitly. If the request has multiple valid interpretations, list them — don't pick one silently. Ask for clarification before proceeding.
- Search for mature libraries that solve the problem. Evaluate trade-offs: proven library vs. custom (maintenance, fit, size).
- Only build custom if no suitable library exists or the fit is poor.
Summarize findings, then move to Step 1.
Output a plan using this exact format:
## Plan
**Goal:** <one sentence>
**Assumptions:** <explicit assumptions; flag any ambiguity>
**Files:** <list of files to create/modify/delete>
**Approach:** <how, key decisions, trade-offs, libraries chosen>
**Verify:** <what success looks like — which test passes, which behavior works>
**Risk:** <what could break, security implications>
HARD RULES:
- After printing the Plan, your message ENDS. No code. No "I'll start by...".
- Do NOT call Edit, Write, Bash, or any file-modifying tool in this turn.
- Wait for the user to explicitly reply. Explicit approval = "ok", "go", "yes", "继续", "好", or equivalent.
- A clarifying question is NOT approval — answer it and wait again.
- If the user approves but asks for changes, revise the plan and STOP again.
Implement the approved plan exactly. Rules while executing:
- Surgical changes only. Every changed line must trace directly to the task. Don't "improve" adjacent code, comments, or formatting.
- Minimum code. No speculative features, no unrequested abstractions, no configurability that wasn't asked for.
- Dead code:
- Orphans YOUR changes created → remove immediately.
- Pre-existing dead code you notice → mention in Summary, don't delete it.
- If scope needs to change mid-implementation → STOP, go back to Step 1.
- Run the project's test suite (see Common Commands above).
- Run lint and format checks.
- Run type checking.
- Fix all failures before proceeding.
- Confirm the Verify criteria from Step 1 are met.
## Summary
**Changed:** <file list with one-line descriptions>
**Tests:** <tests added/updated/passed>
**Notes:** <dead code noticed, trade-offs made, anything the user should know>
Then print a ready-to-run git command — do NOT execute it.
When the user reports that ERC-8183 has published a new version (e.g. "the spec has been updated", "there's a new 8183 draft", or any equivalent), treat it as a Step-1 Plan trigger. Do not modify any code yet. Run this protocol:
- Fetch & diff. Compare the new spec against the version recorded in
the
Spec version reviewedheader ofdocs/erc-8183-compliance.md. - Plan. Output a Plan per the Spec-Driven Workflow above, listing every normative delta, the affected contracts/tests, and the migration-risk classification (small = UUPS upgrade, medium = Router interface change, large = fresh deployment).
- Wait for approval. Do not edit until the user explicitly approves.
- Update the compliance doc. After approval — even if no code change
is needed — refresh
docs/erc-8183-compliance.md: rewrite theSummary, update each row ofDetail Items, adjust theNon-blocking Deltassection, and append a newChange Logentry dated today. BumpSpec version reviewedandLast reviewedin the header. - Sync design docs. If behaviour actually changes, update
docs/design.mdso it stays consistent with the code.
- No redundant code. Extract repeated logic; never copy-paste.
- Single responsibility. One file = one clear purpose. Split large files proactively.
- No abstractions for single-use code. Three similar lines > a premature abstraction.
- No dead code, no unused imports, no commented-out code (that you wrote).
- Functions: small, single-purpose, early returns.
- Naming: self-explanatory — if it needs a comment, rename it.
- No error handling for impossible scenarios. Trust internal guarantees; validate only at system boundaries.
- Folder structure: clean and intentional.
- Validate all external input at the system boundary.
- Never log secrets, private keys, or credentials.
- Never hardcode secrets — use environment variables.
- Never commit
.envor credential files.
Before responding to a code change request:
- Have I output a Plan yet? If no → go to Step 1.
- Has the user explicitly approved the Plan? If no → do not write any code.
- Am I about to call Edit/Write/Bash? If yes and Step 2 hasn't started → STOP.