Status: Stable · Version: 1.1 · License: CC-BY-4.0 (spec) / MIT (tools) · Last updated: 2026-06-20
This is the informal, example-led specification. The normative standard — with RFC 2119 conformance language, a formal ABNF grammar, a JSON schema, the IANA media-type registration, and the security/conformance/interoperability/change-control documents — lives in
standard/AIGX-1.1.md. Where the two differ, thestandard/directory governs.
v1.1 adds §8 Scaling - hierarchical (sharded) genomes and per-file resolution - so the format bounds context cost on large repositories and monorepos. v1.1 is backwards-compatible with v1.0 (a single root
.aigx/is the one-package case).
AIGX (AI Genome Exchange) is a context format for AI coding agents. This document is the normative definition. The key words MUST, SHOULD, and MAY are used per RFC 2119.
Design north star: an AI agent reads selectively at the edit site. The format MUST make the binding constraint for the file being edited reachable in one lookup, while keeping the source code untouched.
An AIGX genome is a directory named .aigx/ at the repository root, plus optional per-domain cards
colocated with source folders.
<repo-root>/
├── .aigx/
│ ├── protocol.aigx # REQUIRED - the read protocol
│ ├── product.aigx # RECOMMENDED - product context + doc freshness
│ ├── files.aigx # REQUIRED - the per-file boundary index
│ └── <concern>.aigx # REQUIRED (≥1) - per-concern rule files
└── <any source dir>/
└── <key>.aigx # OPTIONAL - a per-domain card
- Files use the
.aigxextension and UTF-8 encoding. - The syntax is XML-style tags (chosen for parseability), but a genome is read by an LLM, not a strict XML parser; well-formedness SHOULD hold but is not required to be schema-validated.
- Source code files MUST NOT be modified. AIGX is centralized; nothing is injected into source.
Every rule has a stable identifier of the form PREFIX-SLUG where SLUG is either a sequential number
or a semantic kebab-case phrase. Both are valid; semantic slugs are recommended for new genomes
because a human reviewer sees ARCH-no-deep-imports and immediately knows the rule without opening a
second file.
# Numeric (still valid, common in existing genomes)
ARCH-2 DATA-1 ENG-10
# Semantic (recommended for new genomes)
ARCH-no-deep-imports DATA-integer-cents ENG-no-idor
| Numeric | Semantic equivalent | Rule |
|---|---|---|
ARCH-2 |
ARCH-no-deep-imports |
Every feature exposes one public barrel; deep imports forbidden. |
DATA-1 |
DATA-integer-cents |
Monetary values are integer cents; float money is forbidden. |
ENG-7 |
ENG-no-idor |
Authorization must scope every query to the authenticated principal. |
Rules:
- The
PREFIXSHOULD name the concern (ARCH,DATA,AUTH,CACHE,PERF,TEST,AI,OFF,ENG, …). The prefix is conventionally the uppercased concern name. - The SLUG MUST be ASCII letters, digits, and hyphens only (regex:
[A-Za-z0-9-]+). - Ids MUST be stable across edits - they are the cross-reference backbone used by
<check>lists,<fact>s, and gotchas. Renaming a rule id is a breaking change to the genome. - Ids are the unit of parity: any tool that re-renders a genome MUST preserve the full rule-id set.
The read protocol. It is the first thing an agent reads. It MUST instruct the agent to consult
files.aigx for each file it edits and to verify the file's <check> ids before finishing.
<aigx-protocol version="1.1">
<read-first>Open .aigx/files.aigx and find the <file> entry for EACH file you will edit … obey its
<forbid pri="CRIT"> and satisfy every id in its <check> before finishing.</read-first>
<step n="1">Read the per-concern rule files in .aigx/ that the task touches.</step>
<step n="2">Read .aigx/files.aigx for the per-file boundaries of files you edit.</step>
<step n="3">Schema-first; failing test first; minimal change, local blast radius.</step>
<step n="4">Run gates; verify each file's <check> ids hold.</step>
</aigx-protocol>It SHOULD be short (one screen). Per the principles, lengthening or adding scaffolding to the protocol did not improve outcomes and sometimes hurt.
Top-level product context. SHOULD include a <freshness> element that explicitly states which older
documents are superseded - this resolves stale-doc conflicts an agent would otherwise inherit.
<aigx-product name="…">
<name>…</name>
<standard>…what 'good' means for this product…</standard>
<freshness>…which dated docs are historical and yield to this genome…</freshness>
<stack>…the tech stack…</stack>
</aigx-product>Each concern file is a flat list of <rule> elements carrying the full rule text:
<aigx-architecture>
<rule id="ARCH-2">Every feature exposes ONE public API: its index.ts barrel. Deep imports are forbidden.</rule>
<rule id="ARCH-6">TypeScript strict mode; the `any` type is forbidden in any form.</rule>
</aigx-architecture>- Element name SHOULD be
aigx-<concern>; child elements MUST be<rule id="…">. - Rule text MUST be the authoritative, complete statement of the rule. Glosses/abbreviations belong (if anywhere) in the index, never here.
A flat list of <file> entries, one per source file an agent is likely to edit.
<aigx-files>
<file path="src/features/meetings/bookMeeting.ts" domain="meetings">
<role>Book a meeting (validate slot + contact)</role>
<forbid pri="CRIT">NEVER import @/features/suppliers/internal/* (deep import = ARCH-2)</forbid>
<gotcha pri="CRIT">get contact_email from the suppliers PUBLIC api, never the internal mapper</gotcha>
<check>ARCH-2 ARCH-4 ARCH-5 DATA-2 TEST-1</check>
</file>
</aigx-files>Per-entry fields:
| Element | Card. | Meaning |
|---|---|---|
path (attr) |
1 | Repo-relative path of the file. REQUIRED. |
domain (attr) |
0-1 | The domain/feature key this file belongs to. |
<role> |
0-1 | One line: what this file is for. |
<forbid> |
0-1 | A hard NEVER-do boundary (typically a forbidden import). SHOULD be rare - only files with a real boundary carry one. |
<gotcha> |
0-1 | The single most important pitfall for this file. |
<check> |
0-1 | Space-separated rule-ids the agent MUST verify before finishing. |
Normative authoring constraints (these are what the benchmark validated):
- Scarcity.
<forbid>SHOULD appear on only the few files that truly have an import boundary. Marking many files dilutes the signal and measurably reduces compliance. - One gotcha. Each entry SHOULD carry at most one
<gotcha>- the single worst pitfall - not a list. - Terse fields only. The index SHOULD carry only
role+forbid+gotcha+check. Richer per-file fields (allow/schema/data/perf) were tested and did not improve outcomes.
<forbid> and <gotcha> MAY carry pri="CRIT". In the validated design, all critical boundaries
use a single uniform level (CRIT). Graded scales (CRIT/WARN, CRIT/HIGH/NORM) were tested and did not
help. Tools MUST treat an absent pri as normal priority.
Colocated with a source folder, named after the domain key. Gives feature-level context.
<aigx-domain key="suppliers" path="src/features/suppliers">
<purpose>…</purpose>
<public_api>…the barrel / entry point…</public_api>
<test>…the test policy for this domain…</test>
<blast>…the blast radius…</blast>
<facts>
<fact>…a fact, tagged with the rule id it enforces (ARCH-3).</fact>
</facts>
</aigx-domain>To make any agent AIGX-aware, append this to its instructions (system prompt, AGENTS.md, CLAUDE.md,
Cursor rule, etc.). It is the only integration step required.
This repository uses AIGX - the AI Genome Exchange context format. The
.aigx/directory holds the context: read.aigx/protocol.aigxfirst; then the per-concern rule files (.aigx/<concern>.aigx, each a set of<rule id="…">tags) your task touches..aigx/files.aigxis the PER-FILE BOUNDARY INDEX: for EACH file you edit, find its<file path="…">entry - obey its<forbid pri="CRIT">(NEVER-imports), heed its<gotcha>, and verify every id in its<check>before finishing. Each domain folder may have a<domain>.aigxcard. Keep blast radius local unless justified.
A core property AIGX inherits from its benchmark: any transformation of a genome (re-rendering, exporting, compressing) MUST be semantics-preserving. Specifically it MUST preserve:
- the complete set of rule ids and their full text,
- every
<file>entry'spath,forbid,gotcha, andcheckids, - every
<fact>and domain card's content.
A transformation MAY change representation, ordering, or formatting; it MUST NOT add, remove, or alter the
meaning of any rule, boundary, or fact. (Exporters to AGENTS.md/CLAUDE.md/.mdc are
parity-preserving projections.)
A directory is a conforming AIGX genome if it has, at minimum:
- a
.aigx/protocol.aigxinstructing per-file index lookup and<check>verification, - at least one
.aigx/<concern>.aigxwith<rule id="…">rules, and - a
.aigx/files.aigxwith at least one<file path="…">entry whose<check>ids resolve to rules that exist in the concern files.
A conforming AIGX reader (agent or tool) MUST, for each file it edits, consult that file's files.aigx
entry and honor its <forbid> and <check>.
AIGX files MAY contain XML comments (<!-- … -->) for human authoring notes. Conforming readers MUST
ignore XML comments and MUST NOT treat them as rules, boundaries, or structured content. Comments are
useful for annotating why a rule exists or flagging a <forbid> for review; they have no semantic effect.
A single root files.aigx is correct for a small/mid project, but a flat index does not scale to a 50,000
-file monorepo. AIGX v1.1 bounds context cost with two mechanisms.
A repository MAY contain multiple .aigx/ directories - one per package, workspace, or major subtree:
monorepo/
├── .aigx/ # OPTIONAL root genome: only org-wide rules + the few cross-cutting files
│ ├── protocol.aigx
│ └── architecture.aigx # rules that apply everywhere
├── packages/
│ ├── checkout/
│ │ └── .aigx/
│ │ ├── files.aigx # indexes ONLY packages/checkout/**
│ │ └── data.aigx # rules local to checkout
│ └── search/
│ └── .aigx/
│ └── files.aigx # indexes ONLY packages/search/**
Resolution rules (normative):
- A
files.aigxentry'spathis resolved relative to the repository root (so paths are unambiguous across shards). - A genome at
<dir>/.aigx/SHOULD index only files under<dir>/. It MUST NOT be required to list files outside its subtree. - For a file being edited, the applicable genome is the nearest ancestor
.aigx/directory; the root.aigx/(if present) provides org-wide rules that apply in addition. - An agent editing within one package SHOULD load only that package's genome (plus the small root genome), not sibling packages' genomes. Context is therefore bounded by the working package, not the repo.
This makes the design more consistent with the locality principle (principles L2): context is addressed to the subtree the agent is actually in.
The boundary index is meant to be looked up, not ingested. A conforming tool SHOULD be able to return
the single <file> entry for a given path, so an agent's context cost is O(1) per edited file,
independent of index size. The bundled reference linter implements this:
aigx-lint --resolve src/features/meetings/bookMeeting.ts
# prints exactly that file's <file> entry - nothing elseA 50,000-entry index thus costs an agent one resolution call (or one grep), not 50,000 lines. Editor and MCP integrations SHOULD expose this resolution so the agent never loads a whole index.
Because entries reference paths, a genome can drift as files move. A conforming repository SHOULD run a
validator (e.g. the bundled aigx-lint) in CI so that a moved/renamed file fails the build until its
entry is corrected - the same discipline used for CODEOWNERS, tsconfig path maps, and ESLint
import-boundary rules. See docs/limitations.md §2.
Status: hierarchical genomes and resolution are specified and tool-supported. They are not yet benchmarked at monorepo scale - that's labeled future work, not a measured claim.
This spec is v1.1. Backwards-incompatible changes increment the major version; backwards-compatible
additions increment the minor. Genomes MAY declare their target version via an optional version="1.1"
attribute on the root of protocol.aigx.
The default integration (§4) asks the agent to read .aigx/ files before editing. This works, but it
relies on model compliance with multi-step reading instructions ("tool laziness" means some models skip
steps under latency pressure).
A more robust pattern is JIT (Just-In-Time) context hydration: the environment - not the agent - resolves the genome entry and injects it into the agent's context before inference runs.
┌──────────────────────────────────────────────────────────┐
│ edit request: "fix bookMeeting.ts" │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ ENVIRONMENT (IDE / MCP / CI runner) │ │
│ │ aigx-lint --resolve bookMeeting.ts │ O(1) │
│ │ → injects <file> entry into system │ │
│ │ prompt for this inference call │ │
│ └─────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ AGENT receives task + pre-resolved boundary │
│ (no multi-step reads needed; constraint is already in │
│ the context window at the edit site) │
└──────────────────────────────────────────────────────────┘
Implementation via MCP: expose aigx-lint --resolve <path> as an MCP tool. The IDE/client calls it
automatically whenever the agent opens a file for editing, and prepends the result to the context. The
agent never sees the raw files.aigx; it receives the resolved, file-specific boundary directly.
{
"name": "aigx_resolve",
"description": "Return the AIGX boundary entry for a file path (role, forbid, gotcha, check ids).",
"inputSchema": {
"type": "object",
"properties": { "path": { "type": "string", "description": "Repo-relative file path." } },
"required": ["path"]
}
}Implementation via pre-prompt injection: in a CI/CD or CLI wrapper, resolve the file before
constructing the LLM request and inject the <file> block into the system prompt:
# Pseudocode - run before every agent inference call
entry = subprocess.check_output(
['python', 'tools/aigx-lint/aigx_lint.py', '--resolve', target_file, '--root', '.'],
text=True
)
system_prompt = base_system_prompt + f"\n\nAIGX boundary for {target_file}:\n{entry}"See docs/jit-hydration.md for full patterns, reference implementations, and
how to wire this into Claude Code, Cursor, and custom agents.
See examples/sourcing-app/ for a complete conforming genome,
BENCHMARK.md for the evidence behind every "SHOULD", and
docs/limitations.md for the scope and honest caveats.