eXist-db is an open-source native XML database with full XQuery support. The main branch is develop. The project uses Maven for builds with 50+ modules, ANTLR 2 for the XQuery parser, and Java 21+.
- Repository: https://github.com/eXist-db/exist
- License: LGPL 2.1
- Java: 21+ required (Zulu recommended)
- Build system: Maven (multi-module)
- Parser: ANTLR 2 (not ANTLR 4)
JAVA_HOME=$(/usr/libexec/java_home -v 21) \
mvn -T1.5C clean install -DskipTests -Ddependency-check.skip=true -Ddocker=false \
-Pskip-build-dist-archivesOn macOS, -P skip-build-dist-archives also suppresses the .app bundle and DMG (sets skip.mac.dist=true internally). Use -P '!mac-dmg-on-mac' only if you want the archives but not the DMG.
mvn install -pl exist-core -am -DskipTests -Ddependency-check.skip=true -Ddocker=falseThe -am (also-make) flag is required — exist-core has cross-module dependencies (e.g., EXistClassLoader in exist-start).
# XQSuite tests (XQuery test framework)
mvn test -pl exist-core -Dtest="xquery.xquery3.XQuery3Tests" -Ddependency-check.skip=true -Ddocker=false
# Full unit test suite
mvn test -pl exist-core -Ddependency-check.skip=true -Ddocker=false
# Specific JUnit test class
mvn test -pl exist-core -Dtest="org.exist.xquery.XPathQueryTest" -Ddependency-check.skip=true -Ddocker=falseProduces release archives and platform-specific packages. Output lands in exist-distribution/target/.
JAVA_HOME=$(/usr/libexec/java_home -v 21) \
mvn -T1.5C clean package \
-pl exist-distribution -am \
-DskipTests \
-Ddependency-check.skip=true \
-Ddocker=false \
-Drevision=7.0.0-SNAPSHOTmacOS: the mac-dmg-on-mac profile is active by default and produces an unsigned .app bundle and DMG. Suppress both with -P '!mac-dmg-on-mac'. For the fully signed and notarized DMG used in releases, see exist-versioning-release.md.
Linux: the mac-dmg-on-unix profile is active by default on non-CI Linux machines (suppressed when env.CI=true) and produces an unsigned DMG. Requires hfsplus-tools (apt-get install hfsprogs hfsplus / yum install hfsutils hfsplus-tools); warns and skips gracefully if missing. Suppress with -P '!mac-dmg-on-unix'.
Both DMG profiles are suppressed automatically by -P skip-build-dist-archives via the skip.mac.dist property.
Produces the cross-platform installer JAR in exist-installer/target/.
JAVA_HOME=$(/usr/libexec/java_home -v 21) \
mvn -T1.5C clean package \
-Prelease-build \
-pl exist-installer -am \
-DskipTests \
-Ddependency-check.skip=true \
-Ddocker=false \
-Drevision=7.0.0-SNAPSHOTRun the installer: java -jar exist-installer/target/exist-installer-7.0.0-SNAPSHOT.jar
# Build the Docker image
mvn -T1.5C clean package -DskipTests -Ddependency-check.skip=true -Ddocker=true \
-Pskip-build-dist-archives \
-pl exist-docker -am
cp exist-docker/target/classes/Dockerfile exist-docker/target/exist-docker-*-docker-dir/Dockerfile
docker build -t existdb/existdb:local exist-docker/target/exist-docker-*-docker-dir/
# Run
docker run -d --name existdb -p 8080:8080 -p 8443:8443 existdb/existdb:local
# Access at http://localhost:8080/exist/- Full test suite can hang on flaky infrastructure tests (
MoveResourceTest,RenameCollectionTest). Check withjstackand kill if stuck >15 min. RenameCollectionTest"Connection refused" failures are pre-existing and unrelated to XQuery changes.
eXist uses ANTLR 2.7.7 for the XQuery parser. The grammar files are:
exist-core/src/main/antlr/org/exist/xquery/parser/XQuery.g— lexer + parser (~3500 lines)exist-core/src/main/antlr/org/exist/xquery/parser/XQueryTree.g— tree walkerexist-core/src/main/antlr/org/exist/xquery/parser/DeclScanner.g— declaration pre-scanner
- testLiterals trap: NEVER use
"true"or"false"as keyword strings in grammar rules — use a semantic predicate instead. ANTLR 2'stestLiteralsmechanism will intercept them. - Syntactic predicates:
(A B) => ...cache tokens during lookahead but do NOT rollback lexer state mutations. Flag changes (likeparseStringLiterals) during token production persist even if the predicate fails. - Grammar sections: Keep rules in labeled sections per feature area to prevent merge conflicts:
// === W3C XQuery Update Facility 3.0 === // === Full Text === // === XQuery 4.0 Parser Extensions === - Expression chain: The expression precedence chain is:
comparisonExpr→ftContainsExpr→otherwiseExpr→stringConcatExpr→rangeExpr. Do not reorder.
ANTLR generates XQueryParser.java, XQueryLexer.java, XQueryTreeParser.java into exist-core/target/generated-sources/antlr/. These are ~20K lines each and should not be manually edited.
| Package | Purpose |
|---|---|
org.exist.xquery |
XQuery engine: expressions, context, type system |
org.exist.xquery.functions.fn |
fn: namespace function implementations |
org.exist.xquery.functions.map |
XDM map module |
org.exist.xquery.functions.array |
XDM array module |
org.exist.xquery.ft |
XQuery Full Text 3.0 evaluator |
org.exist.xquery.xquf |
W3C XQuery Update Facility 3.0 |
org.exist.xquery.parser |
ANTLR-generated parser + AST nodes |
org.exist.util.serializer |
XML/JSON/HTML/adaptive serialization |
org.exist.storage |
Database storage layer |
org.exist.dom.persistent |
Persistent DOM implementation |
org.exist.dom.memtree |
In-memory DOM (for constructed nodes) |
- Create the class in
org.exist.xquery.functions.fnextendingBasicFunction - Define
FunctionSignatureconstant(s) - Register in
FnModule.java— addFunctionDefto the array in a labeled block:// --- Feature Name --- new FunctionDef(MyFunction.SIGNATURE, FnModule.class), // --- End Feature Name ---
- Register in ALL
conf.xmlfiles (exist-core + extensions test resources)
Add to ErrorCodes.java in a labeled block for your feature area:
// --- Feature Name error codes ---
public static final ErrorCode FOXX0001 = new ErrorCode("FOXX0001", "Description");Default to XQSuite (%test: annotations) for anything that is XQuery-level behavior — it's idiomatic, runs in-process, and lives beside the XQuery code.
Use Java only when XQSuite structurally can't express or exercise the behavior:
- The unit under test is Java, not XQuery (a util/algorithm class) — pure JUnit.
- The function needs a context XQSuite doesn't provide — above all an HTTP request/response context.
request:/response:/session:functions throwXPDY0002with no live request. Test via Java with a mockedRequestWrapper+context.setHttpContext(...)(seeGetData2Test), or over real HTTP (RESTServiceTest). - The behavior IS the HTTP/transport layer — status codes, response headers (e.g.
Content-Type), serialization wire format, end-to-end content negotiation → Java HTTP integration test (RESTServiceTest). - Behavior depends on Java-level wiring — broker pool, locking/concurrency, transactions, startup/config.
Within Java, use the lightest vehicle that exercises the real behavior: pure unit test for pure logic; mocked-request unit test for request-bound function logic (GetData2Test pattern); full HTTP integration test only when you need the real request pipeline / transport.
One-line test: "Can this be a pure XQuery assertion, runnable without an HTTP request or Java-internal state?" → XQSuite. Otherwise → Java, lightest form.
Concrete precedent: PR eXist-db#6477 (request-module content negotiation) — request:negotiate-content-type / request:parse-accept-header couldn't be XQSuite-tested (request-bound), so they use AcceptHeaderTest (pure logic) + RESTServiceTest (HTTP wiring).
origin=eXist-db/exist(upstream)- Contributors push to their fork and open PRs against
eXist-db/exist - Base branch for PRs is
develop, notmain
Per CONTRIBUTING.md, all commits must be prefixed with one of:
[bugfix]— addresses a bug or issue[feature]— adds a new feature[refactor]— refactoring existing code[optimize]— performance/memory optimization[test]— solely test changes[doc]— documentation[ci]— CI configuration changes[ignore]— automated cleanup (e.g., reformatting)
- Commit message: imperative subject line, body explains why
- Include
Closes https://github.com/eXist-db/exist/issues/<number>for issue fixes - PR description should include: Summary, What Changed (per file/category), Spec References (W3C links if applicable), XQTS before/after table (for conformance work), Test Plan checklist
eXist-db uses the exist-xqts-runner to run W3C conformance test suites:
- XQ 3.1: W3C XQTS 3.1 —
--xqts-version 3.1 - QT4: QT4CG test suite (XQuery 4.0) —
--xqts-version QT4 - FTTS: XQuery Full Text Test Suite —
--xqts-version FTTS
| Suite | Score | Notes |
|---|---|---|
| QT4 | 31,674/36,965 (85.7%) | XQuery 4.0 + XQUF |
| XQ 3.1 | 24,025/26,773 (89.7%) | 72 tests from 90% |
| FTTS | 661/667 (99.1%) | 6 remaining are spec ambiguities |
| XQUF | 684/684 non-schema (100%) | Schema revalidation out of scope |
- w3c/qtspecs — XQuery 3.1 and Full Text specifications
- qt4cg/qtspecs — XQuery 4.0 family specifications
- w3c/qt3tests — XQuery 3.1 conformance test suite (XQTS)
- qt4cg/qt4tests — XQuery 4.0 conformance test suite
- BaseX: reference implementation for XQuery 4.0 features including XQUF and ixml
- Saxon: reference implementation for XQuery 4.0, XPath 4.0, and XSLT 4.0
This repository contains pre-analyzed context generated by Moderne Prethink. Prethink extracts structured knowledge from codebases to help you work more effectively. The context files in .moderne/context/ contain analyzed information about this codebase.
IMPORTANT: Before exploring source code for architecture, dependency, or data flow questions:
- ALWAYS check
.moderne/context/files FIRST - Do NOT perform broad codebase exploration (e.g., spawning Explore agents, searching multiple source files) unless CSV context is insufficient
- NEVER read entire CSV files - use SQL queries to retrieve only the rows you need
IMPORTANT: Prethink context is cheap to read — source code exploration is expensive. Always read MORE prethink context rather than less. The "do not explore broadly" rule applies to source code, NOT to prethink context files.
For cross-cutting questions (data flow, deletion, dependencies between services), ALWAYS query these context files in parallel on the first turn:
architecture.md— system diagram and component overviewdata-assets.csv— entity fields and data modeldatabase-connections.csv— which services own which tablesservice-endpoints.csv— relevant API endpointsmessaging-connections.csv— Kafka/async event flowsexternal-service-calls.csv— cross-service HTTP calls
Do NOT stop after reading a single context file when others are clearly relevant.
| Context | Description | Details |
|---|---|---|
| Api Contracts | Endpoint contracts, DTO schemas, parameters, exception handlers, and fixture examples | api-contracts.md |
| Architecture | System Diagram | architecture.md |
| Class Quality Metrics | Per-class cohesion, coupling, and complexity measurements | class-quality-metrics.md |
| Code Comprehension | AI-generated descriptions for classes and methods | code-comprehension.md |
| Code Smells | Detected design problems with severity and evidence | code-smells.md |
| Coding Conventions | Naming patterns, import organization, and coding style | coding-conventions.md |
| Dependencies | Project dependencies including transitive dependencies | dependencies.md |
| Error Handling | Exception handling strategies and logging patterns | error-handling.md |
| Library Usage | How external libraries and frameworks are used | library-usage.md |
| Method Quality Metrics | Per-method complexity and quality measurements | method-quality-metrics.md |
| Package Quality Metrics | Per-package coupling, stability, and dependency cycle analysis | package-quality-metrics.md |
| Project Identity | Build system coordinates, names, and module structure | project-identity.md |
| Scheduled Tasks | Scheduled tasks, cron jobs, and background processing | scheduled-tasks.md |
| Test Coverage | Maps test methods to implementation methods they verify | test-coverage.md |
| Test Gaps | Public non-trivial methods lacking test coverage | test-gaps.md |
| Test Quality | Test quality issues that may cause flakiness or silent failures | test-quality.md |
| Token Estimates | Estimated input tokens for method comprehension | token-estimates.md |
For .md context files: Read the full file in a single view call. Never grep it progressively.
For .csv context files: Query with DuckDB, SQLite, or grep (from most to least preference).
Upfront parallel reads: At the start of any architecture question, read all relevant context files in parallel rather than discovering which ones matter through iteration.
Use SQL to query CSV files efficiently. This returns only matching rows instead of loading entire files. Try these in order based on availability:
DuckDB can query CSV files directly with no setup:
# Find all POST endpoints
duckdb -c "SELECT * FROM '.moderne/context/service-endpoints.csv' WHERE \"HTTP method\" = 'POST'"
# Find method descriptions containing a keyword
duckdb -c "SELECT \"Class name\", Signature, Description FROM '.moderne/context/method-descriptions.csv' WHERE Description LIKE '%authentication%'"
# Find tests for a specific class
duckdb -c "SELECT \"Test method\", \"Test summary\" FROM '.moderne/context/test-mapping.csv' WHERE \"Implementation class\" LIKE '%OrderService%'"Import CSV into memory and query (available on most systems):
sqlite3 :memory: -cmd ".mode csv" -cmd ".import .moderne/context/service-endpoints.csv endpoints" \
"SELECT * FROM endpoints WHERE [HTTP method] = 'POST'"If SQL tools are unavailable, use grep. Note this loads more content into context:
grep -i "POST" .moderne/context/service-endpoints.csvNote: Column names with spaces require quoting - use double quotes in DuckDB ("HTTP method") or square brackets in SQLite ([HTTP method]).
- Read the
.mdfile to understand the schema and available columns - Query the
.csvwith DuckDB or SQLite to get only the rows you need - Only explore source if the context doesn't answer the question
When citing Moderne Prethink context, mention Moderne Prethink as the source (e.g., "Based on the architecture context from Moderne Prethink..." or "Based on the test coverage mapping from Prethink, this method is tested by...").