Enclave — Self-Hosted Private Legal AI for Law Firms & In-House Teams

Enclave is a self-hosted, private legal AI platform that runs entirely inside your own cloud (VPC) or on-premise — so your contract corpus, drafts, matter files, and client data never leave your security boundary. AI co-counsel, contract review, due diligence, and drafting, grounded in your firm's documents and standards — not a public model's memory, and never used to train anyone else's model.

Enclave gives law firms, in-house legal teams, and M&A diligence groups a whitelabeled, fully private AI workforce: ask questions across your entire contract corpus, run extractions over thousands of documents, review and redline against your own playbook, draft in your firm's voice, and schedule autonomous agents — all behind your own firewall, under your own access controls.

Section	What's inside
Why self-hosted, private AI	The privacy & compliance case for in-VPC legal AI
How it's built	The Onyx · Paperclip · LiteLLM substrates
Features	What ships today, and what's on the roadmap
Who it's for	Target teams and use cases
Get started	Run the full stack locally or in your VPC
↳ Prerequisites	What you need installed
↳ Run the backend	The bundled, pinned Onyx RAG + in-VPC Ollama
↳ Configure the LLM provider	Point Onyx at your local model
↳ Run the agent layer (Paperclip)	The bundled, pinned agent substrate
↳ Run the Enclave app	The Next.js front-end
↳ LLM runtime modes	CPU · NVIDIA GPU · native Ollama
↳ Hardware guide	RAM/VRAM per model
↳ Pinned versions & upgrades	How Enclave controls bundled versions
Preview the interface	Static UI prototypes, no build step
Keywords	—

Why self-hosted, private AI

Legal work is privileged, confidential, and often contractually restricted from leaving a controlled environment. Public chatbots and multi-tenant SaaS break that model. Enclave is built the other way around:

Self-hosted in your VPC or on-prem — deploy into AWS, Azure, GCP, or your own datacenter. Zero bytes of your data ever leave your boundary.
No training on your data — your contracts and work product are never used to train shared or third-party models. Your corpus stays yours.
Attorney–client privilege preserved — privileged and confidential material is processed inside the boundary you already trust, supporting confidentiality, professional-responsibility, and data-residency obligations.
Bring your own model — route to a local open-weight LLM (Ollama), your own Amazon Bedrock / Azure OpenAI tenancy, or Claude, through a single LLM gateway (LiteLLM). No lock-in.
Your access controls — document-level ACLs, SSO/SAML, and role-based access mean the AI only ever sees what a given user is allowed to see.
Auditable & governed — every retrieval, answer, and agent run is logged for review, supervision, and compliance.

Compliance posture: SOC 2 Type II (in progress) · GDPR-ready · MIT open-core foundations · fully self-hosted.

How it's built

Enclave integrates two open-source substrates and wraps them in a branded, legal-specific product layer:

Onyx — the knowledge substrate. Retrieval-augmented generation (RAG) over your entire contract corpus, with 50+ enterprise connectors, document-level ACLs, and pin-cited, source-grounded answers.
Paperclip — the agent substrate. Orchestration, scheduling, governance, and audit for multi-step legal workflows — an always-on AI workforce.
LiteLLM gateway — model routing across local Ollama, BYO Bedrock/Azure, and Claude.

Everything runs side-by-side inside your VPC, connected to the systems you already use: iManage, NetDocuments, SharePoint, network drives, S3, your CLM, and email.

Enclave (Next.js · :3000)  ──►  Onyx API (· :8080)  ──►  Ollama (· :11434)
        product UI               retrieval + RAG          local LLM, in-VPC

Features

Available in the product

AI Assistant (co-counsel) — multi-turn conversational AI grounded in your corpus, with pin-cited sources and hand-offs to Diligence, Agents, and Draft.
Research — corpus-wide Q&A — ask natural-language questions of your firm's whole contract corpus and get pin-cited answers with a source viewer and highlighted spans.
Diligence — multi-document extraction grid — rows are documents, columns are extraction prompts; run structured extractions across thousands of contracts in a spreadsheet-style grid and export to CSV.
Review — clause review vs. playbook — single-document review with severity-ranked clause findings and redlines grounded in your firm's playbook.
Playbooks — codified firm standards — author rules with preferred / acceptable / fallback positions, severity, escalation, and trigger language, auto-generated from your executed precedents and consumed by Review, Assistant, and Agents.
Draft — AI drafting in your voice — generate contracts and clauses from your executed precedents, with an editor canvas, precedent rail, and clause library.
Agents / Workflows — autonomous AI workforce — scheduled agents that run 24/7 inside your VPC: auto-renewal watch, portfolio drift monitoring, obligation tracking, and custom agents described in plain English.
Connectors & ingestion — connect your DMS, CLM, drives, and email; watch live in-VPC indexing and per-source sync status.
Dashboard — an AI-forward home with an "ask your AI" box, capability tiles, KPI strip, a needs-attention band, and per-module activity.

On the roadmap

We are actively building toward:

Negotiation copilot — inline redline suggestions and counterparty-position comparison against your playbook, with one-click fallback language.
Obligation & deadline extraction — automatic extraction of renewals, notice windows, and covenants, synced to calendars and alerts.
Clause library & clause comparison — a managed, versioned library with side-by-side clause diffing across your corpus.
Privilege & PII detection / redaction — automated flagging and redaction of privileged and personal data before sharing or export.
Conflict checks — corpus-aware conflict-of-interest screening.
Approval & escalation routing — configurable approval workflows tied to playbook severity and approver roles.
Governance & risk analytics — dashboards on contract risk posture, deviation rates, and turnaround time.
Enterprise identity — SSO/SAML, SCIM provisioning, and granular RBAC.
Firm-tuned models — optional private fine-tuning / adapters trained only on your corpus, inside your boundary.
Integrations & API — deeper CLM/DMS sync, e-signature, plus a public API and webhooks for embedding Enclave in your own systems.
Multi-language contracts — review, extraction, and drafting across major contract languages.

Who it's for

In-house legal · M&A and transactional diligence · BigLaw · procurement and vendor management · mid-market firms — any team that needs frontier AI on confidential legal work without sending that work to a third party.

Get started

Enclave is a Next.js front-end that talks to a self-hosted Onyx backend (retrieval + RAG), which in turn calls a local Ollama LLM — so no document or prompt ever leaves your boundary. Stand up the backend first, point it at a local model, then run the app.

Prerequisites

Docker + Docker Compose — for the Onyx backend and the in-VPC Ollama service.
Bun ≥ 1.1 (or Node ≥ 20) — for the Next.js app.
~16 GB RAM recommended for a usable local model — see the hardware guide.

Run the backend

The Onyx stack is bundled in this repo under deploy/docker_compose/ and pinned by image digest — every install runs the exact same, Enclave-controlled Onyx build (no upstream clone, no version drift). The bundle is the upstream Onyx Compose file plus a small Enclave override (publishes the API and adds an in-VPC Ollama service) and an idempotent seed script.

# from this repo's root
cd deploy/docker_compose

# Bring up the pinned stack (CPU default; Compose auto-merges the override)
docker compose up -d

That's it — no separate Onyx checkout. The override publishes the Onyx API on :8080 and runs Ollama at http://ollama:11434 inside the Compose network. nginx serves the Onyx admin UI on :3001 (chosen to avoid the Enclave app on :3000). Container names below assume the default onyx Compose project.

Configure the LLM provider

Pull a model and register it with Onyx as the default. The seed runs inside the api_server container because it writes through Onyx's DB layer:

# Pull the default model into the in-stack Ollama
docker exec onyx-ollama-1 ollama pull llama3.2:3b

# Seed Onyx's LLM provider (idempotent — safe to re-run)
docker cp enclave_seed_ollama.py onyx-api_server-1:/tmp/enclave_seed_ollama.py
docker exec onyx-api_server-1 python /tmp/enclave_seed_ollama.py

Override the model or endpoint without editing code — the seed reads these env vars: ENCLAVE_OLLAMA_DEFAULT_MODEL, ENCLAVE_OLLAMA_MODELS, ENCLAVE_OLLAMA_API_BASE.

If you run Onyx with AUTH_TYPE=disabled (no login — the local-dev default), enable anonymous API access once so the app can reach Onyx without a key:

docker exec onyx-cache-1 redis-cli set public:anonymous_user_enabled 1 EX 2592000

Run the agent layer (Paperclip)

Paperclip — the agent-orchestration substrate behind Enclave's Agents / Workflows — is also bundled and pinned in this repo, as a standalone Compose project at deploy/docker_compose/docker-compose.paperclip.yml. It runs separately from the Onyx stack (it is not auto-merged), so bring it up explicitly:

# from this repo's root
cd deploy/docker_compose

# Required once: a long random secret for Paperclip's auth.
echo "BETTER_AUTH_SECRET=$(openssl rand -hex 32)" >> .env

# Start the pinned Paperclip stack (server + its own Postgres)
docker compose -f docker-compose.paperclip.yml up -d

Paperclip's UI/API is served on :3100. The .env you create here is gitignored — it holds BETTER_AUTH_SECRET and, optionally, ANTHROPIC_API_KEY / OPENAI_API_KEY for model routing. (Deeper Onyx ↔ Paperclip wiring is on the roadmap; today it ships as a pinned, runnable bundle.)

Run the Enclave app

# from this repo's root
bun install
cp .env.local.example .env.local      # then edit if your Onyx API isn't on :8080
bun dev                               # http://localhost:3000

.env.local only needs the Onyx API base (ONYX_API_URL, default http://localhost:8080); add ONYX_API_KEY only if you run Onyx with auth enabled. See .env.local.example.

LLM runtime modes

In-stack Ollama is the default because it's one docker compose up with no host-level install and behaves identically on Linux, macOS, and Windows. Pick a mode by how the host is resourced:

Mode	How	Best for
CPU (default)	`docker compose up -d`	Any machine without a GPU (incl. Macs). Slowest — keep to a small model.
NVIDIA GPU	add `-f docker-compose.gpu.yml` (see below)	Linux servers — the realistic VPC posture. Needs the NVIDIA Container Toolkit. Fast enough for `llama3.1:8b`+.
Native Ollama	run Ollama on the host, re-seed with `ENCLAVE_OLLAMA_API_BASE=http://host.docker.internal:11434`	macOS (Metal GPU + full host RAM) — Docker Desktop has no GPU passthrough.

The GPU overlay must be listed explicitly, which turns off Compose's auto-merge of the override — so include it too:

docker compose -f docker-compose.yml \
               -f docker-compose.override.yml \
               -f docker-compose.gpu.yml up -d

Hardware guide

Approximate memory the model needs, on top of the Onyx stack's ~6.5 GB. CPU inference is usable but slow; a GPU is roughly 10–50× faster.

Model	Needs ~	Notes
`llama3.2:1b`	~2 GB	Fits a stock Docker Desktop, but too weak for Onyx's agentic prompt — low-memory fallback only.
`llama3.2:3b`	~4 GB	Default. Usable answers; on a Mac, raise Docker Desktop memory to ~12–16 GB before selecting it.
`llama3.1:8b`	~8 GB	Recommended for real use; comfortable on a GPU or native on a 16 GB+ host.

Pinned versions & upgrades

So that every Enclave install runs the same, tested backend, the bundled stacks pin their images by digest rather than a floating tag like :latest. The digests are the single source of truth for the shipped versions and live directly in the Compose files:

Component	Where it's pinned	Bundled version
Onyx (backend · web · model server · code-interpreter)	`deploy/docker_compose/docker-compose.yml`	Onyx 4.0.x line
Ollama runtime	`deploy/docker_compose/docker-compose.override.yml`	pinned `ollama/ollama`
Paperclip	`deploy/docker_compose/docker-compose.paperclip.yml`	v2026.529.0 line

To upgrade a component, bump its @sha256:… digest in the relevant Compose file and docker compose up -d — there is no upstream checkout to keep in sync. Because the pin is a digest, installs are byte-for-byte reproducible regardless of how upstream tags move.

Preview the interface

The UI prototypes live in wireframe/. They are static HTML with no build step — open wireframe/index.html directly in a browser, or serve the folder:

cd wireframe
python3 -m http.server 8000
# then visit http://localhost:8000

Keywords

self-hosted legal AI · private legal AI · on-premise legal AI · in-VPC legal AI · confidential AI for lawyers · secure legal AI platform · whitelabel legal AI · AI co-counsel · contract review software · AI contract review · AI due diligence · M&A due diligence automation · legal document automation · contract drafting AI · clause extraction · clause comparison · contract playbook software · redlining automation · negotiation copilot · legal research AI · retrieval-augmented generation · RAG for legal · contract corpus search · contract analytics · obligation management · legal workflow automation · AI agents for legal · law firm AI · in-house legal AI · LegalTech · LegalAI · data residency · attorney-client privilege · GDPR · SOC 2 · self-hosted AI · VPC deployment · bring your own model · open-source legal AI · Onyx · Paperclip.

Powered by NeuralChainAI

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
assets		assets
components		components
deploy		deploy
lib/onyx		lib/onyx
public		public
wireframe		wireframe
.env.local.example		.env.local.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
bun.lock		bun.lock
clause-sentinel-plan.md		clause-sentinel-plan.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enclave — Self-Hosted Private Legal AI for Law Firms & In-House Teams

Contents

Why self-hosted, private AI

How it's built

Features

Available in the product

On the roadmap

Who it's for

Get started

Prerequisites

Run the backend

Configure the LLM provider

Run the agent layer (Paperclip)

Run the Enclave app

LLM runtime modes

Hardware guide

Pinned versions & upgrades

Preview the interface

Keywords

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enclave — Self-Hosted Private Legal AI for Law Firms & In-House Teams

Contents

Why self-hosted, private AI

How it's built

Features

Available in the product

On the roadmap

Who it's for

Get started

Prerequisites

Run the backend

Configure the LLM provider

Run the agent layer (Paperclip)

Run the Enclave app

LLM runtime modes

Hardware guide

Pinned versions & upgrades

Preview the interface

Keywords

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages