Skip to content

NeuralChainAI/enclave

Repository files navigation

Enclave — private legal AI, inside your boundary

Enclave — Self-Hosted Private Legal AI for Law Firms & In-House Teams

Enclave is a self-hosted, private legal AI platform that runs entirely inside your own cloud (VPC) or on-premise — so your contract corpus, drafts, matter files, and client data never leave your security boundary. AI co-counsel, contract review, due diligence, and drafting, grounded in your firm's documents and standards — not a public model's memory, and never used to train anyone else's model.

Enclave gives law firms, in-house legal teams, and M&A diligence groups a whitelabeled, fully private AI workforce: ask questions across your entire contract corpus, run extractions over thousands of documents, review and redline against your own playbook, draft in your firm's voice, and schedule autonomous agents — all behind your own firewall, under your own access controls.


Contents

Section What's inside
Why self-hosted, private AI The privacy & compliance case for in-VPC legal AI
How it's built The Onyx · Paperclip · LiteLLM substrates
Features What ships today, and what's on the roadmap
Who it's for Target teams and use cases
Get started Run the full stack locally or in your VPC
  ↳ Prerequisites What you need installed
  ↳ Run the backend The bundled, pinned Onyx RAG + in-VPC Ollama
  ↳ Configure the LLM provider Point Onyx at your local model
  ↳ Run the agent layer (Paperclip) The bundled, pinned agent substrate
  ↳ Run the Enclave app The Next.js front-end
  ↳ LLM runtime modes CPU · NVIDIA GPU · native Ollama
  ↳ Hardware guide RAM/VRAM per model
  ↳ Pinned versions & upgrades How Enclave controls bundled versions
Preview the interface Static UI prototypes, no build step
Keywords

Why self-hosted, private AI

Legal work is privileged, confidential, and often contractually restricted from leaving a controlled environment. Public chatbots and multi-tenant SaaS break that model. Enclave is built the other way around:

  • Self-hosted in your VPC or on-prem — deploy into AWS, Azure, GCP, or your own datacenter. Zero bytes of your data ever leave your boundary.
  • No training on your data — your contracts and work product are never used to train shared or third-party models. Your corpus stays yours.
  • Attorney–client privilege preserved — privileged and confidential material is processed inside the boundary you already trust, supporting confidentiality, professional-responsibility, and data-residency obligations.
  • Bring your own model — route to a local open-weight LLM (Ollama), your own Amazon Bedrock / Azure OpenAI tenancy, or Claude, through a single LLM gateway (LiteLLM). No lock-in.
  • Your access controls — document-level ACLs, SSO/SAML, and role-based access mean the AI only ever sees what a given user is allowed to see.
  • Auditable & governed — every retrieval, answer, and agent run is logged for review, supervision, and compliance.

Compliance posture: SOC 2 Type II (in progress) · GDPR-ready · MIT open-core foundations · fully self-hosted.


How it's built

Enclave integrates two open-source substrates and wraps them in a branded, legal-specific product layer:

  • Onyx — the knowledge substrate. Retrieval-augmented generation (RAG) over your entire contract corpus, with 50+ enterprise connectors, document-level ACLs, and pin-cited, source-grounded answers.
  • Paperclip — the agent substrate. Orchestration, scheduling, governance, and audit for multi-step legal workflows — an always-on AI workforce.
  • LiteLLM gateway — model routing across local Ollama, BYO Bedrock/Azure, and Claude.

Everything runs side-by-side inside your VPC, connected to the systems you already use: iManage, NetDocuments, SharePoint, network drives, S3, your CLM, and email.

Enclave (Next.js · :3000)  ──►  Onyx API (· :8080)  ──►  Ollama (· :11434)
        product UI               retrieval + RAG          local LLM, in-VPC

Features

Available in the product

  • AI Assistant (co-counsel) — multi-turn conversational AI grounded in your corpus, with pin-cited sources and hand-offs to Diligence, Agents, and Draft.
  • Research — corpus-wide Q&A — ask natural-language questions of your firm's whole contract corpus and get pin-cited answers with a source viewer and highlighted spans.
  • Diligence — multi-document extraction grid — rows are documents, columns are extraction prompts; run structured extractions across thousands of contracts in a spreadsheet-style grid and export to CSV.
  • Review — clause review vs. playbook — single-document review with severity-ranked clause findings and redlines grounded in your firm's playbook.
  • Playbooks — codified firm standards — author rules with preferred / acceptable / fallback positions, severity, escalation, and trigger language, auto-generated from your executed precedents and consumed by Review, Assistant, and Agents.
  • Draft — AI drafting in your voice — generate contracts and clauses from your executed precedents, with an editor canvas, precedent rail, and clause library.
  • Agents / Workflows — autonomous AI workforce — scheduled agents that run 24/7 inside your VPC: auto-renewal watch, portfolio drift monitoring, obligation tracking, and custom agents described in plain English.
  • Connectors & ingestion — connect your DMS, CLM, drives, and email; watch live in-VPC indexing and per-source sync status.
  • Dashboard — an AI-forward home with an "ask your AI" box, capability tiles, KPI strip, a needs-attention band, and per-module activity.

On the roadmap

We are actively building toward:

  • Negotiation copilot — inline redline suggestions and counterparty-position comparison against your playbook, with one-click fallback language.
  • Obligation & deadline extraction — automatic extraction of renewals, notice windows, and covenants, synced to calendars and alerts.
  • Clause library & clause comparison — a managed, versioned library with side-by-side clause diffing across your corpus.
  • Privilege & PII detection / redaction — automated flagging and redaction of privileged and personal data before sharing or export.
  • Conflict checks — corpus-aware conflict-of-interest screening.
  • Approval & escalation routing — configurable approval workflows tied to playbook severity and approver roles.
  • Governance & risk analytics — dashboards on contract risk posture, deviation rates, and turnaround time.
  • Enterprise identity — SSO/SAML, SCIM provisioning, and granular RBAC.
  • Firm-tuned models — optional private fine-tuning / adapters trained only on your corpus, inside your boundary.
  • Integrations & API — deeper CLM/DMS sync, e-signature, plus a public API and webhooks for embedding Enclave in your own systems.
  • Multi-language contracts — review, extraction, and drafting across major contract languages.

Who it's for

In-house legal · M&A and transactional diligence · BigLaw · procurement and vendor management · mid-market firms — any team that needs frontier AI on confidential legal work without sending that work to a third party.


Get started

Enclave is a Next.js front-end that talks to a self-hosted Onyx backend (retrieval + RAG), which in turn calls a local Ollama LLM — so no document or prompt ever leaves your boundary. Stand up the backend first, point it at a local model, then run the app.

Prerequisites

  • Docker + Docker Compose — for the Onyx backend and the in-VPC Ollama service.
  • Bun ≥ 1.1 (or Node ≥ 20) — for the Next.js app.
  • ~16 GB RAM recommended for a usable local model — see the hardware guide.

Run the backend

The Onyx stack is bundled in this repo under deploy/docker_compose/ and pinned by image digest — every install runs the exact same, Enclave-controlled Onyx build (no upstream clone, no version drift). The bundle is the upstream Onyx Compose file plus a small Enclave override (publishes the API and adds an in-VPC Ollama service) and an idempotent seed script.

# from this repo's root
cd deploy/docker_compose

# Bring up the pinned stack (CPU default; Compose auto-merges the override)
docker compose up -d

That's it — no separate Onyx checkout. The override publishes the Onyx API on :8080 and runs Ollama at http://ollama:11434 inside the Compose network. nginx serves the Onyx admin UI on :3001 (chosen to avoid the Enclave app on :3000). Container names below assume the default onyx Compose project.

Configure the LLM provider

Pull a model and register it with Onyx as the default. The seed runs inside the api_server container because it writes through Onyx's DB layer:

# Pull the default model into the in-stack Ollama
docker exec onyx-ollama-1 ollama pull llama3.2:3b

# Seed Onyx's LLM provider (idempotent — safe to re-run)
docker cp enclave_seed_ollama.py onyx-api_server-1:/tmp/enclave_seed_ollama.py
docker exec onyx-api_server-1 python /tmp/enclave_seed_ollama.py

Override the model or endpoint without editing code — the seed reads these env vars: ENCLAVE_OLLAMA_DEFAULT_MODEL, ENCLAVE_OLLAMA_MODELS, ENCLAVE_OLLAMA_API_BASE.

If you run Onyx with AUTH_TYPE=disabled (no login — the local-dev default), enable anonymous API access once so the app can reach Onyx without a key:

docker exec onyx-cache-1 redis-cli set public:anonymous_user_enabled 1 EX 2592000

Run the agent layer (Paperclip)

Paperclip — the agent-orchestration substrate behind Enclave's Agents / Workflows — is also bundled and pinned in this repo, as a standalone Compose project at deploy/docker_compose/docker-compose.paperclip.yml. It runs separately from the Onyx stack (it is not auto-merged), so bring it up explicitly:

# from this repo's root
cd deploy/docker_compose

# Required once: a long random secret for Paperclip's auth.
echo "BETTER_AUTH_SECRET=$(openssl rand -hex 32)" >> .env

# Start the pinned Paperclip stack (server + its own Postgres)
docker compose -f docker-compose.paperclip.yml up -d

Paperclip's UI/API is served on :3100. The .env you create here is gitignored — it holds BETTER_AUTH_SECRET and, optionally, ANTHROPIC_API_KEY / OPENAI_API_KEY for model routing. (Deeper Onyx ↔ Paperclip wiring is on the roadmap; today it ships as a pinned, runnable bundle.)

Run the Enclave app

# from this repo's root
bun install
cp .env.local.example .env.local      # then edit if your Onyx API isn't on :8080
bun dev                               # http://localhost:3000

.env.local only needs the Onyx API base (ONYX_API_URL, default http://localhost:8080); add ONYX_API_KEY only if you run Onyx with auth enabled. See .env.local.example.

LLM runtime modes

In-stack Ollama is the default because it's one docker compose up with no host-level install and behaves identically on Linux, macOS, and Windows. Pick a mode by how the host is resourced:

Mode How Best for
CPU (default) docker compose up -d Any machine without a GPU (incl. Macs). Slowest — keep to a small model.
NVIDIA GPU add -f docker-compose.gpu.yml (see below) Linux servers — the realistic VPC posture. Needs the NVIDIA Container Toolkit. Fast enough for llama3.1:8b+.
Native Ollama run Ollama on the host, re-seed with ENCLAVE_OLLAMA_API_BASE=http://host.docker.internal:11434 macOS (Metal GPU + full host RAM) — Docker Desktop has no GPU passthrough.

The GPU overlay must be listed explicitly, which turns off Compose's auto-merge of the override — so include it too:

docker compose -f docker-compose.yml \
               -f docker-compose.override.yml \
               -f docker-compose.gpu.yml up -d

Hardware guide

Approximate memory the model needs, on top of the Onyx stack's ~6.5 GB. CPU inference is usable but slow; a GPU is roughly 10–50× faster.

Model Needs ~ Notes
llama3.2:1b ~2 GB Fits a stock Docker Desktop, but too weak for Onyx's agentic prompt — low-memory fallback only.
llama3.2:3b ~4 GB Default. Usable answers; on a Mac, raise Docker Desktop memory to ~12–16 GB before selecting it.
llama3.1:8b ~8 GB Recommended for real use; comfortable on a GPU or native on a 16 GB+ host.

Pinned versions & upgrades

So that every Enclave install runs the same, tested backend, the bundled stacks pin their images by digest rather than a floating tag like :latest. The digests are the single source of truth for the shipped versions and live directly in the Compose files:

Component Where it's pinned Bundled version
Onyx (backend · web · model server · code-interpreter) deploy/docker_compose/docker-compose.yml Onyx 4.0.x line
Ollama runtime deploy/docker_compose/docker-compose.override.yml pinned ollama/ollama
Paperclip deploy/docker_compose/docker-compose.paperclip.yml v2026.529.0 line

To upgrade a component, bump its @sha256:… digest in the relevant Compose file and docker compose up -d — there is no upstream checkout to keep in sync. Because the pin is a digest, installs are byte-for-byte reproducible regardless of how upstream tags move.


Preview the interface

The UI prototypes live in wireframe/. They are static HTML with no build step — open wireframe/index.html directly in a browser, or serve the folder:

cd wireframe
python3 -m http.server 8000
# then visit http://localhost:8000

Keywords

self-hosted legal AI · private legal AI · on-premise legal AI · in-VPC legal AI · confidential AI for lawyers · secure legal AI platform · whitelabel legal AI · AI co-counsel · contract review software · AI contract review · AI due diligence · M&A due diligence automation · legal document automation · contract drafting AI · clause extraction · clause comparison · contract playbook software · redlining automation · negotiation copilot · legal research AI · retrieval-augmented generation · RAG for legal · contract corpus search · contract analytics · obligation management · legal workflow automation · AI agents for legal · law firm AI · in-house legal AI · LegalTech · LegalAI · data residency · attorney-client privilege · GDPR · SOC 2 · self-hosted AI · VPC deployment · bring your own model · open-source legal AI · Onyx · Paperclip.


Powered by NeuralChainAI

About

Enclave — self-hosted, private legal AI for law firms & in-house teams. AI co-counsel, contract review, due diligence, drafting & playbooks, all inside your own VPC. Built on Onyx + Paperclip.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors