Skip to content

Latest commit

 

History

History
389 lines (274 loc) · 11.6 KB

File metadata and controls

389 lines (274 loc) · 11.6 KB

AEGIS Client Workspace Agentic Platform — Windows Setup Runbook

No API key. No Docker. No Neo4j. No Microsoft 365. No Azure. Runs entirely on your Windows laptop.

What this platform includes:

  • Governance Knowledge Layer — policies, standards, AI model approvals, GDPR rules, audit trail
  • Product Knowledge Layer — user stories, architecture artifacts, product decisions, backlog
  • Multi-Agent Process Layer — 10 agents orchestrated for meeting transcript events
  • Focused Retrieval — agents receive only the relevant knowledge slice, not the full graph
  • Targeted Update — REST X.0 → REST Y.0 candidate update without full re-ingestion
  • Human Approval / Pull Requests — no knowledge becomes active without human sign-off
  • Token Reduction — ~85–95% fewer tokens vs. traditional RAG (illustrative estimates)
  • FastAPI Backend — 15 REST endpoints for Microsoft Copilot / Power Automate integration
  • Microsoft 365 Stubs — Teams, SharePoint, Graph API integration path (safe stubs)

Prerequisites

  • Python 3.11 or 3.12 — download from python.org
  • During install: tick "Add Python to PATH"
  • Git (optional — or just unzip the project folder)

Step 1 — Open Command Prompt in the project folder

cd C:\path\to\aegis-knowledge-memory-poc

Step 2 — Create and activate a virtual environment

python -m venv venv
venv\Scripts\activate

You should see (venv) at the start of your prompt.


Step 3 — Install dependencies

python -m pip install --upgrade pip
pip install -r requirements.txt

This installs: streamlit, fastapi, uvicorn, python-dotenv, pyyaml, jsonschema, httpx, pydantic, pytest, pytest-asyncio. All optional dependencies (anthropic, openai, neo4j, azure SDK) are commented out — uncomment as needed.


Step 4 — Set up the .env file

copy .env.example .env

Default config works for demo with no changes:

  • LLM_PROVIDER=mock — deterministic demo outputs, no API key needed
  • NEO4J_ENABLED=false — JSON file storage, no Docker needed

To use Ollama (free local LLM) instead of mock — see Optional section below.


Step 5 — Run ingestion

python src\main.py --stage ingest

Expected output: 7 documents ingested (6 .md + 1 .docx), ingestion manifest created. Supported formats: .md, .txt, .docx (no library needed), .pdf (needs PyMuPDF for full text).


Step 6 — Extract claims and decisions

python src\main.py --stage extract

Expected output: extracted_claims.json, extracted_decisions.json, extracted_entities.json created in data\processed\.


Step 7 — Run conflict detection

python src\main.py --stage conflicts

Expected output: 5 conflicts detected — includes 2 canonical demo conflicts:

  • CONF-2025-GDPR-001 (HIGH) — AWS S3 us-east-1 vs GDPR EU data residency
  • CONF-2025-MQ-002 (MEDIUM) — REST push superseded by IBM MQ pull-based

Step 8 — Queue items for human validation

python src\main.py --stage validate

Expected output: ~13 items queued — claims, decisions, and conflicts.


Step 9 — Generate artifacts

python src\main.py --stage artifacts

Expected output: ART-ADR-001.md and ART-ARCHSUM-001.md saved to data\processed\artifacts\.


Step 10 — Launch the Streamlit dashboard

streamlit run app\streamlit_app.py

Browser opens at: http://localhost:8501

You should see 8 tabs across the top:

Tab What it shows
🔍 Validation Queue Pending claims and decisions — approve / reject / escalate
⚡ Conflict Registry Detected contradictions, severity rated, grouped
✅ Approved Knowledge Validated items + Ask the Knowledge Base (LLM-powered)
🔄 Pipeline Status Stage results, LLM provider, Neo4j status
🕸️ Knowledge Graph Live interactive graph — documents, claims, entities, conflicts
📤 Upload Upload files or paste a folder path — runs full pipeline automatically
🔗 M365 Integration Future enterprise integration design (no credentials required for demo)
⚙️ Product Process Layer Two-layer demo: focused retrieval, REST X.0 → Y.0, agent simulation, PRs

Step 11 — Run the Product Process Layer demo (no extra setup needed)

  1. In the browser, click ⚙️ Product Process Layer
  2. Click Simulate Event sub-tab
  3. Click Simulate Meeting Event
  4. Watch topic detection, focused retrieval (1 of 9 rules), and candidate update creation
  5. Click Approve in the Human Validation section
  6. Check Governance Updates and Pull Requests sub-tabs
  7. Click Token Reduction sub-tab to see the comparison

To reset the demo to its initial state, click Reset Demo Data in the Simulate Event sub-tab.


Step 11b — Start the FastAPI backend (optional)

uvicorn src.api:app --reload --port 8000

Docs at: http://localhost:8000/docs

Key endpoints:

  • GET /health — health check
  • GET /api/status — platform status, LLM provider, M365 enabled
  • POST /api/events/meeting-transcript — full multi-agent event
  • GET /api/knowledge/context?topic=REST+API+standard — focused retrieval
  • POST /api/governance/check — governance check for an artifact
  • GET /api/token-estimate — RAG vs. AEGIS token comparison

Step 12 — Run tests (confirms everything works)

python -m pytest tests\ -v

Expected: 152 tests pass — no API key or Docker required.

Tests cover: chunking, extraction, conflict detection, focused retrieval, targeted update, agents (all 10), orchestrator, FastAPI endpoints, M365 stubs, token estimator, demo hardening.


Using the Ask feature (Approved Knowledge tab)

After approving at least one item in the Validation Queue:

  1. Click ✅ Approved Knowledge
  2. Scroll down to Ask the Knowledge Base
  3. Click a suggested question or type your own
  4. Click Ask

The LLM answers using only the human-approved claims and decisions — not raw documents. Every point is cited back to a specific claim or decision ID.

In mock mode the answer uses approved knowledge structure. For real language model answers, set up Ollama (see below) — free, runs locally.


Using the Upload feature

  1. Click 📤 Upload
  2. Either:
    • Drag and drop .md, .txt, .docx, or .pdf files onto the uploader, or
    • Paste an absolute folder path (e.g. C:\Users\you\Documents\ArchitectureDocs) and click Scan
  3. Click Process & Add to Queue

AEGIS will ingest, extract claims and decisions, detect conflicts, and add everything to the Validation Queue automatically. No size limit (up to 1 GB per file).


Using the M365 Integration feature

To connect to a real Microsoft 365 tenant:

  1. Register an app in Azure Portal → Entra ID → App registrations
  2. Grant API permissions: Sites.Read.All, Files.Read.All, User.Read.All, OnlineMeetings.Read.All
  3. Create a client secret under Certificates & secrets
  4. Add to .env:
    AZURE_TENANT_ID=your-tenant-id
    AZURE_CLIENT_ID=your-app-client-id
    AZURE_CLIENT_SECRET=your-secret-value
    
  5. Restart the dashboard and click 🔗 M365 Integration
  6. Click Test Connection — AEGIS will verify the credentials against the Graph API
  7. Browse SharePoint sites, select files, click Fetch & Ingest

No M365 access is needed for the local demo. The M365 tab also shows the full integration architecture design when not connected.


Optional — Real LLM answers with Ollama (free, local, no API key)

Ollama runs a language model on your laptop. No cloud, no cost.

rem Download from https://ollama.com/download and install
ollama pull llama3.2:latest
ollama serve

In .env, change:

LLM_PROVIDER=ollama
OLLAMA_MODEL=llama3.2:latest

Restart the dashboard. The Ask feature will now use llama3.2 running locally. The Pipeline Status tab will show: Ollama (llama3.2:latest) — local, free, no API key.

To re-extract claims and decisions with the real model:

python src\main.py --stage extract
python src\main.py --stage conflicts
python src\main.py --stage validate

Optional — Real LLM with AWS Bedrock (use Claude via your AWS account)

AWS Bedrock lets you call Claude models using your existing AWS credentials — no Anthropic API key needed.

One-time setup in AWS Console:

  1. Go to AWS Console → Amazon Bedrock → Model access
  2. Click Manage model access → Request access for "Anthropic Claude 3.5 Sonnet v2"
  3. Access is granted within minutes (no extra cost until you use it)

Install boto3:

pip install boto3

Configure credentials (choose one method):

Option A — environment variables (easiest for demos):

set AWS_ACCESS_KEY_ID=AKIA...
set AWS_SECRET_ACCESS_KEY=your-secret-key
set AWS_DEFAULT_REGION=us-east-1

Option B — AWS CLI (persists across sessions):

aws configure

Set the provider in .env:

LLM_PROVIDER=bedrock
BEDROCK_AWS_REGION=us-east-1
BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0

Re-run the pipeline with the real model:

python src\main.py --stage extract
python src\main.py --stage conflicts
python src\main.py --stage artifacts

The dashboard will show: AWS Bedrock (anthropic.claude-3-5-sonnet-20241022-v2:0 / us-east-1) in the Pipeline Status tab.

If credentials are missing or the call fails, AEGIS automatically falls back to mock mode — the demo keeps running.


Optional — Real LLM with Anthropic Claude (requires API key)

set LLM_PROVIDER=anthropic
set ANTHROPIC_API_KEY=sk-ant-your-key-here
python src\main.py --stage extract
python src\main.py --stage conflicts
python src\main.py --stage artifacts

Optional — Neo4j graph storage (requires Docker Desktop)

Install Docker Desktop from docker.com, then:

docker-compose up neo4j -d

In .env, set:

NEO4J_ENABLED=true
GRAPH_BACKEND=neo4j

Neo4j Browser: http://localhost:7474
User: neo4j | Password: password


Optional — REST API for Copilot Studio / Power Automate

pip install fastapi uvicorn
uvicorn src.api:app --reload --port 8000

API docs: http://localhost:8000/docs

Key endpoints:

  • POST /knowledge/ask — ask a natural language question, get a cited answer
  • GET /knowledge/search?q=... — keyword search across approved knowledge
  • GET /conflicts — list detected conflicts
  • GET /knowledge/approved — list all approved claims and decisions

For Copilot Studio demo:

ngrok http 8000

See docs\MICROSOFT_365_COPILOT_INTEGRATION.md for the full Copilot Studio setup.


Troubleshooting

Error Fix
python not found Re-install Python and tick "Add Python to PATH"
venv\Scripts\activate fails Run Set-ExecutionPolicy RemoteSigned in PowerShell as admin
Port 8501 in use streamlit run app\streamlit_app.py --server.port 8502
Neo4j connection error Leave NEO4J_ENABLED=false — not needed for demo
Validation queue has 0 items Run --stage conflicts first, then --stage validate
Anthropic import error Anthropic is optional — leave LLM_PROVIDER=mock in .env
Port 8000 in use Change API_PORT=8001 in .env and use --port 8001
.docx shows 0 words File may be password-protected or corrupted — try a different file
Ollama not responding Run ollama serve in a separate Command Prompt, keep it open
Upload shows no results Check the file has readable text content — scanned PDFs need OCR
M365 connection error 401 Client secret may have expired — generate a new one in Azure Portal
Knowledge Graph shows no edges Run --stage ingest and --stage extract first to populate data