"From prompt-driven AI to context-aware delivery intelligence."
The transcript becomes the prompt. The Knowledge Graph provides the context. AEGIS generates the right outcome for each role.
Author: Pranjal Patil | Programme: AEGIS / Agent 365 | Status: Client Demo Ready — June 2026
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
streamlit run app/aegis_demo.py # Polished client demo
# streamlit run app/streamlit_app.py # Technical validation queuepy -m venv .venv; .\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
streamlit run app/aegis_demo.pydocker build -t aegis-demo .
docker run -p 8501:8501 aegis-demoNo API key required. Demo runs in mock mode by default.
AEGIS is a persona-based enterprise AI platform for delivery teams. Instead of asking users to write prompts, AEGIS ingests transcripts, documents, and enterprise sources, extracts requirements and decisions, validates them against a governed Knowledge Graph, and generates role-specific outputs — with a human in the loop at every confirmation step.
| Capability | Status |
|---|---|
| BA / Product Owner — Transcript to Requirements | Demo Ready |
| Domain Knowledge (Governed Rules of Engagement) | Demo Ready |
| Engagement Record / Audit Log | Demo Ready |
| Knowledge Graph Visualisation | Demo Ready |
| Documentation Templates | Demo Ready |
| UX Designer Workspace | Simulated / Future |
| Architect Workspace | Simulated / Future |
| Developer, Tester, Security, Governance Personas | Future Capability |
See DEMO_GUIDE.md for the full presentation script and DEPLOYMENT_AWS.md for AWS deployment.
Previous README content below (technical reference):
Author: Pranjal Patil Status: POC — v1.0 (client-demo ready) Programme: AEGIS / Agent 365
Documents are not knowledge. Documents contain knowledge. AEGIS extracts that knowledge, validates it, and gives agents the current trusted view.
We are not doing full re-ingestion every time. We are doing focused graph retrieval and targeted knowledge update.
"AEGIS is not another chatbot. It is the trusted memory and process layer behind Copilot."
"Microsoft 365 is the workplace layer; AEGIS is the trusted memory layer behind it."
Enterprise architecture agents cannot reliably use raw documents as their knowledge source. This POC proves a two-layer approach:
Layer 1 — Governance Knowledge Layer (stable, organisation-wide rules)
Raw documents + meeting transcripts
→ extract claims and decisions (LLM)
→ detect conflicts and duplicates
→ human validation gate (Knowledge Pull Request Model)
→ approved knowledge → knowledge graph
→ context graph per agent (filtered, current, validated)
→ generate ADR / architecture doc / wiki page
Layer 2 — Product Knowledge Layer (evolving, project-specific)
Meeting transcript (e.g. "REST Y.0 agreed as new API standard")
→ topic detection (deterministic, no LLM call)
→ focused retrieval: only relevant governance rules retrieved
→ candidate update created (CAND-001 supersedes GOV-REST-001)
→ human validation via Product Process Layer
→ approved: new rule GOV-REST-002 goes live, old rule marked superseded
→ agents, user stories, and architecture artifacts updated
Process / Agent Layer (10 agents) sits on top of both knowledge layers:
ProductIntakeAgent— extract context, decisions, action items from meeting transcriptsFocusedRetrievalAgent— retrieve only the relevant governance slice, not the full knowledge baseKnowledgeUpdateAgent— create targeted candidate updates without full re-ingestionGovernanceCheckAgent— check artifacts and candidates against active governance rulesUserStoryAgent— create/update user stories from action items and governance decisionsArchitectureAgent— create versioned architecture artifactsArtifactGenerationAgent— generate ADR, approval pack, governance summaryApprovalAgent— create local pull requests for human reviewBacklogSyncAgent— sync approved stories to backlog only after PR approvalMicrosoft365IntegrationAgent— stub for Teams/SharePoint/Graph integration
Each agent receives only the validated knowledge slice it needs — not the full knowledge base
The memory keeps everything. The agent sees the current trusted view.
The dashboard runs fully offline with no API key, no Docker, no Neo4j, no Microsoft 365, no Azure.
LLM_PROVIDER=mock— deterministic demo outputs, no Anthropic / OpenAI key neededNEO4J_ENABLED=false— all data stored in JSON files, no Docker requiredM365_ENABLED=false— all Microsoft 365 connectors return safe stubs- All sample documents and governance rules are pre-loaded
Optional Anthropic mode: Set LLM_PROVIDER=anthropic and ANTHROPIC_API_KEY= in .env for stronger extraction and generation quality. Falls back to mock automatically if the key is missing.
Optional Ollama mode: Install Ollama and ollama pull llama3.2:3b, then set LLM_PROVIDER=ollama for a free local LLM with no API key.
cd aegis-knowledge-memory-poc
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# No changes needed for local demo — mock mode works out of the box
# Governance knowledge pipeline
python3 src/main.py --stage ingest
python3 src/main.py --stage extract
python3 src/main.py --stage conflicts
python3 src/main.py --stage validate
# Open Streamlit dashboard
streamlit run app/streamlit_app.py
# http://localhost:8501
# Optional: Run FastAPI backend
uvicorn src.api:app --reload --port 8000
# Docs at http://localhost:8000/docs
# Run all tests
python3 -m pytest tests/ -vcd aegis-knowledge-memory-poc
python -m venv venv
venv\Scripts\activate
python -m pip install --upgrade pip
pip install -r requirements.txt
copy .env.example .env
python src\main.py --stage ingest
python src\main.py --stage extract
python src\main.py --stage conflicts
python src\main.py --stage validate
streamlit run app\streamlit_app.py
# Optional
uvicorn src.api:app --reload
python -m pytest tests\ -vSee WINDOWS_RUNBOOK.md for full step-by-step instructions with troubleshooting.
The slide-ready explanation of how standards are encoded as a validated, versioned knowledge graph:
docs/slides/AEGIS_STANDARDS_KNOWLEDGE_GRAPH.md
Diagram source (Mermaid):
docs/diagrams/aegis_standards_knowledge_graph.mmd
HTML slide (open directly in browser, editable, print-to-PDF ready):
docs/html/AEGIS_Standards_Knowledge_Graph.html
Speaker notes and stakeholder Q&A:
docs/slides/AEGIS_STANDARDS_KNOWLEDGE_GRAPH_NOTES.md
A slide-ready explanation of how AEGIS reduces token consumption and hallucination is available here:
docs/SLIDES_TOKEN_REDUCTION.md
The slide deck covers:
- Why normal RAG is expensive and increases hallucination risk
- How AEGIS focused graph retrieval works
- The REST X.0 → REST Y.0 worked example
- Illustrative token reduction numbers (~85–95%)
- Why validated memory reduces hallucination
- Business value for enterprise clients
| Provider | Config | Requirement |
|---|---|---|
mock (default) |
LLM_PROVIDER=mock |
None — works offline |
ollama |
LLM_PROVIDER=ollama |
Ollama running locally (ollama pull llama3.2:3b) |
bedrock |
LLM_PROVIDER=bedrock + AWS credentials |
AWS account with Bedrock model access enabled |
anthropic |
LLM_PROVIDER=anthropic + ANTHROPIC_API_KEY=sk-ant-... |
Anthropic API key (direct) |
openai |
LLM_PROVIDER=openai + OPENAI_API_KEY=sk-... |
OpenAI API key |
AWS Bedrock is the recommended real-LLM option if you have an AWS account — no Anthropic API key needed, uses your existing AWS credentials. Enable Claude model access once in the Bedrock console, then pip install boto3 and set LLM_PROVIDER=bedrock.
All providers fall back to mock automatically if credentials are missing — the demo never crashes.
Neo4j is optional. The default is JSON file storage which works without Docker.
| Mode | Config | Requirement |
|---|---|---|
| Local JSON (default) | NEO4J_ENABLED=false |
None |
| Neo4j graph | NEO4J_ENABLED=true |
Docker running Neo4j |
To start Neo4j:
docker-compose up neo4j -d
# Neo4j Browser: http://localhost:7474 User: neo4j Password: password| Stage | Command | Output |
|---|---|---|
| Ingest | --stage ingest |
data/processed/ingestion_manifest.json |
| Extract | --stage extract |
extracted_claims.json, extracted_decisions.json, extracted_entities.json |
| Resolve | --stage resolve |
resolved_entities.json |
| Conflicts | --stage conflicts |
conflicts.json |
| Validate | --stage validate |
validation_queue.json (queue for UI review) |
| Artifacts | --stage artifacts |
data/processed/artifacts/ART-ADR-001.md, ART-ARCHSUM-001.md |
python3 -m pytest tests/ -v
# All tests pass without any API key or DockerTests cover: chunking, claim extraction, decision extraction, conflict detection, widget key uniqueness, Neo4j fallback, mock LLM, ingest/validate stages, product process layer (focused retrieval, targeted update, agent simulation, pull requests, token reduction).
Six sample documents are provided in data/raw/:
| File | Type | Purpose |
|---|---|---|
2012_shopping_portal_architecture.md |
Legacy architecture | 2012 AWS 3-tier shopping portal (EC2+ELB+RDS) |
2012_kyc_platform_architecture.md |
Legacy architecture | 2012 hybrid KYC platform (AWS + on-prem Oracle) |
togaf_architecture_template.md |
Governance template | TOGAF 9 ADM templates and artefact formats |
agentic_architecture_reference_2021_2022.md |
Reference | LangChain, LlamaIndex, ReAct, MRKL (2022) |
sample_meeting_transcript.md |
Meeting transcript | REST API → pull-based integration decision change |
sample_rfp_extract.md |
RFP | Architecture requirements (cloud, security, integration) |
All documents labelled [SYNTHETIC] where not sourced directly from public references.
| Page | What it shows |
|---|---|
| Knowledge Pull Request | Validation queue — approve / reject / escalate claims and decisions |
| Conflict Register | Detected contradictions between claims, with severity and recommendation |
| Audit Trail | Full history of every approval, rejection, and escalation |
| Approved Knowledge | All approved claims and decisions, browsable by domain |
| Pipeline Status | Stage results, LLM provider mode, Neo4j status |
| Product Process Layer | New — two-layer demo: simulate meeting event, focused retrieval, candidate governance updates, PR approval, backlog sync, token reduction metrics |
┌──────────────────────────────────┐
│ PROCESS / AGENT LAYER │
│ ProductIntake · GovernanceCheck │
│ UserStory · Architecture │
│ Approval · BacklogSync │
└────────────┬─────────────────────┘
│ receives focused slice only
┌──────────────────┴───────────────────┐
│ │
┌─────────▼──────────┐ ┌─────────────▼──────────┐
│ GOVERNANCE LAYER │ │ PRODUCT LAYER │
│ governance_rules │ │ user_stories │
│ (9 rules, stable) │ │ architecture_artifacts│
│ focused_context │ │ product_context │
│ targeted_update │ │ backlog │
└────────────────────┘ └────────────────────────┘
▲ ▲
│ │
┌─────────┴─────────────────────────────────────┐
│ KNOWLEDGE INGESTION PIPELINE │
│ ingestion → chunking → extraction (LLM) │
│ → entity_resolution → conflict_detection │
│ → validation_workflow → graph_writer (opt) │
└─────────────────────────────────────────────────┘
▲
[raw documents + meeting transcripts]
Focused retrieval in action (REST X.0 → REST Y.0 demo):
- Transcript arrives: "REST Y.0 agreed as new API standard"
- Topic detected:
rest_api_standard(keyword match, no LLM) - Governance retrieval: 1 of 9 rules (GOV-REST-001 only — not TLS, GDPR, AI rules)
- Token reduction: ~95% vs full-knowledge-base RAG
- Candidate GOV-REST-002 created, human approves, GOV-REST-001 marked superseded
aegis-knowledge-memory-poc/
├── README.md
├── WINDOWS_RUNBOOK.md ← Windows setup + troubleshooting
├── DEMO_NOTES.md ← Demo script
├── .env.example ← Copy to .env — no changes needed for demo
├── requirements.txt
│
├── src/
│ ├── main.py ← Pipeline (--stage flag)
│ ├── llm_provider.py ← LLM abstraction (mock/ollama/anthropic/openai)
│ ├── focused_context.py ← Focused knowledge slice retrieval
│ ├── targeted_update.py ← Targeted governance update
│ ├── token_estimator.py ← Token comparison (RAG vs. focused)
│ ├── agent_orchestrator.py ← Full multi-agent event orchestration
│ ├── agent_process_layer.py ← Legacy agent functions (kept for compatibility)
│ ├── event_triggers.py ← Event simulation
│ ├── api.py ← FastAPI backend (15 endpoints)
│ ├── agents/ ← Individual agent classes
│ │ ├── __init__.py
│ │ ├── base_agent.py
│ │ ├── product_intake_agent.py
│ │ ├── focused_retrieval_agent.py
│ │ ├── governance_check_agent.py
│ │ ├── knowledge_update_agent.py
│ │ ├── user_story_agent.py
│ │ ├── architecture_agent.py
│ │ ├── artifact_generation_agent.py
│ │ ├── approval_agent.py
│ │ ├── backlog_sync_agent.py
│ │ └── microsoft365_integration_agent.py
│ └── integrations/
│ ├── microsoft365.py ← Real M365 client (live mode)
│ └── microsoft365/ ← Structured M365 stubs
│ ├── m365_config.py
│ ├── graph_auth.py
│ ├── sharepoint_connector.py
│ ├── teams_transcript_connector.py
│ ├── power_automate_webhook.py
│ └── copilot_connector_stub.py
│
├── data/
│ ├── raw/ ← Input documents
│ ├── processed/ ← Extracted claims/decisions/conflicts
│ ├── documents/ ← Meeting transcripts
│ └── product/ ← Product knowledge layer
│ ├── governance_rules.json
│ ├── product_context.json
│ ├── user_stories.json
│ ├── architecture_artifacts.json
│ ├── product_decisions.json
│ ├── backlog.json
│ ├── pull_requests.json
│ └── candidate_updates.json
│
├── outputs/product/ ← Agent-generated artifacts
│ ├── updated_user_stories.md
│ ├── architecture_artifact_v2.md
│ ├── governance_check_summary.md
│ ├── approval_pack.md
│ └── adr_rest_y0.md
│
├── docs/
│ ├── SLIDES_TOKEN_REDUCTION.md ← Slide-ready: token & hallucination reduction
│ ├── DEPLOYMENT_PLAN.md ← 5-stage deployment roadmap
│ ├── SECURITY_AND_GOVERNANCE.md ← Security, GDPR, audit, RBAC plan
│ ├── MICROSOFT_365_COPILOT_INTEGRATION.md
│ ├── TOKEN_CONSUMPTION_AND_HALLUCINATION_REDUCTION.md
│ └── KNOWLEDGE_MEMORY_MODEL.md
│
├── app/
│ ├── streamlit_app.py ← Dashboard (~2750 lines)
│ └── ai-plugin.json ← Copilot Studio plugin manifest
│
└── tests/
├── test_claim_extraction.py
├── test_decision_extraction.py
├── test_conflict_detection.py
├── test_demo_hardening.py
├── test_product_process_layer.py ← 45 tests (focused retrieval, agents, PRs)
└── test_agents_and_orchestrator.py ← 50 tests (agent layer + orchestrator + API)
- Open
data/raw/sample_meeting_transcript.md— show where the REST API decision changes to pull-based - Show
data/processed/extracted_decisions.json— the structured decision extracted from that transcript - Show
data/processed/conflicts.json— the GDPR vs. AWS S3 conflict the system detected automatically - Open Streamlit at
localhost:8501— validate queue: approve one claim, reject another, see audit trail update - Open
data/processed/artifacts/ART-ADR-001.md— the ADR generated from the validated decision
The key message: documents came in → claims and decisions came out → conflict surfaced automatically → human validated → ADR generated. That is the end-to-end pipeline.
The current POC runs locally. No Microsoft 365 access is required to run the demo.
In an enterprise deployment, the same workflow can be connected to Microsoft 365:
- SharePoint / OneDrive — architecture documents and working drafts as document sources
- Teams meeting transcripts — decision extraction from architecture meetings via Graph API
- Power Automate — trigger AEGIS ingestion when documents change or meetings end
- Power Apps — replace or complement the Streamlit validation dashboard
- Copilot Studio — build an agent that calls AEGIS APIs for validated context
- Teams — architects review and approve decisions without leaving Teams
No new platform is needed. The workflow builds on Microsoft 365 licences clients already hold.
See docs/MICROSOFT_365_COPILOT_INTEGRATION.md for the full integration design including Graph API endpoints, Power Automate flow examples, and Copilot Studio integration options.
| Capability | What to add |
|---|---|
| PDF/DOCX ingestion | PyMuPDF / python-docx parsers in ingestion.py |
| Teams transcript auto-ingestion | Microsoft Graph API — see src/integrations/microsoft365_stub.py |
| SharePoint document sync | Power Automate + POST /ingest/document API |
| Vector search for entity resolution | Replace LLM similarity with Qdrant embeddings |
| Production graph | Neo4j Enterprise or Memgraph |
| Audit store | PostgreSQL (replace JSON file store in validation_workflow.py) |
| Agent integration | Copilot Studio / Agent 365 — consume context graphs via API |
| LLM Wiki sync | Generate Confluence pages from approved stationary knowledge |
- 28 node types — Document, Claim, Decision, Conflict, Policy, Risk, System, Artifact, and more
- 33 edge types — CONTAINS, SUPERSEDES, CONFLICTS_WITH, APPROVED, GENERATED_FROM, and more
- Full definitions in
ontology/node_types.yamlandontology/edge_types.yaml
- LangChain (Harrison Chase, Oct 2022): https://github.com/langchain-ai/langchain
- LlamaIndex (Jerry Liu, Nov 2022): https://github.com/run-llama/llama_index
- ReAct paper (Yao et al., 2022): https://arxiv.org/abs/2210.03629
- MRKL Systems (Karpas et al., 2022): https://arxiv.org/abs/2205.00445
- TOGAF 9 Artifacts: https://pubs.opengroup.org/architecture/togaf9-doc/arch/chap31.html
- ADR format (Michael Nygard): https://adr.github.io/
- AWS Well-Architected Framework: https://aws.amazon.com/architecture/well-architected/