Skip to content

PranjalPatil9945/aegis-ba-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AEGIS — Enterprise Delivery Intelligence Platform

"From prompt-driven AI to context-aware delivery intelligence."

The transcript becomes the prompt. The Knowledge Graph provides the context. AEGIS generates the right outcome for each role.

Author: Pranjal Patil | Programme: AEGIS / Agent 365 | Status: Client Demo Ready — June 2026


Quick Start

macOS / Linux

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
streamlit run app/aegis_demo.py          # Polished client demo
# streamlit run app/streamlit_app.py    # Technical validation queue

Windows (PowerShell)

py -m venv .venv; .\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
streamlit run app/aegis_demo.py

Docker

docker build -t aegis-demo .
docker run -p 8501:8501 aegis-demo

No API key required. Demo runs in mock mode by default.


What Is AEGIS?

AEGIS is a persona-based enterprise AI platform for delivery teams. Instead of asking users to write prompts, AEGIS ingests transcripts, documents, and enterprise sources, extracts requirements and decisions, validates them against a governed Knowledge Graph, and generates role-specific outputs — with a human in the loop at every confirmation step.

Capability Status
BA / Product Owner — Transcript to Requirements Demo Ready
Domain Knowledge (Governed Rules of Engagement) Demo Ready
Engagement Record / Audit Log Demo Ready
Knowledge Graph Visualisation Demo Ready
Documentation Templates Demo Ready
UX Designer Workspace Simulated / Future
Architect Workspace Simulated / Future
Developer, Tester, Security, Governance Personas Future Capability

See DEMO_GUIDE.md for the full presentation script and DEPLOYMENT_AWS.md for AWS deployment.


Previous README content below (technical reference):


AEGIS Client Workspace Agentic Platform

Author: Pranjal Patil Status: POC — v1.0 (client-demo ready) Programme: AEGIS / Agent 365


Core Message

Documents are not knowledge. Documents contain knowledge. AEGIS extracts that knowledge, validates it, and gives agents the current trusted view.

We are not doing full re-ingestion every time. We are doing focused graph retrieval and targeted knowledge update.


Core Message

"AEGIS is not another chatbot. It is the trusted memory and process layer behind Copilot."

"Microsoft 365 is the workplace layer; AEGIS is the trusted memory layer behind it."


What This POC Proves

Enterprise architecture agents cannot reliably use raw documents as their knowledge source. This POC proves a two-layer approach:

Layer 1 — Governance Knowledge Layer (stable, organisation-wide rules)

Raw documents + meeting transcripts
  → extract claims and decisions (LLM)
  → detect conflicts and duplicates
  → human validation gate (Knowledge Pull Request Model)
  → approved knowledge → knowledge graph
  → context graph per agent (filtered, current, validated)
  → generate ADR / architecture doc / wiki page

Layer 2 — Product Knowledge Layer (evolving, project-specific)

Meeting transcript (e.g. "REST Y.0 agreed as new API standard")
  → topic detection (deterministic, no LLM call)
  → focused retrieval: only relevant governance rules retrieved
  → candidate update created (CAND-001 supersedes GOV-REST-001)
  → human validation via Product Process Layer
  → approved: new rule GOV-REST-002 goes live, old rule marked superseded
  → agents, user stories, and architecture artifacts updated

Process / Agent Layer (10 agents) sits on top of both knowledge layers:

  • ProductIntakeAgent — extract context, decisions, action items from meeting transcripts
  • FocusedRetrievalAgent — retrieve only the relevant governance slice, not the full knowledge base
  • KnowledgeUpdateAgent — create targeted candidate updates without full re-ingestion
  • GovernanceCheckAgent — check artifacts and candidates against active governance rules
  • UserStoryAgent — create/update user stories from action items and governance decisions
  • ArchitectureAgent — create versioned architecture artifacts
  • ArtifactGenerationAgent — generate ADR, approval pack, governance summary
  • ApprovalAgent — create local pull requests for human review
  • BacklogSyncAgent — sync approved stories to backlog only after PR approval
  • Microsoft365IntegrationAgent — stub for Teams/SharePoint/Graph integration

Each agent receives only the validated knowledge slice it needs — not the full knowledge base

The memory keeps everything. The agent sees the current trusted view.


Demo Mode — No API Key Required

The dashboard runs fully offline with no API key, no Docker, no Neo4j, no Microsoft 365, no Azure.

  • LLM_PROVIDER=mock — deterministic demo outputs, no Anthropic / OpenAI key needed
  • NEO4J_ENABLED=false — all data stored in JSON files, no Docker required
  • M365_ENABLED=false — all Microsoft 365 connectors return safe stubs
  • All sample documents and governance rules are pre-loaded

Optional Anthropic mode: Set LLM_PROVIDER=anthropic and ANTHROPIC_API_KEY= in .env for stronger extraction and generation quality. Falls back to mock automatically if the key is missing.

Optional Ollama mode: Install Ollama and ollama pull llama3.2:3b, then set LLM_PROVIDER=ollama for a free local LLM with no API key.


Quick Start (Mac / Linux)

cd aegis-knowledge-memory-poc
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

cp .env.example .env
# No changes needed for local demo — mock mode works out of the box

# Governance knowledge pipeline
python3 src/main.py --stage ingest
python3 src/main.py --stage extract
python3 src/main.py --stage conflicts
python3 src/main.py --stage validate

# Open Streamlit dashboard
streamlit run app/streamlit_app.py
# http://localhost:8501

# Optional: Run FastAPI backend
uvicorn src.api:app --reload --port 8000
# Docs at http://localhost:8000/docs

# Run all tests
python3 -m pytest tests/ -v

Quick Start (Windows)

cd aegis-knowledge-memory-poc
python -m venv venv
venv\Scripts\activate
python -m pip install --upgrade pip
pip install -r requirements.txt

copy .env.example .env

python src\main.py --stage ingest
python src\main.py --stage extract
python src\main.py --stage conflicts
python src\main.py --stage validate

streamlit run app\streamlit_app.py

# Optional
uvicorn src.api:app --reload
python -m pytest tests\ -v

See WINDOWS_RUNBOOK.md for full step-by-step instructions with troubleshooting.


AEGIS Standards Knowledge Graph Slide

The slide-ready explanation of how standards are encoded as a validated, versioned knowledge graph:

docs/slides/AEGIS_STANDARDS_KNOWLEDGE_GRAPH.md

Diagram source (Mermaid):

docs/diagrams/aegis_standards_knowledge_graph.mmd

HTML slide (open directly in browser, editable, print-to-PDF ready):

docs/html/AEGIS_Standards_Knowledge_Graph.html

Speaker notes and stakeholder Q&A:

docs/slides/AEGIS_STANDARDS_KNOWLEDGE_GRAPH_NOTES.md


Token Reduction Slide

A slide-ready explanation of how AEGIS reduces token consumption and hallucination is available here:

docs/SLIDES_TOKEN_REDUCTION.md

The slide deck covers:

  • Why normal RAG is expensive and increases hallucination risk
  • How AEGIS focused graph retrieval works
  • The REST X.0 → REST Y.0 worked example
  • Illustrative token reduction numbers (~85–95%)
  • Why validated memory reduces hallucination
  • Business value for enterprise clients

LLM Provider Options

Provider Config Requirement
mock (default) LLM_PROVIDER=mock None — works offline
ollama LLM_PROVIDER=ollama Ollama running locally (ollama pull llama3.2:3b)
bedrock LLM_PROVIDER=bedrock + AWS credentials AWS account with Bedrock model access enabled
anthropic LLM_PROVIDER=anthropic + ANTHROPIC_API_KEY=sk-ant-... Anthropic API key (direct)
openai LLM_PROVIDER=openai + OPENAI_API_KEY=sk-... OpenAI API key

AWS Bedrock is the recommended real-LLM option if you have an AWS account — no Anthropic API key needed, uses your existing AWS credentials. Enable Claude model access once in the Bedrock console, then pip install boto3 and set LLM_PROVIDER=bedrock.

All providers fall back to mock automatically if credentials are missing — the demo never crashes.


Neo4j Graph Storage

Neo4j is optional. The default is JSON file storage which works without Docker.

Mode Config Requirement
Local JSON (default) NEO4J_ENABLED=false None
Neo4j graph NEO4J_ENABLED=true Docker running Neo4j

To start Neo4j:

docker-compose up neo4j -d
# Neo4j Browser: http://localhost:7474  User: neo4j  Password: password

Pipeline Stages

Stage Command Output
Ingest --stage ingest data/processed/ingestion_manifest.json
Extract --stage extract extracted_claims.json, extracted_decisions.json, extracted_entities.json
Resolve --stage resolve resolved_entities.json
Conflicts --stage conflicts conflicts.json
Validate --stage validate validation_queue.json (queue for UI review)
Artifacts --stage artifacts data/processed/artifacts/ART-ADR-001.md, ART-ARCHSUM-001.md

Run Tests

python3 -m pytest tests/ -v
# All tests pass without any API key or Docker

Tests cover: chunking, claim extraction, decision extraction, conflict detection, widget key uniqueness, Neo4j fallback, mock LLM, ingest/validate stages, product process layer (focused retrieval, targeted update, agent simulation, pull requests, token reduction).


Test Data

Six sample documents are provided in data/raw/:

File Type Purpose
2012_shopping_portal_architecture.md Legacy architecture 2012 AWS 3-tier shopping portal (EC2+ELB+RDS)
2012_kyc_platform_architecture.md Legacy architecture 2012 hybrid KYC platform (AWS + on-prem Oracle)
togaf_architecture_template.md Governance template TOGAF 9 ADM templates and artefact formats
agentic_architecture_reference_2021_2022.md Reference LangChain, LlamaIndex, ReAct, MRKL (2022)
sample_meeting_transcript.md Meeting transcript REST API → pull-based integration decision change
sample_rfp_extract.md RFP Architecture requirements (cloud, security, integration)

All documents labelled [SYNTHETIC] where not sourced directly from public references.


Dashboard Pages

Page What it shows
Knowledge Pull Request Validation queue — approve / reject / escalate claims and decisions
Conflict Register Detected contradictions between claims, with severity and recommendation
Audit Trail Full history of every approval, rejection, and escalation
Approved Knowledge All approved claims and decisions, browsable by domain
Pipeline Status Stage results, LLM provider mode, Neo4j status
Product Process Layer New — two-layer demo: simulate meeting event, focused retrieval, candidate governance updates, PR approval, backlog sync, token reduction metrics

Architecture Overview

                    ┌──────────────────────────────────┐
                    │       PROCESS / AGENT LAYER       │
                    │  ProductIntake · GovernanceCheck   │
                    │  UserStory · Architecture          │
                    │  Approval · BacklogSync            │
                    └────────────┬─────────────────────┘
                                 │ receives focused slice only
              ┌──────────────────┴───────────────────┐
              │                                       │
    ┌─────────▼──────────┐              ┌─────────────▼──────────┐
    │  GOVERNANCE LAYER  │              │   PRODUCT LAYER         │
    │  governance_rules  │              │   user_stories          │
    │  (9 rules, stable) │              │   architecture_artifacts│
    │  focused_context   │              │   product_context       │
    │  targeted_update   │              │   backlog               │
    └────────────────────┘              └────────────────────────┘
              ▲                                       ▲
              │                                       │
    ┌─────────┴─────────────────────────────────────┐
    │            KNOWLEDGE INGESTION PIPELINE         │
    │  ingestion → chunking → extraction (LLM)        │
    │  → entity_resolution → conflict_detection       │
    │  → validation_workflow → graph_writer (opt)     │
    └─────────────────────────────────────────────────┘
              ▲
    [raw documents + meeting transcripts]

Focused retrieval in action (REST X.0 → REST Y.0 demo):

  • Transcript arrives: "REST Y.0 agreed as new API standard"
  • Topic detected: rest_api_standard (keyword match, no LLM)
  • Governance retrieval: 1 of 9 rules (GOV-REST-001 only — not TLS, GDPR, AI rules)
  • Token reduction: ~95% vs full-knowledge-base RAG
  • Candidate GOV-REST-002 created, human approves, GOV-REST-001 marked superseded

Project Structure

aegis-knowledge-memory-poc/
├── README.md
├── WINDOWS_RUNBOOK.md                      ← Windows setup + troubleshooting
├── DEMO_NOTES.md                           ← Demo script
├── .env.example                            ← Copy to .env — no changes needed for demo
├── requirements.txt
│
├── src/
│   ├── main.py                             ← Pipeline (--stage flag)
│   ├── llm_provider.py                     ← LLM abstraction (mock/ollama/anthropic/openai)
│   ├── focused_context.py                  ← Focused knowledge slice retrieval
│   ├── targeted_update.py                  ← Targeted governance update
│   ├── token_estimator.py                  ← Token comparison (RAG vs. focused)
│   ├── agent_orchestrator.py               ← Full multi-agent event orchestration
│   ├── agent_process_layer.py              ← Legacy agent functions (kept for compatibility)
│   ├── event_triggers.py                   ← Event simulation
│   ├── api.py                              ← FastAPI backend (15 endpoints)
│   ├── agents/                             ← Individual agent classes
│   │   ├── __init__.py
│   │   ├── base_agent.py
│   │   ├── product_intake_agent.py
│   │   ├── focused_retrieval_agent.py
│   │   ├── governance_check_agent.py
│   │   ├── knowledge_update_agent.py
│   │   ├── user_story_agent.py
│   │   ├── architecture_agent.py
│   │   ├── artifact_generation_agent.py
│   │   ├── approval_agent.py
│   │   ├── backlog_sync_agent.py
│   │   └── microsoft365_integration_agent.py
│   └── integrations/
│       ├── microsoft365.py                 ← Real M365 client (live mode)
│       └── microsoft365/                   ← Structured M365 stubs
│           ├── m365_config.py
│           ├── graph_auth.py
│           ├── sharepoint_connector.py
│           ├── teams_transcript_connector.py
│           ├── power_automate_webhook.py
│           └── copilot_connector_stub.py
│
├── data/
│   ├── raw/                                ← Input documents
│   ├── processed/                          ← Extracted claims/decisions/conflicts
│   ├── documents/                          ← Meeting transcripts
│   └── product/                            ← Product knowledge layer
│       ├── governance_rules.json
│       ├── product_context.json
│       ├── user_stories.json
│       ├── architecture_artifacts.json
│       ├── product_decisions.json
│       ├── backlog.json
│       ├── pull_requests.json
│       └── candidate_updates.json
│
├── outputs/product/                        ← Agent-generated artifacts
│   ├── updated_user_stories.md
│   ├── architecture_artifact_v2.md
│   ├── governance_check_summary.md
│   ├── approval_pack.md
│   └── adr_rest_y0.md
│
├── docs/
│   ├── SLIDES_TOKEN_REDUCTION.md           ← Slide-ready: token & hallucination reduction
│   ├── DEPLOYMENT_PLAN.md                  ← 5-stage deployment roadmap
│   ├── SECURITY_AND_GOVERNANCE.md          ← Security, GDPR, audit, RBAC plan
│   ├── MICROSOFT_365_COPILOT_INTEGRATION.md
│   ├── TOKEN_CONSUMPTION_AND_HALLUCINATION_REDUCTION.md
│   └── KNOWLEDGE_MEMORY_MODEL.md
│
├── app/
│   ├── streamlit_app.py                    ← Dashboard (~2750 lines)
│   └── ai-plugin.json                      ← Copilot Studio plugin manifest
│
└── tests/
    ├── test_claim_extraction.py
    ├── test_decision_extraction.py
    ├── test_conflict_detection.py
    ├── test_demo_hardening.py
    ├── test_product_process_layer.py        ← 45 tests (focused retrieval, agents, PRs)
    └── test_agents_and_orchestrator.py      ← 50 tests (agent layer + orchestrator + API)

What to Show in the Demo

  1. Open data/raw/sample_meeting_transcript.md — show where the REST API decision changes to pull-based
  2. Show data/processed/extracted_decisions.json — the structured decision extracted from that transcript
  3. Show data/processed/conflicts.json — the GDPR vs. AWS S3 conflict the system detected automatically
  4. Open Streamlit at localhost:8501 — validate queue: approve one claim, reject another, see audit trail update
  5. Open data/processed/artifacts/ART-ADR-001.md — the ADR generated from the validated decision

The key message: documents came in → claims and decisions came out → conflict surfaced automatically → human validated → ADR generated. That is the end-to-end pipeline.


Microsoft 365 / Copilot integration path

The current POC runs locally. No Microsoft 365 access is required to run the demo.

In an enterprise deployment, the same workflow can be connected to Microsoft 365:

  • SharePoint / OneDrive — architecture documents and working drafts as document sources
  • Teams meeting transcripts — decision extraction from architecture meetings via Graph API
  • Power Automate — trigger AEGIS ingestion when documents change or meetings end
  • Power Apps — replace or complement the Streamlit validation dashboard
  • Copilot Studio — build an agent that calls AEGIS APIs for validated context
  • Teams — architects review and approve decisions without leaving Teams

No new platform is needed. The workflow builds on Microsoft 365 licences clients already hold.

See docs/MICROSOFT_365_COPILOT_INTEGRATION.md for the full integration design including Graph API endpoints, Power Automate flow examples, and Copilot Studio integration options.


Extending for Production

Capability What to add
PDF/DOCX ingestion PyMuPDF / python-docx parsers in ingestion.py
Teams transcript auto-ingestion Microsoft Graph API — see src/integrations/microsoft365_stub.py
SharePoint document sync Power Automate + POST /ingest/document API
Vector search for entity resolution Replace LLM similarity with Qdrant embeddings
Production graph Neo4j Enterprise or Memgraph
Audit store PostgreSQL (replace JSON file store in validation_workflow.py)
Agent integration Copilot Studio / Agent 365 — consume context graphs via API
LLM Wiki sync Generate Confluence pages from approved stationary knowledge

Ontology

  • 28 node types — Document, Claim, Decision, Conflict, Policy, Risk, System, Artifact, and more
  • 33 edge types — CONTAINS, SUPERSEDES, CONFLICTS_WITH, APPROVED, GENERATED_FROM, and more
  • Full definitions in ontology/node_types.yaml and ontology/edge_types.yaml

References

About

AEGIS Business Analyst Platform — AI-assisted requirements management and delivery intelligence

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages