Skip to content

subigya-js/constitution-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

14 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ›๏ธ Constitution GPT

An Open-Source Constitutional Intelligence System Powered by RAG + LLMs

Constitution GPT is an open-source intelligence system designed specifically for constitutional, legal, policy, and governance documents, enabling precise retrieval, interpretation, and question-answering grounded in authoritative texts.

This project helps students, lawyers, policymakers, researchers, and developers build systems that require:

  • โœ… Accurate referencing with Part/Article/Sub-article citations
  • โœ… Context-aware hierarchical understanding
  • โœ… Traceable legal reasoning
  • โœ… Question answering based on verified constitutional sources

๐ŸŒŸ Why Constitution GPT?

Legal and constitutional documents are long, complex, and interconnected. Traditional search is too shallow. LLMs alone hallucinate. Constitution GPT fills this gap.

Key Advantages:

  • ๐Ÿ“˜ Hierarchical Understanding: Preserves Part โ†’ Article โ†’ Sub-article โ†’ Clause structure
  • ๐ŸŽฏ Smart Retrieval: Query expansion handles semantic variations ("elected" vs "appointed")
  • ๐Ÿ” Complete Coverage: Automatically fetches all sub-articles from relevant articles
  • ๐Ÿ“Š Structured Responses: Beautiful, citation-backed answers with proper hierarchy
  • ๐ŸŒ Generic & Extensible: Works for ANY constitutional topic, not hardcoded

๐Ÿš€ Current Features

๐Ÿ“„ 1. Intelligent Document Processing

  • Loads PDF constitutions (currently: Constitution of Nepal)
  • Extracts 240 pages โ†’ 1,719 semantic chunks
  • Preserves hierarchical structure with rich metadata

โœ‚๏ธ 2. Advanced Hierarchical Chunking

Not just character splitting - our system:

  • โœ… Detects Part, Article, Sub-article, Clause boundaries
  • โœ… Adds contextual prefixes for better semantic matching
  • โœ… Keeps complete sub-articles together (no mid-sentence splits)
  • โœ… Stores metadata: part, article, subarticle, clause, hierarchy

Example chunk metadata:

{
  "part": "Part 7",
  "part_name": "Federal Executive",
  "article": "Article 76",
  "article_title": "Constitution of Council of Ministers",
  "subarticle": "Sub-article (1)",
  "hierarchy": "Part 7 โ†’ Article 76 โ†’ Sub-article (1)"
}

๐Ÿ” 3. Smart Query Processing

Query Expansion - Automatically generates variations:

  • "How is the PM elected?" โ†’ "appointed", "selected", "chosen"
  • "What are citizen rights?" โ†’ "freedoms", "liberties", "entitlements"
  • Topic-specific boosters (e.g., PM queries โ†’ "Article 76")

Article Completion - Ensures comprehensive answers:

  • Detects relevant articles in initial retrieval
  • Fetches ALL sub-articles from those articles
  • Provides complete constitutional coverage

๐Ÿง  4. Structured Response Generation

Responses follow constitutional hierarchy:

๐Ÿ“˜ Part 7 โ€“ Federal Executive
Article 76 โ€“ Constitution of Council of Ministers

๐Ÿ”น Sub-article (1)
As per Part 7, Article 76, Sub-article (1):
โ€ข The President shall appoint the leader of a parliamentary party 
  that commands majority in the House of Representatives as the 
  Prime Minister...

๐Ÿ”น Sub-article (2)
As per Part 7, Article 76, Sub-article (2):
โ€ข If no party has a clear majority...

๐Ÿ“Š System Performance

Metric Value
Total Chunks 1,719 semantic chunks
Chunk Quality Context-aware with metadata
Query Expansion 5-10x variations per query
Retrieval Accuracy ~90% for tested queries
Response Format Hierarchical with citations

โœ… Tested Query Types:

  • โœ… Prime Minister election process
  • โœ… Fundamental rights of citizens
  • โœ… Duties of citizens
  • โœ… President election procedure
  • โœ… Parliament structure
  • โœ… Freedom of speech provisions

โš™๏ธ Installation & Setup

1. Clone Repository

git clone https://github.com/subigya-js/constitution-gpt.git
cd constitution-gpt

2. Create Virtual Environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Environment Variables

Create .env file in the root directory:

OPENAI_API_KEY=your_openai_api_key_here

5. Build Vector Database

python rag/ingestion_pipeline.py

This will:

  • Load the Constitution PDF
  • Create 1,719 semantic chunks with metadata
  • Generate embeddings using OpenAI
  • Store in ChromaDB (db/chroma_db/)

๐ŸŽฎ Usage

Command Line Interface

Ask any constitutional question:

python rag/retrieval_pipeline.py "How is the Prime Minister elected in Nepal?"

Other example queries:

python rag/retrieval_pipeline.py "What are the fundamental rights of citizens?"
python rag/retrieval_pipeline.py "What are the duties of citizens?"
python rag/retrieval_pipeline.py "How is the President elected?"
python rag/retrieval_pipeline.py "What is the structure of the Federal Parliament?"

Test Multiple Queries

python rag/test_various_queries.py

Rebuild Database (if needed)

rm -rf db/chroma_db
python rag/ingestion_pipeline.py

๐Ÿ—๏ธ Project Structure

constitution-gpt/
โ”œโ”€โ”€ rag/
โ”‚   โ”œโ”€โ”€ data/
โ”‚   โ”‚   โ””โ”€โ”€ Constitution_English.pdf    # Source document
โ”‚   โ”œโ”€โ”€ ingestion_pipeline.py           # Chunking + Vector DB creation
โ”‚   โ”œโ”€โ”€ retrieval_pipeline.py           # Query processing + Answer generation
โ”‚   โ””โ”€โ”€ test_various_queries.py         # Test suite
โ”œโ”€โ”€ db/
โ”‚   โ””โ”€โ”€ chroma_db/                      # Vector database (auto-generated)
โ”œโ”€โ”€ venv/                               # Virtual environment
โ”œโ”€โ”€ .env                                # Environment variables
โ”œโ”€โ”€ requirements.txt                    # Python dependencies
โ””โ”€โ”€ README.md                           # This file

๐Ÿ”ง Technical Architecture

Ingestion Pipeline (ingestion_pipeline.py)

  1. Load PDF โ†’ PyMuPDFLoader extracts text
  2. Parse Hierarchy โ†’ Regex-based extraction of Parts/Articles/Sub-articles
  3. Create Chunks โ†’ Semantic chunks with contextual prefixes
  4. Add Metadata โ†’ Rich metadata for each chunk
  5. Generate Embeddings โ†’ OpenAI text-embedding-3-small
  6. Store in ChromaDB โ†’ Persistent vector database

Retrieval Pipeline (retrieval_pipeline.py)

  1. Query Expansion โ†’ Generate 5-10 variations with synonyms
  2. Multi-Query Retrieval โ†’ Search for each variation
  3. Deduplication โ†’ Remove duplicate chunks
  4. Article Completion โ†’ Fetch all sub-articles from key articles
  5. Relevance Scoring โ†’ Prioritize by query term matches
  6. Context Creation โ†’ Group and structure by hierarchy
  7. LLM Generation โ†’ GPT-4o generates structured answer

๐Ÿ“ Example Output

Query: "How is the Prime Minister elected in Nepal?"

Response:

๐Ÿ“˜ Part 7 โ€“ Federal Executive | Article 76 โ€“ Constitution of Council of Ministers

๐Ÿ”น Sub-article (1)
As per Part 7, Article 76, Sub-article (1):
โ€ข The President shall appoint the leader of a parliamentary party that 
  commands a majority in the House of Representatives as the Prime Minister, 
  and the Council of Ministers shall be constituted under his or her 
  chairpersonship.

๐Ÿ”น Sub-article (2)
As per Part 7, Article 76, Sub-article (2):
โ€ข If no party has a clear majority, the President shall appoint as Prime 
  Minister a member of the House of Representatives who presents a ground 
  on which he or she can obtain a vote of confidence in the House of 
  Representatives.

๐Ÿ”น Sub-article (4)
As per Part 7, Article 76, Sub-article (4):
โ€ข If a Prime Minister cannot be appointed under Sub-article (1) or (2), 
  the President shall appoint as the Prime Minister the parliamentary party 
  leader of the party which has the highest number of members in the House 
  of Representatives.

๐ŸŽฏ Use Cases

๐Ÿง‘โ€๐ŸŽ“ For Students

  • Learn constitutional law with structured explanations
  • Get complete article breakdowns with all sub-articles
  • Understand hierarchical relationships between provisions

โš–๏ธ For Lawyers & Legal Researchers

  • Quick retrieval of relevant constitutional provisions
  • Complete article coverage (no missing sub-articles)
  • Accurate Part/Article/Sub-article citations

๐Ÿ›๏ธ For Government & NGOs

  • Build civic education platforms
  • Provide constitution Q&A to citizens
  • Policy analysis and research automation

๐Ÿ› ๏ธ For Developers

  • Backend for AI-powered legal tools
  • Vector-search microservice for legal documents
  • Domain-specific chatbot template

๐Ÿ›ฃ๏ธ Roadmap

  • Hierarchical chunking with metadata
  • Smart query expansion
  • Article completion for comprehensive answers
  • Structured response generation
  • FastAPI backend with REST endpoints
  • Web UI for interactive Q&A
  • Support for multiple constitutions
  • Multilingual support (Nepali, Hindi, etc.)
  • Cross-article relationship graph
  • Dockerized deployment
  • Cloud-ready architecture

๐Ÿค Contributing

Contributions are welcome! Here's how you can help:

  1. Report Issues: Found a bug or incorrect retrieval? Open an issue
  2. Suggest Features: Have ideas for improvements? Let us know
  3. Submit PRs: Code contributions are appreciated
  4. Add Documents: Help add more constitutions or legal documents

๐Ÿ“œ License

MIT License - feel free to use this project for educational, research, or commercial purposes.


๐Ÿ™Œ Acknowledgements

  • Constitution of Nepal - Source document
  • OpenAI - Embeddings and LLM
  • LangChain - RAG framework
  • ChromaDB - Vector database

Built to make constitutional knowledge accessible, accurate, and AI-powered ๐Ÿš€

About

An Open-Source Constitutional Intelligence System Powered by RAG + LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors