Skip to content

Latest commit

 

History

History
167 lines (128 loc) · 5.2 KB

File metadata and controls

167 lines (128 loc) · 5.2 KB

Enhanced Legal RAG System - Complex Question Handling

🚀 What's New:

English-Only System

  • Removed all Arabic language preferences
  • Optimized for English legal documents
  • Better English embedding model (all-MiniLM-L6-v2)

Enhanced for Complex Legal Questions

  • Smart Chunking: Splits by articles/sections instead of arbitrary word counts
  • Context-Aware Prompting: Different prompts for different question types
  • Multi-Step Reasoning: Breaks down complex legal scenarios
  • Source Attribution: Shows which articles/pages were used

🎯 Handling Complex Questions Like:

"If I did X case then what article should be implemented on me?"

The system now:

  1. Detects Case Scenarios - Recognizes "if", "case", "scenario" keywords
  2. Identifies Article Requests - Looks for "article", "section", "law" terms
  3. Applies Legal Reasoning - Uses specialized prompt for case analysis
  4. Provides Step-by-Step Analysis:
    • Identifies relevant articles
    • Explains conditions that trigger each article
    • Determines which specific article applies
    • Provides reasoning for the conclusion

🔧 Key Improvements:

1. Smart Document Chunking

# Old way: Split by word count
chunks = text.split()[:500]

# New way: Split by legal structure
article_pattern = r'(Article\s+\d+|Section\s+\d+|Chapter\s+\d+)'
sections = re.split(article_pattern, text)

2. Enhanced Prompting

# For case scenarios:
"""
For this case scenario question:
1. First identify the relevant articles/sections that apply
2. Explain what conditions or circumstances trigger each article
3. Determine which specific article(s) would be implemented
4. Provide the reasoning for your conclusion
"""

3. Better Context Retrieval

  • Retrieves 7 chunks instead of 5
  • Includes metadata (page numbers, sections)
  • Scores relevance more accurately

📋 Requirements for Better Performance:

1. Ollama Model Recommendations:

For Legal Documents:

# Best for legal reasoning
ollama pull llama3.1:70b  # Most capable for complex legal analysis

# Good alternatives:
ollama pull qwen2.5:32b   # Good for structured reasoning
ollama pull mixtral:8x7b  # Excellent for multi-step analysis
ollama pull codellama:34b # Good for rule-based reasoning

For English Documents:

ollama pull llama3.2:8b   # Current default, good balance
ollama pull mistral:7b    # Fast and accurate for English

2. Hardware Recommendations:

Minimum:

  • 16GB RAM
  • 8GB VRAM (for larger models)
  • SSD storage

Optimal:

  • 32GB RAM
  • 16GB VRAM
  • NVME SSD

3. Model Configuration:

Update your Ollama model in the code:

# For better legal reasoning
response = ollama.chat(
    model="llama3.1:70b",  # or mixtral:8x7b
    messages=[{'role': 'user', 'content': prompt}],
    options={
        'temperature': 0.1,    # More precise
        'top_p': 0.9,         # More focused
        'num_ctx': 4096       # Longer context
    }
)

4. Document Preprocessing:

For Legal Documents:

  • Convert Arabic to English if needed
  • Ensure clear article/section numbering
  • Remove headers/footers that might confuse the system
  • Use clean, well-formatted PDFs

🎯 Example Complex Questions the System Can Handle:

  1. Case-Based Questions:

    • "If someone violates Article 5, what penalties apply?"
    • "In a governance dispute scenario, which articles are relevant?"
    • "What happens if I don't comply with the basic law?"
  2. Implementation Questions:

    • "How should Article 10 be implemented in practice?"
    • "What procedures must be followed according to Section 3?"
    • "What are the enforcement mechanisms for this law?"
  3. Comparative Questions:

    • "What's the difference between Article 2 and Article 5?"
    • "Which article takes precedence in case of conflict?"
    • "How do these rules interact with each other?"

🚀 How to Use:

cd "d:\INTERNSHIP\pdf_analyzer"
python enhanced_legal_rag.py

Then ask complex questions like:

❓ Question: If I violate the governance rules, what article should be implemented on me?

📝 Answer:
Based on the document analysis:

1. **Relevant Articles Identified**: Article 15 (Violations) and Article 23 (Enforcement)

2. **Conditions that Trigger**: 
   - Article 15 applies when there is a clear violation of governance rules
   - Article 23 applies for the enforcement mechanism

3. **Specific Implementation**: Article 15 should be implemented as it specifically addresses violations of governance rules with appropriate penalties

4. **Reasoning**: The document establishes Article 15 as the primary article for handling violations, with Article 23 providing the enforcement framework.

Sources: Page 8 (Section 3), Page 12 (Section 5), Page 15 (Section 7)

📊 Performance Tips:

  1. Use specific questions - The more specific, the better the answer
  2. Include context - Mention the type of case or situation
  3. Ask follow-up questions - Build on previous answers
  4. Use legal terminology - The system understands legal language better

The enhanced system is now much better equipped to handle complex legal reasoning and case-based questions!