- Removed all Arabic language preferences
- Optimized for English legal documents
- Better English embedding model (
all-MiniLM-L6-v2)
- Smart Chunking: Splits by articles/sections instead of arbitrary word counts
- Context-Aware Prompting: Different prompts for different question types
- Multi-Step Reasoning: Breaks down complex legal scenarios
- Source Attribution: Shows which articles/pages were used
The system now:
- Detects Case Scenarios - Recognizes "if", "case", "scenario" keywords
- Identifies Article Requests - Looks for "article", "section", "law" terms
- Applies Legal Reasoning - Uses specialized prompt for case analysis
- Provides Step-by-Step Analysis:
- Identifies relevant articles
- Explains conditions that trigger each article
- Determines which specific article applies
- Provides reasoning for the conclusion
# Old way: Split by word count
chunks = text.split()[:500]
# New way: Split by legal structure
article_pattern = r'(Article\s+\d+|Section\s+\d+|Chapter\s+\d+)'
sections = re.split(article_pattern, text)# For case scenarios:
"""
For this case scenario question:
1. First identify the relevant articles/sections that apply
2. Explain what conditions or circumstances trigger each article
3. Determine which specific article(s) would be implemented
4. Provide the reasoning for your conclusion
"""- Retrieves 7 chunks instead of 5
- Includes metadata (page numbers, sections)
- Scores relevance more accurately
For Legal Documents:
# Best for legal reasoning
ollama pull llama3.1:70b # Most capable for complex legal analysis
# Good alternatives:
ollama pull qwen2.5:32b # Good for structured reasoning
ollama pull mixtral:8x7b # Excellent for multi-step analysis
ollama pull codellama:34b # Good for rule-based reasoningFor English Documents:
ollama pull llama3.2:8b # Current default, good balance
ollama pull mistral:7b # Fast and accurate for EnglishMinimum:
- 16GB RAM
- 8GB VRAM (for larger models)
- SSD storage
Optimal:
- 32GB RAM
- 16GB VRAM
- NVME SSD
Update your Ollama model in the code:
# For better legal reasoning
response = ollama.chat(
model="llama3.1:70b", # or mixtral:8x7b
messages=[{'role': 'user', 'content': prompt}],
options={
'temperature': 0.1, # More precise
'top_p': 0.9, # More focused
'num_ctx': 4096 # Longer context
}
)For Legal Documents:
- Convert Arabic to English if needed
- Ensure clear article/section numbering
- Remove headers/footers that might confuse the system
- Use clean, well-formatted PDFs
-
Case-Based Questions:
- "If someone violates Article 5, what penalties apply?"
- "In a governance dispute scenario, which articles are relevant?"
- "What happens if I don't comply with the basic law?"
-
Implementation Questions:
- "How should Article 10 be implemented in practice?"
- "What procedures must be followed according to Section 3?"
- "What are the enforcement mechanisms for this law?"
-
Comparative Questions:
- "What's the difference between Article 2 and Article 5?"
- "Which article takes precedence in case of conflict?"
- "How do these rules interact with each other?"
cd "d:\INTERNSHIP\pdf_analyzer"
python enhanced_legal_rag.pyThen ask complex questions like:
❓ Question: If I violate the governance rules, what article should be implemented on me?
📝 Answer:
Based on the document analysis:
1. **Relevant Articles Identified**: Article 15 (Violations) and Article 23 (Enforcement)
2. **Conditions that Trigger**:
- Article 15 applies when there is a clear violation of governance rules
- Article 23 applies for the enforcement mechanism
3. **Specific Implementation**: Article 15 should be implemented as it specifically addresses violations of governance rules with appropriate penalties
4. **Reasoning**: The document establishes Article 15 as the primary article for handling violations, with Article 23 providing the enforcement framework.
Sources: Page 8 (Section 3), Page 12 (Section 5), Page 15 (Section 7)
- Use specific questions - The more specific, the better the answer
- Include context - Mention the type of case or situation
- Ask follow-up questions - Build on previous answers
- Use legal terminology - The system understands legal language better
The enhanced system is now much better equipped to handle complex legal reasoning and case-based questions!