Skip to content

nithamitabh/AI-Powered-Regulatory-Compliance-Checker-for-Contracts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“‹ Contract Compliance Checker

πŸ”’ An AI-powered GDPR compliance verification system for legal contracts

🌟 Overview

The Contract Compliance Checker is an intelligent web application that automates the process of analyzing legal contracts for - πŸ• Runs daily at midnight (00:00)

  • πŸ“₯ Scrapes latest templates from official sources
  • πŸ” Compares new templates with existing ones using hash verification
  • πŸ“§ Sends email notifications when changes are detected
  • πŸ’¬ Sends Slack notifications (if configured)
  • βœ… Ensures compliance checks use current standards
  • ⚠️ Reports any errors encountered during updatesompliance. It uses advanced AI models to detect document types, extract clauses, and compare them against regulatory standards.

✨ Features

  • πŸ“„ Automated Document Classification - Identifies 5 types of GDPR-related agreements
  • πŸ” Intelligent Clause Extraction - Extracts and summarizes contract clauses using AI
  • βš–οΈ Compliance Analysis - Compares uploaded contracts against standard templates
  • 🎯 Risk Assessment - Assigns risk scores (0-100) for compliance gaps
  • πŸ“Š Detailed Reporting - Provides missing clauses, risks, and recommendations
  • πŸ”„ Auto-Update System - Scheduled scraping to keep templates up-to-date
  • 🎨 User-Friendly Interface - Built with Streamlit for easy interaction

πŸ—‚οΈ Supported Document Types

The system can analyze the following contract types:

  1. πŸ“‘ Data Processing Agreement (DPA)
  2. 🀝 Joint Controller Agreement (JCA)
  3. πŸ”— Controller-to-Controller Agreement (C2C)
  4. πŸ”„ Processor-to-Subprocessor Agreement (PSA)
  5. πŸ“œ Standard Contractual Clauses (SCC)

πŸ—οΈ Project Structure

project/
β”œβ”€β”€ main.py                          # πŸš€ Streamlit application entry point
β”œβ”€β”€ agreement_comparision.py         # πŸ” Document classification & comparison
β”œβ”€β”€ data_extraction.py               # πŸ“ Clause extraction with AI
β”œβ”€β”€ scrapping.py                     # πŸ•·οΈ Template scraping & updates
β”œβ”€β”€ pipeline.py                      # βš™οΈ Automated processing pipeline
β”œβ”€β”€ notification.py                  # πŸ“§ Email notification module (SMTP)
β”œβ”€β”€ slack_notification.py            # πŸ’¬ Slack notification module (Webhooks)
β”œβ”€β”€ requirements.txt                 # πŸ“¦ Python dependencies
β”œβ”€β”€ .env                             # πŸ” Environment variables (not in git)
β”œβ”€β”€ .env.example                     # πŸ“‹ Template for environment variables
β”œβ”€β”€ json/                            # πŸ’Ύ Template standards
β”‚   β”œβ”€β”€ DPA.json
β”‚   β”œβ”€β”€ JCA.json
β”‚   β”œβ”€β”€ CCA.json
β”‚   β”œβ”€β”€ PSA.json
β”‚   └── SCC.json
└── templates/                       # πŸ“š Reference documents
    β”œβ”€β”€ (DPA) appendix-gdpr-eea-uk-4-27-21.pdf
    β”œβ”€β”€ (JCA) model-joint-controllership-agreement.pdf
    β”œβ”€β”€ (C2C) 2-Controller-to-controller-data-privacy-addendum.pdf
    β”œβ”€β”€ (SCCs) Standard Contractual Clauses.pdf
    └── (PSA) Personal-Data-Sub-Processor-Agreement-2024-01-24.pdf

πŸš€ Getting Started

Prerequisites

  • 🐍 Python 3.8 or higher
  • πŸ”‘ Google Gemini API key

Installation

  1. Clone the repository

    git clone <repository-url>
    cd project
  2. Create virtual environment

    python -m venv .venv
    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Set up environment variables

    Create a .env file in the project root (copy from .env.example):

    # API Keys
    GEMINI_API_KEY=your_gemini_api_key_here
    GROQ_API_KEY=your_groq_api_key_here
    
    # Email Configuration (for notifications)
    SMTP_SENDER_EMAIL=your_email@gmail.com
    SMTP_PASSWORD=your_app_password_here
    SMTP_RECEIVER_EMAIL=receiver_email@gmail.com
    SMTP_SERVER=smtp.gmail.com
    SMTP_PORT=587
    
    # Slack Configuration (optional)
    SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL

    Note for Gmail Users: Use an App Password, not your regular password. Enable 2-Step Verification first, then generate an app password from Google Account Security settings.

  5. Prepare template standards (if not already present)

    Run the pipeline to generate JSON templates:

    python pipeline.py

🎯 Running the Application

Launch the Streamlit web application:

streamlit run main.py

The application will open in your default browser at http://localhost:8501

πŸ“– Usage

  1. Upload Contract πŸ“€

    • Click "Browse files" or drag & drop a PDF contract
    • Supported format: PDF only
  2. Automatic Analysis πŸ€–

    • The system detects the document type
    • Extracts all clauses automatically
    • Compares against GDPR standard templates
  3. Review Results πŸ“Š

    • View missing or altered clauses
    • Check compliance risks
    • Review risk score and recommendations
    • Get actionable amendments

πŸ› οΈ Core Modules

main.py

  • 🎨 Streamlit web interface
  • πŸ“€ File upload handling
  • πŸ”„ Background scheduler for auto-updates
  • πŸ“Š Results visualization

agreement_comparision.py

  • πŸ” Document type detection using AI
  • βš–οΈ Clause-by-clause comparison
  • 🎯 Risk scoring and analysis
  • πŸ’‘ Compliance recommendations

data_extraction.py

  • πŸ“„ PDF text extraction
  • πŸ€– AI-powered clause extraction
  • πŸ“ Summarization (for large documents)
  • πŸ’Ύ JSON output generation

scrapping.py

  • πŸ•·οΈ Automated template scraping from web sources
  • πŸ”„ Scheduled updates (daily at 12:00 AM)
  • πŸ“₯ Downloads latest standard agreements
  • οΏ½ Detects changes using file hash comparison
  • πŸ“§ Sends email notifications when templates are updated
  • πŸ›‘οΈ Error handling and reporting

pipeline.py

  • βš™οΈ Orchestrates the entire workflow
  • πŸ—οΈ Builds template library
  • πŸ”„ Runs end-to-end comparison pipeline

notification.py

  • πŸ“§ Email notification system using SMTP
  • πŸ” Secure credential management via environment variables
  • βœ… Configurable sender, receiver, and message content
  • πŸ›‘οΈ Error handling and validation

slack_notification.py

  • πŸ’¬ Slack notification system using webhooks
  • πŸ“Š Rich formatted messages with blocks
  • 🎯 Compliance report formatting
  • πŸ”„ Template update notifications
  • πŸ” Secure webhook URL management

πŸ§ͺ Example Workflows

Document Analysis Pipeline

# Run pipeline for a new document
from pipeline import run_pipeline

result = run_pipeline("your-contract.pdf")
print(result)

Email Notifications

# Send email notification
from notification import send_email

# Use defaults from .env
send_email()

# Send custom notification
send_email(
    subject="Compliance Report Ready",
    body="Your GDPR compliance analysis is complete. Risk Score: 45/100",
    receiver="team@company.com"
)

Test Notification Module

python notification.py

Test Slack Notification

python slack_notification.py

Test Scraping with Notifications

python test_scraping_notification.py

This will manually trigger the scraping process and send notifications if changes are detected.

πŸ” Security & Privacy

  • πŸ”’ Temporary files are automatically cleaned up
  • πŸ—‘οΈ Uploaded files are deleted after processing
  • πŸ”‘ API keys and credentials stored securely in .env file
  • 🚫 No data is stored permanently on the server
  • πŸ” .env file excluded from version control via .gitignore
  • πŸ›‘οΈ Gmail App Passwords used instead of regular passwords
  • βœ… Environment variables for all sensitive configuration

πŸ€– Technology Stack

  • Frontend: Streamlit
  • AI Model: Google Gemini 2.5 Flash
  • PDF Processing: PyPDF2, pypdf
  • Data Validation: Pydantic
  • Scheduling: schedule
  • Environment: python-dotenv

πŸ“Š Risk Score Interpretation

  • 0-25: βœ… Low Risk - Minor issues
  • 26-50: ⚠️ Medium Risk - Attention needed
  • 51-75: πŸ”Ά High Risk - Significant gaps
  • 76-100: πŸ”΄ Critical Risk - Major compliance issues

πŸ”„ Auto-Update System

The application includes a background scheduler that:

  • πŸ• Runs daily at midnight (00:00)
  • πŸ“₯ Scrapes latest templates from official sources
  • οΏ½ Compares new templates with existing ones using hash verification
  • πŸ“§ Sends email notifications when changes are detected
  • βœ… Ensures compliance checks use current standards
  • ⚠️ Reports any errors encountered during updates

What Gets Notified?

  • βœ… New templates created
  • βœ… Existing templates updated with new clauses
  • ⚠️ Download or processing errors

Notifications are sent via:

  • πŸ“§ Email (SMTP)
  • πŸ’¬ Slack (Webhooks - if configured)

Notification Example

When templates are updated, you'll receive an email like:

Subject: πŸ”„ GDPR Template Update Notification

πŸ“ CHANGES DETECTED:
  β€’ Data Processing Agreement: Template updated with new clauses
  β€’ Standard Contractual Clauses: New template created

See SCRAPING_NOTIFICATION_GUIDE.md for detailed information.

🀝 Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

πŸ“ License

This project is licensed under the MIT License.

πŸ‘₯ Support

For questions or support, please contact the development team.

🎯 Future Enhancements

  • 🌍 Multi-language support
  • πŸ“§ Email notifications for compliance reports βœ…
  • πŸ“ˆ Historical comparison tracking
  • πŸ”— Integration with document management systems
  • πŸ“± Mobile-responsive interface
  • 🎨 Custom template creation
  • πŸ“Š Advanced analytics dashboard
  • πŸ”” Webhook support for real-time notifications
  • πŸ“± SMS notifications

πŸ“ Recent Updates

October 2025 - Security & Notification Enhancements

πŸ“§ Scraping Notification System

  • Automatic change detection for template updates
  • Sends email notifications when GDPR templates are updated
  • Uses MD5 hash comparison for accurate change detection
  • Reports both successful updates and errors
  • Fixed JSON file paths (json_files/ β†’ json/)
  • Added comprehensive logging and error handling

βœ… Notification Module Refactored

  • notification.py completely rewritten with security best practices
  • Moved all hardcoded credentials to .env file
  • Added reusable send_email() function with flexible parameters
  • Implemented proper error handling and input validation
  • Added comprehensive documentation

πŸ” Security Improvements

  • All sensitive credentials now in .env file:
    • Email credentials (SMTP sender, password, receiver)
    • API keys (Gemini, Groq)
    • Server configuration
  • Created .env.example as a safe template for team members
  • Verified .gitignore excludes .env from version control

🎯 Code Quality

  • Eliminated security risks of hardcoded credentials
  • Improved code maintainability and reusability
  • Added detailed inline documentation
  • Fixed typos and improved code structure
  • Follows Python PEP 8 standards

πŸ“š Documentation

  • Updated README.md with notification module usage
  • Added Gmail App Password setup instructions
  • Included environment variable configuration guide
  • Provided example workflows for email notifications

Made with ❀️ for GDPR Compliance | Powered by πŸ€– Google Gemini AI

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages