DABB.ai: Intelligent Contract Risk Analysis

Team: DABB.ai

Members:

Birajit Saikia
Devaansh Kathuria
Abhey Dua
Bhavya Jain

1) Overview

DABB.ai is a contract risk analysis system that combines a classical clause classifier with a grounded assistant layer for structured legal-style reporting.

Milestone 2 keeps the Milestone 1 machine-learning baseline intact and extends it with:

Clause-level assistant explanations backed by a local guidance corpus
PDF report export for submission and demo use
Multi-contract comparison for spotting repeated risk patterns
Deployment-ready Streamlit entrypoints for public hosting

2) What the App Does

The application accepts PDF or TXT contracts and returns:

Clause segmentation and clause IDs
Predicted clause type
Risk severity: Low, Medium, or High
Risk score from 0 to 100
Highlighted high-risk clauses
CSV and JSON exports
Structured legal assistance report
Optional PDF report download
Multi-contract comparison summary

3) Legal Disclaimer

This tool is informational only and is not legal advice. Always consult a qualified legal professional before making legal decisions.

4) Architecture

flowchart TD
    A[Upload PDF/TXT] --> B[Text Extraction]
    B --> C[Preprocess + Clause Segmentation]
    C --> D[TF-IDF Vectorization]
    D --> E[Clause Classifier]
    E --> F[Risk Mapping Table]
    F --> G[Severity + Risk Score]
    G --> H[Streamlit UI]
    H --> I[Assistant Report + Clause Drill-down]
    I --> J[PDF Export + Multi-Contract Comparison]
    J --> K[CSV / JSON Export]

    L[Bundled Legal Guidance Corpus] --> M[Local Retrieval Index]
    M --> I

    N[Training CSV] --> O[Train / Refresh Model]
    O --> P[models/model.joblib]
    P --> E

5) Project Layout

DABB.ai/
├── app.py
├── streamlit_app.py
├── data/
├── docs/
├── models/
├── reports/
├── scripts/
├── src/contract_risk/
└── tests/

6) Local Setup

Prerequisites

Python 3.10+

Install

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run the app

streamlit run streamlit_app.py

streamlit_app.py is the recommended hosted entrypoint. app.py remains the reusable application module for local development and tests.

Train or refresh the model

PYTHONPATH=src python -m contract_risk.cli train --csv data/raw/legal_docs_modified.csv

If data/raw/legal_docs_modified.csv is unavailable, the app falls back to data/demo/sample_training.csv.

Evaluate locally

PYTHONPATH=src python -m contract_risk.cli eval --csv data/raw/legal_docs_modified.csv --reports-dir reports

7) Usage Flow

Upload a PDF or TXT contract, or enable the bundled demo contract.
Review the clause table, severity filters, and highlighted sections.
Generate the legal assistance report for clause explanations and mitigation guidance.
Export CSV, JSON, or PDF artifacts as needed.
Upload multiple contracts to compare repeated risk patterns.

8) Hosted Deployment

Streamlit Community Cloud

Create a new app and point the main file at streamlit_app.py.
Keep requirements.txt at the repository root.
Use the default public deployment settings from .streamlit/config.toml.

Hugging Face Spaces

Create a new Space with the Streamlit SDK.
Set the app entrypoint to streamlit_app.py.
Push this repository and allow the Space to install dependencies from requirements.txt.

Render

Use the included render.yaml or Procfile.
Deploy the repository as a Python web service.
Start the app with streamlit run streamlit_app.py.

Environment overrides

The app supports the following optional environment variables:

DABB_MODEL_PATH
DABB_REPORTS_DIR
DABB_TRAINING_CSV
DABB_FALLBACK_TRAINING_CSV

9) Testing

python3 -m compileall app.py streamlit_app.py src tests
python3 -m pytest -q

The repository also includes GitHub Actions CI under .github/workflows/ci.yml.

10) Limitations

Clause labels still depend on training data quality and class balance.
Risk scores are rule-based and should be treated as screening guidance.
Scanned or image-only PDFs may not extract clean text.
The assistant report is grounded in the bundled corpus, but it is still informational only.
Multi-contract comparison surfaces repeated patterns rather than making legal judgments.

11) Submission Notes

reports/ contains the milestone report artifacts and presentation material.
docs/deployment.md documents the public-host startup path and smoke checks.
streamlit_app.py is the submission-ready public entrypoint.

12) Quick Commands

PYTHONPATH=src python -m contract_risk.cli train
PYTHONPATH=src python -m contract_risk.cli eval
streamlit run streamlit_app.py
python3 -m pytest -q

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DABB.ai: Intelligent Contract Risk Analysis

1) Overview

2) What the App Does

3) Legal Disclaimer

4) Architecture

5) Project Layout

6) Local Setup

Prerequisites

Install

Run the app

Train or refresh the model

Evaluate locally

7) Usage Flow

8) Hosted Deployment

Streamlit Community Cloud

Hugging Face Spaces

Render

Environment overrides

9) Testing

10) Limitations

11) Submission Notes

12) Quick Commands

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
data		data
docs		docs
models		models
reports		reports
scripts		scripts
src/contract_risk		src/contract_risk
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
app.py		app.py
render.yaml		render.yaml
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Folders and files

Latest commit

History

Repository files navigation

DABB.ai: Intelligent Contract Risk Analysis

1) Overview

2) What the App Does

3) Legal Disclaimer

4) Architecture

5) Project Layout

6) Local Setup

Prerequisites

Install

Run the app

Train or refresh the model

Evaluate locally

7) Usage Flow

8) Hosted Deployment

Streamlit Community Cloud

Hugging Face Spaces

Render

Environment overrides

9) Testing

10) Limitations

11) Submission Notes

12) Quick Commands

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages