Skip to content

Utkarsh-rwt/PeerToPeerPlagrismDetector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Peer-to-Peer Plagiarism Detector

Python Flask HTML5 CSS3 JavaScript scikit-learn NumPy PyPDF2 Google OAuth Gunicorn Render

A Flask-based web application that detects peer-to-peer plagiarism between student submissions using NLP techniques.

The system compares uploaded content and computes plagiarism scores with TF-IDF vectorization and cosine similarity, helping educators identify suspiciously similar submissions efficiently.


Table of Contents


Live Demo

Application Link:
https://peertopeerplagrismdetector.onrender.com

Project Overview

This project focuses on peer-to-peer similarity detection in classroom submissions.

Unlike internet-wide plagiarism tools, it is designed to detect copying between students in the same course workflow. It supports PDF text extraction and can be extended with Google Classroom for automated assignment retrieval.

Features

  • Upload and compare multiple documents
  • Automatic text extraction from PDF files
  • Text cleaning and preprocessing pipeline
  • TF-IDF based feature extraction
  • Cosine similarity score calculation
  • Plagiarism percentage display
  • Clean, responsive teacher dashboard
  • Optional Google Classroom integration
  • Cloud deployment support via Render

Tech Stack

Backend

  • Python
  • Flask

Frontend

  • HTML
  • CSS
  • JavaScript

Data Processing & NLP

  • scikit-learn (TF-IDF, cosine similarity)
  • NumPy
  • PyPDF2

Integrations

  • Google OAuth 2.0
  • Google Classroom API
  • Google Drive API

Deployment

  • Gunicorn
  • Render

Project Structure

PeerToPeerPlagrismDetector/
├── app.py
├── requirements.txt
├── Procfile
├── templates/
├── static/
└── README.md

Installation

Prerequisites

  • Python 3.x
  • pip

1) Clone the repository

git clone https://github.com/Utkarsh-rwt/PeerToPeerPlagrismDetector.git
cd PeerToPeerPlagrismDetector

2) Create and activate a virtual environment

python -m venv venv

Windows

venv\Scripts\activate

macOS/Linux

source venv/bin/activate

3) Install dependencies

pip install -r requirements.txt

4) Run the application

python app.py

Open in browser:

http://127.0.0.1:5000

Usage

  1. Open the app home page.
  2. Sign in through the Google login flow (optional for Classroom-based workflow).
  3. Select course and assignment (when using Classroom integration).
  4. Fetch submissions and run similarity analysis.
  5. Review generated plagiarism percentages and matched student pairs.

Deployment on Render

Step 1 — Prepare required files

Ensure these files exist in the repository root:

  • requirements.txt
  • Procfile with:
web: gunicorn app:app

Step 2 — Push the repository

git add .
git commit -m "deploy app"
git push

Step 3 — Deploy on Render

  1. Go to Render and sign in with GitHub.
  2. Create a New Web Service.
  3. Select this repository.
  4. Set build command:
pip install -r requirements.txt
  1. Set start command:
gunicorn app:app
  1. Click Deploy.

Google Classroom Integration (Optional)

For OAuth integration, update your redirect URI in Google Cloud Console:

https://your-app-name.onrender.com/oauth2callback

Also ensure client_secret.json is configured correctly for your Google Cloud project.

Working Principle

The plagiarism detection workflow:

  1. Extract text from uploaded documents
  2. Clean and normalize text
  3. Convert text into TF-IDF vectors
  4. Compute cosine similarity
  5. Generate plagiarism percentage scores

Similarity formula:

cos(θ) = (A · B) / (||A|| ||B||)

Future Improvements

  • AI-based semantic similarity detection
  • Teacher analytics dashboard
  • Report export system
  • Classroom-wide live sync
  • Historical plagiarism database
  • Student submission insights

Contributing

Pull requests are welcome.

For major changes, please open an issue first to discuss your proposed improvements.

License

This project is intended for educational and academic use.

A dedicated open-source license file is not currently included in the repository.

About

A Flask-based peer-to-peer plagiarism detection system that analyzes text similarity between user submissions to detect copied or duplicated content.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors