This project detects whether a job posting is REAL or FAKE / FRAUDULENT using Natural Language Processing and Machine Learning.
- Takes raw job description text as input
- Cleans and processes text using NLP
- Uses TF-IDF features + Logistic Regression
- Trained on the Kaggle Real/Fake Job Posting Prediction dataset
- Simple Streamlit web app interface
- Python 3
- Pandas, NumPy
- Scikit-learn (TF-IDF, Logistic Regression)
- Streamlit (Web App)
- Joblib (Model saving)
- Kaggle dataset:
fake_job_postings.csv
fake-job-detector/
βββ app_streamlit.py # Streamlit web app
βββ train_model.py # Model training script
βββ nlp_utils.py # Text cleaning utilities
βββ data/
β βββ fake_job_postings.csv
βββ Models/
β βββ fake_job_model.pkl
β βββ tfidf_vectorizer.pkl
βββ requirements.txt
βββ .gitignore