An intelligent resume-job matching application that analyzes the compatibility between resumes and job descriptions using machine learning. Upload your resume and get an AI-powered matching score!
- 📄 Resume Upload: Support for PDF, TXT, and DOCX files
- 💼 Job Description Analysis: Paste job descriptions for instant matching
- 🎯 AI-Powered Matching: Uses advanced NLP and machine learning for accurate matching scores
- 📊 Detailed Analysis: Get text similarity scores and recommendations
- 🎨 Modern UI: Clean, intuitive Streamlit interface
This project uses the "AI-Powered Resume Screening Dataset 2025" from Kaggle, which contains:
- 1,000+ resume records with skills, experience, education, and certifications
- Job roles and AI matching scores
- Used to train a regression model that predicts resume-job matching scores
- Text Processing: Extracts and processes text from resumes and job descriptions
- Similarity Calculation: Uses TF-IDF vectorization and cosine similarity
- Feature Engineering: Combines text similarity with experience, salary, and project data
- Machine Learning: Random Forest regression model predicts matching scores (0-100)
- Clone the repository:
git clone <repository-url>
cd AI_POWERED_RESUME_SCREENER- Install dependencies:
pip install -r requirements.txt- Run the application:
streamlit run app.py- Upload Resume: Choose between file upload (PDF/TXT/DOCX) or paste text
- Enter Details: Provide years of experience, salary expectation, and project count
- Paste Job Description: Copy and paste the job posting
- Get Results: Click "Calculate Matching Score" for instant analysis
- 80-100: 🌟 Excellent Match - Strong recommendation to apply
- 60-79: 👍 Good Match - Consider tailoring resume further
- 0-59:
⚠️ Low Match - Significant gaps detected
AI_POWERED_RESUME_SCREENER/
├── app.py # Main Streamlit application
├── train_models.py # Model training script
├── requirements.txt # Python dependencies
├── README.md # This file
├── data/
│ └── AI_Resume_Screening.csv # Training dataset
├── src/
│ ├── processing.py # Data preprocessing and feature engineering
│ └── util.py # Model utilities and file processing
├── models/ # Trained model files (generated)
└── assets/ # Static assets
To train the model manually:
python train_models.pyThe script will:
- Load and preprocess the dataset
- Calculate text similarities
- Train a Random Forest regression model
- Evaluate performance (R², MAE, RMSE)
- Save the trained model
- Streamlit: Web application framework
- Scikit-learn: Machine learning and NLP
- Pandas: Data manipulation
- PyPDF2: PDF text extraction
- python-docx: Word document processing
- NumPy: Numerical computing
The regression model achieves good performance in predicting resume-job matching scores based on the training data, with metrics including:
- R² Score: Measure of variance explained
- MAE: Mean Absolute Error
- RMSE: Root Mean Squared Error
- PDF: Portable Document Format (.pdf)
- TXT: Plain text files (.txt)
- DOCX: Microsoft Word documents (.docx)
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
- Support for more file formats (DOC, RTF)
- Advanced NLP models (BERT, transformers)
- Skills gap analysis
- Resume improvement suggestions
- Batch processing for multiple resumes
This project is for educational purposes. Please check the dataset license on Kaggle for commercial use.
- Dataset source: Kaggle AI-Powered Resume Screening Dataset 2025
- Built with ❤️ for better job matching