- Application Architecture Overview
- Backend Startup Process
- Login Page
- Registration Page
- Resume Ranker Page
- Upload CV Page
- Explore Stored Resumes Page
- Data Visualization Page
- Project Setup and Usage Guide
Below is an architecture diagram that illustrates the overall structure and interaction between the frontend, backend, and supporting services of the SmartRecruit application:
- Frontend: The user interacts with the application through a web browser, which is supported by the Streamlit framework for the UI.
- Backend: The backend is built using FastAPI to provide RESTful web services. It handles requests from the frontend, processes data, and communicates with external services.
- Elasticsearch: Used for indexing and searching resumes, running within a Docker container to manage scalable search operations.
- MongoDB Atlas: A cloud-based database where user and resume data are stored. The backend interacts with MongoDB using the
pymongolibrary. - Docker: Manages the containerization of services like Elasticsearch to ensure a consistent and scalable environment.
- Matcher: Loads patterns into spaCy's
PhraseMatcherfor extracting skills. - SVM Model: Loads the pre-trained SVM model for classifying resumes.
- TF-IDF Vectorizer: Loads the TF-IDF vectorizer used to transform text into numerical vectors.
- OCR Tool: Loads the OCR (Optical Character Recognition) tool to extract text from scanned PDFs.
- spaCy NLP Model: Loads spaCy's NLP model for various natural language processing tasks such as tokenization, dependency parsing, and named entity recognition (NER).
The "Resume Ranker" page allows users to input a job description and rank stored resumes based on their relevance to the provided job description. Users can choose to include only today's resumes and specify the number of results to return. The results are displayed with scores indicating how well each resume matches the job criteria.
- Job Description Input: Users can enter detailed job descriptions to tailor the ranking process.
- Filter Option: Check the box to include only resumes uploaded on the current day.
- Results Limiting: A slider allows the user to set the number of resumes to display.
- Ranked Resumes: Each resume is shown with a score and upload date to help identify the most relevant candidates.
The "Upload CV" page enables users to upload their resumes to the system. Users can drag and drop files or use the "Browse files" button to upload their documents. The page provides a status update upon successful upload.
The "Explore Stored Resumes" page allows users to browse and search through previously uploaded resumes. Users can expand each resume entry to view details such as category, contact information, and the file path. The page also includes options to show the candidate's skills or select a resume for deletion. Users can download individual resumes directly from this page.
- Search Functionality: Users can filter resumes based on skills.
- Detailed Resume View: Expanding a resume shows detailed information including category, email, file path, and phone number.
- Resume Management: Users can choose to show skills, mark a resume for deletion, or download it.
The "Data Visualization" page provides graphical insights into the distribution of stored resumes based on their categories. Users can choose between different chart types, such as histograms or pie charts, to view the distribution. This helps to analyze the resume data visually and understand the trends in candidate submissions.
Follow these steps to set up and run the project.
- Docker (for running the Elasticsearch container)
- MongoDB
- Pytesseract (for OCR functionality)
-
Open
config.pyin backend/app. -
Locate the MongoDB URL configuration.
-
Replace the existing URL with your MongoDB connection string.
# Example in config.py mongodb_url: str = "Add your MongoDB URL here"
To run Elasticsearch in a Docker container, execute the following commands:
docker pull docker.elastic.co/elasticsearch/elasticsearch:7.17.21docker run --name <your-container-name> --net elastic -p 127.0.0.1:9200:9200 -p 127.0.0.1:9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.17.21docker start <your-container-name>This setup will run an Elasticsearch container that listens on 127.0.0.1 on ports 9200 and 9300.
Open ressource/nlp_loader.py.
Set the path of pytesseract to ensure OCR functionalities work correctly.
pytesseract.pytesseract.tesseract_cmd = (r"path_to_your_tesseract_executable")Replace path_to_your_tesseract_executable with the actual path to the Tesseract executable on your system.
Navigate to the backend directory:
cd .\backend\app\pip install -r requirements.txtuvicorn main:app --reloadThis command runs the server in development mode with auto-reloading enabled.
Navigate to the frontend directory:
cd .\frontend\pip install -r requirements.txtstreamlit run main.py







