HireIntel is an intelligent recruitment system specifically designed for hiring software engineers. The system acts as an AI-powered recruiter that streamlines the technical hiring process through automated resume processing, candidate research, and interview management.
- Overview
- Features
- System Requirements
- Installation
- Configuration
- Architecture
- API Documentation
- Email System
- Pipeline Architecture
- Real-Time Monitoring
- Security
- Deployment
- Troubleshooting
- Contributing
- License
HireIntel automates and enhances the technical recruitment process through:
- Automated resume parsing and analysis
- Multi-source candidate research (GitHub, LinkedIn, Google)
- AI-powered profile creation
- Automated interview scheduling
- Real-time pipeline monitoring
- Advanced resume parsing using AI
- GitHub repository analysis
- LinkedIn profile integration
- Google presence analysis
- Intelligent candidate-job matching
- Automated email communications
- Real-time monitoring dashboard
- Interview scheduling system
- Continuous background processing
- Status-based candidate progression
- Multi-stage data enrichment
- Automated profile creation
- Real-time status updates
- Python 3.8 or higher
- SQLite database
- Poppler PDF library (for PDF processing)
- SMTP server access
- Required API access tokens
- Minimum 4GB RAM recommended
- Storage space for document processing
HireIntel/
βββ src/
β βββ config/
β β βββ AppSettings.py
β β βββ Config.yaml
β β βββ DBModelsConfig.py
β βββ Controllers/
β β βββ AdminController.py
β β βββ AuthController.py
β β βββ ScheduleMonitorController.py
β βββ Modules/
β β βββ Auth/
β β βββ Candidate/
β β βββ Jobs/
β β βββ Interviews/
β β βββ PipeLineData/
β βββ PipeLines/
β β βββ Integration/
β β βββ PipeLineManagement/
β β βββ Profiling/
β βββ Static/
β βββ EmailTemplates/
β βββ Resume/
βββ instance/
βββ email_attachments/
- Clone the repository:
git clone https://github.com/kudzaiprichard/hireIntel.api
cd HireIntel- Create virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Install Poppler:
- Windows: Download from poppler releases
- Linux:
sudo apt-get install poppler-utils - MacOS:
brew install poppler
- GitHub Token
- Visit GitHub Developer Settings
- Create new token with:
reposcope (repository access)userscope (user data access)read:orgscope (organization access)
- Add to config:
profiler:
github_token: "your_token"- Google AI (Gemini) API
- Visit Google AI Studio
- Sign in and enable the API
- Create new API key
- Add to config:
llm:
genai_token: "your_token"
poppler_path: "C:\\Program Files\\poppler-24.08.0\\Library\\bin"- RapidAPI (LinkedIn)
- Create account at RapidAPI
- Subscribe to LinkedIn Profile & Company Data API
- Copy API key to config:
profiler:
rapid_api_key: "your_key"- Gmail Configuration
- Enable 2-Step Verification
- Generate App Password:
- Go to Security β App Passwords
- Select "Mail" and "Other (Custom name)"
- Name it "HireIntel"
- Copy 16-character password
- Add to config:
email:
from: "hire"
username: "your.email@gmail.com"
password: "your_app_password"
smtp_host: "smtp.gmail.com"
smtp_port: 465
imap_host: "imap.gmail.com"
imap_port: 993server:
ip: "0.0.0.0"
port: 12345
debug: true
ssl: false
database:
uri: "sqlite:///hire.db"
track_modifications: false
jwt:
secret_key: "your_jwt_secret"
assets:
resume: "./src/Static/Resume/Documents"
json_resume: "./src/Static/Resume/Json"
profiler:
github_token: "ghp_xxxxxxxxxxxx"
google_api_key: "your_google_api_key"
rapid_api_key: "xxxxxxxxxxxxxxxx"
batch_size: 5
intervals:
linkedin_scraping: 1
text_extraction: 1
github_scraping: 1
google_scraping: 1
profile_creation: 1
scoring:
weights:
technical: 0.4
experience: 0.35
github: 0.25
min_passing_score: 70.0
watcher:
watcher_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/watcher_folder"
failed_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/failed_folder"
check_interval: 1
email_pipe_line:
batch_size: 10
check_interval: 1
folder: "INBOX"
allowed_attachments: [".pdf", ".doc", ".docx"]Candidate applications must be submitted in XML format:
<?xml version="1.0" encoding="UTF-8"?>
<candidate>
<email>example@email.com</email>
<first_name>John</first_name>
<last_name>Doe</last_name>
<job_id>93f14b11-da25-4a9a-8bb2-4ac8509ddac0</job_id>
<phone>+1234567890</phone>
<current_company>Company Name</current_company>
<current_position>Current Role</current_position>
<years_of_experience>5</years_of_experience>
<documents>
<document name="resume.pdf" type="resume">resume.pdf</document>
</documents>
</candidate>Required fields:
emailfirst_namelast_namejob_id(valid UUID)documents(with resume)
-
Input Detection:
- Monitors
watcher_folderfor new XML files - Validates XML structure and schema
- Checks for associated resume document
- Monitors
-
Document Processing:
- Moves resume to document storage
- Generates unique document identifiers
- Maintains document associations
-
Candidate Creation:
- Creates new candidate record
- Sets initial pipeline status to
XML - Triggers pipeline processing
watcher:
watcher_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/watcher_folder"
failed_folder: "./src/PipeLines/Integration/FileWatcher/Watcher/failed_folder"
check_interval: 1 # minutes
assets:
resume: "./src/Static/Resume/Documents"
json_resume: "./src/Static/Resume/Json"Each pipeline operates as a daemon thread, continuously monitoring for candidates in specific states:
Pipeline Threads (All Running Continuously):
βββ File Watcher Thread
β βββ Monitors folder for new XML files
β βββ Creates candidates with XML status
β
βββ Email Watcher Thread
β βββ Monitors email inbox
β βββ Converts to XML and triggers File Watcher
β
βββ Text Extraction Thread
β βββ Watches for status: XML
β βββ Processes resume text
β βββ Updates to: EXTRACT_TEXT
β
βββ Google Scraping Thread
β βββ Watches for status: EXTRACT_TEXT
β βββ Gathers web presence
β βββ Updates to: GOOGLE_SCRAPE
β
βββ LinkedIn Scraping Thread
β βββ Watches for status: GOOGLE_SCRAPE
β βββ Fetches LinkedIn data
β βββ Updates to: LINKEDIN_SCRAPE
β
βββ GitHub Scraping Thread
β βββ Watches for status: LINKEDIN_SCRAPE
β βββ Analyzes GitHub activity
β βββ Updates to: GITHUB_SCRAPE
β
βββ Profile Creation Thread
βββ Watches for status: GITHUB_SCRAPE
βββ Creates final profile
βββ Updates to: PROFILE_CREATED
Each pipeline uses an infinite loop for continuous processing:
def _run_pipeline(self):
with self.app.app_context():
while not self.stop_flag.is_set():
try:
# Get candidates in specific status
candidates = self.get_input_data()
# Process batch if found
if candidates:
self.process_batch()
# Wait for next interval
self.stop_flag.wait(self.config.process_interval)
except Exception as e:
self.handle_error(e)-
Continuous Polling:
- Each thread continuously polls database
- Looks for candidates in its input status
- Processes in configurable batch sizes
-
Status-Based Processing:
XML β EXTRACT_TEXT β GOOGLE_SCRAPE β LINKEDIN_SCRAPE β GITHUB_SCRAPE β PROFILE_CREATION β PROFILE_CREATED -
Thread Safety:
- Isolated database transactions
- Atomic status updates
- Pipeline-specific state management
profiler:
batch_size: 5 # Number of candidates per batch
intervals: # Polling intervals in minutes
linkedin_scraping: 1
text_extraction: 1
github_scraping: 1
google_scraping: 1
profile_creation: 1# Each pipeline continuously:
while not stop_flag:
# Find candidates in input status
candidates = find_candidates_in_status(INPUT_STATUS)
if candidates:
try:
# Process candidates
process_candidates(candidates)
# Update to next status
update_status(candidates, OUTPUT_STATUS)
except:
# Mark as failed
update_status(candidates, FAILED_STATUS)
# Wait for next interval
wait(process_interval)Candidate A: XML β EXTRACT_TEXT β GOOGLE_SCRAPE β ...
Candidate B: XML β EXTRACT_TEXT β GOOGLE_SCRAPE_FAILED
Candidate C: XML β EXTRACT_TEXT_FAILED
- Failed states don't block pipeline
- Detailed error logging
- Automatic retry mechanism
- Status-based error tracking
- Error notification system
Pipeline states for each candidate:
class CandidatePipelineStatus(Enum):
XML = "xml"
EXTRACT_TEXT = "extract_text"
GOOGLE_SCRAPE = "google_scrape"
LINKEDIN_SCRAPE = "linkedin_scrape"
GITHUB_SCRAPE = "github_scrape"
PROFILE_CREATION = "profile_creation"
PROFILE_CREATED = "profile_created"
# Failed states
XML_FAILED = "xml_failed"
EXTRACT_TEXT_FAILED = "extract_text_failed"
GOOGLE_SCRAPE_FAILED = "google_scrape_failed"
LINKEDIN_SCRAPE_FAILED = "linkedin_scrape_failed"
GITHUB_SCRAPE_FAILED = "github_scrape_failed"
PROFILE_CREATION_FAILED = "profile_creation_failed"Email applications must follow this format:
Applying for [position name] position. Please find below attached resume and documents for your reference
First Name: [Required]
Middle Name: [Optional]
Last Name: [Required]
Job Id: [Required UUID]
- Application Received:
Subject: Application Received - [Position]
Dear [First Name],
Your application for [Position] has been received...- Invalid Job ID:
Subject: Application Error - Invalid Job ID
Dear [First Name],
The Job ID [Job ID] is not valid...- Missing Fields:
Subject: Application Error - Missing Information
Dear Applicant,
The following required fields are missing:
[Missing Fields List]Authentication Endpoints:
βββ POST /register
βββ POST /login
βββ POST /logout
βββ POST /refresh/tokens
βββ GET /user/fetch
Protected Endpoints:
βββ Jobs Management
β βββ GET /jobs
β βββ POST /jobs
β βββ PUT /jobs/<id>
βββ Interview Management
βββ POST /interviews/schedule
βββ GET /interviews/schedules
Real-time Endpoints:
βββ GET /api/monitor/status
βββ GET /api/monitor/status/stream
- Pipeline Monitor:
{
"timestamp": "2025-02-12T10:00:00Z",
"pipelines": {
"text_extraction": {
"status": "PROCESSING",
"last_updated": "2025-02-12T09:59:55Z"
}
}
}- Candidate Monitor:
{
"data": {
"candidates": [...],
"pagination": {
"total": 100,
"page": 1,
"per_page": 10
}
}
}- JWT-based authentication
- Role-based access control
- API rate limiting
- Secure password storage
- Email validation
- Input sanitization
- Set up environment:
- Configure API keys
- Set up email server
- Configure database
- Install dependencies
- Initialize database
- Start application:
python app.py-
Pipeline Failures:
- Check API quotas
- Verify credentials
- Check network connectivity
-
Email Issues:
- Verify SMTP settings
- Check email templates
- Validate email format
-
Database Issues:
- Check connections
- Verify permissions
- Monitor disk space
- Fork repository
- Create feature branch
- Submit pull request
MIT License