CareerScope is a unified intelligence platform for the tech ecosystem, focused on AI, ML, and Data Science. It aggregates and analyzes company data, job openings, events, courses, research papers, and tech news to provide actionable career intelligence — guiding users to the right opportunities, resources, and insights in one place.
-
Company Insights
- Careers page URL, open positions, tech stack, benefits, basic info (founded year, size, location, tier)
- Live monitoring of active hiring status
-
Jobs
- Aggregated from company career pages (Workday/Greenhouse/Lever) + external providers (Adzuna, Jooble, Reddit, Eventbrite)
- Normalized and scored by freshness; global jobs API with filters (q, role, tech, location, work type, source)
-
Events
- Hackathons, workshops, and conferences
- Filterable by domain, location, and dates
-
News
- Tech & hiring news relevant to AI, ML, and Data Science (e.g., Industry Dive)
- Ranked by relevance and recency; exposed via
/api/news/
-
Courses
- Recommended learning paths & courses from major platforms (Coursera, Udemy, edX)
- Linked to skill gaps and job opportunities
-
Research Papers
- Aggregates top papers from ArXiv, OpenReview, and Semantic Scholar
- Metadata: authors, citations, abstracts, domains
-
Intelligence Layer
- Matchmaking between user skills, jobs, events, and learning opportunities
- Freshness scoring & dynamic alerts
-
Automation & Scheduling
- Celery + Redis for background ingestion tasks
- Automatic reprocessing for missing or outdated data
- Periodic jobs for news, job postings, and company enrichment
- Backend: Django + DRF (APIs) with Celery workers for ingestion and enrichment
- Frontend: React (Vite)
- Ingestion: DataIngestion/* modules by domain (Companies, Jobs, News)
- Storage: PostgreSQL
- Schedulers: Celery Beat (every 10 minutes for jobs/news collectors, nightly rollups)
High-level data flow:
- Schedulers trigger collectors (Jobs, News, Companies enrichment)
- Providers fetch and normalize data
- Upsert into database models (Jobs, Companies, NewsArticle)
- DRF APIs expose filtered/paginated endpoints for the frontend
REST endpoints exist for companies, global jobs, and news. The frontend consumes these directly. See the code under backend/Companies/api, backend/Jobs/api, and backend/News/api for details.
- Jobs collectors (every 10m):
Adzuna,Jooble,Reddit(heuristics),Eventbrite - Company ATS ingestion (scheduled): Workday/Greenhouse/Lever per hiring company
- News (every 10m): Industry Dive via DataIngestion/News
- Nightly: rollup collectors (
collect_all_external)
Run-time services:
- Celery Beat scheduler
- Celery Worker (ingestion queue)
Create backend/.env and set:
- Django:
SECRET_KEY,DEBUG,DATABASE_URL,DJANGO_ALLOWED_HOSTS - Celery/Redis:
CELERY_BROKER_URL,CELERY_RESULT_BACKEND - Tavily:
TAVILY_API_KEY - Jobs APIs:
ADZUNA_APP_ID,ADZUNA_API_KEY,JOOBLE_API_KEY,REDDIT_CLIENT_ID,REDDIT_CLIENT_SECRET,EVENTBRITE_API_KEY - News:
INDUSTRY_DIVE_API_KEY, optionalINDUSTRY_DIVE_BASE_URL
- Clone
git clone <repository-url>
cd CareerScope- Backend
cd backend
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
cp .env.example .env # or create your .env with the variables above
python manage.py migrate
python manage.py runserver- Workers (separate terminals)
# Terminal A: Celery Beat
celery -A backend beat -l info
# Terminal B: Celery Worker (ingestion queue)
celery -A backend worker -Q ingestion --loglevel INFO- Frontend (optional)
cd ../frontend
npm install
npm run devExamples omitted for brevity. Use the running backend (http://127.0.0.1:8000) and inspect the API modules or your frontend network calls.
- Review ingestion quality (developer tool):
python manage.py review_jobs_ingestion --days 7 --limit 50
python manage.py review_jobs_ingestion --source adzuna --limit 30- Adjust keyword/heuristics in
DataIngestion/Jobs/filters.py
- Django, DRF, Celery, Redis, PostgreSQL
- React (Vite)
- Providers: Adzuna, Jooble, Reddit, Eventbrite, Industry Dive, Tavily
Private internal development. Not licensed for public distribution.