Skip to content

musiliandrew/CareerScoper

Repository files navigation

CareerScope Logo

CareerScope

CareerScope is a unified intelligence platform for the tech ecosystem, focused on AI, ML, and Data Science. It aggregates and analyzes company data, job openings, events, courses, research papers, and tech news to provide actionable career intelligence — guiding users to the right opportunities, resources, and insights in one place.


🌟 Key Features

  • Company Insights

    • Careers page URL, open positions, tech stack, benefits, basic info (founded year, size, location, tier)
    • Live monitoring of active hiring status
  • Jobs

    • Aggregated from company career pages (Workday/Greenhouse/Lever) + external providers (Adzuna, Jooble, Reddit, Eventbrite)
    • Normalized and scored by freshness; global jobs API with filters (q, role, tech, location, work type, source)
  • Events

    • Hackathons, workshops, and conferences
    • Filterable by domain, location, and dates
  • News

    • Tech & hiring news relevant to AI, ML, and Data Science (e.g., Industry Dive)
    • Ranked by relevance and recency; exposed via /api/news/
  • Courses

    • Recommended learning paths & courses from major platforms (Coursera, Udemy, edX)
    • Linked to skill gaps and job opportunities
  • Research Papers

    • Aggregates top papers from ArXiv, OpenReview, and Semantic Scholar
    • Metadata: authors, citations, abstracts, domains
  • Intelligence Layer

    • Matchmaking between user skills, jobs, events, and learning opportunities
    • Freshness scoring & dynamic alerts
  • Automation & Scheduling

    • Celery + Redis for background ingestion tasks
    • Automatic reprocessing for missing or outdated data
    • Periodic jobs for news, job postings, and company enrichment

🏗 Architecture Overview

  • Backend: Django + DRF (APIs) with Celery workers for ingestion and enrichment
  • Frontend: React (Vite)
  • Ingestion: DataIngestion/* modules by domain (Companies, Jobs, News)
  • Storage: PostgreSQL
  • Schedulers: Celery Beat (every 10 minutes for jobs/news collectors, nightly rollups)

High-level data flow:

  1. Schedulers trigger collectors (Jobs, News, Companies enrichment)
  2. Providers fetch and normalize data
  3. Upsert into database models (Jobs, Companies, NewsArticle)
  4. DRF APIs expose filtered/paginated endpoints for the frontend

🔌 Backend APIs

REST endpoints exist for companies, global jobs, and news. The frontend consumes these directly. See the code under backend/Companies/api, backend/Jobs/api, and backend/News/api for details.


⏱ Ingestion & Scheduling

  • Jobs collectors (every 10m): Adzuna, Jooble, Reddit (heuristics), Eventbrite
  • Company ATS ingestion (scheduled): Workday/Greenhouse/Lever per hiring company
  • News (every 10m): Industry Dive via DataIngestion/News
  • Nightly: rollup collectors (collect_all_external)

Run-time services:

  • Celery Beat scheduler
  • Celery Worker (ingestion queue)

🔐 Environment Variables (excerpt)

Create backend/.env and set:

  • Django: SECRET_KEY, DEBUG, DATABASE_URL, DJANGO_ALLOWED_HOSTS
  • Celery/Redis: CELERY_BROKER_URL, CELERY_RESULT_BACKEND
  • Tavily: TAVILY_API_KEY
  • Jobs APIs: ADZUNA_APP_ID, ADZUNA_API_KEY, JOOBLE_API_KEY, REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, EVENTBRITE_API_KEY
  • News: INDUSTRY_DIVE_API_KEY, optional INDUSTRY_DIVE_BASE_URL

🚀 Quick Start

  1. Clone
git clone <repository-url>
cd CareerScope
  1. Backend
cd backend
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
cp .env.example .env  # or create your .env with the variables above
python manage.py migrate
python manage.py runserver
  1. Workers (separate terminals)
# Terminal A: Celery Beat
celery -A backend beat -l info

# Terminal B: Celery Worker (ingestion queue)
celery -A backend worker -Q ingestion --loglevel INFO
  1. Frontend (optional)
cd ../frontend
npm install
npm run dev

🔎 API Examples

Examples omitted for brevity. Use the running backend (http://127.0.0.1:8000) and inspect the API modules or your frontend network calls.


🧪 QA & Tuning

  • Review ingestion quality (developer tool):
python manage.py review_jobs_ingestion --days 7 --limit 50
python manage.py review_jobs_ingestion --source adzuna --limit 30
  • Adjust keyword/heuristics in DataIngestion/Jobs/filters.py

📦 Tech Stack

  • Django, DRF, Celery, Redis, PostgreSQL
  • React (Vite)
  • Providers: Adzuna, Jooble, Reddit, Eventbrite, Industry Dive, Tavily

License

Private internal development. Not licensed for public distribution.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors