Internship Project β Built for Endee.io AI Engineering Internship Evaluation
Demonstrates production-grade RAG pipeline architecture using Endee Vector Database
A user types a career goal like "I want to become a Machine Learning Engineer" and the system:
- Embeds the query into a 384-dimensional vector using
sentence-transformers - Retrieves semantically relevant documents from 5 Endee collections in parallel
- Augments an LLM prompt with the retrieved context (RAG)
- Generates a structured career intelligence report containing:
- Step-by-step learning roadmap (0β12 months)
- Required skills ranked by relevance score
- Real job roles with salary ranges and demand signals
- Salary trends and market outlook
- Recommended portfolio projects
- Career timeline milestones
User Input
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend β
β β
β POST /api/v1/career/analyze β
β β β
β βΌ β
β EmbeddingEngine (sentence-transformers/all-MiniLM-L6-v2) β
β β query β 384-dim float vector β
β β β
β βΌ β
β βββββββββββββββββββββββ Endee Vector DB βββββββββββββββββββ β
β β Parallel cosine-similarity search across 5 collections β β
β β β β
β β job_roles β top-5 semantically relevant jobs β β
β β skill_taxonomy β top-5 relevant skills β β
β β learning_paths β top-5 resources β β
β β salary_insightsβ top-5 salary records β β
β β projects β top-5 portfolio projects β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Retrieved SearchResult objects (id, score, metadata) β
β β β
β βΌ β
β RAGPipeline.build_prompt(query + retrieved_context) β
β β β
β βΌ β
β LLM Client (OpenAI / Groq / Ollama) β
β β system: expert career counselor β
β β user: career goal + retrieved context β
β β β
β βΌ β
β CareerInsight (structured Python dataclass β JSON) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
HTML/JS Frontend (renders roadmap, skills, jobs, salary, projects)
ai-career-engine/
β
βββ backend/
β βββ main.py # FastAPI app, startup, routing
β βββ core/
β β βββ vector_store.py # Endee SDK integration + InMemoryFallback
β β βββ embeddings.py # sentence-transformers wrapper + LRU cache
β β βββ rag_pipeline.py # RAG orchestration + LLM client + prompts
β βββ api/
β βββ career_routes.py # /career/analyze, /search/semantic, /stats
β βββ health_routes.py # /health
β
βββ frontend/
β βββ templates/
β βββ index.html # Single-file frontend (HTML + CSS + JS)
β
βββ scripts/
β βββ seed_data.py # One-time dataset loader β Endee collections
β
βββ .env.example # Environment variable template
βββ requirements.txt # All Python dependencies
βββ README.md
| Layer | Technology | Purpose |
|---|---|---|
| Vector DB | Endee | Store & search 384-dim embeddings |
| Embeddings | sentence-transformers (MiniLM) | Convert text to dense vectors |
| RAG | Custom pipeline | Retrieval-augmented prompt construction |
| LLM | OpenAI / Groq / Ollama | Career insight generation from context |
| Backend | FastAPI + Python 3.11 | REST API, async, Pydantic validation |
| Frontend | Vanilla HTML/CSS/JS | Zero-dependency terminal-aesthetic UI |
Endee serves as the semantic memory of the system β replacing traditional keyword search with meaning-aware retrieval.
| Collection | Documents | Content |
|---|---|---|
job_roles |
10 | Job titles, salaries, required skills, demand |
skill_taxonomy |
15 | Skills, categories, importance, learning time |
learning_paths |
10 | Courses, books, platforms with metadata |
salary_insights |
5 | Compensation by role, location, YoY growth |
projects |
6 | Portfolio projects with tech stacks |
from core.vector_store import VectorStore
from core.embeddings import EmbeddingEngine
engine = EmbeddingEngine()
store = VectorStore()
await store.initialize()
vector = engine.embed_single("Machine Learning Engineer requires Python and PyTorch")
await store.upsert_vectors("job_roles", [{
"id": "job_001",
"values": vector, # 384-dim float list
"metadata": {
"job_title": "Machine Learning Engineer",
"salary_range": "$130kβ$180k",
"demand": "Very High",
"key_skills": ["Python", "PyTorch", "MLOps"],
"content": "Machine Learning Engineer builds production ML systems..."
}
}])query_vector = engine.embed_single("I want a career in deep learning")
results = await store.semantic_search(
collection="job_roles",
query_vector=query_vector,
top_k=5
)
for r in results:
print(f"[{r.score:.3f}] {r.metadata['job_title']} β {r.metadata['salary_range']}")- Python 3.11+
- Endee API key (sign up at endee.io)
- OpenAI or Groq API key (or run Ollama locally for free)
git clone https://github.com/yourusername/ai-career-engine.git
cd ai-career-engine
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtcp .env.example .env
# Edit .env and fill in:
# ENDEE_API_KEY=your_key_here
# OPENAI_API_KEY=your_key_here (or GROQ_API_KEY for free tier)python scripts/seed_data.pyThis embeds 46 career documents and upserts them into 5 Endee collections.
cd backend
uvicorn main:app --reload --port 8000Navigate to http://localhost:8000 in your browser.
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/career/analyze |
Full RAG career analysis |
GET |
/api/v1/career/skills?role= |
Ranked skills for a role |
GET |
/api/v1/career/jobs?goal= |
Job roles matching a goal |
POST |
/api/v1/search/semantic |
Raw semantic search in any collection |
GET |
/api/v1/stats |
Vector DB collection counts + cache stats |
GET |
/api/v1/health |
Health check |
curl -X POST http://localhost:8000/api/v1/career/analyze \
-H "Content-Type: application/json" \
-d '{"career_goal": "I want to become a Machine Learning Engineer", "top_k": 5}'| Option | Cost | Setup |
|---|---|---|
| OpenAI GPT-4o-mini | ~$0.01/query | OPENAI_API_KEY=sk-... |
| Groq Llama-3 | Free tier | GROQ_API_KEY=gsk_... + update .env |
| Ollama (local) | Free | ollama pull llama3.2 + update .env |
No LLM key? The system still works β it falls back to rule-based generation using the retrieved Endee data.
pytest tests/ -vWhy Endee over SQL/keyword search?
A user saying "deep learning career" and "neural network engineer path" mean the same thing. SQL LIKE queries fail here. Endee's cosine similarity returns the same top results for both because the meaning is semantically close.
Why sentence-transformers (MiniLM)?
Fast (14k tokens/sec on CPU), small (22 MB), good quality for retrieval tasks. Outperforms TF-IDF and BM25 on semantic similarity benchmarks.
Why async throughout?
Parallel retrieval from 5 Endee collections simultaneously using asyncio. Without async, this would be ~5x slower (sequential).
Why a fallback for empty collections?
Graceful degradation β the system still produces useful output even before the DB is seeded, making it easier to demo.
Apoorva Deep Singh β B.Tech CSE
Built as part of the Endee.io AI Internship Evaluation
This project demonstrates production-level RAG system design: vector database integration, semantic retrieval, LLM prompt engineering, async FastAPI, and clean modular architecture.