Brief Overview:
Extracts text and layout metadata (font size, style, position) from PDFs using PyMuPDF to heuristically identify headings (H1–H3) and build a structured JSON outline.
Brief Overview:
Uses spaCy embeddings and cosine similarity to rank sections by persona-task relevance, extracting 1–3 key sentences per top section for focused, offline insights.