Feature Request: Extract clickable URLs from PDF text

### Problem
Currently, the PDFHandler only extracts visible text from resumes. 
If a link is present in the PDF as clickable text (e.g., "GitHub"), the underlying URL is not captured. 
As a result, the JSON resume does not include these URLs in the "profiles" section.

### Proposed Solution
Enhance `to_markdown` (or PDFHandler) to:
1. Extract link annotations (e.g., `link['uri']` from PyMuPDF).
2. Append URLs to the text passed to the LLM prompt.
3. Ensure the LLM prompt can include these URLs for accurate JSON extraction.

### Benefits
- Improves accuracy of profile extraction (GitHub, LinkedIn, portfolio links).  
- Ensures that clickable links in resumes are not lost.  
- Makes the system more robust for real-world resumes.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Extract clickable URLs from PDF text #152

Problem

Proposed Solution

Benefits

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Extract clickable URLs from PDF text #152

Description

Problem

Proposed Solution

Benefits

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions