Skip to content

Conversation

@AkshayJKulkarni
Copy link

This patch improves the get_page_output() function in pymupdf_rag.py:

  • Tries multi-column extraction using extract_text_multicolumn() first.
  • Falls back to original column_boxes() if too few blocks are found.
  • Works for both single-column and multi-column PDFs.
  • Addresses issue Feature : Add Multi-format Resume Support #81.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant