feat: Added support for .docx resume parsing #127

diyapratheep · 2025-10-08T16:48:22Z

Closes #126

Description

This pull request adds support for parsing and evaluating resumes in .docx format, making the hiring agent more versatile. It also includes a bug fix to make the evaluation model more robust.

Changes Made

New Dependency: Added python-docx to requirements.txt.
File Handling: Modified score.py to check the input file's extension and call the appropriate parsing function (.pdf or .docx).
Caching: Fixed the caching logic in score.py to correctly generate cache filenames for any supported file type.
DOCX Parsing: Created a new extract_json_from_docx method in pdf.py that leverages the existing LLM pipeline to ensure consistent, structured data output.

How to Test

Ensure the new dependency is installed: pip install python-docx
Run the agent with a .docx file: python score.py /path/to/your/resume.docx
Verify that the evaluation completes successfully.

introduces the capability to parse and evaluate resumes in .docx format, expanding the agent's functionality beyond just PDFs. - Added the `python-docx` library to handle .docx file parsing. - Modified `score.py` to dynamically detect the file extension (.pdf or .docx) and route it to the appropriate parsing function. - Updated the caching mechanism in `score.py` to generate correct filenames for both file types. - Refactored `pdf.py` by creating a new `extract_json_from_docx` method. This method reuses the existing core LLM logic to convert the extracted text into the structured JSONResume format, ensuring consistency.

diyapratheep · 2025-10-08T16:52:05Z

@sp2hari @anxkhn-hacker This PR is ready for review. It adds support for .docx files.
FYI, I used a dummy resume for testing, so the low evaluation score is expected and can be ignored. The core parsing works.

Let me know if any changes are needed! If not requesting to close the issue and PR soon. Thank you!

diyapratheep mentioned this pull request Oct 8, 2025

Feature Request: Add support for .docx resume files #126

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Added support for .docx resume parsing #127

feat: Added support for .docx resume parsing #127

Uh oh!

diyapratheep commented Oct 8, 2025 •

edited

Loading

Uh oh!

diyapratheep commented Oct 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

feat: Added support for .docx resume parsing #127

Are you sure you want to change the base?

feat: Added support for .docx resume parsing #127

Uh oh!

Conversation

diyapratheep commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes Made

How to Test

Uh oh!

diyapratheep commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

diyapratheep commented Oct 8, 2025 •

edited

Loading

diyapratheep commented Oct 8, 2025 •

edited

Loading