Skip to content

Fix failing tests in PdfReader, OWImportDocuments, and OWCorpusViewer#1139

Merged
ajdapretnar merged 1 commit into
biolab:masterfrom
leskovecg:fix-tests-on-master
Sep 4, 2025
Merged

Fix failing tests in PdfReader, OWImportDocuments, and OWCorpusViewer#1139
ajdapretnar merged 1 commit into
biolab:masterfrom
leskovecg:fix-tests-on-master

Conversation

@leskovecg

@leskovecg leskovecg commented Aug 12, 2025

Copy link
Copy Markdown
Collaborator

Issue

Tests were failing in:

  • TestPdfReader.test_error
  • TestOWImportDocuments (warning text, skipped documents count)
  • TestOWCorpusViewer.test_search (n_matches type mismatch)

Description of changes

  • Updated PdfReader to return None for corrupted/empty PDFs.
  • Adjusted OWImportDocuments warning message and skipped documents handling.
  • Ensured OWCorpusViewer.n_matches is stored as an integer for valid regex results.

Includes

  • Code changes
  • Tests pass locally

@leskovecg leskovecg force-pushed the fix-tests-on-master branch 2 times, most recently from 52633f3 to 768f00e Compare August 13, 2025 06:58
@leskovecg leskovecg changed the title test Fix failing tests in PdfReader, OWImportDocuments, and OWCorpusViewer Aug 13, 2025
def on_done(self, res: int):
"""When matches count is done show the result in the label"""
self.n_matches = f"{int(res):,}" if res is not None else "n/a"
self.n_matches = int(res) if res is not None else "n/a"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we wanted to have a separator. Please fix the failing test instead so that it check a string.

)
if errors:
self.Warning.read_error("One file" if len(errors) == 1
else f"{len(errors)} files")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this check better than the above?

else f"{len(errors)} files")
if self.corpus:
self.n_text_data = len(self.corpus)
self.n_text_categories = 0

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not already set in line 663:665?

self.n_text_data = len(self.corpus)
self.n_text_categories = 0
else:
self.Warning.read_error.clear()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors don't have to be cleared if they are not raised. They are cleared on every new run, right?

@leskovecg leskovecg force-pushed the fix-tests-on-master branch from 04e225f to 94f2e16 Compare August 21, 2025 07:56
@leskovecg leskovecg force-pushed the fix-tests-on-master branch from 94f2e16 to 5045ac6 Compare August 21, 2025 08:07
@ajdapretnar ajdapretnar merged commit a546f12 into biolab:master Sep 4, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants