Skip to content

Searching for tens of thousands of hits yields incomplete results #6772

@matthias-ronge

Description

@matthias-ronge

Searching with a keyword in the index only returns ten thousand results, not all of them. And if all results were available, the ID list would have to be passed into the database for filtering with other conditions. This is not a text search for users, and using a search engine for this purpose misses the mark. This is inefficient in terms of performance and memory usage, even on powerful servers.

Some metadata is used as a filter keyword; this applies to selection criteria such as service provider, scanner, or subproject. A metadata keyword is assigned to tens of thousands of processes, and then the search, especially for statistical purposes, must retrieve all hits.

To Reproduce

  1. Steps to reproduce the behavior:
  2. Create a database containing tens of thousands of processes, more than ten thousand of which share the same metadata value.
  3. Search for the metadata.
  4. The results list is incomplete.

Expected behavior
The query result must be complete.

Release
Kitodo.Production 3.9

Cf. #6549 #6719

Metadata

Metadata

Assignees

No one assigned

    Labels

    searchsearch, filter

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions