[FEATURE] OCR  and Full-Text-Search

## Problem Statement

Currently, it is not possible to search for Text that is inside PDFs and images. 

## Proposed Solution

Text from PDFs and images would be needed to be extracted and stored somewhere for convenient retrieval. 

If I am not mistaken, Nextcloud uses Elasticsearch, which offers an[ experimental rust client](https://www.elastic.co/docs/reference/elasticsearch/clients/rust) and could be used for that here as well. 

For extracting the text from PDFs, I think that should be either doable by, just extracting the text from a computer generated PDF with readable text, or by using ocr. 

On the first look, [this](https://github.com/robertknight/ocrs) project, looks like it could be helpful. Although it cannot extract text directly from PDF, converting PDF to PNG first could be a workaround.

## User Impact

The user could search for keywords and find all documents it appears in, as well as the page and location.



An important question would be, how to structure the data, and how to efficiently store, retrieve and display it.

I could take a deeper look into actually implementing this, though time wise, it would probably be in a while.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] OCR and Full-Text-Search #446

Problem Statement

Proposed Solution

User Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[FEATURE] OCR and Full-Text-Search #446

Description

Problem Statement

Proposed Solution

User Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions