Skip to content

Fix regex bugs with replaced code. Simplified implementation as well. #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jaxley
Copy link

@jaxley jaxley commented Sep 24, 2024

The regex code was not properly matching links with special characters in them. This was causing the path to be a truncated part of the image filename. When this truncated path matched an existing image file, that file would have OCR text inserted into the file. This caused lots of files to have the same (wrong) OCR text.

I changed the way the images were identified to pull from each file's embeds list. No need to use regex to find them. This lets you use the API to locate the file path matching that name using Obsidian's own logic for resolving the file name. There's no need to have an image path parameter and use that to know what is/is not an image. That was a rigid design that didn't work for many configurations of Obsidian and resulted in many images never getting OCR text due to their location.

Once we had a list of the embeds, this allows for directly finding the location of that link in the file using the exact link text. And associating the embed with its corresponding TFile object, we can get its path without re-creating it (incorrectly). This resolves the original bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant