Fix regex bugs with replaced code. Simplified implementation as well. #8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The regex code was not properly matching links with special characters in them. This was causing the path to be a truncated part of the image filename. When this truncated path matched an existing image file, that file would have OCR text inserted into the file. This caused lots of files to have the same (wrong) OCR text.
I changed the way the images were identified to pull from each file's embeds list. No need to use regex to find them. This lets you use the API to locate the file path matching that name using Obsidian's own logic for resolving the file name. There's no need to have an image path parameter and use that to know what is/is not an image. That was a rigid design that didn't work for many configurations of Obsidian and resulted in many images never getting OCR text due to their location.
Once we had a list of the embeds, this allows for directly finding the location of that link in the file using the exact link text. And associating the embed with its corresponding TFile object, we can get its path without re-creating it (incorrectly). This resolves the original bug.