Description
I tried to run the OCR on an image with ¥ symbols and the engine was totally unable to match any of them. It usually translated them into "\ く".
The ¥ were "" for a UNC path (because on Japanese Windows, all the \ are replaced by ¥)
Also, all the number (except 0) were translated into their circled version.
For example:
1 -> ①
2 -> ②
3 -> ③
...
Japanese uses circled number some times (often compare to the rest of the world) but not that often. Number should still be translated into their normal form.
I think that these issues come from the training data that did not include ¥ and has as input number that were circled in the expected result.
I am quite a noob in ML and I do not know if I can and how to extract source data from traineddata file.
Activity