require an update of japanese trained data

I tried to run the OCR on an image with ¥ symbols and the engine was totally unable to match any of them. It usually translated them into "\ く".
The ¥ were "\" for a UNC path (because on Japanese Windows, all the \ are replaced by ¥)

Also, all the number (except 0) were translated into their circled version.
For example:
1 -> ①
2 -> ②
3 -> ③
...
Japanese uses circled number some times (often compare to the rest of the world) but not that often. Number should still be translated into their normal form.

I think that these issues come from the training data that did not include ¥ and has as input number that were circled in the expected result.

I am quite a noob in ML and I do not know if I can and how to extract source data from traineddata file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

require an update of japanese trained data #119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

require an update of japanese trained data #119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions