Skip to content

Required GUI Based App To Train Data On fonts And Different Languages #183

Open
@Farazioo

Description

  • Problem is That Data Trained By tesseract didnt work on either regular fonts used by languages like its urdu language most comon urdu font is like
    customfnturdu
    while trained data is so much different
    trd

So As i see the process is like
a picture > box data and ground data > other stuff > trained model
What i think process should be like

lvl 1
--- Input ---

  • language
  • language text ( bunch of text )
  • language fonts

lvl 2
--- Training Data Generation ---

  • Automated process
    • divide text small pieces
    • uses fonts
    • and simulate text pieces using font files in preview window
    • different color combination can be applied for precision training ( may be not necessary )
    • capture these preview samples automatedly as images and text data can be saved as ground truth also be usd for box files
      These Steps can be automated

lvl 3
--- Final Output ---

  • As Data is generated by lvl 2 now we just do Training
  • And Final trained model be ready
  • We can involve a lvl for testing model for new text
  • Tried Youtube Stuff its freaking too much irritating for such like me inexperince person too much errors

User inputs using lvl 1 and lvl 2 and 3 can performed automatedly and result be the model

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions