Tesseract Custom Font

This repo is heavily inspired by Gabriel Garcia's Tesseract Tutorial

I recreated it to get a better understanding of how tesstrain works. I've also included training.sh to help with the training.

When creating this project, I trained a model to recognize the Minecraft font, on top of the English one, hence why the MODEL_NAME in training.sh is mc.

Usage

Make sure you clone tesseract and tesstrain before running the code

If you want to train a model on top op E.G the English one, you need to place the model from the tessdata repository into tesseract/tessdata.

Training text

Here you can either use langdata from Tesseract's langdata repository, or you can use generate-training-text.py, which takes in a list of symbols, and generates random text with it.

Ground truth

Run generate-ground-truth.py and follow instructions given in the terminal. The code uses the training_text file in the data folder.

Training

If you're on windows, you may need to use WSL.

Use the training.sh file and adjust it to your needs. The most important variable to change here is the MODEL_NAME.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
fonts.conf		fonts.conf
generate-ground-truth.py		generate-ground-truth.py
generate-training-text.py		generate-training-text.py
training.sh		training.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tesseract Custom Font

Usage

Training text

Ground truth

Training

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tesseract Custom Font

Usage

Training text

Ground truth

Training

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages