Follow

Hola Fediverse! Does anyone know of a really good FOSS OCR tool?

I'm using on online pay tool onlineocr.net because none of the ones I've found give very good results. They all detect the words with pretty much 100% accuracy, but the problem is that only the pay tools seem to be able to reproduce layout properly.

@vrsmd That's not FOSS, I already have great pay solutions.

@vrsmd @OCRbot Sorry thought you meant ocrbot.net, will check it out, thanks 😃

OCR Output (chars: 121) 

@vrsmd No that's not what I'm looking for, but thanks anyway.

@aran
This is the engine behind OCRbot, which is what I really meant for you to check out.

github.com/tesseract-ocr/tesse

@vrsmd Most of the linux tools I've looked at use the tesseract engine, but so far none I've found stack up to the good pay ones I've found in terms of font-matching and reproducing the original layout.

@vrsmd Here's an example of a gImageReader (a Tesseract-based linux one) and OnlineOcr.net a pay service. gImage reader's result has not been able to get the layout sized and positioned correctly and the text is not continuous and selectable. It's not able to match the serif and san-serif sections or any of the bold/italics in the content like the pay one has.

@aran
Thanks for the example, I understand better now and I hope it helps you get an answer that will help you!

Sign in to participate in the conversation
Organic Design

ODing in the fediverse!