wiki:OCR_evaluation

Version 4 (modified by Klaus Thoden, 13 years ago) (diff)

--

The workflow is going to be adapted to allow the use of OCRed text as input. The OCR engine is going to be OCRopus.

Tutorial video and other videos

The documents of the previous workflows were assessed in terms of how well they might perform being OCRed.

Command overview

The following commands (taken from above video) allow the recognition of English text:

  1. ocropus-binarize 035.jpg
  2. ocropus-pseg book/????.png
  3. ocropus-lattices -m OCRopus/ocropy/2m2-reject.cmodel book/0001/??????.png
  4. ocropus-align -l OCRopus/ocropy/data/default.fst book/0001/??????.fst
  5. ocropus-hocr book/

Attachments (1)

Download all attachments as: .zip