Dieses Projekt ist jetzt Teil des Projektes OCR Tools and Fulltextsearch
Tools for creating lucene indices
This projects is aimed to develop tools creating fulltext indices for documents ocred with Tesseract and Octropus. This tools cover the following features:
- Integrating Donatus Language Technologies for creating and searching a Lucene Index
- Indexing ocropus generated documents, so the hits can be displaxed on the original image using digilib.
Tools for creating OCR from folders in our repository will be collected at: https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/pythonOcropusTools
Zope and Python tools to access theses indices can be found at https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/luceneToolsPython
Examples for the use of the Zope product can be found at http://itgroup.mpiwg-berlin.mpg.de/experimental/Searching/OCRfulltext.
Last modified 14 years ago
Last modified on Jun 25, 2010, 3:03:33 PM