= Test of RLP (Rosette Linguistics Platform) = == RLP == * RLP: version 6.5.2 (platform dependant) * RLP-Lucene: version 6.0.0 (Java library: platform independant) == Document base == * 113 documents, sized each 1 KB - 18 MB * languages: latin, italian, english, german, french, dutch, greek, arabic, chinese == Hardware, operating system == * Mac Pro, Dual Core Intel Xeon 2,66 Ghz, 4GB RAM * MacOS 10.5.4 == Indexing == * done on [http://exist-db.org/ eXist with Lucene (eXist 1.3dev)] * needed 1,3 hours (83 minutes) * took most of the time full processor time (100%) * less RAM consumption (< 500 MB) == Result / Quality of indexing (random samples) == * application: see [http://xserve07.mpiwg-berlin.mpg.de:30010/mpdl/query.xql MPDL prototype with RLP analyzer (access only within MPIWG network)] * online example: RLP base form reduction (morphological index lookup in a document) [http://xserve07.mpiwg-berlin.mpg.de:30010/mpdl/page-query-result.xql?document=/archimedes/la/delfi_fluxu_024_la_1559.xml&pn=1&mode=text&query-type=ftIndexMorph&query=a&query-result-pn=1 for "a" in Delfino, Federico. De fluxu et refluxu aquae maris. Venice, 1559] * base form reduction: comparison of RLP and Donatus * latin: Delfino, Federico. De fluxu et refluxu aquae maris. Venice, 1559. Morphological index for "a" * RLP: 234 base forms * Donatus: 149 base forms * italian: Borro, Girolamo. Del flusso e reflusso del mare. Lucca, 1561. Morphological index for "e" * RLP: 221 base forms * Donatus: 132 base forms * english: Alberti, Leone Battista. Architecture. London, 1755. Morphological index for "b" * RLP: 592 base forms * Donatus: 367 base forms * german: Johann Grunert. Mathematik und Physik. 1920. Morphological index for "f" * RLP: 25 base forms * Donatus: 16 base forms * french: Alberti, Leone Battista. Architecture. London, 1755. Morphological index for "b" * RLP: 592 base forms * Donatus: 367 base forms * dutch: Stevin, Simon. De Beghinselen der Weegconst. Leyden, 1586. Morphological index for "d" * RLP: 159 base forms * Donatus: 142 base forms * overall: RLP misses xx % in base form reduction in contrast to Donatus * double entries: same word forms leads to different base forms: examples * babylonian, babylonians * back­doors, back­-doors, back­-door * fleisse, fleissigen, fleiß, fleißig * orthographic normalization: error examples * f., fisi-, * single characters: a, b, * count hits: error examples * fotografie: 10 hits (actually 5 hits)