Changes between Version 10 and Version 11 of RLP-Test
- Timestamp:
- Sep 16, 2009, 11:39:15 AM (16 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
RLP-Test
v10 v11 2 2 3 3 == RLP == 4 * RLP: version 6.5.2 (platform depend ant)5 * RLP-Lucene: version 6.0.0 (Java library: platform independ ant)4 * RLP: version 6.5.2 (platform dependent) 5 * RLP-Lucene: version 6.0.0 (Java library: platform independent) 6 6 == Document base == 7 7 * 113 documents, sized each 1 KB - 18 MB … … 22 22 * RLP: 234 base forms 23 23 * Donatus: 149 base forms 24 * RLP misses: 36% 24 25 * italian: Borro, Girolamo. Del flusso e reflusso del mare. Lucca, 1561. Morphological index for "e" 25 26 * RLP: 221 base forms 26 27 * Donatus: 132 base forms 28 * RLP misses: 40% 27 29 * english: Alberti, Leone Battista. Architecture. London, 1755. Morphological index for "b" 28 30 * RLP: 592 base forms 29 31 * Donatus: 367 base forms 32 * RLP misses: 38% 30 33 * german: Johann Grunert. Mathematik und Physik. 1920. Morphological index for "f" 31 34 * RLP: 25 base forms 32 35 * Donatus: 16 base forms 33 * french: Alberti, Leone Battista. Architecture. London, 1755. Morphological index for "b" 34 * RLP: 592 base forms 35 * Donatus: 367 base forms 36 * RLP misses: 36% 37 * french: Galilei, Galileo. Les méchaniques. Paris, 1634. Morphological index for "g" 38 * RLP: 71 base forms 39 * Donatus: 60 base forms 40 * RLP misses: 15% 36 41 * dutch: Stevin, Simon. De Beghinselen der Weegconst. Leyden, 1586. Morphological index for "d" 37 42 * RLP: 159 base forms 38 43 * Donatus: 142 base forms 44 * RLP misses: 11% 45 * greek: Epicurus. Varia. Leipzig, 1887. Morphological index for "s" 46 * RLP: 253 base forms 47 * Donatus: 241 base forms 48 * RLP misses: 5% 49 * arabic: Heron Alexandrinus. Mechanica. Leipzig, 1900. Morphological index for "a" 50 * RLP: 330 base forms 51 * Donatus: 325 base forms 52 * RLP misses: 2% 53 * chinese: no base form reduction 39 54 * overall: RLP misses xx % in base form reduction in contrast to Donatus 55 * base form reduction of latin "sunt": comparison of RLP and Donatus (in Benedetti, Giovanni Battista de. Diversarum Speculationum mathematicum, & physicarum liber. 1585.) 56 * RLP: 259 sentence hits 57 * Donatus: 1655 sentence hits (with all forms: ens, entibus, entis, eram, eramus, erant, erantque, erat, eratque, erimus, eris, erit, eritin, eritque, eritqueue, ero, erunt, erunt., eruntque, es, esne, esse, essemus, essent, esseque, esset, est, estis, esto, estque, fore, forem, forent, fores, foret, fuam, fuat, fueram, fueramus, fuerant, fueras, fuerat, fuere, fuerim, fuerimus, fuerin, fuerint, fuerintque, fueris, fuerit, fueritne, fueritque, fuero, fuerunt, fui, fuimus, fuisse, fuissent, fuisset, fuit, fuitque, futura, futuram, futurarum, futuras, futuri, futuris, futuro, futurorum, futuros, futurum, futurumst, futurus, sient, siet, sim, simus, sint, sintque, sis, sit, sitis, sitque, sum, sumus, sunt, sunto, suntque) 58 * RLP misses: 84% 40 59 * double entries: same word forms leads to different base forms: examples 41 60 * babylonian, babylonians 42 61 * backdoors, back-doors, back-door 43 62 * fleisse, fleissigen, fleiß, fleißig 44 * orthographic normalization: error examples 45 * f., fisi-, 46 * single characters: a, b, 47 * count hits: error examples 63 * orthographic normalization: error base forms (examples) 64 * f., fisi-, e@@et, e@t, 65 * c.a, c.b, c.d, c.e, c.f, ..., c.sit, ..., c.y, d.c.sit, d-ui, e-tago, fa-cere, face-re 66 * ca-liditatem, ca-lor, ca-lorem, ... 67 * single characters: a, b, c, ... 68 * count hits: errors (examples) 48 69 * fotografie: 10 hits (actually 5 hits) 70 * 編 : 1 hit (actually 15 hits) 71 * overall 72 * RLP produces many errors (much more errors as Donatus) 73 * it is not platform independent 74 * is not open software 75 * it costs much money