| 142 | |
| 143 | |
| 144 | == Text enrichment Tools == |
| 145 | |
| 146 | = Lemmatisation = |
| 147 | |
| 148 | This section addresses the integration of the MPIWG Lemmatisation in Pundit. |
| 149 | |
| 150 | A lemma is the canonical form of a set of words. |
| 151 | Lemmatisation refers to the morphological analysis of words that aims to find the lemma of a word by removing inflectional endings and returning the base or dictionary form of a word. |
| 152 | |
| 153 | = MPIWG Lemmatisator = |
| 154 | |
| 155 | The MPIWG Lemmatisator is a web service that is part of the Language technology services (Mpdl) that is hosted in [http://mpdl-service.mpiwg-berlin.mpg.de/mpiwg-mpdl-lt-web/]. |
| 156 | This web service is not responsible for the lemmatisation of words, however it accesses several other web services (like: http://www.perseus.tufts.edu/hopper/) that have dictionaries of words and their lemmas. The MPIWG Lemmatisator is only responsible for the words query and for the merging of the responses from the other services in a unique response. |
| 157 | |
| 158 | The lemmatisator supports the following languages: Arabic, Chinese, Dutch, English, French, German, Ancient Greek, Italian and Latin. |
| 159 | |
| 160 | The following link illustrates the response of the lemmatisation for the query “multa”: |
| 161 | [http://mpdl-service.mpiwg-berlin.mpg.de/mpiwg-mpdl-lt-web/lt/GetLemmas?query=multa&language=lat&outputFormat=html] |
| 162 | |
| 163 | |
| 164 | = Pundit Integration = |
| 165 | |
| 166 | Pundit is an annotation tool based on semantic web technologies. In order to allow the use of the Mpdl in Pundit, the lemmatisator should be able to transform its response to RDF. The Web Service (hosted temporally in |
| 167 | [https://openmind-ismi-dev.mpiwg-berlin.mpg.de/lemmatisator]) attempts to solve this issue by the transformation of the response from the MPIWG Lemmatisator to RDF Triples. |
| 168 | |
| 169 | The triples returned by this service implement the Gold Ontology (see: [http://linguistics-ontology.org/gold/]). For example, the query the word “mula” in Latin returns: “mula is lemma of multus”. Using the Gold Ontology, the last triple would be expressed as follow: |
| 170 | |
| 171 | http://mpiwg.de/ontologies/ont.owl/lemma#multus writtenRealization http://mpiwg.de/ontologies/ont.owl/word#multa |