Changes between Version 10 and Version 11 of digi-tools-doku


Ignore:
Timestamp:
Jan 30, 2015, 10:27:11 AM (9 years ago)
Author:
jurzua
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • digi-tools-doku

    v10 v11  
    140140
    141141Of course, the source has also been given to [http://europeana.eu/portal/record/2048607/data_item_mpiwg_rara_MPIWG_UK75HAUV.html Europeana]
     142
     143
     144== Text enrichment Tools ==
     145
     146= Lemmatisation =
     147
     148This section addresses the integration of the MPIWG Lemmatisation in Pundit.
     149
     150A lemma is the canonical form of a set of words.
     151Lemmatisation refers to the morphological analysis of words that aims to find the lemma of a word by removing inflectional endings and returning the base or dictionary form of a word.
     152
     153= MPIWG Lemmatisator =
     154
     155The MPIWG Lemmatisator is a web service that is part of the Language technology services (Mpdl) that is hosted in [http://mpdl-service.mpiwg-berlin.mpg.de/mpiwg-mpdl-lt-web/].
     156This web service is not responsible for the lemmatisation of words, however it accesses several other web services (like: http://www.perseus.tufts.edu/hopper/) that have dictionaries of words and their lemmas. The MPIWG Lemmatisator is only responsible for the words query and for the merging of the responses from the other services in a unique response.
     157
     158The lemmatisator supports the following languages: Arabic, Chinese, Dutch, English, French, German, Ancient Greek, Italian and Latin.
     159
     160The following link illustrates the response of the lemmatisation for the query “multa”:
     161[http://mpdl-service.mpiwg-berlin.mpg.de/mpiwg-mpdl-lt-web/lt/GetLemmas?query=multa&language=lat&outputFormat=html]
     162
     163
     164= Pundit Integration =
     165
     166Pundit is an annotation tool based on semantic web technologies. In order to allow the use of the Mpdl in Pundit, the lemmatisator should be able to transform its response to RDF. The Web Service (hosted temporally in
     167[https://openmind-ismi-dev.mpiwg-berlin.mpg.de/lemmatisator]) attempts to solve this issue by the transformation of the response from the MPIWG Lemmatisator to RDF Triples.
     168
     169The triples returned by this service implement the Gold Ontology (see: [http://linguistics-ontology.org/gold/]). For example, the query the word “mula” in Latin returns: “mula is lemma of multus”. Using the Gold Ontology, the last triple would be expressed as follow:
     170
     171http://mpiwg.de/ontologies/ont.owl/lemma#multus writtenRealization http://mpiwg.de/ontologies/ont.owl/word#multa