Changes between Version 1 and Version 2 of mpdl2.0-design


Ignore:
Timestamp:
Sep 5, 2011, 2:53:42 PM (13 years ago)
Author:
jwillenborg
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • mpdl2.0-design

    v1 v2  
    11= MPDL 2.0 =
    22
    3 The MPDL 1.0 software is tightly coupled with the XML system eXist. The next MPDL release 2.0 will be redesigned so that many functions (language technology, some basic XML functions) are usable as open services independently of the eXist software. The new functions will be designed in a layer architecture so that they could be used in different workflows and in a more standardized way (API and XML standard output format). All main functions are available as servlets and are fully implemented in Java.
     3The MPDL release 2.0 is redesigned so that all important functions (language technology, XML functions) are usable as web applications independent from the eXist software - available as HTTP servlets and fully implemented in Java.
    44
    55== Language technology ==
    66
    7 === Word recognition ===
     7The language technology module consists of:
     8* language technology data (XML data files, Java Berkely DB's)
     9  * morphology data (ara, eng, fre, ger, gre, ita, lat, nld, zho)
     10  * dictionary data
     11* Java source code
     12* used Java libraries
     13* web application configuration file (web.xml)
    814
    9 Input:
    10   * text (URL)
    11     * unstructured text
    12     * XML fragment/document
    13   * language (ISO 639-3 specifier)
     15It is available as the web archive file "mpiwg-mpdl-lt.war".
    1416
    15 Output
    16   * list of word tokens
    17     * words seperated by a blank
    18     * XML format
     17Following servlets are available:
    1918
    2019=== Morphology ===
    2120
     21* TokenizeServlet
     22  * URL: /mpdl/tokenize
     23  * Request parameters:
     24    * srcUrl
     25      * source URL of fulltext
     26        * unstructured text
     27        * XML fragment/document
     28    * language
     29      * ISO 639-3 specifier
     30  * Response output:
     31    * word tokens
     32      * word tokens (XML)
     33
     34* LemmaServlet
     35  * URL: /mpdl/getLemmas
     36  * Request parameters:
     37    * forms
     38      * one word form (string)
     39      * list of word forms (XML)
     40    * language
     41      * ISO 639-3 specifier
     42  * Response output:
     43    * lemmas
     44      * one lemma
     45      * list of lemmas (XML)
     46
     47* FormServlet
     48  * URL: /mpdl/getForms
     49  * Request parameters:
     50    * lemmas
     51      * one lemma (string)
     52      * list of lemmas (XML)
     53    * language
     54      * ISO 639-3 specifier
     55  * Response output:
     56    * forms
     57      * list forms (XML)
     58
    2259=== Dictionary ===
    2360
     61* WordServlet
     62  * URL: /mpdl/getDictionaryEntries
     63  * Request parameters:
     64    * forms
     65      * one form (string)
     66      * list of forms (XML)
     67    * language
     68      * ISO 639-3 specifier
     69    * type
     70      * full, compact
     71  * Response output:
     72    * dictionary entries
     73      * dictionary entries (XML)
    2474
    25 == XML functions ==
     75* DictionaryEnrichServlet
     76  * URL: /mpdl/enrichByDictionary
     77  * Request parameters:
     78    * srcUrl
     79      * source URL of XML fragment/document
     80  * Response output:
     81    * enriched XML fragment/document
     82      * words of document are extended by links to dictionaries
     83
     84=== Other functions ===
     85
     86* NormalizeServlet
     87  * URL: /mpdl/normalize
     88  * Request parameters:
     89    * srcUrl
     90      * source URL of XML fragment/document
     91    * method
     92      * method of normalization (e.g. "reg", "norm", "reg norm")
     93    * type
     94      * type of normalization (e.g. "display", "dictionary", "search")
     95  * Response output:
     96    * normalized XML fragment/document
     97
     98* TranscodeServlet
     99  * URL: /mpdl/transcode
     100  * Request parameters:
     101    * text
     102      * text to be transcoded (string)
     103    * srcEncoding
     104      * source encoding (e.g. betacode, buckwalter, unicode)
     105    * destEncoding
     106      * destination encoding (e.g. betacode, buckwalter, unicode)
     107  * Response output:
     108    * transcoded text
     109
     110== XML technology ==
     111
     112The XML technology module consists of:
     113* Java source code
     114* used Java libraries
     115* web application configuration file (web.xml)
     116
     117It is available as the web archive file "mpiwg-mpdl-xml.war".
     118
     119Following servlets are available:
    26120
    27121=== XPath/XQuery ===
    28122
    29 === get fragment ===
     123* TransformServlet
     124  * URL: /mpdl/transform
     125  * Request parameters:
     126    * srcUrl
     127      * source URL of XML document
     128    * xslUrl
     129      * URL of XSL document
     130  * Response output:
     131    * transformed document (HTML, XML, etc.)
    30132
     133* RenderServlet
     134  * URL: /mpdl/render
     135  * Request parameters:
     136    * srcUrl
     137      * source URL of XML document
     138  * Response output:
     139    * rendered document (PDF)
     140
     141* XPathServlet
     142  * URL: /mpdl/xpath
     143  * Request parameters:
     144    * srcUrl
     145      * source URL of XML document
     146    * xpath
     147      * xpath source code
     148  * Response output:
     149    * XPath result for that document
     150
     151* XQueryServlet
     152  * URL: /mpdl/xquery
     153  * Request parameters:
     154    * srcUrl
     155      * source URL of XML document
     156    * xquery
     157      * xquery source code
     158  * Response output:
     159    * XQuery result for that document
     160
     161* GetFragmentServlet
     162  * URL: /mpdl/getFragment
     163  * Request parameters:
     164    * srcUrl
     165      * source URL of XML document
     166    * ms1Name
     167      * first milestone name, e.g. "pb"
     168    * ms1Position
     169      * first milestone position, e.g. 1
     170    * ms2Name
     171      * second milestone name, e.g. "pb"
     172    * ms2Position
     173      * second milestone position, e.g. 2
     174  * Response output:
     175    * XML fragment
     176