wiki:mpdl2.0-design

Version 2 (modified by jwillenborg, 13 years ago) (diff)

--

MPDL 2.0

The MPDL release 2.0 is redesigned so that all important functions (language technology, XML functions) are usable as web applications independent from the eXist software - available as HTTP servlets and fully implemented in Java.

Language technology

The language technology module consists of:

  • language technology data (XML data files, Java Berkely DB's)
    • morphology data (ara, eng, fre, ger, gre, ita, lat, nld, zho)
    • dictionary data
  • Java source code
  • used Java libraries
  • web application configuration file (web.xml)

It is available as the web archive file "mpiwg-mpdl-lt.war".

Following servlets are available:

Morphology

  • TokenizeServlet?
    • URL: /mpdl/tokenize
    • Request parameters:
      • srcUrl
        • source URL of fulltext
          • unstructured text
          • XML fragment/document
      • language
        • ISO 639-3 specifier
    • Response output:
      • word tokens
        • word tokens (XML)
  • LemmaServlet?
    • URL: /mpdl/getLemmas
    • Request parameters:
      • forms
        • one word form (string)
        • list of word forms (XML)
      • language
        • ISO 639-3 specifier
    • Response output:
      • lemmas
        • one lemma
        • list of lemmas (XML)
  • FormServlet?
    • URL: /mpdl/getForms
    • Request parameters:
      • lemmas
        • one lemma (string)
        • list of lemmas (XML)
      • language
        • ISO 639-3 specifier
    • Response output:
      • forms
        • list forms (XML)

Dictionary

  • WordServlet?
    • URL: /mpdl/getDictionaryEntries
    • Request parameters:
      • forms
        • one form (string)
        • list of forms (XML)
      • language
        • ISO 639-3 specifier
      • type
        • full, compact
    • Response output:
      • dictionary entries
        • dictionary entries (XML)
  • DictionaryEnrichServlet?
    • URL: /mpdl/enrichByDictionary
    • Request parameters:
      • srcUrl
        • source URL of XML fragment/document
    • Response output:
      • enriched XML fragment/document
        • words of document are extended by links to dictionaries

Other functions

  • NormalizeServlet?
    • URL: /mpdl/normalize
    • Request parameters:
      • srcUrl
        • source URL of XML fragment/document
      • method
        • method of normalization (e.g. "reg", "norm", "reg norm")
      • type
        • type of normalization (e.g. "display", "dictionary", "search")
    • Response output:
      • normalized XML fragment/document
  • TranscodeServlet?
    • URL: /mpdl/transcode
    • Request parameters:
      • text
        • text to be transcoded (string)
      • srcEncoding
        • source encoding (e.g. betacode, buckwalter, unicode)
      • destEncoding
        • destination encoding (e.g. betacode, buckwalter, unicode)
    • Response output:
      • transcoded text

XML technology

The XML technology module consists of:

  • Java source code
  • used Java libraries
  • web application configuration file (web.xml)

It is available as the web archive file "mpiwg-mpdl-xml.war".

Following servlets are available:

XPath/XQuery

  • TransformServlet?
    • URL: /mpdl/transform
    • Request parameters:
      • srcUrl
        • source URL of XML document
      • xslUrl
        • URL of XSL document
    • Response output:
      • transformed document (HTML, XML, etc.)
  • RenderServlet?
    • URL: /mpdl/render
    • Request parameters:
      • srcUrl
        • source URL of XML document
    • Response output:
      • rendered document (PDF)
  • XPathServlet
    • URL: /mpdl/xpath
    • Request parameters:
      • srcUrl
        • source URL of XML document
      • xpath
        • xpath source code
    • Response output:
      • XPath result for that document
  • XQueryServlet
    • URL: /mpdl/xquery
    • Request parameters:
      • srcUrl
        • source URL of XML document
      • xquery
        • xquery source code
    • Response output:
      • XQuery result for that document
  • GetFragmentServlet?
    • URL: /mpdl/getFragment
    • Request parameters:
      • srcUrl
        • source URL of XML document
      • ms1Name
        • first milestone name, e.g. "pb"
      • ms1Position
        • first milestone position, e.g. 1
      • ms2Name
        • second milestone name, e.g. "pb"
      • ms2Position
        • second milestone position, e.g. 2
    • Response output:
      • XML fragment