= MPDL 2.0: software selection =

We evaluate the different content management systems (CMS) by means of their main functions and features for our extended user requirements in the area of web based access to XML-documents. The main new requirements are:
* state of the art web GUI
* datastore / repository functionality (user management, versioning of documents, history function)
* user notes / annotations within documents
* easy document uploads
* scalability
* incomplex architecture
* system independant application design
* easy administration

== Software candidates ==

We limit our software selection to the four CMS systems Magnolia ^[#note1 [1]]^, mediaWiki, eXist and eSciDoc and an own CMS development:

|| ||Magnolia||mediaWiki||eXist||eSciDoc||own development ^[#note2 [2]]^||
||Webpage||  [http://www.magnolia-cms.com/ here]  ||  [http://www.mediawiki.org/wiki/MediaWiki here]  ||  [http://exist.sourceforge.net/ here]  ||  [https://www.escidoc.org/ here]  ||  -  ||
||MPIWG installation||  [http://mediathek.mpiwg-berlin.mpg.de/mediathekPublic/versionEins Mediathek]  ||  -  ||  [http://mpdl-proto.mpiwg-berlin.mpg.de/mpdl/query.xql MPDL system]  ||  [http://pubman.mpiwg-berlin.mpg.de/pubman/ Pubman]  ||  -  ||

References: [[br]]
[=#note1 1.] which contains Apache Jackrabbit; similar system to Magnolia: [http://www.alfresco.com Alfresco] [[br]]
[=#note2 2.] own development with Servlets, Lucene for indexing/querying, GIT for versioning documents 

== System architectures ==

=== Old system MPDL 1.0: eXist ===
  [[Image(architectureMpdl1.0-eXist.jpg, 600px)]]

=== MPDL 2.0: Language/XML technology
  [[Image(architectureMpdl2.0-xml-lt.jpg, 600px)]]

=== MPDL 2.0 - CMS: Magnolia ===
  [[Image(architectureMpdl2.0-cms-magnolia.jpg, 600px)]]

=== MPDL 2.0 - CMS: mediaWiki ===
  [[Image(architectureMpdl2.0-cms-mediawiki.jpg, 600px)]]

=== MPDL 2.0 - CMS: eXist ===
  [[Image(architectureMpdl2.0-cms-eXist.jpg, 600px)]]

=== MPDL 2.0 - CMS: eSciDoc ===
  [[Image(architectureMpdl2.0-cms-eSciDoc.jpg, 600px)]]

=== MPDL 2.0 - CMS: own development ===
  no image available so far

== Software design ==

See the MPDL 2.0 software design [/wiki/mpdl2.0-design here].


== Comparison of the main system features ==

=== Basic ===

|| ||Magnolia||mediaWiki||eXist||eSciDoc||own development
||Scalable||++||++||- ^[#note1 [1]]^||++||++||
||Incomplex architecture||+||+||-||-||+||
||Stable||++||++||- ^[#note2 [2]]^||+||-||
||Performant||++||++||++||+||+||
||Customizable||++||++||++||++||++||
||Many installations||++||++||+||-||-||
||In use at MPIWG||+||-||+||+||-||
||Easy administration||+||+||+||-||+||
||Free software||+||++||++||++||++||

References: [[br]]
[=#note1 1.] not really scalable for many documents (> 1000 XML documents with each 1 MB) [[br]]
[=#note2 2.] many system crashes on macosx servers at startup time (up to 2 days); sometimes system crash when a document is uploaded [[br]]

[[br]]
=== Datastore / Repository ===

|| ||Magnolia||mediaWiki||eXist||eSciDoc||own development||
||User management||++||++||-||+||+ ^[#note1 [1]]^||
||Version control system||++||++||-||+||++||
||History presentation||++||++||-||-||++||
||Index / Fulltext query system||++||++||++||++||++||
||Many document formats (xml, pdf, doc, html)||++||++||+||++||+||
||Multimedia support||++||++||-||+||-||
||Discussions / Blogs||++||++||-||-||-||
||RDBMS support||++||++||-||++||-||
||Wiki support||++||++||-||-||-||
||JCR support||++||-||-||-||-||

[[br]]
[=#note1 1.] User management of GIT is used [[br]]

=== Extensions / Development ===

|| ||Magnolia||mediaWiki||eXist||eSciDoc||own development
||Predefined extensions / templates||++||++ ^[#note1 [1]]^ ^[#note2 [2]]^||-||-||-||
||Powerful programming||++ ^[#note3 [3]]^||++ ^[#note4 [4]]^||++ ^[#note5 [5]]^||+ ^[#note6 [6]]^||+||
||Easy application development||+||+||+||-||+||
||Index / Query system||++||++||++||++||++||
||Java Servlet / JSP support||++||-||++||++||++||
||XQuery / XPath support||+ ^[#note7 [7]]^||+ ^[#note7 [7]]^||++||+ ^[#note7 [7]]^||+||
||XML / XSL support||+||+ ^[#note7 [7]]^||++||+ ^[#note7 [7]]^||+||
||Notes / Annotations ^[#note8 [8]]^||-||-||-||-||-||
||Web Development / Web page editor||++||++||-||-||-||


References: [[br]]
[=#note1 1.] footnotes: [http://meta.wikimedia.org/wiki/Help:Footnotes internal], [http://www.mediawiki.org/wiki/Extension:Cite Cite] [[br]]
[=#note2 2.] [http://wikisource.org presentation of old books] [[br]]
[=#note3 3.] Servlet / JSP, XSL / CSS, Freemarker, limited XPath (works also in JCR 2.0 ?)[[br]]
[=#note4 4.] php, Java over !JavaBridge [[br]]
[=#note5 5.] Java, Servlet / JSP, XQuery / XPath, XSL / CSS [[br]]
[=#note6 6.] Java, Servlet / JSP [[br]]
[=#note7 7.] has to be implemented in Java (relative easy) [[br]]
[=#note8 8.] at a point in the (XML)-document: after/before an element, after/before a word, at the 10th character after the beginning etc. (see [/wiki/schema/xpointer XPointer]) [[br]]

[[br]]
=== MPDL software ===

|| ||Magnolia||mediaWiki||eXist||eSciDoc||own development||
||Get XML page fragment||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||
||Document web viewer||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||
||Document upload||++||++||+ ^[#note1 [1]]^ ^[#note2 [2]]^||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||
||Document page dictionary view||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||
||Browse lexicons and morph. database||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||
||Morphological fulltext search over all documents||+ ^[#note3 [3]]^^[#note4 [4]]^||+ ^[#note4 [4]]^||++||+ ^[#note4 [4]]^||+ ^[#note4 [4]]^||
||Morphological fulltext search in one document||+ ^[#note5 [5]]^||+ ^[#note5 [5]]^||++||+ ^[#note5 [5]]^||+ ^[#note5 [5]]^||

References: [[br]]
[=#note1 1.] already implemented but not stable enough [[br]]
[=#note2 2.] has to be implemented (relative easy) [[br]]
[=#note3 3.] Own implemented Lucene analyzer class could be set in SearchIndex; also an extractor could extract XML nodes to properties which then could be searched by specific analyzers set to them [[br]]
[=#note4 4.] has to be implemented (Lucene implementation, effort still not known) [[br]]
[=#note4 5.] could be implemented by XSL, language technology and regular expression function "matches" (relative easy and performant) [[br]]