MPDL 2.0: software selection

We evaluate the different content management systems (CMS) by means of their main functions and features for our extended user requirements in the area of web based access to XML-documents. The main new requirements are:

  • state of the art web GUI
  • datastore / repository functionality (user management, versioning of documents, history function)
  • user notes / annotations within documents
  • easy document uploads
  • scalability
  • incomplex architecture
  • system independant application design
  • easy administration

Software candidates

We limit our software selection to the four CMS systems Magnolia [1], mediaWiki, eXist and eSciDoc and an own CMS development:

MagnoliamediaWikieXisteSciDocown development [2]
Webpage here here here here -
MPIWG installation Mediathek - MPDL system Pubman -

1. which contains Apache Jackrabbit; similar system to Magnolia: Alfresco
2. own development with Servlets, Lucene for indexing/querying, GIT for versioning documents

System architectures

Old system MPDL 1.0: eXist

MPDL 2.0: Language/XML technology

MPDL 2.0 - CMS: Magnolia

MPDL 2.0 - CMS: mediaWiki

MPDL 2.0 - CMS: eXist

MPDL 2.0 - CMS: eSciDoc

MPDL 2.0 - CMS: own development

no image available so far

Software design

See the MPDL 2.0 software design here.

Comparison of the main system features


MagnoliamediaWikieXisteSciDocown development
Scalable++++- [1]++++
Incomplex architecture++--+
Stable++++- [2]+-
Many installations+++++--
In use at MPIWG+-++-
Easy administration+++-+
Free software+++++++++

1. not really scalable for many documents (> 1000 XML documents with each 1 MB)
2. many system crashes on macosx servers at startup time (up to 2 days); sometimes system crash when a document is uploaded

Datastore / Repository

MagnoliamediaWikieXisteSciDocown development
User management++++-++ [1]
Version control system++++-+++
History presentation++++--++
Index / Fulltext query system++++++++++
Many document formats (xml, pdf, doc, html)++++++++
Multimedia support++++-+-
Discussions / Blogs++++---
RDBMS support++++-++-
Wiki support++++---
JCR support++----

1. User management of GIT is used

Extensions / Development

MagnoliamediaWikieXisteSciDocown development
Predefined extensions / templates++++ [1] [2]---
Powerful programming++ [3]++ [4]++ [5]+ [6]+
Easy application development+++-+
Index / Query system++++++++++
Java Servlet / JSP support++-++++++
XQuery / XPath support+ [7]+ [7]+++ [7]+
XML / XSL support++ [7]+++ [7]+
Notes / Annotations [8]-----
Web Development / Web page editor++++---

1. footnotes: internal, Cite
2. presentation of old books
3. Servlet / JSP, XSL / CSS, Freemarker, limited XPath (works also in JCR 2.0 ?)
4. php, Java over JavaBridge
5. Java, Servlet / JSP, XQuery / XPath, XSL / CSS
6. Java, Servlet / JSP
7. has to be implemented in Java (relative easy)
8. at a point in the (XML)-document: after/before an element, after/before a word, at the 10th character after the beginning etc. (see XPointer)

MPDL software

MagnoliamediaWikieXisteSciDocown development
Get XML page fragment+ [2]+ [2]+++ [2]+ [2]
Document web viewer+ [2]+ [2]+++ [2]+ [2]
Document upload+++++ [1] [2]+ [2]+ [2]
Document page dictionary view+ [2]+ [2]+++ [2]+ [2]
Browse lexicons and morph. database+ [2]+ [2]+++ [2]+ [2]
Morphological fulltext search over all documents+ [3][4]+ [4]+++ [4]+ [4]
Morphological fulltext search in one document+ [5]+ [5]+++ [5]+ [5]

1. already implemented but not stable enough
2. has to be implemented (relative easy)
3. Own implemented Lucene analyzer class could be set in SearchIndex?; also an extractor could extract XML nodes to properties which then could be searched by specific analyzers set to them
4. has to be implemented (Lucene implementation, effort still not known)
5. could be implemented by XSL, language technology and regular expression function "matches" (relative easy and performant)

Last modified 5 years ago Last modified on Feb 1, 2012 4:16:09 PM

Attachments (6)

Download all attachments as: .zip