= MPDL 2.0: software selection = We evaluate the different content management systems (CMS) by means of their main functions and features for our extended user requirements in the area of web based access to XML-documents. The main new requirements are: * state of the art web GUI * datastore / repository functionality (user management, versioning of documents, history function) * user notes / annotations within documents * easy document uploads * scalability * incomplex architecture * system independant application design * easy administration == Software candidates == We limit our software selection to the four CMS systems Magnolia ^[#note1 [1]]^, mediaWiki, eXist and eSciDoc and an own CMS development: || ||Magnolia||mediaWiki||eXist||eSciDoc||own development ^[#note2 [2]]^|| ||Webpage|| [http://www.magnolia-cms.com/ here] || [http://www.mediawiki.org/wiki/MediaWiki here] || [http://exist.sourceforge.net/ here] || [https://www.escidoc.org/ here] || - || ||MPIWG installation|| [http://mediathek.mpiwg-berlin.mpg.de/mediathekPublic/versionEins Mediathek] || - || [http://mpdl-proto.mpiwg-berlin.mpg.de/mpdl/query.xql MPDL system] || [http://pubman.mpiwg-berlin.mpg.de/pubman/ Pubman] || - || References: [[br]] [=#note1 1.] which contains Apache Jackrabbit; similar system to Magnolia: [http://www.alfresco.com Alfresco] [[br]] [=#note2 2.] own development with Servlets, Lucene for indexing/querying, GIT for versioning documents == System architectures == === Old system MPDL 1.0: eXist === [[Image(architectureMpdl1.0-eXist.jpg, 600px)]] === MPDL 2.0: Language/XML technology [[Image(architectureMpdl2.0-xml-lt.jpg, 600px)]] === MPDL 2.0 - CMS: Magnolia === [[Image(architectureMpdl2.0-cms-magnolia.jpg, 600px)]] === MPDL 2.0 - CMS: mediaWiki === [[Image(architectureMpdl2.0-cms-mediawiki.jpg, 600px)]] === MPDL 2.0 - CMS: eXist === [[Image(architectureMpdl2.0-cms-eXist.jpg, 600px)]] === MPDL 2.0 - CMS: eSciDoc === [[Image(architectureMpdl2.0-cms-eSciDoc.jpg, 600px)]] === MPDL 2.0 - CMS: own development === no image available so far == Software design == See the MPDL 2.0 software design [/wiki/mpdl2.0-design here]. == Comparison of the main system features == === Basic === || ||Magnolia||mediaWiki||eXist||eSciDoc||own development ||Scalable||++||++||- ^[#note1 [1]]^||++||++|| ||Incomplex architecture||+||+||-||-||+|| ||Stable||++||++||- ^[#note2 [2]]^||+||-|| ||Performant||++||++||++||+||+|| ||Customizable||++||++||++||++||++|| ||Many installations||++||++||+||-||-|| ||In use at MPIWG||+||-||+||+||-|| ||Easy administration||+||+||+||-||+|| ||Free software||+||++||++||++||++|| References: [[br]] [=#note1 1.] not really scalable for many documents (> 1000 XML documents with each 1 MB) [[br]] [=#note2 2.] many system crashes on macosx servers at startup time (up to 2 days); sometimes system crash when a document is uploaded [[br]] [[br]] === Datastore / Repository === || ||Magnolia||mediaWiki||eXist||eSciDoc||own development|| ||User management||++||++||-||+||+ ^[#note1 [1]]^|| ||Version control system||++||++||-||+||++|| ||History presentation||++||++||-||-||++|| ||Index / Fulltext query system||++||++||++||++||++|| ||Many document formats (xml, pdf, doc, html)||++||++||+||++||+|| ||Multimedia support||++||++||-||+||-|| ||Discussions / Blogs||++||++||-||-||-|| ||RDBMS support||++||++||-||++||-|| ||Wiki support||++||++||-||-||-|| ||JCR support||++||-||-||-||-|| [[br]] [=#note1 1.] User management of GIT is used [[br]] === Extensions / Development === || ||Magnolia||mediaWiki||eXist||eSciDoc||own development ||Predefined extensions / templates||++||++ ^[#note1 [1]]^ ^[#note2 [2]]^||-||-||-|| ||Powerful programming||++ ^[#note3 [3]]^||++ ^[#note4 [4]]^||++ ^[#note5 [5]]^||+ ^[#note6 [6]]^||+|| ||Easy application development||+||+||+||-||+|| ||Index / Query system||++||++||++||++||++|| ||Java Servlet / JSP support||++||-||++||++||++|| ||XQuery / XPath support||+ ^[#note7 [7]]^||+ ^[#note7 [7]]^||++||+ ^[#note7 [7]]^||+|| ||XML / XSL support||+||+ ^[#note7 [7]]^||++||+ ^[#note7 [7]]^||+|| ||Notes / Annotations ^[#note8 [8]]^||-||-||-||-||-|| ||Web Development / Web page editor||++||++||-||-||-|| References: [[br]] [=#note1 1.] footnotes: [http://meta.wikimedia.org/wiki/Help:Footnotes internal], [http://www.mediawiki.org/wiki/Extension:Cite Cite] [[br]] [=#note2 2.] [http://wikisource.org presentation of old books] [[br]] [=#note3 3.] Servlet / JSP, XSL / CSS, Freemarker, limited XPath (works also in JCR 2.0 ?)[[br]] [=#note4 4.] php, Java over !JavaBridge [[br]] [=#note5 5.] Java, Servlet / JSP, XQuery / XPath, XSL / CSS [[br]] [=#note6 6.] Java, Servlet / JSP [[br]] [=#note7 7.] has to be implemented in Java (relative easy) [[br]] [=#note8 8.] at a point in the (XML)-document: after/before an element, after/before a word, at the 10th character after the beginning etc. (see [/wiki/schema/xpointer XPointer]) [[br]] [[br]] === MPDL software === || ||Magnolia||mediaWiki||eXist||eSciDoc||own development|| ||Get XML page fragment||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^|| ||Document web viewer||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^|| ||Document upload||++||++||+ ^[#note1 [1]]^ ^[#note2 [2]]^||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^|| ||Document page dictionary view||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^|| ||Browse lexicons and morph. database||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^||++||+ ^[#note2 [2]]^||+ ^[#note2 [2]]^|| ||Morphological fulltext search over all documents||+ ^[#note3 [3]]^^[#note4 [4]]^||+ ^[#note4 [4]]^||++||+ ^[#note4 [4]]^||+ ^[#note4 [4]]^|| ||Morphological fulltext search in one document||+ ^[#note5 [5]]^||+ ^[#note5 [5]]^||++||+ ^[#note5 [5]]^||+ ^[#note5 [5]]^|| References: [[br]] [=#note1 1.] already implemented but not stable enough [[br]] [=#note2 2.] has to be implemented (relative easy) [[br]] [=#note3 3.] Own implemented Lucene analyzer class could be set in SearchIndex; also an extractor could extract XML nodes to properties which then could be searched by specific analyzers set to them [[br]] [=#note4 4.] has to be implemented (Lucene implementation, effort still not known) [[br]] [=#note4 5.] could be implemented by XSL, language technology and regular expression function "matches" (relative easy and performant) [[br]]