= MPIWG-MPDL Content Project = This is the wiki for the XML Workflow Service subproject within the [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/WikiStart/MPDL_project_desc.pdf cooperative project between the MPIWG and MPDL]. At present, two working groups are active, "Data Entry Specs WG" and "Document Schema WG". For protocols, check the ProtocolIndex. The other subproject is the [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-software Software Development Project] == Data Entry Specs WG == === Existing data entry specs === Some preliminary notes on character issues are to be found on the page CharacterIssues. Some old DE specs may be found under LegacySpecs. Raw data entry versions of Archimedes texts can be accessed through the [http://archimedes.mpiwg-berlin.mpg.de/cvs-web/read/cvswebread.cgi/texts/archimedes/raw/ WebCVS interface]. Malcolm's [wiki:HighLevelRequirements high level requirements] from the large team meeting on 2008-09-11. === Sample texts to examine === Some [wiki:SampleTexts Sample texts] from the ECHO collection. Problems of and requests for ECHO see [wiki:EchoRemarks here]. ["Provisional list"] of books to be digitized. === References on encoding === * [http://www.cs.tut.fi/~jkorpela/chars.html A tutorial on character code issues] (by J. Korpela) * [http://www.unicode.org/ Unicode Home Page] * [http://www.unicode.org/charts/ Code Charts By Script (Unicode 5.1)] * [http://proquest.safaribooksonline.com/9780596102425/dedication Fonts & Encodings] (O'Reilly book by Y. Haralambous, English ed.) * [http://www.w3.org/2003/entities/iso8879doc/overview.html ISO 8879 entities] (from W3C) * [http://www.tei-c.org The Text Encoding Initiative (TEI)] * [http://www.tlg.uci.edu/BetaCode.html Beta Code] === Additional resources === Some material about [wiki:GreekLigatures Greek Ligatures]. Our [wiki:BookRecommendations book recommendations]. === Completed specs === See [wiki:DataEntrySpecs here]. == Document Schema WG == Malcolm's [wiki:SchemaHighLevelRequirements Schema high level requirements] from the large team meeting on 2008-09-18. Two documents that will serve as starting points for the Document Schema can be found [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/WikiStart/echo_V1.xml here] and [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/WikiStart/ECHO00001A2B3CX_V2.xml here]. === References === * Relax NG * [http://proquest.safaribooksonline.com/0596004214/relax-PREFACE-2 Relax NG] (O'Reilly book by E. van der Vlist) * [http://www.thaiopensource.com/relaxng/trang.html trang] (open source schema converter written in Java) * [http://relaxng.org/compact-tutorial-20030326.html RELAX NG Compact Syntax Tutorial] * a [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/WikiStart/schema.tar.gz zipped tarball] with schemas for some well-known document types * [http://enlil.museum.upenn.edu/cdl/doc/XDF XDF] : XML Documentation Format (literate programming with Relax NG) * [http://dublincore.org/ Dublin Core (Metadata)] * [http://www.loc.gov/standards/iso639-2/ ISO 639-2] (codes for natural languages) * [http://www.w3.org/TR/NOTE-datetime ISO 8601] (date and time formats; brief reference from W3C) === Tools === * [http://xmlstar.sourceforge.net/ XMLStarlet] Command Line XML Toolkit * [http://www.xmlsoft.org/ libxml2] contains ''xmllint'', a command line tool for validating * [http://www.id.cbs.dk/~dh/corpus/tools/MXTERMINATOR.html MXTerminator] A tool for sentence boundary detection == Documentation about trac == * WikiFormatting -- detailed description of available Wiki formatting commands * TracGuide -- Built-in Documentation * [http://trac.edgewall.org/ The Trac project] -- Trac Open Source Project * [http://trac.edgewall.org/wiki/TracFaq Trac FAQ] -- Frequently Asked Questions * TracSupport -- Trac Support For a complete list of local wiki pages, see TitleIndex.