Version 70 (modified by 16 years ago) (diff) | ,
---|
MPIWG-MPDL Content Project
This is the wiki for the XML Workflow Service subproject within the cooperative project between the MPIWG and MPDL. The other subproject is the Software Development Project.
At present, two working groups are active, "Data Entry Specs WG" and "Document Schema WG". For protocols, check the ProtocolIndex.
1. Data Entry Specs WG
Existing data entry specs
Some preliminary notes on character issues are to be found on the page CharacterIssues. Some old DE specs may be found under LegacySpecs. Raw data entry versions of Archimedes texts can be accessed through the WebCVS interface.
Malcolm's high level requirements from the large team meeting on 2008-09-11.
Sample texts to examine
Some Sample texts from the ECHO collection. Problems of and requests for ECHO see here.
Provisional list of books to be digitized. First evaluation of the work sample.
References on encoding
- A tutorial on character code issues (by J. Korpela)
- Unicode Home Page
- Fonts & Encodings (O'Reilly book by Y. Haralambous, English ed.)
- ISO 8879 entities (from W3C)
- The Text Encoding Initiative (TEI)
- Beta Code
Additional resources
Some material about Greek Ligatures. Our book recommendations?.
On abbreviations: Lexicon abbreviaturarum by Adriano Cappelli
Completed specs
See here.
2. Document Schema WG
Malcolm's Schema high level requirements from the large team meeting on 2008-09-18.
Two documents that will serve as starting points for the Document Schema can be found here and here.
References
- Relax NG
- Relax NG (O'Reilly book by E. van der Vlist)
- The GFDL release of this book, along with updates (html)
- trang (open source schema converter written in Java)
- RELAX NG Compact Syntax Tutorial
- a zipped tarball with schemas for some well-known document types
- XDF : XML Documentation Format (literate programming with Relax NG)
- Relax NG (O'Reilly book by E. van der Vlist)
- Dublin Core (Metadata)
- ISO 639-2 (codes for natural languages)
- ISO 8601 (date and time formats; brief reference from W3C)
Tools
- XMLStarlet Command Line XML Toolkit
- libxml2 contains xmllint, a command line tool for validating
- MXTerminator A tool for sentence boundary detection
Documentation about trac
- WikiFormatting -- detailed description of available Wiki formatting commands
- TracGuide -- Built-in Documentation
- The Trac project -- Trac Open Source Project
- Trac FAQ -- Frequently Asked Questions
- TracSupport -- Trac Support
For a complete list of local wiki pages, see TitleIndex.
Attachments (7)
-
MPDL_project_desc.pdf (210.8 KB) - added by 16 years ago.
MPIWG proposal within the MPDL framework
-
ECHO-DE-draft.oo3.zip (5.0 KB) - added by 16 years ago.
DE draft guidelines (OmniOutliner file, zipped)
-
transcr.pdf (129.8 KB) - added by 16 years ago.
Old Archimedes Project transcription workflow (B. Fuchs)
-
archimedes.pen (7.6 KB) - added by 16 years ago.
special entities used in Archimedes documents
- echo_V1.xml (6.0 KB) - added by 16 years ago.
- ECHO00001A2B3CX_V2.xml (387.1 KB) - added by 16 years ago.
-
schema.tar.gz (53.1 KB) - added by 16 years ago.
sample Relax NG schemas for well-known document types
Download all attachments as: .zip