Changes between Version 2 and Version 3 of despecs


Ignore:
Timestamp:
Jun 15, 2010, 9:50:08 AM (14 years ago)
Author:
Wolfgang Schmidle
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • despecs

    v2 v3  
    33[[PageOutline(2-4,,pullout)]]
    44
    5 Auf dieser Seite werden die ''Data Entry Specifications'' (DESpecs) beschrieben. (TO DO: Übertrage den Usage Guide aus LaTeX)
     5Auf dieser Seite werden die ''Data Entry Specifications'' (DESpecs) beschrieben. (TO DO: Beschreibung und Begründung)
    66
    77Die aktuellen Versionen der DESpecs sind [http://pythia.mpiwg-berlin.mpg.de/department1/mpdl/despecs hier].
     
    1010== The DESpecs and the ECHO schema ==
    1111
    12 The DESpecs simply describe a standard for marking up the text. The goal is to mark up structural features of the text with reasonably simple rules that keep the balance between cost and benefit. In addition to the DESpecs, we have created an XML format ("ECHO") that the transcriptions should conform to in the end, as well as a workflow for transforming the transcriptions from the DESpecs format into the ECHO format.
     12The DESpecs simply describe a standard for marking up the text. The goal is to mark up structural features of the text with reasonably simple rules that keep the balance between cost and benefit. In addition to the DESpecs, we have created an XML format ("[wiki:echo-schema ECHO]") that the transcriptions should conform to in the end, as well as a workflow for transforming the transcriptions from the DESpecs format into the ECHO format.
    1313
    1414We make a clear conceptual distinction between the DESpecs and the ECHO format. In particular, the ECHO format is not simply a well-formed version of the DESpecs format. The DESpecs format just happens to resemble XML. We would not gain much if we ask Formax to send us well-formed XML since we have to post-process the text anyway. For example, it doesn't make much sense to make them type a character variant such as <獘V> as well-formed XML, e.g. <V>獘</V>, let alone as the full <reg norm="獘" type="simple"><image xlink:href="symbols/chinese/⿱敝大.svg"/></reg> in the ECHO format. In this example, we might want to change "simple" to e.g. "ids-list" without having to change the DESpecs.
    1515
    16 The workflow consists of a series of scripts and is designed to require as little human intervention as possible (see https://it-dev.mpiwg-berlin.mpg.de/tracs/mpdl-project-content/wiki/workflow, in German). Turning the transcription into well-formed XML is only a small and relatively straightforward part of this workflow: resolve some reserved characters, make attributes well-formed, add "/" in empty elements, change the names of some elements, etc. (Some parts are more tricky, for instance the example above or conflicting XML hierarchies of e.g. paragraphs and columns of text.) While it is true that especially some copy/paste mistakes in the XML markup could be avoided if the transcriptions were required to be well-formed XML, mistakes of this kind are relatively rare and easy to spot.
     16The workflow consists of a series of scripts and is designed to require as little human intervention as possible (see [wiki:workflow], in German). Turning the transcription into well-formed XML is only a small and relatively straightforward part of this workflow: resolve some reserved characters, make attributes well-formed, add "/" in empty elements, change the names of some elements, etc. (Some parts are more tricky, for instance the example above or conflicting XML hierarchies of e.g. paragraphs and columns of text.) While it is true that especially some copy/paste mistakes in the XML markup could be avoided if the transcriptions were required to be well-formed XML, mistakes of this kind are relatively rare and easy to spot.