{{{ #!html DE Specs Working Group Meeting

DE Specs Working Group Meeting

Klaus Thoden

08 September 2008

1 Legacy Specs

Two data entry specifications were distributed, one for texts in the Latin alphabet and one for Chinese. For both writing systems, ECHO pages served as examples.

After a few notes on a Chinese text regarding the data entry specifications, a text in Latin alphabet was discussed concerning its idiosyncrasies, e. g. ligatures. A diplomatic transcription of the characters is planned, however, most of the ligatures are to be resolved except for the most common ones (e. g. æ and œ).

2 Things to deal with

2.1 Character encoding

Characters are to be typed preferably in Unicode, not in XML tags, i. e. they should be directly typable, not as entities.

Unknown characters are to be numbered by the digitizers. Same instances of one unknown character get the same number so that these can be resolved easily.

2.2 Conversion to XML

The basis of the conversion to XML will be XML schemata, rather than DTDs. RELAX NG is the proposed language for writing these schemata.

3 Organisation

The trac-pages were introduced1. They contain timeline, roadmap, version control, source viewer and a wiki. There will soon be an introduction to the use of trac.

4 Next steps

1 https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content

2Links to these documents are on the first page of the wiki. Firefox is the recommended browser here.

}}}