wiki:XMLDESpecs

Version 2 (modified by Klaus Thoden, 12 years ago) (diff)

--

Typing conventions

Setting up the document

For resources inside the MPIWG (which you can look at in the ECHO viewing environment), a template XML file can be created which contains the metadata and a body containing simply page breaks for each image there is in this resource. Thus, the transcription can be begun right away.

This web application is available on the toolbox server. Of course, you need to know the server path of the resource. Please refer to the online documentation in the toolbox (the red icons with the white question mark).

Connection to the ECHO Schema and Autocompletion

All the documents that are to be inserted in the ECHO document viewer have to adhere to a standard. This standard is manifested in ECHO Schema for XML texts. Such a standard ensures that all elements in the document can be processed and displayed correctly.

To facilitate the production of such a valid document, XML editors have the functionality to help you with entering the appropriate tags in the right places. The XML file contains a link to schema so that the editor can check the schema and tell you if you made a mistake and suggest the right tags.

Usage in XeMeL

We built a stripped-down version of the Eclipse IDE, called XeMeL. This is a free and system-independent software to work with XML documents. You can find a version of it on the institute's software server.

There is a version of the ECHO XML Schema in XSD format which can be used to get auto completion when working with the XML editor inside XeMeL. Thus, the structure of the document will always be valid.

To enable it, the line

xsi:schemaLocation="http://www.mpiwg-berlin.mpg.de/ns/echo/1.0/ harriot_xsd/echo.xsd "

has to be added to the echo element of the XML file, e. g.

<echo
    xmlns="http://www.mpiwg-berlin.mpg.de/ns/echo/1.0/"
    xmlns:de="http://www.mpiwg-berlin.mpg.de/ns/de/1.0/"
    xmlns:dcterms="http://purl.org/dc/terms"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xmlns:mml="http://www.w3.org/1998/Math/MathML"
    xmlns:xlink="http://www.w3.org/1999/xlink"
    xmlns:xml="http://www.w3.org/XML/1998/namespace"
    xsi:schemaLocation="http://www.mpiwg-berlin.mpg.de/ns/echo/1.0/ harriot_xsd/echo.xsd "
    version="1.0RC">

Usage in Emacs

The nxml-mode in Emacs supports validation by using the RelaxNG version of the schema. You can download an archive of the stable version and extract it. To make Emacs make use of the schema, go into the buffer with the XML document, type C-c C-s C-f (rng-set-schema-file-and-validate) and enter the path to the echo.rnc file. You will be asked if this link should be saved to a file. If yes, a file called schemas.xml will be created that contains the link between the XML document and the schema.

Please see the documentation of the nxml-mode for additional information, e. g. keyboard shortcuts.

Some basic rules for XML

There are some rules which have to be observed when dealing with XML texts. A nice collection can be found on the web pages of w3schools.

Paragraphs and semantic units

To enter text into the document, some additional elements have to be included, because text must not stand on its own between page breaks.

A headline is enclosed by <head> tags. The main text has to go into a paragraph (<p>), and inside this has to be a semantic unit (<s>).

Line breaks

Type a <lb/> at the end of each line. Be careful to include a space between the last word and the tag. However, if the word is hyphenated, do not type a space.

Theoretically, line breaks are not obligatory. If they are missing, lines are broken depending on the width of your browser's window.

XML entities

Some characters can not be typed directly. They have to be escaped like this:

&&amp;
<&lt;
>&gt;
"&quot;
'&apos;

Language change

Paragraphs and semantic units can be set in other languages by giving them an attribute xml:lang="lat". Please use the three-letter code of the appropriate language. Thus, the right dictionary will be used in the display environment.

By using the <foreign> tag, you can also change the language of single words

Emphasis styles

style=".." Meaning
it Italics
bf Bold
bf it Bold italics
sc Small caps
sc it Small caps italics
sub Subscript
super Superscript
red Red
fr Fraktur
rom Antiqua
sp Sperrung
ol Overline
ul Underline
st Struck through

Generating a table of contents with divisions

The text should not only be subdivided by headings, but also by <div> tags. The div structure, along with subsequent headings will be used to generate a table of contents in the ECHO document viewer. NB, this is only activated, if the type attribute of the div is either "chapter" or "section".

Otherwise, you are free to use any name for the type of divisions, e. g. recipes