wiki:schema

Version 3 (modified by jwillenborg, 13 years ago) (diff)

--

Schema support

The MPDL document storing and querying system supports the document schemas Archimedes and Echo. A subset of TEI-Lite will be supported in the near future.

Archimedes

Schema

The Archimedes schema was developed by the Archimedes project and could be found here.

Example

In the following simple example document the metadata part consists of 4 elements („author“, „title“, „lang“, „date“) and the text part consists of 2 pages („pb“) with 2 paragraphs („p“) which contains 3 sentences („s“).

<?xml version="1.0" encoding="UTF-8"?>
<archimedes xmlns:xlink="http://www.w3.org/1999/xlink">
  <info>
    <author>Your name, your prename</author>
    <title>Your title</title>
    <lang>en</lang>
    <date>1789</date>
  </info>
  <text>
    <pb xlink:href="0001.jpg"/>
    <p>
      <s>This is the first sentence of the first paragraph.</s>
      <s>This is the second sentence of the first paragraph.</s>
      <s>This is the third sentence of the first paragraph.</s>
    </p>
    <p>
      <s>This is the first sentence of the second paragraph.</s>
      <s>This is the second sentence of the second paragraph.</s>
      <s>This is the third sentence of the second paragraph.</s>
    </p>
    <pb xlink:href="0002.jpg"/>
    <p>
      <s>This is the first sentence of the first paragraph with line <lb/>break.</s><lb/>
      <s>This is the second sentence of the first paragraph with line <lb/>break.</s><lb/>
      <s>This is the third sentence of the first paragraph with line <lb/>break.</s><lb/>
    </p>
    <p>
      <s>This is the first sentence of the second paragraph.</s>
      <s>This is the second sentence of the second paragraph.</s>
      <s>This is the third sentence of the second paragraph.</s>
    </p>
  </text>
</archimedes>

Echo

Schema

The MPDL Echo schema is developed by the schema group of this project and could be found here.

Elements

An Echo document (element „echo“ with namespace „echo“) consists of a metadata part (element „metadata“) which contains the Dublin Core metadata of the document and a fulltext part (element „text“) which contains the content of the document.

Dublin Core metadata elements (namespace dcterms):

  • identifier
  • creator
  • title
  • date
  • rights
  • license
  • accessRights

Fulltext elements:

  • text flow elements: head, div (type, level, style), p (style), lb, cb, gap (extent)
  • text structure elements: s
  • figure elements: figure, image (file), caption (style), description (style), variables (style), handwritten (xlink:href)
  • note elements: note (xlink:label), anchor (type, xlink:href)
  • quotation elements: q, quote, blockquote, set-off
  • translation elements: foreign (lang, xml:lang), reg (orig)
  • mathematical elements: var (type), num, mml:*
  • geographical elements: place, event, time
  • person elements: person
  • xhtml elements: xhtml:* : e.g. table, ul
  • other elements: expan, emph (class), ref

Example

In the following simple example document the metadata part consists of 4 Dublin Core elements („creator“, „title“, „language“, „date“) and the text part consists of 2 pages („pb“) with 2 paragraphs („p“) which contains 3 sentences („s“).

<?xml version="1.0" encoding="UTF-8"?>
<echo xmlns="http://www.mpiwg-berlin.mpg.de/ns/echo/1.0/" xmlns:dcterms="http://purl.org/dc/terms">
  <metadata>
    <dcterms:creator>Your name, your prename</dcterms:creator>
    <dcterms:title>Your title</dcterms:title>
    <dcterms:language>en</dcterms:language>
    <dcterms:date>1789</dcterms:date>
  </metadata>
  <text>
    <pb file="0001"/>
    <p>
      <s>This is the first sentence of the first paragraph.</s>
      <s>This is the second sentence of the first paragraph.</s>
      <s>This is the third sentence of the first paragraph.</s>
    </p>
    <p>
      <s>This is the first sentence of the second paragraph.</s>
      <s>This is the second sentence of the second paragraph.</s>
      <s>This is the third sentence of the second paragraph.</s>
    </p>
    <pb file="0002"/>
    <p>
      <s>This is the first sentence of the first paragraph with line <lb/>break.</s><lb/>
      <s>This is the second sentence of the first paragraph with line <lb/>break.</s><lb/>
      <s>This is the third sentence of the first paragraph with line <lb/>break.</s><lb/>
    </p>
    <p>
      <s>This is the first sentence of the second paragraph.</s>
      <s>This is the second sentence of the second paragraph.</s>
      <s>This is the third sentence of the second paragraph.</s>
    </p>
  </text>
</echo>

TEI Lite

A subset of TEI-Lite will be supported in the near future.

Schema

The TEI Lite schema is developed by the Text Encoding Initiative and could be found here.