wiki:schema/xpointer

Version 10 (modified by jwillenborg, 13 years ago) (diff)

--

See XML Linking Language (XLink) Version 1.0.

XLink could be used in all elements. Example:

  • <p>The best german punk band is <div xlink:href="http://slime.de/">Slime</div>.</p>

XPointer

See XML Pointer Language (XPointer). XPointer could be used in all URI's especially those provided by XLink.

Examples:

  • <p>This is discussed in <div xlink:href="example.xml#xpointer((//p)[1])">the first paragraph of the example document</div>.</p>
  • <p>This is discussed in <div xlink:href="example.xml#xpointer(id('4711')/div[1])">the first division of the example document</div>.</p>
  • <p>This is discussed in <div xlink:href="example.xml#element(/1/2)">the second element of the first element</div>.</p>
  • <p>Einstein said in his diary that he doesn't like a further delay of his shipping tour to south america (see <note xlink:href="http://mpdl.mpiwg-berlin.mpg.de/physics/einstein/diary.xml#xpointer((id('page53')/echo[1]/text[1]/body[1]/chap[1]/p[1]/s[2])">page 53, sentence 2</note>).</p>

XPoints

  • Examples
    • point(1.0) is just inside the beginning of the p element.
    • point(1.2) is between the end of the em element and the following text node (which contains "world.").
    • point(.0) immediately precedes the root node.
    • point(1/2/1.1) immediately following the "b" in the middle text node.

Range

  • xpointer(id("chap1")/range-to(id("chap2"))) (the range from the start point of the element with ID "chap1" to the end point of the element with ID "chap2")
  • string-range(title,"Thomas Pynchon")[17] (the 17th of those "Thomas Pynchon" strings appearing in a title element)
  • <p>See the <note xlink:href="http://mpdl.mpiwg-berlin.mpg.de/physics/einstein/diary.xml#xpointer((id('page53')/echo[1]/text[1]/body[1]/chap[1]/p[1]/s[2]/range(1.3, 1.10))">text passage on page 53, sentence 2, character 3 to 10</note>).</p>

XLink will be supported in all elements of the Echo and TEI Lite schema (in the near future).

Support of XPointer

The MPDL project sets a special focus on the presentation of document pages. An important requirement for MPDL-XPointers is the support of pointers relative to document pages. Another special requirement is to point not only to elements on a page but also to text portions in elements (point or range). XPointer could be used in all URI's especially those provided by XLink. The MPDL project supports the following subset of XPointer (in the near future):

  • XPointer to a page in an XML document. The result of the XPointer URL is the whole page. Example:
    • <p>... (see <seg xlink:href="http://mpdl.mpiwg-berlin.mpg.de/music/ramones/ramones_2004.xml#xpointer((id('page6'))">Dick Porter: Ramones – The Complete Twisted History, London: Plexus, 2004. Page 6. ISBN 0859653269</seg>)
    • if the document contains no pages (no <pb/> elements) then the page number doesnt't have to be specified (all elements are on the first page internally)
  • XPointer to an element on a page in an XML document. The result of the XPointer URL is the page and the requested element is highlighted. Example:
    • <p>Joey Ramone said: "When we started up in March of ’74, it was because the bands we loved, the rock ’n’ roll that we knew, had disappeared. We were playing music for ourselves." (see <seg xlink:href="http://mpdl.mpiwg-berlin.mpg.de/music/ramones/ramones_2004.xml#xpointer((id('page6')/tei[1]/text[1]/body[1]/chap[1]/p[1])">page 6, sentence 2</seg>).</p>
  • XPointer to a text portion of an element on a page in an XML document. The result of the XPointer URL is the page and the requested text portion of the element is highlighted. Example:
    • <p>The Ramones started up in March 1974 (see <seg xlink:href="http://mpdl.mpiwg-berlin.mpg.de/music/ramones/ramones_2004.xml#xpointer((id('page6')/tei[1]/text[1]/body[1]/chap[1]/p[1]/s[1]/range(1.22, 1.34))">page 6, sentence 1, character 22 to 32</seg>).</p>

External user annotations of documents are stored relative to document pages by mapping XPointer page points/ranges to an internal identifier as a combination of document identifier, page number, element xpath expression, and a point/range expression. Example:

  • before the first sentence on page 6: /music/ramones/ramones_2004.xml, page6, /tei[1]/text[1]/body[1]/chap[1]/p[1]/s[1], point(.0)
  • after the first sentence on page 6: /music/ramones/ramones_2004.xml, page6, /tei[1]/text[1]/body[1]/chap[1]/p[1]/s[1], point(.1)
  • from character 22 to 34 in the first sentence of page 6: /music/ramones/ramones_2004.xml, page6, /tei[1]/text[1]/body[1]/chap[1]/p[1]/s[1], range(1.22, 1.34)

Discussion

  • document identifier: a persistent identifier of a document
  • page number
  • element identifier
    • xpath expression: /tei[1]/text[1]/body[1]/chap[3]/p[1]/s[2]
      • Advantages
        • could be generated dynamically for the XML page
        • intuitive
        • full compatible to XLink/XPointer
        • easy to retrieve
        • implementation is easy and consistent (through saxon:path)
      • Disadvantages
        • if in XML document elements are inserted, updated or deleted afterwards many old XPointer links are broken
        • relative long string
    • node id: 1.1.1.3.1.2
      • Advantages
        • could be generated dynamically for the XML page
        • intuitive
        • compatible to XLink/XPointer
        • easy to retrieve
      • Disadvantages
        • if in XML document elements are inserted, updated or deleted afterwards many old XPointer links are broken
    • id attribute in XML document: <s id="47114711">...</s>
      • Advantages
        • if in XML document elements are inserted, updated or deleted afterwards only old XPointer links with deleted elements are broken
        • compatible to XLink/XPointer
      • Disadvantages
        • through upload process the document has to be modified: for each element in XML document an id attribute has to be generated, also implementation is more complex
        • if a user uses an id attribute already in his document but not consistent these id attributes have to be replaced by a new consistent id (not easy to implement)
        • the document size is bigger with all these id attribute values
        • the id is not intuitive
    • special id attribute in XML document: <s xmlNodeId="47114711">...</s>
      • Advantages
        • user could use his own id attribute in document as he like
        • if in XML document elements are inserted, updated or deleted afterwards only old XPointer links with deleted elements are broken
        • compatible to XLink/XPointer
      • Disadvantages
        • through upload process the document has to be modified: for each element in XML document an xmlNodeId attribute has to be generated, also implementation is more complex
        • the document size is bigger with all these id attribute values
        • the id is not intuitive
  • point/range expression
    • point(.0) or point(.1) or range(1.22, 1.34)
      • Advantages
        • pointers could point to a text portion
        • pointers could point to the place before or after an element
        • is partly already implemented
      • Disadvantages

Selection / Solution

  • document identifier
  • page number
  • element identifier (choice between these 2 best possible solution)
    • xpath expression: /tei[1]/text[1]/body[1]/chap[3]/p[1]/s[2]
      • Reasons
        • no modification of the original document is done, intuitive, compatible with XPointer, dynamically generation, implementation is already mostly done
        • an automatic repair mechanism for broken links to inside elements after document updates: no one would expect such a difficult solution for that and is also not needed normally
        • the broken link is sent to the older version of the document if a versioning system is running or an error message is given that the document has changed
  • point/range expression
    • point(.0) or point(.1) or range(1.22, 1.34)
      • Reasons
        • compatible with XPointer, only this alternative

Implementation

  • new presentation mode with dynamically generated xpath identifier for all elements
    • modes: text, pureXml
      • mostly already implemented: .../interface/page-fragment.xql?...&options=withXmlNodeId
  • user annotations
    • insert/update/delete user annotations: dynamic storing of annotations with an XPointer to the XML document (with user login)
      • partly implemented (login not regarded yet)
    • read user annotations: dynamic presentation of the annotations in the document presentation
      • partly implemented
  • XPointer URI's
    • implementation of a REST like solution over controller.xql: not implemented yet
      • translation of the XPointer URI
      • highlighting of the requested element or text portion in the element
      • different presentation modes: text, textPollux, gis, xml, pureXml
      • in .../interface/page-fragment.xql and .../page-query-result.xql
  • XLink URI's
    • support xlink:href attribute in all Echo and TEI elements: not implemented yet
  • user objects such as user queries
    • insert/update/delete user objects: dynamic storing of user queries (with user login)
      • partly implemented (login not regarded yet)
    • read user objects: show user queries after user login
      • partly implemented

Attachments (1)

Download all attachments as: .zip