wiki:schema/xpointer

Version 15 (modified by jwillenborg, 14 years ago) (diff)

--

XPointer

See XML Pointer Language (XPointer). XPointer could be used in all URI attributes.

Examples:

  • <p>This is discussed in <div xlink:href="example.xml#xpointer((//p)[1])">the first paragraph of the example document</div>.</p>
  • <p>This is discussed in <div xlink:href="example.xml#xpointer(id('4711')/div[1])">the first division of the example document</div>.</p>
  • <p>This is discussed in <div xlink:href="example.xml#element(/1/2)">the second element of the first element</div>.</p>
  • <p>Einstein said in his diary that he doesn't like a further delay of his shipping tour to south america (see <note xlink:href="http://mpdl.mpiwg-berlin.mpg.de/physics/einstein/diary.xml#xpointer(id('page53')/echo[1]/text[1]/body[1]/chap[1]/p[1]/s[2])">page 53, sentence 2</note>).</p>

XPoints

  • Examples
    • point(1.0) is just inside the beginning of the p element.
    • point(1.2) is between the end of the em element and the following text node (which contains "world.").
    • point(.0) immediately precedes the root node.
    • point(1/2/1.1) immediately following the "b" in the middle text node.

Range

  • xpointer(id("chap1")/range-to(id("chap2"))) (the range from the start point of the element with ID "chap1" to the end point of the element with ID "chap2")
  • string-range(title,"Thomas Pynchon")[17] (the 17th of those "Thomas Pynchon" strings appearing in a title element)
  • <p>See the <note xlink:href="http://mpdl.mpiwg-berlin.mpg.de/physics/einstein/diary.xml#xpointer(id('page53')/echo[1]/text[1]/body[1]/chap[1]/p[1]/s[2]/range(1.3, 1.10))">text passage on page 53, sentence 2, character 3 to 10</note>).</p>

Support of XPointer

The MPDL project sets a special focus on the presentation of document pages. An important requirement for MPDL-XPointers is the support of pointers relative to document pages. Another special requirement is to point not only to elements on a page but also to text portions in elements (point or range). XPointer could be used in all Echo and TEI URI attributes. The MPDL project supports the following subset of XPointer (in the near future):

External user annotations of documents are stored relative to document pages by mapping XPointer page points/ranges to an internal identifier as a combination of document identifier, page number, element xpath expression, and a point/range expression. Example:

  • before the first sentence on page 6: /tei/en/ramones_2004.xml, page6, /tei[1]/text[1]/body[1]/chap[1]/p[1]/s[1], point(.0)
  • after the first sentence on page 6: /tei/en/ramones_2004.xml, page6, /tei[1]/text[1]/body[1]/chap[1]/p[1]/s[1], point(.1)
  • from character 22 to 34 in the first sentence of page 6: /tei/en/ramones_2004.xml, page6, /tei[1]/text[1]/body[1]/chap[1]/p[1]/s[1], range(1.22, 1.34)

See XML Linking Language (XLink) Version 1.0. Example: <p>The best german punk band is <bla xlink:href="http://slime.de/">Slime</bla>.</p>

XLink is not supported in TEI so far. So the MPDL project will support XLink when it is part of TEI.

Discussion

  • document identifier: a persistent identifier of a document
  • page number
  • element identifier
    • xpath expression: /tei[1]/text[1]/body[1]/chap[3]/p[1]/s[2]
      • Advantages
        • could be generated dynamically for the XML page
        • intuitive
        • full compatible to XLink/XPointer
        • easy to retrieve
        • implementation is easy and consistent (through saxon:path)
      • Disadvantages
        • if in XML document elements are inserted, updated or deleted afterwards many old XPointer links are broken
        • relative long string
    • node id: 1.1.1.3.1.2
      • Advantages
        • could be generated dynamically for the XML page
        • intuitive
        • compatible to XLink/XPointer
        • easy to retrieve
      • Disadvantages
        • if in XML document elements are inserted, updated or deleted afterwards many old XPointer links are broken
    • id attribute in XML document: <s id="47114711">...</s>
      • Advantages
        • if in XML document elements are inserted, updated or deleted afterwards only old XPointer links with deleted elements are broken
        • compatible to XLink/XPointer
      • Disadvantages
        • through upload process the document has to be modified: for each element in XML document an id attribute has to be generated, also implementation is more complex
        • if a user uses an id attribute already in his document but not consistent these id attributes have to be replaced by a new consistent id (not easy to implement)
        • the document size is bigger with all these id attribute values
        • the id is not intuitive
    • special id attribute in XML document: <s xmlNodeId="47114711">...</s>
      • Advantages
        • user could use his own id attribute in document as he like
        • if in XML document elements are inserted, updated or deleted afterwards only old XPointer links with deleted elements are broken
        • compatible to XLink/XPointer
      • Disadvantages
        • through upload process the document has to be modified: for each element in XML document an xmlNodeId attribute has to be generated, also implementation is more complex
        • the document size is bigger with all these id attribute values
        • the id is not intuitive
  • point/range expression
    • point(.0) or point(.1) or range(1.22, 1.34)
      • Advantages
        • pointers could point to a text portion
        • pointers could point to the place before or after an element
        • is partly already implemented
      • Disadvantages

Selection / Solution

  • document identifier
  • page number
  • element identifier (choice between these 2 best possible solution)
    • xpath expression: /tei[1]/text[1]/body[1]/chap[3]/p[1]/s[2]
      • Reasons
        • no modification of the original document is done, intuitive, compatible with XPointer, dynamically generation, implementation is already mostly done
        • an automatic repair mechanism for broken links to inside elements after document updates: no one would expect such a difficult solution for that and is also not needed normally
        • the broken link is sent to the older version of the document if a versioning system is running or an error message is given that the document has changed
  • point/range expression
    • point(.0) or point(.1) or range(1.22, 1.34)
      • Reasons
        • compatible with XPointer, only this alternative

Implementation

  • new presentation mode with dynamically generated xpath identifier for all elements
    • modes: text, pureXml
      • mostly already implemented: .../interface/page-fragment.xql?...&options=withXmlNodeId
  • user annotations
    • insert/update/delete user annotations: dynamic storing of annotations with an XPointer to the XML document (with user login)
      • partly implemented (login not regarded yet)
    • read user annotations: dynamic presentation of the annotations in the document presentation
      • partly implemented
  • XPointer URI's
    • implementation of a REST like URI over controller.xql: partly implemented
      • translation of the XPointer URI
      • highlighting of the requested element or text portion in the element
      • different presentation modes: text, textPollux, gis, xml, pureXml
      • in .../interface/page-fragment.xql and .../page-query-result.xql
  • TEI URI's
    • support URI attributes (e.g. ref(target) ptr(target)) in Echo and TEI elements: implemented
  • user objects such as user queries
    • insert/update/delete user objects: dynamic storing of user queries (with user login)
      • partly implemented (login not regarded yet)
    • read user objects: show user queries after user login
      • partly implemented

Attachments (1)

Download all attachments as: .zip