wiki:Metadata

Version 23 (modified by Klaus Thoden, 15 years ago) (diff)

Fixed link to metadata module.

Metadata

DFG Digitisation Guidelines

The "DFG Practical Guidelines on Digitisation" (German, English) discuss some metadata issues.

Dublin Core

Dublin Core Metadata Initiative (links with a specified date point to a specific document verson, links without date point to the most recent version):

Some findings: Within the original 15 elements, creator now refines contributor, and source refines relation (i.e. searching for a contributor should also find a creator, etc.); one can still use dc: for the original 15 elements, but is encouraged to use dcterms: instead. dcterms:date has range Literal, dcterms:language has range LinguisticSystem (DCTERMS). DateScheme and LanguageScheme were never officially declared and are no longer used; ISO-639-3, W3CDTF (REVISIONS).

Dublin Core Metadata Generator at Stanford.

Metadata in our Schema

The metadata module in our Relax NG compact schema.

Some questions:

  1. The definitions
    dc.language = element dcterms:language { element rdf:Description { element dcq:languageScheme { "ISO 639-2" }, rdf.value } }
    dc.date = element dcterms:date { element rdf:Description { element dcq:dateScheme { "ISO 8601" }, rdf.value } }
    

seem to be obsolete. Especially dcq seems obsolete (see NAMESPACE). I have replaced them by something which I believe is in accordance with Recommendation 7 in the 2003 XML Guidelines:

dc.language = element dcterms:language { attribute xsi:type { "dcterms:ISO639-3" }, text }
dc.date = element dcterms:date { attribute xsi:type { "dcterms:W3CDTF" }, text } 

(Maybe I should have left <rdf:value> instead of "text"? But then, consequently ALL "text" in dcterms definitions should in fact be <rdf:value>.)

However, information is sparse. The syntax encoding schemes dcterms:W3CDTF and dcterms:ISO639-3 do exist according to DCTERMS, but an up-to-date opinion about how to use syntax encoding schemes is difficult to obtain. The proposed recommendation 2008 XML Guidelines uses dcterm:date only like this:

B.17:  <dcterms:date>2005-05-05</dcterms:date>
B.18:  <dcterms:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcterms:date>

Each example in 2008 XML Guidelines comes in three versions. The three versions of example 18 are:

DC-DS-XML (section 4.5.2.1):

<?xml version="1.0" encoding="UTF-8" ?>
<dcds:descriptionSet
  xmlns:dcds="http://purl.org/dc/xmlns/2008/09/01/dc-ds-xml/">
  <dcds:description
    dcds:resourceURI="http://dublincore.org/pages/home">
    <dcds:statement dcds:propertyURI="http://purl.org/dc/terms/title">
      <dcds:literalValueString>DCMI Home Page</dcds:literalValueString>
    </dcds:statement>
    <dcds:statement dcds:propertyURI="http://purl.org/dc/terms/publisher"
                   dcds:valueURI="http://example.org/agents/DCMI">
      <dcds:valueString>Dublin Core Metadata Initiative</dcds:valueString>
    </dcds:statement>
    <dcds:statement dcds:propertyURI="http://purl.org/dc/terms/date">
                       <!-- syntax encoding scheme URI -->
      <dcds:literalValueString dcds:sesURI="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcds:literalValueString>
    </dcds:statement>
  </dcds:description>
</dcds:descriptionSet>

DC-Text syntax (appendix A.18):

@prefix dcterms: <http://purl.org/dc/terms/> .
DescriptionSet (
  Description (
    ResourceURI ( <http://dublincore.org/pages/home> )
    Statement (
      PropertyURI ( dcterms:title )
      LiteralValueString ( "DCMI Home Page"
      )
    )
    Statement (
      PropertyURI ( dcterms:publisher )
      ValueURI ( <http://example.org/agents/DCMI> )
      ValueString ( "Dublin Core Metadata Initiative" )
    )
    Statement (
      PropertyURI ( dcterms:date )
      ValueString ( "2005-05-05"
        SyntaxEncodingSchemeURI ( <http://www.w3.org/2001/XMLSchema#date> )
      )
    )
  )
)

RDF/XML syntax (appendix B.18):

<?xml version="1.0" encoding="UTF-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
        xmlns:dcterms="http://purl.org/dc/terms/" >
  <rdf:Description rdf:about="http://dublincore.org/pages/home">
    <dcterms:title>DCMI Home Page</dcterms:title>
    <dcterms:publisher>
      <rdf:Description rdf:about="http://example.org/agents/DCMI">
        <rdf:value>Dublin Core Metadata Initiative</rdf:value>
      </rdf:Description>
    </dcterms:publisher>
    <dcterms:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcterms:date>
  </rdf:Description>
</rdf:RDF>

Now what?

  1. Should we rename our <metadata> element? In all examples it is <ref:Description>. However, in most examples it seems to mean the repository version of the metadata, not the version in the xml file itself.
  1. Do we need something like <foaf:Person> for creators, contributors, etc.? (I guess not; it doesn't seem have any particular advantage.)