Metadata
DFG Digitisation Guidelines
The "DFG Practical Guidelines on Digitisation" (German, English) discuss some metadata issues.
Dublin Core
Dublin Core Metadata Initiative (links with a specified date point to a specific document verson, links without date point to the most recent version):
- homepage, overview
- DCMI Metadata Terms: DCTERMS (main document), REVISIONS, DOMAINS
- DCMI Namespace Policy: NAMESPACE
- DCMI Abstract Model: DCAM (aka ABSTRACT-MODEL)
- DC and RDF: DC-RDF, DC-RDF-NOTES
- DC and XML: 2003 XML Guidelines, may soon be superseded by the proposed recommendation 2008 XML Guidelines, 2008 additional notes (there is also a 2006 working draft)
- software (mainly online tools), FAQ
Some findings: Within the original 15 elements, creator now refines contributor, and source refines relation (i.e. searching for a contributor should also find a creator, etc.); one can still use dc: for the original 15 elements, but is encouraged to use dcterms: instead. dcterms:date has range Literal, dcterms:language has range LinguisticSystem (DCTERMS).
DateScheme and LanguageScheme were never officially declared and are no longer used; ISO-639-3, W3CDTF (REVISIONS).
Dublin Core Metadata Generator at Stanford.
Metadata in our Schema
The metadata module in our Relax NG compact schema.
Some questions:
- The definitions
dc.language = element dcterms:language { element rdf:Description { element dcq:languageScheme { "ISO 639-2" }, rdf.value } } dc.date = element dcterms:date { element rdf:Description { element dcq:dateScheme { "ISO 8601" }, rdf.value } }
seem to be obsolete. Especially dcq seems obsolete (see NAMESPACE). I have replaced them by something which I believe is in accordance with Recommendation 7 in the 2003 XML Guidelines:
dc.language = element dcterms:language { attribute xsi:type { "dcterms:ISO639-3" }, text }
dc.date = element dcterms:date { attribute xsi:type { "dcterms:W3CDTF" }, text }
(Maybe I should have left <rdf:value> instead of "text"? But then, consequently ALL "text" in dcterms definitions should in fact be <rdf:value>.)
However, information is sparse. The syntax encoding schemes dcterms:W3CDTF and dcterms:ISO639-3 do exist according to DCTERMS, but an up-to-date opinion about how to use syntax encoding schemes is difficult to obtain. The proposed recommendation 2008 XML Guidelines uses dcterm:date only like this:
B.17: <dcterms:date>2005-05-05</dcterms:date> B.18: <dcterms:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcterms:date>
Each example in 2008 XML Guidelines comes in three versions. The three versions of example 18 are:
DC-DS-XML (section 4.5.2.1):
<?xml version="1.0" encoding="UTF-8" ?>
<dcds:descriptionSet
xmlns:dcds="http://purl.org/dc/xmlns/2008/09/01/dc-ds-xml/">
<dcds:description
dcds:resourceURI="http://dublincore.org/pages/home">
<dcds:statement dcds:propertyURI="http://purl.org/dc/terms/title">
<dcds:literalValueString>DCMI Home Page</dcds:literalValueString>
</dcds:statement>
<dcds:statement dcds:propertyURI="http://purl.org/dc/terms/publisher"
dcds:valueURI="http://example.org/agents/DCMI">
<dcds:valueString>Dublin Core Metadata Initiative</dcds:valueString>
</dcds:statement>
<dcds:statement dcds:propertyURI="http://purl.org/dc/terms/date">
<!-- syntax encoding scheme URI -->
<dcds:literalValueString dcds:sesURI="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcds:literalValueString>
</dcds:statement>
</dcds:description>
</dcds:descriptionSet>
DC-Text syntax (appendix A.18):
@prefix dcterms: <http://purl.org/dc/terms/> .
DescriptionSet (
Description (
ResourceURI ( <http://dublincore.org/pages/home> )
Statement (
PropertyURI ( dcterms:title )
LiteralValueString ( "DCMI Home Page"
)
)
Statement (
PropertyURI ( dcterms:publisher )
ValueURI ( <http://example.org/agents/DCMI> )
ValueString ( "Dublin Core Metadata Initiative" )
)
Statement (
PropertyURI ( dcterms:date )
ValueString ( "2005-05-05"
SyntaxEncodingSchemeURI ( <http://www.w3.org/2001/XMLSchema#date> )
)
)
)
)
RDF/XML syntax (appendix B.18):
<?xml version="1.0" encoding="UTF-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dcterms="http://purl.org/dc/terms/" >
<rdf:Description rdf:about="http://dublincore.org/pages/home">
<dcterms:title>DCMI Home Page</dcterms:title>
<dcterms:publisher>
<rdf:Description rdf:about="http://example.org/agents/DCMI">
<rdf:value>Dublin Core Metadata Initiative</rdf:value>
</rdf:Description>
</dcterms:publisher>
<dcterms:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcterms:date>
</rdf:Description>
</rdf:RDF>
Now what?
- Should we rename our
<metadata>element? In all examples it is<ref:Description>. However, in most examples it seems to mean the repository version of the metadata, not the version in the xml file itself.
- Do we need something like
<foaf:Person>for creators, contributors, etc.? (I guess not; it doesn't seem have any particular advantage.)