Metadata
DFG Digitisation Guidelines
The "DFG Practical Guidelines on Digitisation" (German, English) discuss some metadata issues.
Dublin Core
Dublin Core Metadata Initiative (links with a specified date point to a specific document verson, links without date point to the most recent version):
- homepage, overview
- DCMI Metadata Terms: DCTERMS (main document), REVISIONS, DOMAINS
- DCMI Namespace Policy: NAMESPACE
- DCMI Abstract Model: DCAM (aka ABSTRACT-MODEL)
- DC and RDF: DC-RDF, DC-RDF-NOTES
- DC and XML: 2003 XML Guidelines, may soon be superseded by the proposed recommendation 2008 XML Guidelines, 2008 additional notes (there is also a 2006 working draft)
- software (mainly online tools), FAQ
Some findings: Within the original 15 elements, creator
now refines contributor
, and source
refines relation
(i.e. searching for a contributor should also find a creator, etc.); one can still use dc:
for the original 15 elements, but is encouraged to use dcterms:
instead. dcterms:date
has range Literal
, dcterms:language
has range LinguisticSystem
(DCTERMS).
DateScheme
and LanguageScheme
were never officially declared and are no longer used; ISO-639-3
, W3CDTF
(REVISIONS).
Dublin Core Metadata Generator at Stanford.
Metadata in our Schema
The metadata module in our Relax NG compact schema.
Some questions:
- The definitions
dc.language = element dcterms:language { element rdf:Description { element dcq:languageScheme { "ISO 639-2" }, rdf.value } } dc.date = element dcterms:date { element rdf:Description { element dcq:dateScheme { "ISO 8601" }, rdf.value } }
seem to be obsolete. Especially dcq seems obsolete (see NAMESPACE). I have replaced them by something which I believe is in accordance with Recommendation 7 in the 2003 XML Guidelines:
dc.language = element dcterms:language { attribute xsi:type { "dcterms:ISO639-3" }, text } dc.date = element dcterms:date { attribute xsi:type { "dcterms:W3CDTF" }, text }
(Maybe I should have left <rdf:value>
instead of "text"? But then, consequently ALL "text" in dcterms
definitions should in fact be <rdf:value>
.)
However, information is sparse. The syntax encoding schemes dcterms:W3CDTF
and dcterms:ISO639-3
do exist according to DCTERMS, but an up-to-date opinion about how to use syntax encoding schemes is difficult to obtain. The proposed recommendation 2008 XML Guidelines uses dcterm:date
only like this:
B.17: <dcterms:date>2005-05-05</dcterms:date> B.18: <dcterms:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcterms:date>
Each example in 2008 XML Guidelines comes in three versions. The three versions of example 18 are:
DC-DS-XML (section 4.5.2.1):
<?xml version="1.0" encoding="UTF-8" ?> <dcds:descriptionSet xmlns:dcds="http://purl.org/dc/xmlns/2008/09/01/dc-ds-xml/"> <dcds:description dcds:resourceURI="http://dublincore.org/pages/home"> <dcds:statement dcds:propertyURI="http://purl.org/dc/terms/title"> <dcds:literalValueString>DCMI Home Page</dcds:literalValueString> </dcds:statement> <dcds:statement dcds:propertyURI="http://purl.org/dc/terms/publisher" dcds:valueURI="http://example.org/agents/DCMI"> <dcds:valueString>Dublin Core Metadata Initiative</dcds:valueString> </dcds:statement> <dcds:statement dcds:propertyURI="http://purl.org/dc/terms/date"> <!-- syntax encoding scheme URI --> <dcds:literalValueString dcds:sesURI="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcds:literalValueString> </dcds:statement> </dcds:description> </dcds:descriptionSet>
DC-Text syntax (appendix A.18):
@prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( ResourceURI ( <http://dublincore.org/pages/home> ) Statement ( PropertyURI ( dcterms:title ) LiteralValueString ( "DCMI Home Page" ) ) Statement ( PropertyURI ( dcterms:publisher ) ValueURI ( <http://example.org/agents/DCMI> ) ValueString ( "Dublin Core Metadata Initiative" ) ) Statement ( PropertyURI ( dcterms:date ) ValueString ( "2005-05-05" SyntaxEncodingSchemeURI ( <http://www.w3.org/2001/XMLSchema#date> ) ) ) ) )
RDF/XML syntax (appendix B.18):
<?xml version="1.0" encoding="UTF-8" ?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" > <rdf:Description rdf:about="http://dublincore.org/pages/home"> <dcterms:title>DCMI Home Page</dcterms:title> <dcterms:publisher> <rdf:Description rdf:about="http://example.org/agents/DCMI"> <rdf:value>Dublin Core Metadata Initiative</rdf:value> </rdf:Description> </dcterms:publisher> <dcterms:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2005-05-05</dcterms:date> </rdf:Description> </rdf:RDF>
Now what?
- Should we rename our
<metadata>
element? In all examples it is<ref:Description>
. However, in most examples it seems to mean the repository version of the metadata, not the version in the xml file itself.
- Do we need something like
<foaf:Person>
for creators, contributors, etc.? (I guess not; it doesn't seem have any particular advantage.)