Annotation of storage/meta/meta-format.tex, revision 1.20
1.1 casties 1: \documentclass[a4paper]{article}
2:
3: \usepackage[latin1]{inputenc}
4: \usepackage[T1]{fontenc}
5: \usepackage{ae}
6: %\usepackage{times}
7: %\usepackage{courier}
8:
9: % create in-text links black (with PDF)
1.6 casties 10: \usepackage[colorlinks=true,linkcolor=black]{hyperref}
1.1 casties 11: % Format URLs nicely (without PDF)
1.6 casties 12: %\usepackage{url}
1.1 casties 13:
14:
15: \title{A simple metadata format for resource bundles}
16:
1.4 casties 17: \author{Robert Casties, Dirk Wintergrün, Hans-Christoph Liess}
1.1 casties 18:
1.20 ! casties 19: \date{V1.3.4 of 24.7.2008}
1.1 casties 20:
21: \begin{document}
22:
23: \maketitle
24:
25: \tableofcontents
26:
27:
28: \section{File and directory names}
29: \label{sec:file-directory-names}
30:
31: File and directory names should not contain spaces. Allowed characters
32: in filenames are only the alphanumeric set a-z, A-Z, 0-9, hyphen
33: ``-'', underscore ``\_'' and dot ``.''.
34:
1.12 casties 35: Files and directories with names that contain illegal characters must
36: be transformed to allowed names. A proposition for a simple
37: transformation rule is
38:
39: \begin{itemize}
40: \item whitespace characters (e.g. blank, tab, cr, lf) are replaced by
41: hyphens ``-''
42:
43: \item other illegal characters are replaced by underscores ``\_''.
44: \end{itemize}
45:
46: This rule does not provide a reversible mapping to the original
47: illegal file name and it does not provide a collision-free mapping,
48: i.e. two different illegal file names might be mapped to the same
49: allowed file name. Additional precautions for these cases must be
50: taken.
1.1 casties 51:
1.4 casties 52:
53: \section{Metadata files}
54: \label{sec:metadata-files}
55:
56: The metadata information is stored in the XML format documented below
57: in special files in the resource directory. Two forms of metadata
58: files are possible:
59: \begin{itemize}
60: \item a file named \texttt{index.meta} in a directory.
61:
1.16 casties 62: \item a file with the same name as the data file it describes and an
1.4 casties 63: additional extension \texttt{.meta}. For example metadata for the
1.16 casties 64: file \texttt{p0001.tif} would be in a file \texttt{p0001.tif.meta}.
1.4 casties 65: \end{itemize}
66:
67: The resource directory must contain an \texttt{index.meta} file with
1.16 casties 68: information about the resource as a whole. Subdirectories can
69: contain additional \texttt{index.meta} files.
1.4 casties 70:
71: Additional information about single data files that are part of the
72: resource can either be put in \texttt{file} tags in the
73: \texttt{index.meta} file or in separate \emph{filename}\texttt{.meta}
74: files for each data file. Information from the directory level file is
1.16 casties 75: inherited at the file level when it is not overwritten.
1.4 casties 76:
77:
1.1 casties 78: \section{Resource format}
79: \label{sec:mpiwg-doc}
80:
81: In this description elements marked ``optional'' need not be supplied
82: by the provider of the resource and may be absent in all versions of
83: the metadata file. Elements marked ``required'' must be supplied by
84: the provider of the resource. Elements marked ``deduced'' can be
85: supplied by the provider of the resource but can also be provided by
1.4 casties 86: automatic scripts later in the process, these elements must be present
1.1 casties 87: in the final file.
88:
1.12 casties 89: File and directory paths in the metadata file use the conventional
90: Unix file separator slash ``/''.
91:
1.11 casties 92: The outer container element is \texttt{resource}. It has the following
93: \textbf{attributes}:
94:
95: \begin{description}
1.12 casties 96: \item[type] sub-type of resource (e.g. ``ECHO'', ``MPIWG'') --
97: optional.
1.11 casties 98:
1.16 casties 99: \item[version] version number of metadata format (currently 1.2) --
1.11 casties 100: required.
101: \end{description}
102:
103: \noindent The allowed \textbf{elements} inside \texttt{resource} are:
1.1 casties 104:
105: \begin{description}
1.14 casties 106: \item[description] An informal textual description of the resource --
107: optional\footnote{At least one description of the resource's content
108: is required. The description can be an informal
109: \texttt{description} element or a descriptive element (like
110: \texttt{bib}) in a \texttt{meta} container.}.
1.1 casties 111:
112: \item[name] The filename of the resource (name of the directory this
113: file is contained in) -- required.
114:
115: \item[creator] The name of the project or person that created the
116: resource -- optional.
1.4 casties 117:
118: \item[archive-creation-date] The time and date the archive collection
119: was created -- deduced.
1.1 casties 120:
1.4 casties 121: \item[archive-storage-date] The time and date the archive was written
122: to permanent storage -- deduced (must not be set by the user).
1.1 casties 123:
124: \item[archive-path] The full path to the resource directory inside the
1.5 casties 125: whole archive collection, including the resource directory -- deduced.
1.12 casties 126:
127: \item[archive-id] The ID for this document in the archive --
1.16 casties 128: optional.
1.1 casties 129:
130: \item[derived-from] Container for the description of the original
131: resource if this resource is a modified version of another resource
132: -- optional.
133:
134: \begin{description}
1.12 casties 135: \item[archive-id] The ID of the original resource
1.16 casties 136: -- required (or archive-path).
1.12 casties 137:
1.1 casties 138: \item[archive-path] The full path to the original resource
1.16 casties 139: -- required (or archive-id).
140:
141: \item[description] An informal textual description of the relation
142: of this resource to the original resource -- optional.
143: \end{description}
144:
145: \item[used-by] Container for the description of modified resources
146: if this resource is the source of another resource
147: -- optional.
148:
149: \begin{description}
150: \item[archive-id] The ID of the derived resource
151: -- required (or archive-path).
152:
153: \item[archive-path] The full path to the derived resource
154: -- required (or archive-id).
1.1 casties 155:
156: \item[description] An informal textual description of the relation
157: of this resource to the original resource -- optional.
158: \end{description}
159:
160: \item[linked-with] Container for the description of another
161: resource when this resource is a linked copy of another resource
162: -- optional.
163:
164: \begin{description}
1.12 casties 165: \item[archive-id] The ID of the linked resource
1.16 casties 166: -- required (or archive-path).
1.12 casties 167:
1.1 casties 168: \item[archive-path] The full path to the linked resource
1.16 casties 169: -- required (or archive-id).
1.1 casties 170:
171: \item[description] An informal textual description of the relation
172: of this resource to the linked resource -- optional.
173: \end{description}
174:
1.12 casties 175: \item[media-type] \label{tag-media-type} The main media type of this
176: resource -- required.\\ The main media type can be overridden by
177: \texttt{media-type}s in subdirectories. Possible types are
178: \begin{itemize}
179: \item \texttt{image}
180:
181: \item \texttt{text}
182:
183: \item \texttt{audio}
184:
185: \item \texttt{video}
186:
187: \item \texttt{data} for other type of data
188: \end{itemize}
1.1 casties 189:
190: \item[meta] Additional metadata information about the resource --
191: optional.\\ For a description of additional metadata see below.
192:
193: \item[dir] Container for the description of a subdirectory -- required
194: (when there are subdirectories).\\ \texttt{dir} tags should not be
195: nested. Directories at lower levels are identified by their
196: \texttt{path}.
197:
198: \begin{description}
199: \item[description] An informal textual description of the
200: subdirectory -- optional.
201:
202: \item[name] The name of the subdirectory -- required.
203:
1.12 casties 204: \item[original-name] A text string associated with the directory as
205: original name -- optional. (E.g. if the data in this directory
206: came from an external source and had a name that had to be changed
207: according to section~\ref{sec:file-directory-names} but it should
208: be possible to reference the original name.)
209:
1.1 casties 210: \item[path] The directory path of this subdirectory relative to the
1.5 casties 211: resource's root directory (excluding the directory itself) --
212: required (may be empty or omitted if the directory is a direct
213: child of the resource's root directory).
1.1 casties 214:
215: \item[meta] Additional metadata information about the directory --
216: optional.\\ For a description of additional metadata see below.
217: \end{description}
218:
219: \item[file] Container for the description of a file -- deduced.\\
220: \texttt{file} tags should not be nested in \texttt{dir} tags. Files
221: at lower directory levels are identified by their \texttt{path}.
222:
223: \begin{description}
224: \item[description] An informal textual description of the
225: file -- optional.
226:
227: \item[name] The name of the file -- required.
228:
1.12 casties 229: \item[original-name] A text string associated with the file as
1.16 casties 230: original name -- optional. (e.g. if this file came from an
1.12 casties 231: external source and had a name that had to be changed according to
1.16 casties 232: section~\ref{sec:file-directory-names} it is possible
233: to preserve the original name.)
1.12 casties 234:
1.1 casties 235: \item[path] The directory path of this file relative to the
1.5 casties 236: resource's root directory (excluding the file itself) -- required
237: (may be empty or omitted if the file is in the resource's root
238: directory).
1.7 casties 239:
240: \item[date] The file's modification or creation date\footnote{The
241: preferred time and date format is ``YYYY/MM/DD HH:MM:SS''},
242: whichever is more recent -- optional.
1.1 casties 243:
244: \item[modification-date] The file's modification date -- optional.
245:
246: \item[creation-date] The file's creation date -- optional.
1.7 casties 247:
1.1 casties 248: \item[size] The file size -- deduced.
249:
250: \item[mime-type] The file's mime-type -- optional.
251:
252: \item[md5cs] MD5 checksum of the file content -- optional.
253:
254: \item[meta] Additional metadata information about the file --
255: optional. For a description of additional metadata see below.
256: \end{description}
257:
258: \end{description}
259:
260:
261:
262: \section{Additional metadata}
263: \label{sec:additional-metadata}
264:
265: All elements with \texttt{meta} tags can contain an arbitrary number
1.12 casties 266: of the following additional metadata elements.
267:
1.16 casties 268: \subsection{Workflow state}
1.12 casties 269: \label{sec:workflow-state}
270:
271: All additional metadata elements can have a \texttt{workflow-state}
272: \textbf{attribute}. This attribute reflects the state of the
273: corresponding metadata element. The possible values for the
274: \texttt{workflow-state} attribute are
275: \begin{itemize}
276: \item \texttt{preliminary} this information is preliminary. It must
277: be checked in further workflow steps.
278:
279: \item \texttt{inwork}
280:
281: \item \texttt{final}
282: \end{itemize}
283:
284: workflow states other than \texttt{preliminary} are part of the
285: workflow handling of the respective projects.
286:
287: Metadata elements can appear multiple times with different
288: \texttt{workflow-state} attributes. This enables metadata versioning.
289:
290:
291:
292: \subsection{Content type}
293: \label{sec:content-type}
294:
295: \begin{description}
296: \item[content-type] \label{tag-content-type} The content type of this
297: resource -- required.\\
298: The content type enables the choice of tools to manipulate and
299: display the resource. There should be a common list of content
300: types. For digital documents (books, manuscripts) this would be
301: "scanned document", for other image data "scanned
302: images".\footnote{The criterion for documents is a ordered
303: succession of image files (pages) and equal image size and
304: resolution throughout the images of a resource.}
305: \end{description}
306:
307:
1.1 casties 308:
1.4 casties 309: \subsection{Language}
310: \label{sec:lang}
311:
312: The language of a resource (e.g. a text) can be specified with a
313: \texttt{lang} tag. Languages have to be described using the
314: international codes for the representation of names of languages
315: either in two-letter form (ISO 639-1) or in three-letter form (ISO
316: 639-2). The entire catalogue of languages is documented on the page
317:
318: \url{http://www.loc.gov/standards/iso639-2/englangn.html}
319:
1.1 casties 320:
321: \subsection{DRI}
322: \label{sec:dri}
323:
324: The \emph{digital resource identifier} for the resource is specified
1.4 casties 325: in a \texttt{dri} element. Digital resource identifiers are documented
1.1 casties 326: on the page
327:
328: \url{http://pythia.mpiwg-berlin.mpg.de/projects/standards/dri}.
329:
330:
1.4 casties 331:
332: \subsection{Collection context}
333: \label{sec:collection-context}
334:
1.15 casties 335: The context of a resource as part of a collection or part of a project
336: can be specified in the \texttt{context} element. The context element
337: can appear multiple times if the resource is part of multiple
338: collections or projects.
1.4 casties 339:
340: \begin{description}
1.5 casties 341: \item[context] information on collection or project context.
1.4 casties 342:
1.5 casties 343: \begin{description}
1.15 casties 344: \item[link] URL to additional context information -- optional.
1.5 casties 345:
1.15 casties 346: \item[name] Textual description of project or collection -- optional.
347:
348: \item[meta-datalink] description of external sources of canonical meta
349: information -- optional
350: \begin{description}
351: \item[db] \textbf{attribute} to identify different sets of meta data
352: links to the same resource -- optional
353:
354: \item[object] \textbf{attribute} to identify different objects or
355: parts of the same resource -- optional
356:
357: \item[label] textual label for the link -- optional
358:
359: \item[url] URL to present to the client -- optional
360:
361: \item[metadata-url] URL to an external server to be queried -- optional
362: \end{description}
363:
364: \item[meta-baselink] description of external server for canonical meta
365: information -- optional
366: \begin{description}
367: \item[db] \textbf{attribute} to identify different sets of meta data
368: links to the same resource -- optional
369:
370: \item[label] textual label for the link -- optional
371:
372: \item[url] URL to present to the client -- optional
373:
374: \item[metadata-url] URL to an external server to be queried --
375: required (the parameter \texttt{object=} with an object id has
376: to be appended to this URL)
377: \end{description}
1.5 casties 378: \end{description}
1.4 casties 379: \end{description}
1.5 casties 380:
1.4 casties 381:
382:
383:
1.1 casties 384: \subsection{Bibliographic information}
385: \label{sec:bibliographic-data}
386:
1.5 casties 387: Bibliographic information is presented in a \texttt{bib} container with
1.1 casties 388: a \texttt{type} parameter, giving the type of bibliographic resource.
1.4 casties 389: The \texttt{type} field can be repeated as a tag in the container.
390:
1.5 casties 391: The format is based on the ECHO scheme for bibliographic data (cf.
392: content workflow), the MPIWG ``Projektbibliografie'' and the format of
393: the commonly used program ``EndNote''.
394:
1.4 casties 395:
396: \subsubsection{Book}
397:
398: \begin{description}
399:
400: \item [bib type="book"] a published book.
401:
402: \begin{description}
403: \item [author] The author of the book.
404: \item [year] The year of publication.
405: \item [title] Title of the book.
406: \item [series-editor] Name of the series editor, if the book appears
407: in a series.
408: \item [series-title] Title of the serie, if the book appears in a
409: series.
410: \item [series-volume] Volume number, if the book appears in a
411: series.
412: \item [number-of-pages] Number of pages of the entire book.
413: \item [city] City where the book was published.
414: \item [publisher] Name of the publishing company
415: \item [edition] Edition of the book (e.g. third edition)
416: \item [number-of-volumes] Number of volumes, if the the book is
417: published in multiple volumes.
418: \item [translator] Name of the translator.
419: \item [isbn-issn]
1.18 casties 420: \item[call-number] Call number in holding library
421: \item[holding-library] Holding library
1.4 casties 422: \end{description}
423: \end{description}
424:
425: \subsubsection{In Book}
426:
427: \begin{description}
428: \item [bib type="inbook"] an article as part of a book.
429:
430: \begin{description}
431: \item [author] The author of the book.
432: \item [year] The year of publication.
433: \item [title] Title of the article.
434: \item [editor] Name of the book's editor.
435: \item [book-title] Title of the book.
436: \item [series-volume] Volume number, if the book appears in a
437: series.
438: \item [pages] Number of pages of the article.
439: \item [city] City where the book was published.
440: \item [publisher] Name of the publishing company
441: \item [edition] Edition of the book (e. g. third edition)
442: \item [series-author] Name of the series editor, if the book appears
443: in a series.
444: \item [series-title] Title of the series, if the book appears in a
445: series.
446: \item [number-of-volumes] Number of volumes, if the the book is
447: published in multiple volumes.
448: \item [translator] Name of the translator
449: \item [isbn-issn]
1.18 casties 450: \item[call-number] Call number in holding library
451: \item[holding-library] Holding library
1.4 casties 452: \end{description}
453: \end{description}
454:
455: \subsubsection{Proceedings}
456:
457: \begin{description}
458: \item [bib type="proceedings"] a conference proceedings publication.
459:
460: \begin{description}
461: \item [author] The author of the article.
462: \item [year] The year of publication.
463: \item [title] Title of the article.
464: \item [editor] Name of the book's editor.
465: \item [conference-name] Name of the conference the proceedings are
466: related to.
467: \item [volume] Volume number.
468: \item [pages] Number of pages of the article.
469: \item [date] Date of the conference the proceedings are related to.
470: \item [conference]-location City where the conference was held.
471: \item [publisher] Name of the publishing company
472: \item [edition] Edition of the book (e. g. third edition)
473: \item [series-editor] Name of the series editor, if the book appears
474: in a series.
475: \item [series-title] Title of the series, if the book appears in a
476: series.
477: \item [number-of-volumes] Number of volumes, if the the book is
478: published as multiple volumes.
479: \item [isbn-issn]
1.18 casties 480: \item[call-number] Call number in holding library
481: \item[holding-library] Holding library
1.4 casties 482: \end{description}
483: \end{description}
484:
485: \subsubsection{Edited Book}
486:
487: \begin{description}
488: \item[bib type="edited-book"] a book that is the edition of another
489: work.
490:
491: \begin{description}
492: \item [editor] Name of the editor of the book.
493: \item [year] The year of publication.
494: \item [title] Title of the book.
495: \item [series-editor] Name of the editor of the series the book is
496: part of.
497: \item [series-title] Title of the series, if the book is part of a
498: series.
499: \item [series-volume] Volume number, if the book appears in a series.
500: \item [number-of-pages] Number of pages of the article.
501: \item [city] City where the book was published.
502: \item [publisher] Name of the publishing company
503: \item [edition] Information about the edition (e.g. ``Repr. of the London ed. 1652'')
504: \item [number-of-volumes] Number of volumes, if the the book is
505: published as multiple volumes.
506: \item [isbn-issn]
1.18 casties 507: \item[call-number] Call number in holding library
508: \item[holding-library] Holding library
1.4 casties 509: \end{description}
510: \end{description}
511:
1.17 casties 512: \subsubsection{Journal Volume}
513:
514: \begin{description}
515: \item [bib type="journal-volume"] a volume of a scientific journal.
516: \begin{description}
517: \item [title] Name of the journal.
518: \item [editor] The editor of the journal.
519: \item [publisher] Name of the publishing company.
520: \item [city] City where the journal is published.
521: \item [year] The year of publication.
522: \item [volume] Volume number.
523: \item [numer-of-pages] Number of pages of the volume.
524: \item [isbn-issn]
1.18 casties 525: \item[call-number] Call number in holding library
526: \item[holding-library] Holding library
1.17 casties 527: \end{description}
528: \end{description}
529:
1.4 casties 530: \subsubsection{Journal Article}
531:
532: \begin{description}
533: \item [bib type="journal-article"] an article in a scientific journal.
534: \begin{description}
535: \item [author] The author of the article.
536: \item [year] The year of publication.
537: \item [title] Title of the article.
538: \item [journal] Name of the journal.
539: \item [volume] Volume number, if the journal appears in a series.
540: \item [issue] Number of the issue the article is part of.
541: \item [pages] Number of pages of the article.
542: \item [alternate-journal] Alternate Journal
543: \item [isbn-issn]
1.18 casties 544: \item[call-number] Call number in holding library
545: \item[holding-library] Holding library
1.4 casties 546: \end{description}
547: \end{description}
548:
549: \subsubsection{Magazine Article}
550:
551: \begin{description}
552: \item [bib type="magazine-article"] an article in a popular magazine.
553: \begin{description}
554: \item [author] The author of the book.
555: \item [year] The year of publication.
556: \item [title] Title of the article.
557: \item [magazine] Name of the magazine.
558: \item [volume] Volume number, if the book appears in a series.
559: \item [issue-number] Number of the issue the article is part of.
560: \item [pages Number] of pages of the article.
561: \item [date] Date when the article appeared.
1.18 casties 562: \item[call-number] Call number in holding library
563: \item[holding-library] Holding library
1.4 casties 564: \end{description}
565: \end{description}
566:
567: \subsubsection{Newspaper Article}
568:
569: \begin{description}
570: \item [bib type="newspaper-article"] an article in a newspaper.
571: \begin{description}
572: \item [author] The author of the article.
573: \item [year] The year of publication.
574: \item [title] Title of the article.
575: \item [Newspaper] Name of the newspaper the article appeared in.
576: \item [pages] Number of pages of the article.
577: \item [issue-date] Date of the issue the article is part of.
578: \item [city] City of the newspaper.
1.18 casties 579: \item[call-number] Call number in holding library
580: \item[holding-library] Holding library
1.4 casties 581: \end{description}
582: \end{description}
583:
584: \subsubsection{Thesis}
585:
586: \begin{description}
587: \item [bib type="thesis"] a master/doctorate/etc. thesis.
588: \begin{description}
589: \item [author] The author of the thesis.
590: \item [year] The year of publication.
591: \item [title] Title of the thesis.
592: \item [academic-department] Name of the academic department where
593: the thesis was handed in.
594: \item [number-of-pages] Number of pages of the thesis.
595: \item [city] City where the thesis was published.
596: \item [University] Name of the university where the thesis was
597: handed in.
598: \item [isbn-issn]
1.18 casties 599: \item[call-number] Call number in holding library
600: \item[holding-library] Holding library
1.4 casties 601: \end{description}
602: \end{description}
603:
604: \subsubsection{Report}
605:
606: \begin{description}
607: \item [bib type="report"] a scientific report.
608: \begin{description}
609: \item [author] The author of the report.
610: \item [year] The year of publication.
611: \item [title] Title of the report.
612: \item [pages] Number of pages of the report.
613: \item [date] Date when the report appeared.
614: \item [city] City where the book was published.
615: \item [institution] Institution where the report was produced.
616: \item [type] Type of report.
617: \item [report-number] Report number.
1.18 casties 618: \item[call-number] Call number in holding library
619: \item[holding-library] Holding library
1.4 casties 620: \end{description}
621: \end{description}
622:
1.5 casties 623: \subsubsection{Manuscript}
624:
625: \begin{description}
626: \item [bib type="manuscript"] a handwritten/typewritten manuscript.
627:
628: \begin{description}
629: \item [title] Title of the manuscript.
630: \item [author] The author of the text.
631: \item [location] Name of the library where the manuscript is
632: currently located.
633: \item [year] The year or century of publication.
634: \item [pages] Number of pages of the manuscript.
635: \item [signature] Signature of the manuscript.
636: \item [editorial-remarks] Remarks related to the online
637: publication of the manuscript. This could be notes about
638: annotations etc.
639: \item [description] This can be any kind of description.
640: \item [keywords] Keywords related to the manuscript.
1.18 casties 641: \item[call-number] Call number in holding library
642: \item[holding-library] Holding library
1.5 casties 643: \end{description}
644: \end{description}
645:
646:
1.19 casties 647: \subsubsection{Correspondence}
648:
649: \begin{description}
650: \item [bib type="correspondence"] a piece of correspondence e.g. letter, telegram, in the following called ``letter''
651:
652: \begin{description}
653: \item[type] The type of correspondence, e.g. ``letter'', ``postcard'', ``telegram'', ``letter draft''
654: \item [author] The author/sender of the letter.
655: \item [recipient] The recipient of the letter.
656: \item [date] normalised date of the letter.
657: \item [date-range-end] end of range of uncertain dating -- optional.
658: \item [date-original] the date in its original form as noted on the letter -- optional.
659: \item [place] place where the letter was written/sent.
660: \item [title] Title of the letter -- optional.
661: \item[incipit] The opening phrase of the letter -- optional.
662: \item[excipit] The closing phrase of the letter -- optional.
663: \item [pages] Number of pages of the manuscript.
664: \item [signature] Canonical signature/call number of the manuscript.
665: \item [description] This can be any kind of description.
666: \item [keywords] Keywords related to the manuscript.
667: \item[call-number] Call number in the current holding library
668: \item[holding-library] current holding library
669: \end{description}
670: \end{description}
671:
672:
1.4 casties 673: \subsubsection{Generic}
674:
675: \begin{description}
676: \item [bib type="generic"] a generic bibliographic type. This type
677: should only be used in rare cases.
678: \begin{description}
679: \item [author]
680: \item [year]
681: \item [title]
682: \item [secondary-author]
683: \item [secondary-title]
684: \item [volume]
685: \item [number]
686: \item [pages]
687: \item [date]
688: \item [place-published]
689: \item [publisher]
690: \item [edition]
691: \item [tertiary author]
692: \item [tertiary-title]
693: \item [number-of-volumes]
694: \item [type-of-work]
695: \item [subsidiary author]
696: \item [alternate-title]
697: \item [isbn-issn]
698: \item [call-number]
699: \item [label]
700: \item [keywords]
701: \item [abstract]
702: \item [notes]
703: \item [url]
1.5 casties 704: \end{description}
1.4 casties 705: \end{description}
706:
707:
708: \subsection{Architectural drawings}
709: \label{sec:doc}
710:
711: Specific information for architectural drawings is presented in a
1.5 casties 712: \texttt{doc} container with an additional \texttt{type} attribute
713: giving the type of drawing. All elements inside the container can
714: appear multiple times.
1.4 casties 715:
716: \begin{description}
1.5 casties 717:
718: \item[doc type="Architectural Drawing"] architectural drawing.
719:
720: \begin{description}
721: \item [person] last name and first name of a person, separated by a
722: comma. A further common name for the person can be put infront,
723: separated by a semicolon.
724: \item [location] Name of a place in its common notation. This can be
725: a city or a institution.
726: \item [date] This can be a year (or several years, separated by
727: commas) or a period (1706-1714). Years are noted with four digits.
728: \item [object] Short description of an object or signatures.
729: \item [keywords] Keywords related to the object.
730: \end{description}
1.4 casties 731: \end{description}
1.1 casties 732:
733:
1.10 casties 734: \subsection{Document structure (table of contents)}
1.1 casties 735: \label{sec:toc}
736:
1.4 casties 737: Information on the structure of a document like the division into
738: parts and chapters in the way of a table of contents is presented in a
739: \texttt{toc} container.
740:
741: The scheme allows multiple logical pages on a single page image
742: as it is often the case with scanned books or manuscripts. The scheme
743: also allows for ``loose'' numbering schemes with roman, arabic or
744: other page numbers consecutively or mixed and changes in the numbering
745: within the document.
746:
747: The flexibility comes from the fact that no additional assumptions
748: about the mapping between logical pages and page images are made in
749: the format. All mapping information is specified by the user.
750:
751: The logical page numbering or naming that can be presented to the user
752: is specified in the \texttt{name} tags while the physical numbering of
753: the page images is specified in the \texttt{index} or \texttt{url}
754: tags.
1.1 casties 755:
1.4 casties 756: \begin{description}
1.5 casties 757: \item[toc] container for document structure
758:
1.4 casties 759: \begin{description}
1.5 casties 760: \item[page] describes a single logical page
761:
762: \begin{description}
763: \item[name] the ``name'' of the logical page. This can be any string
764: like a page number (arabic, roman, etc.) or a special designation
765: like ``Table 5''.
766:
767: \item[index] the \texttt{digilib} index number\footnote{The index
768: number for digilib is the index in the alphabetical order of the
769: scan file names.} of the scan image of the page.
770:
771: \item[url] alternatively to the \texttt{digilib} index number the
772: full URL of the scan image of the page can be used.
773: \end{description}
1.4 casties 774:
1.5 casties 775: \item[chapter] describes a section or chapter of the text.
776: \texttt{chapter} elements can be nested.
1.1 casties 777:
1.4 casties 778: \begin{description}
1.5 casties 779: \item[name] the title of the chapter or section.
780:
781: \item[start] the beginning of a page range (usually the first page
782: of the chapter). The \texttt{start} element has an optional
783: \texttt{increment} attribute to indicate the number of logical
784: pages on a scan image.\footnote{This information is only needed by
785: additional tools that try to generate lists of all page and
786: image numbers.}
787:
788: \begin{description}
789: \item[name] the ``name'' of the first page (see \texttt{page}).
790:
791: \item[index] the index of the first page (see \texttt{page}).
792:
793: \item[url] the URL of the first page (see \texttt{page}).
794: \end{description}
795:
796: \item[end] the end of a page range (usually the last page of the
797: chapter).
798:
799: \begin{description}
800: \item[name] the ``name'' of the last page (see \texttt{page}).
801:
802: \item[index] the index of the last page (see \texttt{page}).
803:
804: \item[url] the URL of the last page (see \texttt{page}).
805: \end{description}
806:
807: \item[page] alternative (and additional) to
808: \texttt{start}/\texttt{end} page ranges single \texttt{page}
809: elements can be used inside \texttt{chapter}.
1.4 casties 810: \end{description}
811: \end{description}
812: \end{description}
813:
814: %%\url{http://pythia.mpiwg-berlin.mpg.de/toolserver/TS_lise}
1.1 casties 815:
816:
1.12 casties 817: \subsection{Digital images}
1.1 casties 818: \label{sec:inform-scann-imag}
819:
820: Image files representing scanned images can have an \texttt{img}
821: container tag with information about the scan resolution and the size
822: of the original image. This information is used by the
823: \texttt{digilib} image viewing tool.
824:
825: Required is one of three possible sets of tags:
826:
827: \begin{description}
1.5 casties 828: \item[img] digital image information.
1.1 casties 829:
1.5 casties 830: \begin{description}
1.12 casties 831: \item[original-size-x] The width of the original
832: image -- required. \\
833: The unit of measure can be contained as parameter \texttt{unit},
834: the default is meter ``m''. The width to be considered is the
835: total width of the scanned area.
1.5 casties 836:
1.12 casties 837: \item[original-size-y] The height of the original image -- required.
1.5 casties 838:
1.12 casties 839: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
1.5 casties 840:
1.12 casties 841: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 842: \end{description}
1.1 casties 843: \end{description}
844:
845: or
846:
847: \begin{description}
1.5 casties 848: \item[img] digital image information.
849:
850: \begin{description}
851: \item[original-dpi-x] The resolution of the hi-res scan in its width
1.12 casties 852: in pixels per inch -- required.
1.1 casties 853:
1.5 casties 854: \item[original-dpi-y] The resolution of the hi-res scan in its height
1.12 casties 855: in pixels per inch -- required.
856:
857: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
858:
859: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 860: \end{description}
1.1 casties 861: \end{description}
862:
863: or
864:
865: \begin{description}
1.5 casties 866: \item[img] digital image information.
867:
868: \begin{description}
869: \item[original-dpi] The resolution of the hi-res scan in pixels per
1.12 casties 870: inch if the resolutions in width and height are the same -- required.
871:
872: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
873:
874: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 875: \end{description}
1.1 casties 876: \end{description}
1.7 casties 877:
878:
1.10 casties 879:
1.12 casties 880: \subsection{Digital image acquisition}
1.10 casties 881: \label{sec:inform-about-image}
882:
883: A description of the technology used in the process of producing a
884: digital image.
885:
886: \begin{description}
887: \item[image-acquisition] description of the image production process
888: \begin{description}
1.12 casties 889: \item[device] acquisition device (e.g. ``flatbed scanner'')
1.10 casties 890:
1.12 casties 891: \item[image-type] type and color-depth of the image -- required (e.g. ``RGB 24
1.10 casties 892: bit'')
893:
894: \item[production-comment] additional textual information about the
895: production process
896: \end{description}
897: \end{description}
898:
899:
1.12 casties 900:
1.7 casties 901: \subsection{Full text with images}
902: \label{sec:full-text-with}
903:
1.12 casties 904: Full text in a XML format should be specified with a
905: \texttt{content-type}\footnote{see section~\ref{tag-content-type}
906: on page\pageref{tag-content-type}} ``fulltext''.
1.8 casties 907:
908: The relation between the full text and optional images of
909: whole pages or parts of pages must be specified in a
1.20 ! casties 910: \texttt{texttool} container.
1.8 casties 911:
912: \begin{description}
1.20 ! casties 913: \item[texttool] representation of full text with images
! 914:
1.8 casties 915: \begin{description}
1.20 ! casties 916: \item[text] the file name of the full text file (with path
1.8 casties 917: inside document directory)
1.12 casties 918:
1.20 ! casties 919: \item[text-url-path] a characteristic part of the URL with which the
! 920: full text can be retrieved (the form and content of this element
! 921: is dependent on the specific text retrieval mechanism)
! 922:
! 923: \item[image] the directory name of the directory containig the
1.12 casties 924: page image files (with path inside document directory)
1.8 casties 925:
1.20 ! casties 926: \item[xslt] the file name of an additional XSL transformation
1.8 casties 927: file
928:
1.20 ! casties 929: \item[pagebreak] the name of the element that indicates page breaks
! 930: (default ``pb'')
1.8 casties 931: \end{description}
932: \end{description}
1.7 casties 933:
1.1 casties 934:
935:
1.12 casties 936: \subsection{Copyright and access conditions}
937: \label{sec:access-conditions}
938:
939: If the access to a resource is bound to conditions for technical or legal
940: reasons then the conditions can be put in a \texttt{access-conditions}
1.16 casties 941: container. Other usage conditions like copyright can also be
1.12 casties 942: documented in this container.
943:
944: \begin{description}
945: \item[access-conditions] legal and technical conditions for access to
946: this resource
947:
948: \begin{description}
949: \item[attribution] The name or institution this resource should be
950: attributed to when it's publicly presented
951:
952: \begin{description}
953: \item[name] a name (free text)
954:
955: \item[url] a URL (with an optional \texttt{label} attribute to show
956: as text)
1.18 casties 957:
958: \item[description] more information (free text, e.g. holding
959: library call number)
1.12 casties 960: \end{description}
961:
1.16 casties 962: \item[copyright] the copyright holder and it's conditions
1.12 casties 963: \begin{description}
1.16 casties 964: \item[owner] the name of the copyright holder
1.12 casties 965: \begin{description}
966: \item[name] a name (free text)
967:
968: \item[url] a URL (with an optional \texttt{label} attribute to show
969: as text)
970: \end{description}
971:
972: \item[date] the date when the copyright was issued
973:
1.16 casties 974: \item[duration] the duration of the copyright term (if known)
1.12 casties 975:
976: \item[description] free-text field for special or additional
977: conditions
978: \end{description}
1.14 casties 979:
980:
981: \item[publish-metadata] metadata about this resource can be made
1.16 casties 982: freely available when this tag is present (otherwise metadata has
983: the same access conditions as the rest of the resource). Access to
984: the resource itself is regulated separately by the \texttt{access}
985: element.
1.12 casties 986:
1.16 casties 987: \item[access] conditions of access to this resource. Different
988: access types are specified by a \texttt{type} attribute:
1.12 casties 989: \begin{description}
1.16 casties 990: \item[type=group] access restricted to the members of this named
991: group. The method to identify a user belonging to a named group
992: is not specified in this document.
993: \begin{description}
994: \item[name] name of the group.
995:
996: \item[only-before] the access condition is only valid before the
997: given date (format: ``YYYY/MM/DD'').
998:
999: \item[only-after] the access condition is only valid after the
1000: given date (format: ``YYYY/MM/DD'').
1001: \end{description}
1002:
1003: \item[type=institution] access restricted to the members of this
1004: institution. The method to identify a user to belong to the
1005: institution is not specified in this document.
1.12 casties 1006: \begin{description}
1.16 casties 1007: \item[name] name of the group.
1008:
1009: \item[only-before] the access condition is only valid before the
1010: given date (format: ``YYYY/MM/DD'').
1011:
1012: \item[only-after] the access condition is only valid after the
1013: given date (format: ``YYYY/MM/DD'').
1014: \end{description}
1015:
1016:
1017: \item[type=subnet] access restricted to all computers with an
1018: IP-address in this subnet.
1019: \begin{description}
1020: \item[range] subnet range defined in
1021: truncated-quad (e.g. ``141.14''), network-netmask
1022: (e.g. ``141.14.0.0/255.255.0.0''), or network-range
1023: (e.g. ``141.14.0.0/16'') notation.
1024:
1025: \item[only-before] the access condition is only valid before the
1026: given date (format: ``YYYY/MM/DD'').
1027:
1028: \item[only-after] the access condition is only valid after the
1029: given date (format: ``YYYY/MM/DD'').
1030: \end{description}
1031:
1.12 casties 1032:
1.16 casties 1033: \item[type=scientific] access to this resource should be restricted to
1034: scientific work
1035: \begin{description}
1036: \item[only-before] the access condition is only valid before the
1037: given date (format: ``YYYY/MM/DD'').
1038:
1039: \item[only-after] the access condition is only valid after the
1040: given date (format: ``YYYY/MM/DD'').
1.12 casties 1041: \end{description}
1.16 casties 1042:
1.12 casties 1043:
1.16 casties 1044: \item[type=free] access to this resource is not restricted
1045: \begin{description}
1046: \item[only-before] the access condition is only valid before the
1047: given date (format: ``YYYY/MM/DD'').
1.12 casties 1048:
1.16 casties 1049: \item[only-after] the access condition is only valid after the
1050: given date (format: ``YYYY/MM/DD'').
1051: \end{description}
1052:
1.12 casties 1053:
1.16 casties 1054: \item[type=special] if none of the above conditions seems appropriate,
1.12 casties 1055: a free-form text can be specified here.
1.16 casties 1056: \begin{description}
1057: \item[description] description of special access conditions.
1058:
1059: \item[only-before] the access condition is only valid before the
1060: given date (format: ``YYYY/MM/DD'').
1061:
1062: \item[only-after] the access condition is only valid after the
1063: given date (format: ``YYYY/MM/DD'').
1064: \end{description}
1065:
1.12 casties 1066: \end{description}
1067: \end{description}
1068: \end{description}
1069:
1070: \noindent
1.16 casties 1071: It should be noted that control over access to the resource has to be
1072: provided by additional technical measures. Access conditions in the
1073: metadata file only state that conditions \emph{should} be observed, it
1074: is not implied that they \emph{are} necessarily observed, as the
1075: enforcement of conditions depends on additional measures.
1.12 casties 1076:
1077:
1078:
1079: \subsection{Acquisition of raw-data}
1080: \label{sec:acqu-inform}
1081:
1082: Information about the acquisition source for raw data resources can be
1083: provided in an \texttt{acquisition} container.
1084:
1085: \begin{description}
1086: \item[acquisition] the acquisition source of this resource -- required
1087: for raw data.
1088: \begin{description}
1089: \item[provider] where this resource came from -- required
1090: \begin{description}
1091: \item[name] free-text name of the provider (institution or
1092: individual)
1093:
1094: \item[address] address of the provider
1095:
1096: \item[contact] contact person at the provider (i.e. name and email)
1097:
1098: \item[url] URL related to the provider
1.13 casties 1099:
1100: \item[provider-id] id of the provider (internally used) -- deduced
1.12 casties 1101: \end{description}
1102:
1103: \item[date] date of acquisition -- required
1104:
1105: \item[description] free-text description of the acquisition source or
1106: additional information
1107: \end{description}
1108: \end{description}
1109:
1110:
1111:
1112: \subsection{Documentary Films}
1113: \label{sec:documentary-films}
1114:
1115: Documentary films can be described using a \texttt{film-acquisition}
1116: container.
1117:
1118: \begin{description}
1119: \item[film-acquisition] description of a (documentary) film --
1120: required for documentary film
1121: \begin{description}
1122: \item[recording] specification of the recording process
1123: \begin{description}
1124: \item[author] the person or persons doing the recording
1125:
1126: \item[date] the date or time span when the film was recorded
1127:
1128: \item[location] the place where the film was recorded
1129:
1130: \item[device] recording device used (e.g. ``Sony CP-DV8 Camcorder'')
1131:
1132: \item[format] format of the recorded film -- required (e.g. ``DV
1133: 720x524 25fps interlaced'')
1134: \end{description}
1135:
1136: \item[description] free-form description of the recording and the
1137: content of the film
1138: \end{description}
1139: \end{description}
1140:
1141: (More information about the digitization step could be added in a
1142: \texttt{digitization} tag similar to the \texttt{recording} tag.)
1143:
1.1 casties 1144:
1145:
1146:
1.4 casties 1147: \section{Sample metadata files for ECHO resources}
1.1 casties 1148:
1.5 casties 1149: The following is a sample metadata index file for a directory containig a
1150: scanned document.
1151:
1152: \begin{small}
1.1 casties 1153: \begin{verbatim}
1.11 casties 1154: <resource type="ECHO" version="1.0">
1.5 casties 1155: <description>Fleck, 1980</description>
1156: <name>fleck.1980</name>
1157: <creator>University of Bern</creator>
1158: <archive-path>ubern/wiss-theorie</archive-path>
1159: <content-type>scanned images</content-type>
1160: <meta>
1161: <dri>echo23a45e2329x</dri>
1162: <lang>ger</lang>
1163: <bib type="book">
1164: <author>Fleck, Ludwik</author>
1165: <year>1980</year>
1166: <title>Entstehung und Entwicklung einer
1167: wissenschaftlichen Tatsache</title>
1168: <series-editor></series-editor>
1169: <series-title></series-title>
1170: <series-volume></series-volume>
1171: <number-of-pages></number-of-pages>
1172: <city>Frankfurt am Main</city>
1173: <publisher>Suhrkamp</publisher>
1174: <edition></edition>
1175: <number-of-volumes></number-of-volumes>
1176: <translator></translator>
1177: <isbn-issn></isbn-issn>
1178: <keywords>Wissenschaftstheorie, Fleck, Tatsache</keywords>
1179: <abstract></abstract>
1180: </bib>
1181: </meta>
1182: <dir>
1183: <description>Scanned images (300dpi)</description>
1184: <name>img</name>
1185: </dir>
1.4 casties 1186: </resource>
1187: \end{verbatim}
1.5 casties 1188: \end{small}
1.4 casties 1189:
1.5 casties 1190: The following is a sample metadata file for a single image of an
1191: architectural drawing.
1.4 casties 1192:
1.5 casties 1193: \begin{small}
1.4 casties 1194: \begin{verbatim}
1.11 casties 1195: <resource type="ECHO" version="1.0">
1.5 casties 1196: <creator>Bibliotheca Hertziana</creator>
1197: <content-type>scanned images</content-type>
1198: <file>
1199: <name>00000271-asl-160-r-full.tif</name>
1200: <meta>
1201: <img>
1202: <original-dpi>315</original-dpi>
1203: </img>
1204: <dri>echo45a67bc4367d</dri>
1205: <lang>ita</lang>
1206: <doc type="Architectural Drawing">
1207: <person>Ciolli, Giacomo</person>
1208: <person>Urban VIII; Barberini, Maffeo</person>
1209: <location>Accademia di San Luca</location>
1210: <location>Roma</location>
1211: <date>1706</date>
1212: <object>Concorso Clementino</object>
1213: <object>Fontana Pubblica</object>
1214: <object>Brunnen</object>
1215: <object>ASL 160</object>
1216: <keywords></keywords>
1217: </doc>
1218: <context>
1219: <url>http://colosseum.biblhertz.it:8080/Lineamenta/
1220: 1033478408.39/1035196181.35/1035196204.09/1035394121.83
1221: </url>
1222: </context>
1223: </meta>
1224: </file>
1.2 casties 1225: </resource>
1.1 casties 1226: \end{verbatim}
1.5 casties 1227: \end{small}
1.1 casties 1228:
1229: \end{document}
1230:
1231: %%% Local Variables:
1232: %%% mode: latex
1233: %%% TeX-master: t
1234: %%% End:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>