Annotation of storage/meta/meta-format.tex, revision 1.15
1.1 casties 1: \documentclass[a4paper]{article}
2:
3: \usepackage[latin1]{inputenc}
4: \usepackage[T1]{fontenc}
5: \usepackage{ae}
6: %\usepackage{times}
7: %\usepackage{courier}
8:
9: % create in-text links black (with PDF)
1.6 casties 10: \usepackage[colorlinks=true,linkcolor=black]{hyperref}
1.1 casties 11: % Format URLs nicely (without PDF)
1.6 casties 12: %\usepackage{url}
1.1 casties 13:
14:
15: \title{A simple metadata format for resource bundles}
16:
1.4 casties 17: \author{Robert Casties, Dirk Wintergrün, Hans-Christoph Liess}
1.1 casties 18:
1.15 ! casties 19: \date{V1.2 of 16.7.2004}
1.1 casties 20:
21: \begin{document}
22:
23: \maketitle
24:
25: \tableofcontents
26:
27:
28: \section{File and directory names}
29: \label{sec:file-directory-names}
30:
31: File and directory names should not contain spaces. Allowed characters
32: in filenames are only the alphanumeric set a-z, A-Z, 0-9, hyphen
33: ``-'', underscore ``\_'' and dot ``.''.
34:
1.12 casties 35: Files and directories with names that contain illegal characters must
36: be transformed to allowed names. A proposition for a simple
37: transformation rule is
38:
39: \begin{itemize}
40: \item whitespace characters (e.g. blank, tab, cr, lf) are replaced by
41: hyphens ``-''
42:
43: \item other illegal characters are replaced by underscores ``\_''.
44: \end{itemize}
45:
46: This rule does not provide a reversible mapping to the original
47: illegal file name and it does not provide a collision-free mapping,
48: i.e. two different illegal file names might be mapped to the same
49: allowed file name. Additional precautions for these cases must be
50: taken.
1.1 casties 51:
1.4 casties 52:
53: \section{Metadata files}
54: \label{sec:metadata-files}
55:
56: The metadata information is stored in the XML format documented below
57: in special files in the resource directory. Two forms of metadata
58: files are possible:
59: \begin{itemize}
60: \item a file named \texttt{index.meta} in a directory.
61:
62: \item a file named like the data file it describes with an
63: additional extension \texttt{.meta}. For example metadata for the
64: file \texttt{0001.tif} would be in a file \texttt{0001.tif.meta}.
65: \end{itemize}
66:
67: The resource directory must contain an \texttt{index.meta} file with
68: information about the resource as a whole. Other directories can
69: contain \texttt{index.meta} files.
70:
71: Additional information about single data files that are part of the
72: resource can either be put in \texttt{file} tags in the
73: \texttt{index.meta} file or in separate \emph{filename}\texttt{.meta}
74: files for each data file. Information from the directory level file is
75: inherited at the file level.
76:
77:
1.1 casties 78: \section{Resource format}
79: \label{sec:mpiwg-doc}
80:
81: In this description elements marked ``optional'' need not be supplied
82: by the provider of the resource and may be absent in all versions of
83: the metadata file. Elements marked ``required'' must be supplied by
84: the provider of the resource. Elements marked ``deduced'' can be
85: supplied by the provider of the resource but can also be provided by
1.4 casties 86: automatic scripts later in the process, these elements must be present
1.1 casties 87: in the final file.
88:
1.12 casties 89: File and directory paths in the metadata file use the conventional
90: Unix file separator slash ``/''.
91:
1.11 casties 92: The outer container element is \texttt{resource}. It has the following
93: \textbf{attributes}:
94:
95: \begin{description}
1.12 casties 96: \item[type] sub-type of resource (e.g. ``ECHO'', ``MPIWG'') --
97: optional.
1.11 casties 98:
1.12 casties 99: \item[version] version number of metadata format (currently 1.1) --
1.11 casties 100: required.
101: \end{description}
102:
103: \noindent The allowed \textbf{elements} inside \texttt{resource} are:
1.1 casties 104:
105: \begin{description}
1.14 casties 106: \item[description] An informal textual description of the resource --
107: optional\footnote{At least one description of the resource's content
108: is required. The description can be an informal
109: \texttt{description} element or a descriptive element (like
110: \texttt{bib}) in a \texttt{meta} container.}.
1.1 casties 111:
112: \item[name] The filename of the resource (name of the directory this
113: file is contained in) -- required.
114:
115: \item[creator] The name of the project or person that created the
116: resource -- optional.
1.4 casties 117:
118: \item[archive-creation-date] The time and date the archive collection
119: was created -- deduced.
1.1 casties 120:
1.4 casties 121: \item[archive-storage-date] The time and date the archive was written
122: to permanent storage -- deduced (must not be set by the user).
1.1 casties 123:
124: \item[archive-path] The full path to the resource directory inside the
1.5 casties 125: whole archive collection, including the resource directory -- deduced.
1.12 casties 126:
127: \item[archive-id] The ID for this document in the archive --
128: required.
1.1 casties 129:
130: \item[derived-from] Container for the description of the original
131: resource if this resource is a modified version of another resource
132: -- optional.
133:
134: \begin{description}
1.12 casties 135: \item[archive-id] The ID of the original resource
136: -- required.
137:
1.1 casties 138: \item[archive-path] The full path to the original resource
1.12 casties 139: -- deduced.
1.1 casties 140:
141: \item[description] An informal textual description of the relation
142: of this resource to the original resource -- optional.
143: \end{description}
144:
145: \item[linked-with] Container for the description of another
146: resource when this resource is a linked copy of another resource
147: -- optional.
148:
149: \begin{description}
1.12 casties 150: \item[archive-id] The ID of the linked resource
151: -- required.
152:
1.1 casties 153: \item[archive-path] The full path to the linked resource
1.12 casties 154: -- deduced.
1.1 casties 155:
156: \item[description] An informal textual description of the relation
157: of this resource to the linked resource -- optional.
158: \end{description}
159:
1.12 casties 160: \item[media-type] \label{tag-media-type} The main media type of this
161: resource -- required.\\ The main media type can be overridden by
162: \texttt{media-type}s in subdirectories. Possible types are
163: \begin{itemize}
164: \item \texttt{image}
165:
166: \item \texttt{text}
167:
168: \item \texttt{audio}
169:
170: \item \texttt{video}
171:
172: \item \texttt{data} for other type of data
173: \end{itemize}
1.1 casties 174:
175: \item[meta] Additional metadata information about the resource --
176: optional.\\ For a description of additional metadata see below.
177:
178: \item[dir] Container for the description of a subdirectory -- required
179: (when there are subdirectories).\\ \texttt{dir} tags should not be
180: nested. Directories at lower levels are identified by their
181: \texttt{path}.
182:
183: \begin{description}
184: \item[description] An informal textual description of the
185: subdirectory -- optional.
186:
187: \item[name] The name of the subdirectory -- required.
188:
1.12 casties 189: \item[original-name] A text string associated with the directory as
190: original name -- optional. (E.g. if the data in this directory
191: came from an external source and had a name that had to be changed
192: according to section~\ref{sec:file-directory-names} but it should
193: be possible to reference the original name.)
194:
1.1 casties 195: \item[path] The directory path of this subdirectory relative to the
1.5 casties 196: resource's root directory (excluding the directory itself) --
197: required (may be empty or omitted if the directory is a direct
198: child of the resource's root directory).
1.1 casties 199:
200: \item[meta] Additional metadata information about the directory --
201: optional.\\ For a description of additional metadata see below.
202: \end{description}
203:
204: \item[file] Container for the description of a file -- deduced.\\
205: \texttt{file} tags should not be nested in \texttt{dir} tags. Files
206: at lower directory levels are identified by their \texttt{path}.
207:
208: \begin{description}
209: \item[description] An informal textual description of the
210: file -- optional.
211:
212: \item[name] The name of the file -- required.
213:
1.12 casties 214: \item[original-name] A text string associated with the file as
215: original name -- optional. (E.g. if this file came from an
216: external source and had a name that had to be changed according to
217: section~\ref{sec:file-directory-names} but it should be possible
218: to reference the original name.)
219:
1.1 casties 220: \item[path] The directory path of this file relative to the
1.5 casties 221: resource's root directory (excluding the file itself) -- required
222: (may be empty or omitted if the file is in the resource's root
223: directory).
1.7 casties 224:
225: \item[date] The file's modification or creation date\footnote{The
226: preferred time and date format is ``YYYY/MM/DD HH:MM:SS''},
227: whichever is more recent -- optional.
1.1 casties 228:
229: \item[modification-date] The file's modification date -- optional.
230:
231: \item[creation-date] The file's creation date -- optional.
1.7 casties 232:
1.1 casties 233: \item[size] The file size -- deduced.
234:
235: \item[mime-type] The file's mime-type -- optional.
236:
237: \item[md5cs] MD5 checksum of the file content -- optional.
238:
239: \item[meta] Additional metadata information about the file --
240: optional. For a description of additional metadata see below.
241: \end{description}
242:
243: \end{description}
244:
245:
246:
247: \section{Additional metadata}
248: \label{sec:additional-metadata}
249:
250: All elements with \texttt{meta} tags can contain an arbitrary number
1.12 casties 251: of the following additional metadata elements.
252:
253: \subsection{workflow state}
254: \label{sec:workflow-state}
255:
256: All additional metadata elements can have a \texttt{workflow-state}
257: \textbf{attribute}. This attribute reflects the state of the
258: corresponding metadata element. The possible values for the
259: \texttt{workflow-state} attribute are
260: \begin{itemize}
261: \item \texttt{preliminary} this information is preliminary. It must
262: be checked in further workflow steps.
263:
264: \item \texttt{inwork}
265:
266: \item \texttt{final}
267: \end{itemize}
268:
269: workflow states other than \texttt{preliminary} are part of the
270: workflow handling of the respective projects.
271:
272: Metadata elements can appear multiple times with different
273: \texttt{workflow-state} attributes. This enables metadata versioning.
274:
275:
276:
277: \subsection{Content type}
278: \label{sec:content-type}
279:
280: \begin{description}
281: \item[content-type] \label{tag-content-type} The content type of this
282: resource -- required.\\
283: The content type enables the choice of tools to manipulate and
284: display the resource. There should be a common list of content
285: types. For digital documents (books, manuscripts) this would be
286: "scanned document", for other image data "scanned
287: images".\footnote{The criterion for documents is a ordered
288: succession of image files (pages) and equal image size and
289: resolution throughout the images of a resource.}
290: \end{description}
291:
292:
1.1 casties 293:
1.4 casties 294: \subsection{Language}
295: \label{sec:lang}
296:
297: The language of a resource (e.g. a text) can be specified with a
298: \texttt{lang} tag. Languages have to be described using the
299: international codes for the representation of names of languages
300: either in two-letter form (ISO 639-1) or in three-letter form (ISO
301: 639-2). The entire catalogue of languages is documented on the page
302:
303: \url{http://www.loc.gov/standards/iso639-2/englangn.html}
304:
1.1 casties 305:
306: \subsection{DRI}
307: \label{sec:dri}
308:
309: The \emph{digital resource identifier} for the resource is specified
1.4 casties 310: in a \texttt{dri} element. Digital resource identifiers are documented
1.1 casties 311: on the page
312:
313: \url{http://pythia.mpiwg-berlin.mpg.de/projects/standards/dri}.
314:
315:
1.4 casties 316:
317: \subsection{Collection context}
318: \label{sec:collection-context}
319:
1.15 ! casties 320: The context of a resource as part of a collection or part of a project
! 321: can be specified in the \texttt{context} element. The context element
! 322: can appear multiple times if the resource is part of multiple
! 323: collections or projects.
1.4 casties 324:
325: \begin{description}
1.5 casties 326: \item[context] information on collection or project context.
1.4 casties 327:
1.5 casties 328: \begin{description}
1.15 ! casties 329: \item[link] URL to additional context information -- optional.
1.5 casties 330:
1.15 ! casties 331: \item[name] Textual description of project or collection -- optional.
! 332:
! 333: \item[meta-datalink] description of external sources of canonical meta
! 334: information -- optional
! 335: \begin{description}
! 336: \item[db] \textbf{attribute} to identify different sets of meta data
! 337: links to the same resource -- optional
! 338:
! 339: \item[object] \textbf{attribute} to identify different objects or
! 340: parts of the same resource -- optional
! 341:
! 342: \item[label] textual label for the link -- optional
! 343:
! 344: \item[url] URL to present to the client -- optional
! 345:
! 346: \item[metadata-url] URL to an external server to be queried -- optional
! 347: \end{description}
! 348:
! 349: \item[meta-baselink] description of external server for canonical meta
! 350: information -- optional
! 351: \begin{description}
! 352: \item[db] \textbf{attribute} to identify different sets of meta data
! 353: links to the same resource -- optional
! 354:
! 355: \item[label] textual label for the link -- optional
! 356:
! 357: \item[url] URL to present to the client -- optional
! 358:
! 359: \item[metadata-url] URL to an external server to be queried --
! 360: required (the parameter \texttt{object=} with an object id has
! 361: to be appended to this URL)
! 362: \end{description}
1.5 casties 363: \end{description}
1.4 casties 364: \end{description}
1.5 casties 365:
1.4 casties 366:
367:
368:
1.1 casties 369: \subsection{Bibliographic information}
370: \label{sec:bibliographic-data}
371:
1.5 casties 372: Bibliographic information is presented in a \texttt{bib} container with
1.1 casties 373: a \texttt{type} parameter, giving the type of bibliographic resource.
1.4 casties 374: The \texttt{type} field can be repeated as a tag in the container.
375:
1.5 casties 376: The format is based on the ECHO scheme for bibliographic data (cf.
377: content workflow), the MPIWG ``Projektbibliografie'' and the format of
378: the commonly used program ``EndNote''.
379:
1.4 casties 380:
381: \subsubsection{Book}
382:
383: \begin{description}
384:
385: \item [bib type="book"] a published book.
386:
387: \begin{description}
388: \item [author] The author of the book.
389: \item [year] The year of publication.
390: \item [title] Title of the book.
391: \item [series-editor] Name of the series editor, if the book appears
392: in a series.
393: \item [series-title] Title of the serie, if the book appears in a
394: series.
395: \item [series-volume] Volume number, if the book appears in a
396: series.
397: \item [number-of-pages] Number of pages of the entire book.
398: \item [city] City where the book was published.
399: \item [publisher] Name of the publishing company
400: \item [edition] Edition of the book (e.g. third edition)
401: \item [number-of-volumes] Number of volumes, if the the book is
402: published in multiple volumes.
403: \item [translator] Name of the translator.
404: \item [isbn-issn]
405: \end{description}
406: \end{description}
407:
408: \subsubsection{In Book}
409:
410: \begin{description}
411: \item [bib type="inbook"] an article as part of a book.
412:
413: \begin{description}
414: \item [author] The author of the book.
415: \item [year] The year of publication.
416: \item [title] Title of the article.
417: \item [editor] Name of the book's editor.
418: \item [book-title] Title of the book.
419: \item [series-volume] Volume number, if the book appears in a
420: series.
421: \item [pages] Number of pages of the article.
422: \item [city] City where the book was published.
423: \item [publisher] Name of the publishing company
424: \item [edition] Edition of the book (e. g. third edition)
425: \item [series-author] Name of the series editor, if the book appears
426: in a series.
427: \item [series-title] Title of the series, if the book appears in a
428: series.
429: \item [number-of-volumes] Number of volumes, if the the book is
430: published in multiple volumes.
431: \item [translator] Name of the translator
432: \item [isbn-issn]
433: \end{description}
434: \end{description}
435:
436: \subsubsection{Proceedings}
437:
438: \begin{description}
439: \item [bib type="proceedings"] a conference proceedings publication.
440:
441: \begin{description}
442: \item [author] The author of the article.
443: \item [year] The year of publication.
444: \item [title] Title of the article.
445: \item [editor] Name of the book's editor.
446: \item [conference-name] Name of the conference the proceedings are
447: related to.
448: \item [volume] Volume number.
449: \item [pages] Number of pages of the article.
450: \item [date] Date of the conference the proceedings are related to.
451: \item [conference]-location City where the conference was held.
452: \item [publisher] Name of the publishing company
453: \item [edition] Edition of the book (e. g. third edition)
454: \item [series-editor] Name of the series editor, if the book appears
455: in a series.
456: \item [series-title] Title of the series, if the book appears in a
457: series.
458: \item [number-of-volumes] Number of volumes, if the the book is
459: published as multiple volumes.
460: \item [isbn-issn]
461: \end{description}
462: \end{description}
463:
464: \subsubsection{Edited Book}
465:
466: \begin{description}
467: \item[bib type="edited-book"] a book that is the edition of another
468: work.
469:
470: \begin{description}
471: \item [editor] Name of the editor of the book.
472: \item [year] The year of publication.
473: \item [title] Title of the book.
474: \item [series-editor] Name of the editor of the series the book is
475: part of.
476: \item [series-title] Title of the series, if the book is part of a
477: series.
478: \item [series-volume] Volume number, if the book appears in a series.
479: \item [number-of-pages] Number of pages of the article.
480: \item [city] City where the book was published.
481: \item [publisher] Name of the publishing company
482: \item [edition] Information about the edition (e.g. ``Repr. of the London ed. 1652'')
483: \item [number-of-volumes] Number of volumes, if the the book is
484: published as multiple volumes.
485: \item [isbn-issn]
486: \end{description}
487: \end{description}
488:
489: \subsubsection{Journal Article}
490:
491: \begin{description}
492: \item [bib type="journal-article"] an article in a scientific journal.
493: \begin{description}
494: \item [author] The author of the article.
495: \item [year] The year of publication.
496: \item [title] Title of the article.
497: \item [journal] Name of the journal.
498: \item [volume] Volume number, if the journal appears in a series.
499: \item [issue] Number of the issue the article is part of.
500: \item [pages] Number of pages of the article.
501: \item [alternate-journal] Alternate Journal
502: \item [isbn-issn]
503: \end{description}
504: \end{description}
505:
506: \subsubsection{Magazine Article}
507:
508: \begin{description}
509: \item [bib type="magazine-article"] an article in a popular magazine.
510: \begin{description}
511: \item [author] The author of the book.
512: \item [year] The year of publication.
513: \item [title] Title of the article.
514: \item [magazine] Name of the magazine.
515: \item [volume] Volume number, if the book appears in a series.
516: \item [issue-number] Number of the issue the article is part of.
517: \item [pages Number] of pages of the article.
518: \item [date] Date when the article appeared.
519: \end{description}
520: \end{description}
521:
522: \subsubsection{Newspaper Article}
523:
524: \begin{description}
525: \item [bib type="newspaper-article"] an article in a newspaper.
526: \begin{description}
527: \item [author] The author of the article.
528: \item [year] The year of publication.
529: \item [title] Title of the article.
530: \item [Newspaper] Name of the newspaper the article appeared in.
531: \item [pages] Number of pages of the article.
532: \item [issue-date] Date of the issue the article is part of.
533: \item [city] City of the newspaper.
534: \end{description}
535: \end{description}
536:
537: \subsubsection{Thesis}
538:
539: \begin{description}
540: \item [bib type="thesis"] a master/doctorate/etc. thesis.
541: \begin{description}
542: \item [author] The author of the thesis.
543: \item [year] The year of publication.
544: \item [title] Title of the thesis.
545: \item [academic-department] Name of the academic department where
546: the thesis was handed in.
547: \item [number-of-pages] Number of pages of the thesis.
548: \item [city] City where the thesis was published.
549: \item [University] Name of the university where the thesis was
550: handed in.
551: \item [isbn-issn]
552: \end{description}
553: \end{description}
554:
555: \subsubsection{Report}
556:
557: \begin{description}
558: \item [bib type="report"] a scientific report.
559: \begin{description}
560: \item [author] The author of the report.
561: \item [year] The year of publication.
562: \item [title] Title of the report.
563: \item [pages] Number of pages of the report.
564: \item [date] Date when the report appeared.
565: \item [city] City where the book was published.
566: \item [institution] Institution where the report was produced.
567: \item [type] Type of report.
568: \item [report-number] Report number.
569: \end{description}
570: \end{description}
571:
1.5 casties 572: \subsubsection{Manuscript}
573:
574: \begin{description}
575: \item [bib type="manuscript"] a handwritten/typewritten manuscript.
576:
577: \begin{description}
578: \item [title] Title of the manuscript.
579: \item [author] The author of the text.
580: \item [location] Name of the library where the manuscript is
581: currently located.
582: \item [year] The year or century of publication.
583: \item [pages] Number of pages of the manuscript.
584: \item [signature] Signature of the manuscript.
585: \item [editorial-remarks] Remarks related to the online
586: publication of the manuscript. This could be notes about
587: annotations etc.
588: \item [description] This can be any kind of description.
589: \item [keywords] Keywords related to the manuscript.
590: \end{description}
591: \end{description}
592:
593:
1.4 casties 594: \subsubsection{Generic}
595:
596: \begin{description}
597: \item [bib type="generic"] a generic bibliographic type. This type
598: should only be used in rare cases.
599: \begin{description}
600: \item [author]
601: \item [year]
602: \item [title]
603: \item [secondary-author]
604: \item [secondary-title]
605: \item [volume]
606: \item [number]
607: \item [pages]
608: \item [date]
609: \item [place-published]
610: \item [publisher]
611: \item [edition]
612: \item [tertiary author]
613: \item [tertiary-title]
614: \item [number-of-volumes]
615: \item [type-of-work]
616: \item [subsidiary author]
617: \item [alternate-title]
618: \item [isbn-issn]
619: \item [call-number]
620: \item [label]
621: \item [keywords]
622: \item [abstract]
623: \item [notes]
624: \item [url]
1.5 casties 625: \end{description}
1.4 casties 626: \end{description}
627:
628:
629: \subsection{Architectural drawings}
630: \label{sec:doc}
631:
632: Specific information for architectural drawings is presented in a
1.5 casties 633: \texttt{doc} container with an additional \texttt{type} attribute
634: giving the type of drawing. All elements inside the container can
635: appear multiple times.
1.4 casties 636:
637: \begin{description}
1.5 casties 638:
639: \item[doc type="Architectural Drawing"] architectural drawing.
640:
641: \begin{description}
642: \item [person] last name and first name of a person, separated by a
643: comma. A further common name for the person can be put infront,
644: separated by a semicolon.
645: \item [location] Name of a place in its common notation. This can be
646: a city or a institution.
647: \item [date] This can be a year (or several years, separated by
648: commas) or a period (1706-1714). Years are noted with four digits.
649: \item [object] Short description of an object or signatures.
650: \item [keywords] Keywords related to the object.
651: \end{description}
1.4 casties 652: \end{description}
1.1 casties 653:
654:
1.10 casties 655: \subsection{Document structure (table of contents)}
1.1 casties 656: \label{sec:toc}
657:
1.4 casties 658: Information on the structure of a document like the division into
659: parts and chapters in the way of a table of contents is presented in a
660: \texttt{toc} container.
661:
662: The scheme allows multiple logical pages on a single page image
663: as it is often the case with scanned books or manuscripts. The scheme
664: also allows for ``loose'' numbering schemes with roman, arabic or
665: other page numbers consecutively or mixed and changes in the numbering
666: within the document.
667:
668: The flexibility comes from the fact that no additional assumptions
669: about the mapping between logical pages and page images are made in
670: the format. All mapping information is specified by the user.
671:
672: The logical page numbering or naming that can be presented to the user
673: is specified in the \texttt{name} tags while the physical numbering of
674: the page images is specified in the \texttt{index} or \texttt{url}
675: tags.
1.1 casties 676:
1.4 casties 677: \begin{description}
1.5 casties 678: \item[toc] container for document structure
679:
1.4 casties 680: \begin{description}
1.5 casties 681: \item[page] describes a single logical page
682:
683: \begin{description}
684: \item[name] the ``name'' of the logical page. This can be any string
685: like a page number (arabic, roman, etc.) or a special designation
686: like ``Table 5''.
687:
688: \item[index] the \texttt{digilib} index number\footnote{The index
689: number for digilib is the index in the alphabetical order of the
690: scan file names.} of the scan image of the page.
691:
692: \item[url] alternatively to the \texttt{digilib} index number the
693: full URL of the scan image of the page can be used.
694: \end{description}
1.4 casties 695:
1.5 casties 696: \item[chapter] describes a section or chapter of the text.
697: \texttt{chapter} elements can be nested.
1.1 casties 698:
1.4 casties 699: \begin{description}
1.5 casties 700: \item[name] the title of the chapter or section.
701:
702: \item[start] the beginning of a page range (usually the first page
703: of the chapter). The \texttt{start} element has an optional
704: \texttt{increment} attribute to indicate the number of logical
705: pages on a scan image.\footnote{This information is only needed by
706: additional tools that try to generate lists of all page and
707: image numbers.}
708:
709: \begin{description}
710: \item[name] the ``name'' of the first page (see \texttt{page}).
711:
712: \item[index] the index of the first page (see \texttt{page}).
713:
714: \item[url] the URL of the first page (see \texttt{page}).
715: \end{description}
716:
717: \item[end] the end of a page range (usually the last page of the
718: chapter).
719:
720: \begin{description}
721: \item[name] the ``name'' of the last page (see \texttt{page}).
722:
723: \item[index] the index of the last page (see \texttt{page}).
724:
725: \item[url] the URL of the last page (see \texttt{page}).
726: \end{description}
727:
728: \item[page] alternative (and additional) to
729: \texttt{start}/\texttt{end} page ranges single \texttt{page}
730: elements can be used inside \texttt{chapter}.
1.4 casties 731: \end{description}
732: \end{description}
733: \end{description}
734:
735: %%\url{http://pythia.mpiwg-berlin.mpg.de/toolserver/TS_lise}
1.1 casties 736:
737:
1.12 casties 738: \subsection{Digital images}
1.1 casties 739: \label{sec:inform-scann-imag}
740:
741: Image files representing scanned images can have an \texttt{img}
742: container tag with information about the scan resolution and the size
743: of the original image. This information is used by the
744: \texttt{digilib} image viewing tool.
745:
746: Required is one of three possible sets of tags:
747:
748: \begin{description}
1.5 casties 749: \item[img] digital image information.
1.1 casties 750:
1.5 casties 751: \begin{description}
1.12 casties 752: \item[original-size-x] The width of the original
753: image -- required. \\
754: The unit of measure can be contained as parameter \texttt{unit},
755: the default is meter ``m''. The width to be considered is the
756: total width of the scanned area.
1.5 casties 757:
1.12 casties 758: \item[original-size-y] The height of the original image -- required.
1.5 casties 759:
1.12 casties 760: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
1.5 casties 761:
1.12 casties 762: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 763: \end{description}
1.1 casties 764: \end{description}
765:
766: or
767:
768: \begin{description}
1.5 casties 769: \item[img] digital image information.
770:
771: \begin{description}
772: \item[original-dpi-x] The resolution of the hi-res scan in its width
1.12 casties 773: in pixels per inch -- required.
1.1 casties 774:
1.5 casties 775: \item[original-dpi-y] The resolution of the hi-res scan in its height
1.12 casties 776: in pixels per inch -- required.
777:
778: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
779:
780: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 781: \end{description}
1.1 casties 782: \end{description}
783:
784: or
785:
786: \begin{description}
1.5 casties 787: \item[img] digital image information.
788:
789: \begin{description}
790: \item[original-dpi] The resolution of the hi-res scan in pixels per
1.12 casties 791: inch if the resolutions in width and height are the same -- required.
792:
793: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
794:
795: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 796: \end{description}
1.1 casties 797: \end{description}
1.7 casties 798:
799:
1.10 casties 800:
1.12 casties 801: \subsection{Digital image acquisition}
1.10 casties 802: \label{sec:inform-about-image}
803:
804: A description of the technology used in the process of producing a
805: digital image.
806:
807: \begin{description}
808: \item[image-acquisition] description of the image production process
809: \begin{description}
1.12 casties 810: \item[device] acquisition device (e.g. ``flatbed scanner'')
1.10 casties 811:
1.12 casties 812: \item[image-type] type and color-depth of the image -- required (e.g. ``RGB 24
1.10 casties 813: bit'')
814:
815: \item[production-comment] additional textual information about the
816: production process
817: \end{description}
818: \end{description}
819:
820:
1.12 casties 821:
1.7 casties 822: \subsection{Full text with images}
823: \label{sec:full-text-with}
824:
1.12 casties 825: Full text in a XML format should be specified with a
826: \texttt{content-type}\footnote{see section~\ref{tag-content-type}
827: on page\pageref{tag-content-type}} ``fulltext''.
1.8 casties 828:
829: The relation between the full text and optional images of
830: whole pages or parts of pages must be specified in a
831: \texttt{text-tool} container.
832:
833: \begin{description}
834: \item[text-tool] representation of full text with images
835:
836: \begin{description}
837: \item[text-file] the file name of the full text file (with path
838: inside document directory)
1.12 casties 839:
1.8 casties 840: \item[page-images] the directory name of the directory containig the
1.12 casties 841: page image files (with path inside document directory)
1.8 casties 842:
843: \item[xslt-file] the file name of an additional XSL transformation
844: file
845:
846: \item[text-config] container for configuration options
1.10 casties 847: \begin{description}
848: \item[container-tag] the name of the text root element (default
849: ``text'')
850:
851: \item[ref-element-tag] the name of the element that is used as
852: unit of reference when results are presented
1.8 casties 853:
1.10 casties 854: \item[pagebreak-tag] the name of the element that indicates page
855: breaks (default ``pb'')
856: \end{description}
1.8 casties 857: \end{description}
858: \end{description}
1.7 casties 859:
1.1 casties 860:
861:
1.12 casties 862: \subsection{Copyright and access conditions}
863: \label{sec:access-conditions}
864:
865: If the access to a resource is bound to conditions for technical or legal
866: reasons then the conditions can be put in a \texttt{access-conditions}
867: container. Other access rights conditions like copyright can also be
868: documented in this container.
869:
870: \begin{description}
871: \item[access-conditions] legal and technical conditions for access to
872: this resource
873:
874: \begin{description}
875: \item[attribution] The name or institution this resource should be
876: attributed to when it's publicly presented
877:
878: \begin{description}
879: \item[name] a name (free text)
880:
881: \item[url] a URL (with an optional \texttt{label} attribute to show
882: as text)
883: \end{description}
884:
885: \item[copyright] the copyright owner and it's conditions
886: \begin{description}
887: \item[owner] the name of the copyright owner
888: \begin{description}
889: \item[name] a name (free text)
890:
891: \item[url] a URL (with an optional \texttt{label} attribute to show
892: as text)
893: \end{description}
894:
895: \item[date] the date when the copyright was issued
896:
897: \item[duration] the duration of the copyright (if known)
898:
899: \item[description] free-text field for special or additional
900: conditions
901: \end{description}
1.14 casties 902:
903:
904: \item[publish-metadata] metadata about this resource can be made
905: freely available when this tag is present. Access to the resource
906: itself is regulated separately by the \texttt{access} element.
1.12 casties 907:
908: \item[access] conditions of access to this resource
909: \begin{description}
910: \item[internal] access should be restricted to a group of users. The
911: type of group is defined by one of the following
912: \begin{description}
913: \item[institution] the members of this institution. The method
914: to identify a user to belong to the institution is not
915: specified in this document.
916:
917: \item[subnet] all computers with an IP-address in this subnet. The
918: subnet is defined in ``truncated-quad'' (e.g. ``141.14'') or
919: ``adress/netmask'' (e.g. ``141.14.0.0/255.255.0.0'') notation.
920:
921: \item[group] the members of this named group. The method to
922: identify a user to belong to a named group is not specified in
923: this document.
924: \end{description}
925:
926: \item[scientific] access to this resource should be restricted to
927: scientific work
928:
929: \item[free] access to this resource is not restricted
930:
931: \item[special] if none of the above conditions seems appropriate,
932: a free-form text can be specified here.
933: \end{description}
934: \end{description}
935: \end{description}
936:
937: \noindent
938: It should be noted that control over the access to the resource has to
939: be provided by additional technical measures. Access conditions in the
940: metadata file only state that conditions \emph{should} be observed,
941: not that they \emph{are} necessarily observed, as the enforcement of
942: conditions depends on additional technical measures.
943:
944:
945:
946: \subsection{Acquisition of raw-data}
947: \label{sec:acqu-inform}
948:
949: Information about the acquisition source for raw data resources can be
950: provided in an \texttt{acquisition} container.
951:
952: \begin{description}
953: \item[acquisition] the acquisition source of this resource -- required
954: for raw data.
955: \begin{description}
956: \item[provider] where this resource came from -- required
957: \begin{description}
958: \item[name] free-text name of the provider (institution or
959: individual)
960:
961: \item[address] address of the provider
962:
963: \item[contact] contact person at the provider (i.e. name and email)
964:
965: \item[url] URL related to the provider
1.13 casties 966:
967: \item[provider-id] id of the provider (internally used) -- deduced
1.12 casties 968: \end{description}
969:
970: \item[date] date of acquisition -- required
971:
972: \item[description] free-text description of the acquisition source or
973: additional information
974: \end{description}
975: \end{description}
976:
977:
978:
979: \subsection{Documentary Films}
980: \label{sec:documentary-films}
981:
982: Documentary films can be described using a \texttt{film-acquisition}
983: container.
984:
985: \begin{description}
986: \item[film-acquisition] description of a (documentary) film --
987: required for documentary film
988: \begin{description}
989: \item[recording] specification of the recording process
990: \begin{description}
991: \item[author] the person or persons doing the recording
992:
993: \item[date] the date or time span when the film was recorded
994:
995: \item[location] the place where the film was recorded
996:
997: \item[device] recording device used (e.g. ``Sony CP-DV8 Camcorder'')
998:
999: \item[format] format of the recorded film -- required (e.g. ``DV
1000: 720x524 25fps interlaced'')
1001: \end{description}
1002:
1003: \item[description] free-form description of the recording and the
1004: content of the film
1005: \end{description}
1006: \end{description}
1007:
1008: (More information about the digitization step could be added in a
1009: \texttt{digitization} tag similar to the \texttt{recording} tag.)
1010:
1.1 casties 1011:
1012:
1013:
1.4 casties 1014: \section{Sample metadata files for ECHO resources}
1.1 casties 1015:
1.5 casties 1016: The following is a sample metadata index file for a directory containig a
1017: scanned document.
1018:
1019: \begin{small}
1.1 casties 1020: \begin{verbatim}
1.11 casties 1021: <resource type="ECHO" version="1.0">
1.5 casties 1022: <description>Fleck, 1980</description>
1023: <name>fleck.1980</name>
1024: <creator>University of Bern</creator>
1025: <archive-path>ubern/wiss-theorie</archive-path>
1026: <content-type>scanned images</content-type>
1027: <meta>
1028: <dri>echo23a45e2329x</dri>
1029: <lang>ger</lang>
1030: <bib type="book">
1031: <author>Fleck, Ludwik</author>
1032: <year>1980</year>
1033: <title>Entstehung und Entwicklung einer
1034: wissenschaftlichen Tatsache</title>
1035: <series-editor></series-editor>
1036: <series-title></series-title>
1037: <series-volume></series-volume>
1038: <number-of-pages></number-of-pages>
1039: <city>Frankfurt am Main</city>
1040: <publisher>Suhrkamp</publisher>
1041: <edition></edition>
1042: <number-of-volumes></number-of-volumes>
1043: <translator></translator>
1044: <isbn-issn></isbn-issn>
1045: <keywords>Wissenschaftstheorie, Fleck, Tatsache</keywords>
1046: <abstract></abstract>
1047: </bib>
1048: </meta>
1049: <dir>
1050: <description>Scanned images (300dpi)</description>
1051: <name>img</name>
1052: </dir>
1.4 casties 1053: </resource>
1054: \end{verbatim}
1.5 casties 1055: \end{small}
1.4 casties 1056:
1.5 casties 1057: The following is a sample metadata file for a single image of an
1058: architectural drawing.
1.4 casties 1059:
1.5 casties 1060: \begin{small}
1.4 casties 1061: \begin{verbatim}
1.11 casties 1062: <resource type="ECHO" version="1.0">
1.5 casties 1063: <creator>Bibliotheca Hertziana</creator>
1064: <content-type>scanned images</content-type>
1065: <file>
1066: <name>00000271-asl-160-r-full.tif</name>
1067: <meta>
1068: <img>
1069: <original-dpi>315</original-dpi>
1070: </img>
1071: <dri>echo45a67bc4367d</dri>
1072: <lang>ita</lang>
1073: <doc type="Architectural Drawing">
1074: <person>Ciolli, Giacomo</person>
1075: <person>Urban VIII; Barberini, Maffeo</person>
1076: <location>Accademia di San Luca</location>
1077: <location>Roma</location>
1078: <date>1706</date>
1079: <object>Concorso Clementino</object>
1080: <object>Fontana Pubblica</object>
1081: <object>Brunnen</object>
1082: <object>ASL 160</object>
1083: <keywords></keywords>
1084: </doc>
1085: <context>
1086: <url>http://colosseum.biblhertz.it:8080/Lineamenta/
1087: 1033478408.39/1035196181.35/1035196204.09/1035394121.83
1088: </url>
1089: </context>
1090: </meta>
1091: </file>
1.2 casties 1092: </resource>
1.1 casties 1093: \end{verbatim}
1.5 casties 1094: \end{small}
1.1 casties 1095:
1096: \end{document}
1097:
1098: %%% Local Variables:
1099: %%% mode: latex
1100: %%% TeX-master: t
1101: %%% End:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>