Annotation of storage/meta/meta-format.tex, revision 1.14
1.1 casties 1: \documentclass[a4paper]{article}
2:
3: \usepackage[latin1]{inputenc}
4: \usepackage[T1]{fontenc}
5: \usepackage{ae}
6: %\usepackage{times}
7: %\usepackage{courier}
8:
9: % create in-text links black (with PDF)
1.6 casties 10: \usepackage[colorlinks=true,linkcolor=black]{hyperref}
1.1 casties 11: % Format URLs nicely (without PDF)
1.6 casties 12: %\usepackage{url}
1.1 casties 13:
14:
15: \title{A simple metadata format for resource bundles}
16:
1.4 casties 17: \author{Robert Casties, Dirk Wintergrün, Hans-Christoph Liess}
1.1 casties 18:
1.14 ! casties 19: \date{V1.1.1 of 18.5.2004}
1.1 casties 20:
21: \begin{document}
22:
23: \maketitle
24:
25: \tableofcontents
26:
27:
28: \section{File and directory names}
29: \label{sec:file-directory-names}
30:
31: File and directory names should not contain spaces. Allowed characters
32: in filenames are only the alphanumeric set a-z, A-Z, 0-9, hyphen
33: ``-'', underscore ``\_'' and dot ``.''.
34:
1.12 casties 35: Files and directories with names that contain illegal characters must
36: be transformed to allowed names. A proposition for a simple
37: transformation rule is
38:
39: \begin{itemize}
40: \item whitespace characters (e.g. blank, tab, cr, lf) are replaced by
41: hyphens ``-''
42:
43: \item other illegal characters are replaced by underscores ``\_''.
44: \end{itemize}
45:
46: This rule does not provide a reversible mapping to the original
47: illegal file name and it does not provide a collision-free mapping,
48: i.e. two different illegal file names might be mapped to the same
49: allowed file name. Additional precautions for these cases must be
50: taken.
1.1 casties 51:
1.4 casties 52:
53: \section{Metadata files}
54: \label{sec:metadata-files}
55:
56: The metadata information is stored in the XML format documented below
57: in special files in the resource directory. Two forms of metadata
58: files are possible:
59: \begin{itemize}
60: \item a file named \texttt{index.meta} in a directory.
61:
62: \item a file named like the data file it describes with an
63: additional extension \texttt{.meta}. For example metadata for the
64: file \texttt{0001.tif} would be in a file \texttt{0001.tif.meta}.
65: \end{itemize}
66:
67: The resource directory must contain an \texttt{index.meta} file with
68: information about the resource as a whole. Other directories can
69: contain \texttt{index.meta} files.
70:
71: Additional information about single data files that are part of the
72: resource can either be put in \texttt{file} tags in the
73: \texttt{index.meta} file or in separate \emph{filename}\texttt{.meta}
74: files for each data file. Information from the directory level file is
75: inherited at the file level.
76:
77:
1.1 casties 78: \section{Resource format}
79: \label{sec:mpiwg-doc}
80:
81: In this description elements marked ``optional'' need not be supplied
82: by the provider of the resource and may be absent in all versions of
83: the metadata file. Elements marked ``required'' must be supplied by
84: the provider of the resource. Elements marked ``deduced'' can be
85: supplied by the provider of the resource but can also be provided by
1.4 casties 86: automatic scripts later in the process, these elements must be present
1.1 casties 87: in the final file.
88:
1.12 casties 89: File and directory paths in the metadata file use the conventional
90: Unix file separator slash ``/''.
91:
1.11 casties 92: The outer container element is \texttt{resource}. It has the following
93: \textbf{attributes}:
94:
95: \begin{description}
1.12 casties 96: \item[type] sub-type of resource (e.g. ``ECHO'', ``MPIWG'') --
97: optional.
1.11 casties 98:
1.12 casties 99: \item[version] version number of metadata format (currently 1.1) --
1.11 casties 100: required.
101: \end{description}
102:
103: \noindent The allowed \textbf{elements} inside \texttt{resource} are:
1.1 casties 104:
105: \begin{description}
1.14 ! casties 106: \item[description] An informal textual description of the resource --
! 107: optional\footnote{At least one description of the resource's content
! 108: is required. The description can be an informal
! 109: \texttt{description} element or a descriptive element (like
! 110: \texttt{bib}) in a \texttt{meta} container.}.
1.1 casties 111:
112: \item[name] The filename of the resource (name of the directory this
113: file is contained in) -- required.
114:
115: \item[creator] The name of the project or person that created the
116: resource -- optional.
1.4 casties 117:
118: \item[archive-creation-date] The time and date the archive collection
119: was created -- deduced.
1.1 casties 120:
1.4 casties 121: \item[archive-storage-date] The time and date the archive was written
122: to permanent storage -- deduced (must not be set by the user).
1.1 casties 123:
124: \item[archive-path] The full path to the resource directory inside the
1.5 casties 125: whole archive collection, including the resource directory -- deduced.
1.12 casties 126:
127: \item[archive-id] The ID for this document in the archive --
128: required.
1.1 casties 129:
130: \item[derived-from] Container for the description of the original
131: resource if this resource is a modified version of another resource
132: -- optional.
133:
134: \begin{description}
1.12 casties 135: \item[archive-id] The ID of the original resource
136: -- required.
137:
1.1 casties 138: \item[archive-path] The full path to the original resource
1.12 casties 139: -- deduced.
1.1 casties 140:
141: \item[description] An informal textual description of the relation
142: of this resource to the original resource -- optional.
143: \end{description}
144:
145: \item[linked-with] Container for the description of another
146: resource when this resource is a linked copy of another resource
147: -- optional.
148:
149: \begin{description}
1.12 casties 150: \item[archive-id] The ID of the linked resource
151: -- required.
152:
1.1 casties 153: \item[archive-path] The full path to the linked resource
1.12 casties 154: -- deduced.
1.1 casties 155:
156: \item[description] An informal textual description of the relation
157: of this resource to the linked resource -- optional.
158: \end{description}
159:
1.12 casties 160: \item[media-type] \label{tag-media-type} The main media type of this
161: resource -- required.\\ The main media type can be overridden by
162: \texttt{media-type}s in subdirectories. Possible types are
163: \begin{itemize}
164: \item \texttt{image}
165:
166: \item \texttt{text}
167:
168: \item \texttt{audio}
169:
170: \item \texttt{video}
171:
172: \item \texttt{data} for other type of data
173: \end{itemize}
1.1 casties 174:
175: \item[meta] Additional metadata information about the resource --
176: optional.\\ For a description of additional metadata see below.
177:
178: \item[dir] Container for the description of a subdirectory -- required
179: (when there are subdirectories).\\ \texttt{dir} tags should not be
180: nested. Directories at lower levels are identified by their
181: \texttt{path}.
182:
183: \begin{description}
184: \item[description] An informal textual description of the
185: subdirectory -- optional.
186:
187: \item[name] The name of the subdirectory -- required.
188:
1.12 casties 189: \item[original-name] A text string associated with the directory as
190: original name -- optional. (E.g. if the data in this directory
191: came from an external source and had a name that had to be changed
192: according to section~\ref{sec:file-directory-names} but it should
193: be possible to reference the original name.)
194:
1.1 casties 195: \item[path] The directory path of this subdirectory relative to the
1.5 casties 196: resource's root directory (excluding the directory itself) --
197: required (may be empty or omitted if the directory is a direct
198: child of the resource's root directory).
1.1 casties 199:
200: \item[meta] Additional metadata information about the directory --
201: optional.\\ For a description of additional metadata see below.
202: \end{description}
203:
204: \item[file] Container for the description of a file -- deduced.\\
205: \texttt{file} tags should not be nested in \texttt{dir} tags. Files
206: at lower directory levels are identified by their \texttt{path}.
207:
208: \begin{description}
209: \item[description] An informal textual description of the
210: file -- optional.
211:
212: \item[name] The name of the file -- required.
213:
1.12 casties 214: \item[original-name] A text string associated with the file as
215: original name -- optional. (E.g. if this file came from an
216: external source and had a name that had to be changed according to
217: section~\ref{sec:file-directory-names} but it should be possible
218: to reference the original name.)
219:
1.1 casties 220: \item[path] The directory path of this file relative to the
1.5 casties 221: resource's root directory (excluding the file itself) -- required
222: (may be empty or omitted if the file is in the resource's root
223: directory).
1.7 casties 224:
225: \item[date] The file's modification or creation date\footnote{The
226: preferred time and date format is ``YYYY/MM/DD HH:MM:SS''},
227: whichever is more recent -- optional.
1.1 casties 228:
229: \item[modification-date] The file's modification date -- optional.
230:
231: \item[creation-date] The file's creation date -- optional.
1.7 casties 232:
1.1 casties 233: \item[size] The file size -- deduced.
234:
235: \item[mime-type] The file's mime-type -- optional.
236:
237: \item[md5cs] MD5 checksum of the file content -- optional.
238:
239: \item[meta] Additional metadata information about the file --
240: optional. For a description of additional metadata see below.
241: \end{description}
242:
243: \end{description}
244:
245:
246:
247: \section{Additional metadata}
248: \label{sec:additional-metadata}
249:
250: All elements with \texttt{meta} tags can contain an arbitrary number
1.12 casties 251: of the following additional metadata elements.
252:
253: \subsection{workflow state}
254: \label{sec:workflow-state}
255:
256: All additional metadata elements can have a \texttt{workflow-state}
257: \textbf{attribute}. This attribute reflects the state of the
258: corresponding metadata element. The possible values for the
259: \texttt{workflow-state} attribute are
260: \begin{itemize}
261: \item \texttt{preliminary} this information is preliminary. It must
262: be checked in further workflow steps.
263:
264: \item \texttt{inwork}
265:
266: \item \texttt{final}
267: \end{itemize}
268:
269: workflow states other than \texttt{preliminary} are part of the
270: workflow handling of the respective projects.
271:
272: Metadata elements can appear multiple times with different
273: \texttt{workflow-state} attributes. This enables metadata versioning.
274:
275:
276:
277: \subsection{Content type}
278: \label{sec:content-type}
279:
280: \begin{description}
281: \item[content-type] \label{tag-content-type} The content type of this
282: resource -- required.\\
283: The content type enables the choice of tools to manipulate and
284: display the resource. There should be a common list of content
285: types. For digital documents (books, manuscripts) this would be
286: "scanned document", for other image data "scanned
287: images".\footnote{The criterion for documents is a ordered
288: succession of image files (pages) and equal image size and
289: resolution throughout the images of a resource.}
290: \end{description}
291:
292:
1.1 casties 293:
1.4 casties 294: \subsection{Language}
295: \label{sec:lang}
296:
297: The language of a resource (e.g. a text) can be specified with a
298: \texttt{lang} tag. Languages have to be described using the
299: international codes for the representation of names of languages
300: either in two-letter form (ISO 639-1) or in three-letter form (ISO
301: 639-2). The entire catalogue of languages is documented on the page
302:
303: \url{http://www.loc.gov/standards/iso639-2/englangn.html}
304:
1.1 casties 305:
306: \subsection{DRI}
307: \label{sec:dri}
308:
309: The \emph{digital resource identifier} for the resource is specified
1.4 casties 310: in a \texttt{dri} element. Digital resource identifiers are documented
1.1 casties 311: on the page
312:
313: \url{http://pythia.mpiwg-berlin.mpg.de/projects/standards/dri}.
314:
315:
1.4 casties 316:
317: \subsection{Collection context}
318: \label{sec:collection-context}
319:
320: The context of a resource as part of a collection or part of a project can be
1.5 casties 321: specified in the \texttt{context} element. All elements in the
322: container can appear multiple times.
1.4 casties 323:
324: \begin{description}
1.5 casties 325: \item[context] information on collection or project context.
1.4 casties 326:
1.5 casties 327: \begin{description}
328: \item[link] URL to additional context information.
329:
330: \item[name] Textual description of project or collection.
331: \end{description}
1.4 casties 332: \end{description}
1.5 casties 333:
1.4 casties 334:
335:
336:
1.1 casties 337: \subsection{Bibliographic information}
338: \label{sec:bibliographic-data}
339:
1.5 casties 340: Bibliographic information is presented in a \texttt{bib} container with
1.1 casties 341: a \texttt{type} parameter, giving the type of bibliographic resource.
1.4 casties 342: The \texttt{type} field can be repeated as a tag in the container.
343:
1.5 casties 344: The format is based on the ECHO scheme for bibliographic data (cf.
345: content workflow), the MPIWG ``Projektbibliografie'' and the format of
346: the commonly used program ``EndNote''.
347:
1.4 casties 348:
349: \subsubsection{Book}
350:
351: \begin{description}
352:
353: \item [bib type="book"] a published book.
354:
355: \begin{description}
356: \item [author] The author of the book.
357: \item [year] The year of publication.
358: \item [title] Title of the book.
359: \item [series-editor] Name of the series editor, if the book appears
360: in a series.
361: \item [series-title] Title of the serie, if the book appears in a
362: series.
363: \item [series-volume] Volume number, if the book appears in a
364: series.
365: \item [number-of-pages] Number of pages of the entire book.
366: \item [city] City where the book was published.
367: \item [publisher] Name of the publishing company
368: \item [edition] Edition of the book (e.g. third edition)
369: \item [number-of-volumes] Number of volumes, if the the book is
370: published in multiple volumes.
371: \item [translator] Name of the translator.
372: \item [isbn-issn]
373: \end{description}
374: \end{description}
375:
376: \subsubsection{In Book}
377:
378: \begin{description}
379: \item [bib type="inbook"] an article as part of a book.
380:
381: \begin{description}
382: \item [author] The author of the book.
383: \item [year] The year of publication.
384: \item [title] Title of the article.
385: \item [editor] Name of the book's editor.
386: \item [book-title] Title of the book.
387: \item [series-volume] Volume number, if the book appears in a
388: series.
389: \item [pages] Number of pages of the article.
390: \item [city] City where the book was published.
391: \item [publisher] Name of the publishing company
392: \item [edition] Edition of the book (e. g. third edition)
393: \item [series-author] Name of the series editor, if the book appears
394: in a series.
395: \item [series-title] Title of the series, if the book appears in a
396: series.
397: \item [number-of-volumes] Number of volumes, if the the book is
398: published in multiple volumes.
399: \item [translator] Name of the translator
400: \item [isbn-issn]
401: \end{description}
402: \end{description}
403:
404: \subsubsection{Proceedings}
405:
406: \begin{description}
407: \item [bib type="proceedings"] a conference proceedings publication.
408:
409: \begin{description}
410: \item [author] The author of the article.
411: \item [year] The year of publication.
412: \item [title] Title of the article.
413: \item [editor] Name of the book's editor.
414: \item [conference-name] Name of the conference the proceedings are
415: related to.
416: \item [volume] Volume number.
417: \item [pages] Number of pages of the article.
418: \item [date] Date of the conference the proceedings are related to.
419: \item [conference]-location City where the conference was held.
420: \item [publisher] Name of the publishing company
421: \item [edition] Edition of the book (e. g. third edition)
422: \item [series-editor] Name of the series editor, if the book appears
423: in a series.
424: \item [series-title] Title of the series, if the book appears in a
425: series.
426: \item [number-of-volumes] Number of volumes, if the the book is
427: published as multiple volumes.
428: \item [isbn-issn]
429: \end{description}
430: \end{description}
431:
432: \subsubsection{Edited Book}
433:
434: \begin{description}
435: \item[bib type="edited-book"] a book that is the edition of another
436: work.
437:
438: \begin{description}
439: \item [editor] Name of the editor of the book.
440: \item [year] The year of publication.
441: \item [title] Title of the book.
442: \item [series-editor] Name of the editor of the series the book is
443: part of.
444: \item [series-title] Title of the series, if the book is part of a
445: series.
446: \item [series-volume] Volume number, if the book appears in a series.
447: \item [number-of-pages] Number of pages of the article.
448: \item [city] City where the book was published.
449: \item [publisher] Name of the publishing company
450: \item [edition] Information about the edition (e.g. ``Repr. of the London ed. 1652'')
451: \item [number-of-volumes] Number of volumes, if the the book is
452: published as multiple volumes.
453: \item [isbn-issn]
454: \end{description}
455: \end{description}
456:
457: \subsubsection{Journal Article}
458:
459: \begin{description}
460: \item [bib type="journal-article"] an article in a scientific journal.
461: \begin{description}
462: \item [author] The author of the article.
463: \item [year] The year of publication.
464: \item [title] Title of the article.
465: \item [journal] Name of the journal.
466: \item [volume] Volume number, if the journal appears in a series.
467: \item [issue] Number of the issue the article is part of.
468: \item [pages] Number of pages of the article.
469: \item [alternate-journal] Alternate Journal
470: \item [isbn-issn]
471: \end{description}
472: \end{description}
473:
474: \subsubsection{Magazine Article}
475:
476: \begin{description}
477: \item [bib type="magazine-article"] an article in a popular magazine.
478: \begin{description}
479: \item [author] The author of the book.
480: \item [year] The year of publication.
481: \item [title] Title of the article.
482: \item [magazine] Name of the magazine.
483: \item [volume] Volume number, if the book appears in a series.
484: \item [issue-number] Number of the issue the article is part of.
485: \item [pages Number] of pages of the article.
486: \item [date] Date when the article appeared.
487: \end{description}
488: \end{description}
489:
490: \subsubsection{Newspaper Article}
491:
492: \begin{description}
493: \item [bib type="newspaper-article"] an article in a newspaper.
494: \begin{description}
495: \item [author] The author of the article.
496: \item [year] The year of publication.
497: \item [title] Title of the article.
498: \item [Newspaper] Name of the newspaper the article appeared in.
499: \item [pages] Number of pages of the article.
500: \item [issue-date] Date of the issue the article is part of.
501: \item [city] City of the newspaper.
502: \end{description}
503: \end{description}
504:
505: \subsubsection{Thesis}
506:
507: \begin{description}
508: \item [bib type="thesis"] a master/doctorate/etc. thesis.
509: \begin{description}
510: \item [author] The author of the thesis.
511: \item [year] The year of publication.
512: \item [title] Title of the thesis.
513: \item [academic-department] Name of the academic department where
514: the thesis was handed in.
515: \item [number-of-pages] Number of pages of the thesis.
516: \item [city] City where the thesis was published.
517: \item [University] Name of the university where the thesis was
518: handed in.
519: \item [isbn-issn]
520: \end{description}
521: \end{description}
522:
523: \subsubsection{Report}
524:
525: \begin{description}
526: \item [bib type="report"] a scientific report.
527: \begin{description}
528: \item [author] The author of the report.
529: \item [year] The year of publication.
530: \item [title] Title of the report.
531: \item [pages] Number of pages of the report.
532: \item [date] Date when the report appeared.
533: \item [city] City where the book was published.
534: \item [institution] Institution where the report was produced.
535: \item [type] Type of report.
536: \item [report-number] Report number.
537: \end{description}
538: \end{description}
539:
1.5 casties 540: \subsubsection{Manuscript}
541:
542: \begin{description}
543: \item [bib type="manuscript"] a handwritten/typewritten manuscript.
544:
545: \begin{description}
546: \item [title] Title of the manuscript.
547: \item [author] The author of the text.
548: \item [location] Name of the library where the manuscript is
549: currently located.
550: \item [year] The year or century of publication.
551: \item [pages] Number of pages of the manuscript.
552: \item [signature] Signature of the manuscript.
553: \item [editorial-remarks] Remarks related to the online
554: publication of the manuscript. This could be notes about
555: annotations etc.
556: \item [description] This can be any kind of description.
557: \item [keywords] Keywords related to the manuscript.
558: \end{description}
559: \end{description}
560:
561:
1.4 casties 562: \subsubsection{Generic}
563:
564: \begin{description}
565: \item [bib type="generic"] a generic bibliographic type. This type
566: should only be used in rare cases.
567: \begin{description}
568: \item [author]
569: \item [year]
570: \item [title]
571: \item [secondary-author]
572: \item [secondary-title]
573: \item [volume]
574: \item [number]
575: \item [pages]
576: \item [date]
577: \item [place-published]
578: \item [publisher]
579: \item [edition]
580: \item [tertiary author]
581: \item [tertiary-title]
582: \item [number-of-volumes]
583: \item [type-of-work]
584: \item [subsidiary author]
585: \item [alternate-title]
586: \item [isbn-issn]
587: \item [call-number]
588: \item [label]
589: \item [keywords]
590: \item [abstract]
591: \item [notes]
592: \item [url]
1.5 casties 593: \end{description}
1.4 casties 594: \end{description}
595:
596:
597: \subsection{Architectural drawings}
598: \label{sec:doc}
599:
600: Specific information for architectural drawings is presented in a
1.5 casties 601: \texttt{doc} container with an additional \texttt{type} attribute
602: giving the type of drawing. All elements inside the container can
603: appear multiple times.
1.4 casties 604:
605: \begin{description}
1.5 casties 606:
607: \item[doc type="Architectural Drawing"] architectural drawing.
608:
609: \begin{description}
610: \item [person] last name and first name of a person, separated by a
611: comma. A further common name for the person can be put infront,
612: separated by a semicolon.
613: \item [location] Name of a place in its common notation. This can be
614: a city or a institution.
615: \item [date] This can be a year (or several years, separated by
616: commas) or a period (1706-1714). Years are noted with four digits.
617: \item [object] Short description of an object or signatures.
618: \item [keywords] Keywords related to the object.
619: \end{description}
1.4 casties 620: \end{description}
1.1 casties 621:
622:
1.10 casties 623: \subsection{Document structure (table of contents)}
1.1 casties 624: \label{sec:toc}
625:
1.4 casties 626: Information on the structure of a document like the division into
627: parts and chapters in the way of a table of contents is presented in a
628: \texttt{toc} container.
629:
630: The scheme allows multiple logical pages on a single page image
631: as it is often the case with scanned books or manuscripts. The scheme
632: also allows for ``loose'' numbering schemes with roman, arabic or
633: other page numbers consecutively or mixed and changes in the numbering
634: within the document.
635:
636: The flexibility comes from the fact that no additional assumptions
637: about the mapping between logical pages and page images are made in
638: the format. All mapping information is specified by the user.
639:
640: The logical page numbering or naming that can be presented to the user
641: is specified in the \texttt{name} tags while the physical numbering of
642: the page images is specified in the \texttt{index} or \texttt{url}
643: tags.
1.1 casties 644:
1.4 casties 645: \begin{description}
1.5 casties 646: \item[toc] container for document structure
647:
1.4 casties 648: \begin{description}
1.5 casties 649: \item[page] describes a single logical page
650:
651: \begin{description}
652: \item[name] the ``name'' of the logical page. This can be any string
653: like a page number (arabic, roman, etc.) or a special designation
654: like ``Table 5''.
655:
656: \item[index] the \texttt{digilib} index number\footnote{The index
657: number for digilib is the index in the alphabetical order of the
658: scan file names.} of the scan image of the page.
659:
660: \item[url] alternatively to the \texttt{digilib} index number the
661: full URL of the scan image of the page can be used.
662: \end{description}
1.4 casties 663:
1.5 casties 664: \item[chapter] describes a section or chapter of the text.
665: \texttt{chapter} elements can be nested.
1.1 casties 666:
1.4 casties 667: \begin{description}
1.5 casties 668: \item[name] the title of the chapter or section.
669:
670: \item[start] the beginning of a page range (usually the first page
671: of the chapter). The \texttt{start} element has an optional
672: \texttt{increment} attribute to indicate the number of logical
673: pages on a scan image.\footnote{This information is only needed by
674: additional tools that try to generate lists of all page and
675: image numbers.}
676:
677: \begin{description}
678: \item[name] the ``name'' of the first page (see \texttt{page}).
679:
680: \item[index] the index of the first page (see \texttt{page}).
681:
682: \item[url] the URL of the first page (see \texttt{page}).
683: \end{description}
684:
685: \item[end] the end of a page range (usually the last page of the
686: chapter).
687:
688: \begin{description}
689: \item[name] the ``name'' of the last page (see \texttt{page}).
690:
691: \item[index] the index of the last page (see \texttt{page}).
692:
693: \item[url] the URL of the last page (see \texttt{page}).
694: \end{description}
695:
696: \item[page] alternative (and additional) to
697: \texttt{start}/\texttt{end} page ranges single \texttt{page}
698: elements can be used inside \texttt{chapter}.
1.4 casties 699: \end{description}
700: \end{description}
701: \end{description}
702:
703: %%\url{http://pythia.mpiwg-berlin.mpg.de/toolserver/TS_lise}
1.1 casties 704:
705:
1.12 casties 706: \subsection{Digital images}
1.1 casties 707: \label{sec:inform-scann-imag}
708:
709: Image files representing scanned images can have an \texttt{img}
710: container tag with information about the scan resolution and the size
711: of the original image. This information is used by the
712: \texttt{digilib} image viewing tool.
713:
714: Required is one of three possible sets of tags:
715:
716: \begin{description}
1.5 casties 717: \item[img] digital image information.
1.1 casties 718:
1.5 casties 719: \begin{description}
1.12 casties 720: \item[original-size-x] The width of the original
721: image -- required. \\
722: The unit of measure can be contained as parameter \texttt{unit},
723: the default is meter ``m''. The width to be considered is the
724: total width of the scanned area.
1.5 casties 725:
1.12 casties 726: \item[original-size-y] The height of the original image -- required.
1.5 casties 727:
1.12 casties 728: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
1.5 casties 729:
1.12 casties 730: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 731: \end{description}
1.1 casties 732: \end{description}
733:
734: or
735:
736: \begin{description}
1.5 casties 737: \item[img] digital image information.
738:
739: \begin{description}
740: \item[original-dpi-x] The resolution of the hi-res scan in its width
1.12 casties 741: in pixels per inch -- required.
1.1 casties 742:
1.5 casties 743: \item[original-dpi-y] The resolution of the hi-res scan in its height
1.12 casties 744: in pixels per inch -- required.
745:
746: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
747:
748: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 749: \end{description}
1.1 casties 750: \end{description}
751:
752: or
753:
754: \begin{description}
1.5 casties 755: \item[img] digital image information.
756:
757: \begin{description}
758: \item[original-dpi] The resolution of the hi-res scan in pixels per
1.12 casties 759: inch if the resolutions in width and height are the same -- required.
760:
761: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
762:
763: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 764: \end{description}
1.1 casties 765: \end{description}
1.7 casties 766:
767:
1.10 casties 768:
1.12 casties 769: \subsection{Digital image acquisition}
1.10 casties 770: \label{sec:inform-about-image}
771:
772: A description of the technology used in the process of producing a
773: digital image.
774:
775: \begin{description}
776: \item[image-acquisition] description of the image production process
777: \begin{description}
1.12 casties 778: \item[device] acquisition device (e.g. ``flatbed scanner'')
1.10 casties 779:
1.12 casties 780: \item[image-type] type and color-depth of the image -- required (e.g. ``RGB 24
1.10 casties 781: bit'')
782:
783: \item[production-comment] additional textual information about the
784: production process
785: \end{description}
786: \end{description}
787:
788:
1.12 casties 789:
1.7 casties 790: \subsection{Full text with images}
791: \label{sec:full-text-with}
792:
1.12 casties 793: Full text in a XML format should be specified with a
794: \texttt{content-type}\footnote{see section~\ref{tag-content-type}
795: on page\pageref{tag-content-type}} ``fulltext''.
1.8 casties 796:
797: The relation between the full text and optional images of
798: whole pages or parts of pages must be specified in a
799: \texttt{text-tool} container.
800:
801: \begin{description}
802: \item[text-tool] representation of full text with images
803:
804: \begin{description}
805: \item[text-file] the file name of the full text file (with path
806: inside document directory)
1.12 casties 807:
1.8 casties 808: \item[page-images] the directory name of the directory containig the
1.12 casties 809: page image files (with path inside document directory)
1.8 casties 810:
811: \item[xslt-file] the file name of an additional XSL transformation
812: file
813:
814: \item[text-config] container for configuration options
1.10 casties 815: \begin{description}
816: \item[container-tag] the name of the text root element (default
817: ``text'')
818:
819: \item[ref-element-tag] the name of the element that is used as
820: unit of reference when results are presented
1.8 casties 821:
1.10 casties 822: \item[pagebreak-tag] the name of the element that indicates page
823: breaks (default ``pb'')
824: \end{description}
1.8 casties 825: \end{description}
826: \end{description}
1.7 casties 827:
1.1 casties 828:
829:
1.12 casties 830: \subsection{Copyright and access conditions}
831: \label{sec:access-conditions}
832:
833: If the access to a resource is bound to conditions for technical or legal
834: reasons then the conditions can be put in a \texttt{access-conditions}
835: container. Other access rights conditions like copyright can also be
836: documented in this container.
837:
838: \begin{description}
839: \item[access-conditions] legal and technical conditions for access to
840: this resource
841:
842: \begin{description}
843: \item[attribution] The name or institution this resource should be
844: attributed to when it's publicly presented
845:
846: \begin{description}
847: \item[name] a name (free text)
848:
849: \item[url] a URL (with an optional \texttt{label} attribute to show
850: as text)
851: \end{description}
852:
853: \item[copyright] the copyright owner and it's conditions
854: \begin{description}
855: \item[owner] the name of the copyright owner
856: \begin{description}
857: \item[name] a name (free text)
858:
859: \item[url] a URL (with an optional \texttt{label} attribute to show
860: as text)
861: \end{description}
862:
863: \item[date] the date when the copyright was issued
864:
865: \item[duration] the duration of the copyright (if known)
866:
867: \item[description] free-text field for special or additional
868: conditions
869: \end{description}
1.14 ! casties 870:
! 871:
! 872: \item[publish-metadata] metadata about this resource can be made
! 873: freely available when this tag is present. Access to the resource
! 874: itself is regulated separately by the \texttt{access} element.
1.12 casties 875:
876: \item[access] conditions of access to this resource
877: \begin{description}
878: \item[internal] access should be restricted to a group of users. The
879: type of group is defined by one of the following
880: \begin{description}
881: \item[institution] the members of this institution. The method
882: to identify a user to belong to the institution is not
883: specified in this document.
884:
885: \item[subnet] all computers with an IP-address in this subnet. The
886: subnet is defined in ``truncated-quad'' (e.g. ``141.14'') or
887: ``adress/netmask'' (e.g. ``141.14.0.0/255.255.0.0'') notation.
888:
889: \item[group] the members of this named group. The method to
890: identify a user to belong to a named group is not specified in
891: this document.
892: \end{description}
893:
894: \item[scientific] access to this resource should be restricted to
895: scientific work
896:
897: \item[free] access to this resource is not restricted
898:
899: \item[special] if none of the above conditions seems appropriate,
900: a free-form text can be specified here.
901: \end{description}
902: \end{description}
903: \end{description}
904:
905: \noindent
906: It should be noted that control over the access to the resource has to
907: be provided by additional technical measures. Access conditions in the
908: metadata file only state that conditions \emph{should} be observed,
909: not that they \emph{are} necessarily observed, as the enforcement of
910: conditions depends on additional technical measures.
911:
912:
913:
914: \subsection{Acquisition of raw-data}
915: \label{sec:acqu-inform}
916:
917: Information about the acquisition source for raw data resources can be
918: provided in an \texttt{acquisition} container.
919:
920: \begin{description}
921: \item[acquisition] the acquisition source of this resource -- required
922: for raw data.
923: \begin{description}
924: \item[provider] where this resource came from -- required
925: \begin{description}
926: \item[name] free-text name of the provider (institution or
927: individual)
928:
929: \item[address] address of the provider
930:
931: \item[contact] contact person at the provider (i.e. name and email)
932:
933: \item[url] URL related to the provider
1.13 casties 934:
935: \item[provider-id] id of the provider (internally used) -- deduced
1.12 casties 936: \end{description}
937:
938: \item[date] date of acquisition -- required
939:
940: \item[description] free-text description of the acquisition source or
941: additional information
942: \end{description}
943: \end{description}
944:
945:
946:
947: \subsection{Documentary Films}
948: \label{sec:documentary-films}
949:
950: Documentary films can be described using a \texttt{film-acquisition}
951: container.
952:
953: \begin{description}
954: \item[film-acquisition] description of a (documentary) film --
955: required for documentary film
956: \begin{description}
957: \item[recording] specification of the recording process
958: \begin{description}
959: \item[author] the person or persons doing the recording
960:
961: \item[date] the date or time span when the film was recorded
962:
963: \item[location] the place where the film was recorded
964:
965: \item[device] recording device used (e.g. ``Sony CP-DV8 Camcorder'')
966:
967: \item[format] format of the recorded film -- required (e.g. ``DV
968: 720x524 25fps interlaced'')
969: \end{description}
970:
971: \item[description] free-form description of the recording and the
972: content of the film
973: \end{description}
974: \end{description}
975:
976: (More information about the digitization step could be added in a
977: \texttt{digitization} tag similar to the \texttt{recording} tag.)
978:
1.1 casties 979:
980:
981:
1.4 casties 982: \section{Sample metadata files for ECHO resources}
1.1 casties 983:
1.5 casties 984: The following is a sample metadata index file for a directory containig a
985: scanned document.
986:
987: \begin{small}
1.1 casties 988: \begin{verbatim}
1.11 casties 989: <resource type="ECHO" version="1.0">
1.5 casties 990: <description>Fleck, 1980</description>
991: <name>fleck.1980</name>
992: <creator>University of Bern</creator>
993: <archive-path>ubern/wiss-theorie</archive-path>
994: <content-type>scanned images</content-type>
995: <meta>
996: <dri>echo23a45e2329x</dri>
997: <lang>ger</lang>
998: <bib type="book">
999: <author>Fleck, Ludwik</author>
1000: <year>1980</year>
1001: <title>Entstehung und Entwicklung einer
1002: wissenschaftlichen Tatsache</title>
1003: <series-editor></series-editor>
1004: <series-title></series-title>
1005: <series-volume></series-volume>
1006: <number-of-pages></number-of-pages>
1007: <city>Frankfurt am Main</city>
1008: <publisher>Suhrkamp</publisher>
1009: <edition></edition>
1010: <number-of-volumes></number-of-volumes>
1011: <translator></translator>
1012: <isbn-issn></isbn-issn>
1013: <keywords>Wissenschaftstheorie, Fleck, Tatsache</keywords>
1014: <abstract></abstract>
1015: </bib>
1016: </meta>
1017: <dir>
1018: <description>Scanned images (300dpi)</description>
1019: <name>img</name>
1020: </dir>
1.4 casties 1021: </resource>
1022: \end{verbatim}
1.5 casties 1023: \end{small}
1.4 casties 1024:
1.5 casties 1025: The following is a sample metadata file for a single image of an
1026: architectural drawing.
1.4 casties 1027:
1.5 casties 1028: \begin{small}
1.4 casties 1029: \begin{verbatim}
1.11 casties 1030: <resource type="ECHO" version="1.0">
1.5 casties 1031: <creator>Bibliotheca Hertziana</creator>
1032: <content-type>scanned images</content-type>
1033: <file>
1034: <name>00000271-asl-160-r-full.tif</name>
1035: <meta>
1036: <img>
1037: <original-dpi>315</original-dpi>
1038: </img>
1039: <dri>echo45a67bc4367d</dri>
1040: <lang>ita</lang>
1041: <doc type="Architectural Drawing">
1042: <person>Ciolli, Giacomo</person>
1043: <person>Urban VIII; Barberini, Maffeo</person>
1044: <location>Accademia di San Luca</location>
1045: <location>Roma</location>
1046: <date>1706</date>
1047: <object>Concorso Clementino</object>
1048: <object>Fontana Pubblica</object>
1049: <object>Brunnen</object>
1050: <object>ASL 160</object>
1051: <keywords></keywords>
1052: </doc>
1053: <context>
1054: <url>http://colosseum.biblhertz.it:8080/Lineamenta/
1055: 1033478408.39/1035196181.35/1035196204.09/1035394121.83
1056: </url>
1057: </context>
1058: </meta>
1059: </file>
1.2 casties 1060: </resource>
1.1 casties 1061: \end{verbatim}
1.5 casties 1062: \end{small}
1.1 casties 1063:
1064: \end{document}
1065:
1066: %%% Local Variables:
1067: %%% mode: latex
1068: %%% TeX-master: t
1069: %%% End:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>