Annotation of storage/meta/meta-format.tex, revision 1.13
1.1 casties 1: \documentclass[a4paper]{article}
2:
3: \usepackage[latin1]{inputenc}
4: \usepackage[T1]{fontenc}
5: \usepackage{ae}
6: %\usepackage{times}
7: %\usepackage{courier}
8:
9: % create in-text links black (with PDF)
1.6 casties 10: \usepackage[colorlinks=true,linkcolor=black]{hyperref}
1.1 casties 11: % Format URLs nicely (without PDF)
1.6 casties 12: %\usepackage{url}
1.1 casties 13:
14:
15: \title{A simple metadata format for resource bundles}
16:
1.4 casties 17: \author{Robert Casties, Dirk Wintergrün, Hans-Christoph Liess}
1.1 casties 18:
1.13 ! casties 19: \date{V1.1.0 of 9.12.2003}
1.1 casties 20:
21: \begin{document}
22:
23: \maketitle
24:
25: \tableofcontents
26:
27:
28: \section{File and directory names}
29: \label{sec:file-directory-names}
30:
31: File and directory names should not contain spaces. Allowed characters
32: in filenames are only the alphanumeric set a-z, A-Z, 0-9, hyphen
33: ``-'', underscore ``\_'' and dot ``.''.
34:
1.12 casties 35: Files and directories with names that contain illegal characters must
36: be transformed to allowed names. A proposition for a simple
37: transformation rule is
38:
39: \begin{itemize}
40: \item whitespace characters (e.g. blank, tab, cr, lf) are replaced by
41: hyphens ``-''
42:
43: \item other illegal characters are replaced by underscores ``\_''.
44: \end{itemize}
45:
46: This rule does not provide a reversible mapping to the original
47: illegal file name and it does not provide a collision-free mapping,
48: i.e. two different illegal file names might be mapped to the same
49: allowed file name. Additional precautions for these cases must be
50: taken.
1.1 casties 51:
1.4 casties 52:
53: \section{Metadata files}
54: \label{sec:metadata-files}
55:
56: The metadata information is stored in the XML format documented below
57: in special files in the resource directory. Two forms of metadata
58: files are possible:
59: \begin{itemize}
60: \item a file named \texttt{index.meta} in a directory.
61:
62: \item a file named like the data file it describes with an
63: additional extension \texttt{.meta}. For example metadata for the
64: file \texttt{0001.tif} would be in a file \texttt{0001.tif.meta}.
65: \end{itemize}
66:
67: The resource directory must contain an \texttt{index.meta} file with
68: information about the resource as a whole. Other directories can
69: contain \texttt{index.meta} files.
70:
71: Additional information about single data files that are part of the
72: resource can either be put in \texttt{file} tags in the
73: \texttt{index.meta} file or in separate \emph{filename}\texttt{.meta}
74: files for each data file. Information from the directory level file is
75: inherited at the file level.
76:
77:
1.1 casties 78: \section{Resource format}
79: \label{sec:mpiwg-doc}
80:
81: In this description elements marked ``optional'' need not be supplied
82: by the provider of the resource and may be absent in all versions of
83: the metadata file. Elements marked ``required'' must be supplied by
84: the provider of the resource. Elements marked ``deduced'' can be
85: supplied by the provider of the resource but can also be provided by
1.4 casties 86: automatic scripts later in the process, these elements must be present
1.1 casties 87: in the final file.
88:
1.12 casties 89: File and directory paths in the metadata file use the conventional
90: Unix file separator slash ``/''.
91:
1.11 casties 92: The outer container element is \texttt{resource}. It has the following
93: \textbf{attributes}:
94:
95: \begin{description}
1.12 casties 96: \item[type] sub-type of resource (e.g. ``ECHO'', ``MPIWG'') --
97: optional.
1.11 casties 98:
1.12 casties 99: \item[version] version number of metadata format (currently 1.1) --
1.11 casties 100: required.
101: \end{description}
102:
103: \noindent The allowed \textbf{elements} inside \texttt{resource} are:
1.1 casties 104:
105: \begin{description}
106: \item[description] An informal textual description of the
107: resource -- optional.
108:
109: \item[name] The filename of the resource (name of the directory this
110: file is contained in) -- required.
111:
112: \item[creator] The name of the project or person that created the
113: resource -- optional.
1.4 casties 114:
115: \item[archive-creation-date] The time and date the archive collection
116: was created -- deduced.
1.1 casties 117:
1.4 casties 118: \item[archive-storage-date] The time and date the archive was written
119: to permanent storage -- deduced (must not be set by the user).
1.1 casties 120:
121: \item[archive-path] The full path to the resource directory inside the
1.5 casties 122: whole archive collection, including the resource directory -- deduced.
1.12 casties 123:
124: \item[archive-id] The ID for this document in the archive --
125: required.
1.1 casties 126:
127: \item[derived-from] Container for the description of the original
128: resource if this resource is a modified version of another resource
129: -- optional.
130:
131: \begin{description}
1.12 casties 132: \item[archive-id] The ID of the original resource
133: -- required.
134:
1.1 casties 135: \item[archive-path] The full path to the original resource
1.12 casties 136: -- deduced.
1.1 casties 137:
138: \item[description] An informal textual description of the relation
139: of this resource to the original resource -- optional.
140: \end{description}
141:
142: \item[linked-with] Container for the description of another
143: resource when this resource is a linked copy of another resource
144: -- optional.
145:
146: \begin{description}
1.12 casties 147: \item[archive-id] The ID of the linked resource
148: -- required.
149:
1.1 casties 150: \item[archive-path] The full path to the linked resource
1.12 casties 151: -- deduced.
1.1 casties 152:
153: \item[description] An informal textual description of the relation
154: of this resource to the linked resource -- optional.
155: \end{description}
156:
1.12 casties 157: \item[media-type] \label{tag-media-type} The main media type of this
158: resource -- required.\\ The main media type can be overridden by
159: \texttt{media-type}s in subdirectories. Possible types are
160: \begin{itemize}
161: \item \texttt{image}
162:
163: \item \texttt{text}
164:
165: \item \texttt{audio}
166:
167: \item \texttt{video}
168:
169: \item \texttt{data} for other type of data
170: \end{itemize}
1.1 casties 171:
172: \item[meta] Additional metadata information about the resource --
173: optional.\\ For a description of additional metadata see below.
174:
175: \item[dir] Container for the description of a subdirectory -- required
176: (when there are subdirectories).\\ \texttt{dir} tags should not be
177: nested. Directories at lower levels are identified by their
178: \texttt{path}.
179:
180: \begin{description}
181: \item[description] An informal textual description of the
182: subdirectory -- optional.
183:
184: \item[name] The name of the subdirectory -- required.
185:
1.12 casties 186: \item[original-name] A text string associated with the directory as
187: original name -- optional. (E.g. if the data in this directory
188: came from an external source and had a name that had to be changed
189: according to section~\ref{sec:file-directory-names} but it should
190: be possible to reference the original name.)
191:
1.1 casties 192: \item[path] The directory path of this subdirectory relative to the
1.5 casties 193: resource's root directory (excluding the directory itself) --
194: required (may be empty or omitted if the directory is a direct
195: child of the resource's root directory).
1.1 casties 196:
197: \item[meta] Additional metadata information about the directory --
198: optional.\\ For a description of additional metadata see below.
199: \end{description}
200:
201: \item[file] Container for the description of a file -- deduced.\\
202: \texttt{file} tags should not be nested in \texttt{dir} tags. Files
203: at lower directory levels are identified by their \texttt{path}.
204:
205: \begin{description}
206: \item[description] An informal textual description of the
207: file -- optional.
208:
209: \item[name] The name of the file -- required.
210:
1.12 casties 211: \item[original-name] A text string associated with the file as
212: original name -- optional. (E.g. if this file came from an
213: external source and had a name that had to be changed according to
214: section~\ref{sec:file-directory-names} but it should be possible
215: to reference the original name.)
216:
1.1 casties 217: \item[path] The directory path of this file relative to the
1.5 casties 218: resource's root directory (excluding the file itself) -- required
219: (may be empty or omitted if the file is in the resource's root
220: directory).
1.7 casties 221:
222: \item[date] The file's modification or creation date\footnote{The
223: preferred time and date format is ``YYYY/MM/DD HH:MM:SS''},
224: whichever is more recent -- optional.
1.1 casties 225:
226: \item[modification-date] The file's modification date -- optional.
227:
228: \item[creation-date] The file's creation date -- optional.
1.7 casties 229:
1.1 casties 230: \item[size] The file size -- deduced.
231:
232: \item[mime-type] The file's mime-type -- optional.
233:
234: \item[md5cs] MD5 checksum of the file content -- optional.
235:
236: \item[meta] Additional metadata information about the file --
237: optional. For a description of additional metadata see below.
238: \end{description}
239:
240: \end{description}
241:
242:
243:
244: \section{Additional metadata}
245: \label{sec:additional-metadata}
246:
247: All elements with \texttt{meta} tags can contain an arbitrary number
1.12 casties 248: of the following additional metadata elements.
249:
250: \subsection{workflow state}
251: \label{sec:workflow-state}
252:
253: All additional metadata elements can have a \texttt{workflow-state}
254: \textbf{attribute}. This attribute reflects the state of the
255: corresponding metadata element. The possible values for the
256: \texttt{workflow-state} attribute are
257: \begin{itemize}
258: \item \texttt{preliminary} this information is preliminary. It must
259: be checked in further workflow steps.
260:
261: \item \texttt{inwork}
262:
263: \item \texttt{final}
264: \end{itemize}
265:
266: workflow states other than \texttt{preliminary} are part of the
267: workflow handling of the respective projects.
268:
269: Metadata elements can appear multiple times with different
270: \texttt{workflow-state} attributes. This enables metadata versioning.
271:
272:
273:
274: \subsection{Content type}
275: \label{sec:content-type}
276:
277: \begin{description}
278: \item[content-type] \label{tag-content-type} The content type of this
279: resource -- required.\\
280: The content type enables the choice of tools to manipulate and
281: display the resource. There should be a common list of content
282: types. For digital documents (books, manuscripts) this would be
283: "scanned document", for other image data "scanned
284: images".\footnote{The criterion for documents is a ordered
285: succession of image files (pages) and equal image size and
286: resolution throughout the images of a resource.}
287: \end{description}
288:
289:
1.1 casties 290:
1.4 casties 291: \subsection{Language}
292: \label{sec:lang}
293:
294: The language of a resource (e.g. a text) can be specified with a
295: \texttt{lang} tag. Languages have to be described using the
296: international codes for the representation of names of languages
297: either in two-letter form (ISO 639-1) or in three-letter form (ISO
298: 639-2). The entire catalogue of languages is documented on the page
299:
300: \url{http://www.loc.gov/standards/iso639-2/englangn.html}
301:
1.1 casties 302:
303: \subsection{DRI}
304: \label{sec:dri}
305:
306: The \emph{digital resource identifier} for the resource is specified
1.4 casties 307: in a \texttt{dri} element. Digital resource identifiers are documented
1.1 casties 308: on the page
309:
310: \url{http://pythia.mpiwg-berlin.mpg.de/projects/standards/dri}.
311:
312:
1.4 casties 313:
314: \subsection{Collection context}
315: \label{sec:collection-context}
316:
317: The context of a resource as part of a collection or part of a project can be
1.5 casties 318: specified in the \texttt{context} element. All elements in the
319: container can appear multiple times.
1.4 casties 320:
321: \begin{description}
1.5 casties 322: \item[context] information on collection or project context.
1.4 casties 323:
1.5 casties 324: \begin{description}
325: \item[link] URL to additional context information.
326:
327: \item[name] Textual description of project or collection.
328: \end{description}
1.4 casties 329: \end{description}
1.5 casties 330:
1.4 casties 331:
332:
333:
1.1 casties 334: \subsection{Bibliographic information}
335: \label{sec:bibliographic-data}
336:
1.5 casties 337: Bibliographic information is presented in a \texttt{bib} container with
1.1 casties 338: a \texttt{type} parameter, giving the type of bibliographic resource.
1.4 casties 339: The \texttt{type} field can be repeated as a tag in the container.
340:
1.5 casties 341: The format is based on the ECHO scheme for bibliographic data (cf.
342: content workflow), the MPIWG ``Projektbibliografie'' and the format of
343: the commonly used program ``EndNote''.
344:
1.4 casties 345:
346: \subsubsection{Book}
347:
348: \begin{description}
349:
350: \item [bib type="book"] a published book.
351:
352: \begin{description}
353: \item [author] The author of the book.
354: \item [year] The year of publication.
355: \item [title] Title of the book.
356: \item [series-editor] Name of the series editor, if the book appears
357: in a series.
358: \item [series-title] Title of the serie, if the book appears in a
359: series.
360: \item [series-volume] Volume number, if the book appears in a
361: series.
362: \item [number-of-pages] Number of pages of the entire book.
363: \item [city] City where the book was published.
364: \item [publisher] Name of the publishing company
365: \item [edition] Edition of the book (e.g. third edition)
366: \item [number-of-volumes] Number of volumes, if the the book is
367: published in multiple volumes.
368: \item [translator] Name of the translator.
369: \item [isbn-issn]
370: \end{description}
371: \end{description}
372:
373: \subsubsection{In Book}
374:
375: \begin{description}
376: \item [bib type="inbook"] an article as part of a book.
377:
378: \begin{description}
379: \item [author] The author of the book.
380: \item [year] The year of publication.
381: \item [title] Title of the article.
382: \item [editor] Name of the book's editor.
383: \item [book-title] Title of the book.
384: \item [series-volume] Volume number, if the book appears in a
385: series.
386: \item [pages] Number of pages of the article.
387: \item [city] City where the book was published.
388: \item [publisher] Name of the publishing company
389: \item [edition] Edition of the book (e. g. third edition)
390: \item [series-author] Name of the series editor, if the book appears
391: in a series.
392: \item [series-title] Title of the series, if the book appears in a
393: series.
394: \item [number-of-volumes] Number of volumes, if the the book is
395: published in multiple volumes.
396: \item [translator] Name of the translator
397: \item [isbn-issn]
398: \end{description}
399: \end{description}
400:
401: \subsubsection{Proceedings}
402:
403: \begin{description}
404: \item [bib type="proceedings"] a conference proceedings publication.
405:
406: \begin{description}
407: \item [author] The author of the article.
408: \item [year] The year of publication.
409: \item [title] Title of the article.
410: \item [editor] Name of the book's editor.
411: \item [conference-name] Name of the conference the proceedings are
412: related to.
413: \item [volume] Volume number.
414: \item [pages] Number of pages of the article.
415: \item [date] Date of the conference the proceedings are related to.
416: \item [conference]-location City where the conference was held.
417: \item [publisher] Name of the publishing company
418: \item [edition] Edition of the book (e. g. third edition)
419: \item [series-editor] Name of the series editor, if the book appears
420: in a series.
421: \item [series-title] Title of the series, if the book appears in a
422: series.
423: \item [number-of-volumes] Number of volumes, if the the book is
424: published as multiple volumes.
425: \item [isbn-issn]
426: \end{description}
427: \end{description}
428:
429: \subsubsection{Edited Book}
430:
431: \begin{description}
432: \item[bib type="edited-book"] a book that is the edition of another
433: work.
434:
435: \begin{description}
436: \item [editor] Name of the editor of the book.
437: \item [year] The year of publication.
438: \item [title] Title of the book.
439: \item [series-editor] Name of the editor of the series the book is
440: part of.
441: \item [series-title] Title of the series, if the book is part of a
442: series.
443: \item [series-volume] Volume number, if the book appears in a series.
444: \item [number-of-pages] Number of pages of the article.
445: \item [city] City where the book was published.
446: \item [publisher] Name of the publishing company
447: \item [edition] Information about the edition (e.g. ``Repr. of the London ed. 1652'')
448: \item [number-of-volumes] Number of volumes, if the the book is
449: published as multiple volumes.
450: \item [isbn-issn]
451: \end{description}
452: \end{description}
453:
454: \subsubsection{Journal Article}
455:
456: \begin{description}
457: \item [bib type="journal-article"] an article in a scientific journal.
458: \begin{description}
459: \item [author] The author of the article.
460: \item [year] The year of publication.
461: \item [title] Title of the article.
462: \item [journal] Name of the journal.
463: \item [volume] Volume number, if the journal appears in a series.
464: \item [issue] Number of the issue the article is part of.
465: \item [pages] Number of pages of the article.
466: \item [alternate-journal] Alternate Journal
467: \item [isbn-issn]
468: \end{description}
469: \end{description}
470:
471: \subsubsection{Magazine Article}
472:
473: \begin{description}
474: \item [bib type="magazine-article"] an article in a popular magazine.
475: \begin{description}
476: \item [author] The author of the book.
477: \item [year] The year of publication.
478: \item [title] Title of the article.
479: \item [magazine] Name of the magazine.
480: \item [volume] Volume number, if the book appears in a series.
481: \item [issue-number] Number of the issue the article is part of.
482: \item [pages Number] of pages of the article.
483: \item [date] Date when the article appeared.
484: \end{description}
485: \end{description}
486:
487: \subsubsection{Newspaper Article}
488:
489: \begin{description}
490: \item [bib type="newspaper-article"] an article in a newspaper.
491: \begin{description}
492: \item [author] The author of the article.
493: \item [year] The year of publication.
494: \item [title] Title of the article.
495: \item [Newspaper] Name of the newspaper the article appeared in.
496: \item [pages] Number of pages of the article.
497: \item [issue-date] Date of the issue the article is part of.
498: \item [city] City of the newspaper.
499: \end{description}
500: \end{description}
501:
502: \subsubsection{Thesis}
503:
504: \begin{description}
505: \item [bib type="thesis"] a master/doctorate/etc. thesis.
506: \begin{description}
507: \item [author] The author of the thesis.
508: \item [year] The year of publication.
509: \item [title] Title of the thesis.
510: \item [academic-department] Name of the academic department where
511: the thesis was handed in.
512: \item [number-of-pages] Number of pages of the thesis.
513: \item [city] City where the thesis was published.
514: \item [University] Name of the university where the thesis was
515: handed in.
516: \item [isbn-issn]
517: \end{description}
518: \end{description}
519:
520: \subsubsection{Report}
521:
522: \begin{description}
523: \item [bib type="report"] a scientific report.
524: \begin{description}
525: \item [author] The author of the report.
526: \item [year] The year of publication.
527: \item [title] Title of the report.
528: \item [pages] Number of pages of the report.
529: \item [date] Date when the report appeared.
530: \item [city] City where the book was published.
531: \item [institution] Institution where the report was produced.
532: \item [type] Type of report.
533: \item [report-number] Report number.
534: \end{description}
535: \end{description}
536:
1.5 casties 537: \subsubsection{Manuscript}
538:
539: \begin{description}
540: \item [bib type="manuscript"] a handwritten/typewritten manuscript.
541:
542: \begin{description}
543: \item [title] Title of the manuscript.
544: \item [author] The author of the text.
545: \item [location] Name of the library where the manuscript is
546: currently located.
547: \item [year] The year or century of publication.
548: \item [pages] Number of pages of the manuscript.
549: \item [signature] Signature of the manuscript.
550: \item [editorial-remarks] Remarks related to the online
551: publication of the manuscript. This could be notes about
552: annotations etc.
553: \item [description] This can be any kind of description.
554: \item [keywords] Keywords related to the manuscript.
555: \end{description}
556: \end{description}
557:
558:
1.4 casties 559: \subsubsection{Generic}
560:
561: \begin{description}
562: \item [bib type="generic"] a generic bibliographic type. This type
563: should only be used in rare cases.
564: \begin{description}
565: \item [author]
566: \item [year]
567: \item [title]
568: \item [secondary-author]
569: \item [secondary-title]
570: \item [volume]
571: \item [number]
572: \item [pages]
573: \item [date]
574: \item [place-published]
575: \item [publisher]
576: \item [edition]
577: \item [tertiary author]
578: \item [tertiary-title]
579: \item [number-of-volumes]
580: \item [type-of-work]
581: \item [subsidiary author]
582: \item [alternate-title]
583: \item [isbn-issn]
584: \item [call-number]
585: \item [label]
586: \item [keywords]
587: \item [abstract]
588: \item [notes]
589: \item [url]
1.5 casties 590: \end{description}
1.4 casties 591: \end{description}
592:
593:
594: \subsection{Architectural drawings}
595: \label{sec:doc}
596:
597: Specific information for architectural drawings is presented in a
1.5 casties 598: \texttt{doc} container with an additional \texttt{type} attribute
599: giving the type of drawing. All elements inside the container can
600: appear multiple times.
1.4 casties 601:
602: \begin{description}
1.5 casties 603:
604: \item[doc type="Architectural Drawing"] architectural drawing.
605:
606: \begin{description}
607: \item [person] last name and first name of a person, separated by a
608: comma. A further common name for the person can be put infront,
609: separated by a semicolon.
610: \item [location] Name of a place in its common notation. This can be
611: a city or a institution.
612: \item [date] This can be a year (or several years, separated by
613: commas) or a period (1706-1714). Years are noted with four digits.
614: \item [object] Short description of an object or signatures.
615: \item [keywords] Keywords related to the object.
616: \end{description}
1.4 casties 617: \end{description}
1.1 casties 618:
619:
1.10 casties 620: \subsection{Document structure (table of contents)}
1.1 casties 621: \label{sec:toc}
622:
1.4 casties 623: Information on the structure of a document like the division into
624: parts and chapters in the way of a table of contents is presented in a
625: \texttt{toc} container.
626:
627: The scheme allows multiple logical pages on a single page image
628: as it is often the case with scanned books or manuscripts. The scheme
629: also allows for ``loose'' numbering schemes with roman, arabic or
630: other page numbers consecutively or mixed and changes in the numbering
631: within the document.
632:
633: The flexibility comes from the fact that no additional assumptions
634: about the mapping between logical pages and page images are made in
635: the format. All mapping information is specified by the user.
636:
637: The logical page numbering or naming that can be presented to the user
638: is specified in the \texttt{name} tags while the physical numbering of
639: the page images is specified in the \texttt{index} or \texttt{url}
640: tags.
1.1 casties 641:
1.4 casties 642: \begin{description}
1.5 casties 643: \item[toc] container for document structure
644:
1.4 casties 645: \begin{description}
1.5 casties 646: \item[page] describes a single logical page
647:
648: \begin{description}
649: \item[name] the ``name'' of the logical page. This can be any string
650: like a page number (arabic, roman, etc.) or a special designation
651: like ``Table 5''.
652:
653: \item[index] the \texttt{digilib} index number\footnote{The index
654: number for digilib is the index in the alphabetical order of the
655: scan file names.} of the scan image of the page.
656:
657: \item[url] alternatively to the \texttt{digilib} index number the
658: full URL of the scan image of the page can be used.
659: \end{description}
1.4 casties 660:
1.5 casties 661: \item[chapter] describes a section or chapter of the text.
662: \texttt{chapter} elements can be nested.
1.1 casties 663:
1.4 casties 664: \begin{description}
1.5 casties 665: \item[name] the title of the chapter or section.
666:
667: \item[start] the beginning of a page range (usually the first page
668: of the chapter). The \texttt{start} element has an optional
669: \texttt{increment} attribute to indicate the number of logical
670: pages on a scan image.\footnote{This information is only needed by
671: additional tools that try to generate lists of all page and
672: image numbers.}
673:
674: \begin{description}
675: \item[name] the ``name'' of the first page (see \texttt{page}).
676:
677: \item[index] the index of the first page (see \texttt{page}).
678:
679: \item[url] the URL of the first page (see \texttt{page}).
680: \end{description}
681:
682: \item[end] the end of a page range (usually the last page of the
683: chapter).
684:
685: \begin{description}
686: \item[name] the ``name'' of the last page (see \texttt{page}).
687:
688: \item[index] the index of the last page (see \texttt{page}).
689:
690: \item[url] the URL of the last page (see \texttt{page}).
691: \end{description}
692:
693: \item[page] alternative (and additional) to
694: \texttt{start}/\texttt{end} page ranges single \texttt{page}
695: elements can be used inside \texttt{chapter}.
1.4 casties 696: \end{description}
697: \end{description}
698: \end{description}
699:
700: %%\url{http://pythia.mpiwg-berlin.mpg.de/toolserver/TS_lise}
1.1 casties 701:
702:
1.12 casties 703: \subsection{Digital images}
1.1 casties 704: \label{sec:inform-scann-imag}
705:
706: Image files representing scanned images can have an \texttt{img}
707: container tag with information about the scan resolution and the size
708: of the original image. This information is used by the
709: \texttt{digilib} image viewing tool.
710:
711: Required is one of three possible sets of tags:
712:
713: \begin{description}
1.5 casties 714: \item[img] digital image information.
1.1 casties 715:
1.5 casties 716: \begin{description}
1.12 casties 717: \item[original-size-x] The width of the original
718: image -- required. \\
719: The unit of measure can be contained as parameter \texttt{unit},
720: the default is meter ``m''. The width to be considered is the
721: total width of the scanned area.
1.5 casties 722:
1.12 casties 723: \item[original-size-y] The height of the original image -- required.
1.5 casties 724:
1.12 casties 725: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
1.5 casties 726:
1.12 casties 727: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 728: \end{description}
1.1 casties 729: \end{description}
730:
731: or
732:
733: \begin{description}
1.5 casties 734: \item[img] digital image information.
735:
736: \begin{description}
737: \item[original-dpi-x] The resolution of the hi-res scan in its width
1.12 casties 738: in pixels per inch -- required.
1.1 casties 739:
1.5 casties 740: \item[original-dpi-y] The resolution of the hi-res scan in its height
1.12 casties 741: in pixels per inch -- required.
742:
743: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
744:
745: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 746: \end{description}
1.1 casties 747: \end{description}
748:
749: or
750:
751: \begin{description}
1.5 casties 752: \item[img] digital image information.
753:
754: \begin{description}
755: \item[original-dpi] The resolution of the hi-res scan in pixels per
1.12 casties 756: inch if the resolutions in width and height are the same -- required.
757:
758: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
759:
760: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 761: \end{description}
1.1 casties 762: \end{description}
1.7 casties 763:
764:
1.10 casties 765:
1.12 casties 766: \subsection{Digital image acquisition}
1.10 casties 767: \label{sec:inform-about-image}
768:
769: A description of the technology used in the process of producing a
770: digital image.
771:
772: \begin{description}
773: \item[image-acquisition] description of the image production process
774: \begin{description}
1.12 casties 775: \item[device] acquisition device (e.g. ``flatbed scanner'')
1.10 casties 776:
1.12 casties 777: \item[image-type] type and color-depth of the image -- required (e.g. ``RGB 24
1.10 casties 778: bit'')
779:
780: \item[production-comment] additional textual information about the
781: production process
782: \end{description}
783: \end{description}
784:
785:
1.12 casties 786:
1.7 casties 787: \subsection{Full text with images}
788: \label{sec:full-text-with}
789:
1.12 casties 790: Full text in a XML format should be specified with a
791: \texttt{content-type}\footnote{see section~\ref{tag-content-type}
792: on page\pageref{tag-content-type}} ``fulltext''.
1.8 casties 793:
794: The relation between the full text and optional images of
795: whole pages or parts of pages must be specified in a
796: \texttt{text-tool} container.
797:
798: \begin{description}
799: \item[text-tool] representation of full text with images
800:
801: \begin{description}
802: \item[text-file] the file name of the full text file (with path
803: inside document directory)
1.12 casties 804:
1.8 casties 805: \item[page-images] the directory name of the directory containig the
1.12 casties 806: page image files (with path inside document directory)
1.8 casties 807:
808: \item[xslt-file] the file name of an additional XSL transformation
809: file
810:
811: \item[text-config] container for configuration options
1.10 casties 812: \begin{description}
813: \item[container-tag] the name of the text root element (default
814: ``text'')
815:
816: \item[ref-element-tag] the name of the element that is used as
817: unit of reference when results are presented
1.8 casties 818:
1.10 casties 819: \item[pagebreak-tag] the name of the element that indicates page
820: breaks (default ``pb'')
821: \end{description}
1.8 casties 822: \end{description}
823: \end{description}
1.7 casties 824:
1.1 casties 825:
826:
1.12 casties 827: \subsection{Copyright and access conditions}
828: \label{sec:access-conditions}
829:
830: If the access to a resource is bound to conditions for technical or legal
831: reasons then the conditions can be put in a \texttt{access-conditions}
832: container. Other access rights conditions like copyright can also be
833: documented in this container.
834:
835: \begin{description}
836: \item[access-conditions] legal and technical conditions for access to
837: this resource
838:
839: \begin{description}
840: \item[attribution] The name or institution this resource should be
841: attributed to when it's publicly presented
842:
843: \begin{description}
844: \item[name] a name (free text)
845:
846: \item[url] a URL (with an optional \texttt{label} attribute to show
847: as text)
848: \end{description}
849:
850: \item[copyright] the copyright owner and it's conditions
851: \begin{description}
852: \item[owner] the name of the copyright owner
853: \begin{description}
854: \item[name] a name (free text)
855:
856: \item[url] a URL (with an optional \texttt{label} attribute to show
857: as text)
858: \end{description}
859:
860: \item[date] the date when the copyright was issued
861:
862: \item[duration] the duration of the copyright (if known)
863:
864: \item[description] free-text field for special or additional
865: conditions
866: \end{description}
867:
868: \item[access] conditions of access to this resource
869: \begin{description}
870: \item[internal] access should be restricted to a group of users. The
871: type of group is defined by one of the following
872: \begin{description}
873: \item[institution] the members of this institution. The method
874: to identify a user to belong to the institution is not
875: specified in this document.
876:
877: \item[subnet] all computers with an IP-address in this subnet. The
878: subnet is defined in ``truncated-quad'' (e.g. ``141.14'') or
879: ``adress/netmask'' (e.g. ``141.14.0.0/255.255.0.0'') notation.
880:
881: \item[group] the members of this named group. The method to
882: identify a user to belong to a named group is not specified in
883: this document.
884: \end{description}
885:
886: \item[scientific] access to this resource should be restricted to
887: scientific work
888:
889: \item[free] access to this resource is not restricted
890:
891: \item[special] if none of the above conditions seems appropriate,
892: a free-form text can be specified here.
893: \end{description}
894: \end{description}
895: \end{description}
896:
897: \noindent
898: It should be noted that control over the access to the resource has to
899: be provided by additional technical measures. Access conditions in the
900: metadata file only state that conditions \emph{should} be observed,
901: not that they \emph{are} necessarily observed, as the enforcement of
902: conditions depends on additional technical measures.
903:
904:
905:
906: \subsection{Acquisition of raw-data}
907: \label{sec:acqu-inform}
908:
909: Information about the acquisition source for raw data resources can be
910: provided in an \texttt{acquisition} container.
911:
912: \begin{description}
913: \item[acquisition] the acquisition source of this resource -- required
914: for raw data.
915: \begin{description}
916: \item[provider] where this resource came from -- required
917: \begin{description}
918: \item[name] free-text name of the provider (institution or
919: individual)
920:
921: \item[address] address of the provider
922:
923: \item[contact] contact person at the provider (i.e. name and email)
924:
925: \item[url] URL related to the provider
1.13 ! casties 926:
! 927: \item[provider-id] id of the provider (internally used) -- deduced
1.12 casties 928: \end{description}
929:
930: \item[date] date of acquisition -- required
931:
932: \item[description] free-text description of the acquisition source or
933: additional information
934: \end{description}
935: \end{description}
936:
937:
938:
939: \subsection{Documentary Films}
940: \label{sec:documentary-films}
941:
942: Documentary films can be described using a \texttt{film-acquisition}
943: container.
944:
945: \begin{description}
946: \item[film-acquisition] description of a (documentary) film --
947: required for documentary film
948: \begin{description}
949: \item[recording] specification of the recording process
950: \begin{description}
951: \item[author] the person or persons doing the recording
952:
953: \item[date] the date or time span when the film was recorded
954:
955: \item[location] the place where the film was recorded
956:
957: \item[device] recording device used (e.g. ``Sony CP-DV8 Camcorder'')
958:
959: \item[format] format of the recorded film -- required (e.g. ``DV
960: 720x524 25fps interlaced'')
961: \end{description}
962:
963: \item[description] free-form description of the recording and the
964: content of the film
965: \end{description}
966: \end{description}
967:
968: (More information about the digitization step could be added in a
969: \texttt{digitization} tag similar to the \texttt{recording} tag.)
970:
1.1 casties 971:
972:
973:
1.4 casties 974: \section{Sample metadata files for ECHO resources}
1.1 casties 975:
1.5 casties 976: The following is a sample metadata index file for a directory containig a
977: scanned document.
978:
979: \begin{small}
1.1 casties 980: \begin{verbatim}
1.11 casties 981: <resource type="ECHO" version="1.0">
1.5 casties 982: <description>Fleck, 1980</description>
983: <name>fleck.1980</name>
984: <creator>University of Bern</creator>
985: <archive-path>ubern/wiss-theorie</archive-path>
986: <content-type>scanned images</content-type>
987: <meta>
988: <dri>echo23a45e2329x</dri>
989: <lang>ger</lang>
990: <bib type="book">
991: <author>Fleck, Ludwik</author>
992: <year>1980</year>
993: <title>Entstehung und Entwicklung einer
994: wissenschaftlichen Tatsache</title>
995: <series-editor></series-editor>
996: <series-title></series-title>
997: <series-volume></series-volume>
998: <number-of-pages></number-of-pages>
999: <city>Frankfurt am Main</city>
1000: <publisher>Suhrkamp</publisher>
1001: <edition></edition>
1002: <number-of-volumes></number-of-volumes>
1003: <translator></translator>
1004: <isbn-issn></isbn-issn>
1005: <keywords>Wissenschaftstheorie, Fleck, Tatsache</keywords>
1006: <abstract></abstract>
1007: </bib>
1008: </meta>
1009: <dir>
1010: <description>Scanned images (300dpi)</description>
1011: <name>img</name>
1012: </dir>
1.4 casties 1013: </resource>
1014: \end{verbatim}
1.5 casties 1015: \end{small}
1.4 casties 1016:
1.5 casties 1017: The following is a sample metadata file for a single image of an
1018: architectural drawing.
1.4 casties 1019:
1.5 casties 1020: \begin{small}
1.4 casties 1021: \begin{verbatim}
1.11 casties 1022: <resource type="ECHO" version="1.0">
1.5 casties 1023: <creator>Bibliotheca Hertziana</creator>
1024: <content-type>scanned images</content-type>
1025: <file>
1026: <name>00000271-asl-160-r-full.tif</name>
1027: <meta>
1028: <img>
1029: <original-dpi>315</original-dpi>
1030: </img>
1031: <dri>echo45a67bc4367d</dri>
1032: <lang>ita</lang>
1033: <doc type="Architectural Drawing">
1034: <person>Ciolli, Giacomo</person>
1035: <person>Urban VIII; Barberini, Maffeo</person>
1036: <location>Accademia di San Luca</location>
1037: <location>Roma</location>
1038: <date>1706</date>
1039: <object>Concorso Clementino</object>
1040: <object>Fontana Pubblica</object>
1041: <object>Brunnen</object>
1042: <object>ASL 160</object>
1043: <keywords></keywords>
1044: </doc>
1045: <context>
1046: <url>http://colosseum.biblhertz.it:8080/Lineamenta/
1047: 1033478408.39/1035196181.35/1035196204.09/1035394121.83
1048: </url>
1049: </context>
1050: </meta>
1051: </file>
1.2 casties 1052: </resource>
1.1 casties 1053: \end{verbatim}
1.5 casties 1054: \end{small}
1.1 casties 1055:
1056: \end{document}
1057:
1058: %%% Local Variables:
1059: %%% mode: latex
1060: %%% TeX-master: t
1061: %%% End:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>