Annotation of storage/meta/meta-format.tex, revision 1.27
1.1 casties 1: \documentclass[a4paper]{article}
2:
3: \usepackage[latin1]{inputenc}
4: \usepackage[T1]{fontenc}
5: \usepackage{ae}
6: %\usepackage{times}
7: %\usepackage{courier}
8:
9: % create in-text links black (with PDF)
1.6 casties 10: \usepackage[colorlinks=true,linkcolor=black]{hyperref}
1.1 casties 11: % Format URLs nicely (without PDF)
1.6 casties 12: %\usepackage{url}
1.1 casties 13:
14:
15: \title{A simple metadata format for resource bundles}
16:
1.4 casties 17: \author{Robert Casties, Dirk Wintergrün, Hans-Christoph Liess}
1.1 casties 18:
1.24 casties 19: \date{V1.3.8 of 30.8.2010}
1.1 casties 20:
21: \begin{document}
22:
23: \maketitle
24:
25: \tableofcontents
26:
27:
28: \section{File and directory names}
29: \label{sec:file-directory-names}
30:
31: File and directory names should not contain spaces. Allowed characters
32: in filenames are only the alphanumeric set a-z, A-Z, 0-9, hyphen
33: ``-'', underscore ``\_'' and dot ``.''.
34:
1.12 casties 35: Files and directories with names that contain illegal characters must
36: be transformed to allowed names. A proposition for a simple
37: transformation rule is
38:
39: \begin{itemize}
40: \item whitespace characters (e.g. blank, tab, cr, lf) are replaced by
41: hyphens ``-''
42:
43: \item other illegal characters are replaced by underscores ``\_''.
44: \end{itemize}
45:
46: This rule does not provide a reversible mapping to the original
47: illegal file name and it does not provide a collision-free mapping,
48: i.e. two different illegal file names might be mapped to the same
49: allowed file name. Additional precautions for these cases must be
50: taken.
1.1 casties 51:
1.4 casties 52:
53: \section{Metadata files}
54: \label{sec:metadata-files}
55:
56: The metadata information is stored in the XML format documented below
57: in special files in the resource directory. Two forms of metadata
58: files are possible:
59: \begin{itemize}
60: \item a file named \texttt{index.meta} in a directory.
61:
1.16 casties 62: \item a file with the same name as the data file it describes and an
1.4 casties 63: additional extension \texttt{.meta}. For example metadata for the
1.16 casties 64: file \texttt{p0001.tif} would be in a file \texttt{p0001.tif.meta}.
1.4 casties 65: \end{itemize}
66:
67: The resource directory must contain an \texttt{index.meta} file with
1.16 casties 68: information about the resource as a whole. Subdirectories can
69: contain additional \texttt{index.meta} files.
1.4 casties 70:
71: Additional information about single data files that are part of the
72: resource can either be put in \texttt{file} tags in the
73: \texttt{index.meta} file or in separate \emph{filename}\texttt{.meta}
74: files for each data file. Information from the directory level file is
1.16 casties 75: inherited at the file level when it is not overwritten.
1.4 casties 76:
77:
1.1 casties 78: \section{Resource format}
79: \label{sec:mpiwg-doc}
80:
81: In this description elements marked ``optional'' need not be supplied
82: by the provider of the resource and may be absent in all versions of
83: the metadata file. Elements marked ``required'' must be supplied by
84: the provider of the resource. Elements marked ``deduced'' can be
85: supplied by the provider of the resource but can also be provided by
1.4 casties 86: automatic scripts later in the process, these elements must be present
1.1 casties 87: in the final file.
88:
1.12 casties 89: File and directory paths in the metadata file use the conventional
90: Unix file separator slash ``/''.
91:
1.11 casties 92: The outer container element is \texttt{resource}. It has the following
93: \textbf{attributes}:
94:
95: \begin{description}
1.12 casties 96: \item[type] sub-type of resource (e.g. ``ECHO'', ``MPIWG'') --
97: optional.
1.11 casties 98:
1.16 casties 99: \item[version] version number of metadata format (currently 1.2) --
1.11 casties 100: required.
101: \end{description}
102:
103: \noindent The allowed \textbf{elements} inside \texttt{resource} are:
1.1 casties 104:
105: \begin{description}
1.14 casties 106: \item[description] An informal textual description of the resource --
107: optional\footnote{At least one description of the resource's content
108: is required. The description can be an informal
109: \texttt{description} element or a descriptive element (like
110: \texttt{bib}) in a \texttt{meta} container.}.
1.1 casties 111:
112: \item[name] The filename of the resource (name of the directory this
113: file is contained in) -- required.
114:
115: \item[creator] The name of the project or person that created the
116: resource -- optional.
1.4 casties 117:
118: \item[archive-creation-date] The time and date the archive collection
119: was created -- deduced.
1.1 casties 120:
1.4 casties 121: \item[archive-storage-date] The time and date the archive was written
122: to permanent storage -- deduced (must not be set by the user).
1.1 casties 123:
124: \item[archive-path] The full path to the resource directory inside the
1.5 casties 125: whole archive collection, including the resource directory -- deduced.
1.12 casties 126:
127: \item[archive-id] The ID for this document in the archive --
1.16 casties 128: optional.
1.1 casties 129:
130: \item[derived-from] Container for the description of the original
131: resource if this resource is a modified version of another resource
132: -- optional.
133:
134: \begin{description}
1.12 casties 135: \item[archive-id] The ID of the original resource
1.16 casties 136: -- required (or archive-path).
1.12 casties 137:
1.1 casties 138: \item[archive-path] The full path to the original resource
1.16 casties 139: -- required (or archive-id).
140:
141: \item[description] An informal textual description of the relation
142: of this resource to the original resource -- optional.
143: \end{description}
144:
145: \item[used-by] Container for the description of modified resources
146: if this resource is the source of another resource
147: -- optional.
148:
149: \begin{description}
150: \item[archive-id] The ID of the derived resource
151: -- required (or archive-path).
152:
153: \item[archive-path] The full path to the derived resource
154: -- required (or archive-id).
1.1 casties 155:
156: \item[description] An informal textual description of the relation
157: of this resource to the original resource -- optional.
158: \end{description}
159:
160: \item[linked-with] Container for the description of another
161: resource when this resource is a linked copy of another resource
162: -- optional.
163:
164: \begin{description}
1.12 casties 165: \item[archive-id] The ID of the linked resource
1.16 casties 166: -- required (or archive-path).
1.12 casties 167:
1.1 casties 168: \item[archive-path] The full path to the linked resource
1.16 casties 169: -- required (or archive-id).
1.1 casties 170:
171: \item[description] An informal textual description of the relation
172: of this resource to the linked resource -- optional.
173: \end{description}
174:
1.24 casties 175: \item[is-part-of] Container for the description of another resource if this
176: resource is a part of the other resource. -- optional. It can have a
177: \texttt{type} attribute describing the type of relation .e.g. ``manuscript-codex''.
178:
179: \begin{description}
180: \item[archive-id] The ID of the original resource
181: -- required (or archive-path).
182:
183: \item[archive-path] The full path to the original resource
184: -- required (or archive-id).
185:
186: \item[description] An informal textual description of the relation
187: of this resource to the original resource -- optional.
188: \end{description}
189:
1.12 casties 190: \item[media-type] \label{tag-media-type} The main media type of this
191: resource -- required.\\ The main media type can be overridden by
192: \texttt{media-type}s in subdirectories. Possible types are
193: \begin{itemize}
194: \item \texttt{image}
195:
196: \item \texttt{text}
197:
198: \item \texttt{audio}
199:
200: \item \texttt{video}
201:
202: \item \texttt{data} for other type of data
203: \end{itemize}
1.1 casties 204:
205: \item[meta] Additional metadata information about the resource --
206: optional.\\ For a description of additional metadata see below.
207:
208: \item[dir] Container for the description of a subdirectory -- required
209: (when there are subdirectories).\\ \texttt{dir} tags should not be
210: nested. Directories at lower levels are identified by their
211: \texttt{path}.
212:
213: \begin{description}
214: \item[description] An informal textual description of the
215: subdirectory -- optional.
216:
217: \item[name] The name of the subdirectory -- required.
218:
1.12 casties 219: \item[original-name] A text string associated with the directory as
220: original name -- optional. (E.g. if the data in this directory
221: came from an external source and had a name that had to be changed
222: according to section~\ref{sec:file-directory-names} but it should
223: be possible to reference the original name.)
224:
1.1 casties 225: \item[path] The directory path of this subdirectory relative to the
1.5 casties 226: resource's root directory (excluding the directory itself) --
227: required (may be empty or omitted if the directory is a direct
228: child of the resource's root directory).
1.1 casties 229:
230: \item[meta] Additional metadata information about the directory --
231: optional.\\ For a description of additional metadata see below.
232: \end{description}
233:
234: \item[file] Container for the description of a file -- deduced.\\
235: \texttt{file} tags should not be nested in \texttt{dir} tags. Files
236: at lower directory levels are identified by their \texttt{path}.
237:
238: \begin{description}
239: \item[description] An informal textual description of the
240: file -- optional.
241:
242: \item[name] The name of the file -- required.
243:
1.12 casties 244: \item[original-name] A text string associated with the file as
1.16 casties 245: original name -- optional. (e.g. if this file came from an
1.12 casties 246: external source and had a name that had to be changed according to
1.16 casties 247: section~\ref{sec:file-directory-names} it is possible
248: to preserve the original name.)
1.12 casties 249:
1.1 casties 250: \item[path] The directory path of this file relative to the
1.5 casties 251: resource's root directory (excluding the file itself) -- required
252: (may be empty or omitted if the file is in the resource's root
253: directory).
1.7 casties 254:
255: \item[date] The file's modification or creation date\footnote{The
256: preferred time and date format is ``YYYY/MM/DD HH:MM:SS''},
257: whichever is more recent -- optional.
1.1 casties 258:
259: \item[modification-date] The file's modification date -- optional.
260:
261: \item[creation-date] The file's creation date -- optional.
1.7 casties 262:
1.1 casties 263: \item[size] The file size -- deduced.
264:
265: \item[mime-type] The file's mime-type -- optional.
266:
267: \item[md5cs] MD5 checksum of the file content -- optional.
268:
269: \item[meta] Additional metadata information about the file --
270: optional. For a description of additional metadata see below.
271: \end{description}
272:
273: \end{description}
274:
275:
276:
277: \section{Additional metadata}
278: \label{sec:additional-metadata}
279:
280: All elements with \texttt{meta} tags can contain an arbitrary number
1.12 casties 281: of the following additional metadata elements.
282:
1.16 casties 283: \subsection{Workflow state}
1.12 casties 284: \label{sec:workflow-state}
285:
286: All additional metadata elements can have a \texttt{workflow-state}
287: \textbf{attribute}. This attribute reflects the state of the
288: corresponding metadata element. The possible values for the
289: \texttt{workflow-state} attribute are
290: \begin{itemize}
291: \item \texttt{preliminary} this information is preliminary. It must
292: be checked in further workflow steps.
293:
294: \item \texttt{inwork}
295:
296: \item \texttt{final}
297: \end{itemize}
298:
299: workflow states other than \texttt{preliminary} are part of the
300: workflow handling of the respective projects.
301:
302: Metadata elements can appear multiple times with different
303: \texttt{workflow-state} attributes. This enables metadata versioning.
304:
305:
306:
307: \subsection{Content type}
308: \label{sec:content-type}
309:
310: \begin{description}
311: \item[content-type] \label{tag-content-type} The content type of this
312: resource -- required.\\
313: The content type enables the choice of tools to manipulate and
314: display the resource. There should be a common list of content
315: types. For digital documents (books, manuscripts) this would be
1.24 casties 316: ``scanned document'', for other image data ``scanned
317: images''.\footnote{The criterion for documents is a ordered
1.12 casties 318: succession of image files (pages) and equal image size and
319: resolution throughout the images of a resource.}
320: \end{description}
321:
322:
1.1 casties 323:
1.4 casties 324: \subsection{Language}
325: \label{sec:lang}
326:
327: The language of a resource (e.g. a text) can be specified with a
328: \texttt{lang} tag. Languages have to be described using the
329: international codes for the representation of names of languages
330: either in two-letter form (ISO 639-1) or in three-letter form (ISO
331: 639-2). The entire catalogue of languages is documented on the page
332:
333: \url{http://www.loc.gov/standards/iso639-2/englangn.html}
334:
1.1 casties 335:
336: \subsection{DRI}
337: \label{sec:dri}
338:
339: The \emph{digital resource identifier} for the resource is specified
1.4 casties 340: in a \texttt{dri} element. Digital resource identifiers are documented
1.1 casties 341: on the page
342:
343: \url{http://pythia.mpiwg-berlin.mpg.de/projects/standards/dri}.
344:
345:
1.4 casties 346:
347: \subsection{Collection context}
348: \label{sec:collection-context}
349:
1.15 casties 350: The context of a resource as part of a collection or part of a project
351: can be specified in the \texttt{context} element. The context element
352: can appear multiple times if the resource is part of multiple
353: collections or projects.
1.4 casties 354:
355: \begin{description}
1.5 casties 356: \item[context] information on collection or project context.
1.4 casties 357:
1.5 casties 358: \begin{description}
1.15 casties 359: \item[link] URL to additional context information -- optional.
1.5 casties 360:
1.15 casties 361: \item[name] Textual description of project or collection -- optional.
362:
363: \item[meta-datalink] description of external sources of canonical meta
364: information -- optional
365: \begin{description}
366: \item[db] \textbf{attribute} to identify different sets of meta data
367: links to the same resource -- optional
368:
369: \item[object] \textbf{attribute} to identify different objects or
370: parts of the same resource -- optional
371:
372: \item[label] textual label for the link -- optional
373:
374: \item[url] URL to present to the client -- optional
375:
376: \item[metadata-url] URL to an external server to be queried -- optional
377: \end{description}
378:
379: \item[meta-baselink] description of external server for canonical meta
380: information -- optional
381: \begin{description}
382: \item[db] \textbf{attribute} to identify different sets of meta data
383: links to the same resource -- optional
384:
385: \item[label] textual label for the link -- optional
386:
387: \item[url] URL to present to the client -- optional
388:
389: \item[metadata-url] URL to an external server to be queried --
390: required (the parameter \texttt{object=} with an object id has
391: to be appended to this URL)
392: \end{description}
1.5 casties 393: \end{description}
1.4 casties 394: \end{description}
1.5 casties 395:
1.4 casties 396:
397:
398:
1.1 casties 399: \subsection{Bibliographic information}
400: \label{sec:bibliographic-data}
401:
1.5 casties 402: Bibliographic information is presented in a \texttt{bib} container with
1.1 casties 403: a \texttt{type} parameter, giving the type of bibliographic resource.
1.4 casties 404: The \texttt{type} field can be repeated as a tag in the container.
405:
1.5 casties 406: The format is based on the ECHO scheme for bibliographic data (cf.
407: content workflow), the MPIWG ``Projektbibliografie'' and the format of
408: the commonly used program ``EndNote''.
409:
1.4 casties 410:
411: \subsubsection{Book}
412:
413: \begin{description}
414:
415: \item [bib type="book"] a published book.
416:
417: \begin{description}
418: \item [author] The author of the book.
419: \item [year] The year of publication.
420: \item [title] Title of the book.
421: \item [series-editor] Name of the series editor, if the book appears
422: in a series.
423: \item [series-title] Title of the serie, if the book appears in a
424: series.
425: \item [series-volume] Volume number, if the book appears in a
426: series.
427: \item [number-of-pages] Number of pages of the entire book.
428: \item [city] City where the book was published.
429: \item [publisher] Name of the publishing company
430: \item [edition] Edition of the book (e.g. third edition)
431: \item [number-of-volumes] Number of volumes, if the the book is
432: published in multiple volumes.
433: \item [translator] Name of the translator.
434: \item [isbn-issn]
1.18 casties 435: \item[call-number] Call number in holding library
436: \item[holding-library] Holding library
1.4 casties 437: \end{description}
438: \end{description}
439:
440: \subsubsection{In Book}
441:
442: \begin{description}
443: \item [bib type="inbook"] an article as part of a book.
444:
445: \begin{description}
446: \item [author] The author of the book.
447: \item [year] The year of publication.
448: \item [title] Title of the article.
449: \item [editor] Name of the book's editor.
450: \item [book-title] Title of the book.
451: \item [series-volume] Volume number, if the book appears in a
452: series.
453: \item [pages] Number of pages of the article.
454: \item [city] City where the book was published.
455: \item [publisher] Name of the publishing company
456: \item [edition] Edition of the book (e. g. third edition)
457: \item [series-author] Name of the series editor, if the book appears
458: in a series.
459: \item [series-title] Title of the series, if the book appears in a
460: series.
461: \item [number-of-volumes] Number of volumes, if the the book is
462: published in multiple volumes.
463: \item [translator] Name of the translator
464: \item [isbn-issn]
1.18 casties 465: \item[call-number] Call number in holding library
466: \item[holding-library] Holding library
1.4 casties 467: \end{description}
468: \end{description}
469:
470: \subsubsection{Proceedings}
471:
472: \begin{description}
473: \item [bib type="proceedings"] a conference proceedings publication.
474:
475: \begin{description}
476: \item [author] The author of the article.
477: \item [year] The year of publication.
478: \item [title] Title of the article.
479: \item [editor] Name of the book's editor.
480: \item [conference-name] Name of the conference the proceedings are
481: related to.
482: \item [volume] Volume number.
483: \item [pages] Number of pages of the article.
484: \item [date] Date of the conference the proceedings are related to.
485: \item [conference]-location City where the conference was held.
486: \item [publisher] Name of the publishing company
487: \item [edition] Edition of the book (e. g. third edition)
488: \item [series-editor] Name of the series editor, if the book appears
489: in a series.
490: \item [series-title] Title of the series, if the book appears in a
491: series.
492: \item [number-of-volumes] Number of volumes, if the the book is
493: published as multiple volumes.
494: \item [isbn-issn]
1.18 casties 495: \item[call-number] Call number in holding library
496: \item[holding-library] Holding library
1.4 casties 497: \end{description}
498: \end{description}
499:
500: \subsubsection{Edited Book}
501:
502: \begin{description}
503: \item[bib type="edited-book"] a book that is the edition of another
504: work.
505:
506: \begin{description}
507: \item [editor] Name of the editor of the book.
508: \item [year] The year of publication.
509: \item [title] Title of the book.
510: \item [series-editor] Name of the editor of the series the book is
511: part of.
512: \item [series-title] Title of the series, if the book is part of a
513: series.
514: \item [series-volume] Volume number, if the book appears in a series.
515: \item [number-of-pages] Number of pages of the article.
516: \item [city] City where the book was published.
517: \item [publisher] Name of the publishing company
518: \item [edition] Information about the edition (e.g. ``Repr. of the London ed. 1652'')
519: \item [number-of-volumes] Number of volumes, if the the book is
520: published as multiple volumes.
521: \item [isbn-issn]
1.18 casties 522: \item[call-number] Call number in holding library
523: \item[holding-library] Holding library
1.4 casties 524: \end{description}
525: \end{description}
526:
1.17 casties 527: \subsubsection{Journal Volume}
528:
529: \begin{description}
530: \item [bib type="journal-volume"] a volume of a scientific journal.
531: \begin{description}
532: \item [title] Name of the journal.
533: \item [editor] The editor of the journal.
534: \item [publisher] Name of the publishing company.
535: \item [city] City where the journal is published.
536: \item [year] The year of publication.
537: \item [volume] Volume number.
538: \item [numer-of-pages] Number of pages of the volume.
539: \item [isbn-issn]
1.18 casties 540: \item[call-number] Call number in holding library
541: \item[holding-library] Holding library
1.17 casties 542: \end{description}
543: \end{description}
544:
1.4 casties 545: \subsubsection{Journal Article}
546:
547: \begin{description}
548: \item [bib type="journal-article"] an article in a scientific journal.
549: \begin{description}
550: \item [author] The author of the article.
551: \item [year] The year of publication.
552: \item [title] Title of the article.
553: \item [journal] Name of the journal.
554: \item [volume] Volume number, if the journal appears in a series.
555: \item [issue] Number of the issue the article is part of.
556: \item [pages] Number of pages of the article.
557: \item [alternate-journal] Alternate Journal
558: \item [isbn-issn]
1.18 casties 559: \item[call-number] Call number in holding library
560: \item[holding-library] Holding library
1.4 casties 561: \end{description}
562: \end{description}
563:
564: \subsubsection{Magazine Article}
565:
566: \begin{description}
567: \item [bib type="magazine-article"] an article in a popular magazine.
568: \begin{description}
569: \item [author] The author of the book.
570: \item [year] The year of publication.
571: \item [title] Title of the article.
572: \item [magazine] Name of the magazine.
573: \item [volume] Volume number, if the book appears in a series.
574: \item [issue-number] Number of the issue the article is part of.
575: \item [pages Number] of pages of the article.
576: \item [date] Date when the article appeared.
1.18 casties 577: \item[call-number] Call number in holding library
578: \item[holding-library] Holding library
1.4 casties 579: \end{description}
580: \end{description}
581:
582: \subsubsection{Newspaper Article}
583:
584: \begin{description}
585: \item [bib type="newspaper-article"] an article in a newspaper.
586: \begin{description}
587: \item [author] The author of the article.
588: \item [year] The year of publication.
589: \item [title] Title of the article.
590: \item [Newspaper] Name of the newspaper the article appeared in.
591: \item [pages] Number of pages of the article.
592: \item [issue-date] Date of the issue the article is part of.
593: \item [city] City of the newspaper.
1.18 casties 594: \item[call-number] Call number in holding library
595: \item[holding-library] Holding library
1.4 casties 596: \end{description}
597: \end{description}
598:
599: \subsubsection{Thesis}
600:
601: \begin{description}
602: \item [bib type="thesis"] a master/doctorate/etc. thesis.
603: \begin{description}
604: \item [author] The author of the thesis.
605: \item [year] The year of publication.
606: \item [title] Title of the thesis.
607: \item [academic-department] Name of the academic department where
608: the thesis was handed in.
609: \item [number-of-pages] Number of pages of the thesis.
610: \item [city] City where the thesis was published.
611: \item [University] Name of the university where the thesis was
612: handed in.
613: \item [isbn-issn]
1.18 casties 614: \item[call-number] Call number in holding library
615: \item[holding-library] Holding library
1.4 casties 616: \end{description}
617: \end{description}
618:
619: \subsubsection{Report}
620:
621: \begin{description}
622: \item [bib type="report"] a scientific report.
623: \begin{description}
624: \item [author] The author of the report.
625: \item [year] The year of publication.
626: \item [title] Title of the report.
627: \item [pages] Number of pages of the report.
628: \item [date] Date when the report appeared.
629: \item [city] City where the book was published.
630: \item [institution] Institution where the report was produced.
631: \item [type] Type of report.
632: \item [report-number] Report number.
1.18 casties 633: \item[call-number] Call number in holding library
634: \item[holding-library] Holding library
1.4 casties 635: \end{description}
636: \end{description}
637:
1.5 casties 638: \subsubsection{Manuscript}
639:
640: \begin{description}
641: \item [bib type="manuscript"] a handwritten/typewritten manuscript.
642:
643: \begin{description}
644: \item [title] Title of the manuscript.
645: \item [author] The author of the text.
646: \item [location] Name of the library where the manuscript is
647: currently located.
648: \item [year] The year or century of publication.
649: \item [pages] Number of pages of the manuscript.
650: \item [signature] Signature of the manuscript.
651: \item [editorial-remarks] Remarks related to the online
652: publication of the manuscript. This could be notes about
653: annotations etc.
654: \item [description] This can be any kind of description.
655: \item [keywords] Keywords related to the manuscript.
1.18 casties 656: \item[call-number] Call number in holding library
657: \item[holding-library] Holding library
1.5 casties 658: \end{description}
659: \end{description}
660:
1.23 dwinter 661: \subsubsection{Extended Manuscript}
662:
663: \begin{description}
664: \item [bib type="extended-manuscript"] a handwritten/typewritten manuscript
665: with detailed information about the manuscripts appearance.
666:
667: \begin{description}
668: \item [title] Title of the manuscript.
669: \item [author] The author of the text.
1.24 casties 670: \item[holding-library] Holding library.
671: \item[call-number] Call number/Shelf mark in holding library.
1.23 dwinter 672: \item[location] Place/City/Country where the manuscript is
673: currently located.
674: \item[date calendar="type"] The date of publication with attribute which
675: calendar used. If no attribute used, CE is the default. Can also be
676: descriptive.
1.24 casties 677: \item[year calendar="type"] Approximate year or century .
1.23 dwinter 678: \item[number-of-folios] Number of folios/pages of the manuscript.
679: \item[signature] Signature(s) of the manuscript, under which a manuscript is
680: known.
1.24 casties 681: \item[abstract] Interpretative abstract of the text's content.
682: \item[incipit] Incipit (beginning of text).
683: \item[explicit] Explicit (end of text).
1.23 dwinter 684: \item[contents] Formal description of the text structure (e.g. table of
1.24 casties 685: contents).
686: \item[writing-surface] material of the writing surface (e.g. ``non-european
687: paper'', ``palm leaf'',\ldots)
688: \item[foliation] Text giving list or range of folios.
689: \item[page-dimensions] height and width in cm.
690: \item[written-area-dimensions] height and width in cm.
691: \item[lines-per-page] number of lines and columns.
692: \item[catchwords] Quire signatures and catchwords.
693: \item[scripts] Description of the script and the ink used.
694: \item[copyist] Copyist.
695: \item[collation-corrections] Notes on collation and corrections.
696: \item[binding] Description of binding.
697: \item[notes-on-ownership] Notes on ownership.
698: \item[notes] Additional notes.
1.23 dwinter 699: \item[secondary-literature] Notes on secondary literature related to the
1.24 casties 700: manuscript
1.23 dwinter 701: \item [editorial-remarks] Remarks related to the online
702: publication of the manuscript.
703: \item [keywords] Keywords related to the manuscript.
704: \end{description}
705: \end{description}
706:
707: \subsubsection{Codex}
708:
709: \begin{description}
1.24 casties 710: \item [bib type="codex"] Codex i.e. bound collection of one or more manuscripts.
1.23 dwinter 711:
712: \begin{description}
1.24 casties 713: \item[holding-library] Holding library.
714: \item[call-number] Call number/Shelf mark in holding library.
1.23 dwinter 715: \item[location] Place/City/Country where the codex is
716: currently located.
717: \item[date calendar="type"] Date of the collation of the codex.
1.24 casties 718: \item[year calendar="type"] Approximate year or century .
1.23 dwinter 719: \item[number-of-folios] Number of folios/pages of the manuscript.
1.24 casties 720: \item[foliation] Text giving list or range of folios.
1.23 dwinter 721: \item[signature] Signature(s) of the manuscript, under which a manuscript is
722: known.
723: \item[contents] Formal description of the text structure (e.g. table of
1.24 casties 724: contents).
725: \item[dimensions] height + width in cm.
726: \item[binding] Description of binding.
727: \item[notes] Additional notes.
728: \item[notes-on-ownership] Notes on ownership.
729: \end{description}
1.23 dwinter 730: \end{description}
731:
1.5 casties 732:
1.19 casties 733: \subsubsection{Correspondence}
734:
735: \begin{description}
736: \item [bib type="correspondence"] a piece of correspondence e.g. letter, telegram, in the following called ``letter''
737:
738: \begin{description}
739: \item[type] The type of correspondence, e.g. ``letter'', ``postcard'', ``telegram'', ``letter draft''
740: \item [author] The author/sender of the letter.
741: \item [recipient] The recipient of the letter.
742: \item [date] normalised date of the letter.
743: \item [date-range-end] end of range of uncertain dating -- optional.
744: \item [date-original] the date in its original form as noted on the letter -- optional.
745: \item [place] place where the letter was written/sent.
746: \item [title] Title of the letter -- optional.
747: \item[incipit] The opening phrase of the letter -- optional.
748: \item[excipit] The closing phrase of the letter -- optional.
749: \item [pages] Number of pages of the manuscript.
750: \item [signature] Canonical signature/call number of the manuscript.
751: \item [description] This can be any kind of description.
752: \item [keywords] Keywords related to the manuscript.
753: \item[call-number] Call number in the current holding library
754: \item[holding-library] current holding library
755: \end{description}
756: \end{description}
757:
758:
1.4 casties 759: \subsubsection{Generic}
760:
761: \begin{description}
762: \item [bib type="generic"] a generic bibliographic type. This type
763: should only be used in rare cases.
764: \begin{description}
765: \item [author]
766: \item [year]
767: \item [title]
768: \item [secondary-author]
769: \item [secondary-title]
770: \item [volume]
771: \item [number]
772: \item [pages]
773: \item [date]
774: \item [place-published]
775: \item [publisher]
776: \item [edition]
777: \item [tertiary author]
778: \item [tertiary-title]
779: \item [number-of-volumes]
780: \item [type-of-work]
781: \item [subsidiary author]
782: \item [alternate-title]
783: \item [isbn-issn]
784: \item [call-number]
785: \item [label]
786: \item [keywords]
787: \item [abstract]
788: \item [notes]
789: \item [url]
1.5 casties 790: \end{description}
1.4 casties 791: \end{description}
792:
793:
794: \subsection{Architectural drawings}
795: \label{sec:doc}
796:
797: Specific information for architectural drawings is presented in a
1.5 casties 798: \texttt{doc} container with an additional \texttt{type} attribute
799: giving the type of drawing. All elements inside the container can
800: appear multiple times.
1.4 casties 801:
802: \begin{description}
1.5 casties 803:
804: \item[doc type="Architectural Drawing"] architectural drawing.
805:
806: \begin{description}
807: \item [person] last name and first name of a person, separated by a
808: comma. A further common name for the person can be put infront,
809: separated by a semicolon.
810: \item [location] Name of a place in its common notation. This can be
811: a city or a institution.
812: \item [date] This can be a year (or several years, separated by
813: commas) or a period (1706-1714). Years are noted with four digits.
814: \item [object] Short description of an object or signatures.
815: \item [keywords] Keywords related to the object.
816: \end{description}
1.4 casties 817: \end{description}
1.1 casties 818:
819:
1.10 casties 820: \subsection{Document structure (table of contents)}
1.1 casties 821: \label{sec:toc}
822:
1.4 casties 823: Information on the structure of a document like the division into
824: parts and chapters in the way of a table of contents is presented in a
825: \texttt{toc} container.
826:
827: The scheme allows multiple logical pages on a single page image
828: as it is often the case with scanned books or manuscripts. The scheme
829: also allows for ``loose'' numbering schemes with roman, arabic or
830: other page numbers consecutively or mixed and changes in the numbering
831: within the document.
832:
833: The flexibility comes from the fact that no additional assumptions
834: about the mapping between logical pages and page images are made in
835: the format. All mapping information is specified by the user.
836:
837: The logical page numbering or naming that can be presented to the user
838: is specified in the \texttt{name} tags while the physical numbering of
839: the page images is specified in the \texttt{index} or \texttt{url}
840: tags.
1.1 casties 841:
1.4 casties 842: \begin{description}
1.5 casties 843: \item[toc] container for document structure
844:
1.4 casties 845: \begin{description}
1.5 casties 846: \item[page] describes a single logical page
847:
848: \begin{description}
849: \item[name] the ``name'' of the logical page. This can be any string
850: like a page number (arabic, roman, etc.) or a special designation
851: like ``Table 5''.
852:
853: \item[index] the \texttt{digilib} index number\footnote{The index
854: number for digilib is the index in the alphabetical order of the
855: scan file names.} of the scan image of the page.
856:
857: \item[url] alternatively to the \texttt{digilib} index number the
858: full URL of the scan image of the page can be used.
859: \end{description}
1.4 casties 860:
1.5 casties 861: \item[chapter] describes a section or chapter of the text.
862: \texttt{chapter} elements can be nested.
1.1 casties 863:
1.4 casties 864: \begin{description}
1.5 casties 865: \item[name] the title of the chapter or section.
866:
867: \item[start] the beginning of a page range (usually the first page
868: of the chapter). The \texttt{start} element has an optional
869: \texttt{increment} attribute to indicate the number of logical
870: pages on a scan image.\footnote{This information is only needed by
871: additional tools that try to generate lists of all page and
872: image numbers.}
873:
874: \begin{description}
875: \item[name] the ``name'' of the first page (see \texttt{page}).
876:
877: \item[index] the index of the first page (see \texttt{page}).
878:
879: \item[url] the URL of the first page (see \texttt{page}).
880: \end{description}
881:
882: \item[end] the end of a page range (usually the last page of the
883: chapter).
884:
885: \begin{description}
886: \item[name] the ``name'' of the last page (see \texttt{page}).
887:
888: \item[index] the index of the last page (see \texttt{page}).
889:
890: \item[url] the URL of the last page (see \texttt{page}).
891: \end{description}
892:
893: \item[page] alternative (and additional) to
894: \texttt{start}/\texttt{end} page ranges single \texttt{page}
895: elements can be used inside \texttt{chapter}.
1.4 casties 896: \end{description}
897: \end{description}
898: \end{description}
899:
900: %%\url{http://pythia.mpiwg-berlin.mpg.de/toolserver/TS_lise}
1.1 casties 901:
902:
1.12 casties 903: \subsection{Digital images}
1.1 casties 904: \label{sec:inform-scann-imag}
905:
906: Image files representing scanned images can have an \texttt{img}
907: container tag with information about the scan resolution and the size
908: of the original image. This information is used by the
909: \texttt{digilib} image viewing tool.
910:
911: Required is one of three possible sets of tags:
912:
913: \begin{description}
1.5 casties 914: \item[img] digital image information.
1.1 casties 915:
1.5 casties 916: \begin{description}
1.12 casties 917: \item[original-size-x] The width of the original
918: image -- required. \\
919: The unit of measure can be contained as parameter \texttt{unit},
920: the default is meter ``m''. The width to be considered is the
921: total width of the scanned area.
1.5 casties 922:
1.12 casties 923: \item[original-size-y] The height of the original image -- required.
1.5 casties 924:
1.12 casties 925: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
1.5 casties 926:
1.12 casties 927: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 928: \end{description}
1.1 casties 929: \end{description}
930:
931: or
932:
933: \begin{description}
1.5 casties 934: \item[img] digital image information.
935:
936: \begin{description}
937: \item[original-dpi-x] The resolution of the hi-res scan in its width
1.12 casties 938: in pixels per inch -- required.
1.1 casties 939:
1.5 casties 940: \item[original-dpi-y] The resolution of the hi-res scan in its height
1.12 casties 941: in pixels per inch -- required.
942:
943: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
944:
945: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 946: \end{description}
1.1 casties 947: \end{description}
948:
949: or
950:
951: \begin{description}
1.5 casties 952: \item[img] digital image information.
953:
954: \begin{description}
955: \item[original-dpi] The resolution of the hi-res scan in pixels per
1.12 casties 956: inch if the resolutions in width and height are the same -- required.
957:
958: \item[original-pixel-x] The width of the hi-res scan in pixels -- deduced.
959:
960: \item[original-pixel-y] The height of the hi-res scan in pixels -- deduced.
1.5 casties 961: \end{description}
1.1 casties 962: \end{description}
1.7 casties 963:
964:
1.10 casties 965:
1.12 casties 966: \subsection{Digital image acquisition}
1.10 casties 967: \label{sec:inform-about-image}
968:
969: A description of the technology used in the process of producing a
970: digital image.
971:
972: \begin{description}
973: \item[image-acquisition] description of the image production process
974: \begin{description}
1.12 casties 975: \item[device] acquisition device (e.g. ``flatbed scanner'')
1.10 casties 976:
1.12 casties 977: \item[image-type] type and color-depth of the image -- required (e.g. ``RGB 24
1.10 casties 978: bit'')
979:
980: \item[production-comment] additional textual information about the
981: production process
982: \end{description}
983: \end{description}
984:
985:
1.12 casties 986:
1.7 casties 987: \subsection{Full text with images}
988: \label{sec:full-text-with}
989:
1.12 casties 990: Full text in a XML format should be specified with a
991: \texttt{content-type}\footnote{see section~\ref{tag-content-type}
992: on page\pageref{tag-content-type}} ``fulltext''.
1.8 casties 993:
994: The relation between the full text and optional images of
995: whole pages or parts of pages must be specified in a
1.20 casties 996: \texttt{texttool} container.
1.8 casties 997:
998: \begin{description}
1.20 casties 999: \item[texttool] representation of full text with images
1000:
1.8 casties 1001: \begin{description}
1.22 casties 1002: \item[text] the file name of the full text file (path
1.8 casties 1003: inside document directory)
1.12 casties 1004:
1.20 casties 1005: \item[text-url-path] a characteristic part of the URL with which the
1006: full text can be retrieved (the form and content of this element
1007: is dependent on the specific text retrieval mechanism)
1008:
1009: \item[image] the directory name of the directory containig the
1.22 casties 1010: page image files (path inside document directory)
1011:
1012: \item[figure] the directory name of the directory containig the
1013: in-page figure image files (path inside document directory)
1.8 casties 1014:
1.20 casties 1015: \item[xslt] the file name of an additional XSL transformation
1.8 casties 1016: file
1017:
1.20 casties 1018: \item[pagebreak] the name of the element that indicates page breaks
1019: (default ``pb'')
1.8 casties 1020: \end{description}
1021: \end{description}
1.7 casties 1022:
1.1 casties 1023:
1024:
1.12 casties 1025: \subsection{Copyright and access conditions}
1026: \label{sec:access-conditions}
1027:
1028: If the access to a resource is bound to conditions for technical or legal
1029: reasons then the conditions can be put in a \texttt{access-conditions}
1.16 casties 1030: container. Other usage conditions like copyright can also be
1.12 casties 1031: documented in this container.
1032:
1.25 casties 1033: % attribute for type 'original', 'digital-image', 'text'
1034: % tags can be repeated
1035: % CC license short-cut
1036:
1037:
1.12 casties 1038: \begin{description}
1039: \item[access-conditions] legal and technical conditions for access to
1040: this resource
1041:
1042: \begin{description}
1043: \item[attribution] The name or institution this resource should be
1.26 casties 1044: attributed to when it's publicly presented. \\
1045: The type of resource this condition applies to can be specified with a
1046: \texttt{type} attribute with the values ``original'' (the physical object
1047: that was scanned), ``digital-image'' (the scanned images), ``text''
1048: (the textual transcript).
1.12 casties 1049:
1050: \begin{description}
1051: \item[name] a name (free text)
1052:
1053: \item[url] a URL (with an optional \texttt{label} attribute to show
1054: as text)
1.18 casties 1055:
1056: \item[description] more information (free text, e.g. holding
1057: library call number)
1.12 casties 1058: \end{description}
1059:
1.26 casties 1060: \item[copyright] the copyright holder and the copyright conditions. \\
1061: The type of resource this condition applies to can be specified with a
1062: \texttt{type} attribute with the values ``original'' (the physical object
1063: that was scanned), ``digital-image'' (the scanned images), ``text''
1064: (the textual transcript).
1065:
1.12 casties 1066: \begin{description}
1.16 casties 1067: \item[owner] the name of the copyright holder
1.12 casties 1068: \begin{description}
1069: \item[name] a name (free text)
1070:
1071: \item[url] a URL (with an optional \texttt{label} attribute to show
1.26 casties 1072: as text) identifying the copyright holder
1.12 casties 1073: \end{description}
1074:
1075: \item[date] the date when the copyright was issued
1076:
1.16 casties 1077: \item[duration] the duration of the copyright term (if known)
1.12 casties 1078:
1079: \item[description] free-text field for special or additional
1080: conditions
1.27 ! casties 1081: \item[license] the type of license if its a standardised license e.g. Creative Commons
! 1082: \begin{description}
! 1083: \item[url] a URL representing the license e.g. \url{http://creativecommons.org/licenses/by/3.0/}
! 1084: \end{description}
! 1085:
1.12 casties 1086: \end{description}
1.14 casties 1087:
1088:
1089: \item[publish-metadata] metadata about this resource can be made
1.16 casties 1090: freely available when this tag is present (otherwise metadata has
1091: the same access conditions as the rest of the resource). Access to
1092: the resource itself is regulated separately by the \texttt{access}
1093: element.
1.12 casties 1094:
1.16 casties 1095: \item[access] conditions of access to this resource. Different
1096: access types are specified by a \texttt{type} attribute:
1.12 casties 1097: \begin{description}
1.16 casties 1098: \item[type=group] access restricted to the members of this named
1099: group. The method to identify a user belonging to a named group
1100: is not specified in this document.
1101: \begin{description}
1102: \item[name] name of the group.
1103:
1104: \item[only-before] the access condition is only valid before the
1105: given date (format: ``YYYY/MM/DD'').
1106:
1107: \item[only-after] the access condition is only valid after the
1108: given date (format: ``YYYY/MM/DD'').
1109: \end{description}
1110:
1111: \item[type=institution] access restricted to the members of this
1112: institution. The method to identify a user to belong to the
1113: institution is not specified in this document.
1.12 casties 1114: \begin{description}
1.16 casties 1115: \item[name] name of the group.
1116:
1117: \item[only-before] the access condition is only valid before the
1118: given date (format: ``YYYY/MM/DD'').
1119:
1120: \item[only-after] the access condition is only valid after the
1121: given date (format: ``YYYY/MM/DD'').
1122: \end{description}
1123:
1124:
1125: \item[type=subnet] access restricted to all computers with an
1126: IP-address in this subnet.
1127: \begin{description}
1128: \item[range] subnet range defined in
1129: truncated-quad (e.g. ``141.14''), network-netmask
1130: (e.g. ``141.14.0.0/255.255.0.0''), or network-range
1131: (e.g. ``141.14.0.0/16'') notation.
1132:
1133: \item[only-before] the access condition is only valid before the
1134: given date (format: ``YYYY/MM/DD'').
1135:
1136: \item[only-after] the access condition is only valid after the
1137: given date (format: ``YYYY/MM/DD'').
1138: \end{description}
1139:
1.12 casties 1140:
1.16 casties 1141: \item[type=scientific] access to this resource should be restricted to
1142: scientific work
1143: \begin{description}
1144: \item[only-before] the access condition is only valid before the
1145: given date (format: ``YYYY/MM/DD'').
1146:
1147: \item[only-after] the access condition is only valid after the
1148: given date (format: ``YYYY/MM/DD'').
1.12 casties 1149: \end{description}
1.16 casties 1150:
1.12 casties 1151:
1.16 casties 1152: \item[type=free] access to this resource is not restricted
1153: \begin{description}
1154: \item[only-before] the access condition is only valid before the
1155: given date (format: ``YYYY/MM/DD'').
1.12 casties 1156:
1.16 casties 1157: \item[only-after] the access condition is only valid after the
1158: given date (format: ``YYYY/MM/DD'').
1159: \end{description}
1160:
1.12 casties 1161:
1.16 casties 1162: \item[type=special] if none of the above conditions seems appropriate,
1.12 casties 1163: a free-form text can be specified here.
1.16 casties 1164: \begin{description}
1165: \item[description] description of special access conditions.
1166:
1167: \item[only-before] the access condition is only valid before the
1168: given date (format: ``YYYY/MM/DD'').
1169:
1170: \item[only-after] the access condition is only valid after the
1171: given date (format: ``YYYY/MM/DD'').
1172: \end{description}
1173:
1.12 casties 1174: \end{description}
1175: \end{description}
1176: \end{description}
1177:
1178: \noindent
1.16 casties 1179: It should be noted that control over access to the resource has to be
1180: provided by additional technical measures. Access conditions in the
1181: metadata file only state that conditions \emph{should} be observed, it
1182: is not implied that they \emph{are} necessarily observed, as the
1183: enforcement of conditions depends on additional measures.
1.12 casties 1184:
1185:
1186:
1187: \subsection{Acquisition of raw-data}
1188: \label{sec:acqu-inform}
1189:
1190: Information about the acquisition source for raw data resources can be
1191: provided in an \texttt{acquisition} container.
1192:
1193: \begin{description}
1194: \item[acquisition] the acquisition source of this resource -- required
1195: for raw data.
1196: \begin{description}
1197: \item[provider] where this resource came from -- required
1198: \begin{description}
1199: \item[name] free-text name of the provider (institution or
1200: individual)
1201:
1202: \item[address] address of the provider
1203:
1204: \item[contact] contact person at the provider (i.e. name and email)
1205:
1206: \item[url] URL related to the provider
1.13 casties 1207:
1208: \item[provider-id] id of the provider (internally used) -- deduced
1.12 casties 1209: \end{description}
1210:
1211: \item[date] date of acquisition -- required
1212:
1213: \item[description] free-text description of the acquisition source or
1214: additional information
1215: \end{description}
1216: \end{description}
1217:
1218:
1219:
1220: \subsection{Documentary Films}
1221: \label{sec:documentary-films}
1222:
1223: Documentary films can be described using a \texttt{film-acquisition}
1224: container.
1225:
1226: \begin{description}
1227: \item[film-acquisition] description of a (documentary) film --
1228: required for documentary film
1229: \begin{description}
1230: \item[recording] specification of the recording process
1231: \begin{description}
1232: \item[author] the person or persons doing the recording
1233:
1234: \item[date] the date or time span when the film was recorded
1235:
1236: \item[location] the place where the film was recorded
1237:
1238: \item[device] recording device used (e.g. ``Sony CP-DV8 Camcorder'')
1239:
1240: \item[format] format of the recorded film -- required (e.g. ``DV
1241: 720x524 25fps interlaced'')
1242: \end{description}
1243:
1244: \item[description] free-form description of the recording and the
1245: content of the film
1246: \end{description}
1247: \end{description}
1248:
1249: (More information about the digitization step could be added in a
1250: \texttt{digitization} tag similar to the \texttt{recording} tag.)
1251:
1.1 casties 1252:
1253:
1254:
1.4 casties 1255: \section{Sample metadata files for ECHO resources}
1.1 casties 1256:
1.5 casties 1257: The following is a sample metadata index file for a directory containig a
1258: scanned document.
1259:
1260: \begin{small}
1.1 casties 1261: \begin{verbatim}
1.11 casties 1262: <resource type="ECHO" version="1.0">
1.5 casties 1263: <description>Fleck, 1980</description>
1264: <name>fleck.1980</name>
1265: <creator>University of Bern</creator>
1266: <archive-path>ubern/wiss-theorie</archive-path>
1267: <content-type>scanned images</content-type>
1268: <meta>
1269: <dri>echo23a45e2329x</dri>
1270: <lang>ger</lang>
1271: <bib type="book">
1272: <author>Fleck, Ludwik</author>
1273: <year>1980</year>
1274: <title>Entstehung und Entwicklung einer
1275: wissenschaftlichen Tatsache</title>
1276: <series-editor></series-editor>
1277: <series-title></series-title>
1278: <series-volume></series-volume>
1279: <number-of-pages></number-of-pages>
1280: <city>Frankfurt am Main</city>
1281: <publisher>Suhrkamp</publisher>
1282: <edition></edition>
1283: <number-of-volumes></number-of-volumes>
1284: <translator></translator>
1285: <isbn-issn></isbn-issn>
1286: <keywords>Wissenschaftstheorie, Fleck, Tatsache</keywords>
1287: <abstract></abstract>
1288: </bib>
1289: </meta>
1290: <dir>
1291: <description>Scanned images (300dpi)</description>
1292: <name>img</name>
1293: </dir>
1.4 casties 1294: </resource>
1295: \end{verbatim}
1.5 casties 1296: \end{small}
1.4 casties 1297:
1.5 casties 1298: The following is a sample metadata file for a single image of an
1299: architectural drawing.
1.4 casties 1300:
1.5 casties 1301: \begin{small}
1.4 casties 1302: \begin{verbatim}
1.11 casties 1303: <resource type="ECHO" version="1.0">
1.5 casties 1304: <creator>Bibliotheca Hertziana</creator>
1305: <content-type>scanned images</content-type>
1306: <file>
1307: <name>00000271-asl-160-r-full.tif</name>
1308: <meta>
1309: <img>
1310: <original-dpi>315</original-dpi>
1311: </img>
1312: <dri>echo45a67bc4367d</dri>
1313: <lang>ita</lang>
1314: <doc type="Architectural Drawing">
1315: <person>Ciolli, Giacomo</person>
1316: <person>Urban VIII; Barberini, Maffeo</person>
1317: <location>Accademia di San Luca</location>
1318: <location>Roma</location>
1319: <date>1706</date>
1320: <object>Concorso Clementino</object>
1321: <object>Fontana Pubblica</object>
1322: <object>Brunnen</object>
1323: <object>ASL 160</object>
1324: <keywords></keywords>
1325: </doc>
1326: <context>
1327: <url>http://colosseum.biblhertz.it:8080/Lineamenta/
1328: 1033478408.39/1035196181.35/1035196204.09/1035394121.83
1329: </url>
1330: </context>
1331: </meta>
1332: </file>
1.2 casties 1333: </resource>
1.1 casties 1334: \end{verbatim}
1.5 casties 1335: \end{small}
1.1 casties 1336:
1337: \end{document}
1338:
1339: %%% Local Variables:
1340: %%% mode: latex
1341: %%% TeX-master: t
1342: %%% End:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>