Changes between Initial Version and Version 1 of HighLevelRequirements


Ignore:
Timestamp:
Sep 19, 2008, 10:26:08 AM (16 years ago)
Author:
Wolfgang Schmidle
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HighLevelRequirements

    v1 v1  
     1= High level requirements for DE Specs =
     2
     3== general principles ==
     4  * must specifiy standard file encoding and unambiguous conventions for entry of non-ASCII characters
     5  * a convention is needed for DE personnel to indicate and record unknown characters
     6  * character entry conventions must be ergonomic and within capabilities of DE firm
     7  * DE output must be plain text, but will not be well-formed XML
     8  * DE markup should be concise and unambiguous
     9  * DE markup should facilitate conversion to target structured XML document
     10
     11== required structural features ==
     12  * conventions are needed for standard line, paragraph, and page-level structure
     13  * markup needs to indicate not only where a feature starts, but also where it ends, unless automatic inference of the end location is trivial
     14  * must address headers/footers, notes (marginal, foot- and end-), tables, and lists, and figures
     15  * must support multi-column layouts
     16  * must indicate relation of text to commentary, where these are presented on the page together
     17  * must indicate emphasis (e.g. italics)
     18  * must indicate change of typestyle, where this is semantically significant
     19  * conventions for abbreviations
     20
     21== expository aspects ==
     22  * conventions should be indicated in numbered sections
     23  * language needs to be kept simple and readable for Chinese employees
     24  * complex structural features should be illustrated with an example (or examples) from actual texts and desired transcription
     25
     26== coverage ==
     27  * DE is not appropriate where OCR would be more cost-effective
     28  * material needed by the Institute's scientists in the proximate future should be accommodated
     29  * version targets
     30    * DE Specs 1.0 should cover printed European books up to the nineteenth century
     31    * DE Specs 1.1 should add support for Chinese books
     32    * DE Specs 2.0 should cover also transcriptions made by students or other personnel of annotated matter or manuscripts
     33  * out of scope for DE Specs 1.0-2.0
     34    * specialized document types such as dictionaries
     35    * dramatic and verse literature
     36    * complex formal language content (e.g. mathematics, chemical formulae, musical notation)
     37    * documents such as notebooks, personal letters, and financial documents
     38    * twentieth-century material (perhaps with certain exceptions)