LegacySpecs: Xuanji-yishu-SPECS.txt

File Xuanji-yishu-SPECS.txt, 2.4 KB (added by hyman, 16 years ago)
Line 
1DATA ENTRY SPECIFICATIONS FOR XUANJI YISHU
2
3GENERAL
4
51. ENCODING. The text shall be encoded as Unicode
6
72. PAGE BREAKS. Indicate the end of each page by typing <br>
8
93. COLUMN BREAKS. Use one input line for each column. A return should
10be typed after each column
11
124. HEADINGS. This text contains two levels of heading. All headings
13occur in a column by themselves. The first level heading always ends
14with "juan4" and a number or the character "mo4". These headings
15should be tagged with a <h1> at the beginning and </h1> at the
16end. The second level heading is always indented by one or two
17spaces. These headings should be tagged with a <h2> at the beginning
18and </h2> at the end
19
205. INDENTED PARAGRAPHS. The text contains commentaries where each
21column is indented at the top. Where these occur, a tag <ind> should
22be put at the beginning and </ind> at the end
23
246. SMALL CHARACTERS. Some parts of the text are written in small
25characters. Sequences of small characters should be tagged with a
26<small> at the beginning and </small> at the end of the sequence
27
287. ILLEGIBLE CHARACTERS. If a character is illegible because of bad
29printing, indicate it as <x> -- use one <x> for each illegible character
30
318. UNKNOWN CHARACTERS. If a character is not recognized or can not be
32encoded in Unicode, indicate it by a numeric tag, so that the first
33unrecognized character is indicated as <01>, the second as <02> and so
34on; keep a list of these characters and reuse the same numeric code if
35the character reoccurs later in the text. Please provide a list
36showing the numeric codes used and a reproduction of the character for
37which they stand
38
399. PAGE NUMBERS. Page numbers should be typed with <pn> at the
40beginning and </pn> at the end. Other information in the middle column
41(book title, edition) does not need to be typed for this text
42
4310. TABLE OF CONTENTS. Items in the table of contents are separated by
44vertical space. Make sure that one or more IDEOGRAPHIC SPACES (Unicode
45character U+3000) are typed between the items
46
4711. TITLES OF FIGURES. These should be typed, and a tag <fig> used to
48indicate the location of the figure
49
50WORK SAMPLE AND ESTIMATE
51
521. Please transcribe a sample of 2 scans (there two pages per scan)
53starting from page 14 according to the specifications given above
54
552. We will provide the physical text to FORMAX for scanning
56
573. Please provide a cost estimate for transcription of the entire text
58(except for the first thirteen pages)