= Junqi zaji = Junqi zaji, [http://libcoll.mpiwg-berlin.mpg.de/libview?mode=imagepath&url=/mpiwg/online/permanent/library/EP585DF1/pageimg ECHO] Part of [wiki:"Chinese Work Orders#ZhongYiWorkOrderB" Chin WO B]. Sent with DESpecs for Chinese text version 1.2 (see [attachment:wiki:DataEntrySpecs:DESpecs_1_2_chinese.pdf here]) Sent: yes/2009-02-20. Returned: yes/2009-05-04 [http://pythia.mpiwg-berlin.mpg.de/department1/mpdl/raw-texts/Chin_I_Junqi_zaji.txt/Chin_I_Junqi_zaji_V1.txt link] [attachment:JQZJ_Code.pdf List of unknown characters] in this document. == 1. First Analysis == === Difficulties === === Special Instructions === p.3: circled characters "Mark circled characters by ( ), e.g. (甲)." (It seems that this Special Instruction got lost, however. It re-appears as the answer to one question, see below.) == 2. Questions From Formax == Q1. If the three books contain out-dented paragraphs, could you please give us a sample about
?
A: In these three books there are indeed no outdented paragraphs. (However, there are indented paragraphs, for example Jungqi zaji p.0009, line 3. In the Euclid text Jihe yuanben 幾何原本 there were outdented paragraphs.)
About Junqi zaji
Q2. For this book, ics will be only used for the text in page 0001.jpg, i.e.
(二)純鋼造者
此種子彈
07-10: According to our Specs, this is correct. However, we would appreciate if you could put all variables in a single tag, just as in the Euclid text Jihe yuanben. It would then look like this:
{{{
such as the kind of paragraphs in 0003.jpg, 0006.jpg, but if they further indent than the normal paragraphs, we will mark them as . Is it right? Please confirm markup below.
Markup samples for
{{{
0003.jpg
(甲)小粒黑藥 (乙)大粒黑藥 (甲)開花彈 (乙)子母彈 (甲)拉火 (乙)擊火 (丙)電火 instead of , or for all these lines (or , of course).
Q7. Please see the attached Codes.pdf, column 2 is the source characters while column 3 is the corresponding characters that we want to key.
(1) Could you please confirm if lines 1-8 and 10-12 are correct?
(2) For Line 9, should we key this character as 隷, 隸, or unknown character i.e.<001>?
A: How to proceed with character variants:
As always, we would like you to provide us with plain text files in Unicode UTF-8 encoding. We wish the texts to be transcribed making use of the full character repertoire of Unicode 5.1. That means, if a variant is encoded as a separate Unicode character (with a unique Unicode codepoint), we wish the variant to be encoded in the transcribed text by the corresponding Unicode character.
If Unicode 5.1 does not provide a distinct codepoint for a variant character, please assign an unknown character code and provide us with the standard variant in the list of unknown characters.
In an e-mail about the Euclid text Jihe yuanben 幾何原本, you said that you cannot type the Unicode character U+2F88D in the CJK Compatibility Ideographs Supplement block (U+2F800 - U+2FA1F) and used <002> instead. The font Sun-ExtB should cover this Unicode block, but some applications may have problems with Unicode characters above U+FFFF. Please tell us if your problems persist.
Taken together, we want you to do this:
1. Please use Sun-ExtA and Sun-ExtB if possible.
2a. If a character variant exists as a reference glyph with unique codepoint in Unicode 5.1, type it.
2b. If the character variation does not exist in Unicode 5.1, assign an unknown character code and provide us with the standard variant in the list of unknown characters.
Regarding the characters in Codes.pdf:
1. OK
2. <001> unknown characters list: (砲)
3. OK (assuming that it is a slip of the pen)
4. <002> unknown characters list: (飾)
5. <003> unknown characters list: (絨)
6. <004> unknown characters list: (墺 U+58BA)
7. <005> unknown characters list: (曜)
8. <006> unknown characters list: (紀)
9. 𨽻 (U+28F7B); it's a variant of 隸, but has a unique codepoint)
10. 痲 (U+75F2), not 麻 (U+75F3)
11. 神 (U+FA19), not 神 (U+795E)
12. We cannot identify the character from the image. Please provide us either with a better image or with the name of the text (Junqi zaji?), the page number and the line number.
=== Additional Notes from Formax ===
1. We used > in the data for some unsure characters.
1. The text on the 0033.jpg and the text on the 0034.jpg should be interchanged. Now the order we keyed for the two pages is:
After keying 0032.jpg, we keyed
1. the text on 0034.jpg,
1. the figure on 0033.jpg,
1. the text on 0033.jpg
1. the figure on 0034.jpg
Also there is a missing page before the text on 0033.jpg. If you need us to key the missing page please send us it.
== 3. Analysis of the Result ==
=== Findings ===
=== Recommendation ===?
A: Some lines beginning with circled characters could indeed be interpreted as list items. However, since most of these lines are relatively long, we would like you to use