wiki:First evaluation

Version 18 (modified by Wolfgang Schmidle, 16 years ago) (diff)

--

The first few pages of Diversae "Conimbricenses In Universam dialecticam" (1606), Benedetti, Giovanni Battista de "Diversarvm specvlationvm mathematicarum, et physicarum liber" (1585) and Euclid "Elementorum Libri XV" (1607) were digitized and sent back for evaluation. In general, the results are very good.

Unfortunately, the work sample does not contain a page of the Conimbricenses where the Special Instructions apply.

PDF versions of the work samples are attached. In these PDF versions, the font is Helvetica 12pt (10pt for Benedetti), blank lines have been inserted before <pb> tags, and < > { } _ are in bold face.

Offsets ECHO - page numbers in the book: Diversae 2, Benedetti 12

What does work

  • Letters with swashes are recognized, except for this Quod which was transcribed as Luod. Character recognition is surprisingly high, e. g. Conimbricenses, p. 3
  • List of unknown characters is used (two characters so far), unreadable text is marked up accurately.
  • Multiline headings are recognized, possibly because of punctuation
  • Both methods of marking up italics in headings is used:
    <h it>TRACTATVS QVI IN HOC
    volumine continentur.</h>
    

(Benedetti, p. 6)

<h>_Theoremata Arithmetica._</h>

(Benedetti, p. 13)

  • Library stamps are either typed:
    <h>MAX-PLANCK-INB<?>TITUT
    $<?>UR WISS<?>ENSCN<?>AF<?>T@@@@CHICHTE</h>
    <h>Bibliothek</h>
    

or coded as <fig>:

<h><red>E SOCIETATE IESV,</r></h>
<h>_IN VNIVERSAMDIA_
_Iecticam Ari$totelis Stagiritæ_</h>
<fig>
<fig>
  • Parentheses work well, only one example with spaces within parentheses (Benedetti, p. 9). Original has spaces.

What does not work

  • The <red> tag is always closed by the </r> tag.
  • Some ornamental figures are not tagged, e. g. this one.
  • Various mistypings occur frequently:
  • Number 10 becomes <sc>IO</sc> in Euclid, p. 13. A date on the same page is recognised correctly.
  • Greek Ligatures
    • Letter variation of τ was recognized, but τ (in the same word!) was typed as T (as in άγεωμέΤρητ@ (Euclid, p. 9)) and correctly in the next word

Adjustments to be made

  • In the DESpecs 1.1.2 it is not said that the <mg{l|r}> tag may contain the it argument. Thus, the _ _ markup is used consistently. The Specs should allow this.

Attachments (5)

Download all attachments as: .zip