Changes between Version 24 and Version 25 of First evaluation


Ignore:
Timestamp:
Jul 12, 2010, 10:30:18 AM (14 years ago)
Author:
Wolfgang Schmidle
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • First evaluation

    v24 v25  
    11The first fifty pages of [http://libcoll.mpiwg-berlin.mpg.de/libview?mode=imagepath&url=/mpiwg/online/permanent/library/D9V0Q862/pageimg Diversae "Conimbricenses In Universam dialecticam" (1606)], [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?mode=imagepath&url=/mpiwg/online/permanent/library/163127KK/pageimg Benedetti, Giovanni Battista de "Diversarvm specvlationvm mathematicarum, et physicarum liber" (1585)] and [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?mode=imagepath&url=/mpiwg/online/permanent/library/2QTVUHDT/pageimg Euclid  "Elementorum Libri XV" (1607)] were digitized and sent back for evaluation. In general, the results are very good.
    22
    3 Unfortunately, the work sample does not contain a page of the [http://libcoll.mpiwg-berlin.mpg.de/libview?mode=imagepath&url=/mpiwg/online/permanent/library/D9V0Q862/pageimg Conimbricenses] where the [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/DataEntrySpecs/DESpecs_special_Conimbricenses.pdf Special Instructions] apply.
     3Unfortunately, the work sample does not contain a page of the [http://libcoll.mpiwg-berlin.mpg.de/libview?mode=imagepath&url=/mpiwg/online/permanent/library/D9V0Q862/pageimg Conimbricenses] where the [attachment:wiki:DataEntrySpecs:DESpecs_special_Conimbricenses.pdf Special Instructions] apply.
    44
    55PDF versions of the work samples are attached. In these PDF versions, the font is Helvetica 12pt (10pt for Benedetti), blank lines have been inserted before <pb> tags, and < > { } _ are in bold face.
     
    1010= What does work =
    1111 * Letters with swashes are recognized, except for this [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=15&ws=1&ww=1&wh=1&mk=0.1666/0.5916&mode=imagepath&url=/mpiwg/online/permanent/library/2QTVUHDT/pageimg Quod] which was transcribed as Luod. Character recognition is surprisingly high, e. g. [http://libcoll.mpiwg-berlin.mpg.de/libcoll_zogilib?fn=/permanent/library/D9V0Q862/pageimg&pn=3&ws=1&wx=0.7282&wy=0.5497&ww=0.2504&wh=0.1974&mk=0.9353/0.6498 Conimbricenses, p. 3]
    12  * [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/First%20evaluation/Unknown%20Characters%20List.pdf List of unknown characters] is used (two characters so far), unreadable text is marked up accurately.
     12 * [attachment:"Unknown Characters List.pdf" List of unknown characters] is used (two characters so far), unreadable text is marked up accurately.
    1313 * Multiline headings are recognized, possibly because of punctuation
    1414 * Both methods of marking up italics in headings is used:
     
    3939}}}
    4040 * Parentheses work well, only one example with spaces within parentheses ([http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=9&ws=1&wx=0.0784&wy=0.2861&ww=0.7511&wh=0.1039&mk=0.185/0.3401&mode=imagepath&url=%2Fmpiwg%2Fonline%2Fpermanent%2Flibrary%2F163127KK%2Fpageimg Benedetti, p. 9]). Original has spaces.
    41  * The [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/First%20evaluation/Unknown%20Characters%20List.pdf List of unknown characters] works good and is obviously frequently updated. Unknown character <010>, however, is represented by a wrong image. The unknown character in question is [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=26&ws=1&wx=0.8&wy=0.3859&ww=0.1642&wh=0.1205&mk=0.8942/0.4563&mode=imagepath&url=/mpiwg/online/permanent/library/YS05QMU8/pageimg this one]. Unknown character <006> and <011> do not occur in the work samples, characters [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=27&ws=1&wx=0.5074&wy=0.7187&ww=0.1477&wh=0.0473&mk=0.5726/0.7516&mode=imagepath&url=/mpiwg/online/permanent/library/YS05QMU8/pageimg <012>] and [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=32&ws=1&wx=0.6519&wy=0.723&ww=0.2599&wh=0.0789&mk=0.7771/0.7772&mode=imagepath&url=/mpiwg/online/permanent/library/YS05QMU8/pageimg <014>] occur in the text, but are not on the list (yet?).
     41 * The [attachment:"Unknown Characters List.pdf" List of unknown characters] works good and is obviously frequently updated. Unknown character <010>, however, is represented by a wrong image. The unknown character in question is [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=26&ws=1&wx=0.8&wy=0.3859&ww=0.1642&wh=0.1205&mk=0.8942/0.4563&mode=imagepath&url=/mpiwg/online/permanent/library/YS05QMU8/pageimg this one]. Unknown character <006> and <011> do not occur in the work samples, characters [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=27&ws=1&wx=0.5074&wy=0.7187&ww=0.1477&wh=0.0473&mk=0.5726/0.7516&mode=imagepath&url=/mpiwg/online/permanent/library/YS05QMU8/pageimg <012>] and [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=32&ws=1&wx=0.6519&wy=0.723&ww=0.2599&wh=0.0789&mk=0.7771/0.7772&mode=imagepath&url=/mpiwg/online/permanent/library/YS05QMU8/pageimg <014>] occur in the text, but are not on the list (yet?).
    4242
    4343  Small problem with the list: there is only one list for all documents. Was this intended?
     
    5959 * What happens with spaces in the text like [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView/ECHOzogiLib?pn=24&ws=1&wx=0.1613&wy=0.8292&ww=0.7855&wh=0.0984&mk=0.5068/0.8679&mode=imagepath&url=/mpiwg/online/permanent/library/YS05QMU8/pageimg this one]? Are they meaningful?
    6060= Adjustments to be made =
    61  * In the [https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content/attachment/wiki/DataEntrySpecs/DESpecs_1_1_2_overview.pdf DESpecs 1.1.2] it is not said that the <mg{l|r}> tag may contain the ''it'' argument. Thus, the _ _ markup is used consistently. The Specs should allow this.
     61 * In the [attachment:wiki:DataEntrySpecs:DESpecs_1_1_2_overview.pdf DESpecs 1.1.2] it is not said that the <mg{l|r}> tag may contain the ''it'' argument. Thus, the _ _ markup is used consistently. The Specs should allow this.