= Instructions for cutting out images = 1. The figures should be cut out of the TIFF-images, rather than the compressed JPG-images in {{{online_permanent/library}}} on foxridge. The Digigroup should know where the relevant images are. 1. Do not cut out drop caps or embellishments, except for decorative images on the title page 1. If there is already an {{{xml}}}-version of the text, it is handy to extract a list of all figures that are to be cut out. You can do this for example by using XQuery in the display system: {{{ //echo:figure }}} resp. {{{ //echo:image }}} Depending on how much context you like (with caption or not). A list of all figures in ECHO has been extracted ([https://it-dev.mpiwg-berlin.mpg.de/tracs/mpdl-project-software/attachment/wiki/WikiStart/echo-figures.html download]) and converted to html which contains links to each page containing figures. Note that the link might take you to the page after the picture (happens if link is in a float-div). 1. Now, in the viewing environment, mark the images using digilib's "zoom area" tool. Be careful not to cut out surrounding text, including the catchword at the bottom of the page. However, captions are to be cut out, as well. It is sometimes advisable to first cut out a bigger section around the figure. Some small images are only there for decorative purposes. The policy is not to cut out these ones. Lateron, these have to be deleted from the xml file. 1. Save the URLs of the pages with the zoomed area into a text file (keyboard shortcuts come in handy here: {{{cmd-l-c-w}}} copies the link in the address bar and closes the tab, {{{cmd-TAB}}} switches to a text editor, {{{cmd-v}}} inserts the link. To note in the file which images have to be removed from the XML, copy the URL to the text file, but insert a {{{#}}} before that. You can also write other comments into this file, but be sure to begin the line with a {{{#}}}. The resulting list is to be saved in a new directory on the same level as the {{{raw}}} and the {{{xml}}} directory (see [source:/trunk/texts/WO_1/Stevin_1605] as an example). When trained, the average speed for cutting out figures is 2.5 figures per minute (completed Stevin_1605 in 2 hours) 1. On the basis of this text file, the Python script [source:/trunk/schema/scripts/cut_figures/cut_figures.py cut_figures.py] takes care of cutting out the images from the original TIFF files and saves them in the desired format {{{page-imagenumber}}} (e. g., if pageimage {{{0056.tif}}} has three figures, these figures will be saved as {{{0056-01.tif}}}, {{{0056-02.tif}}} and {{{0056-03.tif}}}), by calling Imagemagick's commands {{{identify}}} and {{{convert}}}. They are stored in a folder called {{{figures}}} 1. Following error might occur {{{ convert: AnErrorHasOccurredReadingFromFile `/Volumes/online_permanent/archimedes_repository/large/stevi_stati_527_la_1605/527-01-pageimg/527.01.139.jpg': Bad file descriptor @ constitute.c/ReadImage/575. convert: missing an image filename `/Volumes/online_permanent/archimedes_repository/large/stevi_stati_527_la_1605/figures/527.01.139-02.jpg' @ convert.c/ConvertImageCommand/2775. }}} 1. In that case, rename the directory on Foxridge, run the script once more and merge the directories. Do this until the output of the script about how many figures it extracted and the number of files in the figures directory are the same. 1. It should be made sure that the excluded images are removed both from the raw text and the XML file. == Discussion == 1. Should an [http://mpdl-dev.mpiwg-berlin.mpg.de/ECHOdocuView?url=/mpiwg/online/permanent/library/9NN63YC9&pn=2&viewMode=images Ex libris] be cut out? - Answer: No 1. Should this be treated as one image: [http://echo.mpiwg-berlin.mpg.de/ECHOdocuView?pn=9&ws=1&wx=0.022&wy=0.0628&ww=0.8453&wh=0.4184&url=/mpiwg/online/permanent/library/PUBSU9QD&viewMode=images&tocMode=thumbs&tocPN=1&searchPN=1&characterNormalization=regPlusNorm Apian 1550]? Otherwise, spatial information might get lost (see text version of that page) 1. Data entry tagged every single figure [http://mpdl-dev.mpiwg-berlin.mpg.de/ECHOdocuView?url=/mpiwg/online/permanent/library/S7ECRGW8&pn=295&viewMode=images on this page]. Should this be preserved? Probably yes, as some figures have variables attached (see text version) 1. Ornamentary stuff like [http://mpdl-dev.mpiwg-berlin.mpg.de/ECHOdocuView?pn=7&ws=1&wx=0.1512&wy=0.4966&ww=0.5304&wh=0.348&url=/mpiwg/online/permanent/library/WCWY69V2&viewMode=images this]? Maybe the ornament on the title page should always be cut out for eye candy reasons (because, at one point in the future, maybe, the viewer will display this page as the default first page). However, is missing in Benedetti 1585. 1. Hypothesis to be checked: before the DE firm was supposed to type variables, [http://mpdl-dev.mpiwg-berlin.mpg.de/ECHOdocuView?url=/mpiwg/online/permanent/library/16HBZHF5&pn=337&viewMode=images multifigure pages] were not divided into parts == Number of figures per document == (NB: the numbers are based on the {{{figure}}}-tags in the xml document which again are based on the {{{fig}}}-tags typed in by the data entry firm. Deciding on what counts as a figure is an intellectual process and cannot be decided by the data entry. Thus, the number of figures per document can differ slightly.) ||= Book =||= Figures =|| || /echo/de/Adams_1785_S7ECRGW8.xml || 108 || || /echo/de/Bernstein_1897_01-05_GGAGCX1B.xml || 102 || || /echo/de/Bernstein_1897_06-11_PWVX6XFT.xml || 117 || || /echo/de/Bernstein_1897_12-16_X323E11C.xml || 68 || || /echo/de/Bernstein_1897_17-21_HQ8URX9E.xml || 126 || || /echo/de/Bion_1765_TGXUZC1H.xml || 434 || || /echo/de/Boskovic_1765_YPS3EYQ2.xml || 29 || || /echo/de/Lehmann-Brockhaus_1983.xml || 415 || || /echo/de/Specklin_1599_SSM0YQED.xml || 130 || || /echo/en/Apollonius_1771_FDWQ9FD5.xml || 27 || || /echo/en/Bacon_1670_WX8HY2V2.xml || 19 || || /echo/en/Gravesande_1724_N1TU6UZF.xml || 82 || || /echo/en/Wilkins_1684_TG3ZW27M.xml || 17 || || /echo/fr/Belidor_1754_M1R3K3S6.xml || 68 || || /echo/fr/Belidor_1757_R04RNX9Y.xml || 68 || || /echo/fr/Berzelius_1819_WCWY69V2.xml || 2 || || /echo/fr/Mersenne_1635_508_fr.xml || 61 || || /echo/fr/Papin_1682_A8SP3HCB.xml || 25 || || /echo/fr/Ufano_1628_QXRZU2BV.xml || 68 || || /echo/fr/Varignon_1687_TP04WPNS.xml || 62 || || /echo/fr/Vitruvius_1618_3XFC5KGV.xml || 125 || || /echo/fr/Voltaire_1738_1FP6HWGK.xml || 123 || || /echo/it/Alberti_1565_5PPYB69C.xml || 105 || || /echo/it/Angeli_1668a.xml || 27 || || /echo/it/Angeli_1668b.xml || 13 || || /echo/it/Angeli_1671.xml || 37 || || /echo/it/Benedetti_1579_507_it.xml || 1 || || /echo/it/Bianconi_1746.xml || 5 || || /echo/it/Casati_1685_1YZKBTHR.xml || 75 || || /echo/it/Cataneo_1567_DSDY9XH0.xml || 69 || || /echo/it/Cataneo_1572_ZBAS6ZM1.xml || 129 || || /echo/it/Cavalieri_1632_CE3XGS5P.xml || 30 || || /echo/it/Gallaccini_1767_D09WWP72.xml || 224 || || /echo/it/Heron_1601_M5C8103Y.xml || 34 || || /echo/it/Vitruvius_1524_ZFRVKXMF.xml || 135 || || /echo/it/Vitruvius_1556_XYTWCGV1.xml || 168 || || /echo/it/Vitruvius_1747_Y1G1TRCW.xml || 14 || || /echo/it/Zanotti_1752_16HBZHF5.xml || 2 || || /echo/it/Zonca_1656_UR271U6Y.xml || 55 || || /echo/la/Angeli_1659.xml || 92 || || /echo/la/Apian_1541_9TE6563P.xml || 12 || || /echo/la/Apian_1550_PUBSU9QD.xml || 69 || || /echo/la/Apollonius_1661_1X8T70WB.xml || 526 || || /echo/la/Archimedes_1565.xml || 151 || || /echo/la/Archimedes_1565_YS05QMU8.xml || 38 || || /echo/la/Aristoteles_1547.xml || 13 || || /echo/la/Aristoteles_1548_9NN63YC9.xml || 1 || || /echo/la/Barrow_1674.xml || 40 || || /echo/la/Benedetti_1585.xml || 444 || || /echo/la/Bernoulli_1738_AZ870BWE.xml || 28 || || /echo/la/Biancani_1635_GWS4WXH4.xml || 147 || || /echo/la/Casati_1686_UEY6QQZ7.xml || 18 || || /echo/la/Cataneo_1600.xml || 117 || || /echo/la/Cavalieri_1653.xml || 370 || || /echo/la/Clavius_1581_MXTKM8TF.xml || 432 || || /echo/la/Clavius_1586.xml || 372 || || /echo/la/Clavius_1591_DP9UZA52.xml || 104 || || /echo/la/Clavius_1606_FBVYV7EH.xml || 294 || || /echo/la/Ghetaldi_1603_FQPFR8XP.xml || 23 || || /echo/la/Gravesande_1721_KN9XTZRQ.xml || 98 || || /echo/la/Heron_1680_C3M2XK8N.xml || 98 || || /echo/la/Huygens_1724_1_BYCAB3V6.xml || 168 || || /echo/la/Huygens_1724_2_Y97ESDAP.xml || 241 || || /echo/la/Musschenbroek_1729_H9ZYCGQ0.xml || 32 || || /echo/la/Stevin_1605_527_la.xml || 251 || || /echo/la/Vitruvius_1490_4YSU4X91.xml || 1 || || /echo/la/Vitruvius_1511_XS9KA6WS.xml || 136 || || /echo/la/Vitruvius_1543_T05R2RPS.xml || 91 || || /echo/la/Vitruvius_1544_2UZM8E2N.xml || 64 || || /echo/la/Vitruvius_1552_UNCWSTHE.xml || 107 || || /echo/la/Vitruvius_1567_514_la.xml || 174 || || /echo/la/Vitruvius_1800_V82APKX9.xml || 1 || || /echo/la/Viviani_1659.xml || 272 || || /echo/la/Weidler_1726_NSW2F6PF.xml || 17 || || /echo/la/Zubler_1607_DNYGYWGH.xml || 20 || || /echo/la/alvarus_1509.xml || 14 || || /echo/zh/SongYingxing_1637.xml || 160 || || '''Total:''' || '''8635''' ||