| 1 | XML meeting, November 23rd, 2009 |
| 2 | * Present: JB, RC, PD, JK, FJK, MS, WS, KT, JW, DW |
| 3 | * Protocol: KT |
| 4 | * Language technology |
| 5 | * Greek |
| 6 | * JW: customized two scripts for Greek to transcode from Betacode to Unicode |
| 7 | * Macron problem still there |
| 8 | * WS: Capital letters problem solved |
| 9 | * Arabic |
| 10 | * JW: first used Aramorph system |
| 11 | * found an Extended Buckwalter on the net |
| 12 | * WS: Extended Buckwalter is for a Qur'an project, not sure, if applicable |
| 13 | * Hyphen problem |
| 14 | * MS: hyphens are Buckwalter artefacts: should not be displayed (but not deleted either) |
| 15 | * PD: hyphens come from filemaker file (used by transcriber to represent blanks) |
| 16 | * JB: are there other characters to be taken care of? |
| 17 | * JW: yes, works on it with MS |
| 18 | * DW: Writing direction can be set in XHTML |
| 19 | * Normalizing |
| 20 | * RC: please document normalizing steps |
| 21 | * DW: make overview of the architecture |
| 22 | * Work on Liddell Scott Jones |
| 23 | * WS: final sigma is wrong in some places |
| 24 | * possible error source: new Lex script |
| 25 | * Problems in transcoding should have been solved already by Perseus |
| 26 | * Priorities |
| 27 | * 1. edo/ sum-problem |
| 28 | * 2. Validation |
| 29 | * 3. role of eXist repository |
| 30 | * 4. new eXist version |
| 31 | * 5. eSciDoc |
| 32 | * 6. Transcoding issues |
| 33 | * DTD/ RNG |
| 34 | * WS: implicit validation only through Schema and DTD |
| 35 | * explicit validation through RNG in eXist using Jing |
| 36 | * How to treat the DTD fragment |
| 37 | * DTD fragment supports validation and saves 10% of the documents' size |
| 38 | * Possible solution: separate validation and display of XML |
| 39 | * DW: in general, entities are difficult in XML, it is better to resolve them |
| 40 | * DTD fragment generates errors in a big number of tools |
| 41 | * WS: resolving makes xml file harder to read and edit |
| 42 | * to have xml file in eXist without fragment, a simple XSLT script is sufficient |
| 43 | * WS: nice to have conversion back to version with fragment |
| 44 | * can be done via script |
| 45 | * Disambiguation problem |
| 46 | * Highest priority |
| 47 | * Possible solution: introduce hyperlemma |
| 48 | * Pollux: change to new dictionaries? |
| 49 | * JB: updates are not only corrected but have also different structure |
| 50 | * dicitionaries would have to be transformed |
| 51 | * DW: why do we not use the LSJ hosted at Perseus? |
| 52 | * JW: we do not know if the corresponding entries are there |
| 53 | * Perseus is slow |
| 54 | * Abbreviations |
| 55 | * WS: two different kinds of abbreviations |
| 56 | * book specific and commonly used ones |
| 57 | * should offer service |
| 58 | * PD: make enhancement of morphological server: offer also abbreviation resolver |
| 59 | * similar to docspecs |
| 60 | * schedule a meeting for that |
| 61 | * WS collects material |
| 62 | * PD: WS solves these problems for the third time |
| 63 | * another task for the IT archaeological meeting |
| 64 | * Also missing: MDHs article on linguistic middleware |