Changes between Version 12 and Version 13 of normalization/4


Ignore:
Timestamp:
Jun 9, 2011, 8:12:54 AM (13 years ago)
Author:
Wolfgang Schmidle
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • normalization/4

    v12 v13  
    33
    44= 4. Überblick über Regularisierung und Normalisierung =
     5
     6We try to approximate our texts with Unicode means, but we allow only things that can reasonably be expected to be displayed properly in a web browser. For example, we use
     7* combining characters even though many fonts still struggle with them
     8We do not use
     9* Zero Width Joiners as in "q ZWJ ꝫ" because they do more harm than good
     10* codepoints in the Private Use Area, even if they are standard MUFI codepoints
     11*  ideographic description sequences in Chinese text (an example for "official Unicode")
     12In particular, we do not attempt to display ligatures at all costs.
     13
     14We use a <reg> tag for all additional information, e.g. for resolving abbreviations (and also for ideographic description sequences). On the other hand, we do not regularize e.g. "superfluous" renaissance accents in our texts and instead rely on our display system to create the word form that can be found in a dictionary.
     15
     16For example, we would write
     17 <reg norm="teq́ue" faithful="te́" type="simple">teq́ꝫ</reg>
     18which would display as
     19* teq́ꝫ  in display mode "Original" (the user should have installed a font that contains "ꝫ")
     20* te́  in diplay mode "Original" with checked box "faithful" (the user should have installed a MUFI font and use it for displaying the text)
     21* teq́ue in display mode "Regularized" (this is the default mode)
     22* teque in display mode "Normalized" (which is created on the fly by the display system)
     23
     24== Bearbeitungsschritte
    525
    626Die folgende Tabelle zeigt für einige Wörter die Bearbeitungsschritte vom Rohtext über das XML bis zum Anzeigesystem.