wiki:CharacterIssues

Version 3 (modified by hyman, 16 years ago) (diff)

--

Characters to be typed directly

ASCII characters should be used with their normal values, except as indicated below. Note that tilde (by itself) should be entered directly as ~.

The following characters with diacritics are to be typed directly:

Characters with acute accent

áéíóú ÁÉÍÓÚ

Characters with grave accent

àèìòù ÀÈÌÒÙ

Characters with circumflex accent

âêîôû ÂÊÎÔÛ

Characters with umlaut/diaeresis

äëïöüÿ ÄËÏÖÜŸ

Characters with tilde

ãõñ ÃÕÑ

Characters with cedilla

ç Ç

Common ligatures

æ œ Æ Œ

Special conventions

Owing to the high frequency of “long s” <ʃ>, this character should be typed as $. (We do not expect to digitize works that contain both the currency symbol and the s variant.)

XML Entity notation

Characters that cannot be conveniently be typed may be indicated by means of XML entities. The entities specified in ISO 8879 (see especially isolat1 and isolat2) should be used. More generally, some characters may be entered using conventions specified below:

For characters with accents enumerated above, if the text input method does not support the composition of the accent with a certain character, entities may be used thus. For instance, the operating system typically makes no provision for allowing the entry of a modified <q> — yet such characters are frequent in Latin materials. They may be typed as (e.g.):

&qacute; &qgrave; &quml; &qtilde;

The ampersand (&) must be entered as the entity &amp;, to avoid confusion with its use in entitites.