Changes between Version 4 and Version 5 of normalize_arabic_translit


Ignore:
Timestamp:
May 8, 2015, 2:24:05 PM (9 years ago)
Author:
casties
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • normalize_arabic_translit

    v4 v5  
    55Algorithm for normalizing the existing transliterated arabic (_translit fields) in the database.
    66
    7 Currently: source:OpenMind/src/main/java/org/mpi/openmind/repository/utils/NormalizerUtils.java
     7=== New ===
    88
    9 === 1. replace letter combinations ===
     9==== 1. replace letter combinations ====
     10
     11Replace the following letter combinations with a single letter.
     12
     13|| dj, ch || j ||
     14|| th || t ||
     15|| kh || h ||
     16|| dh || d ||
     17|| sh || s ||
     18|| gh || g ||
     19
     20Replace at the end of a word:
     21
     22|| aẗ\b, at\b, ah\b || a ||
     23
     24Replace letters:
     25
     26|| ỳ || a ||
     27
     28==== 2. replace letters with diacritics ====
     29
     30Replace all(?) letters with diacritics with the letter without diacritics.
     31
     32Remove all apostrophes.
     33
     34
     35
     36=== Currently ===
     37
     38source:OpenMind/src/main/java/org/mpi/openmind/repository/utils/NormalizerUtils.java
     39
     40==== 1. replace letter combinations ====
    1041
    1142Replace the following letter combinations with a single letter.
     
    1950|| ỳ || a ||
    2051
    21 === 2. replace letters with diacritics ===
     52==== 2. replace letters with diacritics ====
    2253
    23 Replace all letters with diacritics with the letter without diacritics.
     54Replace all(?) letters with diacritics with the letter without diacritics.
    2455
    2556Remove all apostrophes.