Changes between Version 4 and Version 5 of normalize_arabic_translit
- Timestamp:
- May 8, 2015, 2:24:05 PM (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
normalize_arabic_translit
v4 v5 5 5 Algorithm for normalizing the existing transliterated arabic (_translit fields) in the database. 6 6 7 Currently: source:OpenMind/src/main/java/org/mpi/openmind/repository/utils/NormalizerUtils.java 7 === New === 8 8 9 === 1. replace letter combinations === 9 ==== 1. replace letter combinations ==== 10 11 Replace the following letter combinations with a single letter. 12 13 || dj, ch || j || 14 || th || t || 15 || kh || h || 16 || dh || d || 17 || sh || s || 18 || gh || g || 19 20 Replace at the end of a word: 21 22 || aẗ\b, at\b, ah\b || a || 23 24 Replace letters: 25 26 || ỳ || a || 27 28 ==== 2. replace letters with diacritics ==== 29 30 Replace all(?) letters with diacritics with the letter without diacritics. 31 32 Remove all apostrophes. 33 34 35 36 === Currently === 37 38 source:OpenMind/src/main/java/org/mpi/openmind/repository/utils/NormalizerUtils.java 39 40 ==== 1. replace letter combinations ==== 10 41 11 42 Replace the following letter combinations with a single letter. … … 19 50 || ỳ || a || 20 51 21 === 2. replace letters with diacritics===52 ==== 2. replace letters with diacritics ==== 22 53 23 Replace all letters with diacritics with the letter without diacritics.54 Replace all(?) letters with diacritics with the letter without diacritics. 24 55 25 56 Remove all apostrophes.