Version 3 (modified by 9 years ago) (diff) | ,
---|
Normalizing arabic transliterations
Algorithm for normalizing the existing transliterated arabic (_translit fields) in the database.
Currently: source:OpenMind/src/main/java/org/mpi/openmind/repository/utils/NormalizerUtils.java
1. replace letter combinations
Replace the following letter combinations with a single letter.
th | t |
kh | h |
dh | d |
sh | s |
gh | g |
"aẗ ", "at ", "ah " | "a " |
ỳ | a |
2. replace letters with diacritics
Replace all letters with diacritics with the letter without diacritics.
Remove all apostrophes.