“…Classic Vector Space and Probabilistic models (Manning et al, 2008) are the first options. However, the very special and noisy nature of Egyptian writing system and the application context may suggest the use of other approaches: the use of standard character n-grams as a working unit, a solution successfully applied in both noisy contexts (Vilares et al, 2011) and languages whose writing systems share characteristics with Egyptian, such as Japanese (Ogawa and Matsuda, 1999), Chinese (Foo and Li, 2004), Korean (Lee and Ahn, 1996) or Arabic (Mustafa and Al-Radaideh, 2004); the use of so-called character s-grams (Järvelin et al, 2008), a generalization of the concept of n-gram by allowing skips during the matching process; the application of localitybased models (de Kretser and Moffat, 1999); or phonetic matching (Yasukawa et al, 2012). Closer to the NLP field, the development of conflation mechanisms based on lemmatization or morphological analysis (Piotrowski, 2012, Ch.…”