I’m a Senior Research SDE at Microsoft Research. My main focus areas are Natural Language Processing (NLP), Conversational Systems and Cloud based solutions. I joined Microsoft Advanced Technology Lab in Cairo (ATL-Cairo) in September 2011 and worked in the R&D of multiple NLP tasks such as Unsupervised Named Entity Recognition, Open Information Extraction, Language Understanding, and lately Conversational Systems. Before I join Microsoft, I worked in the R&D of an enterprise Arabic-English Machine Translation system for around 4 years at Sakhr Software. My MSc was in the field of Ontology based NLP, particularly the automatic inference of an Arabic lexical ontology based on different Arabic/English language resources.



Established: April 22, 2014

Definition The Arabic Parser determines the grammatical structure of Arabic sentences, such as which groups of words combine to form phrases and which words are the subject or the object of a verb. The Parser relies heavily on the Part-of-Speech (POS) Tagger to identify the correct part of speech for each word in an input sentence, and the Named-Entity Recognizer to identify named entities after it has been corrected using the Auto-Corrector. Features Part-of-Speech disambiguation…

Colloquial to Arabic Converter

Established: April 22, 2014

Definition The colloquial converter provides translation of Egyptian colloquial text into modern standard Arabic along with rich mapping information. Features Translation of colloquial and mixed text The colloquial converter translates Egyptian colloquial text into standard Arabic text. Moreover, it handles mixed text that combines modern standard Arabic and colloquial text by translating only the colloquial words and selecting the best translation based on the context. Colloquial morphological analysis The colloquial converter provides rich morphological information…

Part of Speech Tagger

Established: January 29, 2014

Definition POS Tagger identifies the correct part of speech. It resolves the ambiguity on both the stem and the case-ending levels. Features Detailed tag set POS Tagger has a detailed tag set consisting of more than 3,000 tags, which reflects the most important features of each word. Stem level disambiguation POS Tagger solves the stem level ambiguity of most Arabic words by selecting the best analysis that matches each word, based on its context. Case-ending…


Established: January 29, 2014

Definition Restores missing diacritics (short vowels) on Arabic text Handles both stem and case ending vowels Features Diacritization of Arabic text The Arabic diacritizer provides accurate diacritization for Arabic text. Although the Arabic alphabet does have consonants and vowels, people usually write Arabic text without vowels. Case-ending diacritization The component’s automatic diacritization handles both the missing vowels of the stem and the case ending. Handling common Arabic mistakes The Diacritizer makes auto-correction for common Arabic…

Named Entity Recognizer (NER)

Established: January 29, 2014

Definition Detects and classifies named entities for persons, locations and organizations categories Features Arabic named entities detection and classification The Arabic Named Entity Recognizer (NER) extracts named entities from standard Arabic text and classifies them into three main types: proper names, locations, and organizations. Arabic NER can extract foreign and Arabic names, location entities such as cities, countries, streets, squares, as well as organizations like sports teams, political parties, companies, and ministries. Arabic text preprocessing…


Established: January 29, 2014

Definition Detects and corrects misspelled words Provides correction candidates Improves the accuracy of Arabic text processing components Features Mistake detection The Speller enhances the quality of written Arabic text by detecting erroneous words in a standard Arabic text. Automatic correction The Speller makes auto-correction for common Arabic mistakes with no user effort. This also can be used to normalize Arabic text. Spell checking It provides also more than one candidate for correcting erroneous words and…

SARF (Morphological Analyzer)

Established: January 29, 2014

Definition Features Morphological analysis Sarf provides all possible morphological analyses for an input Arabic word. Each analysis consists of the diacritized word and the morphological breakdown of the analysis in terms of prefixes, stem, and suffixes. The stem is further decomposed into its root and morphological pattern. Moreover, each analysis carries the part of speech and a set of morphosyntactic features such as gender and number. The analyses are ranked to reflect the actual language…


Established: January 29, 2014

Definition The Transliterator converts text from Romanized Arabic (Arabic written in English characters) to native Arabic script and vice versa. A common example of that is the transliteration of named entities Features Translation of named entities It provides conversion of proper names, from English characters to Arabic and vice versa. Transliteration of text By converting Romanized Arabic to Arabic, it becomes easier to write in Arabic script even without an Arabic keyboard. Candidate generation The…

Arabic Toolkit Service (ATKS)

Established: December 12, 2013

Natural Language Processing (NLP) is a foundational infrastructure for processing written text. This processing revolves around text analysis and understanding. NLP serves a multitude of sophisticated tasks such as Text Search, Document Management, Automatic Translation, Proofreading, Text Summarization and many more. The Advanced Technology Lab in Cairo has developed the Arabic Toolkit Service (ATKS) as a set of NLP components targeting Arabic language. ATKS Components The component suite includes a full-fledged morphological analyzer (SARF), a spell-checker, an auto…






Mind Maps Automation System
Mohamed Elhoseiny, Asmaa Hamdy, Radwa El Sahn, Sara Samier, , Eslam Kamal, in Proceeding of the 2009 International Conference on Semantic Web & Web Services, SWWS 2009, At Las Vegas, Nevada, USA, The 2009 International Conference on Semantic Web & Web Services, July 1, 2009, View abstract