Sign In

Eman Saleh Alnagi

Masters Abstract

​​

Title: English-Arabic Machine Translation Using Auto-Encoder Model

Keywords: Arabic Machine Translation, Arabic NMT, Auto-Encoder

Abstract: Natural Language Processing (NLP) has become one of the most popular research. Accompanied with Machine Learning (ML) algorithms, NLP research and implementation have reached high levels of effectiveness especially in common languages such as English and other natural languages. Arabic language, on the other hand, as a rich natural language, has its share in NLP research, but till now, research in many tasks is still immature when Arabic language is involved. Machine Translation is one of the hardest tasks in most natural languages, and considered very hard concerning Arabic language. In this thesis, an English-Arabic Machine translation model is proposed. The model is based on one of the variations of Deep Neural Network (DNN) that proved its efficiency when dealing with sequence of data (words), which is an Auto-encoder based on Recurrent Neural Network (RNN). This work proposes an autoencoder model that consists of two RNNs where the input is an English sentence and the output is its translation in Arabic language. Two well-known English/Arabic datasets have been selected to train the proposed model, and a comparison of results has been
conducted with Google Translate. Several experiments have been conducted, changing the RNN architecture among Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), where both architectures involve additional layers (gates) in an RNN, and improve the translation of long sentences that exceed 10 words of length. A global
attention mechanism has been added to the model in order to compare and relate between target sentence and source sentence. Additional pre-processing step has been applied in selected experiments, which is re-ordering of English sentences to cope with Arabic ordering. Two evaluation metrics have been applied which are; Bilingual Evaluation Understudy (BLEU) and Recall-Oriented Understudy for Gisting Evaluation (ROUGE). The results in all conducted experiments have showed better BLEU and ROUGE score for the proposed model over Google Translate. Nevertheless, several future works are suggested at the end that might enhance the machine translation task to Arabic language.


Contact us

Latest News