Neural machine translation has surpassed statistical machine translation as the leading approach. It uses an encoder-decoder model with attention to learn translation representations from large parallel corpora. Recent developments include incorporating monolingual data through language models, improving attention mechanisms, and minimizing evaluation metrics like BLEU during training rather than