What it does

This English to French translator takes an English sentence as input and outputs the French translation.

How we built it

With an Enlish-French dataset from Kaggle, I first preprocessed the data by "cleaning" the sentences: put everything in lowercase, removed punctuations, etc. Then using the keras library, I tokenized the French and English sentences and performed word embeddings. I then encoded sentences into vectors and fed these vectors into a single-layered unidirectional LSTM Encoder-Decoder, also built using the keras library. I used this model to predict French sentences/translations from English sentences.

Challenges we ran into

The model was evaluated using BLEU-scores (BLEU-1 to BLEU-4). These scores were not great (BLEU-1: 0.480895, BLEU-2: 0.370989, BLEU-3: 0.304639, BLEU-4: 0.174916). I tried to improve them by changing my model to be multi-layered and bidirectional but the scores remained the same or decreased.

Future Directions

Future directions include trying to improve the performance of the model, potentially by preprocessing the data differently, using a larger training set, and modifying a different model structure.

Built With

Share this project:

Updates