I made this project as a part of an assignment for the NLP specialisation and wanted to convert it into an interactive web application.

What it does

You can type in a sentence and then provided some optional arguments like first character and the value of the smoothing factor and it'll predict the next word and return the corresponding probability.

How I built it

I pre-processed and then exported the data using Kaggle Kernels. I pasted all the functions in a file and then invoked them in the main streamlit application. Then I simply deployed this application to Heroku using automatic deploys.

Challenges I ran into

I ran into a lot of issues with storage and fetching files. Towards the end, I figured out that the best way was to prune the dataset and export the files and use them.

Accomplishments that I'm proud of

I figured out how to properly the @st.cache decorator of the streamlit library

What I learned

I learnt about the Github Large File System and performance/storage issues on working with Heroku and Streamlit applications

What's next for Auto-Completion using N-Gram Models

Next up I plan to compare the perplexity's using different values of the smoothing factor and then compare performance.

Built With

Share this project: