Inspiration

I wanted to help automate some part of teacher's jobs using machine learning. ML has received a ton of hype recently and has been applied successfully to many applications, mostly in various business' operations. Education is one area that has a lot of untapped potential for ML.

What it does

This workbook is able to verify authorship given a set of previous writing pieces. It does so using two different and separate methods. One is a feed-forward neural net that extracts global features from the writing. The other is a LSTM model that uses a Word2Vec embedding of words as input.

How I built it

It is written in a Jupyter notebook in Python using the Keras (Tensorflow backend), and the data is stored in Google Cloud.

Accomplishments that I'm proud of

I'm very proud that I reached a high level of accuracy even though I did not have a lot of data and no powerful hardware to use for training. With further work (more data, parameter fine-tuning, etc.), I believe this model would be excellent at identifying authorship. It could allow teachers to automate the plagiarism-checking process. It could also, with some more work on interpreting the output of the models, help understand how the writing style of a student changes (hopefully getting better) over time.

Built With

Share this project:

Updates