Inspiration

In law enforcement, the speech patterns of criminal manifestos (sometimes submitted anonymously to newspapers, etc.) have been analyzed to identify perpetrators before. Harassment online can be similarly without a name, so the ability to assign a writer to their work has several intelligence use cases.

What it does

Our project is trained to predict an author based on his/her writing style.

How we built it

We used Angular.js to create a frontend that communicates with our Node.js server and flask backend. The backend runs the text input submitted by a user through our trained neural network, and an array of confidences for each author option (per tokenized sentence) is returned. An average is taken across the sentences and from this, the author of the entire submission can be predicted.

Challenges we ran into

For two teammates, their version of Windows was incompatible with TensorFlow so coding was at a bottleneck on one computer. Our long short-term memory model required too much GPU power/time for training to be viable and multiple Cuda toolkit files apparently self-deleted early Sunday morning. (We worked around this by training our model on the CPU and optimizing our algorithms for this adjustment.)

Accomplishments that we're proud of

With relatively short training times, an accuracy rate of around 90% was achieved. (That's an A in our books.)

What we learned

As the foundation of our project, we all gained a deeper understanding of certain natural language processing concepts and despite the fact that it was ultimately removed from the code, it was good practice with LSTMs. Our website was a good exercise with a full-time stack web application too.

What's next for hackumbc18

We all intend to work on this as a personal project to practice the newer artificial intelligence concepts, seeking other real-world applications along the way.

Built With

Share this project:

Updates