Inspiration
In law enforcement, the speech patterns of criminal manifestos (sometimes submitted anonymously to newspapers, etc.) have been analyzed to identify perpetrators before. Harassment online can be similarly without a name, so the ability to assign a writer to their work has several intelligence use cases.
What it does
Our project is trained to predict an author based on his/her writing style.
How we built it
We used Angular.js to create a frontend that communicates with our Node.js server and flask backend. The backend runs the text input submitted by a user through our trained neural network, and an array of confidences for each author option (per tokenized sentence) is returned. An average is taken across the sentences and from this, the author of the entire submission can be predicted.
Challenges we ran into
For two teammates, their version of Windows was incompatible with TensorFlow so coding was at a bottleneck on one computer. Our long short-term memory model required too much GPU power/time for training to be viable and multiple Cuda toolkit files apparently self-deleted early Sunday morning. (We worked around this by training our model on the CPU and optimizing our algorithms for this adjustment.)
Accomplishments that we're proud of
With relatively short training times, an accuracy rate of around 90% was achieved. (That's an A in our books.)
What we learned
As the foundation of our project, we all gained a deeper understanding of certain natural language processing concepts and despite the fact that it was ultimately removed from the code, it was good practice with LSTMs. Our website was a good exercise with a full-time stack web application too.
What's next for hackumbc18
We all intend to work on this as a personal project to practice the newer artificial intelligence concepts, seeking other real-world applications along the way.
Built With
- angular.js
- cuda
- flask
- keras
- node.js
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.