PredictaBill

Inspiration

As social media continues to gain importance in the world of politics, we realized that the massive trove of data available online holds the potential to provide meaningful analysis of opinions on legislation.

What it does

Our program analyzes bills to identify keywords, then pulls and analyzes Tweets from a variety of important political figures, and feeds the resulting data into a trained neural network that outputs a confidence measure in the strength of the bill.

How we built it

Using a variety of scraping and parsing tools, we identified the handles of politically-important Twitter users. Then, we experimented with Twitter's API and Microsoft Azure to pull Tweets and filter out irrelevant ones by keyword. Finally, we processed the Tweets from each user with Python's Natural Language Toolkit and used TensorFlow to create and train a neural network that takes in the output from the Toolkit.

Challenges we ran into

Time and time again, we were faced with scalability challenges, as we pull and analyze hundreds of thousands of Tweets. Although we were often held back by API rate limits, we managed to optimize our code to produce analysis capable of updating in real-time.

Accomplishments that we're proud of

With just a little prior experience in ML and none with Twitter, we were able to design, code, and train two neural networks, and identify the most important Tweets.

What we learned

We definitely learned that it's important to optimize code early on, because problems with scale can become almost impossible to solve otherwise.

What's next for PredictaBill

We hope to gather more data to train a more advanced, accurate neural network, and continue to optimize some of our code that limits our ability to parse data at speed, eventually providing a valuable, new analysis for the political sphere.

Built With

azure
java
nltk
python
tensorflow
twitter-rest-api

Submitted to

LA Hacks 2017

Created by

Worked on identifying important Twitter streams, creating a single clear process for our code, and implementing the neural nets with TensorFlow

Sachit Shroff
Worked with Microsoft Azure's cloud text analytics to find keywords from legislation bills and preprocessed data needed for the neural net. Also worked on the implementing the TensorFlow neural net and helped out with other general code.

Terrence Ho
Worked on implementing and using the Twitter API, filtering the tweets based on relevance, performing sentiment analysis with NLTK, and creating an algorithm to determine the overall sentiment of a stream weighted by the timestamp of each tweet

Nilay Shah
Alex Fargo

Updates

Sachit Shroff started this project — Apr 02, 2017 11:33 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.