Inspiration
One of our team members was talking about an app he was working on using the Open Source Data of the city of Brampton. We decided to take that idea and using the technologies provided by the sponsors at this hack create a social improvement hack. We wanted to create a way to link the heavy use of social media and the lack of awareness many young students have when it comes to community events. Our data-scraping machine learning hack, allowed us to create an innovative and unique solution that bolsters civic engagement.
What it does
Our application has many moving parts. Firstly we used the requests library to scrape the .csv data for 'City of Waterloo Open Data'. Then we used Cython (C optimized Python) to scrape the tweets of a given twitter user using custom code and Twitters API. Subsequently, we used Indico's personality analysis machine learning API to characterize the data from the users tweets as well as the descriptions of each of the events from the Open Data of Waterloo public events. We determine the most recurring personality trait from the tweets and paired that with a set of Waterloo events that would also indicated that personality trait. This would allow the user to easily find out about events in their city in an easy way that is very individualized to their interests.
How we built it
We used Flask for our backed language, and we custom coded the Front End (Small Rest Application). We used the cythonize python module to c-optimize our code, we used Indicio for the machine learning analysis of the person's tweets and the event's description. We also implemented multi threading in our program so that we would be able to utilize our multi core system when we did the real-time analysis of the clients tweets allowing the page to load faster.
Challenges we ran into
We had a lot of problem modifying the .csv files due to the typing constraints some our algorithms had. Also, implementing the created functions into a web app was quite difficult at first as we had no experience with Flask are node developers. Also, this was our first time c-optimizing an application so it was a bit of a learning curve but it turned out well. Finally, we had a bit of trouble with the web scraping, but it the end, it worked out well.
Accomplishments that we're proud of
We are proud of the clean outcome of our web application. We were also surprised at the fact that it can work relatively fast even when analyzing thousands of of tweets.
What we learned
We learned a lot about C/C++, web hosting, machine-learning and web-scraping
What's next for Socialoo
If there is a real interest for our application, we plan on further developing the application to apply to many cities which would be determine from scraping users online profile. Also, we plan on branching out our application from solely scraping twitter, to scraping other domains such as Facebook or Medium. This was a very enjoyable Hackathon, allowing us to step out of our comfort zones, even though we only had 2 people in our team!
Built With
- bootstrap
- city-of-waterloo-open-data
- css3
- cython
- flask
- heroku
- html5
- indico-api
- javascript
- jquery
- machine-learning
- python
- requests-module
- threading
- web-scarping
Log in or sign up for Devpost to join the conversation.