Did this project as part of an Introductory Data Science course. This topic was interesting and at best, harbored the possibility of making millions, or learning data science and making them later =)
How it works
We got the sanitized data from Infochimps and ran different types of algorithms on it to analyze the data (TF- IDF, sentiment analysis, etc.). We then created a dictionary of values associated with words and tried to ascertain the sentiment expressed in a tweet.
Challenges we ran into
The tweets were sanitized, and weren't complete. So it was difficult to be certain at times that the meaning that we, even the humans, were interpreting was correct or not by looking at the tweet. Another challenge was to create a reliable dictionary in an empirical way.
Accomplishments that we are proud of
We were able to explore different types of procedures to understand the sentiment of the a tweet. Our end results were in conformity with the most recent study that was published on the same subject.
What we learned
Different packages and functions in R.