Inspiration
Due to the rising number of farmed reviews on Amazon, as well as 5-star products that only had negative recent reviews, we saw an opportunity to investigate. The development of this tool came from that curiosity.
What it does
Spamazon is a data collection and graphing utility specially made to search Amazon, with Spamazon you can produce detailed graphs that provide insight into how reviews fluctuate over time.
How we built it
We coded Spamazon in three parts, Collection, Cleaning, and Plotting. Each step was worked on collaboratively and the tools used included beautiful soup, splash, mathplotlib (for plotting) as well as many many regex and string manipulation methods.
Challenges we ran into
Many challenges we ran into were the result of data being sorted or collected incorrectly. At one point we were collecting duplicate reviews, to debug this we started printing the review titles, which ended up being a nice 'loading screen' feature.
Accomplishments that we're proud of
We initially were just going to plot 5-star reviews only, but with our coding abilities combined, we were able to plot all 5-star charts!
What we learned
Most of us only knew rudimentary python and very little about web scraping, we learned our tools as we programmed.
What's next for Spamazon
Spamazon is likely to be updated over time, adding UI and a save as json option.
Built With
- beautiful-soup
- docker
- mathplotlib
- python
- splash
Log in or sign up for Devpost to join the conversation.