USELESS FRESHMAN! One of our esteemed group members is a member of Twenty First Tech, which is a website that aims to educate consumers on phones. He had the idea of a tool that would pick out a phone most suited to a user based on what the user would describe as the phone he wants, and the how important each feature is to the user.
We planned to do this by comparing scores in certain criteria between the ideal phone and the phones out there. The data of the phones that we would suggest to them would be gotten by crawling through websites such as GSMArena, and gathering the relevant aspects of comparison for the phones.
Sadly, we managed to get the crawler up and running, which was no mean feat for our head programmer had never done a crawl before, and it was working during our first test, until we got banned by GSMArena, presumably because they thought we were DDOS-ing them with multiple requests in a short span of time. Despite following all the rules set by GSMArena (in robots.txt), this got our IP banned and we were unable to crawl anything. Since it was a test, there was no cache functionality implemented into our crawler, and we were thus unable to salvage whatever data we had mined from the site. Knowing that it might work under a different IP, we routed all traffic through our crawling computer through a proxy by Ultrasurf, and it was working until we got banned again. We wanted to use Tor instead, but there were no coherent tutorials for windows systems
Undeterred by the fetters of this atrocious setback that stymied our project, vastly hindering its potential to be useful, we decided to hard code a few details about some phones, thereby allowing us to make the rest of the programme work.