Api for getting job updates
Selenium + Beautiful soup for scraping and then dumping the data into a database. Then we query the database to reply to provide jobs
Challenges we ran into
Scraping is difficult because of the difference in how companies structure their jobs page (web technology used and keywords). The internet connection at Hackzurich was also quite slow on Saturday and because we were making a lot of requests, this impacted our data gathering abilities.
Accomplishments that we're proud of
We found a way to get a list of all the companies in Switzerland and we have links to over 6000 companies overnight( with name & urls). We also developed quick heuristics which helped to filter out irrelevant companies with no websites or career pages.
What we learned
- About 50% of all companies in Switzerland have websites. Of them, about 2% of all companies have career pages.
- Once an index of relevant companies is built, the good news is that only new companies need to be checked in.
- Relevant companies can be crawled for extraction on a regular basis.
- Web scraping can be very interesting and effective solution. Legal aspects of the method must be taken into consideration.
What's next for EasyJobMatch
This problem has full potential to be a complete service on its own. We look forward to developing the end-to-end service. Also, we plan to fine tune the links by adding multiple verification mechanisms.