We chose OpenFood challenge because we believe in the ideals it carries. By creating an open database, anyone will be able to get valuable information about product he or she eats, without having to decipher the packaging labels.
What it does
It retrieves various information, including barcodes, about every edible product exposed on Migros' online catalog. For each barcode found, it checks whether it's already featured in OpenFood database. Retrieved data, stored in JSON files, could be used for later addition in OpenFood database.
How we built it
3 steps :
- Retrieve all Migros' internal IDs.
- Use those IDs to retrieve complete information about every specific product, including barcodes.
- Check for existence of similar barcodes in OpenFood database.
-> We made a set of simple bash and python scripts.
Challenges we ran into
Migros APIs were not public - there were two different APIs, with different specifications. We had to reverse engineer them by searching for particular items, and study the structure of the requests and responses. We couldn't retrieve all information we needed with a single request. We had to do thousands of requests and we feared the possibility of an IP ban from Migros server - it's the reason why we chose to send requests one by one instead of grouping into parallel requests, more likely to be spotted by Migros rate limiting algorithms.
Accomplishments that we're proud of
We managed to successfully retrieve every single edible product (except for the vacuum cleaners category) - a total of ~8'000 products full description, including, when given by Migros, name, ingredients and nutritional values.
What's next for openfood-migros-import
It shouldn't be complicated to design a tool able to automatically import missing products from Migros online catalog into OpenFood database, using the JSON dumps retrieved.