Inspiration A Global UN body approached us because they needed to aggregate patient level COVID-19 data from different countries. Their initial process was manual where they would have people going to website sand copying the data, processing data coming in emails, pdfs, spreadsheets and images. Some data would also be in languages foreign to the people doing the task. The whole process was tedious, costly, unreliable and a diversion from the roles that those people played in general. That is why we build an AI powered automated scraping mechanism.
What it does We have azure functions which scrape data from multiple sources, process it by applying NLP to extract the case details, translated and dump them in a SQL DB.
How we built it We have covered this in the the repo
Challenges we ran into Scraping PDFs can be challenging at times when there are merged cells or when alphabets are quite close to a border
Accomplishments that I am proud of A fully operational search UI in which you can search any COVID-19 case from Singapore, Philippines, Hong Kong , Vietnam, New Zealand and S.Korea
Whats next for Covid19 Monitoring We are building accelerators for different types of formats to speed up the process of building a scraper for newer formats and countries