COVID-19 Statistics web scraper

A picture of the first bit of code

Inspiration

I felt that it was a minor inconvenience that everyday I had to actually look up the covid-19 stats. It was a step out of the way of what I wanted to do. So I thought that it would be a good idea to create a python application that scrapes data from a website to get the stats instantly without every typing. I also thought it would be an amazing personal development project to learn computer science and have better employability for any engineering firm or place that needs coding experience in the future.

What it does

Scrapes link for the worldwide coronavirus cases, deaths, recovered cases and active cases. As well the program takes the information from the large table in the website and puts it into neat lists and eventually into the easygui graphical interface for easy access to every country's statistics. These statistics include country, total cases, new cases today, recovered cases and deaths. It is an easier alternative to constantly looking up coronavirus statistics everyday. Just boot up the program and it gives you real-time numbers straight from the website.

How I built it

I first used the Atom IDE as I thought it was a very clean and easy to use environment. Plus I have used it before in my classes at school. Next I began to find out I needed more python libraries for requesting info off of the internet (requests library) and parsing that info into readable information (beautiful soup 4 (bs4)) as well as lxml for the format of the html text. I used these tools to scrape info off of link to get all of the real-time updated covid-19 stats. Then I used easygui to make a usable and easy-to-read graphical interface for all of the info I got.

Challenges I ran into

Firstly, it was a while since I actually coded python so I was very rusty at the start of my project. To combat this, I went back and started looking at documentation for python and watched videos to get back some of my understanding.

Furthermore, I had a very hard time finding out how to go through the large table in the worldometers.info website. It was very challenging to parse all of the info I actually needed out of a huge and difficult to read HTML document with my limited HTML experience. However, I watched some videos on how to use the bs4 and requests library and in the end I was able to find out what the HTML documents meant/ how to find certain attributes that pertain to which numbers and words I am looking for.

Accomplishments that I'm proud of

I am very proud of my development in python over the course of the hackathon. Even thought the project may seem barebones compared to other's progress, I did not know that my product was going to look as polished and as good as it does. I am proud that I was able to learn website scraping in as short of a time as I did. I am also proud that I now have more valuable skills for future jobs.

What I learned

I learned a ton about how to scrape websites with the requests and beautiful soup tool. I can now scrape any website and pull information from any .HTML document regardless of how difficult the .HTML is to read.

What's next for COVID-19 Statistics web scraper

I hope to one day maybe create an algorithm in the stats app to predict how many cases are going to show up in the following day for each country. I feel that it would be a fun addition to the project and a useful skill for later job applications.

Built With

atom
beautiful-soup
easygui
lxml
python
requests

Updates

Gavin Mesz started this project — Jan 17, 2021 12:23 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.