Objective The goal of this mini project is to analyze the global spread and impact of COVID-19 using publicly available datasets. We'll use the Pandas library in Python to clean, manipulate, and analyze the data, and present key insights through visualizations and summary statistics.
Scope Data cleaning and preprocessing
Trend analysis: Confirmed cases, recoveries, and deaths
Country-wise comparisons
Monthly trend aggregation
Identifying top/bottom affected countries
Basic visualizations (matplotlib/seaborn)
- Target Audience Data enthusiasts
Healthcare researchers
Policy makers
General public interested in COVID-19 trends
📅 Project Plan (1-2 Days)
Day Task Tools
Day 1: Data Handling & EDA
Morning Load dataset, inspect schema, handle missing values Pandas
Afternoon Clean data (date format, groupby, aggregations) Pandas
Evening Exploratory Data Analysis (EDA) Pandas, matplotlib/seaborn
| Day 2: Analysis & Reporting | | Morning | Time series trends (daily, monthly cases) | Pandas, matplotlib | | Afternoon | Country-wise comparison and rankings | Pandas | | Evening | Visualizations and final dashboard | Seaborn, Matplotlib | | Final | Documentation, summary, and GitHub push | Markdown, GitHub |
📦 Dataset Use the Johns Hopkins COVID-19 Dataset or Kaggle datasets such as:
COVID-19 World Dataset
Built With
- scratch
Log in or sign up for Devpost to join the conversation.