Inspiration

According to the New York Times (2020), in the United States at least 1,000 cases were reported at more than 85 colleges, and at least 100 cases were reported at more than 680 colleges. The severity of the Covid-19 in the U.S. colleges reached its new peak when students returned to campus last fall. This report will firstly explore the Covid-19 severities on the Big Ten Conference (B1G) colleges during the period of August and December 2020, which are Illinois, Northwestern, Indiana, Iowa, Maryland, Michigan, Michigan State, Minnesota, Nebraska, Rutgers, Ohio State, Penn State, UW Madison, and Purdue .

Many researches have been conducted to explore the important determinants of local Covid-19severity. As explained by Andersen, Martin et al. (2020), when colleges reopen their face-to-face teaching, the mobility on campus increased and the Covid-19 incidence in the county rose on average by a statistically significant 0.024 per thousand residents. This reminds our team the football season started from late October last year for Big Ten Conference.

Furthermore, the federal government planned to allocate money to many U.S. colleges for student emergency grants in April 2020. According to the Lantern (2020), Penn State, Rutgers, and Ohio State obtained more allocation among all the B1G colleges. Other colleges may be interested in the important factors of stimulus funding they could receive.

That's why our project focus on the Big Ten Conference performance during Covid-19 since August 2020 and their stimulus funding received.

What it does

We first would like to compare the Big Ten performance with regard to testing and confirmed cases after the massive shut down of campus. Whether the universities really devoted in controlling Covid-19 for their student in Fall 2020? How does the county influence the university?

Finally, we would like to understand the stimulus funding amount received by each university. What might be the leading factors affecting that amount?

How we built it

Two handcrafted datasets collection data about Big Ten student enrollments , international students amounts and the stimulus funding they received. Two other open source datasets with Big-Ten confirmed and tested during August and December 2020 and county confirmed are used.

Several exploratory data analysis have been conducted to explain the Covid-19 trends and pattern in those universities and how success they are in controlling the situation. We further on compared it with their counties. Five different indexes are considered:

  • Number of New Tests Conducted
  • Number of Cases Confirmed
  • Confirmed Cases out of Conducted Tests (%)
  • University Confirmed Cases out of County (%)
  • University Confirmed Cases out of County (%) / University Population out of County Population (%) Ratio

Furthermore, with regards to stimulus funding, both statistical methods and machine learning models are used. We used Pearson and Spearman Correlation to determine the potential relationship in between funds amount and some features, including enrollment, international students, tests and confirmed. Linear Regression, SVM (Support Vector Machine) and XGBoost are applied to explore deeply in determine the ranking.

Challenges we ran into

Data collection and shortage of time are the most toughest process during this two-day datathon. The sudden expansion of Covid-19 leads to the fact that some data sources are not fully reliable or time accurate. For university general information, except enrollment number and international students number, other features are hard to quantify and collect through open source such as gap percentage, percentage of employments and so on. It is really hard to collect fully reliable data source in such short time period.

Accomplishments that we're proud of

We have managed to find some reliable data source within the time limitation and provide reasonable EDA and modelling with the data we found.

What's next for Covid-19 in Big Ten Conference Colleges

Data accuracy is a key factor that impact the model performance greatly. Our team would further on looking for the related solid dataset and find ways to quantify the university performances.

Built With

Share this project:

Updates