AcademicGatorAid

Inspiration

We are inspired to provide equity for students that face socioeconomic conditions that negatively affect academic performance. To disregard the socioeconomic advantages certain students have is to disregard the wellbeing of students that don't have access to the same privileges. Therefore, it is malinformed to quantify a student's potential based on metrics such as GPA and test scores without accounting for conditions such as parental income. To fix this issue we decided to create AcademicGatorAid, a framework that encourages academic administrators to provide extra support to students that require it most.

What it does

AcademicGatorAid provides academic administrators with tools that allow them to better understand how socioeconomic factors contribute to a student's performance and identifies students that require additional support.

The Data Exploration Notebook uses methods to quantify and visualize how certain factors impact academic performance. The R Script prediction model classifies a student's likelihood to benefit from additional support based on socioeconomic conditions and metrics used to measure academic performance.

How we built it

The Data Exploration notebook uses KMeans clustering to partition a set of students into two groups. This is used to inform the user how differences in academic performance can be explained by conditions such as a student's parent’s highest form of education. Furthermore, TSNE, a visualization technique to reduce higher dimensional data, is applied to demonstrate how the groups formed by KMeans clustering are relatively distinct.

After we obtained the descriptive statistics and a visual insight on the clustering from our Data exploration notebook, we moved into R script to build our prediction model. We employed the use of multi-group Quadratic Discriminant Analysis to build a classification model using all predictors to classify students into three groups labeled as (0,1,2). These groups indicate varying priority, with students least at risk of struggling in group 2. Quadratic Discriminant Analysis is a supervised machine learning algorithm used to classify data into discrete groups. We trained our model on 70% of our data and tested it on the remaining 30% to get a testing error rate of 20%, i.e., our model predicted the correct classification on our testing set 80% of the time.

Challenges we ran into

We found it challenging to narrow down an issue that we could feasibly solve given our prior experience and the time constrain. Furthermore, after we decided on our area of focus, we struggled to find a data set that would allow us to accomplish our goal to demonstrate the effect of socioeconomic conditions on academic outcomes.

Accomplishments that we're proud of

We are proud of our ability to collaborate and learn from each other during the brainstorming and development process. Furthermore, we are proud of our determination and perseverance during times we felt like giving up. Due to our teamwork and perseverance we accomplished building a model with an 80% prediction accuracy. We also were able to apply Machine Learning methods to visualize and group the data.

What we learned

Through our discussions and research during the brainstorming process, we discovered many ways in which AI and Machine Learning provides promise for a better future when used for social good. We also learned many team working skills such as how to incorporate individual strengths to solve a problem. As for technical skills we learned how to visualize higher dimensional data using TSNE and how to use Pandas.

What's next for AcademicGatorAid

Due to data limitations, our data set does not encompass many socioeconomic conditions we wish to consider. Thus, we hope to collect a more comprehensive and expansive dataset to improve the prediction model and account for more conditions. Additionally, with future version we wish to expand upon our data exploration and visualization methods and create a better UI.