Hackathon 2021 Data Science Challenge

Inspiration

We wanted to tackle the data science project to elucidate the relationships between variables that predict the placement of teams in the College Basketball Postseason

What it does

Our project first visualizes the relationships between the data. By looking at frequencies, correlations as well as predictions in our data, we can begin to understand the relationships between our data. We first visualize the relationships between Conferences and our features of interest and then look at the relationships between public and private universities to see if any inequalities occur. After that, we use Neural Networks, Random Forest, and XGBoost to attempt to fit a model that best predicts tournament placement.

How we built it

We used a combination of python libraries such as NumPy, Pandas, seaborn, Keras, sklearn and XGBoost to build and visualize our project.

Challenges we ran into

We ran into problems preprocessing our data to fit into our visualization and machine learning methods. We also lacked domain knowledge, so it was hard to make judgement calls on which features should be included in our model as well as what questions to ask.

Accomplishments that we're proud of

We are proud of challenging ourselves and creating working machine learning models which we lacked experience in previously.

What we learned

We learned current visualization and machine learning methods and how to apply them to a toy data set.

What's next for Hackathon 2021 Data Science Challenge

Next, we would continue to explore the Neural Network and XGBoost machine learning methods to try to get the best hyperparameters and improve the model accuracy.

Built With

python

Updates

Dillon Lloyd started this project — Mar 20, 2021 11:57 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.