Inspiration
This is a project which we made on a Hackathon organised by GeeksForGeeks. The data was provided by GFG, and some external data was also used to make it, which was taken from Kaggle. It is built up with the data of the past 120 years.
What it does
We drew out multiple insights and correlations with medals of a country. These insights can help a country to develop itself to win in Olympics.
How I built it
We used multiple data-science technologies to clean up and preprocess the data. Most of the work was to clean up the data as it had many null values/missing values. We also used RandomForestRegressor from scikit learn to make a small Machine learning model, which isn't much accurate but gives a basic idea on the basis of only 2 factors, The country's location(Continent) and its average population. We used a seperate dataset besides the given dataset, which was based on country's population, we cleaned that dataset up and merged these datasets, and proceded to train the model. But, before training the model, we visualized the data very well and drew out multiple insights.
Challenges I ran into
As any other project , we also faced a few problems while making this project, As the data had past 120 years of olympics records, there were several countries such as Soviet Union, Bohemia, West Indies federation, and many other countries . However these countries does not exist in the present and hence their data should be removed. Another Issue which i faced was when i wanted to calculate ages of the athletes, because i wanted to find out the average age of the athletes . One of my team members helped me fix that issue, he used GCP (Google cloud platform) to scrape data from internet on the birthyears of the athletes.
Accomplishments that I'm proud of
I am proud that I found out useful insights in this project. I found out things like, if a country belongs to Europe it has higher chances of winning any medals, probably due to historical reasons and if a country belongs to Africa, it has lesser chances of winning any medals. We also found out the average age of the athletes to be 26. We also gave a good amount of time to make a website to display the insights visualizations and code.
What I learned
I learned many parts of Front-End development . As i was the team captain, I wanted to help one of my team members with her work(Of FrontEnd development), when i completed my work(Of data Analysis). This helped me learn about frameworks like Tailwind CSS, to built a website in short time, I also learned some JavaScript as I didn't know much of it, I generally used Python and C++, so JavaScript is also a skill which i developed in this project. But, Most important of all these was team work, This was the most important skill that i learned in this project, as it was my first collaborative project.
What's next for OlympicsDataAnalysis
I have multiple thoughts to improve this Project. I will improve the machine learning model to be accurate, with new features such as Average age of citizens, number of athletes in country, etc. I will even add demonstration of data soon in our project webpage, this would give the users freedom of playing with the data and finding , or even predicting country medals.
Built With
- css3
- data-science-toolkit
- html5
- javascript
- jupyter-notebook
- kaggle
- pandas
- python
- sklearn
- tailwind
Log in or sign up for Devpost to join the conversation.