What it does

It predicts Sales (Domestic Ultimate Total USD) using random forest regression model.

How we built it

Through exploration and analysis of the data, we found certain relationships between the features and decided. We then chose a few features and built three models to compare the performances. There is multiple edits to the features used in the progress to improve the performance of our model, reducing the AIC from a relatively higher number to 11.

Challenges we ran into

We did not know one another before this Datathon and was not able to meet up in real life due to packed timetable. Many of us also do not have experience in building a machine learning model. Therefore, it is difficult to find a time to discuss and split the work at first, causing the progress of the project to be slow and feel like there isn't sufficient time to complete the project.

Accomplishments that we're proud of

We managed to come out with a model and understand the dataset in various way. We also made friends with one another along the way and learnt new knowledge and gain perspective from one another.

What we learned

We learnt multiple functions and packages like pandas and sklearn. We also learnt about different models like neural network, random forest, linear and multi-linear regression and gradient boosting along the way. We also learnt multiple ways of evaluating our regression model.

Requirements for running the notebooks has not been included in a separate txt file since it has been mentioned at the start of the notebook

Built With

Share this project:

Updates