Inspiration
We are a competitive team, and the fact that a real-time leaderboard was a feature of the challenge motivated us to take part in. Our previous knowledge in machine learning techniques was an inspiration to do research on such methods to came with a solution for the problem.
What it does
Our approach makes use of state-of-the-art techniques and advanced methods to forecast the next season demand of every product planned to sale, and estimates a number for production.
How we built it
Basically, an ensemble of multiple SOTA models for this task, including a XGBoosting, a Light-GB and a neural network (FNN). Each has been trained on a subset of the provided training data, and validated in another subset, and we brought to test the ones with best parameter combinations. An internal parameter in the training of the models acts as a crucial to trade-off to slightly favor over-production than under-production. New items in the test set never seen before can still be correctly predicted thanks to using the image embeddings in the training, which allows the model to make those forecasts based on knowledge from previous similar products. The ensemble is just doing, for each sample, an average of the predictions of each model.
Challenges we ran into
We tried a lot of experiments and different methods. It was a real struggle to reach the final combination that gave the best results. Raising from 54% to 55% was the most difficult part, doing an ensemble to counteract the errors between models was the key to achieve that.
Accomplishments that we're proud of
We came into the challenge without having experience or much knowledge in time-series and forecasting. A lot of research was necessary and we learned a lot from it. It all came out really well since we ended up in the first position in the leaderboard.
What we learned
We learned a bunch of state-of-the-art methods and models used for fore-casting and tabular data. XGBoost was key and was the first start to a high performance model, and we increasedly improved it by trying other methods like Light-GB. We also had to develop and think of in-home customizations to the system to favor over-production and similar things.
What's next for mas_diez_por_ciento
There are still some experiments that we would like to try and aim to improve the test accuracy, and specially on unseen products.
Log in or sign up for Devpost to join the conversation.