Our entire team enjoys video games, data analysis, and gambling.
What it does
Our project provides a formal description of the mechanisms governing a player's loot box rewards in Genshin Impact. We examine how the stated probability distribution of legendary rewards differs from the one that we observed, and we construct a mixture model that closely aligns with the experimental results.
How we built it
We built this project in Python using a variety of common data science packages and techniques. We began with exploratory data analysis, where we immediately noticed discrepancies between the stated in game mechanisms and what we observed. Next, we thought about what sorts of models could fit this data without overfitting. We found that a mixture of a geometric distribution and Gaussian distribution fit the data remarkably well.
Challenges we ran into
We spent a significant amount of time finding good model parameters, in large part due to the unique challenges that arise when trying to fit a mixture of two very different distributions. Additionally, due to the open ended nature of the problem, we had to research contemporary theories and philosophies surrounding Genshin Impact in order to find a meaningful question for us to explore.
Accomplishments that we're proud of
We are proud of how accurately our model explains in game behaviors.
What we learned
We developed a thorough understanding of the Gacha system in Genshin Impact and learned how data science techniques could provide actionable insight to players of the game.
What's next for Mystery Boxes and Mixture Models
We could certainly improve on the model score- the Gaussian assumption was certainly not perfectly correct, since the "bump" distribution was not quite symmetric. Furthermore, we made little use of the data on draw per player per box. We could have analyzed the relationship between draw time and frequency of a player accessing a particular box, for example.