Our entire team enjoys video games, data analysis, and gambling.

What it does

Our project provides a formal description of the mechanisms governing a player's loot box rewards in Genshin Impact. We examine how the stated probability distribution of legendary rewards differs from the one that we observed, and we construct a mixture model that closely aligns with the experimental results.

How we built it

We built this project in Python using a variety of common data science packages and techniques. We began with exploratory data analysis, where we immediately noticed discrepancies between the stated in game mechanisms and what we observed. Next, we thought about what sorts of models could fit this data without overfitting. We found that a mixture of a geometric distribution and Gaussian distribution fit the data remarkably well.

Challenges we ran into

We spent a significant amount of time finding good model parameters, in large part due to the unique challenges that arise when trying to fit a mixture of two very different distributions. Additionally, due to the open ended nature of the problem, we had to research contemporary theories and philosophies surrounding Genshin Impact in order to find a meaningful question for us to explore.

Accomplishments that we're proud of

We are proud of how accurately our model explains in game behaviors.

What we learned

We developed a thorough understanding of the Gacha system in Genshin Impact and learned how data science techniques could provide actionable insight to players of the game.

What's next for Mystery Boxes and Mixture Models

We could certainly improve on the model score- the Gaussian assumption was certainly not perfectly correct, since the "bump" distribution was not quite symmetric. Furthermore, we made little use of the data on draw per player per box. We could have analyzed the relationship between draw time and frequency of a player accessing a particular box, for example.

Built With

Share this project: