Inspiration

The 2026 FIFA World Cup is shaping up to be unlike any before. Forty eight teams will compete across three host countries, playing in environments that range from high altitude to extreme heat and humidity. As a soccer fan, I noticed that most previews and analyses focus on the same questions every cycle. Who are the strongest teams? Who will win the tournament?

What interested me much more was how the changes in format and context might reshape the game itself. How does expanded participation change competitive balance? How much does environment matter when teams play in unfamiliar conditions? How fragile are some teams once those variables are introduced?

That curiosity became the foundation of this project. Instead of building a traditional analytics dashboard or a single prediction model, I wanted to create a simulation sandbox that lets users explore possibilities. Something closer to a game, where users can think like a national team manager, a scout, or just a curious fan and ask their own questions. They can match teams they think are strong candidates to simulate the game result, or even put a historical team on stage and let it compete in one of the stadiums for 2026 World Cup.

What it does

FIFA 2026 Matches Sandbox is an interactive analysis and simulation app built on one hundred fifty years of international soccer data.

The app allows users to explore national teams across eras, understand how host cities and environmental conditions influence performance, analyze squad identity and momentum, and simulate matchups under custom scenarios.

It supports realistic exploration. For example, this sandbox can be used to preview plausible 2026 World Cup fixtures by simulating how teams might perform against expected opponents in specific host cities, while also supporting creative experimentation beyond real tournament brackets. It also encourages playful exploration through unlikely or near impossible matchups between teams that would rarely meet in an official tournament, such as a historically dominant team facing a modern rising squad under unfamiliar conditions. As long as it's a team in the dataset, you can create any matchups you want.

How I built it

FIFA 2026 Matches Sandbox was built end to end in Hex, using an integrated workflow that combines SQL for data preparation, Python for feature engineering and simulation, and interactive visualizations for interpretation. The foundation of the project is more than 150 years of international soccer data, covering nearly 49,000 matches across more than 330 national teams.

A major portion of the work focused on data preparation and standardization. Historical country names were unified across multiple political eras so performance history is attributed consistently. This required systematic cleanup across 34 renamed or restructured national entities, validation of tournament records, and manual verification of geographic coordinates to support accurate global mapping.

To make results comparable across eras, I designed a tournament importance weighting system that categorizes roughly 190 distinct competition types into six tiers. Competitive matches such as World Cups and continental championships influence momentum and team strength more than friendlies. Match centric records were then transformed into team centric features, including rolling form, weighted momentum, and recent team identity based on each team’s most recent 30 matches.

To explicitly model uncertainty, I engineered features that feed directly into the simulator. A Fragility Index measures how dependent a team’s scoring output is on a single player using weighted goal concentration with minimum thresholds. Highly fragile teams are treated as higher variance squads during simulation. I also created a Clutch Resilience Score that combines shootout history with one goal match performance to capture execution under pressure.

The match simulator combines team strength, momentum, fragility, resilience, and venue effects into a Monte Carlo framework that produces probability distributions rather than single predictions. Environmental modeling for all 16 host cities accounts for altitude, heat, humidity, and stadium climate control so the same matchup can behave differently by location. Every result is paired with visual explanations such as momentum trends, team DNA plots, score probability heatmaps, and expected goal breakdowns, allowing users to see not only what the model predicts, but why.

Challenges I ran into

One of the biggest challenges was translating interpretation into visualization. It is easy to produce charts, but much harder to design visuals that clearly communicate ideas such as tactical identity, fragility, and environmental impact without overwhelming the user. Many visualizations went through multiple iterations before they felt intuitive and explainable.

Data preparation was another hurdle. Countries change names, teams represent different political entities across time periods, and geographic information is often inconsistent. Standardizing country names across decades of matches and verifying accurate coordinates required repeated validation and careful cross checking.

Feature engineering was also a major challenge, especially deciding how to weight and combine different signals. Choices such as how much importance to give tournament tiers, how to balance recent matches against long term history, and how to trade off environmental penalties versus team strength required constant iteration. This process also led to the design of the Fragility Index, which measures how dependent a team’s scoring output is on a single player while avoiding short term noise through minimum thresholds and recent match constraints. Small changes in these weights or thresholds could noticeably alter simulation behavior, so I spent a significant amount of time stress testing assumptions and refining them to ensure outcomes felt plausible and expressive rather than mechanically correct.

Finally, there was a design challenge in making all components of the app feel cohesive. Aligning maps, charts, controls, and simulations into a single flowing experience took far more effort than expected, but was necessary for the sandbox to feel usable rather than fragmented.

Accomplishments that I'm proud of

Completing and submitting this project as my first hackathon is something I am genuinely proud of. I learned Hex from scratch and used it to build an end to end application around a topic I am interested in.

What's next for FIFA 2026 Matches Sandbox

If I continue working on this project, the next step would be to expand the sandbox beyond single match simulations into group stage and full tournament level simulations, allowing users to explore how early results cascade through later rounds. I would also like to introduce lineup level or player availability scenarios to deepen the fragility modeling and better reflect real world uncertainty.

From a technical perspective, improving load time and performance is a key priority. This includes optimizing data loading and visualization processing. On the design side, I want to continue refining the interaction flow so users can move more seamlessly between scouting teams, understanding context, and running simulations.

Ultimately, the goal is to evolve this into a living system where fans and analysts can explore the World Cup as a dynamic process, not just a fixed set of predictions.

Built With

Share this project:

Updates