Inspiration

I've watched F1 for years and always wanted to analyze the data to see what actually contributes to a championship win. I used to assume it simply went to whoever had the fastest car or the most natural talent. Then the 2025 season happened. Oscar Piastri led the championship for 15 rounds. Max Verstappen clawed back a massive 104-point deficit in the second half of the year. Yet it was Lando Norris who took the title by just two points in a nail-biting, three-way finale in Abu Dhabi. That didn't track on pace alone, which finally gave me the perfect excuse to dive into the numbers and see what the data actually said.

What it does

Pitwall is a multi-agent AI pipeline pulling 12 years of F1 data — every race, every round, 2014 to 2025 — and generating a data-backed journalist story from the findings.

  • Agent 1 (Fetcher): Pulls race results and lap telemetry exclusively from FastF1 using Backboard's tool-calling loop. No reasoning, just structured retrieval.
  • Agent 2 (Analyst): Computes three metrics most F1 coverage ignores. The core one is finishing position standard deviation:

$$ \sigma_{pos} = \sqrt{\frac{1}{n} \sum_{i=1}^{n}(x_i - \bar{x})^2} $$

Lower variance means a more consistent driver. The second is a pressure trap metric — points per race before a driver takes the championship lead versus after. For Piastri and Norris in 2025, that number fluctuates sharply. The analyst runs on Backboard with memory="Auto", so follow-up queries build on prior findings rather than recomputing from scratch.

  • Agent 3 (Narrator): Powered by Gemini Flash. It gets the structured findings plus hardcoded editorial context — the Qatar strategy blunders, the sprint format points gap, stewards' decision patterns — and writes a three-paragraph story. No jargon. Reads like a sports article, not a statistics report.

How I built it

9 hours. Solo. MacBook Air M1.

I built the entire pipeline around FastF1, getting metrics 1 and 2 working within 90 minutes. I scoped the tool strictly to lap time telemetry and final results. I never loaded full telemetry channels — too slow, not needed.

The Backboard integration took longer than it should have because I kept making the fetcher agent smarter than necessary. It doesn't need to reason. It needs to fetch. Figuring that out earlier would have saved real time.

Streamlit was last and stayed minimal on purpose. Two charts, the story, a run button.

Challenges I ran into

FastF1 needs 20 to 40 minutes to warm its cache for 12 seasons. I ran the warmer in a background terminal from the first minute and built everything else while it ran.

Getting Gemini to write without leaking statistical jargon into the output took more prompt iterations than expected. "Standard deviation" kept appearing. The final narrator prompt explicitly substitutes plain-English descriptions for every technical term. Small thing, but it's what makes the story readable to someone who's never watched a race.

The stewards' decision variance data was the hardest cut. It's the most narratively powerful variable in the whole dataset — same incident, different penalty depending on which four stewards are rostered that weekend. I couldn't compute it cleanly in the time available, so it went into the Gemini prompt as editorial context instead. It's still in the story. Just not as a number.

Accomplishments that I'm proud of

  • Speed of Execution: Going from zero to a fully functioning multi-agent AI pipeline in just 9 hours, solo, on a MacBook Air.
  • Finding Real Signal in the Noise: Proving statistically that consistency beats raw pace. Finding a Pearson correlation of $r = -0.73$ across the hybrid era confirmed the initial hypothesis wasn't just noise.
  • Taming Scope Creep: Successfully filtering 80+ candidate variables down to just the three metrics and four narrative facts that actually mattered for the final story.

What I learned

The fastest car wins qualifying. The most consistent driver wins the championship. That's what 12 seasons of data says when you correlate finishing position standard deviation against final standings rank.

I also learned that the hardest part of a data pipeline under time pressure isn't the code — it's deciding what to cut. Scope creep would have cost the demo without improving the story.

Backboard's agents and persistent memory were genuinely useful in helping me figure this stuff out on the fly. Because the analyst agent's thread carries context between runs, I could query it mid-build to test theories. It answered relative to the full hybrid era findings it had already processed, which is how I navigated the noise of all those variables to find the actual story without writing custom scripts for every single question.

What's next for Pitwall

The immediate next step is tackling the stewards' decision variance data. Since it's the most narratively powerful variable in the dataset, the goal is to build out the backend logic to compute it cleanly, replacing the hardcoded editorial context with dynamic, pipeline-generated data.

Built With

  • backboard
  • fastf1
  • gemini
  • pandas
  • python
  • react
  • scipy-agent-orchestration:-backboard-sdk-narration:-gemini-1.5-flash-frontend:-streamlit
  • vite
Share this project:

Updates