Team USA Hometown Success Engine

Inspiration & Impact

We asked: Where do America's Olympic/Paralympic athletes come from? And what sports would I excel at?

This dual question addresses real needs:

  • For Team USA: Understanding athlete distribution enables data-driven talent development—identifying emerging regions, allocating resources strategically, and building new athletic hubs
  • For fans & communities: Celebrates Olympic heritage, shows where hometown heroes originated, and helps individuals discover sports that match their physical profile
  • For aspiring athletes: Inspires them by showing that elite talent comes from communities just like theirs

How We Built It

Full-stack application on Google Cloud, combining geographic intelligence with personalized data-driven matching:

Architecture

  • Frontend: React + TypeScript with interactive Google Maps, three exploration modes (Map View, Explore, Compare), parallel Olympic and Paralympic analysis
  • Backend: Python Flask on Cloud Run (serverless, auto-scaling)
  • Data Engine: BigQuery with 10000+ athletes across 86 sports, spanning historical and modern Olympics/Paralympics
  • AI Layer: Gemini powers regional storytelling, sport distribution analysis (with regional athlete breakdowns), and sport visual descriptions
  • Analytics Layer: Data driven similarity matching algorithm matches users with sports by finding athletes most similar to their physical profile (height, weight) and location

Core Feature

Hometown Success Engine (Geographic Discovery)

  • Interactive map showing where Team USA talent concentrates
  • Filter by Olympic/Paralympic, explore by sport, or by state/city
  • Discover regional sports traditions and talent hubs
  • AI-generated regional insights via Gemini

Additional Feature

Find Your Matched Sports (Personalized Exploration)

  • Optional: Input height, weight, hometown to discover personalized sport matches
  • Algorithmic matching based on real athlete body characteristics in your region
  • Adds a personalization layer to complement geographic insights

What We Learned

1. Agentic AI & Generative AI Integration

  • Combining structured data queries (BigQuery) with generative storytelling (Gemini) creates powerful user experiences
  • Prompt engineering is critical for reducing hallucinations and maintaining factual accuracy
  • AI agents work best when given clear constraints and specific data context

2. Data Quality & Normalization

  • Geographic inconsistencies required careful mapping (spelling variations, historical name changes)
  • Sport classification across 80+ Olympic disciplines demanded meticulous data cleaning
  • Handling missing values (height/weight) required intelligent fallback logic

3. Multiple Perspectives, Multiple Needs

  • Map View (geographic exploration) + Compare Mode (regional benchmarking) + Personalized Matching (individual discovery) serve complementary user needs
  • Different users engage differently—some want exploration, others want comparison, others want personalization

4. Google Cloud Integration

  • Seamless connection between BigQuery (data), Gemini (AI storytelling), Google Maps (visualization), Cloud Run (deployment) enabled this scale
  • BigQuery's analytical power makes querying 10000+ records fast and efficient
  • Google Maps API enables interactive geographic exploration at scale
  • Cloud Run's serverless nature eliminates infrastructure overhead

Technical Challenges & How We Solved Them

Challenge 1: Geographic Data Inconsistencies

  • Athlete hometown data had spelling variations, abbreviations, missing values
  • Solution: Normalized city/state naming, dropped records with missing sport classification, validated against known US geography

Challenge 2: Balancing Olympic & Paralympic Data

  • Paralympic data is sparse compared to Olympic records
    Solution: Separate Olympic and Paralympic data streams allow users to explore each independently

Challenge 3: Low Prediction Accuracy with Naive Approach

  • Initial attempt at direct sport classification (Random Forest) achieved only 23.7% validation accuracy
  • Solution: Pivoted to similarity matching instead—find athletes most similar to the user, return their sports
  • This is more interpretable, more honest about data limitations, and more engaging for users

Challenge 4: Handling Incomplete User Input

  • Users might only provide height, or only city, or nothing at all
  • Solution: Implemented tiered matching logic:
    • Height + weight + city: Combined Euclidean distance + geographic weighting
    • Only city/state: Random sample from that region
    • City not found: Expand to state level
    • No input: Random from all sports
  • This graceful degradation ensures the feature works regardless of how much info users provide

Achievements

✅ Production-ready platform serving dual stakeholders
✅ Combines geographic analysis (10000+ athletes in BigQuery) + personalized exploration (similarity algorithm) + AI storytelling (Gemini) ✅ Support 86 sports across 10000+ athletes and 3000+ cities ✅ Graceful degradation: Sport matching works with any level of user input (full profile, partial, or none) ✅ Deployed on Google Cloud with auto-scaling capability

Team USA gains strategic intelligence for athlete development; community members discover their connection to Olympic legacy and find sports aligned with their potential.

Future Improvements

  1. Richer Athlete Profiles: Add age, weight class, sport-specific metrics (jumping height, throwing distance) to improve matching accuracy
  2. Advanced Analytics: Correlation analysis between elevation/climate and sport success; identify regional talent pipelines
  3. Competitive Intelligence: Compare talent distribution with other Olympic nations; benchmark regional performance
  4. Community Engagement: Historical Olympic medals by hometown; predict future medal prospects by region
  5. Broader Data Integration: Include youth Olympic records, national championships, training facility locations
  6. Business Development:
    • Venue booking & facility integration
    • Local coaching & training marketplace
    • Sports community, events, and pickup game discovery
Share this project:

Updates