Inspiration

Prediction markets have exploded in popularity, becoming an unexpected source of truth in our information landscape. From presidential elections to AI model performance, people are literally betting money on their beliefs—creating a fascinating dataset of collective intelligence.

But here's the problem: with thousands of prediction markets operating simultaneously, which ones actually matter? Which markets are liquid enough to trust? Which ones move quickly enough to serve as real-time sentiment signals?

We were inspired by the question: Can we build an AI-powered system that cuts through the noise and identifies which prediction markets truly reflect what people think? The Sentiment Signal was born from this challenge.

What it does

The Sentiment Signal is a comprehensive prediction market analytics platform built entirely in Hex that transforms raw betting data into actionable intelligence.

Core Features:

  1. Market Influence Ranking System

    • Analyzes markets using a proprietary scoring algorithm based on volume (40%), liquidity (30%), and open interest (30%)
    • Identifies the top prediction markets that drive real sentiment and engagement
    • Tracks market efficiency through bid-ask spread analysis
  2. Interactive Analytics Dashboard

    • Real-time filtering by date range, market category, and metrics
    • Dynamic visualizations showing price movements, volume trends, and correlation patterns
    • Single-value KPI cards highlighting key statistics at a glance
  3. Machine Learning Price Predictor

    • Random Forest model trained on market dynamics (volume, liquidity, volatility)
    • Predicts price movements with high accuracy
    • Feature importance analysis reveals what drives market sentiment
  4. Market Efficiency Metrics

    • Liquidity scoring system
    • Volatility tracking
    • Bid-ask spread analysis for market health assessment

The platform answers critical questions: Which markets should traders watch? How quickly do markets react to news? Can we predict sentiment shifts before they happen?

How we built it

Technology Stack:

  • Platform: Hex (notebook, semantic modeling, and Threads)
  • Data Source: Kalshi prediction market data (VIRTUAL_HACKATHON.KALSHI.TOP_LLM_PREDICTIONS)
  • Languages: SQL for data extraction, Python for analytics
  • ML Framework: scikit-learn
  • Visualization: Hex native charts with interactive inputs

Development Process:

Phase 1: Data Engineering

  • Wrote SQL queries to extract and flatten nested JSON fields (PRICE, YES_ASK, YES_BID)
  • Calculated derived metrics: bid-ask spread, price volatility, liquidity scores
  • Created clean, analysis-ready datasets with proper data types

Phase 2: Feature Engineering & Analytics

  • Developed the influence scoring algorithm through iterative testing
  • Built correlation matrices to identify market relationships
  • Performed time-series analysis to detect price movement patterns
  • Created market efficiency metrics combining multiple dimensions

Phase 3: Machine Learning

  • Feature selection based on domain knowledge and correlation analysis
  • Trained Random Forest model with cross-validation
  • Tuned hyperparameters for optimal prediction accuracy
  • Generated feature importance visualizations to explain model decisions

Phase 4: Interactive Dashboard

  • Designed user-friendly layout with logical flow
  • Implemented 5+ interactive input parameters (date ranges, market selectors, metric toggles)
  • Used Jinja templating for dynamic SQL queries
  • Created calculation cells for real-time KPI computation

Key Hex Features Leveraged:

  • SQL cells for data extraction
  • Python cells for analysis and ML
  • Chart cells (line, scatter, bar, heatmap)
  • Input cells (dropdown, multiselect, date range, slider)
  • Calculation cells for dynamic metrics
  • Jinja for parameterized queries
  • Semantic modeling for AI layer
  • Threads for conversational interface
  • Notebook Agent for code assistance

Challenges we ran into

1. Nested JSON Data Structure

  • The PRICE, YES_ASK, and YES_BID fields were stored as JSON objects
  • Solution: Used Snowflake's JSON parsing syntax (:field::TYPE) to extract and cast values properly
  • Learned to handle null values in nested structures gracefully

2. Defining "Influence" Objectively

  • Initially struggled to quantify what makes a market "influential"
  • Solution: Researched financial market metrics and adapted concepts like liquidity and efficiency
  • Developed a composite scoring system that balanced multiple factors
  • Validated results against domain knowledge (e.g., major company markets ranking higher)

3. Making Interactive Inputs Work Seamlessly

  • Jinja syntax for dynamic SQL queries took time to master
  • Ensuring all visualizations updated correctly when inputs changed
  • Solution: Studied Hex's Jinja documentation thoroughly
  • Used the inclause filter for multi-select parameters
  • Tested all input combinations to ensure no edge case errors

Each challenge taught us more about data engineering, Hex's capabilities, and effective analytics communication.

Accomplishments that we're proud of

🎯 Technical Achievements:

  1. Full-Stack Data Solution in One Platform

    • Built everything from raw data extraction to conversational AI in Hex
    • Demonstrates the true power of Hex's unified analytics environment
    • Zero context-switching between tools
  2. Comprehensive Hex Feature Utilization

    • Used all three AI agents (Notebook, Semantic, Thread)
    • Implemented 8+ different cell types
    • Created truly interactive experience with dynamic filtering
    • Built production-ready semantic model
  3. Discovered Market Efficiency Patterns

    • Identified that liquidity varies by 10x across similar markets
    • Found correlation patterns between different prediction categories
    • Uncovered the relationship between volume spikes and price volatility
  4. Made Complex Data Accessible

    • Non-technical users can now ask questions in plain English
    • Dashboards provide immediate visual insights without analysis expertise
    • Reduced barrier to understanding prediction market dynamics
  5. Zero Errors in Production

    • Thoroughly tested all code paths
    • Handled edge cases and null values properly
    • Project runs reliably from start to finish

We're most proud that this isn't just a data visualization project, it's a complete analytics application that delivers real value to anyone trying to understand prediction markets.

What we learned

About Hex:

  • Hex's tight integration between SQL, Python, and visualizations eliminates so much friction
  • Being able to reference SQL results directly in Python (as sql_result) is game-changing
  • The ability to build end-to-end solutions without leaving one environment dramatically speeds development

About Data Science:

  • Spending time creating meaningful features (liquidity score, volatility) improved results more than tuning hyperparameters
  • Domain knowledge (understanding market mechanics) was crucial for feature selection
  • Simple, interpretable features often outperform complex engineered ones
  • Better to deeply analyze clean, relevant data than superficially examine massive datasets
  • Handling JSON parsing and null values properly prevented downstream errors
  • Validation against expected patterns caught data quality issues early

About Prediction Markets:

  • Not all prediction markets are created equal
  • Liquidity is a better indicator of reliability than price alone
  • Bid-ask spreads reveal which markets traders actually trust
  • Markets react to information systematically, not randomly
  • Patterns in volume and volatility provide predictive signals
  • High-influence markets lead, lower-tier markets follow

## What's next for The Sentiment Signal

Transform The Sentiment Signal from a hackathon project into the go-to platform for understanding prediction markets—making the "wisdom of crowds" accessible, analyzable, and actionable for traders, researchers, journalists, and curious minds alike.

We believe prediction markets are becoming critical infrastructure for collective decision-making in the AI era. The Sentiment Signal will help people navigate this new landscape with confidence and clarity.

Built With

Share this project:

Updates