Inspiration

We were inspired to build this dashboard by the rapid growth of AI and the increasing number of policies being introduced in recent years. We wanted to explore how these policies relate to public opinion and provide a way to visualize and predict response for AI regulations. The goal is to increase transparency in AI policymaking, helping policymakers understand how proposed bills are likely to be received by the public.

How we built it

We constructed the AI Civic Alignment System using a multi-stage Python machine learning pipeline. First, we integrated and cleaned real-world data from the Stanford 2025 AI Index Report and the OECD AI Policy Observatory. We utilized an NLP framework (Hugging Face / local fallback) to parse and classify nearly 2,000 global AI policies into core domains (e.g., Privacy, Technology, Economy). We then engineered a ContextualFeatureEngine to map country-level structural proxies—specifically AI job postings (Economic Pressure), CS graduate demographics (Education Readiness), and policy passing velocity (Policy Action)—to these records.

At the core of the engine, we trained a Random Forest Classifier to estimate policymaker and institutional support. To ensure statistical defensibility given the small set of explicitly polled policies, we implemented Leave-One-Out Cross-Validation (LOOCV). Finally, we developed an Alignment Engine that calculates critical "Gap Metrics" (Citizen vs. Institutional Action) and leverages K-Means Clustering and Isolation Forests to identify regulatory archetypes and anomalies, outputting everything into interactive terminal dashboards.

Challenges we ran into

A major challenge was pulling relevant, apples-to-apples information from vast amounts of global public opinion data. Policies vary significantly in scope, terminology, and enforcement, making standardization difficult. Early in the project, we realized that relying on hardcoded heuristics to predict public response was analytically weak. Sentiment is dynamic and highly dependent on exogenous cultural, political, and economic factors.

Another significant hurdle was ensuring fairness and consistency when comparing survey data from different countries, each with distinct methodologies and sampling sizes. We had to carefully wrangle these Stanford datasets to calculate baseline averages and impute realistic median values to prevent our Machine Learning models from breaking on NaNs, while ensuring we didn't artificially inflate confidence statistics.

Accomplishments that we're proud of

We are incredibly proud to have successfully transitioned this project from a heuristic, rules-based concept into a defensible, data-driven machine learning pipeline. We built a working Random Forest model that generates realistic support predictions, properly validated using LOOCV.

We are also proud of developing an NLP classifier that can interpret and categorize completely new policy proposals on the fly. By actively visualizing the "Alignment Gaps" between raw Citizen Appetite, Policymaker Support, and actual Legislative Action, we have created a tool that clearly identifies when a government is moving out of sync with its people. Most importantly, we transformed complex, disjointed global datasets into a series of cohesive, interactive data layers ready for downstream visualizations.

What we learned

Through this project, we learned exactly how complex and interconnected AI policymaking is on a global scale. We gained intense, hands-on experience with advanced data cleaning, normalization across international datasets, rigorous feature engineering, and applying machine learning models (Random Forests, K-Means) to real-world behavioral policy problems.

More importantly, we learned that predictive modeling in public policy requires intense ethical awareness. Models can easily "memorize" biases if not properly checked. Numbers can guide decisions, but they must honestly reflect their statistical limits—which is why we intentionally frame our outputs as heuristic "Momentum Indicators" rather than definitive "Forecasts." This project ultimately strengthened our understanding of both the power and the limitations of using AI to support human governance.

What's next for AI Policy Sentiment Analysis

If we had more time, we would dramatically expand our training dataset by scraping and integrating real-time legislative trackers (like Congress.gov or EUR-Lex) alongside live API polling data (like Gallup or Pew Research). We would also love to transition our terminal-based dashboards into a fully interactive, web-based React application where policymakers can drag-and-drop policy text and immediately visualize the projected alignment impacts on a global heat map. Finally, we would love to experiment with more advanced Large Language Models to read the entire text of bills rather than just summaries, extracting nuanced enforcement clauses to improve the model's predictive accuracy.

Built With

Share this project:

Updates