Inspiration
Public policy affects billions of lives, yet most citizens and even many policymakers struggle to grasp the full implications of proposed changes. Policy documents are dense, jargon-heavy, and static. What if we could make them interactive?
The inspiration struck during a local town hall meeting where a new housing policy was being debated. Passionate arguments flew back and forth, but nobody could really quantify the trade-offs: "If we increase affordable housing by 20%, what happens to property taxes? To local schools? To environmental impact?"
We realized AI could bridge this gap. With Google Gemini 3's improved reasoning capabilities, we could build a tool that doesn't just summarize policiesβit simulates futures.
What We Learned
Technical Insights
Prompt Engineering is an Art: Crafting prompts that extract structured data from unstructured policy text required extensive iteration. We learned to use system personas and few-shot examples to guide Gemini 3's reasoning.
Causal Modeling with LLMs: Traditional simulation requires explicit mathematical models like: $$\Delta Y = \sum_{i=1}^{n} \beta_i \Delta X_i + \epsilon$$
But LLMs can approximate causal reasoning through learned patterns, generating plausible scenarios based on historical policy outcomes encoded in training data.
Context Window Management: Policy documents can exceed 50,000 tokens. We learned to chunk intelligently, preserving semantic coherence while staying within Gemini 3's context limits.
Structured Output Extraction: Getting consistent JSON from generative models requires careful schema design and validation layers.
Domain Knowledge
- Policy Analysis Frameworks: We studied cost-benefit analysis, multi-criteria decision analysis (MCDA), and systems thinking approaches.
- UN SDG Mapping: Understanding how local policies cascade to global sustainability goals.
- Stakeholder Theory: Recognizing that every policy has winners and losersβsimulation must surface these distributional effects.
π οΈ How We Built It
Architecture Overview
βββββββββββββββ
β User β
β Interface β
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββββββββββββββββββ
β Next.js App Router β
β (Frontend + API Routes) β
ββββββββ¬βββββββββββββββ¬ββββββββ
β β
βΌ βΌ
βββββββββββββββ ββββββββββββββββ
β Gemini 3 β β Vector DB β
β (OpenRouterβ β (Future) β
β API) β β β
βββββββββββββββ ββββββββββββββββ
Technology Stack
Frontend
- Next.js 15 with App Router for server-side rendering and API routes
- TypeScript for type safety across the codebase
- Tailwind CSS + Framer Motion for responsive, animated UI
- Radix UI primitives for accessible components
Backend
- Next.js API Routes handling analysis and simulation requests
- OpenRouter as the gateway to Google Gemini 3 models
- Zod for runtime schema validation
AI Layer
- Gemini 3 Flash for quick document analysis (low latency)
- Gemini 3 Pro for deep causal reasoning (high accuracy)
- Custom prompt templates with role-based personas
Key Implementation Details
1. Policy Ingestion Pipeline
// Extract structured metadata from raw policy text
const analysisPrompt = `
You are a policy analyst. Extract:
- Goals: What outcomes does this policy aim to achieve?
- Levers: What parameters can be adjusted? (funding, timelines, etc.)
- Constraints: Legal, budgetary, or political limits
- Stakeholders: Who is affected and how?
Return JSON format.
`;
The ingestion uses a two-stage process:
- Chunking: Split documents into semantic sections
- Extraction: Parallel Gemini 3 calls for each section, then merge results
2. Scenario Simulation Engine
The core innovation is using LLMs for counterfactual reasoning:
// Generate outcome predictions based on lever adjustments
const simulationPrompt = `
Given baseline policy: ${baseline}
User adjusted: ${leverChanges}
Simulate impacts across:
- Economic (GDP, employment, tax revenue)
- Social (equity, health, education)
- Environmental (emissions, resource use)
Use second-order causal reasoning. For each change, consider:
- Direct effects (Ξβ)
- Indirect effects (Ξβ)
- Feedback loops
Return structured outcomes with confidence intervals.
`;
For example, if a user increases renewable energy subsidies by 30%:
- Direct effect: More solar installations β reduced emissions
- Indirect effect: Job creation in green sector β multiplier effects on local economy
- Feedback loop: Lower energy costs β increased manufacturing β potential rebound in energy use
3. SDG Alignment Algorithm
We map policy outcomes to UN Sustainable Development Goals using cosine similarity:
$$\text{SDG_Score}_i = \frac{\vec{O} \cdot \vec{S}_i}{||\vec{O}|| \cdot ||\vec{S}_i||}$$
Where:
- $\vec{O}$ = outcome embedding vector (from Gemini)
- $\vec{S}_i$ = SDG $i$ embedding vector
- Threshold: 0.7 for "strong alignment"
4. UI State Management
We built custom React hooks for managing simulation state:
const useSimulation = (policyId: string) => {
const [levers, setLevers] = useState<PolicyLever[]>([]);
const [outcomes, setOutcomes] = useState<Outcome[]>([]);
const [isSimulating, setIsSimulating] = useState(false);
const simulate = useCallback(async () => {
setIsSimulating(true);
const result = await fetch('/api/simulate', {
method: 'POST',
body: JSON.stringify({ policyId, levers })
});
setOutcomes(await result.json());
setIsSimulating(false);
}, [policyId, levers]);
return { levers, setLevers, outcomes, simulate, isSimulating };
};
Challenges We Faced
1. Hallucination Control
Problem: LLMs can generate plausible-sounding but factually incorrect predictions.
Solution:
- Implemented confidence scoring: Gemini 3 rates its own certainty (0-100%)
- Cross-validation: Generate 3 scenarios independently, flag divergences
- Grounding prompts: Instruct model to cite reasoning chains
if (outcome.confidence < 60) {
return {
...outcome,
warning: "Low confidence - treat as exploratory scenario only"
};
}
2. Latency vs. Depth Trade-off
Problem: Deep causal analysis takes 20-30 seconds per simulationβtoo slow for interactive use.
Solution: Hybrid approach
- Fast path: Gemini 3 Flash with cached policy context β 2-3 second response
- Deep path: Optional "detailed analysis" button triggers Gemini 3 Pro
- Optimistic UI updates: Show loading skeleton with partial results
3. Prompt Consistency
Problem: Same prompt yielded different JSON structures across runs.
Solution:
- Added strict JSON schema in system prompt
- Implemented Zod validators that retry on parse failures
- Few-shot examples with exact expected output format
const outcomeSchema = z.object({
dimension: z.enum(['economic', 'social', 'environmental']),
metric: z.string(),
baselineValue: z.number(),
projectedValue: z.number(),
confidence: z.number().min(0).max(100),
reasoning: z.string()
});
4. Policy Document Variability
Problem: Policies come in wildly different formats (PDFs, Word docs, HTML, plain text).
Solution:
- Built unified document parser using
pdf-parseandmammoth - Preprocessing step: Convert all to clean markdown
- Gemini 3 handles remaining noise gracefully due to robust training
5. Ethical Considerations
Problem: Simulation results could be misinterpreted as predictions or endorsements.
Solution:
- Prominent disclaimers: "AI-based exploration tool, not professional advice"
- Uncertainty visualization: All metrics show confidence intervals
- Multi-scenario view: Present 3 alternative futures, not single "answer"
- Bias warning: Acknowledge LLM training data biases
6. Scaling Concerns
Problem: As users upload policies, how do we handle storage and context retrieval?
Future Solution (not yet implemented):
- Vector database (Pinecone/Weaviate) for semantic policy search
- Embeddings for "similar policy" recommendations
- User authentication for saving personal simulations
Technical Deep Dive: The Simulation Algorithm
The core logic uses a modified Markov chain approach where states are policy configurations:
$$P(S_{t+1} | S_t, L) = f_{\text{Gemini}}(S_t, L, \theta)$$
Where:
- $S_t$ = state at time $t$ (economic, social, environmental metrics)
- $L$ = lever adjustments (user inputs)
- $\theta$ = LLM parameters (temperature, top-p, etc.)
- $f_{\text{Gemini}}$ = black-box state transition function
We discretize time into policy cycles (typically annual budgets) and simulate forward:
# Pseudocode
def simulate(initial_state, lever_changes, horizon=5):
states = [initial_state]
for year in range(1, horizon + 1):
prompt = f"""
Current state: {states[-1]}
Policy changes: {lever_changes}
Year: {year}
Project state for year {year} considering:
- Momentum effects from previous years
- Exogenous shocks (economic cycles, etc.)
- Policy implementation lags
"""
next_state = gemini.predict(prompt)
states.append(next_state)
return states
Results & Impact
Hackathon Demo
We tested the tool on three real policies:
- San Francisco Affordable Housing Initiative (2023)
- EU Green Deal Carbon Pricing (2021)
- Singapore Smart Nation Budget (2024)
Users could adjust levers like funding levels, implementation speed, and enforcement strictness. The simulation revealed non-obvious trade-offs (e.g., aggressive carbon pricing β industrial relocation β job losses in manufacturing).
Performance Metrics
- Analysis time: 8-15 seconds for 30-page policy document
- Simulation latency: 3 seconds (Flash) to 25 seconds (Pro)
- Accuracy: Informal validation against expert policy briefs showed ~70% alignment in predicted directional impacts
Future Roadmap
- Historical Validation: Compare predictions against actual policy outcomes (2010-2025 data)
- Collaborative Scenarios: Multi-user mode where stakeholders debate lever settings
- Temporal Dynamics: Multi-year simulations with path dependency
- Adversarial Testing: Red team mode that finds policy blind spots
- API for Researchers: Enable academic validation studies
Conclusion
Living Policy Simulator demonstrates that AI can democratize policy analysis. We transformed an idea sketched on a napkin into a working prototype in 72 hours, thanks to Gemini 3's powerful reasoning and Google's accessible API.
The real breakthrough isn't the codeβit's the paradigm shift. Policy should be explored, not just read. Decisions should be simulated, not just debated. And AI can be the bridge between expert analysis and public understanding.
This hackathon taught us that the future of civic engagement is interactive, data-driven, and accessible to everyone. We're excited to keep building.
Built with β€οΈ for the Gemini 3 Hackathon
*Team: Seamless AI | Stack: Next.js + Gemini 3 + TypeScript
Log in or sign up for Devpost to join the conversation.