Here's the full submission with every em dash removed:


Inspiration

College Park is a city of 34,000 people, half of them students, a third of them non-native English speakers, nearly all of them unaware of what their city council decided last Tuesday night.

We looked at the city's website and found 3-hour YouTube videos of council meetings, 80-page PDF budget ordinances, and agenda documents buried three clicks deep. The information is technically public. But "technically public" and "actually accessible" are two completely different things.

The gap isn't transparency; the city publishes everything. The gap is comprehension. A UMD international student paying $870/year in city taxes has no idea what that buys. A Spanish-speaking family in North College Park can't follow the zoning ordinance that might affect their neighborhood. A resident who wants to comment on a ballot measure before the deadline has no way to trace it back to the council debates that produced it six months ago.

We built Civic Lens because democracy only works when people can actually understand what their government is doing.


What it does

Civic Lens is a multilingual civic intelligence platform that turns the raw machinery of local government (council meeting videos, ordinances, budget documents, ballot measures) into something any resident can understand and act on.

The budget dashboard breaks College Park's $29.6M FY2026 operating budget into plain-language department cards with interactive filtering. Click "Public Works" and every council item, capital project, and spending change related to that department filters in sync across the entire page, bidirectionally.

The council decoder presents recent and upcoming council items in plain English with accordion detail panels, status badges (Upcoming / Decided / Under Study), and direct links to official agendas. The budget and council sections are cross-linked: selecting a budget category instantly filters council items, and clicking a department tag on a council item highlights the corresponding budget slice.

The bill explainer (Phase 2) takes any ordinance and produces a plain-language summary, the strongest version of the argument for and against it, fiscal impact, and a timeline tracing it back to the specific council meetings that produced it, with timestamped video links.

The meeting decoder (Phase 2) transcribes council videos with speaker diarization, generates per-member vote breakdowns, and delivers morning-after push notifications: "Last night your council voted on 4 things. Here's what passed and what's coming next."

The RAG Q&A interface (Phase 2) lets residents ask questions like "does this bond measure raise my property taxes?" in Spanish and get back a cited answer in Spanish, pulled from the actual staff report, not generated from the model's training data.

On-chain source verification anchors every summarized document to an immutable Polygon blockchain record. Anyone can re-hash a downloaded PDF and confirm it matches what we summarized, eliminating the "trust the platform" problem that has plagued civic tech before us.

Everything runs in English, Español, and 中文, not translated from English output, but generated natively in each language through the full pipeline.


How we built it

We started with a static HTML prototype of the College Park budget dashboard, the kind of thing you'd build in an afternoon to prove a concept. Then we rebuilt it properly.

Frontend: Next.js 16 App Router with TypeScript. The original HTML's Chart.js doughnut and accordion interactions became proper React components with shared state, so clicking a budget slice and filtering council items below it is a single useState call, not a spaghetti of DOM event listeners. Tailwind CSS handles styling; Fraunces and Source Serif 4 (variable fonts) give the product an editorial identity that signals credibility rather than a generic SaaS dashboard.

Backend: Supabase handles Postgres, pgvector, auth, and row-level security in one service, a massive time-saver. The schema covers documents, meetings, bills, chunks (with vector(1536) embeddings for semantic search), profiles, council_members, and votes. A match_chunks SQL function runs cosine similarity search directly in Postgres using ivfflat indexing.

AI layer: Claude Sonnet via the Anthropic API handles summarization and Q&A. The summarization prompt runs once per language (English, Spanish, Mandarin) for every new meeting or bill ingested. We generate natively rather than translate, which produces dramatically better results. The Q&A pipeline retrieves the top-5 semantically similar chunks from pgvector and passes them to Claude with a strict citation instruction: "If you are not certain about a fact, say so. Do not invent vote counts, dates, or names. Cite the specific document section for every claim."

Transcription: AssemblyAI with speaker diarization turns council meeting videos into attributed transcripts. You need diarization specifically so you can say "Council Member X voted yes" rather than just "someone voted yes."

Blockchain provenance: When a document is ingested, we compute its SHA-256, upload it to IPFS via Pinata, and write (hash, source_url, ipfs_cid, timestamp) to a minimal DocumentRegistry Solidity contract on Polygon Amoy testnet using viem. The /verify page lets users drop a PDF and confirm the hash matches the on-chain record in-browser, no backend call required for the verification itself.

i18n: next-intl routes locale from a cookie through the entire pipeline. UI strings, AI-generated content, and notifications all resolve in the user's chosen language.


Challenges we ran into

The "technically public" data problem. College Park doesn't have an API. Meeting videos live on YouTube, agendas are PDFs on a Document Center, and ordinances are scanned documents with no machine-readable structure. Building the ingestion pipeline means writing scrapers and dealing with inconsistent formatting across years of archives; every document is slightly different.

Blockchain without the buzzword trap. Judges and users are increasingly skeptical of blockchain in civic tech, and rightfully so. Most uses are solutions looking for problems. We spent real time stress-testing whether on-chain provenance was genuinely necessary here or just a talking point. The answer is yes, but only for one specific thing: proving that the document we summarized is byte-for-byte identical to what the city published. We used it for exactly that and nothing else. No tokens, no NFTs, no "voting on the blockchain."

Multilingual AI that doesn't feel like Google Translate. Early experiments translated English summaries into Spanish and Mandarin. The results were technically correct but felt wrong, bureaucratic, stiff, obviously translated. Switching to native-language generation (prompting Claude in the target language from the start) produced summaries that actually read like something a fluent speaker would write.

React 19 and ecosystem compatibility. The component library we originally planned (@tremor/react) hadn't shipped React 19 support yet. We swapped to Recharts mid-build without losing a sprint, but it's a reminder that the "latest stable" stack at the start of a hackathon can bite you on dependency trees.

Keeping AI honest on civic facts. LLMs confidently hallucinate vote counts, dates, and names, exactly the things that matter most in civic tech and that residents would rely on. The fix is RAG with strict citation enforcement, but getting the prompt constraints right without making the output robotic took iteration. The current system prompt includes an explicit instruction to express uncertainty rather than guess.


Accomplishments that we're proud of

The budget and council cross-link. It sounds simple but no civic transparency tool we found does it: clicking a budget department filters the council agenda below it in real time, and clicking a department tag on a council item jumps back up and highlights the budget slice. It makes the connection between money and decisions visceral in a way that static pages never could.

A complete, working schema for the hardest part. The Supabase schema (with pgvector embeddings, RLS policies, a match_chunks cosine similarity function, and typed TypeScript interfaces generated from it) is the kind of thing that usually takes two days to get right. It's done, it's correct, and it's ready for the ingestion pipeline.

Trilingual at the architecture level, not the afterthought level. The i18n setup (next-intl plus native-language LLM generation) means adding a fourth language is a one-line change to the locale resolver and a new messages JSON file. Spanish and Mandarin weren't bolted on; they're first-class citizens of the data model.

A genuinely defensible blockchain use case. The on-chain provenance system solves a real trust problem ("how do I know the AI didn't summarize a doctored document?") in a way that's verifiable by any resident with a PDF and a browser. We can explain it to a skeptical judge in one sentence and have it hold up to scrutiny.

Building on a real city's real data. This isn't a demo with made-up numbers. Every figure ($29.6M budget, 33.5¢/$100 tax rate, the Complete Streets bid, the ADU ordinance, the PGPD contract renewal) comes from actual College Park FY2026 documents and actual April 2026 council agendas. It's deployable today.


What we learned

Local government data is shockingly inaccessible even when it's "open." Transparency without comprehension is just noise; the real civic tech problem isn't FOIA, it's UX.

LLMs are genuinely good at civic summarization when you constrain them correctly. The combination of RAG retrieval, strict citation prompting, and native-language generation produces output that's useful and trustworthy, not just impressive-sounding.

Blockchain is a tool, not a thesis. Using it for one specific, provable problem (document integrity) made our architecture cleaner and our pitch sharper. Resisting the urge to expand it kept us from building something embarrassing.

The hardest engineering problem in civic tech isn't the AI; it's the ingestion pipeline. Getting reliable, structured data out of inconsistent government documents is where most civic tech projects stall. We've designed the architecture to handle it, and the scraper work is real and ongoing.


What's next for CivicLens

Phase 2: Ingestion pipeline. Live scraping of College Park's YouTube channel, Document Center, and the Maryland General Assembly's bill API. Whisper/AssemblyAI transcription running automatically when new videos are posted. Every new document hashed, IPFS-stored, and registered on-chain within minutes of publication.

Phase 3: Full frontend. The bill explainer page with steel-manned for/against arguments. The meeting decoder with per-member vote breakdowns and timestamped video links. The RAG Q&A interface for residents to ask questions in any language. The accountability dashboard with council voting records, attendance trends, and time-to-vote analytics.

Expand to neighboring jurisdictions. Prince George's County Council, the Maryland General Assembly, and eventually any municipality willing to share structured agenda data. The architecture is jurisdiction-agnostic; College Park is just the first instance.

Verified nonprofit and press access. The accountability dashboard data (voting records, attendance, time from introduction to vote) is exactly what journalists and civic researchers need. We want to build an API tier for verified press access with higher rate limits and bulk export.

The morning-after notification. Push alerts the day after every council meeting: what passed, who voted how, what's on the next agenda, in your language. This is the feature residents ask for first and the one most likely to bring non-engaged residents into the civic process.

Built With

Share this project:

Updates