Project (3 to 4 sentences). SimCivics is a civic collaboration platform where any U.S voter or potential future voter, can propose, test, and vote on simulated U.S policy alongside their fellow Americans of all backgrounds, representing a true democracy. Users pick a state to represent, submit a policy idea in plain English, and see it simulated against that state's real population before sharing it nationally on SimCivic's National Pulse. Policies that earn community support through voting are then applied to a shared national simulation, revealing how a decision that helps one state can burden another, and how policies scale on a state versus national level. AI usage disclosure (2 to 3 sentences). We first drafted the write-up responses as bullet points fed those into Claude Sonnet 4.6 to generate cohesive paragraphs, which we rewrote to better clarify and emphasize our ideas. Claude Opus 4.7 was used for technical specification design and UI/UX implementation, while all parameters are gathered and estimated from real data (ACS, BLS, BEA, FBI-CDE, Presidential election results, CDC, YCOM). GPT-4.1-mini is used for evaluating the result of policies, and is also used to detect/flag any harmful comments submitted on our social platform. Ethical Categories (3 to 5 sentences each category).
Transparency & Explainability: Every numerical indicator change in the simulation comes with an explanation and the causal chain the model followed. Once published to the national feed, simulation outputs, AI reasoning, and user comments are visually separated so readers always know what came from the model and what came from a person. The LLM doesn’t drive the simulation: it only proposes small, bounded shocks to an environmental state vector, phased across 3 temporal offsets; most of what users see over many turns comes from the deterministic and stochastic rules of the math engine. Due to the nature of the task, the compiler states assumptions, risks, and rationale that make the inferential leap behind each change inspectable. Lastly, users can see reasoning/assumptions made by the LLM on any simulation result. A persistent disclaimer on every page states that outputs are estimates for civic learning purposes and are not predictions of real-world outcomes. Bias & Fairness: Firstly, statistical priors for the simulation are estimated from 7 different datasets by taking a weighted average of at least 4 unique indicators for each parameter, giving the model an empirical foundation. A fairness test for nonpartisanship found that GPT assigned liberal-coded policies a mean score of 3.96 against 3.20 for Republican-coded policies, suggesting the model inherently weights mainstream liberal frameworks higher. We also ran a fairness test to see whether policy phrasing affected simulation outputs by submitting identical policies written in layman and expert language, finding a 21.84% difference in indicator changes before explicitly instructing gpt-5.1-mini to evaluate content over framing, after which the gap dropped to 5.57%. Refer to Appendix 1 for results. In its current form, Sim Civics is inaccessible to non-English speakers, users without reliable internet or digital literacy, and those whose political frameworks fall outside U.S. state-based federalism. Limitations:

  1. The simulation cannot capture cultural dynamics, historical context, or local implementation capacity that fall outside what census and labor statistics measure, meaning a policy that scores well may still fail in practice for reasons the model structurally cannot see. Every result page carries a permanent disclaimer to this effect, and the community comment layer exists explicitly for users to surface real-world context the simulation missed.
  2. The moderation classifier may flag charged but substantive civic language (comments like "this policy historically harms Black communities") as hostile even when they represent legitimate political arguments. When this happens, users are shown exactly why their comment was flagged and given the opportunity to edit and repost, ensuring nothing is silently dropped.
  3. Sim Civics deliberately models real implementation constraints by state, reflecting how federalism actually works, but a user may misread a weak simulation result in a low-capacity state like West Virginia as the policy being flawed, when the output is actually describing the state's starting conditions rather than the policy's merit. To address this, the result page displays the state's baseline indicators prominently alongside the outcome, with a note clarifying that results reflect both policy design and state-level implementation capacity. Track-specific concern (1 paragraph 3-4 sentences). Weaponization for Manipulation or Suppressing Dissent: Our biggest concern is that coordinated groups could exploit the community layer to suppress ideas they disagree with before those ideas get a fair hearing. To prevent this, policies are ordered chronologically rather than by vote count, meaning no policy can be buried through downvoting. Second, all accounts require email verification (not implemented yet), reducing risk of attacks. Third, the moderation system applies aggressive filtering of inappropriate content through a GPT filter critically, when a comment is rejected, the user sees exactly why, so the moderation is transparent.

What you’d do with another week (3 to 4 sentences). Given another week, we would collect and scrape a lot more data in order to make our priors more informed and stable. We would also ideally build out the backend and middleware for this system including some of the critical components for user ethics such as authentication. Furthermore we would also research better dynamics models of economies and societies in order to frame our simulation in more empirically backed research. Lastly we would test various LLMs for fairness and bias as we did GPT and choose the one which acted with the least prejudice.

Built With

  • ai
Share this project:

Updates