PocketPact

Inspiration

As students in Cambridge, Massachusetts, where voting sometimes feels like it might not change the outcome, we wanted to identify whether our opinions could shape policy outside regional bubbles. We realized that strategic political donations can amplify our impact across other jurisdictions, but there was no systematic way to identify where our dollars would have the most leverage. Should we donate to a competitive race that’s already saturated with money? Or a less competitive race where our contribution could be the difference?

This question inspired us to build a data-driven system that calculates donation leverage scores tailored to our specific policy interests for every election race, from federal to local, helping donors maximize their impact on democracy.

What We Learned

Civic data is extremely unequally distributed across different levels of government. Federal races have comprehensive FEC campaign finance data, while state and local races often have extremely limited and confusing financial records. This data gap creates an information asymmetry that disadvantages smaller, local races.

Local and state campaign finances aren’t easy to access or navigate. While the FEC provides well-structured APIs for federal races, state and local campaign finance data is scattered across hundreds of different state and county websites, each with different formats and access methods.

We discovered that election rules vary dramatically across the country. Different states have different campaign finance reporting requirements, different election cycles, and different levels of transparency. This fragmentation makes it very difficult to compare races across jurisdictions using a single methodology.

Leverage Calculations

$$ \text{Leverage Score} = \text{Competitiveness} \times \text{Saturation} $$

We developed a three-tier approach that adapts to data availability:

Primary Source: Real-time prediction markets (Kalshi API) that reflect market sentiment about race competitiveness
Secondary Source: Historical election results from Civic Engine API, analyzing party alternation patterns
Fallback Source: County-level party affiliation data (NANDA) for state-level competitiveness estimates

For primary elections, we use entropy-based calculations: $\text{Entropy} = -\sum_{i} p_i \log(p_i)$, where $p_i$ is the probability of candidate $i$ winning. This captures the reality that more evenly distributed probabilities indicate higher competitiveness.

Saturation Calculation

We use different methods depending on race type:

Federal Races: FEC API provides actual campaign finance receipts. We calculate saturation using: $\text{Saturation} = \frac{1}{\log(1 + \text{total_receipts})}$
State Races: Since FEC data isn’t available, we innovated a proxy method using Kalshi market volume and bid-ask spread as indicators of market attention (and by extension, fundraising saturation)
Local Races: With no data sources available, we set saturation to neutral (1.0) and let competitiveness drive the score—transparently acknowledging data limitations rather than penalizing races unfairly

Data Integration

We built a system that:

Validates data quality: Every Kalshi market is validated for state, district, office type, and year match
Handles missing data gracefully: Uses weighted averaging when some data sources are unavailable
Adapts to race types: Different methodologies for federal, state, and local races
Provides transparency: Data quality indicators and warnings when data is incomplete

Challenges We Faced

API Rate Limiting

The FEC API has strict rate limits, and we encountered frequent 429 Too Many Requests errors. We implemented exponential backoff retry logic, but this slowed down processing significantly. We also had to be strategic about which races to prioritize when testing.

Data Source Validation

Kalshi markets don’t always perfectly match the races we’re analyzing. We built a comprehensive validation system that scores market matches based on state, office type, district, and year. Poor matches are downweighted rather than discarded, ensuring we use available data while maintaining accuracy.

Missing Data for Local Races

The biggest challenge was handling local races (city council, county offices) where no comprehensive data exists. Rather than excluding these races or using unreliable proxies, we designed the system to be transparent about limitations—setting saturation to neutral and clearly warning users when data is unavailable.

GraphQL Query Complexity

Civic Engine’s GraphQL API required complex queries to extract historical election results. We had to learn the schema, handle pagination, and deal with nested data structures. For city races, we discovered that position names vary significantly, requiring flexible matching logic.

State Extraction for Local Races

For local races like “Mount Vernon City Council,” the state isn’t in the race name—it’s in the election name (“New York General Election”). We had to enhance our parsing logic to extract state information from multiple sources, ensuring NANDA data and historical results could be properly matched.

Year Handling in NANDA Data

NANDA data is only available for certain years (e.g., 2022), but we’re analyzing 2025 elections. We implemented logic to automatically use the most recent available year when the target year isn’t available, with fallback to multiple ratio fields (presidential and senatorial) to maximize data coverage.

Impact

Our system democratizes access to donation leverage analysis that was previously only available to large political organizations with dedicated data teams. By combining multiple data sources and adapting to data availability, we’ve created a tool that works across all levels of government—from presidential races to city council elections.

We believe that when donors can identify where their dollars will have maximum impact, we can level the playing field and ensure that competitive races at all levels get the attention they deserve.