Synthetic Research Agent

Inspiration

Consumer research costs $15K+ and takes weeks. Most teams just skip it. We wanted to build something where you ask a question and get a full study back in minutes. Not a summary, but structured research with real personas, segmented analysis, and grounded recommendations.

We also wanted it to compound. Every AI tool treats each interaction as isolated. This agent learns from its own studies, and the personas themselves develop memory and consistency over time. Continual learning in text space, model weights stay frozen, capability still accumulates.

What it does

You give it a research question in plain English. The agent plans the study, generates a panel of 12 demographically grounded personas (Census distributions), designs typed questions, simulates all personas responding in character in parallel, and delivers segmented analysis with recommendations.

The personas have persistent memory. After each study, the system extracts their concerns, preferences, and sentiment patterns. Future studies inject that history so they stay consistent and get more realistic over time.

The agent also self-assembles its toolkit. Study 1 it builds everything from scratch. By study 5 it's reusing most tools and the reflection engine starts consolidating them into composite pipelines.

How we built it

Python, single SQLite database, Gemini 2.5 Flash for all LLM work, Gemini Embedding 001 for semantic search. The simulation runs 6 concurrent persona calls via asyncio, so a 12 persona x 7 question study fires 84 API calls and finishes in ~40 seconds. Persona memory, tool registry, and reflections all live in the same SQLite DB. Dashboard is FastAPI + vanilla JS, no framework.

Challenges we ran into

Running async simulation inside FastAPI's event loop required creating fresh loops to avoid collisions. Persona demographics initially sampled from general Census distributions regardless of target audience, so a "Gen Z 18-24" query returned personas aged 27-76. Had to restructure sampling and prompting to enforce audience matching. Gemini's JSON output was consistently malformed (truncated, trailing commas, markdown fences) so we built a recovery function to handle all edge cases.

Accomplishments that we're proud of

The self-learning loop genuinely works. Zero tools on study 1, full toolkit by study 3, reflection engine autonomously building new tools. Persona memory lets synthetic consumers develop persistent identity across studies. The whole thing runs on a single SQLite database with no external infrastructure. That was a design choice, not a constraint.

What we learned

Text-space learning gets you surprisingly far. The stability-plasticity tradeoff from continual learning research doesn't apply the same way when learning is structured artifacts in a database composed into prompts. The model never changes, but effective capability grows with every study.

The evaluation step is everything. The self-improvement loop works because the reflection engine evaluates concrete outcomes (demographic accuracy, question quality, analysis usefulness), not fuzzy proxies. Context window management is the real bottleneck as personas accumulate history.

What's next for Intuit

Longitudinal studies with the same panel over time (concept test week 1, pricing week 2, messaging week 3). Cross-panel meta-analysis to find which persona archetypes are most predictive. Code tool generation for statistical analysis (Van Westendorp, conjoint) so the agent can build and deploy its own analytical functions.

Built With

anthropic
claude
gcp
gemini
next
python
vector

Updates

William Starr started this project — Feb 21, 2026 04:14 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.