STEVE

Landing page
ai response
Cursed triple mcp setup
ai evals
ai
mastra playground with yahoo finance and calculator tools

Inspiration

MCP agents are everywhere in hackathons, but we don't see them in production much yet. A problem is making sure AI responses are reliable, along metrics like bias, completeness, and prompt relevancy. So we built a realistic agentic workflow and monitored it with another agent

What it does

It's an agent that compares two stocks by using yahoo finance and calculator tools and then emails you an analyst report. Another agent then reviews the outputs and scores it on bias, completeness, and prompt relevance

How we built it

Brightdata for webscraping, google for models, and mastra and mastra evals for the agent stuff.

Challenges we ran into

Mastra is super flexible and there are a lot of ways to integrate with nextjs (standalone or monolith).

Accomplishments that we're proud of

It works and the evals look sick (thanks claude)

What we learned

Multi agent monitor systems are probably necessary for scalable production deployments of agents

What's next for STEVE

More metrics and stuff

Built With

Updates

Michael Yu started this project — Jul 26, 2025 07:55 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.