Inspiration

In the modern world, we are drowning in "administrative friction." Whether it is a busy parent trying to contest a late fee, a student navigating a complex insurance claim, or an elderly person struggling to find reliable home repairs, these "micro-stresses" steal hours of our lives and mental energy. The launch of Gemini 3 signaled the start of the Action Era. I was inspired by the shift from static chatbots to autonomous agents that don't just talk, but act. We asked ourselves: What if everyone had a personal logistics officer in their pocket to handle the chores they hate most? Thus, The Invisible Hand was born, a cognitive accessibility agent designed to solve the "boring chores" of everyday life.

How we built it

As a person with a strong vision but no prior coding experience, I leveraged the Vibe Engineering track. Utilized Google AI Studio’s Build tab to transform natural language requirements into a functional React application. The core of our project is the Gemini 3 Pro multimodal reasoning engine. I integrated:

  • Multimodal Vision: Allowing the agent to "see" and analyze physical bills or complex web forms.
  • Thought Signatures & Thinking Levels: To maintain continuity and self-correction during multi-step tasks like negotiating with a customer service bot.
  • Agentic Workflows: Moving beyond single prompts to create an orchestrator that plans out a multi-step solution for the user

Challenges we ran into

The greatest challenge was ensuring the agent could handle spatial-temporal understanding, recognizing the cause and effect within a complex customer service interaction or a long-running "Marathon" task. I had to refine the system prompts to move away from "Baseline RAG" (simple data retrieval) and toward true reasoning over document sets. Ensuring the agent stayed within safety boundaries, specifically avoiding prohibited medical or mental health advice, required strict architectural guardrails.

Accomplishments that we're proud of

  • Zero-to-One Development: I am incredibly proud of successfully building a functional, high-fidelity application with zero prior coding experience. By utilizing Vibe Engineering through Google AI Studio, I proved that a strong vision and strategic prompting can now replace traditional engineering barriers.
  • True Agentic Workflow: I moved beyond a "simple prompt" or "chatbot" interface to create a legitimate orchestrator. My agent doesn't just answer questions, it autonomously plans and prepares complex tasks like multi-step bill negotiations.
  • Leveraging Multimodal Depth: I successfully implemented multimodal vision to analyze messy, real-world physical documents. The agent’s ability to extract specific data from a photo of a bill and immediately apply it to a negotiation strategy demonstrates the "spatial-temporal" understanding requested for the Action Era.
  • Transparent Reasoning: By integrating Thought Signatures, I made the "black box" of AI transparent.I am proud to show users exactly how the agent arrives at its strategy, ensuring human trust in autonomous systems.

What we learned

I learned that the barrier to entry for AI innovation has vanished. I discovered that Gemini 3’s 1M token context window natively replaces complex retrieval systems, allowing me to feed entire store policies or legal documents directly into the prompt for perfect reasoning. Most importantly, we learned that the "Wow Factor" in AI today isn't just about generation, it's about delegation.

What's next for The Invisible Hand

  • Integration with Gemini 3 Auto-Browse: My next major milestone is moving from "drafting" actions to "executing" them directly on the web. plan to integrate with Auto-Browse capabilities to allow the agent to securely navigate support portals and submit refund requests on the user's behalf.
  • Gemini Live Voice Support: I aim to implement the Gemini Live API so users can speak to their agent in real-time. Imagine being able to tell your phone, "Hey, the internet is down again, can you call the provider and get me a credit?" and having the agent handle the entire audio-based support queue for you.
  • The Marathon Framework: I will evolve the agent into a true Marathon Agent, capable of monitoring long-running disputes such as insurance claims or building permits that span weeks, checking status updates and self-correcting its strategy without any human intervention.
  • Expanding Accessibility: I see "The Invisible Hand" as a vital tool for Cognitive Accessibility, helping those with ADHD or executive dysfunction manage the "administrative tax" of modern life. We plan to build a mobile-first interface to ensure this power is accessible to everyone, everywhere.

Built With

Share this project:

Updates