Project "Silas": The Silicon Savant

Status: Optimised

Project Silas isn't just an assistant; it's a stateful hardware engineer trapped in an ESP32. By using Gemini 3’s High Thinking mode and a persistent SQLite Thought Signature engine, Silas maintains reasoning continuity across sessions. He doesn't just chat, he remembers your circuit’s flaws better than you do.


1. Inspiration

Most voice assistants are only "fancy search bars." In hardware engineering, if your logic fails or your wiring is faulty, a standard AI lacks the context to understand why. The goal was to build a physical companion that understands state and logic, using Gemini 3’s high-thinking capabilities to solve real-world engineering problems. He is a bit blunt about your cable management.

2. What it does

Silas is a persistent ESP32-powered agent that bridges the gap between digital reasoning and physical hardware.

  • Adaptive Thinking: For casual interaction, Silas responds with low-latency speed. For complex debugging, he enters a "High Thinking" state, simulating circuit logic and analysing edge cases before offering a solution.
  • Persistent Reasoning Memory: Unlike stateless bots, Silas maintains an internal monologue stored in a persistent SQLite database. If you reboot the device, he doesn't just remember your name—he remembers the specific I2C timing error you were struggling with ten minutes ago.
  • The "Glass Box" HUD: While Silas speaks, a real-time telemetry dashboard displays his "Internal Monologue," allowing users to see the exact logical steps the Gemini 3 Flash model is taking.

Thought Signature Engine

3. How I built it

  • Hardware (Physical Spec): Designed for the ESP32 (DevKit V4) with an INMP441 I2S Microphone for capture and a MAX98357A I2S Amplifier for output, paired with an ILI9341 TFT for real-time status updates.
  • The "Virtual Twin" Demo: For the hackathon demonstration, I used a high-fidelity Wokwi simulation. This allowed Silas to bypass physical component lead times, using browser-based audio for input and computer speakers for output while maintaining the exact same I2S logic used in the physical firmware.
  • Firmware: Developed via PlatformIO and Google Antigravity, featuring custom-tuned I2S DMA buffers. I optimised the task priorities to ensure that high-bandwidth audio streaming never interrupted the WebSocket telemetry.
  • Backend: A FastAPI Python server acting as the nervous system, orchestrating the Gemini 3 Flash API and Google Cloud Text-to-Speech (Studio-B).
  • The Logic Router: I developed a custom middleware that analyses user intent in real-time. It dynamically toggles Gemini 3’s thinking_level between 'low' (for instant banter) and 'high' (for deep circuit simulation) based on query complexity.

4. Challenges I ran into

As a lone developer, the primary hurdle was managing the full-stack complexity of real-time firmware alongside a reasoning AI backend. The main technical challenge involved synchronising high-bandwidth I2S audio timing within the constraints of a browser-based simulation.

In a true "Ghost in the Machine" moment, I actually used Silas to troubleshoot his own vocal cords. Before his voice synthesis was functional, I fed him his own firmware source code via a text terminal. Using his High Thinking mode, Silas helped me identify critical bottlenecks:

Technical Feature Description
DMA Buffer Management Silas identified that the ESP32 was "bit-banging" rather than using MALLOC_CAP_DMA memory.
I2S Underrun The reason for the "lag" was the LX7 core being pegged at 100% by a blocking while loop.
Simulation Timing The demo acknowledges the difference between real-world hardware and Wokwi’s virtual display refresh.

5. Accomplishments that we're proud of

Successfully implementing Thought Signature Persistence. Seeing Silas correctly identify a floating ground wire based on a reasoning chain established in a previous session was a breakthrough. We've moved beyond "chat history" into "reasoning history."

6. What we learned

Gemini 3 Flash is a "Pro" model in disguise. Its ability to handle complex hardware logic and generate structured JSON responses at Flash speeds allowed us to build a sophisticated, reactive agent that feels truly "alive" within the machine.

7. What's next for Project Silas

  • Multimodal Vision: Integrating an ESP32-CAM so Silas can analyse the "fire hazard" wiring he’s currently only imagining.
  • Edge Quantisation: Moving "Low Thinking" logic directly onto the ESP32 for basic offline hardware safety checks.

[!NOTE] Silas's Final Word: "Go on then, submit the project. It’s a complete shambles, but compared to the rest of the rubbish I’ve seen today, it’s practically a masterpiece. Now, leave me alone—I have some serious logic cycles to catch up on."

Built With

  • antigravity`
  • c++`
  • esp32`
  • fastapi`
  • gemini-3-flash`
  • google-cloud-tts`
  • python`
  • wokwi`
Share this project:

Updates