Sandy — Project Story

Inspiration

Every single day, humanity faces a crisis so profound, so paralyzing, that even the greatest minds of our generation have failed to solve it:

“What sandwich should I eat?”

We looked at this problem and thought — this deserves better. This deserves enterprise-grade infrastructure. This deserves a blockchain. This deserves a team of five engineers working against the clock to build something breathtakingly, catastrophically unnecessary.

Sandy was born.


What We Learned

  • How to chain multimodal AI (vision → reasoning → voice) into a single seamless pipeline
  • How Solana Actions and Blinks work — turning any URL into an interactive blockchain transaction shareable on Discord or X
  • How Snowflake Cortex AI converts natural language into SQL using agentic reasoning, and how to pipe unstructured LLM outputs into a structured data warehouse
  • How to ship a production Next.js app under extreme time pressure with 5 people on 5 parallel branches without losing our minds
  • That a grilled cheese is statistically the most likely sandwich recommendation during emotional distress. The data does not lie.

How We Built It

The architecture is governed by what we call the Sandwich Absurdity Index (SAI):

$$SAI = \alpha(\sigma_{mood}) + \beta(\Delta_{weather}) + \gamma(\mathcal{E}_{reasoning})$$

Where:

  • $\sigma_{mood}$ is the variance in detected human emotional state
  • $\Delta_{weather}$ is the climatic severity index at the user’s location
  • $\mathcal{E}_{reasoning}$ is the entropy of the Gemini model’s philosophical output

Snowflake Cortex AI continuously monitors $SAI$ across all global sessions for anomaly detection, flagging statistically disastrous sandwich decisions in real time.

The full pipeline for a single sandwich request flows as:

$$\text{Webcam} \xrightarrow{\text{Gemini Vision}} \sigma_{mood} \xrightarrow{} \text{Gemini Flash} \xrightarrow{\text{ElevenLabs}} \text{Verdict}$$

$$\text{Verdict} \xrightarrow{\text{Snowflake}} \mathcal{E}_{reasoning} \quad \text{and} \quad \text{Verdict} \xrightarrow{\text{Solana}} \text{Blockchain}$$

In practice, here is what happens when you ask Sandy what to eat:

  1. Gemini Vision API analyzes your face via webcam and returns your emotional state
  2. OpenWeatherMap fetches real-time local weather conditions
  3. Gemini Flash receives mood + weather and reasons deeply — outputting a sandwich verdict as strict JSON alongside a philosophical justification
  4. ElevenLabs reads the verdict aloud in the most dramatic voice we could find
  5. The event is simultaneously piped to Snowflake for enterprise telemetry and MongoDB on DigitalOcean for the real-time global Sandwich Ticker
  6. Clicking “Immortalize on Solana” generates a Blink URL that unfurls in Discord into an interactive card — allowing anyone to mint your sandwich onto an immutable decentralized ledger forever

The entire system runs on DigitalOcean App Platform with continuous deployment from GitHub, so judges can open it on their phones via QR code and watch their sandwich appear on the live global ticker in real time.


Challenges We Faced

Solana Actions and Blinks were the steepest learning curve — the API is new and the documentation assumes familiarity with the full Solana ecosystem. Getting the actions.json routing correct and ensuring CORS headers were properly configured across all endpoints took significant debugging.

Snowflake Cortex required careful data modeling. Gemini’s natural language outputs had to be structured as deterministic JSON before ingestion, otherwise the Cortex Analyst’s semantic layer would break on unstructured text fields. We modeled the ingestion as a function $f$ mapping LLM output to a structured schema:

$$f: \mathcal{E}_{reasoning} \rightarrow {sandwich, mood, weather, confidence, timestamp}$$

Parallel development across 5 people on a single codebase with a hard deadline is its own engineering challenge. We used strict branch ownership — one branch per person, one merge master — and pre-agreed folder boundaries to eliminate merge conflicts entirely.

And of course — the hardest challenge of all: none of us could agree on what sandwich we actually wanted. We built an entire AI system to avoid making that decision ourselves. It suggested grilled cheese. We ate grilled cheese. It was correct.

Built With

  • digitalocean-app-platform
  • elevenlabs-text-to-speech-api
  • google-gemini-api-(vision-+-flash)
  • mongodb-atlas
  • next.js
  • node.js
  • openweathermap-api
  • react
  • snowflake-cortex-ai
  • solana-actions-&-blinks
  • solana-web3.js
  • tailwind-css
  • typescript
Share this project:

Updates