Inspiration

Five of us packed into a car and drove from Chicago to San Francisco to attend the hackathon. Somewhere in the Nevada desert- no towns, no cell signal, just flat highway stretching to the horizon- our tire blew out.

We had a spare. We had tools. What we didn't have was any idea how to use them properly. None of us remembered the lug nut torque sequence. Nobody knew if our car had a full-size spare or a donut with a speed limit. We couldn't look any of it up. We sat stranded on the side of the road for hours.

Thirty seconds on Google would have answered every question. But we had no Google. That gap — between having a smartphone and having actual help — is exactly what Glovebox is built to close.

What it does

Glovebox is an offline AI copilot for stranded drivers. Open the app, describe your problem in plain English, and it walks you through it — step by step, in a conversational chat — with zero internet required.

It's not just a generic how-to guide. Glovebox uses RAG (retrieval-augmented generation) over your vehicle's owner's manual, so when you ask "where's the jack storage on my car," it finds the right answer for your specific model and year — not a generic YouTube video that might not match your setup.

How we built it

Glovebox is a React Native app for iOS built in TypeScript. The core is on-device LLM inference via llama.rn, running a quantized GGUF model (Llama 3.2-1B-Instruct or Gemma-2-2B at Q4_K_M — under 1.5GB). All inference happens locally on the iPhone; no API calls, no backend.

On top of the LLM, I built a lightweight keyword-based RAG pipeline that chunks and indexes the vehicle owner's manual and retrieves the most relevant sections to include in the model's context window for each query.

Offline state is detected via React Native's NetInfo API and surfaced clearly in the UI — the "Offline mode: ON" indicator reflects real network state, not a hardcoded flag.

Challenges we ran into

Getting a quantized LLM to run acceptably fast on-device was the biggest challenge — model selection and quantization level directly affect both response latency and accuracy. Bundling the GGUF weights as a native resource in Xcode and wiring it into the React Native bridge via llama.rn took significant trial and error.

Building this as a solo developer under hackathon time pressure meant making deliberate tradeoffs: iOS only for now, keyword-based RAG rather than vector embeddings, a curated set of emergency procedures rather than an exhaustive manual library.

Accomplishments that we're proud of

Apart from the technical accomplishments, I'm proud that this solves a real problem I actually lived. The five of us stranded in Nevada weren't edge cases. There are 46 million roadside breakdowns in the US every year, and a huge chunk of them happen where there's no signal. Glovebox would have gotten us back on the road in 20 minutes instead of 3 hours.

What we learned

I learned that on-device AI is genuinely viable in 2025 — but model selection is everything. The difference between a 1B and a 3B parameter model isn't just size; it's whether the responses are actually useful under the constraints of a glove compartment emergency. Quantization level (Q4_K_M vs Q8) has a real effect on both latency and coherence, and finding that sweet spot took most of my first night.

And building this solo taught me to be brutal about scope. No Android, no vector DB, no fancy UI, just one thing that works completely offline, answers car questions accurately, and launches in under three seconds. Constraints forced clarity.

What's next for Glovebox

  • Android support via the same llama.rn bridge
  • Vector-based semantic RAG for more accurate retrieval
  • creating an offline network mesh for users to connect
Share this project:

Updates