Inspiration

Every year, over 500,000 truck crashes occur in the United States — 3,300 deaths attributed to distracted driving alone. A large portion of that distraction comes from drivers switching between apps, taking dispatcher phone calls, and manually checking GPS while moving at highway speeds.

We talked to fleet managers at JY Carriers and heard the same frustration repeatedly: dispatchers drowning in spreadsheets, drivers context-switching between 4 different tools, and critical decisions — which driver has enough HOS? Who's closest to the load? Is that route truck-legal? — taking minutes of manual lookup that should take seconds.

The insight was simple: the interface itself is the problem. Trucks are loud, hands are on the wheel, eyes are on the road. The right interface is voice.


What it does

Trucky is a voice-first, AI-native fleet operating system with two sides that share live state:

Dispatcher Co-Pilot — Dispatchers speak naturally into a TMS dashboard: "Who can make it to Chicago by 6pm Tuesday?" Trucky checks live HOS clocks across all 49 drivers, calculates route times with mandatory rest stops under FMCSA regulations, and answers in under 2 seconds. Load creation, driver assignment, fleet alerts — all hands-free.

Driver Co-Pilot — An always-on voice assistant in the cab. Drivers ask about their loads, get truck-safe routing (bridge heights, weight limits), flag fatigue, and report issues — without touching their phone. A fatigue alert from the driver instantly surfaces on the dispatcher dashboard in real time.

Both sides connect to live Samsara ELD data — 49 real drivers, real HOS clocks, real GPS, real fuel levels.


How we built it

Layer Technology
Voice AI Gemini 2.5 Flash Native Audio (Live API)
SDK Google GenAI Python SDK
Backend FastAPI on Google Cloud Run
CI/CD Google Cloud Build
Fleet Data Samsara ELD — 49 live drivers
Routing OSRM + Nominatim (truck-aware)
Frontend Next.js on Vercel

The core is a persistent WebSocket between the browser and Cloud Run. PCM16 audio streams bidirectionally at 16kHz (mic) and 24kHz (playback). We chose Gemini Native Audio specifically to eliminate the ASR → LLM → TTS round-trip — barge-in works naturally, latency stays low, and the voice feels conversational.

Gemini's function calling handles every tool: HOS lookup, load creation, route planning, driver messaging, and fleet alerts — all grounded in live Samsara data.


Challenges we ran into

Transcript deduplication was unexpectedly complex. Gemini's Live API fires output_transcription events per-word and turn_complete multiple times per conversation round. We went through several iterations before landing on the right solution: accumulate on the backend, send once per turn, deduplicate on the frontend with an exact-match suppression window.

FMCSA compliance inside the AI layer required Gemini's tool calls to calculate not just distance but mandatory 10-hour rest stops, break-due windows, and cycle resets before committing a driver to a load. Getting this right with real HOS data took careful prompt engineering and tool design.

Zero-touch driver UX meant every single feature had to work purely by voice including edge cases like bridge height warnings (the Merritt Parkway has a 12'6" clearance that consumer GPS won't flag), fuel stop recommendations based on live fuel telemetry, and fatigue detection that escalates to the dispatcher automatically.


Accomplishments that we're proud of

  • Live fleet data, not mock data. 49 real drivers from JY Carriers with live HOS clocks, GPS positions, fuel levels, and engine state all accessible to Trucky in real time.
  • Sub-2-second voice responses on complex queries that require multi-step tool calls against live APIs.
  • True bidirectional coordination a driver flagging fatigue by voice immediately creates a priority alert on the dispatcher's dashboard without any manual step.
  • Truck-aware routing that correctly rejects routes with bridge height or weight restrictions — something consumer GPS consistently fails at.
  • Built and deployed end-to-end on Google Cloud in under two weeks.

What we learned

Gemini Native Audio isn't just a faster TTS pipeline it fundamentally changes what voice AI can do in high-stakes, hands-occupied environments. The ability to interrupt mid-sentence, maintain full context across a dispatch shift, and execute real tool calls against live data makes this genuinely useful, not just a demo.

Fleet operations turned out to be a perfect domain for agentic AI: decisions are time-sensitive, data is structured, and the cost of a wrong answer a bridge strike, an HOS violation, a missed delivery window is concrete and measurable. That clarity made building the right tool boundaries surprisingly straightforward.


What's next for Trucky-AI

  • Predictive delay alerts proactively surface HOS + traffic conflicts before they become violations
  • Multi-driver broadcast dispatcher speaks once, relevant drivers notified automatically
  • Load matching optimization Trucky recommends the best driver across the full fleet, not just the fastest
  • ELD compliance reporting auto-generate FMCSA-ready logs from voice session history
  • Expansion to other fleets the Samsara integration is fleet-agnostic; any carrier can plug in

Built With

  • fastapi
  • google-cloud-build
  • google-cloud-run
  • google-container-registry
  • google-gemini-2.5-flash-native-audio
  • google-genai-python-sdk
  • next.js
  • nominatim
  • osrm
  • pcm16-audio
  • python
  • react
  • samsara-eld-api
  • typescript
  • vercel
  • websocket
Share this project:

Updates