Aequo — Project Story

Team: UPCTeam (Barcelona) Team members:


Inspiration

Small, everyday tasks can become real barriers for people with limited mobility or those who prefer voice interactions. Aequo was created to make it simple to ask for practical help using voice, receive concise guidance, and escalate to human volunteers when a situation needs it. The prototype prioritizes accessibility, clarity, and reliable demo behavior.

What it does

  • Voice-first input: Accepts spoken or typed requests and returns clear, actionable guidance.
  • Photo-aware assistance: Prompts for photos when visuals improve the suggestion.
  • Volunteer escalation: Suggests and publishes local help requests and provides a volunteer dashboard to accept and complete tasks.
  • Multilingual: Supports English, Spanish, and Slovak.
  • Spoken replies: Uses device TTS and supports richer remote TTS options when available.
  • Volunteer incentives (experimental): The project includes a prototype design for tokenized volunteer incentives using Solana (SPL); repository diagrams describe an NFC handshake flow and a fee-relayer that can sponsor transaction fees. ## How we built it Aequo is a single Flutter codebase with a cloud-backed persistence layer. The app combines speech recognition, text-to-speech, image attachments, and optional LLM-based responses.

One of the most technically interesting parts of the project is on-device AI inference. We integrated a model from the Gemma family, running directly on the device with no network round-trip required. This enables real-time LLM responses with noticeably low latency — the model runs fast, even on mid-range hardware. Beyond speed, this approach is inherently private (user data never leaves the device) and works fully offline, which is especially valuable for accessibility use cases where connectivity may be limited or unreliable.

Challenges

  • Deploying a capable LLM locally on a mobile device while maintaining real-time inference performance.
  • Handling media safely and keeping image-centered prompts robust.
  • Tuning heuristics that decide when to request a photo or escalate to human help. ## Accomplishments
  • A demo-first UX that remains functional without cloud credentials.
  • End-to-end multimodal flows combining voice, images, and volunteer escalation.
  • Accessibility-focused interactions: hands-free modes and concise spoken replies.
  • On-device AI inference: Real-time LLM responses running locally on the device using a mobile-optimized Gemma model — no cloud dependency, low latency, and fully private. ## What we learned
  • Rule-based heuristics are useful initially; labeled data will improve intent detection.
  • Running a capable LLM directly on a mobile device is not only feasible but surprisingly fast with the right model family — on-device inference is a practical option, not just a theoretical one.

Built With

  • android
  • dart
  • elevenlabs
  • flutter
  • flutter-dotenv
  • flutter-riverpod
  • flutter-tts
  • gemma
  • go-router
  • google-gemini
  • http
  • image-picker
  • intl
  • ios
  • just-audio
  • mime
  • permission-handler
  • postgresql
  • shared-preferences
  • speech-to-text
  • supabase
  • uuid
Share this project:

Updates