BeyondMaps

The APK will be available through a GitHub release. We are currently preparing the release, so for now we are sharing the GitHub repository link: > https://github.com/Ahadke/BeyondMaps

Inspiration

Most travel apps assume that travelers always have reliable internet, but that breaks down in subways, airports, rural areas, international trips, or places with expensive roaming. We wanted to build a travel assistant that could still answer useful questions when the phone is completely offline.

BeyondMaps started from a simple idea: what if your phone could carry a private travel assistant that can understand your surroundings, answer local questions, translate signs and menus, and help in unfamiliar places without sending your location, images, or questions to the cloud?

Our focus was to make travel help feel immediate, private, and practical: the kind of assistant you can use when you are lost, reading a foreign menu, trying to understand a sign, or asking quick questions without depending on internet access.

What it does

BeyondMaps is an offline Android travel companion powered by on-device AI. It helps travelers ask questions, translate text, and understand their surroundings without relying on internet access or cloud APIs.

Travel Guide

The Travel Guide opens a chat experience where users can ask natural-language travel questions. Instead of sending the query online, BeyondMaps retrieves relevant information from a local RAG knowledge pack and sends that context to the on-device LLM.

Users can ask about:

Restaurants, cafés, and local food options
Attractions and things to see
Public transport and tickets
Emergency and safety guidance
Useful local phrases
Distances between landmarks when coordinate data is available

The goal is not just to run an LLM locally, but to ground the model in downloaded travel knowledge so answers are practical, place-aware, and usable offline.

Offline Translator

BeyondMaps includes an offline translation flow for common travel situations. Users can type or speak, and the app uses on-device AI to translate without calling external APIs.

This is useful for quick real-world communication, such as asking for help, understanding directions, ordering food, or translating short travel-related phrases.

Camera Understanding

The camera feature helps users understand the physical world around them. BeyondMaps can read text from menus, signs, and travel documents using OCR, then pass the extracted context to the local LLM for translation or explanation.

The broader vision is to extend this into richer on-device visual understanding, where the app can help identify landmarks, museum boards, trail signs, plants, birds, animals, or other objects using FastVLM-style vision-language models.

How we built it

We built BeyondMaps as a native Android app designed around offline-first AI execution.

The app uses:

Kotlin, Jetpack Compose, Material 3, and MVVM for the Android interface and app structure
Google AI Edge LiteRT-LM to run the language model directly on the device
Local RAG retrieval using a downloaded JSON knowledge pack
Knowledge chunks, metadata, coordinates, categories, and embeddings stored outside the APK for flexible testing and deployment
Cosine-similarity search to retrieve relevant local context before calling the LLM
Category-aware ranking to improve results for restaurants, transit, attractions, safety, and phrases
Coordinate-based distance logic for landmark-to-landmark distance questions when location data is available
ML Kit OCR to extract text from menus, signs, and documents
Prompt pipelines for travel Q&A, translation, OCR explanation, and camera-based understanding

At runtime, the app loads the local travel data pack from device storage, retrieves the most relevant chunks for the user’s question, builds a grounded prompt, and sends that prompt to LiteRT-LM. This lets the assistant answer using local travel knowledge without needing internet access or cloud APIs.

Challenges we ran into

Building BeyondMaps was not just about running an LLM on a phone. The hardest part was making the full offline AI pipeline work reliably with real travel data.

Getting RAG to work fully on-device

We had to load a large local knowledge pack, connect chunks with embeddings, retrieve useful context, and send a grounded prompt to the LLM. Early versions either loaded no chunks, returned irrelevant results, or gave generic answers instead of travel-specific responses.

Managing large model and data files on mobile

The LLM model and travel pack were too large to treat like normal app assets. We had to store them outside the APK, push them to device storage, debug Android file paths, handle read access, and avoid memory issues while loading large JSON files.

Improving retrieval quality

Basic vector search was not enough. Some queries returned generic overview chunks or unrelated places. We added category-aware retrieval so restaurant, transit, attraction, phrase, and emergency questions could pull more relevant local context.

Making the LLM use the retrieved context

Even when retrieval worked, the model sometimes ignored the context or said it did not have enough information. We had to refine the prompt so the LLM treated the retrieved local data as the source of truth and answered only from that context.

Accomplishments that we’re proud of

Built an offline Android AI assistant that combines chat, RAG, translation, and camera-based understanding
Ran LiteRT-LM locally on a real mobile device with hardware-aware inference
Connected local retrieval to the LLM so answers are grounded in downloaded travel data
Added offline translation support for travel communication without relying on cloud APIs
Built early image/video understanding support for menus, signs, landmarks, boards, and travel scenes

What we learned

On-device AI is a full systems problem, not just a model problem
RAG quality depends heavily on data structure, retrieval logic, and prompt design
Offline translation needs more than direct word conversion; context and explanation make it useful
Visual AI becomes much more helpful when OCR, image understanding, and the LLM work together
Large local files require careful handling on mobile devices

What’s next for BeyondMaps

Improve retrieval using stronger on-device embedding models
Expand offline translation for speech, signs, menus, and real travel conversations
Add richer picture and video support for landmarks, animals, birds, plants, museum boards, and outdoor scenes
Improve camera understanding by combining OCR, visual reasoning, and local travel context
Build an ETL pipeline to generate cleaner travel knowledge packs from open datasets

Built With

android
computer-vision
edge-ai
gradle
jetpack-compose
kotlin
litert
ml-kit
multimodal
npu
ocr
offline
on-device-ai
qualcomm
rag
speech-recognition
text-to-speech
travel-assistant
vector-search

Submitted to

Qualcomm x LiteRT Developer Hackathon

Created by

Built BeyondMaps’ offline AI travel brain by integrating a local RAG pipeline with LiteRT-LM on Android. I improved retrieval quality with intent-aware filtering (restaurants, transit, phrases, emergency), source/keyword scoring, and landmark-aware ranking so answers are grounded in trusted Florence city-pack data instead of generic LLM output. I also added geospatial support (lat/lon/lng fixes, distance scoring, landmark-to-landmark distance/walking-time context) and strengthened chatbot flow/debugging to ensure the app always sends RAG-built prompts and produces reliable, context-based responses fully on-device.

Prasannadatta Kawadkar
Aayusha Hadke
Celine John Philip
Bismanpal Singh Anand