The APK will be available through a GitHub release. We are currently preparing the release, so for now we are sharing the GitHub repository link: > https://github.com/Ahadke/BeyondMaps
Inspiration
Most travel apps assume that travelers always have reliable internet, but that breaks down in subways, airports, rural areas, international trips, or places with expensive roaming. We wanted to build a travel assistant that could still answer useful questions when the phone is completely offline.
BeyondMaps started from a simple idea: what if your phone could carry a private travel assistant that can understand your surroundings, answer local questions, translate signs and menus, and help in unfamiliar places without sending your location, images, or questions to the cloud?
Our focus was to make travel help feel immediate, private, and practical: the kind of assistant you can use when you are lost, reading a foreign menu, trying to understand a sign, or asking quick questions without depending on internet access.
What it does
BeyondMaps is an offline Android travel companion powered by on-device AI. It helps travelers ask questions, translate text, and understand their surroundings without relying on internet access or cloud APIs.
Travel Guide
The Travel Guide opens a chat experience where users can ask natural-language travel questions. Instead of sending the query online, BeyondMaps retrieves relevant information from a local RAG knowledge pack and sends that context to the on-device LLM.
Users can ask about:
- Restaurants, cafés, and local food options
- Attractions and things to see
- Public transport and tickets
- Emergency and safety guidance
- Useful local phrases
- Distances between landmarks when coordinate data is available
The goal is not just to run an LLM locally, but to ground the model in downloaded travel knowledge so answers are practical, place-aware, and usable offline.
Offline Translator
BeyondMaps includes an offline translation flow for common travel situations. Users can type or speak, and the app uses on-device AI to translate without calling external APIs.
This is useful for quick real-world communication, such as asking for help, understanding directions, ordering food, or translating short travel-related phrases.
Camera Understanding
The camera feature helps users understand the physical world around them. BeyondMaps can read text from menus, signs, and travel documents using OCR, then pass the extracted context to the local LLM for translation or explanation.
The broader vision is to extend this into richer on-device visual understanding, where the app can help identify landmarks, museum boards, trail signs, plants, birds, animals, or other objects using FastVLM-style vision-language models.
How we built it
We built BeyondMaps as a native Android app designed around offline-first AI execution.
The app uses:
- Kotlin, Jetpack Compose, Material 3, and MVVM for the Android interface and app structure
- Google AI Edge LiteRT-LM to run the language model directly on the device
- Local RAG retrieval using a downloaded JSON knowledge pack
- Knowledge chunks, metadata, coordinates, categories, and embeddings stored outside the APK for flexible testing and deployment
- Cosine-similarity search to retrieve relevant local context before calling the LLM
- Category-aware ranking to improve results for restaurants, transit, attractions, safety, and phrases
- Coordinate-based distance logic for landmark-to-landmark distance questions when location data is available
- ML Kit OCR to extract text from menus, signs, and documents
- Prompt pipelines for travel Q&A, translation, OCR explanation, and camera-based understanding
At runtime, the app loads the local travel data pack from device storage, retrieves the most relevant chunks for the user’s question, builds a grounded prompt, and sends that prompt to LiteRT-LM. This lets the assistant answer using local travel knowledge without needing internet access or cloud APIs.
Challenges we ran into
Building BeyondMaps was not just about running an LLM on a phone. The hardest part was making the full offline AI pipeline work reliably with real travel data.
Getting RAG to work fully on-device
We had to load a large local knowledge pack, connect chunks with embeddings, retrieve useful context, and send a grounded prompt to the LLM. Early versions either loaded no chunks, returned irrelevant results, or gave generic answers instead of travel-specific responses.
Managing large model and data files on mobile
The LLM model and travel pack were too large to treat like normal app assets. We had to store them outside the APK, push them to device storage, debug Android file paths, handle read access, and avoid memory issues while loading large JSON files.
Improving retrieval quality
Basic vector search was not enough. Some queries returned generic overview chunks or unrelated places. We added category-aware retrieval so restaurant, transit, attraction, phrase, and emergency questions could pull more relevant local context.
Making the LLM use the retrieved context
Even when retrieval worked, the model sometimes ignored the context or said it did not have enough information. We had to refine the prompt so the LLM treated the retrieved local data as the source of truth and answered only from that context.
Accomplishments that we’re proud of
- Built an offline Android AI assistant that combines chat, RAG, translation, and camera-based understanding
- Ran LiteRT-LM locally on a real mobile device with hardware-aware inference
- Connected local retrieval to the LLM so answers are grounded in downloaded travel data
- Added offline translation support for travel communication without relying on cloud APIs
- Built early image/video understanding support for menus, signs, landmarks, boards, and travel scenes
What we learned
- On-device AI is a full systems problem, not just a model problem
- RAG quality depends heavily on data structure, retrieval logic, and prompt design
- Offline translation needs more than direct word conversion; context and explanation make it useful
- Visual AI becomes much more helpful when OCR, image understanding, and the LLM work together
- Large local files require careful handling on mobile devices
What’s next for BeyondMaps
- Improve retrieval using stronger on-device embedding models
- Expand offline translation for speech, signs, menus, and real travel conversations
- Add richer picture and video support for landmarks, animals, birds, plants, museum boards, and outdoor scenes
- Improve camera understanding by combining OCR, visual reasoning, and local travel context
- Build an ETL pipeline to generate cleaner travel knowledge packs from open datasets

Log in or sign up for Devpost to join the conversation.