Below is a basic overview of the project. For the detailed write-up , please refer to the below link. https://docs.google.com/document/d/19-UBi9iZIb9cWE3gT9FsHgYNicJntjMURD3jfLs586k/edit?usp=sharing This write-up is uploaded on the github repo as well.

Inspiration

Lumen was inspired by the everyday challenges faced by visually challenged individuals when trying to understand their surroundings independently. Many assistive solutions rely on cloud-based processing, which introduces latency, connectivity issues, and privacy concerns. The goal behind Lumen was to explore how far on-device AI could go in delivering reliable, real-time assistance without requiring an internet connection.

What it does

Lumen is a fully offline Android application that provides four assistive modes: real-time object detection, live text reading, document understanding with question answering, and scene description. Using on-device machine learning models and text-to-speech feedback, the app helps users interpret their environment, read printed content, understand documents, and get concise descriptions of scenes around them.

How we built it

The app was built as a native Android application using Kotlin and the Android SDK, with CameraX handling live camera input. All inference runs locally using TensorFlow Lite. EfficientDet-Lite0 is used for object detection, ML Kit on-device OCR for text recognition, MobileBERT for document question answering, and a custom InceptionV3 + LSTM pipeline for scene captioning. Performance was improved using NNAPI acceleration, multi-threaded execution, and lightweight preprocessing, ensuring smooth operation on Arm-based devices.

Challenges we ran into

One major challenge was managing multiple ML pipelines while maintaining real-time performance on mobile hardware. Balancing accuracy, latency, and spoken feedback required careful throttling, filtering, and cooldown logic. Ensuring consistent behavior across different devices, all while keeping the app completely offline, also required thoughtful model selection and optimization.

Accomplishments that we're proud of

We are proud of building a complete assistive application that runs entirely on-device without any cloud dependency. Successfully integrating multiple ML models into a single, cohesive experience while maintaining low latency and clear audio feedback was a key achievement. The project demonstrates that edge AI can be both practical and impactful for accessibility-focused applications.

What we learned

This project provided hands-on experience with deploying and optimizing on-device machine learning models on mobile hardware. We gained deeper insight into TensorFlow Lite, NNAPI, and performance tuning for Arm architectures, as well as designing user-centric accessibility features that prioritize clarity, privacy, and reliability.

What's next for Lumen: An Offline AI Assistant for the Visually Challenged

Future work includes improving model accuracy, expanding language support for OCR and speech, and enhancing scene understanding with richer contextual descriptions. We also plan to explore better hardware acceleration and further optimizations to make the app accessible on a wider range of devices.

Built With

  • adb
  • android-studio
  • camerax
  • custom-vocabulary-file-platforms:-android-(arm-based-devices)-apis-&-tools:-mediastore
  • filemanager
  • github
  • inceptionv3
  • languages:-kotlin
  • lstm-decoder
  • ml-kit-on-device-text-recognition-models:-efficientdet-lite0
  • mobilebert-qa
  • pdfrenderer
  • recyclerview
  • speechrecognizer-ml-libraries:-tensorflow-lite
  • texttospeech
  • tflite-task-library
  • xml-frameworks-&-sdks:-android-sdk
Share this project:

Updates