🌟 Inspiration

AbleAI was inspired by a single question:

How much AI can run fully offline on an ARM-powered device without relying on the cloud?

Most AI tools today depend on internet access or cloud APIs, making them slow, unreliable, or unusable for people in low-connectivity environments. Many visually impaired users and students also need tools that work instantly and privately.

AbleAI is my attempt to prove that powerful, multimodal AI can run entirely offline on everyday ARM-based devices like phones, tablets, Raspberry Pi, and M-series Macs.


🧠 What It Does

AbleAI consists of three fully offline AI modules, each demonstrating a core capability of edge AI:


📖 Module 1 - OCR Reader

Reads printed text aloud.

  • Captures images using the camera
  • Preprocesses images for clarity
  • Performs OCR using Tesseract locally
  • Speaks text using offline TTS

Key files:

  • app.py - main controller + camera flow
  • ocr_reader.py - preprocessing + OCR
  • speaker.py - offline TTS

🎯 Module 2 - On-Device Object Detection

Detects objects in real time using YOLOv8 models running directly on ARM CPU.

  • Works fully offline
  • Uses lightweight models (yolov8n.pt, yolov8m.pt)
  • Detects objects and speaks the results

This demonstrates real edge computer vision.


🎙 Module 3 - Voice Assistant

A simple offline assistant that listens and performs actions.

  • Offline speech-to-text
  • Can run OCR or object detection when asked
  • Responds using offline TTS
  • Combines speech + vision + logic

🛠 How I Built It

Computer Vision

  • OpenCV
  • Image preprocessing (blur, resize, thresholding)
  • YOLOv8 inference on CPU

OCR

  • Tesseract OCR via pytesseract
  • Adaptive thresholding for noisy cases

Speech

  • pyttsx3 offline TTS
  • macOS “say” fallback
  • SpeechRecognition for ASR

Optimized for ARM

  • All code runs on CPU
  • No cloud or GPU required
  • Lightweight, efficient AI logic
  • Tested on ARM hardware

🧩 Built With

  • Python
  • OpenCV
  • Tesseract OCR
  • YOLOv8
  • pyttsx3
  • SpeechRecognition
  • NumPy
  • Pillow

🚧 Challenges I Faced

  1. Running YOLOv8 efficiently on CPU
  2. Making OCR accurate on low-light / skewed images
  3. Integrating TTS + STT + CV without lag
  4. Ensuring everything stays 100% offline
  5. Designing three modules that work independently and together

📚 What I Learned

  • How to optimize AI for edge devices
  • How preprocessing changes OCR performance
  • How to design multimodal assistants
  • ARM-specific performance considerations
  • The challenges of real-time audio + vision

🔗 Try It Out

GitHub Repository:
https://github.com/DiyaMenon/AbleAI


Built With

Share this project:

Updates