Biki — An Arm-Powered Edge AI Nutrition Assistant
Inspiration
After a malnutrition diagnosis, I needed a practical, real-time way to track nutrition and manage my pantry. Existing apps didn't combine inventory management, nutrition tracking, and AI-powered recommendations. I built Biki to solve this — and I intentionally designed it to run locally on Arm-based devices, leveraging their efficiency, low power consumption, and suitability for on-device AI.
What I Learned
Computer Vision & Edge AI on Arm
- Trained a custom YOLOv5 model for grocery item detection
- Optimized and deployed it on the Arm Cortex-A76 CPU of the Raspberry Pi 5, paired with the Hailo-8L accelerator
- Learned quantization, ONNX conversion, ARM64 deployment, and hardware-aware model optimization
- Implemented all preprocessing and postprocessing pipelines directly on Arm CPU cores, leveraging NEON-accelerated operations for speed
Raspberry Pi Development on Arm Architecture
- Deployed a production FastAPI backend on the Armv8-A architecture of the Raspberry Pi 5
- Managed systemd services, CPU/memory constraints, and long-running Arm-native processes
- Designed the system specifically for low-power, always-on AI inference, which is a core strength of Arm hardware
AI Inference on Arm at Scale
- Integrated TinyLlama, running entirely on the Pi's Arm Cortex-A76 CPU cores
- Balanced quality, memory usage, and latency within the constraints of ARM64 hardware
- Built a unified inference pipeline (CV + LLM) optimized for on-device performance on Arm silicon
Full-Stack Mobile + Arm Edge Architecture
- Developed a React Native/Expo mobile frontend
- Built a Python FastAPI backend running fully on Arm hardware
- Implemented real-time sync, push notifications, offline mode, and edge inference
How I Built It
Architecture Overview
Biki uses a three-tier architecture centered around Arm-powered edge inference:
1. Mobile App — React Native/Expo
- Barcode scanning
- Nutrition tracking
- Inventory management
- On-device AI chat interface
- Syncs with the Arm-powered backend running on the Raspberry Pi
2. Backend — FastAPI on Raspberry Pi 5
- ARM64 REST API with SQLite
- Scheduled jobs for recipes, weekly analysis, and notifications
- Local AI inference (YOLOv5 + TinyLlama) executed on Arm CPU cores
3. Computer Vision Pipeline (Arm + Hailo)
- Custom YOLOv5 model detecting avocados, tomatoes, milk
- Pre/post-processing executed on Arm Cortex-A76 cores
- Real-time inference using the Hailo-8L accelerator working alongside the Arm CPU
Tunneling System
Because the Raspberry Pi sits behind a NAT/firewall, I built a cost-efficient reverse SSH tunnel:
- The Pi establishes a persistent SSH tunnel → AWS EC2
- Nginx proxies:
https://tsiandanetsianda.com/biki/*→localhost:8001 - A custom discovery service:
- Registers the Pi
- Sends heartbeats every \( 240 \) seconds
- The mobile app connects via a VPS proxy with fallback logic
This provides a stable external API endpoint without opening ports or increasing cost.
Key Features Implemented
- Barcode scanning with OpenFoodFacts
- Real-time nutrition tracking with daily/weekly goals
- Inventory management + expiry alerts
- AI-powered recipe generation (ingredient-aware)
- Weekly nutrition analysis via TinyLlama
- Push notifications for tips, expiry, and insights
- All inference executed on Arm CPU cores + Hailo accelerator
Challenges Faced
YOLOv5 → Hailo-8L Conversion
This was the most technically demanding part.
Problems & Solutions:
- Unsupported ops: Hailo didn't support YOLO detection heads → Used truncated model + Python postprocessing
- Calibration formatting: Needed NHWC tensors on ARM64 → Rebuilt dataset to \( (100, 640, 640, 3) \)
- Limited calibration images: Only \( 100 \) (vs \( 1024 \) recommended) → Reduced optimization level
- Profiling: All debugging, profiling, and validation done directly on the Pi's Arm Cortex-A76 cores
Final result:
- \( 15 \text{MB} \) HEF model
- \( 3 \) detection contexts
- \( 30-60 \text{ FPS} \) real-time inference on Arm + Hailo combo
Timezone Handling & Daily Resets
- Unified timezone system across Python, SQLite, and frontend
- Midnight SAST (UTC+2) resets for nutrition goals
Reliable Notification System
- \( 9 \) scheduled jobs (recipes, tips, health checks, etc.)
- Implemented graceful fallback logic when TinyLlama is unavailable
Database Migrations & Schema Evolution
- Added features (daily recipes, meal logs, etc.)
- Used careful incremental migrations to maintain stability on the low-resource Arm system
Running Everything on a Single Arm Device
Operating FastAPI + YOLOv5 pipelines + LLM inference + scheduled jobs on an Arm device required:
- Query optimization
- Memory-efficient inference
- ARM64 process management
- Systemd supervision for resilience
Impact
After months of daily use, Biki has significantly improved my nutrition habits and reduced food waste.
Running the entire stack on Arm hardware made the system:
- Low cost
- Energy efficient
- Fast
- Private (all AI is on-device)
This aligns directly with the goals of the Arm AI Developer Challenge: real-world AI running locally on Arm architecture.
Technical Highlights
- Arm-Optimized Edge Deployment: All AI inference (YOLOv5 + TinyLlama) runs natively on the Pi's Arm Cortex-A76 CPU
- Hybrid Arm + Accelerator AI Pipeline: Arm handles orchestration & postprocessing; Hailo performs matrix ops
- Cost-Efficient Arm Infrastructure: Raspberry Pi backend + low-cost EC2 tunneling
- Production-Ready on Arm: systemd, health monitoring, watchdogs, error handling
- Strong AI Integration: TinyLlama + RAG for contextual recommendations
What's Next for Biki
- Expand object detection dataset
- Add macronutrient prediction from images
- Support multi-user households
- Advanced analytics (budget tracking, waste forecasting)
- Deploy lightweight Vision Transformers on Arm + Hailo
Log in or sign up for Devpost to join the conversation.