Inspiration
253 million people worldwide are visually impaired. They struggle daily with identifying money (risking fraud), reading medicine labels (risking poisoning), understanding documents, choosing clothing, and calling for help. I built SCOUT using Google Gemini 3 API to provide independence, safety, and dignity.
What it does
SCOUT is an AI accessibility assistant powered by Gemini 3 API with 6 features:
💰 Money Reader - Identifies currency, detects counterfeits, checks condition
💊 Medicine Safety - Reads labels, checks expiry dates, alerts if unsafe
📄 Document Reader - Extracts text from bills, letters, newspapers
👔 Clothing Helper - Describes colors, suggests matching outfits
📸 General Camera - Identifies any object or scene
🚨 Emergency SOS - Sends GPS location to contacts instantly
All features use Gemini 3's vision and reasoning with voice output for complete accessibility.
How we built it
Core Technology: Google Gemini 3 API
SCOUT uses the gemini-2.5-flash model from Gemini 3 family for:
- Ultra-fast vision analysis (3-5 seconds)
- Advanced OCR from complex layouts
- Contextual reasoning (e.g., comparing expiry dates)
- Natural language generation
Tech Stack:
- Frontend: React Native + Expo
- Backend: Node.js + Express on Vercel
- AI: Google Gemini 3 API (
gemini-2.5-flash) - Voice: Expo Speech (TTS)
- Location: Expo Location (GPS)
Architecture:
Mobile App → Vercel Backend → Gemini 3 API → Response → Voice Output
Key Implementation:
// Gemini 3 API Integration
fetch('https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent', {
body: JSON.stringify({
contents: [{ parts: [{ text: prompt }, { inline_data: { data: image } }] }]
})
})
Category-specific prompts optimize Gemini 3 for each feature (medicine safety, money verification, document reading).
Challenges we ran into
1. Gemini 3 Prompt Engineering - Medicine labels have tiny text, currency needs authenticity checks. Solution: Category-specific prompts leveraging Gemini 3's reasoning.
2. Speed vs Accuracy - Chose gemini-2.5-flash over Pro for 3-second responses critical for accessibility.
3. True Accessibility - Every element needs screen reader labels, voice output, and large touch targets. Validated with blindfolded testing.
4. Emergency Location - GPS accuracy varies. Implemented fallback to last-known location.
5. Error Handling - Gemini API rate limits require user-friendly messages and retry logic.
Accomplishments that we're proud of
✅ Gemini 3 API powers all 6 features
✅ 3-5 second response time enables real-time accessibility
✅ Medicine safety alerts prevent poisoning using Gemini's date reasoning
✅ Currency verification detects fraud using Gemini's vision
✅ Blindfolded testing validated real-world usability
✅ 253 million potential users worldwide
What we learned
About Gemini 3:
- Multimodal understanding is transformative - understands context, not just objects
- Prompt engineering is critical - 10x accuracy improvement with specific prompts
gemini-2.5-flashperfect balance of speed + accuracy for accessibility- Vision + reasoning enables safety features (expiry date comparison, counterfeit detection)
About Accessibility:
- Screen reader compatibility requires intentional design
- Voice feedback must be immediate
- Blindfolded testing reveals issues invisible in normal development
- Independence = dignity for users
What's next for SCOUT
Short-term:
- Upgrade to
gemini-2.5-profor complex medicine analysis - Multi-language support using Gemini 3
- Medicine Reminders with prescription OCR
- Function calling for structured data
- object finder to identify the missing items
- navigation - to identify the danger items on the way. Long-term:
- Live Navigation with Gemini 3 video understanding
- Conversational AI for follow-up questions
- Smart Object Finder using spatial reasoning
- iOS version
Vision: Make SCOUT the AI companion for every visually impaired person worldwide, powered by Gemini 3.
Built With
- asyncstorage
- eas-build
- expo-location
- expo-speech-api
- express.js
- google-gemini-2.5-flash-api
- mobile-first
- node.js
- react-native-(expo-sdk-54)
- rest-api
- typescript
- vercel-(backend-deployment)
Log in or sign up for Devpost to join the conversation.