💡 Inspiration: The Human-Layer Breach The inspiration for Aegis came from a disturbing trend in cybercrime: the shift from hacking software to hacking humans. In India, "Digital Arrest" scams have reached epidemic proportions. Scammers impersonate police or CBI officers via video calls, creating a simulated high-pressure environment to extort life savings. Traditional security software—firewalls and antivirus—are useless here because the user is "consenting" to the interaction under duress. We decided to build a "Zero-Trust" guardian that monitors the interaction layer itself.
🛠️ How we built it Aegis is built using Compose Multiplatform (KMP), focusing on the Android Accessibility Service as our "eye" into the user's digital experience. The core logic combines local heuristics with Gemini's multimodal reasoning:
Local Heuristics: Lightweight regex and pattern matching for immediate filtering. Multimodal Vision (Gemini 3 Flash): High-speed analysis of video call frames to detect simulated police stations or fake uniforms. Deep Reasoning (Gemini 3 Pro): Analyzing the psychological markers in chat messages. To maintain performance, we implemented a state-based deduplication logic to ensure we only invoke the AI when the content meaningfully changes. we suppress analysis to save tokens and battery.
đź§ What we learned The biggest revelation was the power of multimodal context. A link to a "bail payment portal" is suspicious, but when that link is sent during a video call where Gemini detects a fake police uniform, the confidence level shifts from a "warning" to an "emergency shutdown." LLMs are not just for chat; they are incredible real-time reasoning engines for complex security environments.
🌋 Challenges we faced The Latency-Privacy Tradeoff: Processing video frames in real-time requires balancing frame rates with data privacy. We solved this by implementing a "Local-First" filter that only sends frames to Gemini when a video call is active and specific triggers are met. Context Windowing in UI: Accessibility events are noisy. Filtering out dynamic timestamps and notification badges to create a "stable hash" for Gemini was a significant engineering hurdle.
Future Roadmap Pre-emptive Notification Guard: Intercepting suspicious notifications before the user even opens the app. Privacy Kill-Switch & Data Cloaking: Automated "Camera Off" and dynamic blurring of sensitive banking data during verified attacks. "The Time-Waster" (AI Honeypot): Autonomous AI chat responses that engage and frustrate scammers while collecting forensic evidence. Local-First AI (Gemini Nano): Moving detection on-device to maximize privacy and minimize latency. Zero-Data Monetization: Subscriptions via Device ID (No login/signup) to maintain the Zero-Trust architecture.
Built With
- gemini3
- kotlinmultiplatform
Log in or sign up for Devpost to join the conversation.