Inspiration
The inspiration for VoxGuard was born from a chilling reality: AI can now clone a human voice in under three seconds. In India, we are seeing a massive surge in "Emergency" UPI scams where attackers use deepfakes of family members to bypass logic with false urgency. We realized that while laptops have firewalls, our ears do not. We built VoxGuard to be the "Digital Shield" that restores trust in communication.
What it does
VoxGuard is a dual-engine security dashboard that analyzes suspicious voice notes for both synthetic cloning and psychological manipulation. It takes a raw audio file, extracts a precise transcript using Azure Speech Services, and then feeds that data into Google Gemini 2.5 Flash. The system provides a real-time Risk Score (0-100%), flags linguistic red flags (like isolation tactics or financial urgency), and gives the user an actionable threat report.
How we built it
We architected a multi-layered pipeline using:
- Frontend: A high-performance, "Cyberpunk-themed" dashboard built with Streamlit.
- The Ears: Azure Cognitive Services for high-accuracy Speech-to-Text extraction.
- The Brain: Google Gemini AI for advanced cognitive intent analysis and scam detection.
- Integration: Orchestrated via Python and deployed through Streamlit Cloud for instant accessibility.
Challenges we ran into
One of the biggest hurdles was managing the ScriptRunContext and environment sync between our local Windows development environment and the cloud. We also faced challenges in HTML/CSS injection within Streamlit—ensuring our custom neon UI rendered correctly without breaking the Markdown engine. Additionally, we had to fine-tune the Gemini prompt to ensure it returned a structured JSON-like format for our risk meters to read accurately.
Accomplishments that we're proud of
We are incredibly proud of creating a Zero-Trust Voice Architecture that actually works. Successfully integrating two competing AI giants—Azure and Google—into a single, seamless security pipeline was a major technical win. We also managed to build a UI that matches the "Hacker-Defense" aesthetic, making a complex security tool feel intuitive and engaging for the end-user.
What we learned
This project taught us the importance of Defense in Depth. We learned that technical deepfake detection is only half the battle; analyzing the psychology of social engineering is equally critical. On the technical side, we mastered API orchestration, environment variable management for security, and advanced CSS customization in Streamlit to create a professional-grade product.
What's next for VoxGuard
The future of VoxGuard is Proactive Defense. Our next steps include:
- Telegram Bot Integration: Allowing users to forward voice notes for instant scanning.
- Biometric Voiceprints: Verifying callers against a private "Trusted Contact" database.
- Real-Time Mobile Overlay: Developing an app that provides a live "Risk Meter" during active phone calls.
Log in or sign up for Devpost to join the conversation.