Inspiration
AI-powered voice cloning has made deepfake scams more realistic, tricking people and organizations into compromising sensitive data or money. For example, a UK energy firm lost $243,000 to a deepfake impersonating their CEO. A mother was nearly tricked by a deepfake of her daughter's voice, claiming a kidnapping. These scams are financially and emotionally devastating, which motivated us to build CallerIDK — a tool that uses AI to detect voice deepfakes and proactively protect sensitive data.
What It Does
CallerIDK is an AI-driven voice fraud detection system designed to address the growing risks in finance. It identifies deepfake voices in real-time during calls or audio files, ensuring the speaker is authentic before any sensitive financial actions take place. This is crucial for services like banking, where customers frequently make calls to open credit cards, transfer funds, or manage accounts.
Once a deepfake is detected, CallerIDK activates protective measures, such as:
- Obfuscating sensitive data (e.g., account numbers, PINs) to prevent future exploitation.
- Masking responses from the user, making it harder for deepfakes to clone voices.
- Flagging and logging suspicious calls, providing a layer of defense against scams targeting financial institutions.
We also built additional add-ons to enhance the tool's utility in real-world applications:
- Hidden Ultrasonic Messages: Our obfuscation system includes ultrasonic frequencies that alter voice characteristics without affecting the human ear. This makes it extremely difficult for deepfake models to accurately clone the voice in the future.
- Background Object Detection: We added functionality to detect objects and sounds in the background of a call, offering further context to validate the call's authenticity, especially in suspicious scenarios.
How We Built It
- Voice Deepfake Detection: We trained a machine learning model with real and synthetic voice datasets to detect acoustic patterns and waveform inconsistencies in calls.
- Obfuscation System: We implemented a voice obfuscation layer that subtly alters speech waves, making it difficult for AI to clone the voice in future scams. This protects sensitive information during calls.
- Hidden Ultrasonic Messages: The obfuscation layer uses ultrasonic frequencies to add hidden information, preventing AI models from successfully reproducing the original voice.
- Background Objects Detection: We used advanced techniques to detect sounds or objects in the background, enhancing security by providing additional contextual information about the caller's environment.
- Frontend Interface: A React-based dashboard displays detection results, flags suspicious calls, and logs details in real-time.
- Backend Infrastructure: We used Flask for backend API logic and deployed the app on Vercel for quick hosting.
Challenges
- Finding high-quality, varied datasets for voice deepfakes.
- Achieving low-latency, real-time detection for finance-related use cases.
- Balancing detection accuracy with a low false-positive rate, ensuring legitimate calls aren’t flagged.
- Integrating obfuscation and hidden messages without compromising user experience or call quality.
Accomplishments
- Achieved 80% accuracy in detecting deepfake voices on our test set.
- Built a real-time voice monitoring system with minimal lag, essential for financial transactions.
- Developed a full-stack working prototype within the hackathon window, including our obfuscator, ultrasonic filter, and background detection system.
What We Learned
- Modern deepfake voice models are surprisingly sophisticated.
- Ethical AI is essential to protect, not just impress.
- Building both detection and prevention systems in parallel is challenging but rewarding.
What's Next for CallerIDK
- Integrate into VOIP systems and customer service software in the finance sector.
- Add multilingual support to cater to global users in financial services.
- Train with more diverse datasets, including new voice generation models like GPT-4 Voice.
- Package the obfuscation module as an API for banking apps and financial institutions.
- Develop a browser extension for real-time scam detection on platforms like WhatsApp and Zoom, especially for financial calls.
Log in or sign up for Devpost to join the conversation.