AI Voice Detector

Inspiration

With the rapid growth of AI-generated voices and voice cloning tools, it is becoming harder to tell whether an audio recording is spoken by a real human or created by a machine.
This can lead to problems such as voice impersonation, fake audio messages, and misinformation.

We were inspired to build AI Voice Detector to provide a simple and accessible way to verify the authenticity of voice samples using modern AI technology.

What it does

AI Voice Detector is a web application that:

Allows users to select a language
Upload an audio file (MP3)
Sends the audio to the Gemini API for analysis
Detects whether the voice is:
- AI Generated
- Human
Returns:
- Classification
- Confidence score
- Reasoning

How we built it

The project was built as a full-stack web application.

Frontend

React (Vite)
Tailwind CSS
Axios

The frontend provides:

A clean user interface
Language selection
Drag-and-drop audio upload
Result display for classification and confidence

Backend

Node.js
Express.js
Multer (for file uploads)
Gemini API

The backend:

Receives the uploaded audio file
Converts it to Base64
Sends it to Gemini API with a structured prompt
Parses the response
Returns a JSON result to the frontend

Challenges we ran into

Handling large audio files while keeping performance stable
Ensuring correct file upload and format validation
Designing a prompt that makes Gemini return structured JSON output
Avoiding hardcoded results and ensuring dynamic confidence values
Making the UI responsive and easy to use

Accomplishments that we're proud of

Successfully integrated Gemini API for real-time audio analysis
Built a complete end-to-end system from audio upload to result display
Implemented language selection for better contextual analysis
Created a clean and modern user interface
Designed a modular backend architecture with controllers and services

What we learned

How to process and transmit audio data using Base64 encoding
How to work with Gemini API for non-text inputs
Prompt engineering for structured JSON responses
Full-stack integration between React and Node.js
Error handling and validation for file uploads

What's next for AI Voice Detector

Future improvements include:

Supporting more languages
Improving detection accuracy
Adding user authentication
Storing past analysis results
Deploying the system to a cloud platform
Making the UI mobile-friendly

Built With

axios
express.js
gemini
node.js
react
tailwindcss
vite

Updates

Akash Rawat started this project — Feb 02, 2026 05:13 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.