Inspiration
Our inspiration for creating this application came from the increase of deep fake content on the internet and specifically deepfake live video content. In a world where impersonation is so easy to do online, it becomes increasingly more difficult to know if you are talking with a real person
What it does
The application views your window and provides a rating from 0 to 100 that rates the overall likelyhood that you are actually speaking to a human, scanning every 5 seconds for forensic and behavioral information and providing a score. A lower score represents a higher likelyhood that the person you are speaking to is not a real person. 50 bellow is what we determine to be ai, 80 above is not. Additional information is provided bellow with a visual representation of the score in graph.
How we built it
- Forensic Reasoning Layer This layer acts as a high-resolution digital microscope, analyzing the "DNA" of the image to find the mathematical footprints of generative AI. Multimodal Pixel Inspection: We leverage Gemini 2.5 Flash to perform deep-tissue scans of the image, specifically hunting for "generative artifacts" that are often invisible to the naked eye. Artifact Checklist: The system specifically inspects for Perimeter Blending (blurred halos around hair and ears), Skin Topography anomalies (artificial grain or "waxy" over-smoothing), and Temporal Jitter (disconnected jawlines). The Critical Threshold: Following a "Zero-Trust" policy, if the model detects even a single forensic artifact, such as inconsistent catch-lights in the pupils, the True Trust Score is automatically downgraded below 50%.
- Volumetric Liveness Check (MediaPipe 3D) The Implementation: We integrated a "Profile Integrity" check using the MediaPipe Face Landmarker. The Logic: Most deepfakes are 2D projections optimized for front-facing views; they often "warp" or "melt" during lateral rotation. Technical Depth: Our system tracks 478 3D landmarks to calculate volumetric facial depth. If the distance between the nose and the ear "snaps" or displays a non-biological jump during a 90-degree turn, the code flags a Face-Adhesion Failure.
- Behavioral Reasoning Layer This layer shifts the focus from pixels to people, analyzing the unconscious biological rhythms that AI currently struggles to replicate perfectly. Ocular Rhythm Analysis: We monitor for "Fixed Dead-Eye Stares" or missing micro-blinks. If the subject displays glassy, vacant textures or overly plastic reflections, it triggers an immediate behavioral warning. Emotional Sync & Adaptors: The layer detects mismatches between facial expressions (e.g., exaggerated smiles that don't reach the eyes) and a lack of unconscious micro-movements like fidgeting or scratching. Biological Consistency: By checking for Emotional Incoherence and gaps in micro-expressions, this layer ensures the subject is a living, reacting human rather than a static digital projection.
What we learned
We learned just how important good project planning is. We had many ideas which we could not implement due to time constraints mostly derived from issues we had due to our unfamiliarity with certain technologies. We also learned how difficult a fast paced environment. We had two new members, and it was an entirely new experience for us to ship a product this quickly.
What's next for DeepFake Mask Breaker
- Integrating real-time audio analysis into our existing Gemini 2.5 Flash pipeline.
- Storing every forensic verdict in a Snowflake Data Warehouse for long-term security monitoring.
- Simple "Thumbs Up/Down" buttons to validate AI predictions.Using real-world user feedback to refine our detection models in real-time.
- Mention what the rating means: -> 50 below = DeepFake Video -> 80 above = Human/Real Video
Built With
- fastapi
- gemini
- javascript
- mediapipe
- python
- react
Log in or sign up for Devpost to join the conversation.