Inspiration

🌍The Genesis: Bridging the "Human Gap" in Cybersecurity

"The gap in cybersecurity isn't technology; it's the human reaction to panic."

CYBE was founded by Sakib and Rawa, two engineers united by a critical observation: The modern cybersecurity landscape is a paradox. While our technical defenses—firewalls, antivirus, and encryption—are incredibly effective against code-based threats like malware, they are utterly defenseless against manipulation-based attacks, commonly known as social engineering or fraud.

The core of the problem is the Human Vulnerability State: When targeted by a scam—a call from a fake bank agent, a sudden email threat, or a compromised QR code—users don't default to logic; they panic. In this state of high stress, critical thinking shuts down. People rush to follow instructions to "fix" the problem, inadvertently giving up their credentials or approving a fraudulent transaction. They are not just being scammed; they are being engineered into a moment of crisis where they don't know who to trust or what to do next. This is the moment traditional security tools fail, as they only watch the device, not the user's state of mind.

Coming from emerging economies and conflict-affected regions, we recognized that for millions of users, especially those using mobile finance as a lifeline, a digital error is not merely a financial inconvenience—it is a devastating threat to livelihood and survival.

The Context: Two Worlds, One Critical Vulnerability

Our diverse backgrounds vividly illustrated this universal flaw in current security models: The Digital Velocity Gap (Bangladesh Perspective): As Sakib witnessed, the velocity of digital adoption in Bangladesh, where mobile financial services like bKash are ubiquitous, has completely outpaced digital literacy. Families are targeted not by complex viruses, but by "contextual fraud": fake customer support agents exploiting trust, or physical QR code tampering at local merchants. The infrastructure has technical security (encryption), but it lacks user safety (intent verification). When a user panics and gets pressured into transferring money, the system sees a legitimate transaction—a disastrous disconnect.

The High-Stakes Environment (Yemen Perspective): For Rawa, coming from a conflict zone, digital finance is a fragile lifeline. Remittances keep families alive, and if a student falls for a phishing scam, there are no robust 24/7 fraud hotlines or protective institutions to reverse the transaction. Existing security tools assume users are calm, educated, and protected by strong institutions. In reality, millions operate under systemic fragility, making them uniquely vulnerable to panic-induced, high-stakes errors. We built CYBE to bridge this gap. We moved beyond device protection to Human Resilience, creating an AI that acts as a Bodyguard (actively preventing the moment of panic and fraud) and a Medic (providing immediate, step-by-step recovery support when the user is already in crisis). Our solution watches the user's behavior and context, intervening at the critical moment of manipulation before the fraudulent transaction is complete.

What it does

CYBE is an advanced, AI-powered, voice-first assistant designed to shield users from digital scams before they occur and provide stable, immediate support if they do. Leveraging the Gemini family of models, CYBE moves beyond simple technical checks to deep Intent Verification.

Mode A: The Guardian (Vibe Check) This mode focuses on real-time, proactive prevention, acting as a constant "vibe check" on digital interactions.

Contextual QR & Manual Link Analysis: Proactive Scanning: Unlike basic scanners, CYBE analyzes the intent of the destination. For a QR code or manually typed URL, CYBE checks the redirect for typical scam indicators. Behavioral Red Flagging: If a link scanned at a public venue redirects to a personal wallet or an unexpected payment gateway, CYBE interjects with a reassuring, yet firm, voice alert: "This link is technically safe, but it's routing payment to 'John Doe'—is that the person you are intending to pay right now? Please confirm the identity."

Multimodal Intent Analysis : Screenshot Upload & Analysis: Users can upload screenshots of suspicious texts, emails, or web pages. CYBE's multimodal AI analyzes the image in real-time for visual cues of social engineering: unprofessional logos, pixelation, forced sense of urgency, or mismatched branding. Psychological Threat Detection: The AI scans incoming communications (SMS, email content) for "Urgency Patterns" (e.g., "Act now or account deleted," "Immediate payment required") and manipulative language designed to induce panic, flagging psychological traps that traditional spam filters completely miss. A gentle notification provides an analysis: "This message contains high-urgency language often used in phishing attempts. Please take a moment and verify the sender's details before clicking anything."

Mode B: The Medic (Panic Room) When a scam occurs or is narrowly avoided, this mode activates a calm, structured "Panic Room" environment focused on damage control and recovery.

Crisis Mode Activation & Personalized Response: The user can instantly trigger this mode by clicking a prominent "SOS" button, activate Panic Room or using a voice command like "I've been scammed." The user interface immediately shifts to a high-contrast, streamlined, and calm design to reduce cognitive load and panic. The app automatically determines the user's geolocation (country and city) to tailor all subsequent advice to local laws, banking protocols, and reporting requirements. Interactive and Localized Structured Action Sequencing: The AI adopts a slow, clear, and immensely reassuring voice, actively listening to the user's brief description of the incident. It then systematically guides the user through the critical first minutes of incident response with country/city-specific, step-by-step instructions to ensure compliance with local rules and laws. This includes: Immediate Financial Countermeasures: Step-by-step guidance on instantly freezing exposed financial accounts/cards based on local bank procedures. Digital Security Restoration: Clear guidance on changing compromised passwords across critical platforms and securing two-factor authentication. Identity Restoration: Localized steps for securing identity and addressing potential identity theft. The AI's primary goal is to provide a non-panic inducing, clear path forward, continuously reminding the user to breathe and follow the simple steps on the screen. Automated Incident Packet Generation & Department Communication: CYBE instantly collates all relevant data—the original suspicious text/email, the incident timeline, the URL, transaction details, and the user's action sequence—into a structured, police-ready and bank-claim-ready PDF report.

Direct Communication Assistance: The app provides direct, localized contact information (phone numbers, email addresses, or online portal links) for the correct national/local police cybercrime unit, bank fraud department, and consumer protection agencies relevant to the user's geolocation. This Incident Packet and integrated communication assistance allows the user to file a police report or a bank fraud claim instantly and accurately, even while still recovering from the shock of the event, ensuring they communicate with the proper department for reporting and countermeasure.

How we built it

We architected CYBE to be privacy-first and low-latency, leveraging the cutting-edge power of Google Gemini 3.The Intelligence Layer (Gemini 3). We chose the Gemini 3 model for its multimodal reasoning and incredible speed. In a voice conversation, latency kills trust. We feed the model raw text from SMS, URL metadata, and crucially, image inputs (screenshots/QR codes). We use a custom prompt chain and multimodal analysis to score "Psychological Urgency" and detect transactional threat vectors. Key Detection Capabilities leveraging Gemini 3: QR Code & Screenshot Detection: Gemini 3 analyzes screen captures and images for suspicious QR codes, payment requests, or sensitive information being shared. Link Detection and Analysis: The model processes raw text and URL metadata for immediate phishing and malicious link identification. Psychological Urgency Scoring: Instead of just technical threats, we use Gemini 3's advanced NLP to score panic-inducing language. We implemented a Risk Scoring Algorithm using a weighted heuristic model:

$$R_{\text{total}} = \alpha \cdot T_{\text{threat}} + \beta \cdot (I_{\text{urgency}} + C_{\text{context}})$$

Where: $T_{\text{threat}}$ is the binary threat status derived from multimodal analysis (Gemini 3) and external APIs (VirusTotal/Safe Browsing). $I_{\text{urgency}}$ is the NLP score derived from Gemini detecting panic-inducing language. $C_{\text{context}}$ is the mismatch score (e.g., Merchant QR $\ne$ Personal Wallet). The Voice Interface We utilized Google Speech-to-Text (STT) and the Gemini Live Voice API. The Gemini Live Voice API provides extremely low-latency, natural, and contextual voice responses. "Medic Mood" Voice: We fine-tuned the voice persona and utilized a contextual system prompt ("Medic Mood") to guide Gemini's responses. This prompt instructs the model to lower its pitch and speaking speed during high-risk scenarios ("Crisis Mode") to psychologically induce calmness and provide clear, reassuring instructions to the user. Threat Detection Stack Gemini 3 Multimodal Analysis: Core engine for image, text, and context-based threat and urgency scoring. VirusTotal API: For file hash verification. Google Safe Browsing API: For real-time phishing blacklists. Flutter/React Native: For a seamless cross-platform mobile experience.

Challenges we ran into

The "Context" Problem: Moving to Interrogative Security : We realized that a simple "Safe" or "Unsafe" classification was inadequate for modern phishing. A link to a legitimate bank's login page is "safe" in a vacuum, but if the user didn't initiate the transaction or doesn't expect the link, it becomes a high-risk security event. To address this, we shifted from binary classification to Interrogative Security. The AI was trained to act less like a security filter and more like a human-centric security assistant, prompting the user with a critical, context-aware question: "Is this who you expect? Did you just request a password reset?" This required complex prompt engineering to teach the model to analyze the surrounding conversational context (user's previous messages, recent actions) and generate a response that wasn't just technically correct, but also addressed the user's current intent and emotional state.

Latency in Crisis: Prioritizing User Experience In a moment of potential crisis, such as receiving a suspicious message, every millisecond counts. We observed that a user’s panic is directly amplified by system delays. A standard 3-second API call latency, while technically fast, felt like an unacceptable eternity to a panicked user waiting for confirmation. Our solution was to architect our backend calls to run asynchronously. This allowed the Voice Assistant to immediately acknowledge the user and begin speaking with preliminary, reassuring context (e.g., "I'm scanning that now, hold on a moment...") while the technical scan and analysis were still finalizing in the background. This optimization prioritized the human element—reducing anxiety and providing immediate feedback—over waiting for a complete, all-at-once technical verdict. Privacy vs. Utility: We wanted to analyze messages without sending PII (Personal Identifiable Information) to the cloud. We built a local Regex pre-filter to redact names and credit card numbers before sending data to Gemini for analysis.

Accomplishments that we're proud of

The "Human" UX: Real-Time Crisis De-escalation We successfully built a "Crisis Mode" that actually lowers user heart rates (conceptually) by simplifying the screen and slowing down the voice. Deeper Point: The perfect detection of every social engineering or emotional fraud attempt is an impossible challenge. Our achievement lies in creating an immediate, psychological countermeasure that actively calms the user in the moment of panic—a hacker's greatest ally. This is an essential layer of defense for proactive prevention and cybersecurity awareness. The Incident Packet Generator: Automating the Counterattack We automated the most tedious part of being scammed—filling out police reports. CYBE generates a professional PDF in seconds. Deeper Point: Post-scam recovery is often abandoned due to bureaucratic friction. By automating the evidence collection and official reporting process, we turn a tedious chore into a swift, actionable countermeasure, ensuring victims can take proper legal steps for restitution and contributing to broader fraud awareness. Contextual QR Verification: AI's Eye on Hidden Fraud We successfully demonstrated that real-time AI (powered by a Gemini integration for Social Good) can catch scams that technically valid QR codes hide, simply by analyzing the metadata of the payment destination. Deeper Point: In the new era of sophisticated, AI-driven attacks, simple validation is insufficient. Fraud is becoming more pervasive and personalized. Our breakthrough is using the power of Gemini's contextual understanding to go beyond surface-level checks, providing an essential, proactive safeguard for everyone against next-generation financial fraud. AI-Driven Comprehensive Threat Intelligence We have integrated real-time AI, leveraging the power of a Gemini integration for Social Good, to establish a multi-layered defense system against the increasing sophistication of financial fraud. Deeper Point: This system is specifically engineered to identify, respond to, and prevent diverse vectors of attack: Social Engineering & Phishing: The AI analyzes the tone, urgency, and atypical requests in communication patterns (SMS, email, and live interaction transcripts) to flag emotional manipulation tactics often missed by traditional filters. Pattern Detection: Utilizing advanced machine learning, the system continuously monitors transaction history, geographic location data, and device behavior to establish a unique user-risk baseline. Any significant deviation—such as a sudden, high-value transfer immediately following a suspicious SMS—triggers an immediate alert and an automated "Crisis Mode" activation. Universal Protection: Recognizing that AI attacks and fraud are only becoming more prevalent, this comprehensive, context-aware defense system is built to provide an essential layer of cybersecurity for everyone, moving beyond simple detection to active, real-time prevention and heightened public awareness.

What we learned

The following elaboration builds upon the security principles of empathy and intent analysis: Panic is the Enemy & Voice Creates Trust: Team Work for Empathy: Developing an empathetic security tool requires diverse skill sets. A team combining security experts (who understand threats), UX designers (who prioritize user experience), and psychologists (who understand emotional response) is crucial for creating messages that are both informative and calming. Vibe Coding for Trust: "Vibe Coding" in this context means crafting the security tool's personality. The voice, tone, and pacing of the warnings should be deliberately designed to project calm authority rather than alarm. This applies to both text and a generated voice. Voice AI API Integration: A Voice AI API can be used to deliver these empathetic warnings. Instead of a jarring siren or generic computer voice, the API can generate speech with a pre-selected calm, human-like cadence explaining the risk ("Hold on a moment, Nazmus. That link looks like it's trying to get you to enter your password on a non-secure site. Let's close that down."). Intent Matters: Importance of Cybersecurity Awareness (Intent Focus): Modern cybersecurity awareness must shift from "don't click bad links" (since the links are often technically "good") to "always question the ask." Awareness campaigns need to train users to recognize malicious intent—like an urgent request for a password change via an unusual channel, even if the link leads to a valid Google Form. Voice AI API for Social Good (Phishing Education): The same Voice AI technology used for real-time defense can be deployed for social good in education. It can simulate high-quality phishing attacks (voice or text-based) in a safe environment, allowing users to practice recognizing malicious intent without real-world risk. For instance, a simulated call from an "urgent bank representative" with a realistic, yet unnerving, tone could train users better than a static email example.

What's next for CYBE.

Deepfake Detection: Utilizing Gemini's multimodal vision and audio analysis capabilities to scrutinize video calls in real-time. The system would actively look for inconsistencies in facial movements, lip-syncing errors, and unnatural voice characteristics, which are hallmarks of AI-generated content, specifically targeting their use in "Grandparent Scams" where an imposter often impersonates a family member in distress. Direct Bank APIs: Establishing secure, real-time partnerships and API integrations with major fintech applications and banks. This would allow a user's voice command, such as "Freeze Card," to directly trigger the card cancellation or suspension process within the bank's system via API call, minimizing the critical time window for fraudulent transactions after a scam attempt is detected. Offline Heuristics: Developing a compact, efficient, on-device machine learning model. This model will use a simplified set of linguistic and behavioral indicators (heuristics) to identify potential scam messages or calls locally on the user's device, ensuring continuous protection even when the user is in areas with poor or no internet connectivity.

Built With

  • google-gemini-2.5-flash
  • google-gemini-3
  • google-genai-sdk
  • google-neural-text-to-speech
  • google-speech-to-text-(stt)
  • lucid-react
  • react
  • tailwind-css
  • typescript
  • websockets
Share this project:

Updates