Inspiration
India is bringing the "Next Billion Users" online, but digital literacy has not kept pace with digital adoption. We observed a critical vulnerability in the ecosystem: while enterprises have sophisticated firewalls and Security Operation Centers (SOCs), the average mobile-first user is defenseless against social engineering.
The inspiration came from analyzing the "Digital Trust Deficit." Sophisticated phishing campaigns, predatory loan apps, and "urgency-based" scams are designed specifically to bypass human judgment, not just software firewalls. We wanted to build a Zero-Trust Security Layer for the common citizen—an automated system that applies forensic-level analysis to everyday communication without requiring the user to have technical expertise.
What it does
DesiCheck functions as an intelligent barrier between the user and potential cyber threats. It is a Progressive Web App (PWA) that provides three layers of defense:
- Multimodal Vector Analysis: Users can upload screenshots of questionable emails, WhatsApp messages, or SMS. The system leverages Gemini 1.5 Flash to perform OCR and visual anomaly detection simultaneously, identifying high-pressure tactics (e.g., "Your electricity will be cut") and fake branding elements.
- Heuristic Link Forensics: Instead of simple blacklist matching, the app analyzes URLs for entropy and typosquatting patterns (e.g., detecting
sbi-kyc-update.comvssbi.co.in) to flag malicious domains before a connection is established. - Adversarial Response Generation (ARG): If a threat is confirmed, the system generates a context-aware, dialect-specific (Hinglish) counter-script. This allows the user to engage the attacker safely, wasting the scammer's time and resources while protecting their own identity.
How we built it
We adopted a Serverless-First architecture to ensure high availability and low operational overhead:
- Core Intelligence: We integrated the Google Gemini 1.5 Flash API. Its multimodal capabilities allowed us to replace multiple separate models (OCR + NLP + Image Classification) with a single, efficient inference call.
- Frontend Engineering: We built the client using Vanilla JavaScript and HTML5. We deliberately avoided heavy frontend frameworks to ensure the application remains lightweight (~150KB) and loads instantly on Tier-2/3 network conditions (3G/4G).
- Development Environment: The entire codebase was architected within Project IDX, utilizing its AI-assisted coding features to rapidly prototype the backend logic.
- Infrastructure: The application is containerized using Docker and deployed on Google Cloud Run. This ensures the platform is stateless and auto-scales based on demand, handling traffic spikes without manual intervention.
Challenges we ran into
- Contextual Nuance in NLP: Standard models often fail to detect the specific flavor of Indian cyber-fraud, which often mixes languages (Hindi + English). We had to rigorously iterate on our System Prompts to teach the AI to recognize terms like "KYC Update," "Lottery," and "Challan" within a colloquial context.
- Payload Optimization: Sending high-resolution screenshots to the API introduced latency. We engineered a client-side image compression pipeline that reduces image size by roughly 60% before transmission, significantly improving the "Time-to-Verdict."
- Containerization Hurdles: Migrating a static vanilla JS application to a dynamic Cloud Run environment required specific Docker configurations to handle port binding correctly, ensuring the health checks passed in a production environment.
Accomplishments that we're proud of
- Seamless Multimodality: We successfully consolidated text and visual analysis into a single, user-friendly workflow that feels instantaneous.
- Production-Grade Security: Deploying a fully SSL-encrypted web app on a scalable public cloud infrastructure (Cloud Run) rather than just running locally.
- Democratizing Forensics: creating a tool that translates complex security headers and metadata into a simple "Safe vs. Risky" language that anyone can understand.
What we learned
- The Power of System Prompting: We learned that the effectiveness of an AI Agent is almost entirely dependent on the quality of the "Persona" defined in the system instructions.
- UX is Security: If a security tool is difficult to use, it is useless. Optimizing for speed and simplicity is just as important as the detection algorithm itself.
- Cloud Native patterns: We gained deep insights into Docker container lifecycles and the benefits of stateless architecture for handling unpredictable user loads.
What's next for DesiCheck - The AI Scam Shield
- Audio Interface Integration: We plan to implement Speech-to-Text pipelines to support users with low literacy, allowing them to simply "ask" the app if a message is safe.
- Vernacular Expansion: Extending our NLP capabilities to natively support regional languages like Bengali, Tamil, and Telugu to cover a wider demographic.
- Decentralized Threat Database: Building a community-driven repository where confirmed scams are hashed and stored, allowing for near-instant detection of viral scam campaigns across the user base.
Built With
- css3
- docker
- gemini-3-pro
- google-cloud
- google-cloud-run
- html5
- javascript
- project-idx
Log in or sign up for Devpost to join the conversation.