Cyberguard URL Scanner

💡 Inspiration

As digital communications continue to expand, cybercriminals are constantly evolving their methods to deceive users through phishing, credential harvesting, and malware distribution campaigns. We realized that traditional "black-box" security tools that simply label a link as "safe" or "unsafe" were no longer enough. End-users and security analysts need transparent context—they need to know exactly why a link is dangerous. We were inspired to build CyberGuard to democratize threat intelligence, combining multiple detection vectors into a single, beautiful, and educational interface that stops attacks before a user ever clicks.

⚙️ How we built it

CyberGuard was engineered with a modern, fully decoupled architecture to ensure high performance and seamless deployment:

The Backend Engine: Built in Python using the robust Flask framework, our RESTful API (/api/scan) acts as the brain. It runs a proprietary heuristic engine that parses the lexical physical structure of the URL, cross-references massive offline blocklists (such as URLhaus), and executes Server-Side Request Forgery (SSRF)-safe live HTTP probes to detect malicious redirects and server headers.
Dynamic Threat Injection: The backend supports on-the-fly injection of external APIs, querying VirusTotal for crowd-sourced historical reputation data, and harnessing the context-aware reasoning of OpenAI to perform advanced behavioral threat modeling.
The Frontend: The UI was designed with sheer aesthetics and speed in mind. Hosted on Netlify, it utilizes pure HTML, CSS, and Vanilla JavaScript over the fetch() API to construct a blazing-fast, dynamic application without the overhead of heavy Webpack frameworks.

The Risk Scoring Algorithm (Math Model)

To aggregate our findings into an actionable intelligence metric, we modeled a custom Risk Score ($R$). The core engine evaluates $n$ distinct lexical and behavioral features, assigning a weighted severity ($w_i$) to each triggered finding ($x_i \in {0, 1}$).

To prevent linear explosion and cap the maximum risk at an intuitive 100-point scale, the foundational Risk Index is calculated using an asymptotic bounding function:

$$ R_{base} = 100 \times \left( 1 - e^{-\lambda \sum_{i=1}^{n} w_i x_i} \right) $$

If third-party threat intelligence (such as VirusTotal or AI modeling) flags the domain as explicitly malicious ($T_{flag} \in {0, 1}$), the score is aggressively pushed to the apex:

$$ R_{final} = \max \Big( R_{base}, \ 90 \times T_{flag} \Big) $$

This ensures that hard-confirmed threats override heuristic subtleties while allowing heuristic indicators to accurately predict novel, zero-day threat domains.

🧱 Challenges we ran into

Transitioning from a tightly-coupled monolithic architecture (using Server-Side rendered Jinja templates) to a decoupled API model was a significant hurdle. We had to rewrite the entire presentation layer from Python Jinja logic into asynchronous JavaScript DOM manipulation.

We also faced the infamous CORS (Cross-Origin Resource Sharing) barriers when attempting to connect our Netlify-hosted static frontend to our Render-hosted Flask backend. Implementing the flask-cors middleware and ensuring our Fetch API requests carried the correct JSON headers was a vital learning curve in networking security. Finally, designing the "Active HTTP Probe" required careful engineering to prevent SSRF vulnerabilities—ensuring that an attacker couldn't use our own scanner to maliciously ping internal network devices.

🧠 What we learned

We gained profound insights into the anatomy of phishing URLs, specifically regarding brand typosquatting, extreme subdomain stacking, and deceptive path keywords (e.g., update-billing-secure.com).

From a software engineering perspective, we learned the immense value of decoupling the frontend from the backend. By separating them, we unlocked the ability to scale our Python analytical engine independently from our UI, paving the way to potentially release our API as a standalone B2B integration in the future.

Built With

Updates

Sujay G started this project — Mar 28, 2026 03:51 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.