Inspiration : In modern web infrastructure, a simple HTTP 200 OK status is insufficient to guarantee application reliability. A server might be healthy, but the user interface could be completely broken due to a bad CSS deployment, or an SSL certificate might be expiring silently. The inspiration for this project was to step beyond standard CRUD applications and build an enterprise-grade Site Reliability Engineering (SRE) tool. We wanted to engineer a unified diagnostic dashboard that tracks not just backend uptime, but absolute end-to-end digital health—bridging the gap between server status and visual UI stability.

What it does : SiteSnap Pro is a comprehensive full-stack site reliability microservice that acts as a multi-pronged diagnostic probe for any target web application. Visual Regression Engine: Automatically captures high-fidelity screenshots of the live DOM to detect silent front-end rendering failures. Security Probes: Executes live TCP handshakes to validate SSL/TLS x509 certificates and proactively flags expiration vulnerabilities. Uptime Diagnostics: Tracks global network latency in real-time using precise ICMP and HTTP probes. Sustainability Tracking: Calculates the estimated carbon footprint of a web page's network data transfer.

How we built it : The platform was built using a distributed microservice architecture to separate lightweight routing from heavy background computation. Client Tier: A highly responsive React.js SPA styled with Tailwind CSS and deployed on Vercel. API Gateway & Security: A Node.js/Express backend protected by Google OAuth 2.0, robust JWT validation, and strict rate-limiting. Diagnostic Engines: Instead of relying entirely on heavy third-party packages, we utilized Node.js native modules (tls.connect(), net.Socket) to build the TCP and SSL probes from scratch. CO2 Analytics: We implemented sustainable software engineering metrics by calculating the carbon footprint of network data transfer. The mathematical model applied is: $$CO_2=\text{Data}{GB}\times E{intensity}\times C_{intensity}$$(Where $E_{intensity}$ represents the energy required per gigabyte transferred, and $C_{intensity}$ is the grid carbon intensity measured in $gCO_2/kWh$.)

Challenges we ran into : The most significant infrastructure bottleneck occurred within the visual regression engine. Initially, running full headless Chromium instances directly on the Node.js backend consumed over 200MB of RAM per request. On a cloud server with a strict 512MB memory limit, this immediately triggered Out-Of-Memory (OOM) crashes. To solve this memory starvation, we had to strip the heavy browser binaries out of the monolith. We pivoted to a distributed approach by implementing puppeteer-core and establishing a bi-directional WebSocket (wss://) tunnel to a remote cloud provider (Browserless.io). JavaScript// Example of the WebSocket tunnel implementation preventing local OOM crashes const browser = await puppeteer.connect({ browserWSEndpoint: wss://chrome.browserless.io?token=${API_KEY} });

Accomplishments that we're proud of : By offloading the heavy DOM rendering through our WebSocket architecture, we reduced the backend server's RAM footprint by over 80%. We are incredibly proud that we took a server that was crashing on every single visual diagnostic request and engineered it into a horizontally scalable, 100% crash-proof microservice.

What we learned : Building SiteSnap Pro was a masterclass in backend system architecture. We learned the critical performance differences between standard REST APIs and persistent WebSocket connections. We gained a much deeper understanding of the OSI model—specifically navigating between Application Layer HTTP requests and Transport Layer TCP handshakes. Most importantly, we learned how to manage strict server memory limits and profile Node.js buffer streams when handling raw binary image data.

What's next for SiteSnap Pro : The immediate next step for scaling the architecture is implementing a dedicated Message Queue (such as RabbitMQ or Redis BullMQ). Currently, the API processes diagnostic requests synchronously. By introducing a worker-node pool and an asynchronous job queue, SiteSnap Pro will be able to handle thousands of concurrent diagnostic requests without ever blocking the Express event loop.

Built With

Share this project:

Updates