Inspiration The inspiration for this project came from the need for a straightforward, real-time server monitoring solution. While many powerful tools exist, they can often be complex to set up or costly for personal projects. We wanted to create a lightweight, open-source platform that provides essential metrics and log viewing at a glance, built on a modern, high-performance technology stack. This project also served as a deep dive into full-stack development, real-time data streaming with WebSockets, and end-to-end automated deployment on a cloud platform.

What it does The Server Metrics and Log Monitoring Platform is a comprehensive tool designed to give developers and system administrators real-time insight into their server's health and activity.

Real-Time Metrics Dashboard: A clean, intuitive web interface displays live data for CPU usage, memory consumption, and disk space utilization. Live Log Streaming: The platform captures and streams server logs directly to the dashboard, allowing for immediate analysis and debugging. Secure User Authentication: The dashboard is protected by a robust authentication system using JWT, ensuring that only authorized users can view server data. Centralized Monitoring: Provides a single pane of glass to observe critical server statistics without needing to SSH into the machine. How we built it This project is a full-stack application composed of a distinct backend service and a frontend dashboard.

Backend (FastAPI & Python):

We chose FastAPI for its incredible performance, asynchronous capabilities, and automatic API documentation. WebSockets are used for pushing live metrics and log data from the server to connected dashboard clients in real-time. PostgreSQL serves as the database for storing user credentials and application data, managed via Cloud SQL. SQLAlchemy is our ORM for interacting with the database, with Alembic handling all database schema migrations to ensure the schema is always in sync with the application code. Frontend (React & TypeScript):

The dashboard is built with React and Vite, providing a fast and modern development experience. TypeScript is used throughout the frontend for enhanced code quality and type safety. Tailwind CSS allowed for rapid development of a clean and responsive user interface. The application state, including authentication, is managed using React's Context API. Deployment & DevOps (Google Cloud Platform):

Both the backend and frontend are containerized using Docker. Cloud Build provides a complete CI/CD pipeline, automatically building, testing, and deploying the applications whenever code is updated. Cloud Run hosts the containerized applications, providing a scalable, serverless environment. Artifact Registry securely stores our built Docker images. Google Secret Manager is used to securely store sensitive information like database passwords and JWT secret keys. Challenges we ran into The most significant challenge was architecting a robust and secure CI/CD pipeline on Google Cloud.

Database Migrations in CI/CD: Establishing a secure connection from the ephemeral Cloud Build environment to the Cloud SQL database for running Alembic migrations was complex. This required correctly configuring the Cloud SQL Auth Proxy (exec-wrapper), managing IAM permissions for the Cloud Build service account, and debugging a series of connection errors (Connection refused, 403 notAuthorized). Environment Variable Management: Differentiating between how secrets are handled during the temporary build/migration step (raw password strings) versus the final Cloud Run deployment (Secret Manager resource names) was a major learning point. Password Special Characters: We discovered that special characters (like @) in database passwords can break connection strings. The solution was to implement URL encoding for the password within our database connection logic, making the system more resilient. Accomplishments that we're proud of Fully Automated Deployment: We successfully built a zero-touch CI/CD pipeline. A git push is all that's needed to build, migrate the database, and deploy a new version of the backend to production. End-to-End Real-Time System: Creating a seamless, real-time data flow from the server, through the FastAPI WebSocket backend, and rendered live on the React frontend. Secure and Scalable Architecture: By leveraging Cloud Run, Cloud SQL, and Secret Manager, we built a platform that is not only secure but can also scale effortlessly based on demand. What we learned This project was a tremendous learning experience in modern cloud-native application development. We gained deep practical knowledge in containerization with Docker, building automated CI/CD pipelines with Cloud Build, and deploying serverless applications on Cloud Run. Most importantly, we learned the intricacies of cloud security, from managing IAM roles to securely handling secrets and connecting services within the GCP ecosystem.

What's next for Server Metrics and Log Monitoring Platform Historical Data & Charting: Implement functionality to query and visualize metrics over specific time periods (e.g., last hour, last 24 hours). Advanced Alerting: Enhance the alert system to allow users to define custom thresholds (e.g., "alert me if CPU usage is over 90% for 5 minutes") and configure notification channels like email or Slack. Multi-Server Support: Refactor the platform to allow a single user to register and monitor multiple servers from one dashboard. Log Searching & Filtering: Add powerful search and filter capabilities to the log viewer to make debugging easier.

Share this project:

Updates