MonitShark

GIF
Dashboard live metrics
GIF
Audit + chat
GIF
Agent answer streaming

## Inspiration

Three real problems converge for anyone running their own Linux server:

The expertise gap. Linux server hardening, package patching, firewall management, cron orchestration — every
team needs them, only senior engineers do them confidently. Misconfiguration is consistently the #1 root cause of breaches at this tier.
Tool fragmentation. Existing admin tools each cover one slice — state inspection, metrics, container logs, firewall management — and they don't talk to each other. Sysadmins context-switch across half a dozen tools to do a
single job.
AI agents that talk vs. agents that act. Most "AI assistants" describe what an admin should do. Almost none
can do it safely on a real host with auditability and consent.

I wanted one console that addresses all three.

## What it does

MonitShark is a self-hosted web app you run with one Docker command. It gives you:

Live dashboard — CPU, memory, disk, network streaming over WebSockets, top processes, open alerts
System page — per-core CPU, per-disk I/O, per-NIC throughput, sensors (temps, fans, battery), kernel modules, listening ports with PID
Services — every systemd unit, start / stop / restart with confirmation
Docker — containers grouped by Compose project, live log streaming over WebSocket, lifecycle actions
Cron — per-user tabs + system crontab, full CRUD + run-now
Scripts — bash editor under /opt/cockpit/scripts/, run with timeout, install as systemd service, schedule via
cron
Audit — 4 security audits (SSH, users, permissions, packages) with ~20 checks ranked by severity, one-click fix
Firewall — host firewall rules: add / delete / enable / disable with action, port, protocol, source filters,
comments
Updates — distro-aware (apt/dnf), security-only updates separately
Permissions — file browser scoped to /etc /opt /var/log /home /root, chmod + chown
Logs — tail any file under /var/log, regex search, "Ask the agent to analyze" handoff to chat

And the differentiator — a chat agent (Groq + LangGraph) with 51 tools, 21 of them gated by an explicit confirmation card. The agent calls the same Python modules the REST API uses. When it wants to do something destructive, the
LangGraph state machine pauses on langgraph.types.interrupt(), surfaces a confirmation card to the React drawer, and resumes only after the user clicks Allow. Confirmation lives in the graph topology, not in a prompt the LLM could ignore.

## How we built it

Backend — Python 3.11, FastAPI, uvicorn, LangGraph 0.2, langchain-groq, psutil, pystemd, python-crontab,
aiosqlite, PyJWT, passlib[bcrypt], docker SDK, distro
Frontend — React 18, Vite, TypeScript (strict), Tailwind CSS, shadcn-style primitives, TanStack Query, Recharts, axios, react-markdown, sonner
Reverse proxy / TLS — Caddy 2.8 with tls internal (local CA, self-signed)
Distribution — docker-compose, single host

The backend container runs --privileged --pid=host with /:/host:rw. Mutating commands use nsenter --target 1 to
execute in the host's namespaces; reads go through the bind-mount.

Safety invariants:

One subprocess gate (app/util/sh.run) — a CI test fails if any other module imports subprocess
Path allowlists for log files, scripts, file browser, cron paths
Pydantic + regex validation on every tool input before it reaches the host
21 destructive tools all run through the confirmation gate
JWT auth (HS256, users.yml + bcrypt) on every REST + WebSocket endpoint

## Challenges we ran into

LLM tool-call format quirks. Some completions emit malformed function-call syntax. Worked around with retries and a graceful fallback to a friendly user-facing error.
Free-tier rate limits. 51 bound tools cost ~6-8k tokens per request. Switched default to
llama-3.3-70b-versatile (higher TPM ceiling) and added a 2.5s throttle between outgoing requests.
WebSocket interrupt resumption. Getting the LangGraph interrupt() payload to surface as a React confirmation card and the user's response to resume the graph required handling astream(stream_mode="updates") and the special
__interrupt__ chunk correctly.
Self-signed cert + Docker networking. Caddy on bridge couldn't reach a backend on host networking. Moved backend to bridge with expose: 8000.
Bcrypt $ escaping. Compose interprets $ as variable expansion, so admin password hashes need $$ doubling — documented loudly in .env.example.

## Accomplishments that we're proud of

The confirmation gate works end-to-end — agent proposes → user clicks → host changes. Verified live on Ubuntu
24.04.
51 tools, 11 management surfaces, 200 source files, all in one cohesive build with consistent design tokens (HSL Tailwind variables, light/dark themes, amber accent).
Genuine cross-distro support — apt/dnf detection, nsenter for host-namespace execution.
Real safety architecture, not security theatre — the single subprocess gate enforced by a unit test is the kind
of invariant production codebases should have.
One-command bootstrap — ./start.sh generates a JWT secret, creates config/ from config.example/, and brings up the entire stack.

## What we learned

LangGraph's interrupt() is a remarkably elegant primitive for human-in-the-loop AI. Putting consent in the graph
topology (not in a prompt or a tool wrapper) makes it un-bypassable.
Aggressive path allowlists + a single subprocess gate (with a CI test enforcing it) catches more issues than dozens
of ad-hoc validations scattered across modules.
For a chat agent over an LLM with rate limits, what you bind to bind_tools() is a fixed token-cost overhead per request. Trimming tool docstrings is the cheapest win.
Self-hosted privileged tools shouldn't be hosted publicly even for demo — a recorded video is the right "live link" for this category.

## What's next for MonitShark

Comprehensive audit expansion — kernel sysctls, mount options, firewall posture, failed-login bursts, CIS
benchmark mapping
Multi-host fleet management — run a MonitShark hub that coordinates agents across N machines
Open-ended audit mode — let the LLM plan checks dynamically rather than calling a fixed audit set
Code-splitting the frontend bundle — current 1.2 MB JS could be halved with route-level splitting
Custom dashboard layouts — drag-resize panels, saved views per user

Built With

caddy
docker
docker-compose
fastapi
groq
javascript
langchain
langgraph
linux
node.js
playwright
psutil
python
react
recharts
shadcn
sqlite
tailwindcss
typescript
uvicorn
vite

Updates

Spyros-Panagiotis Lefkaditis started this project — May 01, 2026 08:27 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.