Inspiration
The inspiration came from the daily frustration of network engineers facing "Zombie Nodes": devices that appear "Green" (Online) on traditional monitors, but are "Dead" to the end user. We observed that the disconnect between the status of the physical infrastructure (ISPs) and the application layer creates a diagnostic vacuum that costs thousands in operational losses and response time (MTTR). We wanted to create a "watchdog" that not only checks if the radio is on, but understands if the service is being delivered.
What it does
Picket Officer NOC is a closed-loop AIOps (Artificial Intelligence for IT Operations) platform. It monitors raw telemetry via SNMP/ICMP (Zabbix) and correlates it with application health in real time. The system uses a Cognitive AI Agent (Gemini 2.5) to reason about failures, automatically differentiating network problems from software errors. More than diagnosing, it assesses risk levels and executes Self-Healing protocols via DevNet, requesting human approval only in high-risk or unprecedented scenarios.
How we built it
We built the system using a robust MVC (Model-View-Center) architecture.
The backend was developed in Python with FastAPI, acting as an orchestrator.
The Data Layer consumes the Zabbix 7.0 API for real telemetry from MikroTik devices.
The Brain is powered by Google Gemini 2.5 Flash, processing diagnostics through advanced Prompt Engineering.
The Interface was created with Tailwind CSS and Chart.js, focused on high data density and full responsiveness.
For the predictive engine, we implemented Machine Learning (Z-Score) algorithms for statistical anomaly detection.
Challenges we ran into
The biggest challenge was Data Sanitization for the AI. Initially, the AI suffered from "hallucinations" when reading cache data from offline devices. We had to create a treatment layer that forces "absolute zero" in vital signs when the link goes down, so that the Agent has good technical sense. Another challenge was synchronizing the SNMP polling time with the Dashboard reactivity, which we solved by optimizing the query intervals in the Zabbix Master Item.
Accomplishments that we're proud of
We are proud to have created a system that truly reasons about the network instead of just following if/else rules. Seeing the AI Agent correctly identify that a device is out of power and recommend a physical action, while at another time it identifies a script error and recommends a logical action, proves the efficiency of our context logic. We managed to close the complete loop: Detection -> Prediction -> Diagnosis -> Resolution.
What we learned
We learned that AI Governance is vital in critical infrastructures. It is not enough to give autonomy to AI; it is necessary to create a risk matrix where the human remains in control of high-impact decisions. We also deepened our knowledge in network automation (DevNet) and how to transform raw and unintelligible logs into clear and executive action plans.
What's next for Picket Officer NOC
The next step is to transform Picket Officer into a multi-vendor system, expanding support for Cisco, Huawei, and Ubiquiti. We plan to implement "Fleet Memory," allowing the agent to learn from each resolved incident, refining their risk accuracy. We also intend to integrate notifications via Telegram/WhatsApp, where the engineer can give the "AUTHORIZE" command directly from their mobile phone, from anywhere in the world.
Built With
- fastapi
- google-gemini-ai
- machine-learning
- mikrotik
- python
- snmp
- tailwindcss
- virtualbox
- zabbix-api
Log in or sign up for Devpost to join the conversation.