PrometheusForge

I build Commercial hvac automation systems — HVAC controllers, boiler sequencers, building management platforms. Our edge controllers (Raspberry Pi-based NexusEdge devices) run control logic in real-time, but we kept hitting the same wall: deploying ML models to these constrained devices was a nightmare of fragmented tooling, manual conversion steps, and zero visibility into what was actually running in the field.

Every existing ML platform assumed you had a GPU cluster and a team of MLOps engineers. We needed something that could take a dataset, produce a model, convert it to run on a Hailo-8 accelerator or bare-metal ARM, and push it to a controller in a warehouse — all without leaving the browser.

So we built Prometheus.

## What it does

Prometheus is an end-to-end AI/ML platform purpose-built for edge deployment:

Ingest — Upload datasets or connect live data sources (InfluxDB, PostgreSQL, CSV)
Create — Define .axonml model architectures using the AxonML framework
Train — Launch training runs with real-time loss curves, metric tracking, and queue management
Evaluate — Run gradient evaluations, compare model versions, analyze performance
Convert — Export to ONNX for broad compatibility or HEF for Hailo neural accelerators
Deploy — Push models to edge devices with rollback support and fleet management
Monitor — Track inference performance, prediction drift, and device health from the mobile app

## How we built it

The entire backend is Rust — no Python microservices, no Node glue code. One language, one binary, maximum performance.

Prometheus Server — Axum-based API server handling auth, training orchestration, model management, and deployment pipelines
Aegis-DB — Our custom Rust database for user management, model metadata, training history, and audit logs
AxonML — Our ML framework for defining, training, and serializing models in the .axonml format
Prometheus Shield — Built-in security engine with threat scoring and request analysis
Prometheus UI — Leptos WASM frontend compiled to WebAssembly, served as static files — no JavaScript framework, no Node runtime
Mobile App — React Native (Expo) for iOS and Android, real-time training run monitoring with live loss charts

The server, database, and frontend compile to a single deployment: one binary + one WASM bundle. The entire platform runs on a $12/month DigitalOcean droplet.

## Challenges

WASM frontend in a Rust workspace — Trunk (the WASM build tool) doesn't play well with Cargo workspaces out of the box. We had to structure the UI crate with its own index.html entry point and build from within the crate directory rather than the workspace root.

Model conversion pipeline — Going from a trained .axonml model to an optimized HEF binary for Hailo-8 involves multiple intermediate representations. Getting the quantization right without destroying model accuracy on edge hardware took significant iteration.

Real-time training metrics — WebSocket streaming of training metrics from long-running Rust training loops to a Leptos reactive frontend, with the same data mirrored to the mobile app via polling. Getting the data flow right across all three surfaces (training engine → server → web UI / mobile) was the hardest architectural challenge.

Edge deployment over unreliable networks — Our controllers sit in mechanical rooms with spotty connectivity. The deployment system had to handle partial transfers, verification checksums, and automatic rollback if a model fails health checks after deployment.

## What we learned

Rust is ready for full-stack product development. The type system catches entire categories of bugs at compile time, and the performance means we can run training orchestration, an API server, a security engine, and serve a WASM frontend — all from one process on minimal hardware. The tradeoff is compile times and a steeper learning curve, but for infrastructure that needs to be reliable, it's worth it.