404 Sentinels

\documentclass[12pt]{article} \usepackage{amsmath} \usepackage{hyperref} \usepackage{geometry} \geometry{margin=1in} \usepackage{enumitem}

\title{404 Sentinel: Air Quality Trust Layer} \author{Team based in Sarajevo, Bosnia and Herzegovina} \date{}

\begin{document} \maketitle

\section*{Inspiration} Air quality maps feel objective: green is good,'' red isbad.'' In reality, every point is a sensor—hardware that can drift, freeze, pick up electrical noise, or in adversarial settings be misused. We’re based in Sarajevo, Bosnia and Herzegovina, where PM2.5 affects daily life, yet trust in the measurement often gets less attention than the visualization.

We asked: Can we treat air-quality streams more like monitored infrastructure—assume readings might be wrong, cross-check them, and output a \textbf{trust score} alongside the value? That became \textbf{404 Sentinel}: a layer that scores how reliable a feed looks, not only what number it reports.

\section*{What it does} 404 Sentinel adds a cybersecurity-style trust layer on top of environmental time series, using six PurpleAir sensors across Sarajevo as the live context. It runs multiple detection passes (cross-sensor, statistical, temporal, and pattern checks), then aggregates them into a single trust score on $[0, 100]$ with per-layer explanations—so you see why trust changed, not only that it changed.

It also supports \textbf{attack simulation} (injected drift, noise, flatlines, spikes, coordinated scenarios) and CSV upload for blind analysis of external datasets using the same detection pipeline—so demos and judging can compare known injection'' vsunknown file'' fairly.

\section*{How we built it} We split the system into a \textbf{FastAPI backend} (data fetch, CSV parsing, detection orchestration, APIs) and a \textbf{Next.js frontend} (dashboard, simulator, CSV analysis, charts). The UI stays a typed client; all scoring logic lives server-side.

\subsection*{Detection pipeline} Seven analyzers score severity in $[0, 1]$ (higher = worse) for things like neighbor correlation, z-score behavior, drift, flatlines, temporal jumps, clustering vs neighbors, and pattern regularity. An eighth block aggregates layer scores with weights and adds data freshness from how recent the last sample is. Conceptually, each layer contributes ``goodness'' $(1 - s_k)$ scaled to $[0, 100]$; the composite is a weighted sum, e.g.:

[ T := \sum_{k=1}^{7} w_k \cdot (1 - s_k) \cdot 100 + w_{\mathrm{fr}} \cdot F, ]

where $F$ is freshness on $[0,100]$, and the weights satisfy $\sum_k w_k + w_{\mathrm{fr}} = 1$. (Exact weighting matches the implementation’s config.)

\subsection*{Consistency} Injected simulator data and uploaded CSVs are both routed through the same analysis path so behavior matches across input types. CSV responses include downsampled time series for charts, analogous to preview data in the simulator.

\subsection*{Reliability} We use automated tests (e.g., pytest) for parsing, APIs, and cross-path checks so changes don’t silently break parity between simulation and upload.

\section*{Challenges we faced} \begin{itemize}[leftmargin=*, label={--}] \item \textbf{Parity:} The attack simulator and CSV upload had to produce comparable trust and layer results; small mismatches (e.g., freshness treated as ``unknown'' for files) would break blind evaluation. We aligned semantics—e.g., deriving recency from the last timestamp in the dataset where appropriate—and exposed series data for visualization parity.

\item \textbf{Thin context:} Single-column CSVs have no neighbors; neighbor-heavy layers must still run without pretending there is cross-sensor agreement. Explaining that honestly in the UI and scores mattered.

\item \textbf{Ops, not only algorithms:} Deployment, CORS, env-based API URLs, and multipart uploads are part of the product; they cost real time next to the math.

\item \textbf{Explainability:} A single score is easy; eight explanations are harder—we focused on layer-level breakdowns so the story stays legible.

\end{itemize}

\section*{What we learned} \begin{itemize}[leftmargin=*, label={--}] \item Environmental integrity overlaps with security thinking: anomalies, drift, and spoofing all show up as mis-trustable time series; the same scoring framework helps debug sensors and surface attacks.

\item One pipeline, many doors: If simulation and CSV upload diverge, you cannot trust benchmarks—consistency is a feature.

\item Tests and docs are part of the demo: Reproducible behavior beats a one-off impressive screenshot.

\end{itemize}

\section*{Accomplishments we’re proud of} \begin{itemize}[leftmargin=*, label={--}] \item A transparent trust model: numeric score plus per-layer evidence. \item Sarajevo-grounded deployment: real PurpleAir context, not a toy dataset. \item Two evaluation modes: controlled simulation and blind CSV analysis on the same engine. \end{itemize}

\section*{What’s next for 404 Sentinel} \begin{itemize}[leftmargin=*, label={--}] \item Richer provenance (device metadata, maintenance history) in the trust model. \item Threshold alerts when trust drops for a neighborhood or route. \item Public benchmarks so others can compare methods on labeled faults. \item Exports (PDF/CSV reports) for cities and NGOs, beyond the live UI. \end{itemize}

\end{document}