27 - Piratas Baviera

Inspiration

What it does

How we built it

Challenges we ran into

Accomplishments that we're proud of## Inspiration

Campus Wi-Fi is mission-critical, but the tools most people see are either too technical or too slow to answer simple questions like “Which AP is struggling right now?” We wanted something anyone at UAB could use—admins, support, even researchers—to explore the network visually, ask questions in plain language (Catalan/Spanish/English), and even peek a bit into the near future.

What it does

Andarax is an AI-powered assistant for UAB’s Wi-Fi infrastructure:

Interactive maps & time series: Explore APs across campus, filter by status/CPU/clients, and drill into trends.
Chat assistant (RAG): Ask natural-language questions like “Show APs in CIEN with CPU > 50%” or “Which APs will be busiest tomorrow morning?”
ML predictions: A Random Forest forecasts client counts per AP for specific dates/times, available both via UI and via chat.
Multilingual: Works in Catalan, Spanish, and English.
Local-first: Runs locally with the AINA model (FLOR-6.3B), keeping data private.

How we built it

Frontend & UX: Streamlit app (Python 3.12) with Folium maps and responsive panels for chat, trends, and predictions.
RAG pipeline:
- Router decides between structured queries (filters on AP/building/metrics) and semantic queries.
- Embeddings + FAISS (BGE-M3) for semantic search over AP/context docs.
- Context builder enriches hits with full AP metadata.
- LLM: AINA FLOR-6.3B via Hugging Face Transformers (4-bit quantization via bitsandbytes) for fast, local responses.
Prediction engine: Random Forest with 13 features (time encodings, hour/day flags, per-AP stats). Caching with joblib makes inference snappy.
Data layer: 2,333 timestamped AP snapshots (15-min intervals), plus geo data; lightweight cleaners and consistent schema for the UI and models.
Testing: Scripts for prediction sanity checks, AINA model readiness, and CLI chat runs.

Challenges we ran into

Messy real-world JSON: Optional/missing fields (e.g., radio/band fields like band_6_channels) broke naive loaders—fixed with robust schema validation and defaults.
Data sparsity: Many zero-client readings made learning tricky. We added per-AP priors (mean/std/max) and careful time encodings to stabilize forecasts.
Query routing edge cases: Natural language often mixes structure (“CPU > 50%”) with semantics (“overloaded around lunch”). We iterated on a hybrid strategy and improved entity extraction for AP names/buildings.
GPU constraints: Even quantized LLMs can be heavy. We tuned model loading (4-bit NF4), reduced context sizes, and ensured CPU fallback.
Date parsing in the wild: “Next Monday morning” vs. specific timestamps required a predictable, timezone-aware parser for the prediction tool.

Accomplishments that we’re proud of

A single interface that feels simple for non-experts but gives power users filters, search, and predictions.
Multilingual assistant that understands UAB context and AP naming conventions.
Fast local inference with a serious Catalan-capable model—no external API calls.
Realistic forecasts that align with daily/weekly rhythms and per-AP behavior.
Clean architecture (router → vector search → context → LLM) that is easy to extend.

What we learned

Good defaults beat perfect models when the UI invites exploration and the assistant speaks the user’s language.
Hybrid search is key: combining strict filters with embeddings yields useful answers to messy questions.
Observability matters: small diagnostics (counters, caches, schema checks) save hours during hackathon crunch.
Local models are viable for campus tools when quantization and prompt discipline are applied.

What’s next for 27 - Piratas Baviera

More training data: extend beyond the April 2025 window to strengthen seasonality and rare events.
Confidence intervals & anomalies: show uncertainty bands and flag unusual behavior.
Alerts & reports: scheduled notifications and PDF/CSV exports for admins.
Real-time streaming: live tiles for status and client counts.
Broader integrations: tie into network management systems; multi-campus support.
Model upgrades: experiment with XGBoost/LightGBM and compact instruction-tuned LLMs for even faster on-device chat.