Project Story — About the Project
Inspiration
In our department building, HVAC, lighting, and lab machines often stay on after hours. Cutting power by floor reduces waste but breaks experiments, overnight computations, and early classes. That small pain reflects a national need: reduce building energy use without blunt shutdowns. This motivated SmartWatt—an AI system that forecasts short-term load and triages anomalies that operators can trust.
SmartWatt — Few‑shot Load Forecasting & Calibrated Triage
What it does
- SmartWatt delivers short‑term (24‑hour) building‑load forecasts from 168 hours of context, with few‑shot transfer on Indian buildings.
- Uses pretrained IBM Granite Time Series – TinyTimeMixer (TTM) backbones (both r1 and r2 checkpoints), fine‑tuned on your windowed data.
- Produces robust submissions via seed ensembling inside each family and cross‑family blending (R1+R2) for better generalization to the hidden leaderboard.
- Keeps the model simple & stable for hackathon runtime: encoder frozen, decoder+head trained, RPT disabled, batch=32 (GA=2).
Approach
Data → Features → Windows
- Construct 192‑step windows per building/region with roles (input: 168 / target: 24).
- Managed the missing values in metadata by XGboost.
- Add calendar features:
hour,dayofweek,month,hour_sin,hour_cos. - Created around 25 features by feature engineering. Engineered featured include isweekend, area per *person, people per sqft ,fans per room, deviation from region area, is area missing, is inverter missing,etc.
- One‑hot region using train‑only categories (prevents leakage); median‑impute numeric controls.
- Enforce float32 everywhere to avoid
object → torchdtype issues.
Preprocessing
- Use
TimeSeriesPreprocessorto format series; rely on TTM’s built‑in normalization (no manual global scaling). - Collator pads/crops to checkpoint CL = 512, pads masks; RPT kept off after ablation.
Modeling
- Load
TinyTimeMixerForPredictionfromibm-granite/granite-timeseries-ttm-r1and…-r2. - Align input channels only; prune head from checkpoint FL → 24 without touching backbone geometry.
Two‑phase fine‑tuning
- Phase‑1 (head‑only):
lr_head = 1e-3, ~8 epochs, early‑stop on val MSE. - Phase‑2 (decoder+head):
lr_head = 8e-4(also tried6e-4),lr_dec = 2e-4(also1.5e-4), 20–30 epochs, cosine + 7–10% warmup, head dropout = 0.2. - Encoder frozen (optional “micro‑unfreeze” A/B for 3–5 epochs at very low LR).
Validation & Ensembling
- Save per‑seed validation predictions and test submissions.
- Convex MSE blending across seeds within each family (weights on simplex; weak seeds get down‑weighted automatically).
- R1↔R2 family blend via convex MSE on validation, then apply the learned weights to test.
- We evaluated horizon/region calibration; it helped NLL offline but did not improve MSE LB, so the final pipeline is MSE‑only blending.
Inference
- CL‑aware pad/crop; predict; inverse‑scale via preprocessor stats; clip ≥ 0;
Challenges
- Per‑window vs global scaling: solved by using TTM’s internal scaler + strict dtype control.
- Shape & context mismatches: handled with left‑pad to 512 and observed‑masking in the collator.
- Seed stability & leakage: region dummies fit on train only; early stopping on eval MSE.
- LB sensitivity: NLL‑oriented calibration hurt MSE on LB → removed.
Expected Impact
- Operational: tighter day‑ahead forecasts for chiller scheduling/shiftable loads; fewer false alerts.
- ₹/CO₂: accuracy translates into tariff‑weighted scheduling and avoided peaks—foundation for ₹/day and kg‑CO₂/day reporting.
- Scalable rollout: small trainable head/decoder; fast few‑shot onboarding across buildings/regions.
What we learned / Novelty
- Seed + family blending matters: R1 and R2 learn complementary errors; convex blending consistently beats either family alone.
- RPT off > on for this dataset/splits.
- Strict preprocessing hygiene (float32, no object dtypes, leak‑safe region dummies) prevents silent degradations.
- Simple beats clever on MSE LB: per‑horizon/region calibration that helps NLL didn’t help MSE → removed.
What’s next
- Depth of ensemble (low‑risk, likely lift): expand to 5 seeds per family (e.g., R1: 40/42/50/17/73; R2: 40/42/44/19/79), convex‑blend per family → blend families; also vary
data_seedto decorrelate loaders. - Targeted micro‑unfreeze (optional A/B): unfreeze last encoder block for 3–5 epochs with tiny LR (
1e-5 → 5e-5) and ES patience = 2–3; stop if no gain. - Residual stacking (low‑medium risk): train a tiny residual regressor on validation residuals using exogenous features (hour/region) and apply to test. Freeze if public LB drops.
- Model diversity: add PatchTSMixer (from
tsfm_public) as a third family for a 3‑way blend (R1/R2/PTM). - Repro pack: one‑click script to train seeds → save val/test → run convex blends → emit final CSV.
Log in or sign up for Devpost to join the conversation.