Inspiration

Every resident of Patna has a photo on their phone of a broken road, an open drain, or a streetlight that has been dark for months. They take the photo, they WhatsApp it to a neighbour — and then nothing happens.

The problem isn't that citizens don't care. It's that the path from complaint to action is completely broken. You don't know which of dozens of government departments to contact. You don't know the right form, the right address, or the right law to cite. And even if you do file a complaint, you have no idea if anyone else has reported the same pothole eight times before you.

"Reported 6 times. Never fixed." — This is the neglect pattern NeighbourWatch was built to surface and escalate.


What it does

NeighbourWatch is a citizen-driven civic reporting platform for Indian cities. A citizen:

  1. Photographs a civic issue — pothole, open drain, broken streetlight, garbage, waterlogging
  2. Records a voice note in Hindi, Bhojpuri, or English describing the problem
  3. Submits — GPS is captured automatically

From there, a single AI pipeline handles everything:

  • Classifies the issue type from the photo
  • Transcribes and translates the voice note
  • Scores severity 1–10
  • Routes to the correct government department (PWD, BSPHCL, Patna Municipal Corporation, etc.)
  • Drafts a legally formatted RTI letter under Section 6, Right to Information Act 2005

The issue appears instantly on a live city map colour-coded by severity. A PostGIS spatial engine clusters nearby reports and calculates a neglect score — how many times this exact problem has been reported and ignored.


How we built it

The 1-call master agent architecture

Our original design called for 7 sequential Gemini API calls — vision → transcription → severity → clustering → pattern detection → department routing → RTI generation. Under free-tier quotas (15 requests/minute), that would have broken the demo after 20 submissions.

Instead, we designed a single monolithic multimodal prompt that accepts a base64 image + GPS coordinates + voice description simultaneously, and returns one unified JSON payload containing all outputs. This reduced token consumption by ~85%.

Input:  [ base64 image ] + [ GPS coords ] + [ voice/text description ]
           ↓
   Gemini 2.0 Flash (1 multimodal call)
           ↓
Output: { issue_type, severity, translation, department, rti_letter }

All Gemini calls go through a single orchestrator (src/lib/gemini.ts) that rotates across multiple API keys and cascades from gemini-2.0-flash-litegemini-2.0-flash on quota errors.

PostGIS for geo-intelligence

Geographical clustering and neglect pattern detection were moved entirely into PostgreSQL + PostGIS — eliminating 2 AI calls and making the spatial reasoning deterministic and fast.

SELECT * FROM reports
WHERE ST_DWithin(
  gps_location,
  ST_SetSRID(ST_MakePoint(lng, lat), 4326)::geography,
  500  -- 500 metre radius
);

Real-time map

Supabase's real-time subscriptions push new reports to the Leaflet.js map instantly — no page refresh. Marker colour encodes severity. A heatmap overlay shows civic hot zones across the city.


Challenges we ran into

API quota exhaustion hit us within 2 hours of testing. Our test logs tell the story: gemini-1.5-flash-8b returned a 404 (model deprecated mid-hackathon), and gemini-2.0-flash returned 429s asking for a 45-second retry. We redesigned the entire pipeline around 1-call batching and implemented key rotation as a safeguard.

Multimodal prompt engineering — getting one prompt to reliably return valid structured JSON across image + audio + text inputs took significant iteration. Early attempts returned markdown-wrapped JSON, inconsistent field names, or hallucinated department addresses. We solved this with an explicit response schema contract in the system prompt.

PostGIS setup — the geography vs geometry type distinction caused several silent failures before our get_nearby_reports RPC function returned correct distances in metres.

Leaflet in Next.js App Router — Leaflet throws on server-side rendering. Required dynamic imports with ssr: false and careful map instance lifecycle management.


What we learned

The biggest lesson: the real intelligence wasn't in the number of agents — it was in knowing when to push work out of the AI and into the database.

Seven agents sounds impressive. One multimodal prompt that does everything costs 85% less and is more reliable. PostGIS handles spatial pattern detection better than Gemini's long context ever could, because it's deterministic. Keep AI for understanding unstructured human input. Keep the database for finding patterns in structured records.

Also: always rotate your API keys from day one. The 429 error at 2am is not the time to redesign your Gemini client.


What's next

  • WhatsApp integration — send a photo to a number, receive the RTI letter back in the chat
  • Department dashboard — a portal for PMC/PWD officers to acknowledge and update issue status
  • Expansion to Muzaffarpur, Gaya, Bhagalpur with city-specific department routing
  • Predictive maintenance — flag infrastructure likely to fail before it is reported

Built With

Share this project:

Updates