DRISHTI - Eye with AI Inspiration The inspiration behind DRISHTI appears to be the need for a more efficient and responsive urban management system. The platform is designed to tackle a wide array of civic issues, from traffic management and infrastructure maintenance to public safety and citizen engagement. The integration of diverse data sources, including multimodal reports, social media, and IoT sensors, points to a vision of a "smart city" where data-driven insights lead to proactive problem-solving. The emphasis on "reasoning, summarisation, dynamic response generation" by the Gemini Models suggests an ambition to create a system that not only monitors but also intelligently interacts with and improves the urban environment.

What it Does DRISHTI functions as a centralized intelligence platform for urban governance. At its core, it ingests a vast amount of data from various sources through its DRISHTI DATA WAREHOUSE. This data includes everything from traffic and weather APIs to public transport feeds and citizen reports. The system then processes and analyzes this information using a sophisticated framework of AI agents.

How We Built It: Kiro - An Agentic Development Journey

DRISHTI's complexity demanded a novel development approach. Instead of traditional line-by-line coding, we employed Kiro, an agentic AI code editor, to orchestrate the entire creation process from concept to deployment. This allowed us to operate with the speed and precision of a much larger team, focusing our human efforts on refining the high-level logic and AI models. The process followed four distinct, automated phases:

Phase 1: Requirement Ingestion The journey began by providing Kiro with our comprehensive requirements.md. The AI agent ingested and parsed the 10 core user stories and their acceptance criteria, building a deep, contextual understanding of the project's goals, from the Citizen App's multimedia reporting to the Authority Portal's mission-critical communication (MCP) integration.

Phase 2: Autonomous Architecture Design Acting as an AI system architect, Kiro synthesized these requirements into a detailed design.md. It autonomously selected the optimal tech stack (React Native for cross-platform, Node.js for scalable backend services, Python for AI agents), defined the entire microservices architecture, and generated detailed data models (PostgreSQL/PostGIS, MongoDB) and API contracts, providing a complete blueprint for development.

Phase 3: Full-Stack Code Generation With the architectural blueprint established, Kiro transitioned into a code generation role. It scaffolded the complete monorepo, generating the entire directory structure and boilerplate files for the React Native frontend, Node.js backend services, and Python AI agent microservices. This included all necessary configurations, from package.json to tsconfig.json and Dockerfiles.

Phase 4: Granular Task Execution Finally, Kiro created and executed a 25-step tasks.md implementation plan. It methodically wrote the functional code for our most complex components, autonomously implementing the computer vision module, building the React Native map interface with Google Maps integration, and coding the MCP service for secure communications, turning the scaffold into a functional application.

This agent-driven protocol yielded significant advantages: Unprecedented Velocity, reducing months of development into a highly compressed timeframe; Architectural Consistency, ensuring the final code perfectly mirrored the initial design; and Complexity Abstraction, allowing us to tackle a multi-agent system with a focus on innovation rather than boilerplate.

Key Functionalities:

Reporting and Analysis: The Reporting Agent analyzes incoming reports, converting audio to text, analyzing images and videos, summarizing reports, and detecting duplicate or fake information.

Mapping and Visualization: The Map Agent provides geospatial plotting of incidents, identifies hotspots for issues like potholes or crime, generates "mood maps" based on social media sentiment, and determines the best routes for emergency services.

Predictive Analytics: The Predictive Agent forecasts future events and patterns, such as traffic congestion, potential hazards, and anomalies in city services.

Logistics and Resource Management: The Logistics Agent triages and prioritizes issues, allocates resources, and integrates with CRM and GIS systems for efficient dispatch and management.

Citizen Engagement: A User Engagement & Chatbot agent manages interactions with the public, providing a conversational history, a gamification engine to encourage reporting, and a notification dispatch system.

The outputs of this complex processing are intelligent maps, comprehensive dashboards and analytics, and actionable alerts and external actions, including chat-based interactions.

How We Built It The architecture of DRISHTI, as detailed in the provided diagram, is modular and built around a central MCP (Master Context Protocol) Server. This server acts as the hub for state management, context, and the coordination of the "agentic workforce."

Key Technological Components:

Data Ingestion: The DRISHTI DATA WAREHOUSE aggregates data from a multitude of sources.

AI and Machine Learning: The system heavily relies on Gemini Models for advanced reasoning and response generation, as well as Fine-Tuned Models for specific civic tasks like pothole detection.

Agent-Based Architecture: The core of the system is the LLM Orchestrator Agent which decomposes tasks and delegates them to specialized "Primary Agents" (Reporting, Map, Predictive, Logistics, and User Engagement). These primary agents are further broken down into specific sub-agents that perform granular tasks.

Memory and Context: A Vector DB (RAG) and a State DB provide the system with both long-term memory and real-time context, enabling it to learn and adapt.

Development Framework: The entire system is built upon an Agent Development Kit (ADK) Framework, suggesting a structured and scalable approach to building and managing the AI agents.

Integration with Android and Firebase: The inclusion of Android and Firebase logos suggests that the citizen-facing component of the platform is likely a mobile application.

Challenges We Ran Into While the provided document doesn't explicitly list challenges, the complexity of the system itself points to several potential hurdles that the development team likely faced:

Data Integration and Standardization: Integrating a wide variety of data sources with different formats and levels of quality is a significant technical challenge.

Real-time Processing: The need to process and act upon real-time data from sources like traffic feeds and IoT sensors requires a robust and highly available infrastructure.

Model Accuracy and Bias: Ensuring the accuracy and fairness of the AI models, especially those used for predictive and logistical tasks, is crucial to avoid negative societal impacts.

Scalability: Building a system that can scale to handle the data volume and processing demands of a large urban area is a major engineering feat.

User Adoption and Trust: Encouraging citizens to actively use the platform and trust the information and actions generated by an AI system would be a key challenge.

Core Foursquare APIs Implemented Our application is built upon a strategic integration of Foursquare's core APIs to deliver a seamless and powerful user experience.

Places API: The Foundation of Our Data

Functionality: This is the cornerstone of DRISHTI, providing access to Foursquare's vast and detailed database of over 100 million global points of interest.

Our Implementation: We use it to fetch rich, structured data for specific venues, including names, formatted addresses, contact information, operating hours, user ratings, and photos. This allows us to populate the detailed information cards for every civic and emergency service displayed in the app.

Impact: Ensures that citizens have access to accurate and comprehensive information for essential services like hospitals, police stations, and government offices.

Search API: Enabling Location-Aware Discovery

Functionality: Provides a powerful engine to search for places based on a variety of parameters, including keywords, categories, and proximity to a user's location.

Our Implementation: When a user searches for a term like "hospital" or "library," the Search API returns a list of the most relevant venues, sorted by distance or relevance. We heavily utilize Foursquare's detailed category hierarchy to specifically query for civic-related establishments.

Impact: Makes it incredibly easy for users to find exactly what they need, when they need it, turning the app into a vital directory for local services.

The Elastic Stack: Powering Real-Time Search, Observability, and Generative AI While BigQuery serves as our powerhouse for large-scale analytics and model training, the Elastic Stack provides the critical real-time capabilities essential for an operational platform like DRISHTI. It bridges the gap between historical data analysis and instant, actionable intelligence.

Elasticsearch: The Real-Time Data and AI Engine

Instant Search: Powers the sub-second, full-text search capabilities within the mobile app and for our AI agents. All incoming reports, social media mentions, and sensor data are indexed in near real-time, allowing users to instantly find relevant information with natural language queries.

Advanced Geospatial Queries: Goes beyond simple mapping by enabling complex, layered geospatial searches. For instance, an operator can ask, "Show me all road quality reports within a 500-meter radius of all hospitals," combining structured data, text, and location in a single, fast query.

Vector Search for RAG: This is the core of our "Vector DB (RAG)" component. We convert all municipal documents—such as Standard Operating Procedures (S.O.P.s), emergency protocols, and historical incident reports—into vector embeddings and store them in Elasticsearch. When a user or agent asks a complex question (e.g., "What is the official procedure for reporting a fallen tree blocking a major road?"), Elasticsearch performs a semantic vector search to retrieve the most relevant documents. This context is then fed to the Gemini LLM, allowing it to generate an accurate, factually-grounded answer based on official city knowledge, thereby preventing AI hallucinations.

Kibana: The City's Live Command Center

Operational Dashboards: Provides city officials and department heads with live, interactive dashboards. They can visualize incoming incident reports on a map in real-time, monitor key performance indicators (KPIs) like average response time, and track resource allocation across the city.

Hotspot and Anomaly Detection: Kibana is used to create heatmaps that instantly identify emerging hotspots for specific issues (e.g., a cluster of water pipe bursts). Its machine learning features can also detect anomalies in data streams, alerting officials to unusual patterns that may require investigation.

Beats & Logstash: The Unified Data Pipeline

Data Ingestion: The Elastic Stack provides a robust pipeline to collect, process, and ingest data from our diverse and distributed sources. Beats agents are deployed on various services to collect logs and metrics, which are then processed by Logstash and reliably sent to Elasticsearch, ensuring that all operational data is centralized, searchable, and observable.

What's Next for DRISHTI - EYE WITH AI The future of DRISHTI will likely focus on expanding its capabilities and reach. Potential next steps could include:

Integration of More Data Sources: Incorporating data from additional public and private sector partners to create an even more comprehensive view of the city.

Enhanced Predictive Capabilities: Improving the accuracy and scope of the predictive models to anticipate and prevent a wider range of urban issues.

Greater Automation: Increasing the level of automation in resource allocation and incident response to improve efficiency.

Deployment in More Cities: Scaling the platform to be deployed in other urban centers, potentially with customizations for local needs.

Open Data Initiatives: Making anonymized data available to researchers and the public to foster further innovation and transparency.

Built With

Share this project:

Updates