Inspiration
My family lives in Dubai. I study at Penn State. With escalating tensions across the Middle East, the information landscape in the region has become increasingly unreliable. When incidents occur, official channels are slow to respond. Social media fills the vacuum with a mixture of legitimate reports, recycled footage from unrelated events, and outright fabrication. Information can be delayed, fragmented, or difficult to verify, and what does surface is often hard to trust in real time.
The problem is not just distance. My family, living in Dubai, faces the same uncertainty. They receive the same unverified forwards, the same repurposed videos, the same conflicting accounts. In an environment where information is tightly controlled and the consequences of misinformation are real, there is no accessible tool that allows ordinary residents to distinguish verified reports from noise.
Veriti was built to solve that problem. It is a platform that ingests anonymous crisis signals from the public, cross-validates them against multiple data points using AI, and produces a confidence-scored assessment of what is actually happening. The core design principle: verify the content, never the identity. In a region where reporting publicly can carry personal risk, the system is architected to minimize re-identification risk by stripping metadata, coarsening location, avoiding user accounts, and deleting raw media after processing.
What it does
Veriti is a real-time crisis signal verification platform that operates across three components: a privacy-preserving Android application for anonymous field reporting, a FastAPI backend powered by Google Gemini for AI-driven verification and clustering, and a Next.js live operations dashboard for situational awareness.
Anonymous Submission Pipeline: A resident witnesses an incident and opens the Android app. There is no account creation, no login, and no persistent identifier of any kind. They capture a photo or select one from their gallery, optionally add a text description, select an incident category, and submit. The confidence scoring engine evaluates report count, media analysis trust modifiers, cross-validation consistency, and official source overlap to assign one of four tiers: unverified (single weak signal), plausible (some corroborating evidence), corroborated (multiple independent reports align), or official (matches or overlaps a verified government source).
AI-Powered Cross-Validation: The backend receives the sanitized submission and runs it through Google Gemini 2.5 Flash's multimodal capabilities. The system performs three-signal cross-validation: it analyzes the uploaded image, compares it against the user's text caption, and evaluates both against the claimed geographic location. Gemini Vision produces a structured assessment including detected incident type, severity estimate, visible landmark identification, plausibility rating, and a trust modifier ranging from -0.3 to +0.3 based on signal consistency. A photo showing smoke at an airport submitted with the caption "fire at DXB" from coordinates matching Dubai International Airport receives a positive trust modifier. The same photo with a contradictory caption or mismatched coordinates receives a negative modifier.
Clustering and Confidence Scoring: Independent submissions are grouped by geographic grid cell and time window into incidents. The confidence scoring engine evaluates report count, media analysis trust modifiers, cross-validation consistency, and official source overlap to assign one of five tiers: unverified (single weak signal), plausible (some corroborating evidence), corroborated (multiple independent reports align), or official (matches a verified government source). Confidence escalates automatically as independent reports accumulate.
Official Source Ingestion:
Operators can paste raw text from official sources such as Dubai Civil Defence or NCEMA statements. Gemini parses the unstructured text, extracts incident type, geographic location (resolved to coordinates using a built-in Dubai landmark reference list), severity, and generates a structured summary. The system then cross-references the official alert against existing public incidents. Any public reports within the same or adjacent grid cells automatically have their official_overlap flag set to true and their confidence scores recomputed, visibly upgrading them on the dashboard.
Live Operations Dashboard: A Next.js 14 dashboard displays all active incidents on a Leaflet map of Dubai with confidence-tier-coded markers. A sidebar provides incident details including report count, AI-generated summaries, confidence explanations, and timestamp information. The dashboard auto-refreshes on a polling interval.
Audio Situational Briefing: An integrated audio briefing system generates spoken summaries of all active incidents. The backend uses Gemini to compose a natural-language briefing script, then sends it to the ElevenLabs text-to-speech API using the Turbo v2.5 model for low-latency generation. The resulting audio streams directly to the browser. This provides accessibility for users who cannot view a screen during a crisis situation.
How we built it
Backend — Python / FastAPI / SQLite: The API server is built on FastAPI with SQLAlchemy ORM and SQLite for persistence. Pydantic v2 handles request validation and response serialization. The backend is structured as a service-oriented architecture with dedicated modules for ingestion, verification, clustering, scoring, and AI services. Background processing runs the full verification pipeline asynchronously after submission receipt.
AI Pipeline — Google Gemini 2.5 Flash: All AI capabilities run through the Google google-genai Python SDK. Gemini serves multiple distinct functions in the pipeline:
- Multimodal cross-validation — image + text + location analysis via
generate_contentwithPart.from_bytesfor image data - Incident type classification — constrained classification into 7 canonical types
- Natural language summary generation — context-aware incident summaries incorporating report count, confidence tier, media analysis, and cross-validation results
- Confidence explanation generation — human-readable explanations of why an incident received its current rating
- Official source parsing — unstructured text extraction into structured incident data with coordinate resolution
- Three-signal cross-validation — image/video evidence + user caption + claimed location analysis
- Audio briefing script generation — natural-language spoken briefing generation for active incidents
Duplicate Detection — Perceptual Hashing:
The imagehash library generates perceptual hashes (pHash) for all submitted images. These fingerprints enable duplicate and near-duplicate detection across submissions without retaining the original media. Two images of the same scene from slightly different angles or with minor edits will produce similar hashes, allowing the system to identify recycled footage.
Mobile — Kotlin / Jetpack Compose:
The Android application is built entirely in Kotlin using Jetpack Compose for the UI layer. The on-device privacy pipeline is implemented in a dedicated LocalPipeline module that executes before any network transmission. Device integrity scoring uses Android SDK APIs exclusively with no external dependencies: Build fingerprint analysis for emulator detection, filesystem checks for root detection, Debug.isDebuggerConnected() plus the app debuggable flag for debugger detection, PackageManager.getInstallSourceInfo() for install source verification, Settings.canDrawOverlays() plus installed overlay-permission checks for overlay detection, and Settings.Secure.ALLOW_MOCK_LOCATION plus installed mock-location-permission checks for mock location detection. The app communicates with the backend over HTTP using multipart form data, connected during development via ADB reverse port forwarding.
Frontend — Next.js 14 / Leaflet / Tailwind CSS: The operations dashboard is a Next.js 14 application with React-Leaflet for map rendering and Tailwind CSS for styling. Leaflet CircleMarkers represent incidents with dynamic radius and color based on confidence tier. Map popups display incident details with emoji-coded type indicators. The dashboard implements automatic polling for real-time updates.
Text-to-Speech — ElevenLabs API: Audio briefings use the ElevenLabs REST API via httpx. The integration uses the eleven_flash_v2_5 model. Audio is generated as MP3 and streamed to the browser as an inline audio response.
Privacy Infrastructure: Privacy Infrastructure: The privacy system spans both client and server. On-device: EXIF stripping via Android ExifInterface, coordinate coarsening to configurable grid cells, regex-based PII redaction, and local heuristic device integrity scoring. On-server: video metadata removal via ffmpeg -map_metadata -1 -c copy, raw media deletion in a finally block after pipeline completion, integrity token sanitization to status-only strings for public submissions, in-memory rate limiting with automatic TTL-based expiry, and zero persistent user identifiers in the database schema.
Challenges we ran into
The most significant technical challenge was a silent failure in the Gemini Vision pipeline. The analyze_and_cross_validate function wrapped the API call in a broad except Exception block that returned a fallback response with type: "unknown" and trust_modifier: 0.0. Every submission appeared to process successfully, but no image analysis was actually reaching the final incident record. Diagnosing this required tracing the full pipeline execution path across four service modules. The root cause involved exception handling that obscured failures and schema normalization issues that pushed valid Gemini outputs into fallback categories such as "unknown."
The confidence escalation system initially had an ordering bug where score updates did not always reflect the latest clustered submission state. The incident would cluster correctly but the score would reflect the previous state. This was particularly difficult to identify because the system appeared functional, incidents were created and updated, but the confidence tier never progressed beyond "unverified."
Integrating the Android app with a local development backend required ADB reverse port forwarding (adb reverse tcp:8000 tcp:8000) and careful handling of network security configuration to allow cleartext HTTP during development. Connection timeouts and request failures during early testing were caused by the phone's network stack attempting to route to a public address rather than the tunneled localhost.
The Leaflet and Tailwind CSS integration produced a rendering conflict where CSS transforms applied by Tailwind's animation utilities caused map markers to detach from their geographic coordinates during zoom operations. The fix required isolating the pulse animation to use only opacity changes rather than scale transforms.
Accomplishments that we're proud of
The privacy architecture withstands scrutiny. A full audit of the codebase confirmed: raw media is deleted post-processing, integrity tokens are reduced to status indicators, text notes are PII-scrubbed, video metadata is stripped, rate limiter state is ephemeral, no IP addresses are persisted, and no field in the database schema can be used to link a submission to an individual. The system is designed so that even with database access, the retained data offers minimal leverage for re-identifying a reporter. In the context this platform is built for, that property is not optional.
The three-signal cross-validation pipeline represents a novel approach to anonymous content verification. Rather than relying on user reputation or identity-based trust, the system derives trust from internal consistency across independent data channels. This allows confidence scoring to function in a fully anonymous environment where traditional trust signals are unavailable by design.
The official source corroboration flow demonstrates the platform's core value proposition in a single interaction. An unverified public report exists on the map. An operator pastes an official government statement. Gemini parses it. A verified incident appears. The public report's confidence automatically escalates. Two independent information streams, anonymous public signals and official government communications, converge through the same trust engine.
What we learned
Exception handling strategy has outsized impact on system debuggability. Broad exception suppression with silent fallbacks creates systems that appear functional while producing incorrect results. Every external service call, particularly to AI APIs with variable response formats, requires explicit error logging with full context before any fallback logic executes.
Prompt engineering for structured JSON output from large language models requires iterative refinement. Gemini's responses varied in format, field naming, and value ranges across calls. Achieving consistent, parseable output required explicit JSON schema specification in the prompt, enumerated value constraints, and a normalization layer that handles edge cases in the response.
Privacy-first architecture is a design constraint that must be applied from the foundation, not retrofitted. Every data flow, every stored field, every log statement must be evaluated against the question: "Does this create a vector for re-identification?" This constraint produced a more disciplined architecture overall, but it requires continuous auditing as new features are added.
The most effective hackathon projects solve problems the builder personally understands. Every architectural decision in Veriti was informed by a concrete question: would this help my family understand what is actually happening around them? That clarity of purpose eliminated ambiguity in prioritization and kept the scope focused on what matters.
What's next for Veriti
Automated source ingestion. The official alerts endpoint is designed as a generic intake interface. Connecting automated scrapers, RSS pollers, or webhook receivers to this endpoint would enable continuous ingestion from official government feeds, news wire services, and verified social media accounts without architectural changes.
Real-time push notifications. The audio briefing infrastructure provides the foundation for targeted alerting. Extending the system with area-of-interest subscriptions and WebSocket-based push delivery would enable residents to receive immediate spoken or visual alerts when incidents are detected in their vicinity.
Multi-city deployment. The clustering engine, confidence scoring system, and AI pipeline are location-agnostic. The Dubai-specific components, the landmark coordinate database and the Gemini prompt context are isolated and configurable. Adapting the platform to other cities experiencing similar information challenges requires only reference data substitution.
Community verification layer. Enabling dashboard and app users to confirm or deny existing incidents would add a crowd-consensus signal to the trust engine. Confirmations from independent device contexts would function as additional corroborating reports. Denials would apply negative trust modifiers. This creates a feedback loop that improves scoring accuracy over time without compromising anonymity.
Built With
- adb
- android
- elevenlabs-api
- elevenlabs-turbo-v2.5
- fastapi
- ffmpeg
- gemini-2.5-flash
- google-genai-sdk
- httpx
- imagehash
- jetpack-compose
- kotlin
- leaflet.js
- next.js-14
- pillow
- pydantic
- python
- react
- sqlalchemy
- sqlite
- tailwind-css
Log in or sign up for Devpost to join the conversation.