About the Project
Inspiration
The 24‑hour news cycle produces millions of headlines that are hard to parse in isolation. We wanted a single, visual dashboard that surfaces where things are happening and why they matter—without forcing users to read hundreds of articles. GlobaLens was born from that need: turning raw, real‑time data into instant global awareness.
What it does
- Streams the GDELT public dataset hourly and plots each event on a 3D globe.
- Summarises every article with Vertex AI, distilling paragraphs into bite‑sized insights.
- Searches semantically using MongoDB Atlas Vector Search so you can ask, “election protests” and get results even if the articles never contain that exact phrase.
- Filters the timeline with a Select Date Range option to explore events within custom time windows.
How we built it
- Ingestion Pipeline – BigQuery + Cloud Functions fetch GDELT CSVs, enrich them, and store in GCS.
- NLP Enrichment – Vertex AI generates summaries and sentiment, while a Sentence‑Transformers model creates embeddings.
- Storage Layer – MongoDB Atlas stores JSON docs plus a vector index for K‑NN search.
- Backend – Flask provides REST endpoints and vector‑search queries.
- Frontend – Vite/React renders an interactive globe (react‑globe.gl) with Tailwind styling and a chat‑style search panel.
- CI/CD – GitHub Actions builds Docker images and deploys Cloud Functions via Terraform.
Challenges we ran into
- Real‑time scale – GDELT emits >100 K events/day; batching and indexing had to stay under free‑tier limits.
- Unreliable Data Source – Fetching CSVs directly from GDELT became unreliable due to intermittent site outages.
- Mixed‑language content – Ensuring summaries worked across 65+ languages required translation fallbacks.
- Frontend performance – Rendering tens of thousands of points crashed browsers until we implemented dynamic level‑of‑detail and WebGL instancing.
- Cold starts – Cloud Functions sometimes exceeded latency targets; we mitigated with min‑instances and caching.
Accomplishments that we’re proud of
- Shipped an end‑to‑end pipeline in 48 hours (hackathon deadline!)
- Achieved sub‑second semantic search over 30 K+ events.
- Visualised linked protests across three continents—insights not obvious from headlines alone.
- Maintained a zero‑ops serverless stack (no VMs to babysit).
What we learned
- Vector databases turn search into discovery—you don’t know what you’re missing until embeddings connect the dots.
- Geospatial + NLP is a powerful combo for storytelling.
- Spending time on DX (dev experience)—pre‑commit hooks, Docker Compose—saves hours when teammates join late.
What’s next for GlobaLens
- Event clustering & heat‑maps to highlight hotspots.
- User alerts (email/SMS) for custom triggers like “earthquake > 6.5”.
- Collaborative annotations so journalists can attach notes and share filtered views.
- PWA & offline mode for low‑bandwidth regions.
- Multi‑lingual UI with on‑device translation for privacy.

Log in or sign up for Devpost to join the conversation.