DreamStreets: GPT-OSS Powered Geospatial Insights

DreamStreets leverages GPT-OSS-120b to:

  • Generate and execute NetworkX graph algorithms for network analysis
  • Create spatial DuckDB SQL queries for POI and facility analysis
  • Perform chain-of-thought reasoning to break down complex urban planning questions
  • Output actionable insights for both urban planning and humanitarian contexts

Inspiration

During my UNICEF internship, I built Python tools to map healthcare accessibility using OpenStreetMap and NetworkX, calculating distances between schools and health facilities across developing nations. While technically successful, these tools required users to write graph algorithms and SQL queries for each type of question i.e. needing to manually translate a geospatial query into a graph problem - a barrier to entry for urban planners or humanitarian specialists who lack programming backgrounds.

The release of GPT-OSS-120b offered a solution: a 120-billion parameter model that could translate natural language directly into sophisticated geospatial analysis while running completely offline. This offline capability is crucial not just for areas with unreliable internet, but also for sensitive humanitarian data that must remain on local systems due to privacy regulations and security concerns.

DreamStreets bridges the gap between complex GIS capabilities and non-technical users by transforming questions like, "Which areas would be isolated if this road floods?" into NetworkX algorithms and spatial queries automatically. It democratizes access to the same network analysis I coded manually at UNICEF, making geographic intelligence accessible without technical prerequisites.

It extends my previous project AskStreets with GPT-OSS-120b's chain-of-thought reasoning, more detailed analytical output with the persona of a geospatial analyst, offline capabilities, a fast in-memory database in the form of DuckDB, built-in schema discovery, and more sophisticated visualization code generation instead of templating.

What It Does

DreamStreets transforms natural language queries into advanced street network analysis using GPT-OSS-120b. Users ask questions like, "Which intersections would isolate communities if flooded?" and the system automatically generates NetworkX graph algorithms, spatial SQL queries, and interactive maps. Once a user brings in their geospatial data sets, DreamStreets can discover the schema of their DuckDB database and graph network and use it as the basis for its geospatial computations.

The proof-of-concept presented in the Jupyter Notebook submission (dreamstreets.ipynb) demonstrates analysis of real street networks and spatial databases built from OpenStreetMap, from NYC's Chinatown to Rohingya refugee camps in Bangladesh. The system identifies optimal emergency service locations, finds infrastructure vulnerabilities, and visualizes results on interactive maps - all without requiring users to write a single line of code or make online API calls.

Follow the steps in quickstart.md if you'd like to try it out for yourself, the notebook prepare_data.ipynb demonstrates dataset creation for the DuckDB database and the graph network (this step must be run online) and can be adapted to other locations.

How We Built It

I integrated GPT-OSS-120b with a LangGraph ReAct agent architecture, creating three specialized tools for the model to orchestrate: a network analyst (generates NetworkX algorithms), a database analyst (writes spatial SQL for DuckDB), and a map visualizer (creates Folium visualizations). The model's chain-of-thought reasoning decomposes complex queries into tool calls, combining LangGraph's agent orchestration with GPT-OSS's native tool-calling capabilities via Ollama.

The choice of the 120-billion parameter model was driven by exploratory curiosity. Having never worked with such a large model on high-end GPUs before, I wanted to test its limits and see what level of complex, zero-shot reasoning was possible. The entire system runs locally via Ollama, loaded with street networks from OpenStreetMap using OSMnx and stored in DuckDB with spatial extensions, ensuring true offline capability once geospatial data is loaded in. As a bonus, I enabled GPU acceleration for NetworkX via the nx-cugraph library, enabling faster graph algorithm performance.

I deployed DreamStreets via a RunPod pod with the RAPIDS AI 25.10a CUDA 12.9 Docker template, installing Ollama and additional Python requirements within the Pod and pulling in the GPT-OSS model. While the installation of these requirements required online access, by reusing the same network storage device, I could shift these pre-installed libraries and the model between different pods without needing to pull them in again. The RunPod Pod represents the sort of production environment that DreamStreets might run in.

Challenges We Ran Into

The project was constrained by limited time (joining the Hackathon with only 12 days remaining from concept to MVP) and budget for GPU resources, relying on RunPod for access to an Nvidia A100. This meant that each iteration and test had to be carefully considered. There was much trial and error in identifying the optimal RunPod template to run the code in (I decided on RAPIDS which is established for scientific computing), which LLM framework to use (I landed on Ollama due to issues with GPT-OSS native tool calling with vLLM), and picking the right database (I hovered between PostGIS and DuckDB and opted for the latter due to reduced complexity and setup time).

I intentionally left some of the "messy bits" in the notebook, such as instances where the agent would loop or the LLM would generate faulty syntax (in one case, misaligning a map visualization). I wanted to be as transparent as possible about the current limitations of LLM agents and the current state of DreamStreets.

The ReAct agent occasionally entered exploration loops, continuing to query the database even after finding the necessary information. In addition, GPT-OSS-120b would sometimes generate syntax errors, whether by mixing up Unicode strings which caused execution errors or showing a strong bias for PostGIS syntax over the required DuckDB functions, which necessitated careful prompt engineering to correct.

Accomplishments That We're Proud Of

I achieved fast response times for complex geospatial queries on a 120B parameter model, generating sophisticated reports that answer questions across several problem domains. The system successfully identifies critical humanitarian insights, such as optimal evacuation routes in Cox's Bazar refugee camps and emergency service coverage gaps in Chinatown.

GPT-OSS-120b reliably generates sophisticated NetworkX algorithms (closeness centrality, multi-source Dijkstra) from natural language, without users needing to know these concepts exist. The model also demonstrates robust self-correction. When encountering an error like a non-existent node ID, it automatically retries with a different approach, such as finding the nearest valid node in the network. This self-recovery mechanism ensures that queries can complete even when initial assumptions are flawed.

Most importantly, the system works: real queries produce actionable intelligence for business and humanitarian planning.

What We Learned

GPT-OSS-120b's reasoning capabilities far exceed simple code generation: it understands spatial relationships, decomposes complex problems, and self-corrects. The humanitarian sector needs accessible GIS tools, and this project proves that the gap between available data and technical expertise can be bridged by large language models. The importance of prompt engineering cannot be overstated - small changes in tool descriptions dramatically affected success rates.

What's Next for DreamStreets

Future work will focus heavily on scalability and accessibility: adapting the system to run on lower-end machines, optimizing it to handle much larger datasets (such as entire cities or regions, aided by GPU acceleration), and collaborating with data scientists to streamline integration with their GIS workflows.

This includes but is not limited to:

  • Fine-tuning a smaller, more efficient model (like GPT-OSS-20b) specifically on humanitarian GIS terminology, NetworkX/OSMNx code, and DuckDB Spatial functions
  • Integration with other spatial databases such as PostGIS
  • Expanding data sources to include health facilities, schools, and other critical infrastructure
  • Building partnerships with UNHCR and urban planning departments for field testing.
  • Developing mobile deployment strategies for rugged devices in disaster zones.
  • Open-sourcing prompt templates and tool architectures to help others build accessible GIS solutions.

Built With

  • duckdb
  • gpt-oss
  • jupyter
  • langchain
  • langgraph
  • networkx
  • osmnx
  • rapids
Share this project:

Updates