A geocoder is a service for matching addresses to geographic locations. Geocoders use both geospatial queries and full text search to resolve incoming data to addresses and locations from a validated set of addresses.

For example, if a developer wants to resolve TIMES SQ MANHATTAN to a full address with coordinates, they may make a request against a forward geocoding API. This API will likely apply a full text search algorithm against a known database of addresses and return a list of potential matches (e.g. TIMES SQ MANHATTAN -> ["5 TIMES SQUARE MANHATTAN 10036", "42 TIMES SQUARE MANHATTAN 10036"]).

Alternatively, if a developer wanted to resolve (40.768044, -73.982372), they could use a reverse geocoding API. A reverse geocoder uses geospatial search to provide validated locations that are near the requested point (e.g. (40.768044, -73.982372) -> 2 COLUMBUS CIR MANHATTAN 10019)

Over the last few days, I built out a geocoder that uses Redis Search to implement both forward and reverse geocoding against approximately 1 million New York City addresses.

Then, using Redis Pub/Sub, I extended this service to provide a batch address resolution endpoint. With this batch service, developers can make geocoding requests and easily share the resolved addresses. In the video linked below, I go into detail about the system architecture and how Redis enabled this service.

In the future I'd like to address some of the rough edges in the system design and expand the target address dataset. There are some weaknesses in this service right now, the following would improve user experience, performance, etc.

  • Set Memory Limits - In the current configuration, different instances have different resource requirements, the application's deployment could be safer if these were scoped and defined.

  • Back-pressure / Retries - In the current application, there's minimal logic for retries, timed back-offs, etc. In short, the properties you'd write into a resilient distributed system aren't present.

  • Open API Specification - The API behavior is not well documented to the public. Open API is a specification that makes documenting services easier, but I haven't yet written this spec.

  • Datasets - The service is designed to handle any dataset with id, location, and address. There are quite a few national datasets available that could be interesting, e.g. US National Address Dataset.

source code is available here

Built With

Share this project: