After talking to Richard White from EDR, we decided that using a graph database to search for patterns seemed like a very interesting problem to solve. Though the problem was narrow, we were able to use existing platforms to efficiently solve the problem.
What It Does
Graph.srch allows the user to specify parts of a mailing address and retrieve all possible matches from the graph database. In addition, the user can search for two different address parts and see the similarities and differences between the datasets returned.
How We Built It
The frontend is written in React.js and the backend is written in Python with Flask and a Neo4j Bolt driver. The graph database solution we used was Neo4j with its Cypher query language. The Neo4j database is hosted on Google Cloud.
We initially intended to host the database with Google Cloud Bigtable, but found setting it up to be more complicated than intended. The datasets given had some addresses with no street address which we had to work around since Cypher's MERGE does not do well with null fields. We also struggled with processing and querying as large of a dataset as we were tasked to handle. Our frontend proved difficult to complete and integrate with the backend because of formatting issues and pagination.
If We Had More Time
We'd like to deploy Graph.srch on Heroku, Netlify, or a similar platform. Other plans include showing a Google Maps view of addresses for user selection and including a visual of the graph data returned. Fuzzy searching with Levenshtein or Lucene search would also be nice to support.