I collaborated with my teammate Terri on the data engineering workstream. Together we profiled all three source datasets to understand their structure, quality, and relevance to the challenge, then designed and implemented the bronze-to-silver ETL pipeline in Databricks notebooks covering data cleaning, JSON parsing, coordinate validation, and geographic lookup table construction. I also explored a Genie Space backend architecture with full instruction sets, example SQL, and Unity Catalog column comments as a potential natural language query layer. While the Genie approach was not used in the final submission, the silver tables Terri and I built served as the data foundation the team's app ran on.
Log in or sign up for Devpost to join the conversation.