Team Merlion

Submitted By:

This project repository contains the submission for the CDP Hackathon by Team Merlion. Reproducibility can be obtained by following the instructions below.

The main coding language for this project is Julia. However, the NLP processing was done in Python (which can be called natively in Julia).

Project Initialization

Julia projects can be initialized and installed. You can initialize this project by doing:

@quickactivate cdphackathon
] initialize cdphackathon

Which will install all dependencies and create a Manifest.toml

Installing spaCy

This project requires the Python package spaCy. Activate the Julia Conda environment (check the list of environments), and do the following:

conda install -c conda-forge spacy
python -m spacy download en_core_web_lg

Obtaining Cleaned City List

The list of cities from which data was obtained from by CDP has been cleaned and compiled by relevant functions in cityresort.jl. Uncomment the following three lines in citiesresort.jl:

resortUScitylist("cities/uscitylist.csv")
resortCAcitylist("cities/cacitylist.csv")
cdpcitycsv("Cities_Data_2017-2019_mb2.csv")

However, it must be noted that this automated resorting only is able to find details for 2/3 to 3/4 of all the cities on the list. Details of the longitude/latitude for the remaining 60 cities were found manually.

Keywords

Using the NLP package spaCy, we are able to identify the most common keywords and the frequency of their outputs in regionresort.jl

Tableau Dashboard

The raw Tableau workbook can be found in the submission folder, but a public link can be found [here]

Built With

Share this project:

Updates