Smart Parks and Rec The study of the functional usages of urban greeneries are helpful in understanding their mental and physical health benefits on the community. Our project is developing an algorithm that classifies and predicts the function of an urban greenery given images of it.

Introduction: We are trying to solve a new problem of having the network, given the pictures of an urban greenery, categorize its functional use and predict a label for it such as “park”, “garden” or “playground”. Therefore, it is a classification problem. It is suggested in the public health/urban planning paper "Value of urban green spaces in promoting healthy living and wellbeing: prospects for planning" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4556255/ ) that, it is the functionality of an urban green space that relates to the health benefits this green space brings to the neighborhood, rather than the qualities of this green space. However, conventional studies have been looking more at the qualities and less so the functionalities, or the activities being undertaken there

Data: The datasets which would best fit our goals would link park images to park activities, so that we may better predict health outcomes of different neighborhoods. Some of the datasets below are a few options we consider using: Recreational facilities in New York City Department of Parks & Recreation properties (805 items) -https://data.world/city-of-ny/e4ej-j6hn Assessment of town and park characteristics related to physical activity in the Lower Mississippi Delta (616 columns) -https://data.world/us-usda-gov/83455c6f-4b84-4dcc-9ac5-f018e00eca40 The NYC Parks Events Listing database is used to store event information displayed on the Parks website, nyc.gov/parks. (50,199 items) https://data.world/city-of-ny/6eti-k994

Methodology: It will be a Convolutional Neural Network with three convolutional layers, each with a normalization layer, a relu layer, and a max pooling layer. Lastly, there will be one dense layer. This design is because of the image-to-label nature of the dataset, and the fact that CNNs are proven to be suitable for image classification tasks. If the model architecture is not working, we consider adding one more dense layer to increase accuracy, try increasing or decreasing the number of convolutional layers, using different pooling functions and nonlinear activations etc., as well as tuning the hyperparameters. We are planning to train the model based on a batch size of 256 for 10 epochs, but these are to be tuned.

Metrics: One first measure of success would be to take a park image, and successfully attribute the correct park activity to the park image. One further step would be to assign each park activity a score between 1 and 0 which expresses the activity’s relationship with health, and thus the park’s relationship with health. Finally, applying these scores on a map (should we find a map to image data set, which is likely) would allow us to visualize the makeup of healthy green spaces in an urban space. A “stretch” model would allow us to assign different health values to different parks/green areas in urban spaces. Ultimately, this will serve as a tool for urban planners to understand how to better organize green spaces, and will enable further analysis of diverse neighborhood healthiness. The model could even suggest, given a set of constraints such as budget and zoning codes, where to add new green spaces to the city and what they would be for. Here, accuracy is an accurate measure for success, as it is a typical image classification problem.

We will have a softmax loss function as well as a prediction accuracy function to apply when we test the model on 20% percent of the data in our dataset that are allocated for testing.

Ethics: What broader societal issues are relevant to your chosen problem space? Recent studies have shown that a lack of green space in disfavored neighborhoods have harsh health repercussions. Green spaces are essential for fitness, health, and mental health, and with the warming of climate provide additional cooling. A better replanning of green spaces to better serve all neighborhoods is essential to community living. Our model will be one additional tool for urban planners to use to identify current green space allocation problems.

Why is Deep Learning a good approach to this problem? When it comes to studying the relationship between urban greeneries and their health outcomes in the community, there are a lot of factors to deal with because urban systems are complex systems. It would be hard for an urban planner to manually map out the functional use of each urban greenery, and that is why deep learning algorithms can help quickly categorize the greeneries and make the relationship clearer.

Division of labor: Elise: preprocessing data, and potentially aligning activities with health scores. Viola: assembling the model, and potentially adjusting the loss function to attribute several labels (for several activities) to one inputs (one park).

Built With

  • cnn
Share this project:

Updates

posted an update

Introduction:

The study of the functional usages of urban greeneries are helpful in understanding their mental and physical health benefits on the community. Our project is developing an algorithm that classifies the function of an urban green, and predicts the activities taking place in it. Given pictures of the green space, our algorithm can predict multiple labels like [“art”,”kid”] if it is a good place for kids’ art events to happen.

Challenges & Insights: What has been the hardest part of the project you’ve encountered so far? Are there any concrete results you can show at this point?

We have been working on the preprocessing of data. The data we found online had multiple issues: file format (csv), duplicate images, different image sizes, incompatibility of images and labels, unreasonable label categorizations, etc. We had to program the fetching of images from the URLs, pad them to the same size, correspond the images and the labels, sort the labels into bigger and more general categories, and then save the images and labels to numpy arrays. This is now done, so our data is ready to be passed into a CNN model.

Plan: Are you on track with your project? What do you need to dedicate more time to? What are you thinking of changing, if anything?

We believe we are on track since we anticipated preprocessing taking up a big amount of time. Next, we are excited to implement our CNN model. Because of the structure of the data, our loss function will be pretty different from what we implemented in class: it will be a loss function that works with [input, [multiple labels]] pairs.

Log in or sign up for Devpost to join the conversation.