Problem Students at universities should feel safe when they walk around campus, particularly at night. Unfortunately, it is hard to be on the lookout 24/7 walking around in the public. When we looked at USC's daily crime logs, we noticed that some areas suffered from repeated harassment crimes. For example, according to USC's crime logs, there have been four harassment incidents clustered around 28th and 29th streets, on 1/30/20, 1/22/20, 1/9/20, and 12/12/19.
What it does This app seeks to mitigate the problem by causing the user's mobile device to softly vibrate and for a push notification to pop up when the user enters within 0.0006 latitude and longitude points of an area where a physical harassment crime has occurred in the past three months.
How we built it USC's crime logs are published in PDFs, and no API was provided to more easily read USC's crime reports. This meant that the only way to read the crime incidents was to download all the PDFs and parse them. We had our program look through the html code for the USC daily crime logs and use regular expressions to match all the PDFs' download links. Then, our program downloads all the PDFs.
Reading the PDFs was a challenge in itself due to the unstructured nature of these documents (compared to regular text documents). We dug through many Python libraries to find a good way to read text from a PDF. Several libraries that we looked at, such as PyPDF2, were unsuitable for our project because they omitted too many lines of text. Eventually, we settled on tika, which was accurate. Unfortunately, tika returned the lines of text for each incident out of order, but the lines were shuffled around in the exact same way for each incident, so we made a function to reorder these lines of text. Then, from these organized incidents, we filtered out the non-harassment ones and kept the rest as objects.
Next, we had to determine the exact geographical coordinates of the locations these crimes occurred. For example, we had to determine the latitude and longitude of the location "28TH ST & HOOVER ST". The easiest way to do this, we discovered, was Google's Geocode API. We used one of the $50 Google Cloud credit codes we were given to use this API to convert names into latitude and longitude coordinates. Sometimes, the coordinates returned would be inaccurate and far away from USC, on the order of hundreds of miles due to similar location names in other parts of the state and country. We fixed this by adding a "bound" to each API request we sent which would restrict the search area to LA county only.
Challenges we ran into In implementing the app, we were stuck, because none of us had any experience at all in app development. The most we were able to get our app to do was to return the user's current latitude and longitude coordinates, shown here. We were confused as to how to program the app to make it send a vibration and a push notification to the user's device. If we had a greater knowledge of React Native, we would most likely have gotten past this issue.
Accomplishments that we're proud of We are very happy to have successfully parsed and extracted useful information from completely unstructured PDF documents. Perhaps this can be the beginning of an API for USC's daily crime logs. In addition, this was our first time using a pay-per-use API (Geocode), so we are proud of making our program as efficient as possible to avoid excessive use of it.
What's next for PocketAngel The most reasonable next step is to improve the app, but another avenue we could go down is to make an API that can structure each incident in USC's daily crime reports.
Log in or sign up for Devpost to join the conversation.