Recent Dartmouth survey showed that 1 out 3 female students have experienced sexual harassment, and more than 76% of them never report the incident to any agency. In addition, it takes on average 11 months for harassment survivors to make their first report. Our focus was to create a safe and secure environment at which survivors may make a record of what happened, provide them social support feedback, while keeping the content and user completely anonymous.
What it does
Users can simply visit the website and start jot down what happened or simply record their thoughts and emotions. As they write, they can "Analyze" their experience to compare how others may have had similar experiences. Which will give them feedback such as "25 other people have had a similar experience". All comparisons are done on hashed data, meaning it is totally deidentified and impossible to know the content of the data.
How we built it
When a user submits their entry, record of their experience, the text data is run through Google-cloud to extract information about the Entities mentioned in the text. This allows us to infer the name of perpetrators and locations related to the event, which can be hashed and saved to our database. The hashed keys can then later be compared by Jaccard similarity using MinHash which allows us to infer similarity across hashed documents without having to store any identifiable data. From this comparison we provide users feedback regarding how many people have had similar experiences.
When a user submits, the text is also encrypted with a randomly generated hash key that only the user will see and can save to retrieve the record at a later time. This encryption process allows us to store the record for the user, while keeping the information safe, anonymous, and secure.
On the back-end we use a combination of python-Flask, google-cloud natural language API, and mlab database to store the info. For the front end we dynamically, render the page using Vue.js.
Challenges we ran into
We needed to figure out a way to compare similarity across documents without having to store any real content. We were able to achieve this leveraging Jaccard similarity using MinHash while storing everything hashed so that true content is irretrievable. We also wanted users to have the option of retrieving saved content which required "encryption" of the data using a unique key for each user. We achieve this by giving users two unique identifiers at submission that indicates where the data is stored, and the key to "decrypt" the data.
Accomplishments that we're proud of
We were new to topics of data security and learned a lot about hashing, encryption, and decryption in the past 24 hours. We also worked efficiently as a team with each person focusing on either front-end, back-end server development, and algorithm development.
What we learned
Hashing is cool.
What's next for CoHeal
We want to send an email to Dartmouth students twice a term to direct them to the website to encourage talking about harassments or difficulties they experience. With such a low barrier to talking about how you feel and what you gone through while receiving feedback about others who have been in the same experience, we believe that we can greatly improve the mental health of many survivors.