Mind's Eye

Sentiment Analysis of r/MentalHealth
Part-of-Speech Tagging Analysis of r/MentalHealth
Sentiment Analysis of r/CasualConversation
Part-of-Speech Tagging Analysis of r/CasualConversation
Part-of-Speech Tags
Tool tip on Hover
Selectable / Deselectable Data Points

Inspiration

Mind's Eye refers to a mental image conceived. Our mind is enclosed, sometimes glimpsed at through medical imaging. Much of the functional processes need to be conveyed in order to be better comprehended. I actually had an idea to use natural language processing (NLP) awhile back to investigate mental illnesses. Understanding the mental particularities of those suffering chronic mental illnesses in a more automated fashion would help health care providers in addressing the concerns of the community. In June of this year, I found that a group of researchers at Emory and Harvard had the same idea of using linguistic analysis as another novel means of identifying those (who self-label) with schizophrenia (on Reddit). They've since published their paper in npj Schizophrenia.

What it does

Mind's Eye can operate both as a self-management tool or one for exploratory analysis. With the aid of a certified counsellor or psychologist/psychiatrist, it can even be incorporated into future therapeutic use. Right now, Mind's Eye utilizes Reddit's API to query subreddits for posts from various user groups. The two subreddits used in this demo are r/MentalHealth and r/CasualConversation. A user of this web app may be able to log his or her thoughts in the future for self-management purposes. However, in order to demo this app, I have opted to use data that was already available to illustrate potential analytics / insights.

How I built it

Mind's Eye was built entirely in Python, using Flask as a web framework, in the PyCharm IDE. HTML templating was used for the visual. The graphs are interactive, with new data queried/added and computed upon pagination. Hovering over the data points reveals a tool tip with their particular values. The data points are also selectable and deselectable in order to emphasize or isolate certain entries. The dates however are pseudo-generated, for demo purposes.

Challenges I ran into

PRAW (the Python Reddit API Wrapper) was rather finicky to work with in terms of slow queries, and incomplete Reddit submission structure due to deleted comments, etc. I also thought about including a profanity filter for demo purposes due to some of the more graphic languages found in a few users' posts, but that would increase computation time and may slow down page rendering.

Accomplishments that I'm proud of

This was an original project idea that came from trying to understand more about how data scientists are to leverage publicly available data within an ethical means. It was also my first foray into Natural Language Processing, and realizing that I had another potential toolkit for addressing modern healthcare concerns. This was also my first time using Flask, and I absolutely loved its ease of use in rapid prototyping. PyGal was another fantastic Python package that I came across today for graphing and visualization purposes. All in all, much of this project was a 'first' for me.

What I learned

I learned that just because there is an abundance of information, one still has to look at it very critically and ask careful questions. During the Part-of-Speech (POS) Tagging Analysis, I had originally aggregated the various sub-tags so that they shared broader categories, but found that by doing so, I was not able to differentiate between the two Reddit subreddits (r/CasualConversation and r/MentalHealth) as readily as in Sentiment Analysis. By focusing on just the verbal tenses and modifiers, it would appear that Reddit users in the r/MentalHealth subreddit were less expressive in certain time orientations.

What's next for Mind's Eye

Halfway through this event, my laptop started warning me that I was critically low in disk space! I would have loved to use WordNet for a more nuanced analysis of the lexicon in order to classify certain affective processes such as anger, anxiety, and sadness; as well as identify categories of personal concerns, such as work, leisure, home, finance, or death. The next step would also be to build a machine learning classifier to predict those who are likely to self-identify with having a mental illness and to suggest or conduct a plausible course of therapy via remote counselling.

Built With

Updates

Y H started this project — Sep 29, 2019 05:25 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.