Depression Detection using Tweets

Inspiration

Depression is a common phenomenon ia a ever growing society with stress, worklife and busy schedules. Most of the working class and common youth face depression in some form or the other due to poor mental health management. The major problem of a depressed community is the reluctance of people to admit to having it. Most People think expressing a weakness such as depression as a social taboo. Instead of treating it as illness, they keep it to themselves leading to extreme scenarios of sucide and toxic relationships. One of the major advantages of mordern world is social media. Eventhough people are reluctant to share their problems with each other in person. They are more comfortable to express their views online, preferrably anyonymous or in a cryptic way. Example Twitter is important platform where peple express their state and views. Helping out people in depression using machine learning detection tool (as people don't admit to it usually) is our motivation for this problem.

What it does

We have implemented a Natural Language processing based Machine learning solution. This uses tweets of previous user to predict if the current user is depressed or not (based on his tweet).

How we built it

We have leveraged IBMZ Linux One Cloud Platform to create an Image and run Jupyter Lab on Port 38888 and used this for training the various ML, NLP concepts to extract the necessary Details out of it. Trained many models for classifying the tweet. Developed a front-end webpage which could be used by to determine the result of our models in an interactive way (Live Demo). Created API endpoints to return response to front end. User Docker as a deployment Platform. And integrated the model based api endpoints and the front end made using Nextjs. This helps the user test out models like glove, naive bayes etc live and see the results for himself.

Challenges we ran into

Accomplishments that we're proud of

Usage of multiple vectorizers to extract model

Initially we had to extract the necessary features from the text based tweets, choosing an appropriate Vectorizer was necessary, so we tried different ones like countVectorizer and tf-idf Vectorizer.

Training multiple models and working around with number of features to increase accuracy,

we used MultinomialNB, Glove, GuassiannNB, sequential RNN, SVC and KNN to get the best accuracy.

Creation of a interatice front end and Docker for deployement

we created an interative frontend to run our models with a UI, we have built this using NEXTjs and deployed the same using docker. As for the preprocessing and other parts, we used python flask based framework to create a backend which returns results to frontend by analysing and predicting based on the input tweet text. We also hosted the backend flask frame work using docker (local host ).

IBMz platform

The fun art of working in this datathon was leveraging the IBMz platform, we used it to train various complex models remotely and deploy addtional docker container for our frontend and backend flask api.

What we learned

We learnt that building a project goes in steps and phases like from ideation, collecting dataset, desigining, coding, testing and deployment. Each phase has its own challenges and time mangement is the biggest one of them all. The only way we could finish this project is due to our teamwork. Efficient task splitting and tag team coding helped a lot. From the technical aspect we got used to various mordern technologies like running cloud instances(IBMz), working and running our own docker container, training various complex ML models, NLP, building an integrated UI in a short time frame for demo and dockerizing the same. This has boosted our confidence in building a full stack application and work with real life social problems.

What's next for Depression Detection using Tweets

1.The future scope of this project involves an subtle yet important architectural change. Right now all our models are trained to test if a user is depressed or not based on a single tweet. Though this works with high accuracy, it still gives some false alarms. The extension of this project would user based tweet analysis, where we train our models based on a series of tweets from a user and predict if he is depressed ot not. We believe this will lead to imporved accuracy and user personalization.

We have trained various models with different accuracies and architecturs. Leveraging just the best one is not always the best approach. We would like to build an hybrid model which gives a descion based on the collective prediction from all our models. This would decrease the false positives and give better accuracy.
We would also like to imporve our intuitive UI and integrate more models and vectorizers with the same

Built With

Submitted to

IBM zDatathon

Created by

I worked as a Full stack developer and made the models, frontend, backend api, IBMz cloud platform and docker deployments

Srinivasa Raghavan
Vihar Devalla
Svsc Santosh
Suhas R
Nishanth Srinivasa