Inspiration
We were inspired by recent stories we have heard in the news about how online communities helped people recognize unforeseen COVID-19 symptoms. Meddit promotes the idea of collective knowledge and communal experience so that we may find medical stories that resonate with us and help us understand the niche ways health issues present in each of us. We were also inspired by the notion of personalized information (a precursor to personalized medicine). While “Web MD” and google search provide some information, many may find medical literature too technical for them or the general list of symptoms too vague. Perhaps people may even find it hard to describe their symptoms in a way google can provide accurate and helpful results. A mass collection of experiences with various diagnoses may benefit not just the patients but the health care givers. Only through observation do we learn of how comorbidities and demographics play a hand in the way illnesses manifest. Perhaps with a platform like meddit doctors may learn the full scope of the patient experience.
What it does
Meddit is a web app that uses natural language processing to connect users to medical stories similar to their own. By allowing people to describe symptoms and experiences in their own words, we hope to provide personalized information. On the home page you are greeted with a “Tell us how you feel” prompt in which users are to enter their general symptoms and experiences. A list of related posts are to pop up below, allowing people to read about experiences similar to theirs. We hope that some form of “verified diagnoses” can be attached to these posts, to prevent the spread of misinformation.
How I built it
Meddit currently comes in two parts, the web application, built using the MERN Stack (comprised of MongoDB, Express, React, and Node.js), and the natural language processing script, built using Python. On the web application, posts by patients are available and stored in the MongoDB database. The NLP component of our project was built using the NLP library, spacy. Spacy allowed us to tokenize paragraphs and categorize words based on parts of speech, context, and labels. Ideally, we would have constructed our own named entity recognition model, which would have allowed us to create categories such as symptoms, illnesses, and medications and would help us find similarities between posts. Alas, due to time constraints we decided to create a smaller-scale version of this idea, and make our own “dictionary” of medical categories - which we called bins. To screen the posts for common ideas, we calculated similarity scores between a post and each bin. This feature uses the token similarity function from spacy. In spacy, identical words receive scores of 1, synonyms or lexically similar words are scored very near 1 and contextually similar words are scored between .5 and 1. We used these scores to categorize each post into the bins. When a query is asked in meddit, we figure out which bins it belongs in and then calculate similarity scores for the posts in the relevant bins.
Challenges I ran into
While we have two halves of the project, we ran out of time before we could stitch them together. Currently, one can post on the web application and store that information but the python script to return relevant posts is not linked up. On the NLP side of things, we currently mimic the web app’s server with a text file of posts. The script can calculate bin scores for each post and also, given a query, provide a list of similar posts.
What's next for meddit
We would like to refine our NLP algorithm by training our own medically-centered named entity recognition (NER) model. This will let us better recognize medically similar words (eg. categorizing words into symptoms, illnesses, and medications).
Log in or sign up for Devpost to join the conversation.