Traced: An Approach to Crohn's Disease Prediction

Basic User Interface for Web Application

Inspiration

The inspiration to create an algorithm to predict Crohn's disease came from reading news articles about the millions of underprivileged people suffering from malnutrition and poverty in our home country of India. Over a third of the children in the world who suffer from malnutrition are from India, and only the accidental miracle of birth separated us from them. We realized how fortunate we were to have access to better healthcare and resources and felt that it was only fitting that we do something to make the world a better place. Although our tool is still in its early stages, it has much potential to alleviate the diagnostic challenges faced by those with limited access to quality healthcare. Reducing misdiagnoses, expediting the identification of the disease, and ultimately improving the quality of life for people who didn’t have the same luxuries as us was the driving motivation behind this project.

What our project is

Based on patients with pre-existing conditions of IBD, hair samples are taken over a period of time to predict a patient’s likelihood of developing malnutrition. By studying hair samples, we can find metal content in hair and assess the risk that the patient has developed Crohn’s disease. If the patient has Crohn’s disease, it is most probably indicated by a level of malnutrition, as inflammation of the gut mucosa (as a direct result of Crohn’s) changes the patient’s ability to absorb nutrients. Therefore, longitudinal studies utilizing the project-based algorithm can be performed to help administer an adaptive and effective treatment plan.

How we built it

We built the project by utilizing various data samples of micronutrient concentrations in hair samples to predict a patient’s risk for developing Crohn’s disease. Based on the different samples, we created a Machine Learning Logistic regression model to predict whether a patient is at risk for Crohn’s disease. The five variables used to indicate a patient’s risk are the micronutrients of Magnesium, Sulfur, Iron, Zinc, and Calcium. The Logistic Regression model was created in Python, and then the application was developed using software called Streamlit.

Challenges we ran into

One challenge we had when it came to the development of the algorithm wasn't in fact with the code, but rather the probabilities and math behind the code. Trying to find a detailed and consistent dataset that matched our set demographic and region specifications proved to be difficult, leading us to take multiple different datasets and calculate what were agreed upon to be the most common deficiencies across studies. Measuring the deficit also proved somewhat subjective due to various experiments having different criteria for what constituted a “normal amount” of vitamins and minerals in the body. This was also especially made difficult by the hallmarks of Crohn’s-specific malnutrition being similar to standard malnutrition, albeit with a few defining characteristics that once again varied per study that we reviewed.

Accomplishments that we're proud of

Defining and utilizing a fixed list of independent variables as the stable foundation for the algorithm demonstrated a clear understanding of the problem and facilitated the development of an effective solution. Secondly, the rapid learning and successful implementation of an entirely new Python library in a short time frame highlighted a strong aptitude for technical growth.

What we learned

Three of us were students from the Engineering and Computer Science school and, as such, needed to be better versed in the subtleties of pathogenesis and diseases. However, working on a solution for this problem meant we quickly picked up new and exciting details. For example, we learned about the effect that different vitamins and minerals have on physical health and how a disease's effectiveness can vary with income level, demographics, and region. We also used Streamlit, an open-source framework with an intuitive interface, to integrate with various data science tools and create a web application. We wrote the code for the application in a Python file and used widgets to create interactive elements that allowed us to perform data analysis and machine-learning tasks. In addition, we could display our Streamlit app on a web server using the local host. Overall, it was a successful project, and we could leverage Streamlit to solve a complex problem and produce a functional application effectively.

What's next for Traced: An Approach to Crohn's Disease Prediction

In the future, we plan to make multiple changes to improve accuracy and optimize the user experience. These changes include increasing the sample size from different demographic and geographical regions, conducting further validation studies to assess the algorithm's accuracy, acquiring more accurate data on diagnostic markers for Crohn’s Disease, and then implementing them as additional variables.