Inspiration
I chose to take part in TAMU datathon because I thought it would be a great place to learn about the world of machine learning and data science. As this was my first attending a datathon/hackathon event, I chose to work on the beginner challenge "TD Hospital Exploration."
What Does It Do
The model predicts patient survival using relevant medical information from patient data, choosing features that appear to correlate with patient survival.
How Did You Build It
Having been given starter code and a spreadsheet of patient data, I spent majority of my time looking over the data and, based on the hospital knowledge on the features and the cleanness of the feature data, eventually selected features that the model will use to predict patient survival. Features that were either missing too many data points (such as urine) or did not have any bearing on patient survival rate (such as dose as its the same for all patients) were determined to be unhelpful to the model so they were left out. Some features that were used for the model were those that made sense in the medical world to relate to patient survival. Additionally, I was told that using too many features for the model could end up confusing the model as it would not know how to handle every patient situation, so I tried to limit the number of features that the model used to determine patient survival so that it could most accurately predict patient survival while not getting confused by too many features.
What Were Some Challenges You Faced
Some challenges I ran into was sorting through the data spreadsheet. With 40 some odd features and over 7000 data points, sifting through and determining what was and was not relevant was very difficult. Cleaning data that was either unreadably by the model or just missing too many data points was another issue that I faced while working on this project.
What Made You Proud
As this was my first datathon, I came in with very little expectations and did not know how I would do, considering I have never worked with machine learning and data modeling. Regardless, I felt proud after submitting to the leaderboard as I was able to come up with something to submit.
What Did You Learn
Upon completion, my understanding of medical conditions and there correlation to the possibility of death was confirmed as many of the features that seemed to have strong correlations on patient survival were those that made sense from a medical perspective.
Log in or sign up for Devpost to join the conversation.