Inspiration
The inspiration for this project is that HBV (Hepatitis B Virus) has been a leading cause of hepatocellular carcinoma (HCC) worldwide, and early detection would mean saving lives. I wanted to combine multi-modal data with clinical features and gene expression to create a tool that can help predict HBV-related HCC risk and identify the most predictive gene signatures. The goal was to design something that could accurately predict early signs of HBV.
What it does
Our project trains models on a multi-cohort dataset and outputs risk predictions for HBV-HCC based on user-provided clinical and gene expression data.
How we built it
The project was built with TCGA-LIHC, ICGC LIRI-JP, and GEO HBV-HCC cohorts. Models had included logistic regression and random forests that were trained on clinical-only, expression-only, and combined data. Feature selection, on the other hand, had used univariate analysis, recursive feature elimination (RFE), and random forest importance. Cross-cohort validation was used to ensure generalizability.
Challenges we ran into
Challenges that we ran into were finding the right datasets with labels, plots/graphs, and with local hosting for the Streamlit or the app. (The app was not tested; for some reason, the local host was not loading)
Accomplishments that we're proud of
Accomplishments that we're proud of are the formations of the graph, the cohort performance, and then with the results we had. The top 10 predictive features/genes that we had were: 1.Gene_0000, Gene_000,3 3. Gene_000,6 4. Gene_00,10 5. Gene_0012 6. Gene_0015 7. Gene_0018 8. Gene_0021 9. Gene_0024 10. Gene_0027 ROC-AUC = 0.991 F1 Score = 0.896
What we learned
From this project, we learned better ways of finding datasets, accessing them, making graphs, and performing external and cross-validation/evaluation. I also learned how to use Colab for the very first time.
What's next for Untitled
In the future, I hope that this project can be used as a guide for future bigger projects, which can help with validation, graphs and calculations.
Log in or sign up for Devpost to join the conversation.