More people will die if policy and clinical decisions are not based on solid science and if researchers follow the wrong leads. But there are tens of thousands of articles with important findings related to COVID-19 already now. And new ones are coming every day. No one can read this. Especially not in an urgent crisis. And even if one could, humans cannot keep all those scattered pieces of information in their head.
We want to help researchers, health care staff and policy makers to make evidence-based decisions even if they cannot spend their time reading thousands of scientific articles. We also want to bring annotated scientific evidence to everyone else who needs it: journalists, public health authorities, vaccine program funders and YOU! For this, we are building a simple customizable web tool with a powerful text mining engine under the hood. Simply choose a pre-made profile that fits you or upload a keyword list or example articles from your area of interest. Then you will get a ranked list of annotated key sentences from the entire medical literature that fits your interest with an option to filter for different annotation classes (e.g. drugs, symptoms, risk factors, genes...). And if you find something of high relevance one click will get you to the full abstract or article with highlighted annotations. For you, it could not been simpler. For us, it means putting together a powerful ensemble of several state of the art natural language processing models and other BioNLP tools such as expert-curated keyword dictionaries.
What we have done during the weekend
Welcomed new team members and expanded the text mining pipeline.
Due to the urgency of this crisis normal strategies where small expert groups weigh the evidence and then summarize it for all others are too slow. So, many decisions are made without weighing all the available scientific knowledge. Obviously, not a good situation... Our tool can change that and even be customized for specific questions and organizations. Knowledge is not only power, knowledge saves lives.
To continue with this project we need volunteers, a home for our webtool and eventually a little bit of funding to pay those working on this fulltime.
Value after this crisis
After the crisis is before the crisis. Climate change, cancer, obesity, malaria, tuberculosis... COVID-19 is not the only global health emergency and with almost 30 million medical publications the need to find the knowledge in the haystack will not go away. As our tool provides customizable text mining for medical texts of any kind it can live on to help us with all the other problems our society faces.
How we built it
We combine a variety of natural language processing approaches (model-, dictionary- and rule-based) and other BioNLP tools made by us and other researchers. This work has grown out of a large academic research project at Lund University, which was switched over to a COVID-19 focus a few weeks ago. Thanks to the tireless work of our team, collaborators and volunteers we are close to a full prototype.
More work putting together the different parts of the text mining ensemble, the development of the webpage and a custom version for our collaborators at CEPI. And an option to input other types of text such as social media posts in order to annotate them for various research projects. Help and input is always welcome!