We have the responsibility and the technological tools to design and plan action strategies in the face of the current health crisis. Now, more than ever, our role is to provide tools to the system that facilitate decision-making and the generation of knowledge for the health of the entire citizenry. Constituting the first line of action in this crisis.
What is the real problem that we want to solve
- Clinical data is the greatest allied tool for researchers to fight against this pandemic and in future crises to come. Currently, the information collected in patients' medical reports is not being intelligently extracted with the available technologies, researchers cannot easily share their clinical databases with others to advance faster, and clinical studies are not homogeneous regarding the variables that are used. Even often different research groups work on the same objective without being able to unite their efforts to reach the results earlier.
What it does
We propose to carry out a database exchange system. Doctors and researchers will be able to extract from the anonymised medical reports the variables of interest of the patients affected by Covid-19 and other pathologies through natural language processing, create their own databases, share them among their colleagues or even among other research groups from different communities or countries. Promoting free access to data among physicians, allowing research studies to be carried out with greater speed and sample size, creating a communication network between scientists and thus being able to promote new prevention, diagnosis and treatment measures.
The solution’s impact to the crisis
- Our platform will allow a more united collaboration between researchers, providing a tool to share ideas, projects and data. They will be able to create their own databases through the extraction of anonymous reports. It will also allow them to incorporate and unify already created and publicly accessible databases. All this, with the intention of accelerating the investigation and using prediction tools based on artificial intelligence or other statistical methods (in which large data samples are required) to fight and defeat the Covid-19
How can this idea be replicable in other contexts
The extraction of data through clinical reports through natural language processing and the creation of a database could be carried out for the study and investigation of all medical and surgical pathologies, although at the moment our primary focus is Covid-19.
Medical reports can be analyzed in different languages, with subsequent coding using SNOMED CT and ICD-10, allowing health professionals to represent the information and share it appropriately, accurately and unequivocally even between different countries.
Could it be taken to other environments
Data extraction will be performed by exporting anonymised medical reports through a website that can be installed on the system intranet, through online access or even through an offline web app, where clinical variables are extracted and stored in its own database. Hence, it would not necessarily be associated exclusively for hospitals.
The use of this tool could cover all disciplines related to health, medicine, nursing, health administration, preclinical sciences, pharmaceutical industry, health organizations, prevention and epidemiology institutions.
How can we capitalize on the available resources currently linked to our idea
Our value proposition is a network platform that allows the direct extraction of medical variables, the storage of own databases, the exchange of data between researchers and allows researchers to group together in common objectives and join efforts.
What kind of collaborations we should generate to carry this process forward
- The support of research institutes, innovation units and hospitals is essential for us to validate our proposal and create value
What is our target user
- Direct: Physicians, Nurses, clinical researchers, data manager and associates
- Indirect: Research institutions, Public and Private Hospitals, pharmaceutical industry, health organizations.
What is the impact to society if the idea is implemented at scale
- Collaborative work is the fastest way for science to advance, nowadays researchers specialize in increasingly complex topics in which it is difficult to obtain their own data. A platform that allows communication and collaboration between researchers, that promotes access to free data, and that allows studies to be homogeneous by coding the variables to handle us all in the same language is completely necessary, and could connect researchers from all communities around the world on such an important issue of continuous search for solutions such as health.
What is our source of income
- Our idea is that all researchers can use our tool without any economic impediment, therefore a basic service of great value will be offered free of charge, with subscriptions to access the most advanced content. Maintenance and support services will also be offered in those institutions where you want to implement our software within the computer system.
What is our funding strategy
We are currently in a stage of seed capital financing. The investment will be made in our creation phase until we manage to generate our own cash flow, or until we are ready for a new investment. This includes options such as friends and family funding, seed venture capital funds, angel funding, and crowdfunding.
How we built it
- NLP service: Python main code, nltk library
- Database service: Mongo DB
- REST APIs
What are our main competitors
- COVID-19 Data Portal
It allows researchers to load, access and analyze laboratory data related to COVID-19 and specialized data sets all in a common database, does not include clinical variables of patients, does not allow to extract data directly from reports, and access it is a comprehensive database that does not allow create collaborations between researchers or own databases
- COVID-19 research database
This is a collection of data sets made freely available to public health and policy researchers to extract insights. But like the previous one, they are common databases, it does not allow the inclusion of data and neither does it allow the creation of collaborations between researchers.
The idea arose two years ago when we started collaborating with a Madrid Hospital to analyze data from patients in the coronary unit. After that time, we already have a functional prototype made up of services, so the web service, the database and the natural language processor are already developed
How fast can the prototype be turned into a ready to use product
- Currently the prototype works for the processing of reports in Spanish, so it can be replicated not only in Spain but in all Latin America. It could easily be transferred and developed to reports in English, for the rest of the languages of the European Union we would need the collaboration of developers from different countries. With this help, and with the necessary financial support, having the web service and the database service already available, it would take us a minimum of three months to have a functional platform for collaboration between inter-community researchers.
What measures should we take into account to ensure personal protection of the health data contained in the Medical Records
The RGPD defines health data as those referred to the mental or physical health of a person, that is, that reveal information about your health status. The general rule according to article 9 of the GDPR is that the treatment of health-related data is prohibited. But the GDPR also establishes exceptions, within which there is "a general public interest".
For its part, article 9 of the LOPDGDD also regulates data processing health expressly referring to article 9.2 letter G of the RGPD, protected also by other laws such as the General Health Law, the Research Law biomedical or the General Law of Public Health and considers that the treatment of such data is justified for reasons of "public interest".
In this way, we can conclude that:
We can treat health data and our basis of legitimacy will be article 9.2 G) in accordance with article 6.1 E);
To provide content to our database, we must have access to the medical records that have health data, that is, we must request a transfer of health data to the corresponding institutions, and in this case there are have to have the following precautions:
a Dissociation: provided that the clinical data can be separated from the identifying, so that they cannot be associated with a person identified or identifiable, they may be transferred without further requirements. When separating the identifying data of the clinical-care character, we are no longer yielding "personal data". The duty of confidentiality does not exist when the information cannot be associated with a person; b Consent: if dissociation is not possible, the transfer of data will be legitimate as long as the patient has given his consent. It is The consent must be in writing. It is advisable that.This consent is filed in the HC itself .; c Legal authorization: in the absence of the patient's consent, the data of health can only be transferred to a third party when a law so provides.
To make sure that all clinical reports are anonymized, the tool automatically performs this process.
What we have done during the weekend
- Frontal development of the web service
- Database service completion
- New collaborators
- Presentation and concretion of the idea
- Improved data processing
- Complete study on data protection regulations
Challenges we ran into
First, we must continue expanding our local collaborations so that at a later stage we can start in other communities, prioritizing the processing of texts in other languages
These data must be treated in the most secure, data privacy and transparent way. All this with the aim of transferring the usefulness of these data to citizens
Accomplishments that we're proud of
The engine of all this is to provide useful resources that contribute to improving health care. Data from patients with SARS-CoV-2 can be of great help to identify, monitor and make predictions about the evolution of the pandemic where the allocation of resources is a priority to design strategies that can contain the disease.
What we learned
Our idea is to use this tool so that healthcare teams can extract all the necessary clinical data from patients with COVID19 to advance knowledge of the virus and create predictive models of evolution, epidemiological models, information on response to treatments, sociodemographic data on the impact on the population and its behavior, through artificial intelligence, or the chosen statistical methods. These bases may if they wish to be shared between researchers from the same team or other collaborators.
Necessities in order to continue the project
To be successful and implement the platform, we will have to find several private or public centers with which to make a use case and a validation test of its function and utility, in order to scale it to more centers. Implement Hospitals of a certain size and quickly produce impact. For this we will need the following:
- NLP developers
Who are the founders
- Carmen Arquero Domínguez: Medicine, CMO, COO
- Fadel Hamed: Telecommunications engineering, IT Manager
- Javier González Bodas: Computer engineering, CIO
- José Ángel Álvarez Vázquez: Medicine, COO
- José de la Mata: Biology, Data Manager
- Óscar Otero Martínez: Telecommunications engineering, CIO
- Pablo Vaquero Martínez: Medicine, CEO
- Víctor Vaquero Martínez: Data scientist CTO
- Yasna Vanessa Bastidas Cid: Law and data protection, Legal advisor