POLYHEDRONvsCOVID Multilingual interaction scientific issues

POLYHEDRON vs COVID (Multilanguage interaction on high-priority scientific questions)

The problem to be solve

Today there are the datasets representing the most extensive machine-readable Coronavirus literature collection available for data and text mining to date, with over 57,000 articles, more than 45,000 of which have full text. https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge This project helps the science community answer high-priority scientific questions related to COVID-19, and to effectively interact with each other indifferent languages answering these questions.

Our solution

We are using semantic-linguistic analysis of large volumes of unstructured information, their structuring, establishing contextual links between the documents being processed, forecasting and supporting the processes of rational choice with the formation of information-analytical WEB-oriented decisions.

What you have done during the weekend (hackathon)

We used Dataset from https://connect.medrxiv.org/relate/feed/181 https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge And provided answers to the following questions https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/tasks?taskId=568 We created a technical prototype to solve the problem above.

Who is doing what in the team:

Dmytro Filistieiev @Dmytro works as Product manager and responsible for project architecture.

Maksym Nadutenko @maxkrb works on back-end with c# and programmed text mining corpus based techniques with lexicographical theory. This includes semantic indexing and ontology-based fuzzy search.

Vitalyi Pryhodnuk @ tangens91 works on most of the frontend and big part of the backend, including lots of text-processing submodules.

Vjacheslav Gorborukov @slavon07 works on back-end, including MCDA (Multiple-criteria decision analysis) modules.

Oleksandr Stryzhak @ sae953 scientific advisor of the project.

The solution’s impact to the crisis

This project will enable scientists and healthcare professionals around the world to reduce the time needed to retrieve specific scientific COVID-19 data and to effectively interact with each other indifferent languages (especially in Ukraine), streamlining data and knowledge from around the world on COVID-19 issues. As a result, a more rapid response to the COVID-19 and similar cases is more likely.

The necessities in order to continue the project

To continue the project, logistical support is required to host the project in the Cloud, the maintenance of a team of programmers and a team to accompany the project until its completion. The estimated cost of one year is approximately from 100 000 Euro to 1 900 000 Euro.

The value of your solution(s) after the crisis

After the crisis related to COVID-19, the project can be applied in any field of activity where there is a need to process a large amount of information, as an option for research in the field of ecology, consumption of goods and services, standardization of products and processes, marketing research. etc.

The URL to the prototype

https://covid19tdm.stemua.science

The URL to the pitch video

https://youtu.be/uYfUY_nQrQs

Built With

Submitted to

The European Commission's EUvsVirus Hackathon

Created by

I worked on back-end with c# and programmed text mining corpus based techniques with lexicographical theory. This includes semantic indexing and ontology-based fuzzy search.

Maxkrb Krb
PhD in applied and computational linguistics
I worked on most of the frontend and big part of the backend, including lots of text-processing submodules

Vitalik 1700
I worked on back-end, including mcda modules

Viacheslav Gorborukov
I worked on Product manager and responsible for the lifecycle of product.

Dmytro Filistieiev

Updates

Dmytro Filistieiev posted an update — Apr 25, 2020 05:38 AM EDT

Up-to-date trends in fields of information and technological development is determined by directed to development the innovative knowledge-oriented solutions for consolidated management of national security processes. Researchers and developers pay particular attention to the problem of intellectual huge amount data processing (Big Data) which have an active impact on the development of society and national security of the state. In technologically advanced countries (USA, China, Japan, South Korea, European Union, etc.) Big Data is processed based on convergence of data science and cognitive technologies that are key for the artificial intelligence systems design. At the same time the problem of artificial intelligence is closely related to natural language because of intelligence, in general, is a "form of systems with a linguistic status individualization ". In the process of rapid network civilization development, the problem of using earlier accumulated knowledge, the volume of which is progressively growing, is becoming increasingly acute which, in turn, gives rise to a number of complex tasks. According to forecasts of leading analytics firms the total global volume of human generated and replicated data by the end of 2020 will reach over 3.6 Zettabytes (36 trillion GB). However, there are reasonable grounds for believing that such amounts of data can be effectively processed and, above all, adequately perceived and properly understood by their recipients. An urgent need to answer this global challenge stimulates the development of new approaches to the operation of oversized arrays of data which actually caused the concept of huge data amount (Big Data). These features of the modern stage of the network processes development resulted in the urgency of the problem of intelligent analytical tools development that were able to "take over" the at least part of functions on the basic cognitive path of a person. The specified functionality fully provides cognitive platform «POLYHEDRON» that provides a full process of large amounts of information resources analyzing regardless of format for their creation. All content of the information resources received on the input of the system is considered as a single narrative. At the same time cognitive services realize structuring and classification of information, the necessary documents on the basis of semantic analysis synthesize, the properties of information processes identify and the choice and acceptance of optimal decisions and prediction provide. Platform "POLIEDR" cognitive services provide realization of personal information-analytical platforms which are able to promptly and integrally all information processing that is formed in space-distributed network sources. In particular, this applies to a coronavirus pandemic. This ensures integrated use of information that is continuously retrieved and updated by all relevant specialists that develops technological conditions to establish effective interaction between different agencies and structures that information about the dissemination and exposure of coronavirus infection analyses and decision-making. As practice shows the usage of "POLIEDR", its services operationally provide the industrial implementation thematically oriented information and analytical platforms for the duration of 3-5 days, at information load up to 1 000 000 documents with volume ~ 100 pages each. In general, the cognitive services of platform "POLIEDR" can be quickly applied on the following main directions: • formation of full-scale logistic solutions for improving capacities of medical and other institutions involved in counteraction processes distribution of coronavirus pandemic; • identifying and path definition to reduce the risk of coronavirus pandemic distribution; • analyzing technological and technical capabilities of medical and other institutions involved in the process of counteracting the spread of coronavirus pandemic including the organization of resource support accounting and control and decision-making in this fields;
• analysis of the legislative and normative based on the Ukraine population protection in emergency conditions of the coronavirus pandemic spread; • operational analysis of information circulating in the Internet resources of Ukraine and other countries; • the forecasted conclusions formation on the basis of information received from the State Statistics Service of Ukraine, medical institutions, various internet sources, etc.;
• fundamental causes of crisis (including pandemic) situations in the world research and proposals for NSDCU on the development of the foundations of a rational strategy of Ukraine under the hard resource constraints conditions formulation.
• objectively reasonable short-term budgets to finance measures to counteract the spread of coronavirus pandemic development;
• providing information and consultation services to the population

Log in or sign up for Devpost to join the conversation.

Dmytro Filistieiev started this project — Apr 24, 2020 02:52 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.