Problem to solve

A novel coronavirus has created a pandemic (COVID-19). The deadly disease is rapidly spreading across the globe forcing governments to lockdown people and hold businesses to reduce mortality and manage the burden on healthcare systems. The virus is invisible, it rapidly spreads from human to human even before symptoms occur or while they are very mild. Large scale testing, which is needed to detect infectious individuals as early as possible, is not available in the required quantities. Therefore, lockdowns can’t be released without high risk of increasing mortality. Identifying and isolating individuals before spreading the disease requires regular testing of asymptomatic individuals combined with rapid communication of results. This makes high-throughput testing key to monitoring the outbreak while reopening businesses to prevent future waves of the pandemic.

Proposed solution

Establish a complete workflow using pre-barcoded sample collection devices to perform large scale SARS-CoV-2 testing. It is performed in an established clinical laboratory using Next Generation Sequencing and combines rapid data analysis with a smartphone app to securely manage the data flow between healthcare providers and citizens. Starting from the sample collection, the communication is seamlessly integrated in the process all the way to receiving the digital result. Upscaling testing capacities for SARS-CoV-2 to meet the demands of healthcare systems and to test large groups of asymptomatic individuals, e.g. at large companies or schools, is necessary to ease lockdown measures and reverse social distancing. The need for such a solution differs across countries in the EU. Some focus on increasing clinical testing capacity, while others are preparing to test large numbers of asymptomatic individuals. The proposed solution is flexible, it can be applied to both settings to address the needs of healthcare systems and public health programs. The key components of the solution are:

Wetlab

We develop a massive parallel NGS approach to identify SARS-CoV-2 in each infected sample. Samples from thousands of individually barcoded patient swabs can be sequenced and analyzed in parallel. Once the collected samples have reached the laboratory and registered by their barcode, they are immediately introduced into the laboratory process. The samples are transferred in a lysis buffer and forwarded to an automated RNA extraction workflow. After cDNA synthesis various validated, SARS-CoV-2 specific virus markers are enriched in each sample. A human control amplicon is used as process control. All samples are then submitted to a highly scalable, massively parallel next generation sequencing platform. Sequencing barcodes, that identify individual samples, consist of 8 nucleotides; their combination is unique for each sample in one run. Ultra-high-throughput SARS-CoV-2 sequencing is used to increase the overall test capacity. Currently this enables the parallel analysis of around 3.000 samples in one run using a small sequencer. The solution is scalable by using more of the same or even bigger machines with higher throughput. Such devices for massive parallel sequencing, are widely adopted in research and diagnostics across many countries and can be used to establish ultra-high throughput testing capabilities.

Data analytics

Using established cloud based or local bioinformatics solutions, the sequencing data output from thousands of samples in a single run is allocated to each individual sample (demultiplexing into individual FASTQ files). The output sequences are then matched to the reference sequence representing the entire SARS-CoV-2 genome and a human control gene (paired-end read alignment). After quality filtering the reads aligning specifically to the virus and control target is counted. Sample-level QC, to confirm that sample collection and the wet lab assay performed as expected, is done comparing counts of reads virus to the human control RNA (RNase P Gene). On samples passing QC, virus detection is then done by scoring read counts from all viral reads through comparison to a threshold established from calibration studies. The overall processing of an entire sequencing run with thousands of samples is completed in under one hour. APIs have been defined to transmit results to the laboratory information system as well as the RavenC2 server communicating with the smartphone app.

Smartphone app

The Raven App provides a decentralized and secure digital solution for direct return of Covid-19 results to individuals tested on the high-throughput RavenC2 laboratory platform. With thousands of patient samples being tested simultaneously, the current practice of transferring test results verbally by phone will not be feasible. Therefore, the goal of the RavenC2 application and infrastructure is to digitally transmit Covid-19 test results directly from the laboratory to each individual at scale, while minimizing the processing of personal data. Individuals using the RavenC2 Application will be able to scan the unique barcode printed on each test kit, which is later entered into the Laboratory Information Management System (LIMS). Once sequenced, each result can then be transmitted back to the individual based on a unique identifier. The use of temporary access tokens ensures that only users who have scanned the code on a test kit can access the test result assigned to that test kit.

Our infrastructure complies with highest standards of data protection

The Raven infrastructure only processes the unique identifier from the test kit and the laboratory result. No personal identifiable information, such as name, birthday, or other will be processed. In addition, each test result is deleted from the RavenC2 Infrastructure after 14 days and will only be available on the individual's smartphone after that time.

Maximum Data Security and Scalability

Using the common HTTPS protocol, a secure connection between the sequencer and the RavenC2 infrastructure is established using certificates, IP whitelisting, client keys and secrets. The infrastructure is operated centrally in an Amazon Web Service (AWS) data center in Frankfurt in a high availability setup. The RavenC2 App is a secure digital solution for infinitely-scalabe return of Covid-19 test results between high-throughput RavenC2 laboratory platforms and tested individuals.

We believe in Open Source

The Raven infrastructure is based exclusively on open source software components. The code is published on Github under the MIT license. The RavenC2 App is a secure digital solution for infinitely-scalabe return of Covid-19 test results between high-throughput RavenC2 laboratory platforms and tested individuals.

What have we accomplished so far?

Identified partners with knowhow and capability to develop a technology solution with almost infinite upscale capacity Creating a complete prototype workflow from sample collection to a Yes/No answer about the presence of SARS-CoV-2. Proof of principle wet lab workflow with first sequencing run using clinical samples successfully confirminged proof of concept.
Powerful data analysis pipeline based on commercially available products. The data analysis is based on counting sequencing reads by matching parts of the virus cDNA and comparing those numbers to reads overlapping the human genome. Development of RavenC2 serverLabRes and App to seamlessly manage digital data and Showing the lab result directly to the citizen via Smartphone App. Defined programming interfaces (APIs) to connect the digital systems. Turn-around times can be reduced to 24 hrs. after sample delivery. Estimated cost/test is below 10,- Euro/test.

Impact on crisis

An ultra-high throughput testing approach as presented here can improve infection control by supporting the early identification of infected individuals on a large scale. Early detection enables an improvement in preventive healthcare measures and can therefore ease social distancing (ease lockdown, reopen economy, enable social contacts)

Outlook

Implementing this workflow all around the EU or even worldwide would multiply the amount of tests that could be performed on a daily basis. Since modern sequencing platforms are able to produce terabytes of data and the virus amplicons only need a few megabytes, the number of concurrently analysed samples is mostly limited by the number of unique barcodes.

With the RavenC2 approach we can see that testing of 3.000 - 12.000 samples in a day on a single instrument is highly achievable. One can imagine that this can be scaled via a number of axes:

· Multiple instruments, there are many thousands of these types of sequencing instrument installed globally

· Higher capacity instruments (newer sequencing instruments produce terabases of data and the virus amplicons only need a few megabases). Theoretically a single run on the latest sequencing instruments could handle millions of samples in a single day however the ability to individually identify each sample would need some significant advances.

· Increasing the barcode length and/or availability of unique dual indexing (both will enable more samples to be combined on a single run and help increase the number of samples per day up. Redesigning those to a length of 10 or 12 bp and using higher capacity instruments would theoretically greatly increase the capacity of each sequencing run to > 1.000.000/run.

However there are other rate limiting steps such as the ability to collect the high number of samples and to be able to process them via automation that could limit the theoretical capacity. The solution is more likely to be limited by sample and preanalytical logistics, than sequencing testing throughput.

To keep track of the virus’ spreading and infectious paths, the law requires positive laboratory test results cases have to be reported to Health care authorities. This process can be automated and accelerated.

Connection to other applications supporting infection control like the use of Integration / API to proximity tracking apps can be established.

The value after the crisis

An ultra high-throughput testing solution continues to be valuable beyond the acute phase of the outbreak: The instrumentsiInfrastructure can be used many ways, e.g.: to advance precision medicine and supports for future preparedness in case of additional waves or new outbreaks to support ongoing global surveillance by sequence full virus genomes to prevent future endemics. The infrastructure adds to a higher degree of digital healthcare communication. Overcome fears in the population of misusing digital health care data by demonstrating data security in a real application General utility for public healthcare issues

What we need to continue

  • Reimbursement for tests in and outside healthcare setting depending on country. Individual testing of symptomatic patients is more an issue of the doctors and insurances, while testing of asymptomatic individuals is more a matter of public health and needs to be financed by the governments.

  • Partners for a pilot project to proof the value of the solution

Share this project:

Updates