COVID-19 is the most significant challenge the world has faced in 75 years. As of the time of writing of this article, 20M people have been infected, of which 730k have died as a result [ 1 ]. In addition, the lockdown measures designed to mitigate the spread of the disease are predicted to reduce worldwide economic growth by up to 6% [ 2 ], corresponding to $5.4 trillion in lost GDP in 2020 alone. This is in addition to the untold future consequences from higher order effects such as mass unemployment, social isolation, and postponing unrelated medical care, to name just a few.
Ultimately the solution to COVID-19 will likely involve one or more medicinal therapeutics like vaccines. Unfortunately, however, it’s unclear when these might become widely available, or if a sufficient proportion of the population would volunteer for such untested treatments, or even whether such a thing is possible to create in a reasonable time frame, if at all. And while the efficacy of face masks is well established [ 3 ], so is the propensity for much of the world’s population to avoid wearing them [ 4 ].
Now and for the foreseeable future, the world’s most effective and reliable defence against the spread of COVID-19 is contact tracing [ 5 ]. This is a manual process that involves public health authorities interviewing known infected cases, determining to whom they may have been exposed during their infectious period, and contacting those exposed individuals so that they can isolate, thereby preventing further spread of the virus. While effective when done correctly, contact tracing is a highly labour-intensive process that relies on infected individuals being able to retrace their steps precisely over time spans of several weeks. Not only do infected cases need to recall with whom they interacted during this period, but they also need to know how to contact those individuals. This is often unfeasible, especially in dense urban environments [ 6 ].
Multiple solutions have been proposed to meet this challenge. Typically these require a majority of the population to install an "exposure notification" application onto their mobile phones [ 7 ], or to carry a dedicated piece of hardware such as a wristband. Unfortunately, public health authorities agree that these approaches are ineffective [ 8 ] due to the low specificity of the technologies upon which they are built, such as GPS, Bluetooth, and ultrasound. For example, these tools are unable to differentiate between unmasked individuals having a conversation less than a foot apart (high degree of exposure), and masked individuals dozens of feet apart and/or separated by a physical barrier such as a wall (zero exposure). In addition, they fail to leverage the empathy and persuasiveness of human personnel, which are critical for the success of contact tracing operations [ 9 ].
Some of the most successful contact tracing efforts have been in South Korea and Singapore. One of the key methods employed by public health authorities in these countries that others have so far largely ignored is the systematic review of security camera footage [ 10 , 11 ]. By watching videos recorded by standard CCTV cameras that are often ubiquitous in private and public spaces alike, contact tracers in these countries are able to pinpoint which individuals were exposed to known COVID-19 cases, and can do so without having to rely on fallible human memory.
Why, then, has the rest of the world not followed suit? The answer may have to do with cultural differences, particularly with respect to privacy. While citizens of South Korea and Singapore may be used to the idea of being recorded, much of the rest of the world (especially in the West) find this idea highly unsettling -- despite the fact that CCTV cameras are are already present in many Western cities to an equal or greater extent compared to their Asian counterparts [ 12 ]. In addition, the idea that a nation’s government can know the precise movements and activities of that nation’s citizens at any time is often seen as antithetical to free democratic societies.
Addressing these concerns is the motivation behind Contact Tracing AI: a software system that lets organizations leverage their existing technology infrastructure (e.g. security cameras) to prevent COVID-19 outbreaks efficiently, accurately, and at population scale with the help of state-of-the-art computer vision.
When a customer, employee, or visitor is confirmed to have COVID-19, organizations are often forced to shut down to prevent further spread, costing $millions in lost revenue and other expenses. With Contact Tracing AI, organizations can quickly and automatically determine a) who may have been exposed, and b) how to contact them.
What it does
After signing up at www.ContactTracingAI.com, organizations can either manually upload video files, or connect their Video Management Systems (e.g. Genetec Omnicast) for automated analysis. Contact information is determined via an opt-in QR code system (no app required), or via automated integrations with Point-of-Sales (e.g. Lightspeed Retail) and Access Control (e.g. Genetec Security Center) systems.
The submission for this hackathon is a small part of the complete system. It contains drag-and-drop video upload, keypoint detection, and tracking.
How we built it
In the following, refer to:
Consider an arbitrary floor plan (representative of a typical retail store, factory, or office building). Inside there may be one or more Cameras (C), Subjects (S), and external pieces of Equipment (E) like Access Control systems, Point-of-Sales systems, and/or QR codes.
Cameras generate Videos (V), which are composed of Images (I). A ten second long video recorded at 30 frames per second will generate 300 images.
Using Deep Neural Networks, we extract various Features (F) from these images, including:
- Person detection keypoints (e.g. mouth, hands, feet)
- Actions (e.g. coughing, speaking)
- Personal Protective Equipment (PPE) (e.g. masks)
Features of the same person between Images are linked together to form Tracks (T). Because humans don't change in their appearance, position, or velocity very quickly between subsequent Images (i.e. during 1/30th of a second), Tracks are selected so as to minimize the change in these attributes across consecutive Images.
Tracks corresponding to the same person in different Videos are linked together to form Subjects (S). Because humans don't significantly change in their appearance between Cameras, Subjects can be selected so as to minimize the change in appearance across Cameras. (Note that "appearance" here refers to shape, size, and clothing, not biometric measures like face or gait.)
Exposure (X) between two subjects is a function of the Features. This is defined according to the standard public health definition: 15 minutes or more of conversation within two metres (six feet) apart, without wearing face masks [ 13 ].
Whenever Subjects interact with Equipment, they generate Events: Swipes for Access Control, Payments for Point-of-sales, and Scans for QR codes. This is how we are able to determine Identities, including Names and Contact information.
In order to associate Events with Subjects, synchronization between the Cameras and the Equipment is required in both time (digital clocks) and space (location in field-of-view). This is accomplished during onboarding via a single click for each Camera/Equipment pair. Events are time-aligned with Tracks using the Viterbi algorithm.
Infection Risk (R) for a particular Subject is a function of their Exposure to all other Subjects and their respective Identities (including infection status).
Infection status is provided in one of two ways:
- Via the user (e.g. an employer) upon being informed either directly by the infected Subject (e.g. an employee), or via a public health organization
- Via the infected Subject directly via text message (only available if they previously opted in via a QR code).
The system is built in Python with PyTorch and RQ. Keypoint detection is implemented with Detectron2, and re-identification is implemented with Torchreid.
Challenges we ran into
- Determining optimal implementations in terms of ease of implementation, accuracy, and performance for keypoint detection, tracking, and subject re-identification
- Determining which components to open source
Accomplishments that we're proud of
- It works! (We are continuously iterating and improving.)
- We are official Genetec technology partners, which means this technology will be integrated into their products
What we learned
- Tracking is hard
- Re-identification is even harder
- PyTorch is awesome!
What's next for Contact Tracing AI
- Deploying on GPUs for improved throughput
- Integrating human-in-the-loop feedback using Amazon Mechanical Turk
- Improving on detection, tracking, and re-identification