An audio fingerprinting based solution to stop robocalls

Please see attached "FTC Robocall.zip" for a Microsoft Word or Adobe PDF file for more details. Thanks.

Introduction

An audio fingerprinting based solution to stop robocalls is introduced in the paper. It consists of a central server or a group of distributed servers that provide access to a constantly-updated database of the audio fingerprints of robocalls, an Anti-robocall Device that can be used by consumers to screen robocalls and to collect robocall data, an Anti-robocall Apps that runs on mobile device or VoIP device to screen robocalls and to collect robocall data, and robocall traps to collect robocall data.

The audio fingerprint "is a condensed digital summary, deterministically generated from an audio signal, that can be used to identify an audio sample or quickly locate similar items in an audio database."[1] The solution uses the audio fingerprints of robocalls to quickly identify robocalls (in seconds). The algorithm for audio fingerprints is now mature and very efficient, and has been used by many commercial and open source solutions to identify music and etc.

The FTC database of the audio fingerprints of robocalls collects data through several sources, e.g. Anti-robocall Device, Anti-robocall Apps running on mobile device or VoIP device, robocall traps (see below), online uploads by consumers, and etc.

The Anti-robocall Device connects between the PSTN and the telephone, and also connects to the Internet to synchronize local database of audio fingerprints of robocalls with the FTC central database. It bypasses calls that are on a whitelist by using Caller ID, and compares the audio fingerprints of other calls to the database of the audio fingerprints of robocalls (disconnect the call if matched). It also allows consumer to flag a call as robocall and sends robocall data to the FTC central database.

The Anti-robocall App works on mobile and VoIP devices. It bypasses calls that are on a whitelist or phone contact list, and compares the audio fingerprints of other calls to the database of the audio fingerprints of robocalls (disconnect the call if matched). It also allows consumer to flag a call as robocall and send data to the FTC central database. It synchronizes local database of audio fingerprints of robocalls with the FTC central database.

Robocall traps can be set up by FTC and the United States Postal Office to collect robocall data.

Method

a. Audio fingerprinting

The audio fingerprint "is a condensed digital summary, deterministically generated from an audio signal, that can be used to identify an audio sample or quickly locate similar items in an audio database."[1] The algorithm is now mature and very efficient. It should only take seconds to match the audio fingerprint of a single robocall to a database of audio fingerprints of robocalls (presumably with tens of thousands of records).

b. FTC central server or a group of distributed servers

The central server or a group of distributed servers have to be provided and maintained by an authority like FTC. It is also possible for phone companies to provide such a server but it won't be able to collect as many audio fingerprints of robocalls as possible. It may also need to exchange data with other phone companies. A central server or a group of distributed servers managed by FTC are a better solution. The server will store the audio data of robocalls reported by the Anti-robocall Device and Apps, and uploaded through online portal. Algorithms are needed to avoid false reporting, e.g. the robocall needs to be reported by a certain number of consumers and from a certain number of physical locations. The FTC central server then processes the audio data of reported robocalls, gets the audio fingerprints, stores in the database, and provides access to Anti-robocall Devices and Apps.

c. Anti-robocall Device

The Anti-robocall Device is a small size box that has one RJ45 LINE port to connect to the land-line or any VoIP device telephone out port, one Phone port to connect to the telephone, and one Ethernet port and/or Wireless antenna to connect to the Internet (router/cable modem/etc.) See below for flowchart of how Anti-robocall Device works,

The Anti-robocall Device has a whitelist of caller IDs that can be maintained through a web-interface. It supports adding phone numbers from call history and importing contact lists from Microsoft Outlook, Google Contacts, Facebook, and etc. The Anti-robocall device will bypass any calls originated from the numbers on the whitelist.

The Anti-robocall Device also supports an optional Junk Voicemail that records all matched robocalls. Consumer can go through a web interface to listen to the Junk Voicemail and even mark them as Not Junk (the voicemail will be sent to FTC central server for further analysis).

The Anti-robocall Device also has a button to flag robocall and can also catch the signal of keys pressed on the telephone to flag robocall, e.g. "#" or "*" on telephone.

The Anti-robocall Device automatically synchronizes local database of audio fingerprints of robocalls to the FTC central database. The size of audio fingerprints is relatively small (can be as small as hundreds of kilobytes) thus the database of the audio fingerprints of robocalls should be small enough to be put onto local storage. Once the size of the database grows too big, the Anti-robocall Device can store the audio fingerprints of most-popular robocalls locally and access the rest of the database through Internet. The performance should be OK if broadband is used.

NOTE: Anti-robocall Device can work without an Internet connection and only use the local database of audio fingerprints of robocalls to match.

d. Anti-robocall Apps

Anti-robocall Apps works similarly to Anti-robocall Device. The only difference is that instead of a physical device, the Anti-robocall Apps is software based. NOTE: for VoIP system, it can use Anti-robocall Apps or use the same Anti-robocall Device for traditional phone lines (see above).

e. Robocall Trap

To get more audio fingerprints of robocalls, FTC should work with United States Postal Office to create fake new addresses with fake new phone numbers, and to get those new phone numbers publicized routinely. Telemarketers like those new addresses and new phone numbers very much and routinely search public records for those numbers. FTC should then record all calls to those numbers and add them to the central database of robocalls (obviously calls that are dialed as wrong numbers should be skipped).

f. Audio Fingerprint Whitelist

FTC should also maintain a whitelist of audio fingerprints for permitted robocall messages, e.g. political calls and calls from certain health care provider. Those calls have to be registered at FTC and the contents of the calls should also be provided to FTC for analysis in advance (to get audio fingerprints).

Preliminary Cost Analysis

The Anti-robocall Device can be made at very low cost, e.g. a stripped-down version of Raspberry Pi[3] should be powerful enough to process the audio and to match audio fingerprints. Open-source operating systems, e.g. Embedded Linux or Android, can be used. Storage shouldn't be a problem either since the size of audio fingerprints can be as small as hundreds of kilobytes.

Anti-robocall Server will need major investment by the FTC to handle possible tens of thousands of consumers concurrently. It is recommended to use a distributed solution.

Conclusion

a. Pros and Cons

The audio fingerprinting based solution in the paper solves the robocall problem by using audio fingerprints rather than Caller ID (which is very easy to spoof, constantly changing, and not very reliable). The key to identify a robocall is that "Many different companies use the same or very similar recorded messages."[2] It forces telemarketers and robocallers to randomize their prerecorded messages, which increases their costs and may not work either (see below). The audio fingerprinting based solution makes minimal changes to existing systems (by only adding a small device or using an App). Because it uses Internet to synchronize data that are collected from general public and robocall traps, the audio fingerprinting based solution can keep up with the ever-changing telemarketers and robocallers. With more comprehensively collected audio fingerprints of robocalls, the time needed to match the audio fingerprint of a single call to the database of audio fingerprints of robocalls should be greatly reduced.

The audio fingerprinting based solution requires caller ID service to bypass calls from a whitelist. For calls that are not on the whitelist and that are not identified as robocalls, the audio fingerprinting based solution will automatically connects the call for a short period (several seconds) and responds with a "Please hold" message, which may be inconvenient for some users and may add costs to users who need to pay for each connected calls (mobile phone users and etc.)

For mobile device, it requires Internet connection to synchronize with FTC central database of the audio fingerprints of robocalls, though it can limit the use to only the local database of the audio fingerprints of robocalls to avoid Internet connection.

The Anti-robocall Apps may be very difficult to develop for proprietary system, e.g. Apple iOS.

b. Possible ways to counter the scheme

The telemarketers and robocallers may try to make their prerecorded messages more random. If they use human voice to do so, it will be very time-consuming and not cost-effective. If they choose to use computer generated voice, then even though the messages can be very random the characteristic of the computer generated voice is actually easier to recognize and identify through acoustic fingerprinting.

The telemarketers and robocallers may also try to add random noise to their prerecorded messages to make it difficult to do audio fingerprinting. It will require further improvement of audio fingerprinting algorithm to identify but at the same time their prerecorded messages also become harder for consumers to understand thus less effective for them.

The telemarketers and robocallers may even try to hack the Anti-robocall Device to get audio fingerprinting data. The encryption and security of the Anti-robocall Device need to be better designed to avoid that.

The FTC central database or a group of distributed servers have to have the highest security measures to fend off any hack attempts by the telemarketers and robocallers.

[1] ISO IEC TR 21000-11 (2004), Multimedia framework (MPEG-21) -- Part 11: Evaluation Tools for Persistent Association Technologies

[2] http://www.consumer.ftc.gov/articles/0259-robocalls

[3] http://www.raspberrypi.org/

Updates

enge chall started this project — Mar 21, 2014 06:34 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.