A large scale, human-powered, data transfer network powered by Raspberry Pis, backed by Twilio, DialogFlow, and GCP.
According to UN Broadband Commission for Sustainable Development, nearly 4 billion people do not have any internet access (1). As the world moves further into the information age, the benefits from advances in AI, Big Data, High Performance Computing, and other data-reliant fields are only available to those with the bandwidth.
However, according to the GSMA Intelligence Agency, over 5 billion people have access to mobile phones with a non-data cellular connection. Our project takes advantage of this fact to connect the next billion people to the information that we consider so essential to our everyday lives. We do this by leveraging the limited communication bandwidth available through SMS, and the existing person-to-person commerce networks to allow people in remote areas to request essential data.
What it does
Datanium is a network of Raspberry Pis which allow for human-powered transfer of data between places with internet access and places which do not. In the Datanium ecosystem, there are two different types of users: consumers and couriers. Consumers request data, and Datanium ensures that couriers move the requested data to the Datanium Node closest to the consuming user.
For consumers, the flow is similar to the following:
- First, the consumer visits their local Datanium Node. These local nodes host a local Wi-Fi network where users can browse through a list of available software, offline internet archives, and other media. Users then text the Datanium service to request the data that they want to receive.
Then, the Datanium Network coordinates couriers to bring the data from a Datanium Node connected to the internet to the Datanium Node closest to the requester's location.
Once it arrives, the user receives a text informing them that their data has arrived at their local Datanium Node and that they can download it using the one-time security token which is provided to them over text.
For couriers, the flow is similar to the following:
The courier registers their intent to go between town A and town B. They can also indicate the amount of money they want to be paid to make that trip.
The courier receives a message letting them know that there is data at Datanium Node in town A which needs to be taken to Datanium Node in town B.
The courier goes to the Datanium Node in town A and downloads it to their device using a code sent to them over SMS. For security purposes, the couriers never know what this data is, just that it exists.
The courier takes the data to the Datanium Node in town B and connects to the local Datanium Node. The courier uploads the data to the Datanium Node in town B, and then receives a one-time-passcode from the node. The courier then sends an SMS message to the Datanium server where the transaction is validated and the user's Datanium account is credited.
How we built it
Our solution's architecture is built with highly volatile and sporadic network conditions in mind. Generally, the architecture can be modeled as a graph.
There are three main parts of the architecture: the Datanium Node hardware, the Datanium Node software, and the coordinating server software.
Inside the Datanium Node, the backing hardware is a Raspberry Pi controlling the Wi-Fi hotspot, an Arduino controlling the visual feedback indicators including the flag, LED strip, and LCD screen.
The Datanium Node software is written in Python and provides a web interface for users on the local network. The frontend is written using Bootstrap and some of it is procedurally generated server-side to facilitate dynamic content without using excessive bandwidth. The backend is a Python web server running the Bottle framework. Effectively, this server provides an easy UI for people to transmit data between their devices and the Datanium Nodes.
The coordinating server is written in Python and responds to requests from Twilio. When texts are received by the coordinating server, we use DialogFlow to parse the message so that we can take advantage of the powerful natural language processing available through that platform.
Challenges we ran into
Some of the main challenges we ran into include:
Integrating dialog interaction with the rest of the application. One of the most confusing parts of integrating DialogFlow was understanding how dialog state is persisted. Once we understood the paradigm, DialogFlow proved to be a powerful too for language processing.
Metadata management. One of the most difficult architectural problems in this application was managing file metadata. In transit, files must store metadata about their destination, file information, and other file metadata.
Python version differences. As always, many of our issues arose from differences between the production and development environments. One of the primary problems was that we are running Arch Linux on our laptops and so we were developing in Python 3.7, however the Python version on the RPi was Python 3.5. Therefore, some of the newest features of Python were not available to us.
Accomplishments that we're proud of
We are extremely proud of our project because we implemented a very sophisticated network which will revolutionize peoples' access to data in the underdeveloped world.
What's next for Datanium
In the future, we would like to extend Datanium by creating a native Android application to replace our prototype web application. We would also like to harden the authentication protocol for our project. We have designed the system in such a way that adding extra security at the authentication protocol layer is simple, but for the sake of time and a proof-of-concept, we neglected to flesh out that aspect of our project.
One major extension to our idea that we are interested in is creating an easy way to connect content sellers to people in emerging markets through our network. This will likely take the form of a content partner structure where content sellers can sell their content through our platform.