One of our team members, Nicky, has a significant amount of trouble hearing (subtle) noises! The name behind our application, Tricone, was first canned because of the triangular, hat-like shape of our hardware contraption, which resembled a tricorn. Later, we changed the name to Tricone because of the three types of cones that we have in our retinas -- red, green, and blue -- which represent the color in our world.
What it does
Tricone is an AR mobile application that uses the direction and location of sounds to provide real-time visualization in order to help people who have trouble with hearing be able to detect their surroundings. The application displays the camera screen with dots, which represent the location and intensity of sounds nearby, and updates as the camera feed is updated as the user moves around.
How we built it
First thing, we began building through installing Android Studio onto our laptops and then downloading Flutter SDK and the Dart language for the IDE. Then once we fully developed our idea and process, we rented an Arduino 101, 15 Digi-Key components (jumper wires, sound sensors and a soldering kit and iron), and an Adafruit Bluefruit BLE (Bluetooth Low Energy) Breakout wireless protocol. The next day, we wired our components to the Arduino so that the sound sensors formed an equilateral triangle with a 20cm side length each by measuring 120° between the sensors and so that we could establish connectivity between the Arduino with the mobile app.
Our mission was to be able to translate sound waves into identifiable objects based on their location and direction. We determined that we would need hardware components, such as a microcontroller with sensors that had powerful microphones to distinguish between nearby sounds. Then we worked on implementing Bluetooth to connect with our Flutter-based mobile application, which would receive the data from the three sound sensors and convert it into graphics that would appear on the screen of the mobile app. Using Augmented Reality, the mobile application would be able to display the location and intensity of the sounds as according to the camera's directionality.
Theoretical research and findings behind sound triangulation
In general, sound localization of a sound source is a non-trivial topic to grasp and even produce in such a short amount of time allotted in a hackathon. At first, I was trying to understand how such a process could be replicated and found a plethora of research papers that were insightful and related to this difficult problem. The first topic I found related to sound localization through a single microphone: monaural capturing. Another had used two microphones, but both experiments dealt with ambiguity of the direction of a sound source that could be anywhere in the 2D plane. That is the use of three microphones was settled on for our hackathon project since ambiguity of direction would be lifted with a third microphone in place.
Essentially, we decided to utilize three microphones to localize sound by using each microphone as an edge to an equilateral triangle centered at the origin with a radius of 20. The key here is that the placement of the microphones is non-collinear as a linear placement would still bring ambiguity to a sound that could be behind the mics. The mics would then capture the sound pressure from the sound source and quantify it for determining the location of the source later on. Here, we took the sound pressure from each mic because there is an inverse relationship between sound pressure and distance from an incoming sound, making it quite useful. By creating a linear system from the equations of circles from the three mics as their locations are already known and deriving each mic’s distance to the source as radii, we were able to use Gaussian elimination method to find an identity matrix and its solution as the source’s location. This is how we triangulated the location: the sound source assuming that there is only one location where the three circles mentioned previously can intersect and the position of the mics are always in a triangular formation. This method of formulation was based on the limitations posed by the hardware available and knowledge of higher-level algorithms.
Another way of visualizing the intersection of the three circles is a geometrical image with radical lines, where the intersection of all those lines is the radical center. However, in this specific case, the radical center is simply the intersection based on the previous assumption of one possible intersection with a triangular positioning at the origin. The figure below generalizes this description.
Challenges we ran into
A significant chunk of time was spent dealing with technical hurdles, since many of us didn't come in with a lot of experience with Flutter and Dart, so we dealt with minor software issues and program bugs. We also had to research a lot of documentation and read plenty of Stack Overflow to understand the science behind our complex idea of detecting direction and distance of sound from our hardware. in order to solve issues we ran into or just to learn how to implement things. Problems with integrating our mobile application with the hardware provided, given the limited range of plugins that Flutter supported, made development tricky and towards the end, we decided to pivot and change technologies to a web application.
We also faced problems more-so on the trivial side, such as not being able to compile our Flutter app for several hours due to Gradle synchronization problems within Android Studio, and other problems that related to the connectivity between the Arduino BLE and our mobile application.
As an alternative, we created a web application to process HTTP requests to substitute Bluetooth connectivity through Google Hosting, which would make web API calls with the technology and host a PWA-based (Progressive Website Application) app and still be compatible for mobile app usage.
Accomplishments that we're proud of
We are proud of coming up and following through on a multifaceted project idea! We divvied up the work to focus on four key areas: hardware, mobile app AR functionality, network connectivity, and front-end design. Our team as a whole worked incredibly hard on making this a success. Some of our most memorable milestones were: 1) being able to successfully control a smartphone to connect to the Arduino via Bluetooth, and 2) finalizing a theoretical formula for sound triangulation based on mathematical research!
What we learned
Especially because all of us had little to no prior experience in at least one of the technologies we used, we were all able to learn about how we are able to connect software with hardware, and also conceptualize complex algorithms to make the technology possible. Additionally, we found the importance of pinpointing and outlining the technologies we would use for the hackathon project before immediately jumping into them, as we later determined midway into the day that we would have had more resources if we had selected other frameworks.
However, we all had a pleasant experience taking on a major challenge at HackHarvard, and this learning experience was extremely exciting in terms of what we were able to do within the weekend and the complexity of combining technologies together for widespread applications.
What's next for TRICONE
Our application and hardware connectivity has significant room to grow; initially, the idea was to have a standalone mobile application that could be easily used as a handheld. At our current prototyping stage, we rely substantially on hardware to be able to produce accurate results. We believe that a mobile application or AR apparatus (ex. HoloLens) is still the end goal, albeit requiring a significant upfront budget for research in technology and funding.
In future work, the method of localization can be improved by increasing the number of microphones to increase accuracy with higher-level algorithms, such as beamforming methods or Multiple Signal Classification (MUSIC), to closely fine-precise the source location. Additionally, in research, fast Fourier Transformations to turn captured sound into a domain of frequencies along with differences in time delays are often used that would be interesting to substitute the comparatively primitive method used originally in this project. We would like to implement an outlier removal method/algorithm that would exclude unrelated sound to ensure localization can still be determined without interruption. Retrospectively, we learned that math is strongly connected in real-world situations and that it can quantify/represent sound that is invisible to the naked eye.