Auricular.ai Logo
wit.ai utterance training
GIF
Loading animation
Final output screen

Inspiration

Funny story actually, the shining light of inspiration was formed from the dawning of another COVID-19 lockdown. For some context, before the winter break, school was hosted in-person, and so in general it was just easier to pay attention in a work environment compared to the comfort of your own house. However, after the break school went completely virtual, which essentially meant we had to pay attention to class while in pajamas, our phones right next to us and every video game we play just two clicks away. There is no real way to monitor if the student is paying attention, which makes it all the more harder to be attentive throughout the entire lesson. So more often than not, we would all find ourselves asking each other “hey did you hear what she said”, or “wait so when is our test again?” because it's unrealistic to expect the student to pay attention the entire lesson. So more often than not, vital information is missed, which makes us have to ask someone what we missed, and miss more information in the process, making it a rabbit hole of a mess. Little did we know that one day, after complaining to each other about this particular rabbit hole, it would be the basis of what is now Auricular.ai.

How it Works

Auricular.ai is a very handy virtual-assistant for online conferences, where you are looking to pick up key information. It has a wide variety of applications (which will be broached upon in the Future Plans section), but currently is just applicable for students. It helps pick up on useful information you might miss so you can look back at the collected data and see what due dates, mark schemes, etc… you didn’t hear was mentioned. The project uses wit.ai API and Google Speech-to-Text API in tangent with the application we designed to personalize data entries outputs of your lessons and skim through for key information.

The APIs work in beautiful harmony, complimenting one and other perfectly. What happens is first, an audio file is sent to the Google Speech-to-Text API, where it will run through the audio file and convert it all into text, storing all the sentences into a data field. It also has a level of confidence associated with the translation, as sometimes it will give a false sentence due to the given audio being unclear. Code shown below.

client = speech.SpeechClient(credentials=credentials)
audio = speech.RecognitionAudio(uri=gcs_uri)
config = speech.RecognitionConfig(sample_rate_hertz=frameRate, language_code='en-US', enable_automatic_punctuation=True,)
operation = client.long_running_recognize(config=config, audio=audio)
response = operation.result(timeout=90)
data = []
data.append((list(filter(None,"{}".format(result.alternatives[0].transcript).split(".")))))
return data

After that, the information is sent to wit.ai and is processed, and sent back to the data field to hold the final products, which is quite a cool process. Essentially, the AI will take the sentences and go through two stages of detection: 1. keywords, and 2. intentions. Keywords finds if the sentences have the important words such as “quiz” or “test” in the sentence, and intentions checks if the sentence’s context makes it so that the relevance of information is there. If the sentence for example is “the test was so easy”, it would detect that it's not relevant, and would not send it back as the information needed. This is trained manually by sending inputs and allowing the AI to learn what would be more favorable to catch as intentions and not intentions.

Finally, the user interface compiles the relevant data sent back into viewable information with its associated keyword, time stamp and sentence, making it very easy for the student to access the information and use it. The user interface can access different lesson recordings that can be reopened to look at past data, being a very useful virtual agenda.

Construction

We constructed our project on Python, mainly on the Atom Python Editor, and worked in tangent with wit.ai API and Google Speech-to-Text API. We coded all the graphics and GUI structure independently, using the built-in commands.

Obstacles Faced

Mainly, the biggest obstacle for us was the time constraint. Going into this event, we knew our idea was ambitious, but we seriously lowballed our time estimate. We had to learn a new coding language and learn how to maneuver two separate APIs, while learning how to carry out backend coding, which in our planning phase sounded quite simple. To our surprise, the intricacy of information and syntax we had to learn during this got exponentially harder as we went along, and although we got through it, it was a tiring process. However, as grueling as it was, the end product and the collaboration along the way made for a very enjoyable and satisfying project, especially when all of our work came together.

Personal Achievements

In our opinion, this subsection is the most valuable to us. We have attended a decent amount of Hackathons, but this project was both the most enjoyable and informative of all. We all had set goals in mind, and were able to achieve them both independently and while working together, making this the most personal Hackathon experience yet. Here are our individual personal achievements:

Aryan: I’ve always been intimidated by the face of backend coding, as the complexity of the code at the surface value looks too confusing for me to register what is going on. I’ve always wanted to conquer this fear, and I can proudly say that after this Hackathon, backend coding will not haunt my dreams anymore. I also made some cool graphics with our group :)

Brian: Heading into this hackathon, I had zero knowledge about APIs and what they were, what they are used for, and how to use them. However, after attending the Intro to APIs workshop, I managed to grasp how APIs work and how to use them. For our Hack the North project, I utilized my newfound knowledge and I am really proud of what we have created. I personally feel accomplished being able to learn a new concept and successfully being able to apply it immediately.

Ishan: As the main backend coder going into Hack the North, my personal achievements would have to be the following: learned how to do get and post for APIs and learn how to run a backend through python. Other than purely informational accolades, I am also very proud of my group and I’s efficiency to work on such a large scale project with the limited given time. Overall, the entire project could really be my personal achievement.

Min: I've generally surmounted to most of the challenges posed towards me during coding with lots of effort, and this time around was no different. Learning the majority of Tkinter, Python GUI and Python graphics syntax in less than a day, I'm quite content with the amount of knowledge I learnt over this experience. Additionally, it showed me that I am capable of more than I realize, and makes me feel pride in myself.

What we Learned

All of our group members have a solid background in programming, so we assumed that the knowledge we already had would be enough for this project. Looking back, we did not realize the amount of new skills we would learn and apply going into this Hackathon. We predominantly use Java as our programming language, but the APIs we wanted to use required the use of Python. Two of our group members have never used Python, so learning an entirely new programming language was a huge hurdle for them, as each programming language consists of different formatting, keywords, and most importantly, syntax. This meant that even the most simple of concepts required research to learn the syntax for the new language. However, even the group members with previous python experience learned something new as we needed to create graphics and an interface for the users to interact with. For this, we all had to learn the Tkinter graphical user interface package. Another thing our group learned were APIs and how to process and receive information with them. Before this Hackathon, we did not really know about APIs, but by attending the Gear Up Intro to APIs Workshop, it helped us understand what an API is, how it works, and how we can use it to our advantage in projects. This workshop gave us a solid foundation of knowledge to work from, so we branched off from it to make our idea a reality. Overall, Hack the North 2020++ gave us a great opportunity to push ourselves and to expand our horizons. Throughout the creation of our project, we learned many new things and have grown as hackers.

Future Plans

Although we have not thought ahead by much, Auricular.ai has so much room to expand and its future is looking very bright. The most crucial aspect we can improve is the training of our wit.ai app. With the limited timeframe a hackathon provides, we could only supply so much data the API can work from. We have not been able to perfect it on a scenario-by-scenario basis, which is why Auricular.ai is only designed for the most important and frequently occurring scenario; date and time for a specific “auricle” (ex. Tests, quizzes, and assignments). However, with more time, we will be able to train our application for different scenarios, and who knows, it will maybe even be able to distinguish the speaker, so that the key information is only extracted from the teacher. Another feature we can add in Aurecular.ai is the implementation of the Google Calendar API. The user could link their google account with our application, and important information such as test dates can be automatically added to their Google Calendar. With seamless integration with other apps, it will be easy to incentivize newcomers to start using Aurecular.ai. Our journey has only just started and we still have a long way to go. In summary, there are limitless possibilities for the future of Aurecular.ai, but we will be making sure to work hard to achieve success.

Built With

atom-python-editor
facebook-wit.ai-api
google-web-speech-api
python
speech-to-text

Submitted to

Hack the North 2020++

Created by

I worked on the application's front-end alongside Min, learning TKinter and Python for my first time. I spearheaded the training of Wit.AI to have it detect phrases of relevance when fed large strings or text files (the functionality of the application itself). Lead our presentation and marketing of project.

Aryan Gandevia
WLMAC class of 2022
I mostly coded the python front-end - learning tkinter and other related python resources as I went. I also popped into the backend group to help out with JSON.

Min Kang
2026-27 Class of UW ECE | 2022 Alumni of MACS
I worked on the back-end aspect of this project, which utilizes the Google Cloud Speech-to-Text API, as well as the Facebook Wit.ai Speech Interface API. I also trained Wit.ai to search for specific intents while identifying all the entities

Brian Kang
WLMAC Class of 2022
I worked on the backend which was done in Python. We used the Google Cloud Speech to Text API, and linked it with the wit.ai Natural Language Processing API to determine key phrases. After getting the responses, I filtered through the JSON to retrieve the most important information that was later displayed on the front end to the user.

Ishan Sethi
WLMAC Class of 2022