According to the 2016 census for Canada, about 22.9% of all Canadians have a non-official language as their mother's tongue, and about 1.9% of all Canadians cannot hold a proper conversation in either English or French. It is crucial that these people, along with other non-native English speakers around the world, can obtain the same working and education opportunities as everyone else. Furthermore, the detrimental effects of the COVID-19 pandemic have given us much less opportunities to communicate with others and many people are still struggling to land jobs or internships, further emphasizing the need for maintaining superb communication skills. The best method to improve and maintain these skills is undeniably practice, PRACTICE, PRACTICE, an aspect that our project greatly helps with.

What it does

CogniTalk can analyze recordings and audio files of verbal responses to common interview questions and store these past recordings in a log to keep track of progress. After logging in with their Google/email account, the user will choose a random interview question to respond to and they can either record themselves on the web app or upload an audio file of their own recording to the question. The audio data gets passed to our Flask backend server, where a voice-analysis library analyzes it and returns an output to the Svelte frontend. This data also gets stored in a Firestore database in order for the user to keep a log of their past recordings. This log of recordings gets displayed in a table that highlights areas of success and improvement.

How we built it

Our frontend is built with SvelteJS and our backend is built with Flask. We used an existing audio-analysis library that runs in the backend to help us analyze audio data. Finally, Firebase authentication was used for logging-in and signing-up users and Firebase Firestore was used as a database to store the performances of the past recordings.

Challenges we ran into

As only a team of 2, we faced a major struggle of accomplishing all our desired tasks in time. One of us (Eric) worked on the entire frontend aspect and the other (Borna) worked on the entire backend aspect of the project, meaning that both of us had a significant portion of the project to contribute to. One example of a major challenge with our code was trying to get our backend to analyze audio with only a URL to give as input (case where the user records with the web app). We spent countless hours trying to figure out how to use that URL to store the audio data in a .wav file. Another major issue we faced was trying to return the output to the frontend and getting it stored in Firestore. We struggled to link our frontend and backend using our minimal knowledge of web requests and work with the Firestore API to store data in a complex format. However, we managed to pull through with each of our bugs and end with a successful project.

Accomplishments that we're proud of

As mentioned before, we are proud to have overcome nearly all of our bugs and end the hackathon with an almost fully-functional web app for others to currently use. We have also learned a couple of new technologies, including working with audio files, and greatly improved on existing skills, such as using web requests, during our hacking period.

What we learned

Apart from the new technical skills we gained while hacking, both of us learned about the advantages and disadvantages of being a duo rather than a full team of 4. Most notably, we have learned that although being in a duo requires more work and coding contributed from both teammates, duo teammates have much more control over their hacking decisions and can each gain much more satisfaction with the project and more hacking experience.

What's next for CogniTalk

One of our future goals for CogniTalk is to develop a much more rigorous algorithm to a recording's overall performance and output this deeper analysis to the user's recording logs. Furthermore, we have also thought about creating our own speech-analysis model with machine learning for better accuracy on the crucial speech information.

Built With

Share this project: