Inspiration

The cycle continues—students enter the education system to contradict the very definition of education: learning from first principles in a manner that maximizes learning outcomes. Unfortunately, education systems are designed to reward completion over learning, restricting opportunities for learning growth and meaningful engagement with class material. For example, our group pointed to the common instance of the internal struggle of taking organized and comprehensible notes during lectures, meetings, and attention-demanding situations. Taking inspiration from this common encounter in schools, we created a speech-to-analysis platform that uses recent advances in large language models and deep learning to generate novel insights from varied speech. Not only is this critical for augmenting learning environments by allowing students to effectively multi-task in high-demanding situations, but it also helps provide idea inspiration to drive thinking and learning.

What it does

Our project helps analyze speech using large language models in order for students to be able to note-take while still engaging with class material. Not only is this important to increase focus, but it also prevents students from needlessly multi-tasking.

How we built it

Moreover, during the development process, we discovered the limited resources available for translating speech-to-analysis for improving note-taking in academic settings. By drastically improving note-taking, however, we believe students will be able to more connectively engage with class material in order to fully optimize their learning and productivity experience. Further, to accomplish out goal, we built the project ground-up from Visual Studio using tools from Streamlit, CohereAI, and OpenAI’s Whisper model. Specifically, we collectively decided to use Streamlit and CohereAI hand-in-hand as they were both easily navigable platforms and Cohere’s API was perfect for our project as it already included features that allowed text summarization and generation based on user input. Since all members of our group were more familiar with Python, we opted for Streamlit’s platform, which aids in web development and reduces barriers in model imports. To produce a functional speech-to-analysis platform, we began using Whisper to transcribe large audio files, after which we operated CohereAI to generate varied results depending on user preference. Unlike other competitors, our speech-to-analysis platform is capable of generating effective summaries and practice questions based on hours worth of material in a computationally efficient manner, helping guide the learning process and enriching the academic environment .

Challenges we ran into

While building, we faced some technical difficulties as we were all new to these platforms. For example. Streamlit, was far more limited in resources (especially UI components) compared to React or Flask—and this issue was compounded with the very few online tutorials and demonstrations available to implement certain features. Additionally, the CohereAI API key only permitted 5 API calls per minute, preventing real-time and interactive analysis as we did not have the appropriate license to execute more ambitious ideas. Nonetheless, to solve these issues, we quickly pivoted by importing CSS and HTML into Streamlit and using Whisper in parallel to reduce API calls and cost.

Accomplishments that we're proud of

We are most proud of building a functional project that not only leverages the Cohere API, but also works in tandem with other language models to provide insightful speech analysis.

What we learned

We learning how to integrate complicated A.I. models in web development and how to use imperative programming to increase transparency in the building process.

What's next for Duck.it.we.ball

Learn how we can use different frameworks to improve the usability of our website, helping polishing our work into a final product.

Built With

Share this project:

Updates