OpenAI Reasoning & Multimodality track

Best under 21 track

Inspiration

Our team members are all first-generation immigrants, we know how challenging it is to improve English, especially spoken English. It is critical for us, and our family to find opportunities, make local friends, and integrate into Irish society.

What it does

We enable multimodal AI agents with the ability to mimic real-life human conversations, also supplying learners with detailed real-time feedback, suggestions, and online resources to improve their English capability.

How we built it

We utilised OpenAI's latest Realtime API, creating one agent to process the real-time conversation, and another to grab the transcript of the conversation, give structured outputs, and using its reasoning ability to generate detailed feedback in different categories, say, grammar, clarity, and coherence.

Challenges we ran into

Following the advice from OpenAI mentor, we developed our product from a Realtime API client which is built on Typescript and Node.js which is totally new to us all. To better develop the reasoning ability of AI models, we decided to implment a multi-agent AI system in which communications between agents was our biggest issue.

Accomplishments that we're proud of

  • We creatively used whisper API to make transcripts for better message conveying.
  • We successfully navigate ourselves through a full stack Typescript repository with best practice, comply to these design rules to build our own product upon it.
  • We utilised agile methodology and scrums in order to take iterative approach to enhance our efficiency.

What we learned

How frontend and backend system work with each other and how to design and implement an agentic system.

What's next for Team39

Contrust a better agentic system with better tools and collaboration patterns. Expanding the system into multiple languages other than English.

Built With

Share this project:

Updates