Inspiration
We have all felt very privileged to receive education from TAs and Professors from around the world. We noticed there were some barriers in language and writing style. We wanted to make it effortless to bridge this gap. This is why we built EchoBoard.
What it does
EchoBoard can transcribe heavy accents into live captions and make handwriting easier to read by projecting the lecturer's whiteboard content onto the students' screens and refining the handwriting
How we built it
We built EchoBoard using Gemini and EllevenLabs. We were able to send frames from a camera every 2 seconds to geminis API. After the image is interpreted by Gemini and converted into CSS columns. We were able to render Gemini output as both Latex and Markdown. At the same time we were streaming data from our microphone to ElevenLabs through a web socket. The student is able to see both transcriptions on their screen.
Challenges we ran into
Gemini 2.5 flash takes a significant amount of time to process single frames so there was some latency initially. We were able to reduce this latency slightly by keeping our camera open instead of closing it every time we sent a picture to gemini.
Accomplishments that we're proud of
We like that we were able to incorporate both LaTeX and Markdown onto our whiteboard projection
What we learned
We learned more about how web sockets worked and how they keep a constant connection so data is fluidly sent between server and client. We also learned how different AI API work.
What's next for EchoBoard
improve formatting and allow for graph interpretation. We also need to incorporate external microphone usage. Once we have scaled it, we want to be able to include authentication so that TA's and professors can invite users to their class session.
Built With
- elevenlabs
- fastapi
- gemini
- javascript
- python
Log in or sign up for Devpost to join the conversation.