Inspiration
I was inspired by gemini 3's ability to take multi modal input specially video i tested it with a few videos and found the flash model good enoughuo
What it does
It takes a video analyses it and finds issues with specific timestamp
How we built it
I am using gemini-3-flash-preview and go at the backend with a simple react frontend
Challenges we ran into
There is some latency between uploading or recording and getting a response back but that can be improved if I moved towards async
Accomplishments that we're proud of
I think integrating video as part of multimodal llm is something that I hadn't done so I am proud that this makes that simpler
What we learned
I learned about uploading videos and managing videos I feel I can learn about storing videostoo but for now this is it
What's next for Gemini Coach
Better interface, prompts for different scenarios
Built With
- docker
- go
- postgresql
- railway
- react
- redis
Log in or sign up for Devpost to join the conversation.