Inspiration
I was inspired by having such problem myself. I got a lot of meeting recordings, videos and lectures for which I wanted to have transcription and summary. Besides, currently AI tools and vibe coding are very interesting. So I wanted to try it myself to develop something useful.
What it does
It takes video or audio, and produces transcription, summary and diarization for this audio. Afterwards it is possible to download results in text, markdown or PDF.
How we built it
I started with using bolt.new to make the outline and frontend of the project. And then refined it myself on the backend side using Cluade and VS Code.
Challenges we ran into
The first annoying challenge was, that after some point, bolt.new started complaining about the project being too big. Then it was hallucinating and was not able to add the bolt badge. So I had to do it myself. And overall, generative AI generate a lot, sometime garbage, sometime broke something, so using git was crucial.
Accomplishments that we're proud of
It actually works and deployed! Unbelivebalbe how much I was able to achieve in my own in the such short period of time. This technology is very promising, and I had a lot of fun.
What we learned
How to work better with generative AI tools. Learner the bolt.new platform and netlify for deployment. But I also realized, for the near future software engineers will be still needed, and probably more, because it has to be people how will support of all this code generated by AI :D
What's next for NurgaVoice
I am planning to use this project, and develop more. There is still a lot of work on the optimization models, to make them faster and consume less resources. As well as add more features and make it more stable and actually helpful to people.
Built With
- bolt
- bootstrap
- celery
- fastapi
- gemma
- javascript
- llama-cpp
- netlify
- pyannote
- python
- whisper
- whisperx
Log in or sign up for Devpost to join the conversation.