Inspiration
It's the first time in weeks you have had time to go out for lunch. Its a lovely spring day when your colleague asks: "You ready for the all hands?". You panic. Of course you are not ready, between endless days on calls you haven't had time to produce another edit of your well-rounded slide deck. That was me. Endless days on calls with no time to do any work, but what if you could do that work whilst your were talking? Aka, convert those audio instructions into actual work, starting with presentations!
What it does
Slidemakr removes the tedious boring process wasting hours of your time editing slides by giving you the power to edit your slides using voice. everyone will help you make a first draft, but let's face it, editing is the place where you spend hours of your time, adjusting flow charts, responding to a superior's feedback. In my opinion that time is wasted!
How we built it
Starts off with simply creating slides. This is very simple: Take text, transcribe it, and then translate that into google slides API requests. We then leveraged the live bidi connection for the editing. Same concepts: Take the instructions, translate them to Google API request, execute!
Challenges we ran into
Oh gosh! So many:
- Not as fast as you would like the experience to be. Will work on this optimization, i want the slide creation to feel seamless
- Translating slides into code was an interesting exploration: First it was just generated by code, then i tried RAG and intent, then that was complicated. I spent a lot more time here then where i would have thoughts
- Lots of issues before slide creation, which is actually where the gemini models made the difference. Dealing with interruptions, having the screen moved, and losing the context. All terrible user experiences that happened even before slide creation,
Accomplishments that we're proud of
- The moment when i first asked the agent to edit a slide and create a red background and it just popped up!
What we learned
- I want to optimize how i planned and executed on this project. I want to get better at setting context and then allowing the LLM to explore ways to achieve the best results rather than starting with the presciption. Was genuinely impressed with the live voice agent.
What's next for SlideMakr
- Make quicker
- Enable connection to google drive so that you can edit existing presentations
- Improve user experience, i don't really want user to be in a new app, it has to be seamless so let's work where they are
- Continue making the outputs nicer and nicer, and capable of creating longer decks
Log in or sign up for Devpost to join the conversation.