Inspiration
Our inspiration of this project runs from going through multiple emails causing production friction. We wanted, as developers, to be able to focus on coding as much as possible. After learning about multi-modal inputs, we decided this would be the best project.
What it does
This is a pure javascript Chrome extension. It allows the user to input a Google Gemini key. From there, a popup appears on either two supported platforms, Gmail or Outlook. It is multi-modal, such that it takes all available attachments to create the best responding email. It can also compose new emails. It is optimized specifically for writing emails, with no prompting needed by the user. It also has a summary/analysis tool specifically tailored for professional use. Lastly, there is a translation tool to make communications between individuals even smoother.
How we built it
The app was built through Javascript, HTML, and CSS. Using Google's various Gemini models, the user can test which model might improve their use case. It uses Gemini 1.5 ability to parse through files, audio, and images to increase the amount of information used in the context window.
Challenges we ran into
We ran into several UI/UX issues. We started with something that directly embedded into Gmail's compose bar, but eventually iterated over to make an overlay type of extension. This made clarity much better for the user. Not only that, we also had to iterate over our prompts for best optimization.
Accomplishments that we're proud of
We're extremely proud of the multi-modal functionality of the extension. It allows for the user to get the best results with more information, leading to more accurate and professional emails.
What we learned
- Gemini integration
- Chrome extension building
- Multimodal LLM usage
What's next for - A Professional Email Writer
There are still several improvements that we could still make. We believe that our prompts could still be improved to give even more accuracy. Some more features that could help is a sort of suggestion feature that works on the user's previous emails with another recipient. By building on their past tone/way of diction, this could create more human/life-like interactions.
Built With
- claude
- css
- gemini
- html
- javascript
Log in or sign up for Devpost to join the conversation.