Inspiration
I've been fascinated by Siri because I know Siri isn't fully an AI model but it does have AI capabilities, and I also wanted to replicate the experience of saying "Hey Jarvis" and having something happen on your laptop just through your voice.
What it does
Archer uses the Java Sound API to capture voice commands and display live transcription in the JavaFX GUI. Commands are parsed into text using the Vosk API and processed with NLP techniques. This string is passed to a CommandProcessor layer that utilizes the Gemini API to find out the meaning of the command and detect if the user wants to perform a local system operation or not. If not, the Assistant speaks back the response with a FreeTTS voice and live transcription on the GUI. If yes, Archer processes the command using the Gemini API in a separate ActionExecutor layer, which utilizes the built-in Java libraries for system operations. Archer then speaks back a confirmation message saying the operation has been completed.
How we built it
I built it using Java, utilizing the Gemini API and libraries like FreeTTS, Vosk, and the in-built sound libraries, the external ones of which I added as dependencies to my Maven project. I used ChatGPT and Gemini to help me translate my ideas into code and teach me about how to use the syntax of those libraries and explain to me what is going on under the hood.
Challenges we ran into
The project failed to compile a lot of times, and getting the voice transcription to not bug out was also a major issue. I also ran into Maven dependency issues a lot of times.
Accomplishments that we're proud of
I was at least able to have a majorly-completed codebase, even if there are bugs, which I can revisit in the future. I also finally executed the development of a project from start to finish, and for someone who is a chronic procrastinator, this is something I'm very proud of as I was able to go from nothing to at least mostly complete in around 14 hours.
What we learned
I learned a lot about API usage and integration in Java, how Maven projects work, event-driven operations in Java, how to effectively use LLMs in software engineering, and most of all I gained experience optimizing my code for low-latency and speeding up concurrent operations by the usage of threads.
What's next for Archer
I plan to revise it in the future to fix bugs and perhaps migrate to a cleaner looking UI, and maybe I will package it as an .exe file available for public download.
Due to time constraints and technical challenges, the project is not fully functional at this stage. However, the full codebase is available in the provided GitHub link, which demonstrates the architecture, API usage, and implementation plan.
As such, I don't have a video demo. Sorry.

Log in or sign up for Devpost to join the conversation.