Logo

About the Project

Inspiration

One of the biggest drawbacks we noticed with traditional LLM-based systems is the need to constantly re-prompt, wait, and then check if the model understood the request correctly. This back-and-forth slows down creativity, especially when building visual elements like websites.

We wanted a solution where you could speak normally — without crafting perfect prompts — and immediately see the website take shape in real-time. That inspired Ducky: a voice-powered assistant that lets you build and customize websites through natural conversation, no delays, no friction.

What We Learned

How to integrate real-time speech recognition using the OpenAI API and natural language understanding with Google Gemini.
How to process messy, casual human speech into structured, reliable web-building commands.
How to create dynamic, real-time UI transformations in React, including advanced state management and live DOM updates.
How to coordinate a multi-service architecture across FastAPI, Express.js, and Next.js.
The importance of keeping backend services lightweight to support ultra-fast feedback loops.

How We Built It

Frontend: Built with React, Next.js, Tailwind CSS, and Shadcn UI, using the Web Speech API and OpenAI API for capturing and transcribing voice commands instantly.
Backend NLP: Developed a FastAPI server that uses Google Gemini API to analyze natural language and generate structured design instructions.
Real-Time Server: Created a Node.js Express server to manage live sessions, while Next.js server functions helped efficiently fetch and relay API data.

At the core of Ducky is the Ducky Engine — a custom-built system that listens for incoming design instructions, interprets user intent, and updates the React DOM in real-time. The engine dynamically adjusts page state across a wide range of styling attributes: layout, colors, spacing, typography, alignment, and more — all while applying smooth animations. As users speak, the site evolves live before their eyes, offering a natural, hands-free website creation experience.

Challenges

Understanding messy speech: Real users speak casually with filler words and unclear references. Making sure Ducky could still correctly understand and execute commands required strong NLP tuning.
Maintaining real-time performance: Updates had to happen instantly with no noticeable lag, which meant optimizing React rendering and backend response times.
Balancing customization and simplicity: We had to design Ducky to be powerful enough for detailed website building, but simple enough for anyone to use without memorizing strict commands.
Coordinating multiple servers: Keeping the FastAPI, Express, and Next.js services in sync while handling live voice streams was a major architectural challenge.

Built With

express.js
fastapi
geminiapi
next.js
node.js
openai
react.js

Submitted to

MorganHacks 2025
- Winner MLH: Best Use of Gemini API

Created by

I worked on the backend, building quack engine logic and optimizing system prompt for speedy response times.

Claudesaul Belizaire
I worked on the realtime voice streaming, focusing on passing arbitrary requests to structured data to be used in component rendering

Blake Baker

Updates

Claudesaul Belizaire started this project — Apr 27, 2025 11:02 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.