About the Project
Inspiration
One of the biggest drawbacks we noticed with traditional LLM-based systems is the need to constantly re-prompt, wait, and then check if the model understood the request correctly. This back-and-forth slows down creativity, especially when building visual elements like websites.
We wanted a solution where you could speak normally — without crafting perfect prompts — and immediately see the website take shape in real-time. That inspired Ducky: a voice-powered assistant that lets you build and customize websites through natural conversation, no delays, no friction.
What We Learned
- How to integrate real-time speech recognition using the OpenAI API and natural language understanding with Google Gemini.
- How to process messy, casual human speech into structured, reliable web-building commands.
- How to create dynamic, real-time UI transformations in React, including advanced state management and live DOM updates.
- How to coordinate a multi-service architecture across FastAPI, Express.js, and Next.js.
- The importance of keeping backend services lightweight to support ultra-fast feedback loops.
How We Built It
- Frontend: Built with React, Next.js, Tailwind CSS, and Shadcn UI, using the Web Speech API and OpenAI API for capturing and transcribing voice commands instantly.
- Backend NLP: Developed a FastAPI server that uses Google Gemini API to analyze natural language and generate structured design instructions.
- Real-Time Server: Created a Node.js Express server to manage live sessions, while Next.js server functions helped efficiently fetch and relay API data.
At the core of Ducky is the Ducky Engine — a custom-built system that listens for incoming design instructions, interprets user intent, and updates the React DOM in real-time. The engine dynamically adjusts page state across a wide range of styling attributes: layout, colors, spacing, typography, alignment, and more — all while applying smooth animations. As users speak, the site evolves live before their eyes, offering a natural, hands-free website creation experience.
Challenges
- Understanding messy speech: Real users speak casually with filler words and unclear references. Making sure Ducky could still correctly understand and execute commands required strong NLP tuning.
- Maintaining real-time performance: Updates had to happen instantly with no noticeable lag, which meant optimizing React rendering and backend response times.
- Balancing customization and simplicity: We had to design Ducky to be powerful enough for detailed website building, but simple enough for anyone to use without memorizing strict commands.
- Coordinating multiple servers: Keeping the FastAPI, Express, and Next.js services in sync while handling live voice streams was a major architectural challenge.
Built With
- express.js
- fastapi
- geminiapi
- next.js
- node.js
- openai
- react.js
Log in or sign up for Devpost to join the conversation.