Inspiration

Cooking as a college student without a family by your side to help you can be a lonely and quite a hassle. We lack the instant, hands-free guidance that a family member - like a gentle grandma (lola in Tagalog!) - provides. It's a huge hassle to stop stirring to look up the next step on a phone. We wanted to close that gap and bring that warm, immediate support into the kitchen.

What it does

cookingLola is a fully conversational, multi-modal voice agent that acts as your hands-free kitchen assistant.

-Listens & Adapts: User speaks commands (eg. "Give me a cookie recipe" or "make it spicier") and Lola will provide the recipe and update it instantly based on the conversation's context.

-Sees: Lola is able to interpret images. Show her an ingredient and she will identify it and tell you what it is and how to use it.

-Speaks: Lola responds in the sweet, comforting "Grandma" voice.

How we built it

Front end: We used tailwind and Next.js for the UI/UX.

Back end: We used fish-audio to convert user's speech to text which is then sent to Claude. Claude will then generate a recipe based on user's command and fish-audio will read out the recipe to the user using text-to-speech.

Challenges we ran into

  • We ran into problems with integrating Claude with Fish Audio.
  • Making the conversation flow "naturally"

Accomplishments that we're proud of

-Successfully implemented a complex three-step, dual-API voice pipeline. -Created a high-fidelity conversational agent capable of handling dynamic user intent.

What we learned

We learned how to implement 2 AI models in one program.

What's next for cookingLola

We want to make a mobile version of cookingLola, implement different voice options, and make it completely hands-free without the need of a button.

Built With

  • anthropic
  • claude
  • fish-audio
  • nextjs
  • python
  • tailwind
Share this project:

Updates