Doable - AI Voice Body Double

Doable helps you to move forward with your tasks, even when you are overwhelmed. Inspired by a concept of Body Doubling, Doable is a personalised AI Conversational Agent that acts as your Voice Body Double, helping you to start, work and finish whatever tasks you have for yourself.

Why we built it?

In the world where everybody is overwhelmed (76% of employees struggled at least once, 28% reporting that they feel burned out “very often”; 5% of the world are neurodiverse, 98% of new parents are sleep-deprived in the first 3 years of their baby life), it’s time to mix early successes of personalised digital coaching (improving employee focus and productivity by up to 22%, 60% higher employee satisfaction rate due to personalised experiences) and well-known productivity technique known as body doubling.

Doable provides a AI Voice Body Double whose personality is a perfectly balanced with your personality, helping you to initiate the task, discuss the best way to start, stay focused and finish as planned, while staying supportive and knowledgeable about executive function (ability to make plans, start, work on, finish tasks) limitations. You can use Doable on any kind of task, even when you are away from your keyboard

We expect Doable to contribute to a 35% increase in workplace productivity for neurodiverse individuals that comes from structured support and produce a 20% increase in task completion coming from AI - assisted interventions.

But more importantly, create a sense of control, helping individuals build resilience and self-efficacy - the belief in their ability to succeed - which is critical for people experiencing burnout or neurodiverse individuals.

How we built it?

NB! Please note that we haven’t finished integrating the timer into the “cool” interface, so you’ll see two interfaces in our demo video.

Doable is an AI-built web-app, aiming to meaningfully leverage the fullest sponsorship stack of this hackathon (Lovable, Eleven Labs, Fal.ai, Make.com) with a minimal number of lines of code written by a human:

ElevenLabs Conversational AI agent provides tailored voice companions for the user depending on the user personality and type of the task chosen: a) configuration and overrides (voice id, first message, prompts, context) are retrieved from Supabase; b) any new session context stored back to Supabase; c) tools and knowledge base is configured on the agent via Eleven Labs interface.
Lovable.dev-fuelled front-end, with Supabase natively integrating the back-end data storage.
Make.com integration capability is used to GET AI-prompts (describing both feel and colour scheme of the images, including examples) to generate videos in fal.ai for each of eight personality types. Generated videos URLs POSTed back to Supabase to be visualised by the front-end.
Fal.ai covers video and audio integration features, picking up inputs from Supabase and returning back URLs to be available for the front-end .

architecture

As the users open the app, we profile profile them in non-cognitive demanding way with videos (1,2), clarify their task, assign a matching Voice Double Personality (3) and try to generate productivity-supporting sound-scape (4), with the Voice Double supporting the person throughout the session (5,6)

Accomplishments that we're proud of

Our MVP was to profile the user and then provide a conversational experience with a default configuration, yet thanks to leveraging the stack above, we progressed much further:

personalised voices and configuration of AI Conversational agent, including tools and client-side interactions.

While none of us worked with ElevenLabs before - it was easy to grasp the main concepts and prototype away! Our favourite part about Conversational AI how easy an agent could be tailored with easy overrides of first messages and tuning prompts and how powerful variables are. Developing client-side tools was a blast too as it was unique mix of coding and prompting simultaneously. We saw first hand how agentic AI shines when a situation calls for thinking behind the box - in our video (sadly the test when Mark, our agent proposed to send a letter to the organisers requesting to extend the submission review timelines wasn't recorded).

videos are used for personality profiling instead of images, providing a more dynamic experience, tapping into “newness” to support executive function, yet still staying “non-cognitive” demanding.

With Fal.ai, we've been able to test several models and approaches to generate video which enabled quick iteration on the prompts and overall feel of the app, and it was a truly random prototyping.

productivity-supporting soundscape augments user experience , tailored for the user personality and the type of the task (haven’t been integrated in the main branch due to time constraints).

What's next for Doable

We’ve consciously limited ourselves in some project areas to focus on getting end-to-end solution at the end of the hackathon, yet our vision for Doable is much bigger, making it a supportive performance coach, suitable for all the types of executive disfunction:

nuanced profiling based on more inputs: emotion, weather and user surroundings to produce one-of-a-kind soundscapes and agent personalities;
feedback look (if user agrees) to use screen pointer movements (keyboard strokes, etc) as an input to decide on the user state;
more agent personalities including historical figures, with users favouriting their best body doubles to reuse;
robust analysis of what worked for the user and regular summaries to promote greater self-awareness (Posthog), fed back to the user to promote new ways of thinking;
group sessions offering vs one-on-one virtual body double support.

Challenges we dealt with

The biggest challenge for us was integrating different parts of the architecture together. While integration of Lovable with Supabase was nearly seamless (yet we stumbled on RLS policies on Supabase side), we’ve spent a considerable amount of time integrating Eleven Labs with Lovable and Make.com with Fal.ai.

As we’ve set ourselves a goal to create an end-to-end app without writing a lot of code, integrating Lovable with Eleven Labs took two cycles of iterations - after the first attempt stumbled, Anya has developed a primer in code to check the logic working, while Dmitry re-promoted Lovable to get a refined foundation for integration, finally later integrating, yet only partially. Our lesson learned here is to focus Lovable on getting the bare bones of the app out ASAP (that was true magic and we got there so fast!), taking gradually more control if a custom integration is nuanced and requires a lot of scenario-based back and forth between the different APIs.

Similarly, when integrating Make.com and Fal.ai, Alexey bumped into discrepancies in documentation and unscripted behavior of API, especially when dealing with sound generation. We’ve employed python code integrations to get the sounds out to unblock the path, yet it would've been cooler to use specific blocks .

Another group of challenges was related to moving forward with the project while iterating our understanding of it, keeping an eye on the end-to-end setup: we manually mapped the voices to the personalities, re-used visual AI generative prompts for audio generation and haven't spent any time optimising what goes into prompts and what goes into knowledge base. ** Our lesson learned here is to re-group often to stay on the critical path to deliver the minimal viable experience**

Overall, we've had a great fun with tech (sadly we haven't had a chance to integrate PostHog!) and seeing our idea coming to fruition, and remember, it's definitely doable with Doable!

Built With

burgers
elevenlabs
fal
flux
gemini
lovable
pizza
react

Submitted to

ElevenLabs x 16z Worldwide Hackathon

Created by

Developed the app front-end on Lovable.dev & UX/UI and integrated with Supabase backend and the final version of Eleven Labs Conversational AI agent. My knowledge of personality profiling types is the foundation for “video-to-personality” mapping employed in the app. I am as a creative coder, passionate about agentic AI, building communities

Dmitry Demirkylych
pancakes lover
- Make.com integration capability is used to GET AI-prompts (describing both feel and colour scheme of the images, including examples) to generate videos in fal.ai for each of eight personality types. Generated videos URLs POSTed back to Supabase to be visualised by the front-end (worked for video, but not for audio end-to-end in our case).
- Fal.ai covers video and audio integration features, picking up inputs from Supabase and returning back URLs to be visualised by the front-end .

Alexey Buzovkin
I've has developed Supabase backend and Eleven Labs Conversational AI agents, focusing on tools, prompts and knowledge base, following my passion in personal development of individuals and teams. I'm a senior leader in data x AI and an ICF coach for new people managers and teams in technology, exploring intersection of AI x creative arts to increase team cohesion and individual motivation.

Anya Epie