Introduction

Meet Lumi, the groundbreaking AI companion that's about to revolutionize the way you interact with technology. Lumi doesn't just respond to your queries - it anticipates your needs, learns your preferences, and evolves alongside you. Whether you're scheduling meetings, brainstorming ideas, or simply need a witty conversation partner, Lumi is always one step ahead. It's not just an assistant; it's your personal gateway to a world of limitless possibilities.

Inspiration

Lumi is designed to help individuals, professionals, and businesses streamline workflows, automate repetitive tasks, and gain actionable insights without requiring deep technical expertise. Whether it’s a developer running scripts on the go, a financial analyst tracking stock trends in real time, or a content creator summarizing YouTube videos, Lumi acts as a reliable assistant that enhances productivity and decision-making.

What it does

Lumi is a voice-first AI assistant that processes natural language commands to perform real-world tasks including:

  • Generating video notes from YouTube URLs 🎥 - Helping students, researchers, and professionals extract key insights from long-form video content without manual note-taking.
  • Executing Python scripts & creating files 🐍 - Assisting developers and engineers in automating workflows and testing code without switching between multiple tools.
  • Drafting professional emails/LinkedIn posts ✉️ - Enabling professionals to create well-structured emails and posts effortlessly, enhancing communication and personal branding.
  • Real-time stock analysis with interactive charts 📈 - Providing financial analysts, traders, and investors with up-to-date stock insights for better market decisions.
  • Web searches with visual results 🔍 - Enhancing research capabilities by fetching structured and enriched search results in real time.
  • Image generation from text prompts 🖼 - Supporting content creators, marketers, and designers in generating high-quality visuals instantly.

How we built it

Core Stack:

  • Python + Chainlit (Voice UI)
  • Speechmatics Flow (Real-time audio processing)
  • SQLite (Session/transcript storage)
  • LangChain (AI orchestration)
  • Modular tool system (Stock/Youtube/Email tools as independent modules)

APIs Used:

  • Speechmatics API – For real-time speech-to-text processing.
  • Tavily API – For executing internet searches and retrieving web-based content.
  • Groq API – Used for optimized AI model inference.
  • OpenAI API – For DALL-E Image Generation.

Challenges we ran into

  1. Real-time Audio Sync - An almost jitter-free playback while handling ASR results.
  2. Tool Collision - Multiple async tools modifying session state simultaneously.
  3. Processes running in parallel - Determining which processes need to be synchronous and which do not.

Accomplishments that we're proud of

  • Implemented real-time voice interaction. This was a significant achievement as it required synchronizing voice input with AI-generated responses in a way that felt seamless and natural to users. Ensuring low latency and high accuracy in speech recognition was a major technical challenge, and overcoming it allowed us to create a truly interactive experience.

  • Integrated an array of tools. Successfully integrating diverse tools while maintaining a smooth user experience showcased our ability to design a scalable and extensible system.

  • Designed a user-friendly interface with Chainlit. We prioritized ease of use, ensuring that users could interact with Lumi naturally. By leveraging Chainlit, we crafted an intuitive interface that allows users to seamlessly transition between voice and text inputs, making AI more accessible and efficient.

  • Developed a modular architecture for easy expansion. One of the core principles behind Lumi was flexibility. By structuring it with independent modules for different functionalities, we made it easy to add new features in the future without disrupting the existing workflow. This modularity ensures long-term scalability and maintainability.

How it can impact others:

Lumi has the potential to transform various domains by providing AI-driven automation tailored to different user needs:

  • For professionals – Lumi can assist in drafting high-quality emails, scheduling tasks, and providing real-time stock market insights, saving time and effort.
  • For developers – By enabling script execution via voice commands, Lumi streamlines debugging and automation workflows, reducing repetitive tasks.
  • For students and researchers – The ability to generate structured notes from YouTube videos and search results can enhance learning and information retrieval.
  • For content creators and designers – With AI-driven image generation and content drafting, creators can produce high-quality visuals and social media posts effortlessly.
  • For businesses – Companies can leverage Lumi for real-time data retrieval, analytics, and task automation, improving operational efficiency.

By integrating AI into everyday workflows, Lumi enhances accessibility and efficiency, making sophisticated AI capabilities more practical for a broader audience.

What we learned

Throughout the development of Lumi, we gained valuable insights into optimizing real-time audio synchronization, managing multiple asynchronous operations efficiently, and structuring AI-driven interactions more effectively. We also refined our ability to design modular architectures that allow for easy expansion and seamless integration of various tools. Additionally, we could have better handled the audio chunks and improved the UI/UX with a more structured output response if we had more time.

What's next for Lumi - Personal AI Assistant

  • Enhanced Security: JWT validation for tool access, Browser tool sandboxing via WASM.
  • Advanced Features: Multi-modal input (Voice+Text+Images), Team collaboration features.

Built With

  • auth0
  • chainlit
  • dalle
  • langchain
  • openai
  • plotly
  • python
  • speechmatics
  • tavilyapi
  • yfinance
Share this project:

Updates