Inspiration

People with disabilities and the elderly can face significant challenges in performing everyday tasks independently. Many rely heavily on caregivers, which can be limiting and stressful. Plus, caregivers can be hard to find, as demand often far exceeds supply, leaving many without reliable support. There are countless stories of someone falling or having some other mishap, being stuck for hours on end with no way to call help. This inspired us to create a supportive robot friend that empowers users to regain independence through easy-to-use technology.

What it does

CareBot provides voice-controlled robotic assistance in daily tasks such as cooking, laundry, and in our simulated demonstration, cleaning up. By integrating real-time voice transcription with Sonnet's intuitive capabilities and precise robot arm control, CareBot enables custom object manipulation without requiring specific key words or commands.

How we built it

We combined cutting-edge voice transcription technology with a two-step hierarchal LLM pipeline to control a 7-DOF robot arm in a custom-built simulation environment. We used React Native for the speech-to-text UI, a prompt-engineered Sonnet analyzer and a Sonnet planner for our LLM pipeline, and a Python-based MCP (Model Context Protocol) server for robot commands. This culminates in an intuitive system that listens, understands, and executes tasks efficiently.

Challenges we ran into

  1. Managing precise robot movements while avoiding collisions with objects.
  2. Engineering strong prompts for our LLM analyzer and planner.
    • Constructing heuristics for task completion (eg. move straight up before and after an object pick).
    • Feeding the LLM the bare-minimum information required for determining gripper pose (position and orientation).
  3. Integrating a multi-step pipeline: speech-to-text, LLM, MCP, pybullet simulation.

Accomplishments that we're proud of

  1. Developing a modular robot control API and an MCP server that allows easy integration of new robotic skills.
  2. Building a Sonnet 4 prompt-processing pipeline with strong in-built language information.
  3. Building a pybullet simulation environment with custom objects (it was our first time using pybullet, or any robot simulator for that matter!).
  4. Demonstrating real-time task execution with voice commands in a simulated 3D environment.

What we learned

We gained valuable insights into robot motion planning, robot simulations, and LLM integration with an external API. The project highlighted the reasoning capabilities of state-of-the-art chatbots in a robotics context.

What's next for CareBot

  • Integrating computer vision API to dynamically obtain object positions before each pick.
  • Deploying on physical robotic hardware to assist in real-world environments.
  • Enhancing natural language understanding for greater task breadth.
  • Attempting more complex tasks: tool use, cooking, folding laundry.

Built With

Share this project:

Updates