Inspiration We wanted to enhance the interactivity of the HiWonder AiNex humanoid robot as well as the Muto HexaPod by integrating it with OpenMind's OM1 platform, enabling responsive, video-driven communication powered by AI.
What it does The system enables AiNex to detect human presence via its onboard camera, interpret visual input using a Large Language Model (LLM), and respond with physical gestures or text. It supports real-time video streaming and remote operation.
How we built it We set up a persistent SSH connection using Tailscale, installed OM1 in a Docker container on the AiNex and Muto Hexapod's Raspberry Pi, configured the robot with a JSON5 config file, and launched the full ROS stack on boot to initialize sensors, camera, and app logic.
Challenges we ran into Overcoming school network restrictions required using a mobile hotspot. Resolving Docker's internet access issues through manual DNS configuration. Aligning ROS1 and ROS2 environments for full system compatibility.
Accomplishments that we're proud of Successfully integrated a multimodal AI runtime on a humanoid robot. Enabled real-time, camera-based interactions using LLMs. Automated the robot’s full boot and interaction stack.
What we learned We deepened our understanding of ROS architecture, containerized AI systems, VPN-based remote networking, and real-time robotic perception powered by AI language models.
What's next for OpenMind OM1 We plan to expand AiNex and Muto HexaPod's capabilities with more advanced gestures, voice integration, multi-agent collaboration, and deployment to other robots beyond AiNex and Hexapod.
Log in or sign up for Devpost to join the conversation.