Inspiration
We have always wanted a robotic assistant at home. Just like how Grok can transform your workflow online, Grok Junior can help you do things at home.
What it does
Grok Junior listens to your voice and robot request. It will respond and move the robot to assist with your request. It can pick up objects, help shine a light, move to specific coordinates, and even do a little dance. Behind the scenes, Grok Junior understands the user input, plans motion, and control the arm in real time.
How we built it
We use Grok as a VLA by providing it with capable tools that interface, including robot joints, cartesian coordinates, gripper control, and real time feedback. We integrated a webcam based object detection pipeline to allow the robot to detect objects on the table. For physical interaction, we designed and 3D printed a custom gripper to make it easier to pick up objects.
Challenges we ran into
Network limitations restricted how we decided to connect to the robot. We ran into a firmware issue on the robot side that caused a start up error, leaving us unable to move the robot. We lost a significant amount of time debugging and being unable to test our software until the issue was resolved. We ran into issues calibrating homography with our computer vision model due to webcam resolution limitations, as well as lighting changes due to the weather/daylight outside. Vision to robot coordinate mapping required significant trial and error.
Accomplishments that we're proud of
We are very happy with how successful Grok reasoning model could understand natural language commands and translate that into robotic movement commands. We didn't want the hardware to be complicated, so we purchased a commercial robot that anyone can buy on Amazon (Waveshare RoArm M2), showing that powerful robotic agents can still be built with accessible hardware. This allowed us to focus on building custom tools to call, and allow Grok to autonomously plan movements in a way that felt intuitive rather than scripted.
What we learned
Hardware is hard. Having a good concept of your hardware, making things as simple as possible, makes debugging much easier the further you get into the project. We also learned how powerful reasoning models can be with real world actions. The robot is much more capable once we were able the unlock the ability to communicate with it in plain language.
What's next for grok junior - robotic arm helper
We are excited about robotics and would like to continue exploring this space. If we continue developing this agent, it will be with upgraded hardware capabilities with the goal of becoming a helpful, hardware assistant!

Log in or sign up for Devpost to join the conversation.