Inspiration

We were inspired by the "Tower of Babel" problem in industrial automation. Factories and warehouses often use robots from different manufacturers (like FANUC, ABB, KUKA), each speaking its own complex, proprietary programming language (KAREL, RAPID, KRL). This makes programming slow, requires expensive specialists for each brand, and hinders the flexibility needed in modern manufacturing and logistics – sectors critical to transportation efficiency. We wanted to build a universal translator to bridge this gap.

What it does

How we built it

Gesture Capture (Edge - Jetson + ZED): We used Python with the ZED SDK (pyzed.sl) on an NVIDIA Jetson Orin to capture 3D hand motions. The script tracks the right wrist's position (in millimeters) using the BODY_38 model. We implemented pinch detection (thumb tip near pinky knuckle) with a consistency buffer (deque) to register "grasp" and "release" events reliably. To optimize the path, we only recorded points when the wrist moved beyond a minimum distance (MIN_RECORDING_DISTANCE_MM). OpenCV was used for real-time visualization of the camera feed and status.

Cloud Ingest (AWS IoT Core): The final path data (smoothed points, grasp events, fixture locations) was formatted into a JSON payload and published securely via MQTT to a specific topic (auraBridge/gestures/path) in AWS IoT Core using the AWSIoTPythonSDK. We set up an IoT Thing, Certificates, and an IAM Policy to manage device identity and permissions.

Backend Processing (AWS Lambda - Node.js):

An IoT Rule triggers our primary Lambda function upon receiving a message.

The Lambda parses the incoming JSON payload.

Coordinate Mapping: It applies an estimated translation offset and an axis swap (Camera ZXY -> Robot XYZ) to transform the captured points from the camera's coordinate frame to the FANUC robot's base frame.

Path Validation: The transformed path points are validated against geometric constraints (max reach, base dead zone) defined for the target robot model (FANUC-R2000iC).

Code Generation: If valid, the function generates basic code strings in both FANUC KAREL (.LS format) and ABB RAPID (.mod format), translating the transformed points into motion commands (L P[] / MoveL) and inserting gripper actions (DOUT/SetDO) based on the grasp_events.

AI Refinement: Using the AWS SDK v3 (@aws-sdk/client-bedrock-runtime), the basic code is sent to Amazon Bedrock (specifically Anthropic Claude) with a detailed prompt asking it to refine the code for production readiness – adding comments, suggesting motion optimizations (CNT/z10), ensuring proper syntax, and incorporating robot-specific context.

Artifact Storage: The refined KAREL and RAPID code, along with the raw payload and transformed path points, are stored in an S3 bucket using the AWS SDK v3 (@aws-sdk/client-s3).

Status Tracking: Job details, status (VALID, INVALID, COMPLETED, etc.), and S3 artifact locations are recorded in a DynamoDB table using AWS SDK v3 (@aws-sdk/client-dynamodb).

Frontend (Web UI): We built a simple web interface that (theoretically) communicates with the backend via API Gateway. It displays the validation status, the generated KAREL and RAPID code returned by the Lambda, and uses Three.js to render the 3D path points received from the Lambda response.

Challenges we ran into

Coordinate Frame Mapping: Accurately transforming points from the camera's 3D space to the robot's base frame is complex and ideally requires calibration. We used manual estimation and axis swapping as a practical hackathon simplification.

Orientation: Capturing stable 3D tool orientation from hand gestures was too complex for the timeframe, so we opted for using a fixed default orientation in the generated code.

Simulator Integration: We initially aimed to simulate the output. We encountered KAREL compilation errors in FANUC ROBOGUIDE (ASCII to Binary error) due to needing specific compiler options. We then pivoted to ABB RobotStudio but faced several RAPID syntax and semantic errors (ambiguous names, type mismatches, unknown references) requiring careful debugging of declarations and I/O signals. Ultimately, we decided to focus the demo on the successful code generation rather than live simulation.

Bedrock Prompt Engineering: Getting Bedrock to consistently output perfectly structured and syntactically correct robot code required iterating on the prompt, adding more specific instructions about motion types, comments, and language conventions.

AWS Setup: Correctly configuring the AWS IoT Core certificates, policies, endpoints, and ensuring the Lambda had the right IAM permissions for Bedrock, S3, and DynamoDB required careful attention to detail.

What we learned

Industrial Robot Programming: Gained a much deeper understanding of the specific syntax, data structures (robtarget), and execution requirements of KAREL and RAPID.

Importance of Coordinate Systems: Realized how critical precise coordinate transformations are for any real-world robotics application involving external sensors.

AI for Code Generation: Saw the potential (and challenges) of using large language models like Claude via Bedrock to assist in generating and refining specialized code, highlighting the need for good prompt design.

Cloud Robotics Pipeline: Successfully built an end-to-end pipeline integrating an edge device (Jetson) with multiple AWS cloud services for processing and artifact generation.

Debugging Across Systems: Practiced debugging issues spanning hardware (ZED), edge software (Python), cloud services (AWS), and simulation environments (RobotStudio).

What's next for Aura Bridge AI

Built With

Share this project:

Updates