Cognitive recognition of Pepper robot

object identifiication
pose recognition
demo
simulator

Inspiration

Tackling artificial intelligence problems is big challenge for future of technology. Its great that we can work on, test out and improve one component contained in it.

The humanoid robots have smooth human like movements, can recognize emotions in other people, can participate in simple conversation. The thing we think is really missing for better fitting in human environment is its perception of world. With better world perception, Pepper would have bigger core for further development, and could be easily engaged in human tasks.

Object detection

Pose detection

What it does

Pepper robot contains python api that allows us to track images and send them to our back-end service for further analytics. After analysis are performed, the robot is given the output data, thus providing it with world perception.

As a proof of concept, robot can currently mimic human movement, after recording it, as well as recognize and say several object that it can see.

For further development, robot could further interact with objects around it, and could better understand human motion that is able to see. Also, in a scenario where robot needs to learn predefined human-like motion, using provided workflow is much more intuitive and easier.

How we built it

We used a standalone web server for image processing. Currently it hosts OpenPose (library for estimating human pose from images) and Darknet (library for object classification) alongside python server which exposes functionalities of these libraries as a web service.

The Pepper robot has it's own behavioral flow, and uses forementioned web services to provide it with world perception. Pepper robot is able to capture images, and then use the results the web services provide to mimic human motion and point out some of the object from its field of view

Challenges we ran into

Lots of smaller problems while provisioning our backend server for image analysis. Lots of learned Pepper stuff, what it can do, which parts of API reference are missing, which are not performing that well, and which are working perfectly fine.

Accomplishments that we're proud of

We've managed to tryout several machine learning techniques, and optimize them as close to real time usage as we could. We've managed to build proof of concept for the problem we are trying to solve, with fully functioning code flow from initiating application on Pepper, through our backend services, to Pepper expressing world and human perception.