Robotic handling of compliant food objects by robust learning from demonstration

Today, robots lack the visual, tactile and cognitive intelligence of humans to perform complex handling and processing tasks. We show how to endow robots these abilities and how to teach them to handle complaint food objects by Learning from Demonstration.

The robotic handling of compliant and deformable food raw materials, characterized by high biological variation, complex geometrical 3D shapes, and mechanical structures and texture, is currently in huge demand in the ocean space, agricultural, and food industries.

Many tasks in these industries are performed manually by human operators who, due to the laborious and tedious nature of their tasks, exhibit high variability in execution, with variable outcomes. The introduction of robotic automation for most complex processing tasks has been challenging due to current robot learning policies which are either based on learning from demonstration or self-exploration.

Learing how to grasp

Most of the robotic solutions today are based purely on visual information and focused on handling rigid objects. Compliant objects pose a major challenge to be handled robotically due to their deformation during handling in contact with a robot. For example, humans coordinate their visual and tactile information when they grasp or handle compliant food objects. Fusion of visual and tactile information is therefore a prerequisite so that robots can handle food object without quality degradation and be able to track and adjust to the deformation during handling. Development of novel learning strategies that will make use of both visual and force/tactile information in a single control scheme is also crucial so that the robot can learn new complex tasks and perform these autonomously.

When we humans reach to grasp an apple, we use our visual sensing to approach our hand to the apple and make the necessary adjustment of the path while we move the hand so that we can have the “correct” grasping pose of our hand to be able to grasp the apple.

When we initiate the contact with apple, then our tactile sensing through our finger is used to regulate the forces we excert to the apple to be able to grasp, lift and place. We would use different forces if instead of an apple we had to grasp a strawberry.  The goal is to endow robots with the same capability so that they can use both visual and tactile sensing to learn new and complex tasks.

The resulting approach enables the robot to correctly combine the visual and finger tactile sensing in order to estimate the grasping pose, the grippers correct finger configuration and the forces to be exerted on the object in order to achieve successfull grasping of compliant objects.

Teaching the robot to learn

The learning, the ‘brain’ of the robot, is based on supervised learning in the form of learning from demonstration, i.e. the humans demonstrate the task to the robot and then based on the learning policy that is developed the robot can infer on how to reproduce the task for various compliant objects. Since the robot can be taught from different human operators, which can be inconcistent in demonstration, we also developed a robot learning approach that learns only from consistent demonstrations and automatically rejects the inconsistent demonstrations by human teachers.

This approach for human-inspired robotic grasping and robot learning will enable the learner (robot) to act more consistently and with less variance than the teacher (human). The proposed approach has a vast range in robotic automation of tasks in the ocean space, agriculture, and food industries, where the manual nature of tasks and processes leads to high variation in the way skilled human operators perform complex processing and handling tasks.

Going forward

Focus on more complex and challenging tasks  for robot learning, where the human teacher has a greater challenge in providing accurate demonstrations, use of self-exploration and intermittent learning to refine the learning based on visual and force/tactile sensing.

The gripping sequence shown in RGB (top row) and depth (bottom row) images based on our trained LfD learning policy, where in a) an initial image is acquired and the visual state of the lettuce is computed, b) the robot places a grasp on the lettuce according to the action derived from the visual state, c) the robot moves and releases the lettuce to a predefined target point, d) the robot moves out of the way, enabling visual confirmation of whether the grasping sequence succeeded.

 

 

 

 

 

 

 

 

 

 

 

By Ekrem Misimi