Robot Grasp Planning: A Learning from Demonstration-Based Approach

Wang, Kaimeng; Fan, Yongxiang; Sakuma, Ichiro

doi:10.3390/s24020618

Open AccessArticle

Robot Grasp Planning: A Learning from Demonstration-Based Approach^†

by

Kaimeng Wang

^1,*

,

Yongxiang Fan

¹ and

Ichiro Sakuma

²

¹

FANUC Advanced Research Laboratory, FANUC America Corporation, Union City, CA 94587, USA

²

Department of Precision Engineering, The University of Tokyo, Tokyo 113-8654, Japan

^*

Author to whom correspondence should be addressed.

^†

This paper is an extension version of the conference paper: Wang, K.; Fan, Y.; Sakuma, I. Robot Grasp Planning from Human Demonstration. In Proceedings of the 2023 15th International Conference on Computer and Automation Engineering (ICCAE), Sydney, Australia, 3–5 March 2023.

Sensors 2024, 24(2), 618; https://0-doi-org.brum.beds.ac.uk/10.3390/s24020618

Submission received: 3 December 2023 / Revised: 31 December 2023 / Accepted: 15 January 2024 / Published: 18 January 2024

(This article belongs to the Special Issue Novel Sensors and Algorithms for Outdoor Mobile Robot)

Abstract

:

Robot grasping constitutes an essential capability in fulfilling the complexities of advanced industrial operations. This field has been extensively investigated to address a range of practical applications. However, the generation of a stable grasp remains challenging, principally due to the constraints imposed by object geometries and the diverse objectives of the tasks. In this work, we propose a novel learning from demonstration-based grasp-planning framework. This framework is designed to extract crucial human grasp skills, namely the contact region and approach direction, from a single demonstration. Then, it formulates an optimization problem that integrates the extracted skills to generate a stable grasp. Distinct from conventional methods that rely on learning implicit synergies through human demonstration or on mapping the dissimilar kinematics between human hands and robot grippers, our approach focuses on learning the intuitive human intent that involves the potential contact regions and the grasping approach direction. Furthermore, our optimization formulation is capable of identifying the optimal grasp by minimizing the surface fitting error between the demonstrated contact regions on the object and the gripper finger surface and imposing a penalty for any misalignment between the demonstrated and the gripper’s approach directions. A series of experiments is conducted to verify the effectiveness of the proposed algorithm through both simulations and real-world scenarios.

Keywords:

grasp synthesis; learning from demonstration; skill transfer; robot learning

1. Introduction

In recent years, there has been a notable escalation in the demand for industrial robots, driven by the need to enhance productivity and boost product quality. With the continuous advancements in robotics technology, the trend toward increased robot installations has been significantly pronounced in sectors such as logistics and assembly production. Among the broad spectrum of robotics research, the capability of robot grasping stands out as a fundamental skill that is crucial for the execution of complex tasks in industrial environments [1]. Despite being a subject of research for numerous decades, robot grasp planning in the context of industrial automation presents persistent challenges due to the lack of understanding of the constraints that are inherent to the downstream processes in industrial operations [2].

To address this challenge, learning from demonstration (LfD) has emerged as a promising approach that synthesizes robot grasping by observing human demonstrations. LfD offers an intuitive and straightforward method for non-experts in robotics to program a new task on a robot. Despite significant research efforts dedicated to efficiently transferring human grasping skills to robots, substantial challenges remain due to the low complexity and dexterity of the robot gripper compared to the human hand, as well as the diverse array of object shapes encountered in practical applications. Recent advancements in data-driven methods have made remarkable progress in the extraction of grasping skills by leveraging deep learning techniques. However, these methods typically require a large amount of data for training, leading to time-consuming and labor-intensive processes [3]. On the other hand, traditional analytic approaches formulate grasp planning as an optimization problem, focusing on achieving a stable grasp. The combination of human intention with optimization techniques enables robots to execute functional grasps while being mindful of constraints emanating from downstream processes. Existing methods typically represent human grasping skills through a contact model and then search for an optimal grasp that aligns with the corresponding contact model based on prior knowledge of object shapes and gripper configurations [4]. However, a notable limitation of many current approaches is their reliance on pre-defined grasping taxonomies, which aim to bridge the gap in configuration differences between human hands and robot grippers. This pre-defined grasping taxonomy hinders the ability to generalize across a broad spectrum of industrial tasks.

In this work, we propose a novel LfD-based grasp-planning approach, which incorporates human grasp skills into an optimization framework to synthesize a stable grasp. This method revolves around extracting key elements of human grasp skills, namely the contact region and the approach direction, from a single human demonstration. The key idea of our approach is to account for high-level task knowledge while addressing the difference in configuration between human hands and robot grippers. More specifically, the proposed method utilizes contact regions as a foundational element and applies optimization to reduce the surface fitting error between the robot gripper fingers and the object. To achieve a functional grasp, both the contact region and the approach direction are crucial, as often observed in human grasp skills [5]. To extract this information from human demonstrations, we propose an intuitive and straightforward method that involves passing a plane formed by the thumb and index finger through the object. The obtained grasp skills are then fed into an optimization algorithm to generate a grasp that closely approximates the human demonstration. In summary, our main contributions are:

We propose a novel LfD-based grasp-planning framework that utilizes both the contact regions and approach direction as key skills derived from a single human demonstration. This approach effectively harmonizes human intention with the environmental constraints of both the objects and the robot gripper.
We develop an intuitive and straightforward method for detecting the contact regions and approach direction by employing a specific hand plane formed by the thumb and index fingers using a single RGB-D camera.
We integrate human grasp skills, encompassing both the contact regions and approach direction, into an optimization problem. This integration aims to generate a stable functional grasp that aligns with human intention while considering environmental constraints.

We note that a short conference version of this work was presented in [6]. Our initial conference paper did not delve into specific challenges such as collision avoidance and internal grasp complexities. This manuscript addresses these critical aspects, supplemented by additional analytical experiments to provide deeper insights.

2. Related Works

Data-driven methods: The data-driven approach in robot grasping has gained considerable traction due to its ability to learn grasp features from a large amount of data by leveraging deep learning techniques [7]. Benefiting from robustness to perception errors and the capacity to generalize to unseen objects, this approach has been successfully deployed in various applications, notably in bin-picking scenarios [8]. However, the collection of large volumes of training data is a costly and resource-intensive endeavor. Additionally, generating a feasible functional grasp in complex industrial settings often requires an understanding of environmental constraints and downstream procedures, an aspect that data-driven methods may struggle to incorporate effectively. In response to the challenges associated with data collection, Reinforcement Learning (RL) methods have emerged as a potential solution. RL methods enable robots to automatically learn grasping policies through self-supervised interaction with the environment [9]. The advent of advanced computer graphics has further facilitated this process, allowing researchers to train grasping policies in simulation environments and then transfer them to real-world applications [10]. However, setting up training simulations can be daunting for novice users. Moreover, due to the difficulty in acquiring general skills applicable across a variety of tasks, these methods often require extensive retraining for different task configurations. In contrast, the proposed method synthesizes the functional grasp by observing human demonstration once, without requiring extensive data collection and time-consuming training processes.

Analytic methods: Analytic approaches in robot grasping search for a stable grasp by formulating and solving an optimization problem, utilizing prior knowledge of object shapes and gripper configurations [11]. These methods are suitable for well-structured industrial environments since they are low data-dependent and eliminate the need for tedious supervised or unsupervised training. Some analytic strategies represent a grasp as a set of contact points, sampling gripper poses in the configuration space to seek an optimal grasp. This process often employs various metrics, such as contact surface matching [12] or gripper configuration [13]. However, these approaches do not always guarantee that the generated grasps can successfully perform specific post-grasp motions in line with the downstream processes. Alternative approaches focus on developing a taxonomy of grasp types tailored for different manipulation purposes. This type of approach can initially assign a feasible functional grasp and enhance sample efficiency by reducing the complexity of the search space [14]. An in-depth analysis of grasp taxonomy for continuum robots was presented in [15]. Based on this grasp taxonomy, the study in [16] introduced an analytical grasp synthesis approach, allowing continuum robots to adapt grasping strategies to diverse objects and tasks. In [17], a comprehensive taxonomy of human grasp types was presented through an analysis of the data recorded from 40 healthy subjects performing 20 unique hand grasps. Extending this concept, ref. [18] proposed an approach to predict the best grasp type from a taxonomy of 33 grasp classes in multi-object scenarios using just a single RGB image. However, the taxonomy-based approach is heavily dependent on predefined shape priors of both grippers and objects, which can limit its generalization capabilities for unseen objects. In contrast, our approach directly transfers human grasp skills to the robot through observations of human demonstrations, without the need for a predefined taxonomy of grasp types, which enables a more flexible and adaptive approach to robot grasping.

Grasp synthesis by learning from demonstration: The remarkable grasping skills of humans, capable of handling a diverse range of objects in complex task contexts, have sparked significant interest in the field of robotics, particularly in the transfer of these skills to robots, as surveyed in [19]. To recognize the human grasping pose, many different methods have been discussed. Early research in the field predominantly relied on a data glove for the precise detection of human finger positions, which were then replicated by robotic grippers. A teleoperation system utilizing a data glove to control a multi-fingered robotic hand was developed in [20]. The integration of tactile sensors into this glove-based framework plays a crucial role in understanding how the hand interacts with different objects. In [21], a data glove equipped with multiple Inertial Measurement Units (IMUs) and tactile sensors was designed to capture the complex dynamics of hand–object interactions in real time. However, the complexity of data gloves, especially in integrating various sensors, presents significant challenges in terms of cost and maintenance. To simplify the system of human grasping demonstration, the study in [22] proposed the use of thermal sensors to trace contact areas. With the advent of deep learning technologies, more recent studies have begun to employ simple cameras to track 3D hand poses [23]. A learning from demonstration pipeline designed to infer the poses of both the human hand and the object and then transfer these relative poses into robot grasping instructions is discussed in [24]. The integration of human demonstrations into the initialization of grasping policies has been shown to significantly improve sample efficiency for both reinforcement learning and heuristic-based methods. This efficiency is achieved by limiting the search space to a more constrained domain. A critical element in learning from demonstration (LfD) approaches is the contact model between the human hand and the object, which was extensively explored in [25]. Some researchers measure accurate contact points between the fingertips and objects through the use of a specific data glove and then map the contact points onto the corresponding robot gripper fingers [26,27]. Although these methods provide an intuitive way to transfer human grasping poses to robots, the reliance on single-point approximations imposes rigid mapping constraints and sacrifices flexibility in accounting for the configuration differences between human hands and robot grippers. To relax the mapping constraints, recent research has shifted its focus toward contact regions, recognizing that human grasps typically involve surface matching between the hand and the object [28]. However, most of these approaches employ specialized equipment (e.g., thermal cameras) or manual labeling to identify the contact regions, often overlooking the importance of the grasping approach direction, which could further narrow the search space for optimal grasping. In contrast, our method proposes a novel approach to detect both the contact regions and approach direction using a single RGB-D camera, a common device in industrial settings. In addition, the proposed optimization method formulates the surface fitting issue, aiming to minimize the distance between the demonstrated contact regions and gripper finger surface, as well as the misalignment between the demonstrated approach direction and the robot gripper, as a least-squares problem to accelerate the computation required for grasp searching.

3. Proposed Approach

The proposed method for robot grasp synthesis is structured into two stages: grasp skill recognition from human demonstration and grasp synthesis with the optimization method, as shown in Figure 1. The process starts with the identification of contact regions and approach directions from a human demonstration. This identification involves defining a hand plane formed by the thumb and one of the other four fingers [29]. The approach direction is determined by combining the directions of the thumb and index finger. The contact region on the object is then determined by intersecting this hand plane with the object’s surface and selecting proximate surface points. Following the extraction of the contact region and approach direction, the grasp optimization process, which iteratively fits the surface of the gripper fingers to the previously identified contact region and aligns the direction of the gripper approach with the human demonstration, is employed to determine the best grasping pose. The optimization is mathematically formulated as a least-squares problem, taking into account the grasp skills extracted from the human demonstration. Once a grasp pose is generated, it undergoes a critical evaluation to ensure it is collision-free. If a collision is detected, the gripper is moved to a safe pose, and the grasp optimization process is repeated to generate a new collision-free pose.

3.1. Grasp Skill Recognition

During the human demonstration phase, the movements of the human hand are tracked by leveraging deep learning techniques [30], employing a commercial camera. The research in [31] has proven that the majority of human grasp poses are formed by the thumb and some other opposing fingers. This finding has significant implications for robot gripper design, many of which mimic this fundamental grasp characteristic by incorporating a thumb finger and a set of opposing fingers. Therefore, in this work, the hand plane is defined by the thumb and one of the other four fingers. As shown in Figure 2, one hand plane is defined by the thumb and index fingers in the case of a parallel gripper, and two hand planes for a three-finger gripper are defined: one formed by the thumb and index fingers, and another formed by the thumb and middle fingers. The coordinate system of the hand plane is calculated based on the 3D positions of specified hand joints on the thumb and index fingers, as shown in Figure 3. The midpoint between the root joints of the thumb and index fingers is defined as the origin of the hand-plane coordinate system. The normal vector of the hand plane is defined as the X-axis, whereas the Y-axis is determined by the vector connecting the thumb and index root joints. The Z-axis is then established following the right-hand rule.

The approach direction, a crucial feature for functional grasp, indicates grasp reachability. Upon determining the hand plane, the Z-axis direction is selected as the approach direction, a concept similarly explored in [5]. Note that the approach direction is uniformly calculated using the hand plane formed by the thumb and index fingers, regardless of the number of fingers on the gripper.

Our method adopts a novel approach to determining the contact region for robot grasping. Instead of striving for pinpoint accuracy in detecting the positions where human fingertips contact the object surface, the hand plane is utilized to estimate a broader contact region on the object. This strategy not only simplifies the process but also enhances the robustness of our method, particularly in optimizing the grasping pose and accommodating the configuration differences between the human hand and the robot gripper.

Figure 4 exemplifies the process of identifying the contact region on an object. To calculate this contact region, we employ a two-step procedure. The first step involves passing the hand plane through the point cloud of an object. During this phase, all points on the object surface that fall within a specified tolerance distance, denoted as

d_{1}

, from the hand plane are selected using Equation (1). These points serve as preliminary candidates for the contact region.

\frac{| A O_{x} + B O_{y} + C O_{z} + D |}{\sqrt{A^{2} + B^{2} + C^{2}}} \leq d_{1}

(1)

where

A, B, C

, and D are the coefficients defining the hand plane

A x + B y + C z + D = 0

. Each point within the object’s point cloud is denoted by its 3D coordinates

O_{x}, O_{y},

and

O_{z}

. The parameter

d_{1}

is the first tolerance distance, which represents the permissible distance from any point in the object’s point cloud to the hand plane. This tolerance distance is determined based on the width of the gripper finger and subsequently guides the selection of the contact area. The results of the first step are shown in Figure 4a. The selected volume contains points from the object’s point cloud that fall within the tolerance distance

d_{1}

from the hand plane.

In the second step, we refine the selection of potential contact points on the object that were initially identified in the first step. This refinement involves selecting only those points that are within a second tolerance distance, denoted as

d_{2}

, from the origin of the hand plane using Equation (2). The purpose of this second tolerance distance is to ensure that the chosen contact regions on the object are not too far away from the demonstrated grasp pose.

\sqrt{{(O_{x} - x_{0})}^{2} + {(O_{y} - y_{0})}^{2} + {(O_{z} - z_{0})}^{2}} \leq d_{2}

(2)

where

x_{0}, y_{0}, z_{0}

denotes the origin of the hand-plane coordinate, and the parameter

d_{2}

is the second tolerance distance, which is determined based on the length of the gripper finger. The results of the second step are shown in Figure 4b. Only the points within the object’s point cloud, where the distance to the origin of the hand-plane frame does not exceed the specified threshold

d_{2}

, are retained.

3.2. Grasp Synthesis

3.2.1. Grasp Optimization

The contact regions on the object that align with the human demonstration are identified and then employed in an optimization computation. It is generally observed that a larger contact surface between the gripper and the object results in greater friction, which, in turn, facilitates the formation of a more robust grasp. Leveraging this principle, an iterative surface fitting (ISF) optimization method, adapted from the Iterative Closest Point (ICP) algorithm, was proposed in [32].

Compared with the conventional ICP approach, the ISF method deforms the gripper fingertip surfaces while obeying kinematic constraints and concurrently aligns these modified surfaces with the target object’s surface. This method is more precisely formulated in Equation (3).

\begin{matrix} min_{R, t, δ d} & (E_{p} + λ E_{n}) \\ s . t . & E_{p} (R, t, δ d) = \sum_{i = 1}^{m} \sum_{j = 1}^{2} {({(p_{i_{j}} - q_{i_{j}})}^{T} n_{i_{j}}^{q})}^{2} \\ E_{n} (R) = \sum_{i = 1}^{m} {({(R n_{i}^{p})}^{T} n_{i}^{q} + 1)}^{2} \end{matrix}

(3)

where the rotation R and translation t are the transformation matrix from the robot gripper to the object, and

δ d

is the displacement between the gripper fingers. The variables

p_{i}

,

q_{i}

,

n_{i}^{p}

, and

n_{i}^{q}

represent the points on the robot gripper surface, the corresponding point on the object surface, the normal vector at point

p_{i}

on the gripper surface, and the normal vector at point

q_{i}

on the object surface, respectively. The variable m is the total number of points, and

j = 1, 2

represents the indexes of two fingers. According to [32], Equation (3) can be further formulated as a least-square problem

min_{R, t, δ d} {∥ A x - b ∥}^{2}

.

The core objective of the ISF method is to search for an optimal transformation matrix (rotation R and translation t), as well as the displacement (

δ d

) of the gripper fingers, by minimizing the surface fitting error. In practical terms, the most intuitive approach involves initializing

R, t

in Equation (3) with the pose corresponding to the human grasp. This initialization forms the baseline from which the ISF algorithm iteratively computes the errors

E_{p}

and

E_{n}

to refine the fit to the contact surface. However, this method may diverge from the initial human grasp pose due to the difference in fingertip shape between human hands and robot grippers. To address this, we incorporate the approach direction derived from the human demonstration into the optimization process as a constraint in Equation (4). This incorporation ensures a more accurate and feasible replication of human-like grasping by the robot.

\begin{matrix} min_{R, t, δ d} {∥ A x - b ∥}^{2} \\ s . t . R n_{z} = n_{a p p} \end{matrix}

(4)

where

n_{z}

denotes the Z-axis direction of the gripper, and

n_{a p p}

represents the approach direction, determined from the human demonstration. The rotation matrix R is designed to align

n_{z}

with

n_{a p p}

.

To effectively solve this optimization problem while maintaining the integrity of the approach direction constraint, as defined by the human demonstration, Equation (4) is rewritten as follows:

min_{R, t, δ d} {∥ A x - b ∥}^{2} + ω^{2} {∥ (R n_{z}) \times n_{a p p} ∥}^{2}

(5)

where

ω^{2}

is a weight parameter to balance the surface matching accuracy and the alignment of the approach direction. Under the assumption that the rotation angle involved in each iterative step is small, it becomes feasible to approximate the rotation matrix R as follows:

R \approx [\begin{matrix} 1 & δ ψ & δ θ \\ δ ψ & 1 & - δ ϕ \\ - δ θ & δ ϕ & 1 \end{matrix}] = I + \hat{δ r}

(6)

where

\hat{δ r} \in s o (3)

is the skew-symmetric matrix form of the rotation vector

\hat{δ r} = [δ ϕ, δ θ, δ ψ]

, which contains the small rotational changes in three dimensions—roll (

δ ϕ

), pitch (

δ θ

), and yaw (

δ ψ

). By substituting Equation (6), which characterizes the skew-symmetric matrix

\hat{δ r}

, into Equation (5), the second term of Equation (5), which concerns the alignment of the approach direction, can be rewritten as follows:

\begin{matrix} ω^{2} {∥ \hat{n_{a p p}} \cdot \hat{n_{z}} \cdot δ r + \hat{n_{z}} \cdot n_{a p p} ∥}^{2} \\ = & ω^{2} {∥ E \cdot δ r - f ∥}^{2} \\ where & E = \hat{n_{a p p}} \cdot \hat{n_{z}}, f = - \hat{n_{z}} \cdot n_{a p p} \end{matrix}

(7)

Consequently, by integrating Equation (7) into Equation (5), the optimal grasp pose can be calculated as

\begin{matrix} [δ r, t] = {(A^{T} A + ω G^{T} G)}^{- 1} (A^{T} b + ω G^{T} F) \\ where & G = {[E^{T}, 0_{3 \cdot 3}^{T}]}^{T}, F = {[f^{T}, 0_{3}^{T}]}^{T} \end{matrix}

(8)

Using Equation (8), the optimization problem is formulated into a standard least-squares problem that can be solved efficiently. The optimal finger relative displacement is calculated using Equation (9), given by [32].

\begin{matrix} δ d^{*} = \frac{\sum_{i = 1}^{m} \sum_{j = 1}^{2} K_{i_{j}} H_{i_{j}}}{\sum_{i = 1}^{m} \sum_{j = 1}^{2} K_{i_{j}}^{2}} \\ s . t . δ d^{*} \in [d_{m i n}, d_{m a x}] \end{matrix}

(9)

where

K_{i_{j}} = 0.5 {(- 1)}^{j - 1} {(R v)}^{T} n_{i_{j}}^{q}

, and

H_{i_{j}} = {(R p_{i_{j}} + t - q_{i_{j}})}^{T} n_{i_{j}}^{q}

.

d_{m i n}

and

d_{m a x}

indicate the minimum and maximum distances between two fingers, and v is the direction vector of the finger movement.

The pseudo-code of the proposed LfD-based iterative surface fitting is summarized in Algorithm 1. The proposed method is designed to optimize for the optimal gripper transformation

R^{*}, t^{*}

with fixed

δ d

. Concurrently, the finger optimization focuses on determining the optimal finger displacement

δ d^{*}

while maintaining fixed values for

R, t

.

Algorithm 1 Learning from demonstration-based iterative surface fitting

1:: Input: $p_{i}, n_{i}^{p} \in g r i p p e r, q_{i}, n_{i}^{q} \in o b j e c t$
2:: Output: $R^{*}, t^{*}, δ d^{*}$
3:: Init: $R^{*} = R_{h a n d p l a n e}, t^{*} = t_{h a n d p l a n e}, δ d^{*} = 0, e r r = \infty$
4:: while $e r r - E (R^{*}, t^{*}, δ d^{*}) > ▵ e r r$ do
5:: $R^{*}, t^{*} \leftarrow m i n_{R, t} E (R, t, δ d^{*})$ using Equation (8)
6:: $δ d^{*} \leftarrow m i n_{δ d^{*}} E (R^{*}, t^{*}, δ d)$ using Equation (9)
7:: $e r r \leftarrow E (R^{*}, t^{*}, δ d^{*})$ using Equation (3)
8:: end while
9:: return $R^{*}, t^{*}, δ d^{*}$

3.2.2. Collision Avoidance

In the iterative surface fitting phase of robot grasp planning, it was observed that while the gripper fingers tend to align toward collision-free grasps, there remains a non-negligible risk of the gripper base colliding with the target object. To mitigate this issue and ensure that the entire gripper body avoids collision, we incorporate a collision avoidance mechanism within the grasp optimization framework.

An existing method in this domain is the Signed Distance Function (SDF), which computes the distance between each point in a three-dimensional space and the surface of its nearest obstacle. This function offers a differentiable representation of the environment, which proves advantageous for algorithms based on gradient optimization. However, a critical limitation of the SDF approach is its inability to accurately determine the exact penetration vector in scenarios involving deep collisions, as shown in Figure 5. To address collision detection and avoidance, the Gilbert–Johnson–Keerthi (GJK) algorithm coupled with the Expanding Polytope Algorithm (EPA) [33] is commonly employed. The GJK algorithm is utilized to rapidly determine whether a collision occurs through the concept of the Minkowski difference. Subsequent to this, the EPA is tasked with calculating the most efficient vector for collision escape, based on the findings of the GJK algorithm. Despite their utility, these algorithms exhibit limitations, particularly in handling objects with concave geometries. In light of these challenges, particularly prevalent in internal grasp scenarios, we propose a new approach to deep collision avoidance for grasp planning. This approach is designed to effectively navigate the intricacies of deep collision scenarios, thereby enhancing the robustness and reliability of robot grasping.

The proposed approach for determining the shortest escape vector in robot grasping scenarios, as depicted in Figure 6, is structured into three steps.

The first step involves representing both the object and the gripper as multiple spheres, as shown in Figure 7. This step enables the decomposition of any non-convex object into convex shapes, facilitating the efficient calculation of the Minkowski difference for spherical forms.

Next, the Minkowski difference between each pair of spheres—one from the object and one from the gripper—is computed. This calculation, guided by Equation (10), results in the formation of a union of spheres, collectively representing the overlapping regions between the object and the gripper.

P ⊖ G = \cup {A_{i} ⊖ B_{j} | A_{i} \in P, B_{j} \in G}

(10)

The object, denoted as P, is represented as

P = \cup_{i = 1 : M} A_{i}

, where each

A_{i}

signifies an individual sphere on the object. Here, M is the total number of spheres that constitute the object. Similarly, the gripper, denoted as G, is represented as

G = \cup_{i = 1 : N} B_{i}

, with each

B_{j}

representing a sphere on the gripper, and N being the total number of spheres that constitute the gripper. The Minkowski difference between two spheres is calculated using Equation (11).

A_{i} ⊖ B_{j} = S p h e r e (c_{A_{i}} - c_{B_{j}}, r_{A_{i}} + r_{B_{j}})

(11)

where

c_{A_{i}}

represents the center position, and

r_{A_{i}}

denotes the radius of the sphere

A_{i}

constituting the object P. Similarly,

c_{B_{i}}

is the center position, and

r_{B_{i}}

signifies the radius of the sphere

B_{i}

on the gripper G.

The final step involves calculating the boundary of the union of these spheres. The shortest distance from this boundary to the origin is identified as the escape vector. To streamline this process, we introduce a grid-based method to reduce computational complexity. Each grid point is evaluated to determine whether it is inside a sphere, on the sphere’s surface, or unoccupied, according to Equation (12). The shortest escape vector is then determined by identifying the shortest distance between a point on the surface of the combined spherical union and the origin. The vector from this point to the origin represents the shortest possible path to disengage the gripper from the object without collision.

\begin{matrix} d_{i j} = | p_{k} - c_{i j} | - r_{i j} \\ \{\begin{matrix} d_{i j} < 0, inside ball \\ d_{i j} = 0, on ball surface \\ d_{i j} > 0, not occupied \end{matrix} \end{matrix}

(12)

The position of the kth grid in the space is denoted by

p_{k}

.

c_{i j} = c_{A_{i}} - c_{B_{j}}

, which represents the center of the union ball

A_{i} ⊖ B_{j}

, calculated as the difference between the centers of

A_{i}

and

B_{j}

.

r_{i j} = r_{A_{i}} + r_{B_{j}}

, which is the radius of the union ball

A_{i} ⊖ B_{j}

, determined by the sum of the radius of

A_{i}

and

B_{j}

.

4. Experiments

In this section, both simulations and real-world experiments are conducted to verify the effectiveness of the proposed LfD-based grasp-planning algorithm. The desktop computer we used was equipped with a 3.7 GHz CPU and 32 GB of memory. An SMC parallel gripper was used in the experiments. The width

d_{1}

of the gripper finger was 2 cm, whereas the length

d_{2}

was 10 cm. To incorporate a margin of flexibility in the grasp planning, a scale factor

α

was introduced, set at

1.5

. Applying this scale factor, the parameters of the gripper were adjusted to

d_{1} = α \cdot 2 = 3

cm and

d_{2} = α \cdot 10 = 15

cm.

4.1. Experiments through Simulations

In the simulations, the contact regions and the approach direction were specified by the human operator through the use of a mouse, as shown in Figure 8. The method employed for evaluation incorporated three datasets: HomebrewedDB [34], YCB-Video [35], and T-less [36]. To ensure a comprehensive and varied analysis, nine objects were picked from each dataset, resulting in a diverse range of 27 objects. To evaluate the proposed method, the original iterative surface fitting (ISF) method was employed as a benchmark. The evaluation metric was the surface fitting error

E_{t o t a l} = E_{p} + λ E_{n}

, as defined by Equation (3). This metric quantifies the average distance, in millimeters, between the corresponding points on the surfaces of the gripper tips and the object, thus providing a measure of the accuracy of the fit. In addition, the computation time for grasp planning, measured in seconds, was recorded for each method. The simulated results for the various objects are presented in Figure 9, Figure 10 and Figure 11. Furthermore, as detailed in Table 1, the simulation results revealed certain limitations of the original ISF method, particularly its tendency to converge to local optima, requiring the resampling of grasp poses. This resampling process subsequently resulted in increased computation times. In contrast, the proposed method, incorporating contact regions and approach directions specified by human operators, demonstrated improved performance. Specifically, it achieved reduced fitting errors and shorter computation times for determining the optimal grasp poses owing to the employment of closed-form solutions for grasp planning. The experimental results underscore the enhanced efficiency and accuracy of the proposed method in grasp-planning applications.

We further evaluated the effectiveness of the proposed collision avoidance method in scenarios involving the internal grasping of four distinct objects. The results are visualized in Figure 12. The process involved randomly sampling an initial grasp pose, followed by the generation of collision-free grasp poses using the surface fitting method. The evaluation metric was the success rate of collision-free optimal grasps based on a total of 20 samples, comparing scenarios with and without the implementation of collision avoidance. The comparison results are detailed in Table 2. It is noteworthy that internal grasping presented a significant challenge due to the necessity of locating a feasible grasp within a narrow space, a task that is markedly more complex in the absence of a collision avoidance mechanism. Our findings demonstrate that integrating the collision avoidance function significantly enhanced success rates across all four objects.

4.2. Experiments with a Real Robot

The proposed approach was further validated on a FANUC LR-Mate 200iD (FANUC America Corporation, Rochester Hills, MI, USA), a 6-degree-of-freedom industrial robot, tested with various objects. For detecting human hand movements, an Intel RealSense D435 camera (Intel Corporation, Santa Clara, CA, USA) was utilized in these experiments. Compared with other sensors, such as data gloves or thermal cameras, the advantage of using a 3D camera is that it allows for the same sensor to be used for human hand tracking in human demonstration and object detection in robot execution since the demonstrator and the robot share the same workspace. The well-known Iterative Closest Points (ICP) method was employed for object localization. After detecting the positions of both the human hand pose and the target object, the contact region and approach direction were calculated within the camera coordinate system. Once the optimal grasp was determined in the camera coordinates, the result was subsequently transformed into the robot coordinate system for execution by the robot.

The proposed LfD-based grasp planning was demonstrated on four different objects: a water pipe, water valve, converter, and toy rabbit. The results are shown in Figure 13. The top row of the figure illustrates the human demonstration. The second row shows the hand plane; the identified contact region, marked in red; and the approach direction, marked in yellow. The third row shows the generated optimal grasp, whereas the fourth row shows the robot successfully executing the grasp. Finally, the bottom row illustrates the ability of the proposed method to rotate the object by 90 degrees relative to the ground, simulating a pose-grasp motion. The experimental videos are available in Supplementary Materials.

We also evaluated the proposed method to encompass scenarios involving the internal grasping of four distinct objects. The results, as illustrated in Figure 14, affirm that our method is capable of identifying feasible grasping poses, even within restricted spatial confines. This is primarily attributed to the integrated collision avoidance function, which effectively identifies an escape vector to facilitate the generation of a collision-free grasp.

To quantitatively evaluate the proposed method, we conducted a series of experimental comparisons: Method 1 employed only the contact region; Method 2 employed only the approach direction; and Method 3 utilized both the contact region and approach direction. The evaluation metric was the success rate, indicating the robot’s ability to reach the grasp pose generated from human demonstrations without colliding with the surrounding environment. The results of these experiments are comprehensively detailed in Table 3. Note that only Method 1 and Method 3 were subjected to comparative analysis and testing in the case of internal grasp since it was necessary to specify the contact region for generating an internal grasp. The analysis revealed that Method 1 exhibited the lowest success rates. This outcome can be attributed to the absence of approach direction information, leading to frequent collisions with the ground during grasp attempts. This indicates the critical importance of approach direction in facilitating a functional grasp. Method 2 demonstrated better success rates compared to Method 1. Notably, the integration of the approach direction reduced the frequency of collisions with the surrounding environment. However, the lack of contact area information often resulted in grasping poses tending toward an unstable position, hindering the successful lifting of the object. In contrast, Method 3 exhibited the highest success rates, suggesting that the inclusion of both the contact region and approach direction was effective. It was observed that the incorporation of a good initial contact region and approach direction derived from human demonstrations significantly expedited the convergence of the optimization algorithm to a feasible solution.

These experimental results not only validate the proposed method but also highlight its practical applicability in real-world grasp-planning scenarios. Despite the diverse shapes of the target objects, the proposed LfD-based approach demonstrated its efficiency in synthesizing human-like grasps. A key strength of this approach lies in its ability to adeptly avoid collisions, a critical aspect in robotic grasping scenarios. Furthermore, the method ensures a firm grasp of the object, which is essential for successfully executing post-grasp motions. This adaptability to various object geometries while maintaining a reliable and collision-free grasp underscores the robustness of the proposed method. The effectiveness of replicating human-like grasping strategies enhances the practical applicability of robotic systems in diverse environments.

5. Conclusions

This work introduces a learning from demonstration-based grasp-planning approach that effectively generates stable grasps by integrating grasp skills derived from human demonstrations. We demonstrate that the proposed method can intuitively acquire critical grasp skills, namely the contact region and approach direction, from a single human demonstration using a standard RGB-D camera. This approach contrasts with and offers advantages over more specialized equipment such as data gloves or thermal cameras. In addition, the optimal grasp is solved using a closed-form solution, which incorporates the nuances of human grasp skills. To validate the effectiveness of this approach, a series of experiments is conducted using a FANUC industrial robot. The results from these experiments demonstrate that the proposed method not only achieves stable grasping but also effectively aligns with human intentions for post-grasp motions.

In the future, we plan to extend the algorithm to accommodate multi-finger grippers while considering the kinematic constraints of such grippers. In addition, there is an interest in exploring how general grasp skills can be extracted from human demonstrations across a variety of object shapes. This exploration is expected to leverage deep learning techniques, potentially leading to more versatile and adaptable robot grasping capabilities.

Supplementary Materials

The following supporting information can be downloaded at: https://0-www-mdpi-com.brum.beds.ac.uk/article/10.3390/s24020618/s1, Video S1: Experimental Videos for Learning from Demonstration-Based Grasp Planning.

Author Contributions

Conceptualization, K.W.; methodology, K.W. and Y.F.; software, K.W.; validation, K.W. and Y.F.; formal analysis, K.W.; investigation, K.W.; writing—original draft preparation, K.W.; writing—review and editing, K.W., Y.F. and I.S.; visualization, K.W.; supervision, I.S.; project administration, K.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, H.; Tang, J.; Sun, S.; Lan, X. Robotic Grasping from Classical to Modern: A Survey. arXiv 2022, arXiv:2202.03631. [Google Scholar]
Saito, D.; Sasabuchi, K.; Wake, N.; Takamatsu, J.; Koike, H.; Ikeuchi, K. Task-grasping from human demonstration. arXiv 2022, arXiv:2203.00733. [Google Scholar]
Mandikal, P.; Grauman, K. Learning dexterous grasping with object-centric visual affordances. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 6169–6176. [Google Scholar]
Lin, Y.; Sun, Y. Robot grasp planning based on demonstrated grasp strategies. Int. J. Robot. Res. 2015, 34, 26–42. [Google Scholar] [CrossRef]
Geng, T.; Lee, M.; Hülse, M. Transferring human grasping synergies to a robot. Mechatronics 2011, 21, 272–284. [Google Scholar] [CrossRef]
Wang, K.; Fan, Y.; Sakuma, I. Robot Grasp Planning from Human Demonstration. In Proceedings of the 2023 15th International Conference on Computer and Automation Engineering (ICCAE), Sydney, Australia, 3–5 March 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 421–426. [Google Scholar]
Pinto, L.; Gupta, A. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 3406–3413. [Google Scholar]
Song, S.; Zeng, A.; Lee, J.; Funkhouser, T. Grasping in the wild: Learning 6dof closed-loop grasping from low-cost demonstrations. IEEE Robot. Autom. Lett. 2020, 5, 4978–4985. [Google Scholar]
Deng, Y.; Guo, X.; Wei, Y.; Lu, K.; Fang, B.; Guo, D.; Liu, H.; Sun, F. Deep reinforcement learning for robotic pushing and picking in cluttered environment. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 4–8 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 619–626. [Google Scholar]
Zhao, W.; Queralta, J.P.; Westerlund, T. Sim-to-real transfer in deep reinforcement learning for robotics: A survey. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia, 1–4 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 737–744. [Google Scholar]
Bicchi, A.; Kumar, V. Robotic grasping and contact: A review. In Proceedings of the 2000 ICRA, Millennium Conference, IEEE International Conference on Robotics and Automation, San Francisco, CA, USA, 24–28 April 2000; Symposia Proceedings (Cat. No. 00CH37065). IEEE: Piscataway, NJ, USA, 2000; Volume 1, pp. 348–353. [Google Scholar]
Ciocarlie, M.; Goldfeder, C.; Allen, P. Dexterous grasping via eigengrasps: A low-dimensional approach to a high-complexity problem. In Proceedings of the Robotics: Science and Systems Manipulation Workshop-Sensing and Adapting to the Real World, Atlanta, GA, USA, 30 June 2007. [Google Scholar]
Fan, Y.; Tomizuka, M. Efficient grasp planning and execution with multifingered hands by surface fitting. IEEE Robot. Autom. Lett. 2019, 4, 3995–4002. [Google Scholar] [CrossRef]
Dai, W.; Sun, Y.; Qian, X. Functional analysis of grasping motion. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 3507–3513. [Google Scholar]
Mehrkish, A.; Janabi-Sharifi, F. A comprehensive grasp taxonomy of continuum robots. Robot. Auton. Syst. 2021, 145, 103860. [Google Scholar]
Mehrkish, A.; Janabi-Sharifi, F. Grasp synthesis of continuum robots. Mech. Mach. Theory 2022, 168, 104575. [Google Scholar] [CrossRef]
Feix, T.; Romero, J.; Schmiedmayer, H.B.; Dollar, A.M.; Kragic, D. The grasp taxonomy of human grasp types. IEEE Trans. Hum.-Mach. Syst. 2015, 46, 66–77. [Google Scholar]
Corona, E.; Pumarola, A.; Alenya, G.; Moreno-Noguer, F.; Rogez, G. Ganhand: Predicting human grasp affordances in multi-object scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5031–5041. [Google Scholar]
Kleeberger, K.; Bormann, R.; Kraus, W.; Huber, M.F. A survey on learning-based robotic grasping. Curr. Robot. Rep. 2020, 1, 239–249. [Google Scholar] [CrossRef]
Ozawa, R.; Ueda, N. Supervisory control of a multi-fingered robotic hand system with data glove. In Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 November 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 1606–1611. [Google Scholar]
Liu, H.; Xie, X.; Millar, M.; Edmonds, M.; Gao, F.; Zhu, Y.; Santos, V.J.; Rothrock, B.; Zhu, S.C. A glove-based system for studying hand-object manipulation via joint pose and force sensing. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 6617–6624. [Google Scholar]
Lakshmipathy, A.; Bauer, D.; Bauer, C.; Pollard, N.S. Contact transfer: A direct, user-driven method for human to robot transfer of grasps and manipulations. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 6195–6201. [Google Scholar]
Karunratanakul, K.; Yang, J.; Zhang, Y.; Black, M.J.; Muandet, K.; Tang, S. Grasping field: Learning implicit representations for human grasps. In Proceedings of the 2020 International Conference on 3D Vision (3DV), Fukuoka, Japan, 25–28 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 333–344. [Google Scholar]
Wang, P.; Manhardt, F.; Minciullo, L.; Garattoni, L.; Meier, S.; Navab, N.; Busam, B. DemoGrasp: Few-shot learning for robotic grasping with human demonstration. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 5733–5740. [Google Scholar]
Rosales, C.; Ros, L.; Porta, J.M.; Suárez, R. Synthesizing grasp configurations with specified contact regions. Int. J. Robot. Res. 2011, 30, 431–443. [Google Scholar]
Ekvall, S.; Kragic, D. Learning and evaluation of the approach vector for automatic grasp generation and planning. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, Roma, Italy, 10–14 April 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 4715–4720. [Google Scholar]
Hillenbrand, U.; Roa, M.A. Transferring functional grasps through contact warping and local replanning. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal, 7–12 October 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 2963–2970. [Google Scholar]
Brahmbhatt, S.; Handa, A.; Hays, J.; Fox, D. Contactgrasp: Functional multi-finger grasp synthesis from contact. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 4–8 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 2386–2393. [Google Scholar]
Wang, K.; Fan, Y.; Sakuma, I. Robot Programming from a Single Demonstration for High Precision Industrial Insertion. Sensors 2023, 23, 2514. [Google Scholar] [PubMed]
Wang, K.; Tang, T. Robot programming by demonstration with a monocular RGB camera. Ind. Robot. Int. J. Robot. Res. Appl. 2023, 50, 234–245. [Google Scholar] [CrossRef]
Cutkosky, M.R. On grasp choice, grasp models, and the design of hands for manufacturing tasks. IEEE Trans. Robot. Autom. 1989, 5, 269–279. [Google Scholar] [CrossRef]
Fan, Y.; Lin, H.C.; Tang, T.; Tomizuka, M. Grasp planning for customized grippers by iterative surface fitting. In Proceedings of the 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE), Munich, Germany, 20–24 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 28–34. [Google Scholar]
Van Den Bergen, G. Proximity queries and penetration depth computation on 3d game objects. In Proceedings of the Game Developers Conference, San Jose, CA, USA, 22–24 March 2001; Volume 170, p. 209. [Google Scholar]
Kaskman, R.; Zakharov, S.; Shugurov, I.; Ilic, S. Homebreweddb: Rgb-d dataset for 6d pose estimation of 3d objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar]
Xiang, Y.; Schmidt, T.; Narayanan, V.; Fox, D. Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv 2017, arXiv:1711.00199. [Google Scholar]
Hodan, T.; Haluza, P.; Obdržálek, Š.; Matas, J.; Lourakis, M.; Zabulis, X. T-LESS: An RGB-D dataset for 6D pose estimation of texture-less objects. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 880–888. [Google Scholar]

Figure 1. The pipeline of the proposed approach consists of two stages: grasp skill recognition and grasp synthesis. First, the contact regions and approach directions are recognized from the human demonstration. Then, grasp optimization, which iteratively fits the surface between the gripper finger surface and the contact region, is employed to synthesize the grasp corresponding to the human demonstration. The generated grasp pose is evaluated to ensure it is collision-free. If a collision is detected, the gripper is moved to a safe pose, and the grasp optimization process is repeated to generate a new pose.

Figure 2. The hand-plane definitions in the two- and three-finger Robotiq gripper cases.

Figure 3. The hand plane is formed by fitting hand joints on the thumb finger and index finger.

Figure 4. The contact region is determined by passing the hand plane through the object’s point cloud (blue). (a) First, select the points (red) within the distance

d_{1}

to the hand plane. (b) Second, select the points (red) from the first step within the distance

d_{2}

to the origin of the hand plane.

Figure 4. The contact region is determined by passing the hand plane through the object’s point cloud (blue). (a) First, select the points (red) within the distance

d_{1}

to the hand plane. (b) Second, select the points (red) from the first step within the distance

d_{2}

to the origin of the hand plane.

Figure 5. (a) In instances of minor collisions, where the interaction between the gripper and the object is relatively superficial, escaping collisions is typically straightforward. (b) In situations involving deep collisions, the complexity of avoiding collisions escalates significantly.

Figure 6. (a) Approximate both the object and the gripper into spherical components. (b) Calculate the Minkowski difference to formulate a union of spheres. (c) Determine the boundary of the union of spheres. (d) Find the shortest escape vector. (e) The result of the collision-free grasp.

Figure 7. Examples of approximation into spherical components.

Figure 8. The red area is the contact region (red) and the yellow arrow is the approach direction.

Figure 9. The simulation results on different objects. The nine objects were selected from the HomebrewedDB dataset.

Figure 10. The simulation results on different objects. The nine objects were selected from the YCB-Video dataset.

Figure 11. The simulation results on different objects. The nine objects were selected from the T-less dataset.

Figure 12. The simulation results of the internal grasping of different objects.

Figure 13. The results using a real robot. After lifting the objects, the robot rotates the objects 90 degrees to simulate a post-grasp motion.

Figure 14. The experimental results of the internal grasp case using a real robot.

Table 1. Surface fitting errors and computation times on different objects.

Dataset	Object Name	ISF		LfD-Based ISF
Dataset	Object Name	$E_{total}$ (mm)	$t_{total}$ (s)	$E_{total}$ (mm)	$t_{total}$ (s)
HomeDB	Jig	13.005	1.191	5.326	0.798
	Dryer	14.600	1.071	6.974	0.747
	Cap	9.609	1.990	5.459	0.975
	Converter	11.387	2.561	6.404	0.519
	Motocycle	12.032	2.439	7.285	0.661
	Dinosaur	17.219	1.744	9.427	0.895
	Clamp	9.439	1.240	7.395	0.614
	Cup	10.871	1.867	8.068	0.771
	Bear	12.758	1.043	7.215	0.558
	Average	12.324	1.683	7.061	0.726
YCB-V	Box	3.457	0.404	3.560	0.369
	Can	4.193	0.420	2.551	0.406
	Mustard	8.720	0.832	6.174	0.559
	Spam	8.740	0.532	4.283	0.370
	Banana	3.145	0.434	3.031	0.287
	Bowl	8.924	1.238	4.778	0.862
	Scissor	5.808	0.679	3.662	0.277
	Magic pen	4.806	0.523	4.092	0.317
	Clip	6.836	0.379	4.681	0.347
	Average	6.070	0.604	4.090	0.422
T-less	Bulb	5.198	0.506	2.600	0.327
	S-connector	8.772	0.872	3.588	0.334
	C-plug	6.968	1.004	6.525	0.532
	R-connector	11.648	1.216	7.858	0.524
	C-connector	10.593	1.562	6.623	0.639
	Switch	5.159	0.915	3.409	0.359
	Box	2.405	0.439	2.536	0.412
	Plug	13.058	1.282	5.171	0.865
	Circle box	7.784	0.526	5.417	0.329
	Average	7.954	0.925	4.859	0.480
Total average		8.783	1.071	5.337	0.543

Table 2. Success rates (collision-free grasps/total samples), with and without the implementation of collision avoidance.

Object Name	Gear	Water Valve	Water Pipe	Jig
Without collision avoidance	11/20	13/20	10/20	7/20
With collision avoidance	20/20	18/20	19/20	20/20

Table 3. Success rates (collision-free grasps/total samples).

	External Grasp				Internal Grasp
Objects	Water Pipe	Water Valve	Converter	Toy Rabbit	Gear	Water Valve	Water Pipe	Jig
Method 1	11/20	6/20	2/20	13/20	11/20	7/20	17/20	12/20
Method 2	15/20	10/20	13/20	13/20	-	-	-	-
Method 3	18/20	20/20	20/20	19/20	20/20	20/20	20/20	20/20

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, K.; Fan, Y.; Sakuma, I. Robot Grasp Planning: A Learning from Demonstration-Based Approach. Sensors 2024, 24, 618. https://0-doi-org.brum.beds.ac.uk/10.3390/s24020618

AMA Style

Wang K, Fan Y, Sakuma I. Robot Grasp Planning: A Learning from Demonstration-Based Approach. Sensors. 2024; 24(2):618. https://0-doi-org.brum.beds.ac.uk/10.3390/s24020618

Chicago/Turabian Style

Wang, Kaimeng, Yongxiang Fan, and Ichiro Sakuma. 2024. "Robot Grasp Planning: A Learning from Demonstration-Based Approach" Sensors 24, no. 2: 618. https://0-doi-org.brum.beds.ac.uk/10.3390/s24020618

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robot Grasp Planning: A Learning from Demonstration-Based Approach^†

Abstract

1. Introduction

2. Related Works

3. Proposed Approach

3.1. Grasp Skill Recognition

3.2. Grasp Synthesis

3.2.1. Grasp Optimization

3.2.2. Collision Avoidance

4. Experiments

4.1. Experiments through Simulations

4.2. Experiments with a Real Robot

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Robot Grasp Planning: A Learning from Demonstration-Based Approach †

Abstract

1. Introduction

2. Related Works

3. Proposed Approach

3.1. Grasp Skill Recognition

3.2. Grasp Synthesis

3.2.1. Grasp Optimization

3.2.2. Collision Avoidance

4. Experiments

4.1. Experiments through Simulations

4.2. Experiments with a Real Robot

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Robot Grasp Planning: A Learning from Demonstration-Based Approach^†