Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

RGB-D-Based Robotic Grasping in Fusion Application Environments

Appl. Sci. 2022, 12(15), 7573; https://0-doi-org.brum.beds.ac.uk/10.3390/app12157573

by Ruochen Yin^1,2,3,*

, Huapeng Wu³, Ming Li³

, Yong Cheng^1,4, Yuntao Song^1,* and Heikki Handroos³

Reviewer 1: Anonymous

Reviewer 2:

Daniel Costa Ramos

Reviewer 3: Anonymous

Appl. Sci. 2022, 12(15), 7573; https://0-doi-org.brum.beds.ac.uk/10.3390/app12157573

Submission received: 30 June 2022 / Revised: 23 July 2022 / Accepted: 25 July 2022 / Published: 27 July 2022

(This article belongs to the Section Robotics and Automation)

Round 1

Reviewer 1 Report

The object of the paper is definitely interesting: the simple methodology you propose looks very promising, on the base of the performance reported in Table 1.

The other side of the picture: sometimes the quality of language/text is very poor, as if editing was fast or undone. Also, you should check mathematical formula and definitions, which may lack precision. Last, bibliography focuses on few authors and journals and some part of the methodology description may be unnecessary (e.g., standard algorithms like DBSCAN).

In the following I include a detailed list of changes/questions.

-------------------------------------------------------------------------------------------------------------------

Everywhere: use either "RGB-D" or "RGBD"

Abstract: it is not clear why "the black-box nature of neural networks" should prevent them "from meeting the stringent requirements of many industrial scenarios"

Keywords: add Deep Neural Networks

line 14: the word "fusion" may have different meanings: provide disambiguation

line 24: "on reviewing learning-based approaches" ---> "on learning-based approaches" (this is not a review paper)

lines 32-34: is Lidar technology suitable for this kind of problems as an alternative to depth camera?

line 37: "Morrison D. et al." ---> "Morrison et al."

line 44: "Tremblay J. et al." ---> "Tremblay et al."

line 51: "B. Wen et al." ---> "Wen et al."

line 64: what is a "peg" in this context?

lines 68-73: again, it is not clear how the black-box nature of NN (which we all agree about) can be responsible of uncertainty and, in that case, how it can be fixed by making training sets larger (actually, this is the epistemic uncertainty which does not depend on the black-box nature of NNs; then, there is aleatoric uncertainty but I would attribute this to uncertainty in the observed data used for inputs, rather than black-box nature of the NN itself)

line 71-76: this is a general and very interesting point for deep learning training. Are there really insurmountable problems which prevent the acquisition of large-scale training datasets for your application? In which sense is it "expensive"? Are the fusion vacuum vessels inaccessible even at the design, pre-production stage? I have to face this question very often with deep learning applications, while I believe that data acquisition - when possible - should be part of the workflow.

line 82: "Y. Xiang et al." ---> "Xiang et al."

Figure 2: what happens if condition One is negated?

Figure 2 caption: "This is the overall processing" ---> "The overall processing"; use all verbs in simple present tense, not future

line 88: "Figue2" ---> "Figure 2"

line 88-91: use all verbs in simple present tense, not future

line 95: "approached by Y. Xiang et al. [13]" ---> "approached in [13]"

Algorithm 1 DBSCAN: if this is the standard DBSCAN, think if there is a real need to explicitly write the algorithm or, conversely, providing a reference is enough.

Algorithm 1 DBSCAN, lines 5-6: "d(i,n)" ---> "d(i,m)", I guess

Algorithm 1 DBSCAN, line 18: how could Omega_cur be equal to the empty space, after the step defined at line 17?

Algorithm 1 DBSCAN, line 21: is this action correctly placed here or should it be before "back to line 14"? What is k at line 21?

end of page 4 (numeration is missing): "As shown in Figure 2. After we get the instance segmentation from the output of the NN.": check language style and explain better

Algorithm 2 RANSAC Plane Extraction: if this is the algorithm as described in [15], consider if there is a real need to explicitly write it or, conversely, providing reference [15] is enough

before line 104 (numeration is missing): "the number of categories matches the actual situation". What does "actual situation" mean? Elaborate on that.

before line 104 (numeration is missing): "we will abandon" ---> "we abandon" ; "Then, We project" ---> "Then, we project"

line 105: explain why there are two focal lengths, f_x and f_y (rectangular pixels?)

line 106: "it is vital to inform that" ---> "it is vital that"

line 116-119: check *carefully* language, grammar, mathematical symbols

text between line 119 and equation (2) (line numeration missing): check language, contents, mathematical symbols

line 120: "Where" ---> "where" ; "W is the distance between P_a and P_b": It does not seem so from figure 4.

equation (3): is it correct?

line 121: "Our proposed method is called as a Hybrid method combines UOIS with DBSCAN and RANSAC (HUDR)." ---> "The method we propose (called HUDR) is a hybrid one combining UOIS with DBSCAN and RANSAC."

line 130: "Tsai-lenz" ---> "Tsai-Lenz"

line 131: "UR_rtde package" ---> "Universal Robots RTDE package"

line 133: "RGD-B" ---> "RGB-D"

lines 142-143 and figure 6: how the planes shown in the lower figure match the photographies in the upper figure? They seem to have different orientation.

line 142: "independently is in the lower" ---> "independently in the lower"

line 146: "CR-ConvNet" ---> "GR-ConvNet"

line 154: "We will perform" ---> "We performed"

line 155: "then calculate" ---> "then calculated"

line 160: "performances" ---> "performance"

lines 162-164: "will become" ---> "becomes"

line 166: "means the GR-Convent method could" ---> "means it could"

line 170: "performances" ---> "performance"

line 180: "challenge" ---> "challenging"

References, everywhere: use uniform capitalization rule for titles. For instance[12] and [13] are capitalized, while other titles are not. Similarly, always write either "The International journal of robotics research" or "The International Journal of Robotics Research"

line 220: uniform the notation. Compare with the same conference in [1] and [10]. At least capitalize "International"

line 224: "In Proceedings of the Proceedings" ---> "In Proceedings"

line 228: "graspnet" ---> "GraspNet" ; "dof" ---> "DoF"

line 233: "Conference on Robot Learning" ---> "4th Conference on Robot Learning"

line 238: "of the kdd" ---> "of the KDD-96"

Author Response

Thanks for the valuable comments and suggestions. Considerable modifications have been carried out based on the comments. The major modifications are highlighted in the paper. The modifications and explanations are summarized in the Word file.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper describes a grasping task technique for a challenging environment in which the problem is decomposed into object recognition based on an instance segmentation network and grasping pose computation based on plane extraction and clustering algorithm.

I did not found any major problems with the paper. It is objective, well described and it has a good experimental design.

I will only appoint some minor corrections:

1. Please move all figures to right after their first citation.

2. Figure 2 caption is too long. Please consider using the text as a paragraph and replace the caption by a smaller one.

3. Title of Section 4 may be changed to Conclusion, it fits better.

4. A careful text revision is advised. I found some typos along the text: L88 - "Figue2", in the paragraphs of "L103" there are some extra dots, same in L116, L176 "dose",

5. L102 - "we add some other algorithms as bellow" is too vague, maybe a description of it would be better.

Author Response

Author Response File: Author Response.pdf

Reviewer 3 Report

The article is a very relevant subject but written at a middle level. There are some points that need to be corrected, so I recommend a Major revision of the article. The title of the article completely corresponds to it. However, the article has a few points that need to be corrected: • The literature review is very limited and needs a deeper analysis. • Some characters are poorly marked in the text (for example, line 86). The author must replace them so that the designations are displayed correctly in the text. • The main drawback of the article is that the authors focus on image recognition and absolutely do not describe the grasping process, possible error options, etc. Additionally, there is no choice of contact points for the three-finger gripper shown in the figure. How does the selection of these points affect the success of the gripping? Is the shift of the center of mass for metal objects analyzed during grasping? • In view of the previous remark, the authors should revise the article in the direction of computer vision (Change the name and concept). Or add the questions listed above regarding the algorithm for determining the contact position and other parameters that will affect the grasping process. All these parameters need to be tested and demonstrated for different positions and orientations of the manipulation object.

Author Response

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

The authors have significantly improved the article. However, I would recommend adding a picture with an explanation to the article that was provided in the reviewer's response (Or a video of the gripping process and the error that occurs from the link to it). Therefore, I recommend a minor revision of the article.

Author Response

Point 1: The authors have significantly improved the article. However, I would recommend adding a picture with an explanation to the article that was provided in the reviewer's response (Or a video of the gripping process and the error that occurs from the link to it). Therefore, I recommend a minor revision of the article.

Response 1: Thank you for the comment.

In the paper, we revealed that this grasping task was only part of our big task. We recorded a video that shows the complete process of the big task, so we have taken a part of the video and uploaded it to the Internet.

The link to the video: https://drive.google.com/file/d/1d9Qt-Ew0utBOCk6pxRhhGpsPirPiEwM_/view?usp=sharing

The identification of the centre of the hole and the phrase of approaching the hole centre is for the task in the next stage, which is the peg-in-hole assembly.

Article Menu

RGB-D-Based Robotic Grasping in Fusion Application Environments

Further Information

Guidelines

MDPI Initiatives

Follow MDPI