Next Article in Journal
A Novel Passive Implantable Differential Mechanism to Restore Individuated Finger Flexion during Grasping following Tendon Transfer Surgery: A Pilot Study
Previous Article in Journal
Evolution Model of Coal Failure Using Energy Dissipation under Cyclic Loading/Unloading
 
 
Article
Peer-Review Record

Improved YOLOv5-Based Lightweight Object Detection Algorithm for People with Visual Impairment to Detect Buses

by Rio Arifando *, Shinji Eto and Chikamune Wada
Reviewer 1:
Reviewer 2:
Reviewer 3:
Submission received: 18 April 2023 / Revised: 2 May 2023 / Accepted: 3 May 2023 / Published: 8 May 2023
(This article belongs to the Topic AI Enhanced Civil Infrastructure Safety)

Round 1

Reviewer 1 Report

I thank you for your invitation to the article review process. In my opinion, this is a good and interesting scientific study, performed at a high level and with a qualitative presentation.

 

figure 1: , figure 5

The text refers to the head, but the figure does not indicate the head, only the Prediction. I think the picture needs to match the text.

 In discussion:

 

could you compare the results with other models for detecting objects (buses) - such as Mask R-CNN, Faster R-CNN (accuracy and other criteria)?

 Conclusions:

The work deals with the issue of bus recognition. But it seems to me that in this case (for people with low vision) it will be important to recognize the bus route. Are there plans to address this issue in future research?

Author Response

Dear Reviewer,

Thank you for your valuable feedback and constructive comments on our research paper. We greatly appreciate your input and have made the necessary revisions accordingly.

Regarding the issue with Figure 1 and Figure 5, we have updated the figures to ensure that they match the text by indicating the head.

Unfortunately, we were not able to compare proposed model with other object detection models such as Mask R-CNN and Faster R-CNN due to time constraints. Applying different models and conducting the necessary training would require a significant amount of time and resources, which were not feasible within the revision period.

For this reason, we would like to clarify our choice of using different versions of the YOLO model as a basis for comparison in the "Performance Comparison with different models" section. We compared versions 3, 4, 5, and 8 of YOLO as each version uses significantly different architectures and has different authors. We felt that this comparison would provide a more comprehensive understanding of the strengths and weaknesses of each version.

We completely agree with your suggestion to address the issue of recognizing the bus route in future research. While our current study focuses on bus recognition for people with low vision, recognizing the bus route would be an important aspect to consider for practical use in real-world scenarios. We plan to address this issue in our future research.

Once again, we thank you for your valuable feedback and for taking the time to review our paper. We hope that our revised paper meets your expectations and look forward to hearing back from you.

Best regards,
Rio Arifando

Reviewer 2 Report

This paper proposes a lightweight bus detection model based on a YOLOv5 architecture to address the significant challenges that visually impaired individuals face while waiting for buses. The proposed model integrates the GhostConv and C3Ghost Modules into the YOLOv5 network, and replaces SPPF in the backbone with SimSPPF for increased computational efficiency and accurate object detection capabilities. 

However, there are some issues that need to be addressed.

(1) The abstract should be improved and simpled to enhance the contribution of the paper.

(2) This paper selected 1202 images as the training set to train the proposed method and achieved good results. How does the sample size of the training set affect the proposed method? Will the reduction of training samples result in a significant decrease in the performance of the proposed method?

(3) If possible, several the-state-of-art methods on deep learning could be reviewed, such as, 10.1016/j.measurement.2022.111997, 10.3390/app122211821, 10.1504/IJHM.2022.10044141.

Author Response

Dear Reviewer,

Thank you for taking the time to review our research paper and for your insightful comments. We appreciate your input and have made the necessary revisions accordingly.

Regarding your first comment on the abstract, we understand the importance of having a clear and concise summary of our work. As such, we have simplified and improved the abstract to better highlight the contribution of the paper.

In response to your second comment, we agree that the sample size of the training set can significantly affect the performance of the proposed method. Our study used 1202 images as the training set and achieved good results. However, reducing the number of training samples may lead to a decrease in performance as the algorithm may not be able to capture the necessary features and patterns of the object classes accurately. This is a valuable point that we have addressed in the revised version of the paper.

Regarding your third comment, we appreciate your suggestion to review the state-of-the-art methods on deep learning such as 10.1016/j.measurement.2022.111997, 10.3390/app122211821, and 10.1504/IJHM.2022.10044141. However, due to time constraints during the revision process, we were unable to include a detailed review of these methods in our paper. Applying different models and conducting the necessary training would require a significant amount of time and resources, which were not feasible within the revision period.

We would like to clarify our choice of using different versions of the YOLO model as a basis for comparison in the "Performance Comparison with different models" section. We compared versions 3, 4, 5, and 8 of YOLO as each version uses significantly different architectures and has different authors. We felt that this comparison would provide a more comprehensive understanding of the strengths and weaknesses of each version.

Finally, we would like to thank you for your interest in our research and for your valuable feedback. We hope that our revised paper meets your expectations and we look forward to hearing your thoughts.

Sincerely,
Rio Arifando


Reviewer 3 Report

The paper is high quality, it is presented well , but the authors need to describe more about the nature of the dataset. 

is your model working in real time vedios?

Can your model expect the location and the next position of the bus ?

Did you apply it to videos, how were the results?

The modified version of Yolov5 enhances the results but not too much, any reasonable explanation for that?

Compare your achievement with literature?

Can your model be developed to include more subjects?

 

 

Author Response

Dear Reviewer,

Thank you for your feedback on our paper. We appreciate your comments and suggestions. Regarding the dataset, we agree that a more detailed description would be helpful and we will include this information in the revised manuscript.

The processing speed of our model for real-time videos may vary depending on the hardware specifications. However, we have tested our model using images and measured its performance in terms of inference/ms metric, which can also be applied to videos. We acknowledge that using videos for testing would provide a more comprehensive evaluation of our model's performance in real-time scenarios, and we will consider this for future work.

Regarding the question about whether our model can predict the location and the next position of the bus, we would like to clarify that our model is designed to detect and classify buses in images, but predicting their future positions is not within the scope of this paper. However, it is possible to incorporate such predictions in future work by using additional data and training the model accordingly. Nonetheless, it is worth noting that predicting the next position of a moving object can be challenging and requires additional techniques beyond object detection and classification. Therefore, it would require further research and development to achieve accurate predictions

In our study, we observed a slight improvement in results after making partial modifications to the YOLOv5 architecture. However, it is worth noting that the degree of improvement can be influenced by several factors. The complexity and diversity of the dataset used for training can have a significant impact on the performance of the model. In addition, the choice of hyperparameters, training techniques, and testing metrics can also affect the model's performance. Moreover, the extent of the modification to the architecture can also play a role in determining the degree of improvement achieved. In our case, since we only made partial modifications, we might not have fully leveraged the potential of the architecture.

Regarding the comparison of our achievements with the literature, our study compared our results with several models using four different metrics. However, we did not provide a comparison with similar studies in the literature as there were no comparable studies with same subject available for comparison. Nonetheless, we believe our study provides valuable insights into the performance of our model, which can contribute to the development of similar studies in the future.

We would like to express our sincere gratitude for your interest in our research and for providing us with your valuable feedback. We have carefully considered your comments and have incorporated them into our revised paper. We hope that the updated version meets your expectations and we are eager to receive your feedback.

Best regards,
Rio Arifando

Back to TopTop