Artificial Intelligence for Computer Vision

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (15 March 2021) | Viewed by 16538

Special Issue Editor


E-Mail
Guest Editor
Dept. of Computer Engineering, Chosun University, 309 Pilmun-daero, Dong-gu, Gwangju 501-759, Korea
Interests: machine learning; machine vision; computational imaging; signal processing; LIDAR data processing

Special Issue Information

Dear Colleagues,

Computer vision systems are an integral part of modern security, manufacturing, and industrial processes. Computer vision systems are widely used in various applications, such as automotive navigation systems, intelligent surveillance systems, robot guidance, human-assistive systems, product classification, defect inspection, and so on. There are a number of challenges in computer vision systems, such as object segmenting, object recognition, object tracking, image enhancement, LIDAR data processing, 3D scene reconstruction, and so on.

Recently, Artificial Intelligence (AI)-based computer vision systems have played a crucial role in many applications. It is expected that AI will be the main approach of the next generation of computer vision research. The explosive number of AI algorithms and increasing computational power of modern computers has significantly extended the number of potential applications for computer vision.

This Special Issue of the journal Applied Sciences on “Artificial Intelligence for Computer vision” focuses on computer vision research based on AI. This Special Issue solicits state-of-the-art research findings from both academia and industry, with a particular emphasis on novel techniques to ensure the impact of AI in computer vision research and its related applications. Topics of interest for this Special Issue include but are not limited to the following:

  • Theoretical foundations of artificial intelligence and computer vision;
  • RGB-D vision;
  • Computational imaging;
  • Object tracking, detection, segmentation, and recognition;
  • Deep learning for computer vision;
  • Big data analysis for computer vision;
  • User experience for computer vision systems;
  • 3D scene reconstruction.

Prof. Dr. Hee-Deok Yang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Artificial Intelligence
  • Deep learning
  • Computer vision
  • Image analysis
  • Video analysis
  • Visual intelligence
  • 3D scene reconstruction
  • Signal analysis

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 2100 KiB  
Article
Skin Lesion Segmentation by U-Net with Adaptive Skip Connection and Structural Awareness
by Tran-Dac-Thinh Phan, Soo-Hyung Kim, Hyung-Jeong Yang and Guee-Sang Lee
Appl. Sci. 2021, 11(10), 4528; https://0-doi-org.brum.beds.ac.uk/10.3390/app11104528 - 15 May 2021
Cited by 24 | Viewed by 2572
Abstract
Skin lesion segmentation is one of the pivotal stages in the diagnosis of melanoma. Many methods have been proposed but, to date, this is still a challenging task. Variations in size and color, the fuzzy boundary and the low contrast between lesion and [...] Read more.
Skin lesion segmentation is one of the pivotal stages in the diagnosis of melanoma. Many methods have been proposed but, to date, this is still a challenging task. Variations in size and color, the fuzzy boundary and the low contrast between lesion and normal skin are the adverse factors for deficient or excessive delineation of lesions, or even inaccurate lesion location detection. In this paper, to counter these problems, we introduce a deep learning method based on U-Net architecture, which performs three tasks, namely lesion segmentation, boundary distance map regression and contour detection. The two auxiliary tasks provide an awareness of boundary and shape to the main encoder, which improves the object localization and pixel-wise classification in the transition region from lesion tissues to healthy tissues. Moreover, concerning the large variation in size, the Selective Kernel modules, which are placed in the skip connections, transfer the multi-receptive field features from the encoder to the decoder. Our methods are evaluated on three publicly available datasets: ISBI2016, ISBI 2017 and PH2. The extensive experimental results show the effectiveness of the proposed method in the task of skin lesion segmentation. Full article
(This article belongs to the Special Issue Artificial Intelligence for Computer Vision)
Show Figures

Figure 1

19 pages, 7278 KiB  
Article
Multi-Task Learning for Medical Image Inpainting Based on Organ Boundary Awareness
by Minh-Trieu Tran, Soo-Hyung Kim, Hyung-Jeong Yang and Guee-Sang Lee
Appl. Sci. 2021, 11(9), 4247; https://0-doi-org.brum.beds.ac.uk/10.3390/app11094247 - 07 May 2021
Cited by 7 | Viewed by 3512
Abstract
Distorted medical images can significantly hamper medical diagnosis, notably in the analysis of Computer Tomography (CT) images and organ segmentation specifics. Therefore, improving diagnostic imagery accuracy and reconstructing damaged portions are important for medical diagnosis. Recently, these issues have been studied extensively in [...] Read more.
Distorted medical images can significantly hamper medical diagnosis, notably in the analysis of Computer Tomography (CT) images and organ segmentation specifics. Therefore, improving diagnostic imagery accuracy and reconstructing damaged portions are important for medical diagnosis. Recently, these issues have been studied extensively in the field of medical image inpainting. Inpainting techniques are emerging in medical image analysis since local deformations in medical modalities are common because of various factors such as metallic implants, foreign objects or specular reflections during the image captures. The completion of such missing or distorted regions is important for the enhancement of post-processing tasks such as segmentation or classification. In this paper, a novel framework for medical image inpainting is presented by using a multi-task learning model for CT images targeting the learning of the shape and structure of the organs of interest. This novelty has been accomplished through simultaneous training for the prediction of edges and organ boundaries with the image inpainting, while state-of-the-art methods still focus only on the inpainting area without considering the global structure of the target organ. Therefore, our model reproduces medical images with sharp contours and exact organ locations. Consequently, our technique generates more realistic and believable images compared to other approaches. Additionally, in quantitative evaluation, the proposed method achieved the best results in the literature so far, which include a PSNR value of 43.44 dB and SSIM of 0.9818 for the square-shaped regions; a PSNR value of 38.06 dB and SSIM of 0.9746 for the arbitrary-shaped regions. The proposed model generates the sharp and clear images for inpainting by learning the detailed structure of organs. Our method was able to show how promising the method is when applying it in medical image analysis, where the completion of missing or distorted regions is still a challenging task. Full article
(This article belongs to the Special Issue Artificial Intelligence for Computer Vision)
Show Figures

Figure 1

12 pages, 11457 KiB  
Article
Finding the Differences in Capillaries of Taste Buds between Smokers and Non-Smokers Using the Convolutional Neural Networks
by Hang Nguyen Thi Phuong, Choon-Sung Shin and Hie-Yong Jeong
Appl. Sci. 2021, 11(8), 3460; https://0-doi-org.brum.beds.ac.uk/10.3390/app11083460 - 12 Apr 2021
Viewed by 2096
Abstract
Taste function and condition may be a tool that exhibits a rapid deficit to impress the subject with an objectively measured effect of smoking on his/her own body, because smokers exhibit significantly lower taste sensitivity than non-smokers. This study proposed a visual method [...] Read more.
Taste function and condition may be a tool that exhibits a rapid deficit to impress the subject with an objectively measured effect of smoking on his/her own body, because smokers exhibit significantly lower taste sensitivity than non-smokers. This study proposed a visual method to measure capillaries of taste buds with capillaroscopy and classified the difference between smokers and non-smokers through convolutional neural networks (CNNs). The dataset was collected from 26 human subjects through the capillaroscopy with the low and high magnification directly; of which 13 were smokers, and the other 13 were non-smokers. The acquired dataset consisted of 2600 images. The results of gradient-weighted class activation mapping (grad-cam) enabled us to understand the difference in capillaries of taste buds between smokers and non-smokers. Through the results, it was found that CNNs gave us a good performance with 79% accuracy. It was discussed that there was a shortage of extracted features when the conventional methods such as structural similarity index (SSIM) and scale-invariant feature transform (SIFT) were used to classify. Full article
(This article belongs to the Special Issue Artificial Intelligence for Computer Vision)
Show Figures

Figure 1

14 pages, 1928 KiB  
Article
Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence
by Huy Manh Nguyen, Tomo Miyazaki, Yoshihiro Sugaya and Shinichiro Omachi
Appl. Sci. 2021, 11(7), 3214; https://0-doi-org.brum.beds.ac.uk/10.3390/app11073214 - 03 Apr 2021
Cited by 1 | Viewed by 1741
Abstract
Visual-semantic embedding aims to learn a joint embedding space where related video and sentence instances are located close to each other. Most existing methods put instances in a single embedding space. However, they struggle to embed instances due to the difficulty of matching [...] Read more.
Visual-semantic embedding aims to learn a joint embedding space where related video and sentence instances are located close to each other. Most existing methods put instances in a single embedding space. However, they struggle to embed instances due to the difficulty of matching visual dynamics in videos to textual features in sentences. A single space is not enough to accommodate various videos and sentences. In this paper, we propose a novel framework that maps instances into multiple individual embedding spaces so that we can capture multiple relationships between instances, leading to compelling video retrieval. We propose to produce a final similarity between instances by fusing similarities measured in each embedding space using a weighted sum strategy. We determine the weights according to a sentence. Therefore, we can flexibly emphasize an embedding space. We conducted sentence-to-video retrieval experiments on a benchmark dataset. The proposed method achieved superior performance, and the results are competitive to state-of-the-art methods. These experimental results demonstrated the effectiveness of the proposed multiple embedding approach compared to existing methods. Full article
(This article belongs to the Special Issue Artificial Intelligence for Computer Vision)
Show Figures

Figure 1

16 pages, 3315 KiB  
Article
A New Real-Time Detection and Tracking Method in Videos for Small Target Traffic Signs
by Shaojian Song, Yuanchao Li, Qingbao Huang and Gang Li
Appl. Sci. 2021, 11(7), 3061; https://0-doi-org.brum.beds.ac.uk/10.3390/app11073061 - 30 Mar 2021
Cited by 18 | Viewed by 2808
Abstract
It is a challenging task for self-driving vehicles in Real-World traffic scenarios to find a trade-off between the real-time performance and the high accuracy of the detection, recognition, and tracking in videos. This issue is addressed in this paper with an improved YOLOv3 [...] Read more.
It is a challenging task for self-driving vehicles in Real-World traffic scenarios to find a trade-off between the real-time performance and the high accuracy of the detection, recognition, and tracking in videos. This issue is addressed in this paper with an improved YOLOv3 (You Only Look Once) and a multi-object tracking algorithm (Deep-Sort). First, data augmentation is employed for small sample traffic signs to address the problem of an extremely unbalanced distribution of different samples in the dataset. Second, a new architecture of YOLOv3 is proposed to make it more suitable for detecting small targets. The detailed method is (1) removing the output feature map corresponding to the 32-times subsampling of the input image in the original YOLOv3 structure to reduce its computational costs and improve its real-time performances; (2) adding an output feature map of 4-times subsampling to improve its detection capability for the small traffic signs; (3) Deep-Sort is integrated into the detection method to improve the precision and robustness of multi-object detection, and the tracking ability in videos. Finally, our method demonstrated better detection capabilities, with respect to state-of-the-art approaches, which precision, recall and mAP is 91%, 90%, and 84.76% respectively. Full article
(This article belongs to the Special Issue Artificial Intelligence for Computer Vision)
Show Figures

Figure 1

16 pages, 3132 KiB  
Article
Wheel Hub Defects Image Recognition Base on Zero-Shot Learning
by Xiaohong Sun, Jinan Gu, Meimei Wang, Yanhua Meng and Huichao Shi
Appl. Sci. 2021, 11(4), 1529; https://0-doi-org.brum.beds.ac.uk/10.3390/app11041529 - 08 Feb 2021
Cited by 5 | Viewed by 2092
Abstract
In the wheel hub industry, the quality control of the product surface determines the subsequent processing, which can be realized through the hub defect image recognition based on deep learning. Although the existing methods based on deep learning have reached the level of [...] Read more.
In the wheel hub industry, the quality control of the product surface determines the subsequent processing, which can be realized through the hub defect image recognition based on deep learning. Although the existing methods based on deep learning have reached the level of human beings, they rely on large-scale training sets, however, these models are completely unable to cope with the situation without samples. Therefore, in this paper, a generalized zero-shot learning framework for hub defect image recognition was built. First, a reverse mapping strategy was adopted to reduce the hubness problem, then a domain adaptation measure was employed to alleviate the projection domain shift problem, and finally, a scaling calibration strategy was used to avoid the recognition preference of seen defects. The proposed model was validated using two data sets, VOC2007 and the self-built hub defect data set, and the results showed that the method performed better than the current popular methods. Full article
(This article belongs to the Special Issue Artificial Intelligence for Computer Vision)
Show Figures

Figure 1

Back to TopTop