Advanced Intelligent Imaging Technology Ⅱ

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (18 March 2021) | Viewed by 81863

Special Issue Editor


E-Mail Website
Guest Editor
Department of Image, Graduate School of Advanced Imaging Science, Chung-Ang University, Seoul 06974, Korea
Interests: image enhancement and restoration; computational imaging; intelligent surveillance systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

A general pipeline of visual information processing includes: (i) image sensing and acquisition, (ii) pre-processing, (iii) feature detection or metric estimation, and (iv) high-level decisions. State-of-the-art artificial intelligence technology has caused a quantum leap in performance improvements to each step of visual information processing. In this context, deep learning-based image processing and computer vision algorithms have recently been developed and actively led the visual information processing field.

Artificial intelligence-based image signal processing (ISP) technology can drastically enhance the acquired digital images through demosaicing, denoising, deblurring, super resolution, and the wide dynamic range using deep neural networks. Feature detection and image analyses are the most popular application areas of artificial intelligence. An intelligent imaging system can solve various problems that are unsolvable without using the intelligence or learning.

An objective of this Special Issue is to highlight innovative developments of intelligent imaging technology related with various state-of-the-art image acquisition, preprocessing, feature detection, and image analysis using machine learning and artificial intelligence. In addition, any applications that combine two or more intelligent imaging methods are another important research area. Topics include but are not limited to:

  • Computational photography for intelligent imaging;
  • Visual inspection using machine learning and artificial intelligence;
  • Depth estimation and three-dimensional analysis;
  • Image processing and computer vision algorithms for advanced driver assistance systems (ADAS);
  • Wide-area, intelligent surveillance systems using multiple camera network;
  • Advanced image signal processor (ISP) based on artificial intelligence;
  • Deep neural networks for inverse imaging problems;
  • Multiple camera collaboration based on reinforcement learning;
  • Fusion of hybrid sensors for intelligent imaging systems;
  • Deep learning architectures for intelligent image processing and computer vision;
  • Learning-based multimodal image processing;
  • Remote sensing and UAV image processing.

Prof. Joonki Paik
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep neural network (DNN);
  • Artificial neural network (ANN);
  • Intelligent surveillance systems;
  • Computational photography;
  • Computational imaging;
  • Image signal processor (ISP);
  • Camera network;
  • Visual inspection;
  • Multimodal imaging.

Related Special Issue

Published Papers (24 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

15 pages, 6788 KiB  
Article
Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions
by Bonwoo Gu and Yunsick Sung
Appl. Sci. 2021, 11(3), 1291; https://0-doi-org.brum.beds.ac.uk/10.3390/app11031291 - 01 Feb 2021
Cited by 21 | Viewed by 3831
Abstract
Gomoku is a two-player board game that originated in ancient China. There are various cases of developing Gomoku using artificial intelligence, such as a genetic algorithm and a tree search algorithm. Alpha-Gomoku, Gomoku AI built with Alpha-Go’s algorithm, defines all possible situations in [...] Read more.
Gomoku is a two-player board game that originated in ancient China. There are various cases of developing Gomoku using artificial intelligence, such as a genetic algorithm and a tree search algorithm. Alpha-Gomoku, Gomoku AI built with Alpha-Go’s algorithm, defines all possible situations in the Gomoku board using Monte-Carlo tree search (MCTS), and minimizes the probability of learning other correct answers in the duplicated Gomoku board situation. However, in the tree search algorithm, the accuracy drops, because the classification criteria are manually set. In this paper, we propose an improved reinforcement learning-based high-level decision approach using convolutional neural networks (CNN). The proposed algorithm expresses each state as One-Hot Encoding based vectors and determines the state of the Gomoku board by combining the similar state of One-Hot Encoding based vectors. Thus, in a case where a stone that is determined by CNN has already been placed or cannot be placed, we suggest a method for selecting an alternative. We verify the proposed method of Gomoku AI in GuPyEngine, a Python-based 3D simulation platform. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

22 pages, 32604 KiB  
Article
Object-Wise Video Editing
by Ashraf Siddique and Seungkyu Lee
Appl. Sci. 2021, 11(2), 671; https://0-doi-org.brum.beds.ac.uk/10.3390/app11020671 - 12 Jan 2021
Viewed by 2283
Abstract
Beyond time frame editing in video data, object level video editing is a challenging task; such as object removal in a video or viewpoint changes. These tasks involve dynamic object segmentation, novel view video synthesis and background inpainting. Background inpainting is a task [...] Read more.
Beyond time frame editing in video data, object level video editing is a challenging task; such as object removal in a video or viewpoint changes. These tasks involve dynamic object segmentation, novel view video synthesis and background inpainting. Background inpainting is a task of the reconstruction of unseen regions presented by object removal or viewpoint change. In this paper, we propose a video editing method including foreground object removal background inpainting and novel view video synthesis under challenging conditions such as complex visual pattern, occlusion, overlaid clutter and variation of depth in a moving camera. Our proposed method calculates a weighted confidence score on the basis of normalized difference between observed depth and predicted distance in 3D space. A set of potential points from epipolar lines from neighbor frames are collected, refined, and weighted to select a few number of highly qualified observations to fill the desired region of interest area in the current frame from video. Based on the background inpainting method, novel view video synthesis is conducted with arbitrary viewpoint. Our method is evaluated with both a public dataset and our own video clips and compared with multiple state of the art methods showing a superior performance. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

21 pages, 10826 KiB  
Article
Enhancement of Multi-Target Tracking Performance via Image Restoration and Face Embedding in Dynamic Environments
by Ji Seong Kim, Doo Soo Chang and Yong Suk Choi
Appl. Sci. 2021, 11(2), 649; https://0-doi-org.brum.beds.ac.uk/10.3390/app11020649 - 11 Jan 2021
Cited by 2 | Viewed by 1881
Abstract
In this paper, we propose several methods to improve the performance of multiple object tracking (MOT), especially for humans, in dynamic environments such as robots and autonomous vehicles. The first method is to restore and re-detect unreliable results to improve the detection. The [...] Read more.
In this paper, we propose several methods to improve the performance of multiple object tracking (MOT), especially for humans, in dynamic environments such as robots and autonomous vehicles. The first method is to restore and re-detect unreliable results to improve the detection. The second is to restore noisy regions in the image before the tracking association to improve the identification. To implement the image restoration function used in these two methods, an image inference model based on SRGAN (super-resolution generative adversarial networks) is used. Finally, the third method includes an association method using face features to reduce failures in the tracking association. Three distance measurements are designed so that this method can be applied to various environments. In order to validate the effectiveness of our proposed methods, we select two baseline trackers for comparative experiments and construct a robotic environment that interacts with real people and provides services. Experimental results demonstrate that the proposed methods efficiently overcome dynamic situations and show favorable performance in general situations. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

18 pages, 35463 KiB  
Article
A Sample Weight and AdaBoost CNN-Based Coarse to Fine Classification of Fruit and Vegetables at a Supermarket Self-Checkout
by Khurram Hameed, Douglas Chai and Alexander Rassau
Appl. Sci. 2020, 10(23), 8667; https://0-doi-org.brum.beds.ac.uk/10.3390/app10238667 - 03 Dec 2020
Cited by 23 | Viewed by 4063
Abstract
The physical features of fruit and vegetables make the task of vision-based classification of fruit and vegetables challenging. The classification of fruit and vegetables at a supermarket self-checkout poses even more challenges due to variable lighting conditions and human factors arising from customer [...] Read more.
The physical features of fruit and vegetables make the task of vision-based classification of fruit and vegetables challenging. The classification of fruit and vegetables at a supermarket self-checkout poses even more challenges due to variable lighting conditions and human factors arising from customer interactions with the system along with the challenges associated with the colour, texture, shape, and size of a fruit or vegetable. Considering this complex application, we have proposed a progressive coarse to fine classification technique to classify fruit and vegetables at supermarket checkouts. The image and weight of fruit and vegetables have been obtained using a prototype designed to simulate the supermarket environment, including the lighting conditions. The weight information is used to change the coarse classification of 15 classes down to three, which are further used in AdaBoost-based Convolutional Neural Network (CNN) optimisation for fine classification. The training samples for each coarse class are weighted based on AdaBoost optimisation, which are updated on each iteration of a training phase. Multi-class likelihood distribution obtained by the fine classification stage is used to estimate a final classification with a softmax classifier. GoogleNet, MobileNet, and a custom CNN have been used for AdaBoost optimisation, with promising classification results. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

20 pages, 14166 KiB  
Article
Iterative Refinement of Uniformly Focused Image Set for Accurate Depth from Focus
by Sherzod Salokhiddinov and Seungkyu Lee
Appl. Sci. 2020, 10(23), 8522; https://0-doi-org.brum.beds.ac.uk/10.3390/app10238522 - 28 Nov 2020
Cited by 4 | Viewed by 2554
Abstract
Estimating the 3D shape of a scene from differently focused set of images has been a practical approach for 3D reconstruction with color cameras. However, reconstructed depth with existing depth from focus (DFF) methods still suffer from poor quality with textureless and object [...] Read more.
Estimating the 3D shape of a scene from differently focused set of images has been a practical approach for 3D reconstruction with color cameras. However, reconstructed depth with existing depth from focus (DFF) methods still suffer from poor quality with textureless and object boundary regions. In this paper, we propose an improved depth estimation based on depth from focus iteratively refining 3D shape from uniformly focused image set (UFIS). We investigated the appearance changes in spatial and frequency domains in iterative manner. In order to achieve sub-frame accuracy in depth estimation, optimal location of focused frame in DFF is estimated by fitting a polynomial curve on the dissimilarity measurements. In order to avoid wrong depth values on texture-less regions we propose to build a confidence map and use it to identify erroneous depth estimations. We evaluated our method on public and our own datasets obtained from different types of devices, such as smartphones, medical, and normal color cameras. Quantitative and qualitative evaluations on various test image sets show promising performance of the proposed method in depth estimation. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

16 pages, 643 KiB  
Article
Using Common Spatial Patterns to Select Relevant Pixels for Video Activity Recognition
by Itsaso Rodríguez-Moreno, José María Martínez-Otzeta, Basilio Sierra, Itziar Irigoien, Igor Rodriguez-Rodriguez and Izaro Goienetxea
Appl. Sci. 2020, 10(22), 8075; https://0-doi-org.brum.beds.ac.uk/10.3390/app10228075 - 14 Nov 2020
Cited by 2 | Viewed by 1702
Abstract
Video activity recognition, despite being an emerging task, has been the subject of important research due to the importance of its everyday applications. Video camera surveillance could benefit greatly from advances in this field. In the area of robotics, the tasks of autonomous [...] Read more.
Video activity recognition, despite being an emerging task, has been the subject of important research due to the importance of its everyday applications. Video camera surveillance could benefit greatly from advances in this field. In the area of robotics, the tasks of autonomous navigation or social interaction could also take advantage of the knowledge extracted from live video recording. In this paper, a new approach for video action recognition is presented. The new technique consists of introducing a method, which is usually used in Brain Computer Interface (BCI) for electroencephalography (EEG) systems, and adapting it to this problem. After describing the technique, achieved results are shown and a comparison with another method is carried out to analyze the performance of our new approach. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

18 pages, 4451 KiB  
Article
Sparse Representation Graph for Hyperspectral Image Classification Assisted by Class Adjusted Spatial Distance
by Wanghao Xu, Siqi Luo, Yunfei Wang, Youqiang Zhang and Guo Cao
Appl. Sci. 2020, 10(21), 7740; https://0-doi-org.brum.beds.ac.uk/10.3390/app10217740 - 01 Nov 2020
Viewed by 2384
Abstract
In the past few years, the sparse representation (SR) graph-based semi-supervised learning (SSL) has drawn a lot of attention for its impressive performance in hyperspectral image classification with small numbers of training samples. Among these methods, the probabilistic class structure regularized sparse representation [...] Read more.
In the past few years, the sparse representation (SR) graph-based semi-supervised learning (SSL) has drawn a lot of attention for its impressive performance in hyperspectral image classification with small numbers of training samples. Among these methods, the probabilistic class structure regularized sparse representation (PCSSR) approach, which introduces the probabilistic relationship between samples into the SR process, has shown its superiority over state-of-the-art approaches. However, this category of classification methods only apply another SR process to generate the probabilistic relationship, which focuses only on the spectral information but fails to utilize the spatial information. In this paper, we propose using the class adjusted spatial distance (CASD) to measure the distance between each two samples. We incorporate the proposed a CASD-based distance information into PCSSR mode to further increase the discriminability of original PCSSR approach. The proposed method considers not only the spectral information but also the spatial information of the hyperspectral data, consequently leading to significant performance improvement. Experimental results on different datasets demonstrate that compared with state-of-the-start classification models, the proposed method achieves the highest overall accuracies of 99.71%, 97.13%, and 97.07% on Botswana (BOT), Kennedy Space Center (KSC) and the truncated Indian Pines (PINE) datasets, respectively, with a small number of training samples selected from each class. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

21 pages, 13320 KiB  
Article
Toward Scalable Video Analytics Using Compressed-Domain Features at the Edge
by Dien Van Nguyen and Jaehyuk Choi
Appl. Sci. 2020, 10(18), 6391; https://0-doi-org.brum.beds.ac.uk/10.3390/app10186391 - 14 Sep 2020
Cited by 4 | Viewed by 3144
Abstract
Intelligent video analytics systems have come to play an essential role in many fields, including public safety, transportation safety, and many other industrial areas, such as automated tools for data extraction, and analyzing huge datasets, such as multiple live video streams transmitted from [...] Read more.
Intelligent video analytics systems have come to play an essential role in many fields, including public safety, transportation safety, and many other industrial areas, such as automated tools for data extraction, and analyzing huge datasets, such as multiple live video streams transmitted from a large number of cameras. A key characteristic of such systems is that it is critical to perform real-time analytics so as to provide timely actionable alerts on various tasks, activities, and conditions. Due to the computation-intensive and bandwidth-intensive nature of these operations, however, video analytics servers may not fulfill the requirements when serving a large number of cameras simultaneously. To handle these challenges, we present an edge computing-based system that minimizes the transfer of video data from the surveillance camera feeds on a cloud video analytics server. Based on a novel approach of utilizing the information from the encoded bitstream, the edge can achieve low processing complexity of object tracking in surveillance videos and filter non-motion frames from the list of data that will be forwarded to the cloud server. To demonstrate the effectiveness of our approach, we implemented a video surveillance prototype consisting of edge devices with low computational capacity and a GPU-enabled server. The evaluation results show that our method can efficiently catch the characteristics of the frame and is compatible with the edge-to-cloud platform in terms of accuracy and delay sensitivity. The average processing time of this method is approximately 39 ms/frame with high definition resolution video, which outperforms most of the state-of-the-art methods. In addition to the scenario implementation of the proposed system, the method helps the cloud server reduce 49% of the load of the GPU, 49% that of the CPU, and 55% of the network traffic while maintaining the accuracy of video analytics event detection. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

24 pages, 4000 KiB  
Article
A Low-Cost Automated Digital Microscopy Platform for Automatic Identification of Diatoms
by Jesús Salido, Carlos Sánchez, Jesús Ruiz-Santaquiteria, Gabriel Cristóbal, Saul Blanco and Gloria Bueno
Appl. Sci. 2020, 10(17), 6033; https://0-doi-org.brum.beds.ac.uk/10.3390/app10176033 - 31 Aug 2020
Cited by 27 | Viewed by 6150
Abstract
Currently, microalgae (i.e., diatoms) constitute a generally accepted bioindicator of water quality and therefore provide an index of the status of biological ecosystems. Diatom detection for specimen counting and sample classification are two difficult time-consuming tasks for the few existing expert diatomists. To [...] Read more.
Currently, microalgae (i.e., diatoms) constitute a generally accepted bioindicator of water quality and therefore provide an index of the status of biological ecosystems. Diatom detection for specimen counting and sample classification are two difficult time-consuming tasks for the few existing expert diatomists. To mitigate this challenge, in this work, we propose a fully operative low-cost automated microscope, integrating algorithms for: (1) stage and focus control, (2) image acquisition (slide scanning, stitching, contrast enhancement), and (3) diatom detection and a prospective specimen classification (among 80 taxa). Deep learning algorithms have been applied to overcome the difficult selection of image descriptors imposed by classical machine learning strategies. With respect to the mentioned strategies, the best results were obtained by deep neural networks with a maximum precision of 86% (with the YOLO network) for detection and 99.51% for classification, among 80 different species (with the AlexNet network). All the developed operational modules are integrated and controlled by the user from the developed graphical user interface running in the main controller. With the developed operative platform, it is noteworthy that this work provides a quite useful toolbox for phycologists in their daily challenging tasks to identify and classify diatoms. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

22 pages, 3490 KiB  
Article
Hybrid Learning of Hand-Crafted and Deep-Activated Features Using Particle Swarm Optimization and Optimized Support Vector Machine for Tuberculosis Screening
by Khin Yadanar Win, Noppadol Maneerat, Kazuhiko Hamamoto and Syna Sreng
Appl. Sci. 2020, 10(17), 5749; https://0-doi-org.brum.beds.ac.uk/10.3390/app10175749 - 20 Aug 2020
Cited by 20 | Viewed by 2786
Abstract
Tuberculosis (TB) is a leading infectious killer, especially for people with Human Immunodeficiency Virus (HIV) and Acquired Immunodeficiency Syndrome (AIDS). Early diagnosis of TB is crucial for disease treatment and control. Radiology is a fundamental diagnostic tool used to screen or triage TB. [...] Read more.
Tuberculosis (TB) is a leading infectious killer, especially for people with Human Immunodeficiency Virus (HIV) and Acquired Immunodeficiency Syndrome (AIDS). Early diagnosis of TB is crucial for disease treatment and control. Radiology is a fundamental diagnostic tool used to screen or triage TB. Automated chest x-rays analysis can facilitate and expedite TB screening with fast and accurate reports of radiological findings and can rapidly screen large populations and alleviate a shortage of skilled experts in remote areas. We describe a hybrid feature-learning algorithm for automatic screening of TB in chest x-rays: it first segmented the lung regions using the DeepLabv3+ model. Then, six sets of hand-crafted features from statistical textures, local binary pattern, GIST, histogram of oriented gradients (HOG), pyramid histogram of oriented gradients and bags of visual words (BoVW), and nine sets of deep-activated features from AlexNet, GoogLeNet, InceptionV3, XceptionNet, ResNet-50, SqueezeNet, ShuffleNet, MobileNet, and DenseNet, were extracted. The dominant features of each feature set were selected using particle swarm optimization, and then separately input to an optimized support vector machine classifier to label ‘normal’ and ‘TB’ x-rays. GIST, HOG, BoVW from hand-crafted features, and MobileNet and DenseNet from deep-activated features performed better than the others. Finally, we combined these five best-performing feature sets to build a hybrid-learning algorithm. Using the Montgomery County (MC) and Shenzen datasets, we found that the hybrid features of GIST, HOG, BoVW, MobileNet and DenseNet, performed best, achieving an accuracy of 92.5% for the MC dataset and 95.5% for the Shenzen dataset. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

17 pages, 1819 KiB  
Article
Spatial Domain-Based Nonlinear Residual Feature Extraction for Identification of Image Operations
by Xiaochen Yuan and Tian Huang
Appl. Sci. 2020, 10(16), 5582; https://0-doi-org.brum.beds.ac.uk/10.3390/app10165582 - 12 Aug 2020
Cited by 2 | Viewed by 1571
Abstract
In this paper, a novel approach that uses a deep learning technique is proposed to detect and identify a variety of image operations. First, we propose the spatial domain-based nonlinear residual (SDNR) feature extraction method by constructing residual values from locally supported filters [...] Read more.
In this paper, a novel approach that uses a deep learning technique is proposed to detect and identify a variety of image operations. First, we propose the spatial domain-based nonlinear residual (SDNR) feature extraction method by constructing residual values from locally supported filters in the spatial domain. By applying minimum and maximum operators, diversity and nonlinearity are introduced; moreover, this construction brings nonsymmetry to the distribution of SDNR samples. Then, we propose applying a deep learning technique to the extracted SDNR features to detect and classify a variety of image operations. Many experiments have been conducted to verify the performance of the proposed approach, and the results indicate that the proposed method performs well in detecting and identifying the various common image postprocessing operations. Furthermore, comparisons between the proposed approach and the existing methods show the superiority of the proposed approach. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

29 pages, 12197 KiB  
Article
An Advanced Vehicle Body Part Inspection Scheme Based on Scattered Point Cloud Data
by Yang Yang, Ming Li and Xie Ma
Appl. Sci. 2020, 10(15), 5379; https://0-doi-org.brum.beds.ac.uk/10.3390/app10155379 - 04 Aug 2020
Cited by 2 | Viewed by 2684
Abstract
To further improve the efficiency and accuracy of the vehicle part inspection process, this paper designs an accurate and efficient vehicle body part inspection framework based on scattered point cloud data (PCD). Firstly, a hybrid filtering algorithm for point cloud denoising is designed [...] Read more.
To further improve the efficiency and accuracy of the vehicle part inspection process, this paper designs an accurate and efficient vehicle body part inspection framework based on scattered point cloud data (PCD). Firstly, a hybrid filtering algorithm for point cloud denoising is designed to solve the problem of multiple noise points in the original point cloud measurement data. Secondly, a point cloud simplification algorithm based on Fuzzy C-Means (FCM) is designed to solve the problems of a large amount of data and many redundant points in the PCD. Thirdly, a point cloud fine registration algorithm based on the Teaching-Learning-based Optimization (TLBO) algorithm is designed to solve the problem where the initial point cloud measurement data cannot be located properly. Finally, the deviation distance between the PCD and Computer-Aided-Design (CAD) model is calculated by the K-Nearest Neighbor (KNN) algorithm to inspect and analyze the point cloud after preprocessing. On the basis of the design algorithm, four groups that contain measurement data for eight vehicle body parts are analyzed and the results prove the effectiveness of the algorithm, which is very suitable for the inspection process of vehicle body parts. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

19 pages, 4171 KiB  
Article
Deep Learning for Optic Disc Segmentation and Glaucoma Diagnosis on Retinal Images
by Syna Sreng, Noppadol Maneerat, Kazuhiko Hamamoto and Khin Yadanar Win
Appl. Sci. 2020, 10(14), 4916; https://0-doi-org.brum.beds.ac.uk/10.3390/app10144916 - 17 Jul 2020
Cited by 99 | Viewed by 8939
Abstract
Glaucoma is a major global cause of blindness. As the symptoms of glaucoma appear, when the disease reaches an advanced stage, proper screening of glaucoma in the early stages is challenging. Therefore, regular glaucoma screening is essential and recommended. However, eye screening is [...] Read more.
Glaucoma is a major global cause of blindness. As the symptoms of glaucoma appear, when the disease reaches an advanced stage, proper screening of glaucoma in the early stages is challenging. Therefore, regular glaucoma screening is essential and recommended. However, eye screening is currently subjective, time-consuming and labor-intensive and there are insufficient eye specialists available. We present an automatic two-stage glaucoma screening system to reduce the workload of ophthalmologists. The system first segmented the optic disc region using a DeepLabv3+ architecture but substituted the encoder module with multiple deep convolutional neural networks. For the classification stage, we used pretrained deep convolutional neural networks for three proposals (1) transfer learning and (2) learning the feature descriptors using support vector machine and (3) building ensemble of methods in (1) and (2). We evaluated our methods on five available datasets containing 2787 retinal images and found that the best option for optic disc segmentation is a combination of DeepLabv3+ and MobileNet. For glaucoma classification, an ensemble of methods performed better than the conventional methods for RIM-ONE, ORIGA, DRISHTI-GS1 and ACRIMA datasets with the accuracy of 97.37%, 90.00%, 86.84% and 99.53% and Area Under Curve (AUC) of 100%, 92.06%, 91.67% and 99.98%, respectively, and performed comparably with CUHKMED, the top team in REFUGE challenge, using REFUGE dataset with an accuracy of 95.59% and AUC of 95.10%. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

18 pages, 5799 KiB  
Article
NICE: Superpixel Segmentation Using Non-Iterative Clustering with Efficiency
by Cheng Li, Baolong Guo, Geng Wang, Yan Zheng, Yang Liu and Wangpeng He
Appl. Sci. 2020, 10(12), 4415; https://0-doi-org.brum.beds.ac.uk/10.3390/app10124415 - 26 Jun 2020
Cited by 8 | Viewed by 3276
Abstract
Superpixels intuitively over-segment an image into small compact regions with homogeneity. Owing to its outstanding performance on region description, superpixels have been widely used in various computer vision tasks as the substitution for pixels. Therefore, efficient algorithms for generating superpixels are still important [...] Read more.
Superpixels intuitively over-segment an image into small compact regions with homogeneity. Owing to its outstanding performance on region description, superpixels have been widely used in various computer vision tasks as the substitution for pixels. Therefore, efficient algorithms for generating superpixels are still important for advanced visual tasks. In this work, two strategies are presented on conventional simple non-iterative clustering (SNIC) framework, aiming to improve the computational efficiency as well as segmentation performance. Firstly, inter-pixel correlation is introduced to eliminate the redundant inspection of neighboring elements. In addition, it strengthens the color identity in complicated texture regions, thus providing a desirable trade-off between runtime and accuracy. As a result, superpixel centroids are evolved more efficiently and accurately. For further accelerating the framework, a recursive batch processing strategy is proposed to eliminate unnecessary sorting operations. Therefore, a large number of neighboring elements can be assigned directly. Finally, the two strategies result in a novel synergetic non-iterative clustering with efficiency (NICE) method based on SNIC. Experimental results verify that it works 40% faster than conventional framework, while generating comparable superpixels for several quantitative metrics—sometimes even better. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

18 pages, 10670 KiB  
Article
Video Description Model Based on Temporal-Spatial and Channel Multi-Attention Mechanisms
by Jie Xu, Haoliang Wei, Linke Li, Qiuru Fu and Jinhong Guo
Appl. Sci. 2020, 10(12), 4312; https://0-doi-org.brum.beds.ac.uk/10.3390/app10124312 - 23 Jun 2020
Cited by 7 | Viewed by 2082
Abstract
Video description plays an important role in the field of intelligent imaging technology. Attention perception mechanisms are extensively applied in video description models based on deep learning. Most existing models use a temporal-spatial attention mechanism to enhance the accuracy of models. Temporal attention [...] Read more.
Video description plays an important role in the field of intelligent imaging technology. Attention perception mechanisms are extensively applied in video description models based on deep learning. Most existing models use a temporal-spatial attention mechanism to enhance the accuracy of models. Temporal attention mechanisms can obtain the global features of a video, whereas spatial attention mechanisms obtain local features. Nevertheless, because each channel of the convolutional neural network (CNN) feature maps has certain spatial semantic information, it is insufficient to merely divide the CNN features into regions and then apply a spatial attention mechanism. In this paper, we propose a temporal-spatial and channel attention mechanism that enables the model to take advantage of various video features and ensures the consistency of visual features between sentence descriptions to enhance the effect of the model. Meanwhile, in order to prove the effectiveness of the attention mechanism, this paper proposes a video visualization model based on the video description. Experimental results show that, our model has achieved good performance on the Microsoft Video Description (MSVD) dataset and a certain improvement on the Microsoft Research-Video to Text (MSR-VTT) dataset. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

13 pages, 3300 KiB  
Article
Computer-Aided Bacillus Detection in Whole-Slide Pathological Images Using a Deep Convolutional Neural Network
by Chung-Ming Lo, Yu-Hung Wu, Yu-Chuan (Jack) Li and Chieh-Chi Lee
Appl. Sci. 2020, 10(12), 4059; https://0-doi-org.brum.beds.ac.uk/10.3390/app10124059 - 12 Jun 2020
Cited by 13 | Viewed by 2263
Abstract
Mycobacterial infections continue to greatly affect global health and result in challenging histopathological examinations using digital whole-slide images (WSIs), histopathological methods could be made more convenient. However, screening for stained bacilli is a highly laborious task for pathologists due to the microscopic and [...] Read more.
Mycobacterial infections continue to greatly affect global health and result in challenging histopathological examinations using digital whole-slide images (WSIs), histopathological methods could be made more convenient. However, screening for stained bacilli is a highly laborious task for pathologists due to the microscopic and inconsistent appearance of bacilli. This study proposed a computer-aided detection (CAD) system based on deep learning to automatically detect acid-fast stained mycobacteria. A total of 613 bacillus-positive image blocks and 1202 negative image blocks were cropped from WSIs (at approximately 20 × 20 pixels) and divided into training and testing samples of bacillus images. After randomly selecting 80% of the samples as the training set and the remaining 20% of samples as the testing set, a transfer learning mechanism based on a deep convolutional neural network (DCNN) was applied with a pretrained AlexNet to the target bacillus image blocks. The transferred DCNN model generated the probability that each image block contained a bacillus. A probability higher than 0.5 was regarded as positive for a bacillus. Consequently, the DCNN model achieved an accuracy of 95.3%, a sensitivity of 93.5%, and a specificity of 96.3%. For samples without color information, the performances were an accuracy of 73.8%, a sensitivity of 70.7%, and a specificity of 75.4%. The proposed DCNN model successfully distinguished bacilli from other tissues with promising accuracy. Meanwhile, the contribution of color information was revealed. This information will be helpful for pathologists to establish a more efficient diagnostic procedure. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

26 pages, 12664 KiB  
Article
CMOS Fixed Pattern Noise Removal Based on Low Rank Sparse Variational Method
by Tao Zhang, Xinyang Li, Jianfeng Li and Zhi Xu
Appl. Sci. 2020, 10(11), 3694; https://0-doi-org.brum.beds.ac.uk/10.3390/app10113694 - 27 May 2020
Cited by 6 | Viewed by 4870
Abstract
Fixed pattern noise (FPN) has always been an important factor affecting the imaging quality of CMOS image sensor (CIS). However, the current scene-based FPN removal methods mostly focus on the image itself, and seldom consider the structure information of the FPN, resulting in [...] Read more.
Fixed pattern noise (FPN) has always been an important factor affecting the imaging quality of CMOS image sensor (CIS). However, the current scene-based FPN removal methods mostly focus on the image itself, and seldom consider the structure information of the FPN, resulting in various undesirable noise removal effects. This paper presents a scene-based FPN correction method: the low rank sparse variational method (LRSUTV). It combines not only the continuity of the image itself, but also the structural and statistical characteristics of the stripes. At the same time, the low frequency information of the image is combined to achieve adaptive adjustment of some parameters, which simplifies the process of parameter adjustment, to a certain extent. With the help of adaptive parameter adjustment strategy, LRSUTV shows good performance under different intensity of stripe noise, and has high robustness. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

22 pages, 6085 KiB  
Article
rStaple: A Robust Complementary Learning Method for Real-Time Object Tracking
by Wangpeng He, Heyi Li, Wei Liu, Cheng Li and Baolong Guo
Appl. Sci. 2020, 10(9), 3021; https://0-doi-org.brum.beds.ac.uk/10.3390/app10093021 - 26 Apr 2020
Cited by 3 | Viewed by 1995
Abstract
Object tracking is a challenging research task because of drastic appearance changes of the target and a lack of training samples. Most online learning trackers are hampered by complications, e.g., drifting problem under occlusion, being out of view, or fast motion. In this [...] Read more.
Object tracking is a challenging research task because of drastic appearance changes of the target and a lack of training samples. Most online learning trackers are hampered by complications, e.g., drifting problem under occlusion, being out of view, or fast motion. In this paper, a real-time object tracking algorithm termed “robust sum of template and pixel-wise learners” (rStaple) is proposed to address those problems. It combines multi-feature correlation filters with a color histogram. Firstly, we extract a combination of specific features from the searching area around the target and then merge feature channels to train a translation correlation filter online. Secondly, the target state is determined by a discriminating mechanism, wherein the model update procedure stops when the target is occluded or out of view, and re-activated when the target re-appears. In addition, by calculating the color histogram score in the searching area, a significant enhancement is adopted for the score map. The target position can be estimated by combining the enhanced color histogram score with the correlation filter response map. Finally, a scale filter is trained for multi-scale detection to obtain the final tracking result. Extensive experimental results on a large benchmark dataset demonstrates that the proposed rStaple is superior to several state-of-the-art algorithms in terms of accuracy and efficiency. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

18 pages, 8326 KiB  
Article
Virtual Grounding Point Concept for Detecting Abnormal and Normal Events in Home Care Monitoring Systems
by Swe Nwe Nwe Htun, Thi Thi Zin and Hiromitsu Hama
Appl. Sci. 2020, 10(9), 3005; https://0-doi-org.brum.beds.ac.uk/10.3390/app10093005 - 25 Apr 2020
Cited by 5 | Viewed by 2459
Abstract
In this paper, an innovative home care video monitoring system for detecting abnormal and normal events is proposed by introducing a virtual grounding point (VGP) concept. To be specific, the proposed system is composed of four main image processing components: (1) [...] Read more.
In this paper, an innovative home care video monitoring system for detecting abnormal and normal events is proposed by introducing a virtual grounding point (VGP) concept. To be specific, the proposed system is composed of four main image processing components: (1) visual object detection, (2) feature extraction, (3) abnormal and normal event analysis, and (4) the decision-making process. In the object detection component, background subtraction is first achieved using a specific mixture of Gaussians (MoG) to model the foreground in the form of a low-rank matrix factorization. Then, a theory of graph cut is applied to refine the foreground. In the feature extraction component, the position and posture of the detected person is estimated by using a combination of the virtual grounding point, along with its related centroid, area, and aspect ratios. In analyzing the abnormal and normal events, the moving averages (MA) for the extracted features are calculated. After that, a new curve analysis is computed, specifically using the modified difference (MD). The local maximum (lmax), local minimum (lmin), and half width value (vhw) are determined on the observed curve of the modified difference. In the decision-making component, the support vector machine (SVM) method is applied to detect abnormal and normal events. In addition, a new concept called period detection (PD) is proposed to robustly detect the abnormal events. The experimental results were obtained using the Le2i fall detection dataset to confirm the reliability of the proposed method, and that it achieved a high detection rate. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

8 pages, 918 KiB  
Article
Improving Classification Performance of Softmax Loss Function Based on Scalable Batch-Normalization
by Qiuyu Zhu, Zikuang He, Tao Zhang and Wennan Cui
Appl. Sci. 2020, 10(8), 2950; https://0-doi-org.brum.beds.ac.uk/10.3390/app10082950 - 24 Apr 2020
Cited by 33 | Viewed by 5059
Abstract
Convolutional neural networks (CNNs) have made great achievements on computer vision tasks, especially the image classification. With the improvement of network structure and loss functions, the performance of image classification is getting higher and higher. The classic Softmax + cross-entropy loss has been [...] Read more.
Convolutional neural networks (CNNs) have made great achievements on computer vision tasks, especially the image classification. With the improvement of network structure and loss functions, the performance of image classification is getting higher and higher. The classic Softmax + cross-entropy loss has been the norm for training neural networks for years, which is calculated from the output probability of the ground-truth class. Then the network’s weight is updated by gradient calculation of the loss. However, after several epochs of training, the back-propagation errors usually become almost negligible. For the above considerations, we proposed that batch normalization with adjustable scale could be added after network output to alleviate the problem of vanishing gradient problem in deep learning. The experimental results show that our method can significantly improve the final classification accuracy on different network structures, and is also better than many other improved classification Loss. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

18 pages, 7887 KiB  
Article
Instance Hard Triplet Loss for In-video Person Re-identification
by Xing Fan, Wei Jiang, Hao Luo, Weijie Mao and Hongyan Yu
Appl. Sci. 2020, 10(6), 2198; https://0-doi-org.brum.beds.ac.uk/10.3390/app10062198 - 24 Mar 2020
Cited by 7 | Viewed by 3526
Abstract
Traditional Person Re-identification (ReID) methods mainly focus on cross-camera scenarios, while identifying a person in the same video/camera from adjacent subsequent frames is also an important question, for example, in human tracking and pose tracking. We try to address this unexplored in-video ReID [...] Read more.
Traditional Person Re-identification (ReID) methods mainly focus on cross-camera scenarios, while identifying a person in the same video/camera from adjacent subsequent frames is also an important question, for example, in human tracking and pose tracking. We try to address this unexplored in-video ReID problem with a new large-scale video-based ReID dataset called PoseTrack-ReID with full images available and a new network structure called ReID-Head, which can extract multi-person features efficiently in real time and can be integrated with both one-stage and two-stage human or pose detectors. A new loss function is also required to solve this new in-video problem. Hence, a triplet-based loss function with an online hard example mining designed to distinguish persons in the same video/group is proposed, called instance hard triplet loss, which can be applied in both cross-camera ReID and in-video ReID. Compared with the widely-used batch hard triplet loss, our proposed loss achieves competitive performance and saves more than 30% of the training time. We also propose an automatic reciprocal identity association method, so we can train our model in an unsupervised way, which further extends the potential applications of in-video ReID. The PoseTrack-ReID dataset and code will be publicly released. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

13 pages, 4385 KiB  
Article
Fast and Robust Object Tracking Using Tracking Failure Detection in Kernelized Correlation Filter
by Jungsup Shin, Heegwang Kim, Dohun Kim and Joonki Paik
Appl. Sci. 2020, 10(2), 713; https://0-doi-org.brum.beds.ac.uk/10.3390/app10020713 - 20 Jan 2020
Cited by 31 | Viewed by 3962
Abstract
Object tracking has long been an active research topic in image processing and computer vision fields with various application areas. For practical applications, the object tracking technique should be not only accurate but also fast in a real-time streaming condition. Recently, deep feature-based [...] Read more.
Object tracking has long been an active research topic in image processing and computer vision fields with various application areas. For practical applications, the object tracking technique should be not only accurate but also fast in a real-time streaming condition. Recently, deep feature-based trackers have been proposed to achieve a higher accuracy, but those are not suitable for real-time tracking because of an extremely slow processing speed. The slow speed is a major factor to degrade tracking accuracy under a real-time streaming condition since the processing delay forces skipping frames. To increase the tracking accuracy with preserving the processing speed, this paper presents an improved kernelized correlation filter (KCF)-based tracking method that integrates three functional modules: (i) tracking failure detection, (ii) re-tracking using multiple search windows, and (iii) motion vector analysis to decide a preferred search window. Under a real-time streaming condition, the proposed method yields better results than the original KCF in the sense of tracking accuracy, and when a target has a very large movement, the proposed method outperforms a deep learning-based tracker, such as multi-domain convolutional neural network (MDNet). Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

Review

Jump to: Research

34 pages, 9446 KiB  
Review
Advanced Biological Imaging for Intracellular Micromanipulation: Methods and Applications
by Wendi Gao, Libo Zhao, Zhuangde Jiang and Dong Sun
Appl. Sci. 2020, 10(20), 7308; https://0-doi-org.brum.beds.ac.uk/10.3390/app10207308 - 19 Oct 2020
Cited by 6 | Viewed by 3213
Abstract
Intracellular micromanipulation assisted by robotic systems has valuable applications in biomedical research, such as genetic diagnosis and genome-editing tasks. However, current studies suffer from a low success rate and a large operation damage because of insufficient information on the operation information of targeted [...] Read more.
Intracellular micromanipulation assisted by robotic systems has valuable applications in biomedical research, such as genetic diagnosis and genome-editing tasks. However, current studies suffer from a low success rate and a large operation damage because of insufficient information on the operation information of targeted specimens. The complexity of the intracellular environment causes difficulties in visualizing manipulation tools and specimens. This review summarizes and analyzes the current development of advanced biological imaging sampling and computational processing methods in intracellular micromanipulation applications. It also discusses the related limitations and future extension, providing an important reference about this field. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

18 pages, 4315 KiB  
Review
Inverse Halftoning Methods Based on Deep Learning and Their Evaluation Metrics: A Review
by Mei Li, Erhu Zhang, Yutong Wang, Jinghong Duan and Cuining Jing
Appl. Sci. 2020, 10(4), 1521; https://0-doi-org.brum.beds.ac.uk/10.3390/app10041521 - 23 Feb 2020
Cited by 8 | Viewed by 3574
Abstract
Inverse halftoning is an ill-posed problem that refers to the problem of restoring continuous-tone images from their halftone versions. Although much progress has been achieved over the last decades, the restored images still suffer from detail loss and visual artifacts. Recent studies show [...] Read more.
Inverse halftoning is an ill-posed problem that refers to the problem of restoring continuous-tone images from their halftone versions. Although much progress has been achieved over the last decades, the restored images still suffer from detail loss and visual artifacts. Recent studies show that inverse halftoning methods based on deep learning are superior to other traditional methods, and thus this paper aimed to systematically review the inverse halftone methods based on deep learning, so as to provide a reference for the development of inverse halftoning. In this paper, we firstly proposed a classification method for inverse halftoning methods on the basis of the source of halftone images. Then, two types of inverse halftoning methods for digital halftone images and scanned halftone images were investigated in terms of network architecture, loss functions, and training strategies. Furthermore, we studied existing image quality evaluation including subjective and objective evaluation by experiments. The evaluation results demonstrated that methods based on multiple subnetworks and methods based on multi-stage strategies are superior to other methods. In addition, the perceptual loss and the gradient loss are helpful for improving the quality of restored images. Finally, we gave the future research directions by analyzing the shortcomings of existing inverse halftoning methods. Full article
(This article belongs to the Special Issue Advanced Intelligent Imaging Technology Ⅱ)
Show Figures

Figure 1

Back to TopTop