Deep Learning Applied to Image Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 March 2022) | Viewed by 34320

Special Issue Editors


E-Mail Website
Guest Editor

E-Mail Website
Co-Guest Editor

Special Issue Information

Dear Colleagues,

Deep Learning (DL) is creating many new applications in broad areas of science, particularly in the domain of Image Processing (IP). These new innovative applications of DL to complex systems of IP have increased in the last few years. Specifically, this Special Issue is focused on research that addresses IP problems using novel approaches of DL by means of the application of tools of DL.

Therefore, the purpose of this Special Issue is to broadly engage the communities of DL and IP together in order to provide a forum for the researchers and practitioners related to this rapidly developing field and share their novel and original research regarding the topic of Deep Learning applied to Image Processing. Additionally, survey papers addressing relevant topics of DL&IP are also welcome. Topics of interest include but are not limited to:

  • Medical imaging;
  • Image restoration;
  • Deep adversarial learning for IP;
  • Image registration;
  • Image segmentation;
  • Nature-inspired and metaheuristic algorithms for DL&IP;
  • Theoretical analysis of DL models for DL&IP.

Prof. Dr. Jose Santamaria Lopez
Prof. Dr. Zong Woo Geem
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 3087 KiB  
Article
LogoNet: A Robust Layer-Aggregated Dual-Attention Anchorfree Logo Detection Framework with an Adversarial Domain Adaptation Approach
by Rahul Kumar Jain, Taro Watasue, Tomohiro Nakagawa, Takahiro Sato, Yutaro Iwamoto, Xiang Ruan and Yen-Wei Chen
Appl. Sci. 2021, 11(20), 9622; https://0-doi-org.brum.beds.ac.uk/10.3390/app11209622 - 15 Oct 2021
Cited by 6 | Viewed by 2057
Abstract
The task of logo detection is desirable and important for various fields. However, it is challenging and difficult to identify logos in complex scenarios as a logo can appear in different styles and platforms. Logo images include diverse contexts, sizes, projective transformation, resolution, [...] Read more.
The task of logo detection is desirable and important for various fields. However, it is challenging and difficult to identify logos in complex scenarios as a logo can appear in different styles and platforms. Logo images include diverse contexts, sizes, projective transformation, resolution, illumination and fonts, which make it more difficult to detect a logo. To address these issues, we presented a deep learning-based algorithm for logo detection called LogoNet. It includes an hourglass like top-down bottom-up feature extraction network, a spatial attention module and an anchorfree detection head similar to CenterNet. In order to improve performance, in this paper, an extended version of LogoNet is proposed, called—Dual-Attention LogoNet, that exploits different attention mechanisms more efficiently. The incorporated channel-wise and spatial attention modules refine and generate robust and balanced feature maps to predict visual and semantic information more accurately. In addition, we propose a lightweight architecture for both LogoNet and Dual-Attention LogoNet for practical applications. The proposed lightweight architecture significantly reduces the number of network parameters and improves the inference time to address the real-time performance while maintaining accuracy. Furthermore, to address the domain shift problem in practical applications, we also propose an adversarial-learning-based domain adaptation approach, which is easily adaptable to any anchorfree detectors. Our attention-based method shows a 1.8% improvement in accuracy compared to the state-of-the-art detection network on the FlickrLogos-32 dataset. Our proposed domain adaptation approach significantly improves performance by 1.3% mAP compared to direct transfer on the target domain without increasing any labeling cost and network parameters. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Figure 1

21 pages, 16696 KiB  
Article
Automatic Processing of Historical Japanese Mathematics (Wasan) Documents
by Yago Diez, Toya Suzuki, Marius Vila and Katsushi Waki
Appl. Sci. 2021, 11(17), 8050; https://0-doi-org.brum.beds.ac.uk/10.3390/app11178050 - 30 Aug 2021
Viewed by 2623
Abstract
“Wasan” is the collective name given to a set of mathematical texts written in Japan in the Edo period (1603–1867). These documents represent a unique type of mathematics and amalgamate the mathematical knowledge of a time and place where major advances where reached. [...] Read more.
“Wasan” is the collective name given to a set of mathematical texts written in Japan in the Edo period (1603–1867). These documents represent a unique type of mathematics and amalgamate the mathematical knowledge of a time and place where major advances where reached. Due to these facts, Wasan documents are considered to be of great historical and cultural significance. This paper presents a fully automatic algorithmic process to first detect the kanji characters in Wasan documents and subsequently classify them using deep learning networks. We pay special attention to the results concerning one particular kanji character, the “ima” kanji, as it is of special importance for the interpretation of Wasan documents. As our database is made up of manual scans of real historical documents, it presents scanning artifacts in the form of image noise and page misalignment. First, we use two preprocessing steps to ameliorate these artifacts. Then we use three different blob detector algorithms to determine what parts of each image belong to kanji Characters. Finally, we use five deep learning networks to classify the detected kanji. All the steps of the pipeline are thoroughly evaluated, and several options are compared for the kanji detection and classification steps. As ancient kanji database are rare and often include relatively few images, we explore the possibility of using modern kanji databases for kanji classification. Experiments are run on a dataset containing 100 Wasan book pages. We compare the performance of three blob detector algorithms for kanji detection obtaining 79.60% success rate with 7.88% false positive detections. Furthermore, we study the performance of five well-known deep learning networks and obtain 99.75% classification accuracy for modern kanji and 90.4% for classical kanji. Finally, our full pipeline obtains 95% correct detection and classification of the “ima” kanji with 3% False positives. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Figure 1

13 pages, 1889 KiB  
Article
PICCOLO White-Light and Narrow-Band Imaging Colonoscopic Dataset: A Performance Comparative of Models and Datasets
by Luisa F. Sánchez-Peralta, J. Blas Pagador, Artzai Picón, Ángel José Calderón, Francisco Polo, Nagore Andraka, Roberto Bilbao, Ben Glover, Cristina L. Saratxaga and Francisco M. Sánchez-Margallo
Appl. Sci. 2020, 10(23), 8501; https://0-doi-org.brum.beds.ac.uk/10.3390/app10238501 - 28 Nov 2020
Cited by 37 | Viewed by 4689
Abstract
Colorectal cancer is one of the world leading death causes. Fortunately, an early diagnosis allows for effective treatment, increasing the survival rate. Deep learning techniques have shown their utility for increasing the adenoma detection rate at colonoscopy, but a dataset is usually required [...] Read more.
Colorectal cancer is one of the world leading death causes. Fortunately, an early diagnosis allows for effective treatment, increasing the survival rate. Deep learning techniques have shown their utility for increasing the adenoma detection rate at colonoscopy, but a dataset is usually required so the model can automatically learn features that characterize the polyps. In this work, we present the PICCOLO dataset, that comprises 3433 manually annotated images (2131 white-light images 1302 narrow-band images), originated from 76 lesions from 40 patients, which are distributed into training (2203), validation (897) and test (333) sets assuring patient independence between sets. Furthermore, clinical metadata are also provided for each lesion. Four different models, obtained by combining two backbones and two encoder–decoder architectures, are trained with the PICCOLO dataset and other two publicly available datasets for comparison. Results are provided for the test set of each dataset. Models trained with the PICCOLO dataset have a better generalization capacity, as they perform more uniformly along test sets of all datasets, rather than obtaining the best results for its own test set. This dataset is available at the website of the Basque Biobank, so it is expected that it will contribute to the further development of deep learning methods for polyp detection, localisation and classification, which would eventually result in a better and earlier diagnosis of colorectal cancer, hence improving patient outcomes. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Graphical abstract

16 pages, 17244 KiB  
Article
Single-Shot Object Detection with Split and Combine Blocks
by Hongwei Wang, Dahua Li, Yu Song, Qiang Gao, Zhaoyang Wang and Chunping Liu
Appl. Sci. 2020, 10(18), 6382; https://0-doi-org.brum.beds.ac.uk/10.3390/app10186382 - 13 Sep 2020
Cited by 2 | Viewed by 2889
Abstract
Feature fusion is widely used in various neural network-based visual recognition tasks, such as object detection, to enhance the quality of feature representation. It is common practice for both the one-stage object detectors and the two-stage object detectors to implement feature fusion in [...] Read more.
Feature fusion is widely used in various neural network-based visual recognition tasks, such as object detection, to enhance the quality of feature representation. It is common practice for both the one-stage object detectors and the two-stage object detectors to implement feature fusion in feature pyramid networks (FPN) to enhance the capacity to detect objects of different scales. In this work, we propose a novel and efficient feature fusion unit, which is referred to as the Split and Combine (SC) Block, that splits the input feature maps into several parts, then processes these sub-feature maps with different emphasis, and finally gradually concatenates the outputs one-by-one. The SC block implicitly encourages the network to focus on features that are more important to the task, thus improving network efficiency and reducing inference computations. In order to prove our analysis and conclusions, a backbone network and an FPN employing this technique are assembled into a one-stage detector and evaluated on the MS COCO dataset. With the newly introduced SC block and other novel training tricks, our detector achieves a good speed-accuracy trade-off on COCO test-dev set, with 37.1% AP (average precision) at 51 FPS and 38.9% AP at 40 FPS. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Figure 1

12 pages, 1919 KiB  
Article
A Convolutional Neural Network for Anterior Intra-Arterial Thrombus Detection and Segmentation on Non-Contrast Computed Tomography of Patients with Acute Ischemic Stroke
by Manon L. Tolhuisen, Elena Ponomareva, Anne M. M. Boers, Ivo G. H. Jansen, Miou S. Koopman, Renan Sales Barros, Olvert A. Berkhemer, Wim H. van Zwam, Aad van der Lugt, Charles B. L. M. Majoie and Henk A. Marquering
Appl. Sci. 2020, 10(14), 4861; https://0-doi-org.brum.beds.ac.uk/10.3390/app10144861 - 15 Jul 2020
Cited by 12 | Viewed by 3286
Abstract
The aim of this study was to develop a convolutional neural network (CNN) that automatically detects and segments intra-arterial thrombi on baseline non-contrast computed tomography (NCCT) scans. We retrospectively collected computed tomography (CT)-scans of patients with an anterior circulation large vessel occlusion (LVO) [...] Read more.
The aim of this study was to develop a convolutional neural network (CNN) that automatically detects and segments intra-arterial thrombi on baseline non-contrast computed tomography (NCCT) scans. We retrospectively collected computed tomography (CT)-scans of patients with an anterior circulation large vessel occlusion (LVO) from the Multicenter Randomized Clinical Trial of Endovascular Treatment for Acute Ischemic Stroke in the Netherlands trial, both for training (n = 86) and validation (n = 43). For testing we included patients with (n = 58) and without (n = 45) an LVO from our comprehensive stroke center. Ground truth was established by consensus between two experts using both CT angiography and NCCT. We evaluated the CNN for correct identification of a thrombus, its location and thrombus segmentation and compared these with the results of a neurologist in training and expert neuroradiologist. Sensitivity of the CNN thrombus detection was 0.86, vs. 0.95 and 0.79 for the neuroradiologists. Specificity was 0.65 for the network vs. 0.58 and 0.82 for the neuroradiologists. The CNN correctly identified the location of the thrombus in 79% of the cases, compared to 81% and 77% for the neuroradiologists. The sensitivity and specificity for thrombus identification and the rate for correct thrombus location assessment by the CNN were similar to those of expert neuroradiologists. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Figure 1

15 pages, 1444 KiB  
Article
Improvement of Learning Stability of Generative Adversarial Network Using Variational Learning
by Je-Yeol Lee and Sang-Il Choi 
Appl. Sci. 2020, 10(13), 4528; https://0-doi-org.brum.beds.ac.uk/10.3390/app10134528 - 30 Jun 2020
Cited by 4 | Viewed by 2552
Abstract
In this paper, we propose a new network model using variational learning to improve the learning stability of generative adversarial networks (GAN). The proposed method can be easily applied to improve the learning stability of GAN-based models that were developed for various purposes, [...] Read more.
In this paper, we propose a new network model using variational learning to improve the learning stability of generative adversarial networks (GAN). The proposed method can be easily applied to improve the learning stability of GAN-based models that were developed for various purposes, given that the variational autoencoder (VAE) is used as a secondary network while the basic GAN structure is maintained. When the gradient of the generator vanishes in the learning process of GAN, the proposed method receives gradient information from the decoder of the VAE that maintains gradient stably, so that the learning processes of the generator and discriminator are not halted. The experimental results of the MNIST and the CelebA datasets verify that the proposed method improves the learning stability of the networks by overcoming the vanishing gradient problem of the generator, and maintains the excellent data quality of the conventional GAN-based generative models. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Figure 1

21 pages, 47435 KiB  
Article
Towards a Better Understanding of Transfer Learning for Medical Imaging: A Case Study
by Laith Alzubaidi, Mohammed A. Fadhel, Omran Al-Shamma, Jinglan Zhang, J. Santamaría, Ye Duan and Sameer R. Oleiwi
Appl. Sci. 2020, 10(13), 4523; https://0-doi-org.brum.beds.ac.uk/10.3390/app10134523 - 29 Jun 2020
Cited by 135 | Viewed by 10436
Abstract
One of the main challenges of employing deep learning models in the field of medicine is a lack of training data due to difficulty in collecting and labeling data, which needs to be performed by experts. To overcome this drawback, transfer learning (TL) [...] Read more.
One of the main challenges of employing deep learning models in the field of medicine is a lack of training data due to difficulty in collecting and labeling data, which needs to be performed by experts. To overcome this drawback, transfer learning (TL) has been utilized to solve several medical imaging tasks using pre-trained state-of-the-art models from the ImageNet dataset. However, there are primary divergences in data features, sizes, and task characteristics between the natural image classification and the targeted medical imaging tasks. Therefore, TL can slightly improve performance if the source domain is completely different from the target domain. In this paper, we explore the benefit of TL from the same and different domains of the target tasks. To do so, we designed a deep convolutional neural network (DCNN) model that integrates three ideas including traditional and parallel convolutional layers and residual connections along with global average pooling. We trained the proposed model against several scenarios. We utilized the same and different domain TL with the diabetic foot ulcer (DFU) classification task and with the animal classification task. We have empirically shown that the source of TL from the same domain can significantly improve the performance considering a reduced number of images in the same domain of the target dataset. The proposed model with the DFU dataset achieved F1-score value of 86.6% when trained from scratch, 89.4% with TL from a different domain of the targeted dataset, and 97.6% with TL from the same domain of the targeted dataset. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Figure 1

13 pages, 2716 KiB  
Article
Defect Detection on Rolling Element Surface Scans Using Neural Image Segmentation
by Nico Prappacher, Markus Bullmann, Gunther Bohn, Frank Deinzer and Andreas Linke
Appl. Sci. 2020, 10(9), 3290; https://0-doi-org.brum.beds.ac.uk/10.3390/app10093290 - 09 May 2020
Cited by 28 | Viewed by 3141
Abstract
The surface inspection of steel parts like rolling elements for roller bearings is an essential component of the quality assurance process in their production. Existing inspection systems require high maintenance cost and allow little flexibility. In this paper, we propose the use of [...] Read more.
The surface inspection of steel parts like rolling elements for roller bearings is an essential component of the quality assurance process in their production. Existing inspection systems require high maintenance cost and allow little flexibility. In this paper, we propose the use of a rapidly retrainable convolutional neural network. Our approach reduces the development and maintenance cost compared to a manually programmed classification system for steel surface defect detection. One of the main disadvantages of neural network approaches is their high demand for labeled training data. To bypass this, we propose the use of simulated defects. In the production of rolling elements, real defects are a rarity. Collecting a balanced dataset thus costs a lot of time and resources. Simulating defects reduces the time required for data collection. It also allows us to automatically label the dataset. This further eases the data collection process compared to existing approaches. Combined, this allows us to train our system faster and cheaper than existing systems. We will show that our system can be retrained in a matter of minutes, minimizing production downtime, while still allowing high accuracy in defect detection. Full article
(This article belongs to the Special Issue Deep Learning Applied to Image Processing)
Show Figures

Figure 1

Back to TopTop