Recent Advances in Deep Learning for Image Analysis

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (31 October 2022) | Viewed by 37894

Special Issue Editors

Department of Electrical Engineering, National Taipei University of Technology, Taipei 10608, Taiwan
Interests: wireless communications; telecare; deep learning; optimization algorithms
Special Issues, Collections and Topics in MDPI journals
Department of Electrical Engineering, National Taipei University of Technology, Taipei 10608, Taiwan
Interests: Artificial Intelligence; deep learning; computer vision; image processing; remote sensing
Department of Electrical Engineering, National Taipei University of Technology, Taipei 10608, Taiwan
Interests: remote sensing; deep learning; pattern recognition; image processing; high-performance computing

Special Issue Information

Dear Colleagues,

Recently, deep learning algorithms have been used for a wide range of computer vision and image analysis tasks, such as image classification, graphics recognition, object detection, and image segmentation. However, deep learning still poses some challenges in regard to training time, network size, accuracy, computing power, and overfitting. These challenges need to be addressed in order to provide reliable and efficient deep learning networks. Hence, the aim of this Special Issue is to cover novel, optimized, high-performance, and hybrid deep-learning-based approaches for image analysis to address the aforementioned challenges in a variety of applications. Topics of interest include, but are not limited to, the following:

  • Deep-learning-based image analysis in various disciplines (remote sensing, medicine, biology, etc.).
  • Image analysis (classification, segmentation, recognition, and detection) using deep learning.
  • Effective augmentation methods for deep-learning-based image analysis.
  • Hybrid machine learning and deep learning methods for image analysis.
  • Efficient deep learning architectures for image analysis.
  • Deep learning models on mobile and embedded devices for image analysis.
  • Transfer learning, domain adaptation, and knowledge distillation for image analysis.

Prof. Dr. Tan-Hsu Tan
Prof. Dr. Mohammad Alkhaleefah
Prof. Dr. Yang-Lang Chang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • machine learning
  • deep learning
  • image analysis
  • image classification
  • image detection
  • image segmentation
  • graphics recognition
  • remote sensing image analysis
  • biomedical image analysis
  • medical image analysis
  • natural image analysis
  • transfer learning
  • domain adaptation
  • knowledge distillation
  • efficient deep learning models
  • high-performance computing
  • hybrid approaches

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 4326 KiB  
Article
A Transfer Learning and Optimized CNN Based Maritime Vessel Classification System
by Mostafa Hamdy Salem, Yujian Li, Zhaoying Liu and Ahmed M. AbdelTawab
Appl. Sci. 2023, 13(3), 1912; https://0-doi-org.brum.beds.ac.uk/10.3390/app13031912 - 01 Feb 2023
Cited by 6 | Viewed by 1914
Abstract
Deep learning has been used to improve intelligent transportation systems (ITS) by classifying ship targets in interior waterways. Researchers have created numerous classification methods, but they have low accuracy and misclassify other ship targets. As a result, more research into ship classification is [...] Read more.
Deep learning has been used to improve intelligent transportation systems (ITS) by classifying ship targets in interior waterways. Researchers have created numerous classification methods, but they have low accuracy and misclassify other ship targets. As a result, more research into ship classification is required to avoid inland waterway collisions. We present a new convolutional neural network classification method for inland waterways that can classify the five major ship types: cargo, military, carrier, cruise, and tanker. This method can also be used for other ship classes. The proposed method consists of four phases for the boosting of classification accuracy for Intelligent Transport Systems (ITS) based on convolutional neural networks (CNNs); efficient augmentation method, the hyper-parameter optimization (HPO) technique for optimum CNN model parameter selection, transfer learning, and ensemble learning are suggested. All experiments used Kaggle’s public Game of Deep Learning Ship dataset. In addition, the proposed ship classification achieved 98.38% detection rates and 97.43% F1 scores. Our suggested classification technique was also evaluated on the MARVEL dataset. This dataset includes 10,000 image samples for each class and 26 types of ships for generalization. The suggested method also delivered an excellent performance compared to other algorithms, with performance metrics with an accuracy of 97.04%, a precision of 96.1%, a recall of 95.92%, a specificity of 96.55%, and a 96.31% F1 score. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

13 pages, 2276 KiB  
Article
Perturbation-Based Explainable AI for ECG Sensor Data
by Ján Paralič, Michal Kolárik, Zuzana Paraličová, Oliver Lohaj and Adam Jozefík
Appl. Sci. 2023, 13(3), 1805; https://0-doi-org.brum.beds.ac.uk/10.3390/app13031805 - 31 Jan 2023
Viewed by 1675
Abstract
Deep neural network models have produced significant results in solving various challenging tasks, including medical diagnostics. To increase the credibility of these black-box models in the eyes of doctors, it is necessary to focus on their explainability. Several papers have been published combining [...] Read more.
Deep neural network models have produced significant results in solving various challenging tasks, including medical diagnostics. To increase the credibility of these black-box models in the eyes of doctors, it is necessary to focus on their explainability. Several papers have been published combining deep learning methods with selected types of explainability methods, usually aimed at analyzing medical image data, including ECG images. The ECG is specific because its image representation is only a secondary visualization of stream data from sensors. However, explainability methods for stream data are rarely investigated. Therefore, in this article we focus on the explainability of black-box models for stream data from 12-lead ECG. We designed and implemented a perturbation explainability method and verified it in a user study on a group of medical students with experience in ECG tagging in their final years of study. The results demonstrate the suitability of the proposed method, as well as the importance of including multiple data sources in the diagnostic process. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

13 pages, 1507 KiB  
Article
Applying Convolutional Neural Network in Automatic Assessment of Bone Age Using Multi-Stage and Cross-Category Strategy
by Ching-Tung Peng, Yung-Kuan Chan, Yeong-Seng Yuh and Shyr-Shen Yu
Appl. Sci. 2022, 12(24), 12798; https://0-doi-org.brum.beds.ac.uk/10.3390/app122412798 - 13 Dec 2022
Cited by 1 | Viewed by 1073
Abstract
Bone age is a common indicator of children’s growth. However, traditional bone age assessment methods usually take a long time and are jeopardized by human error. To address the aforementioned problem, we propose an automatic bone age assessment system based on the convolutional [...] Read more.
Bone age is a common indicator of children’s growth. However, traditional bone age assessment methods usually take a long time and are jeopardized by human error. To address the aforementioned problem, we propose an automatic bone age assessment system based on the convolutional neural network (CNN) framework. Generally, bone age assessment is utilized amongst 0–18-year-old children. In order to reduce its variation in terms of regression model building, our system consists of two steps. First, we build a maturity stage classifier to identify the maturity stage, and then build regression models for each maturity stage. In this way, assessing bone age through the use of several independent regression models will reduce the variation and make the assessment of bone age more accurate. Some bone sections are particularly useful for distinguishing certain maturity stages, but may not be effective for other stages, and thus we first perform a rough classification to generally distinguish the maturity stage, and then undertake fine classification. Because the skeleton is constantly growing during bone development, it is not easy to obtain a clear decision boundary between the various stages of maturation. Therefore, we propose a cross-stage class strategy for this problem. In addition, because fewer children undergo X-rays in the early and late stages, this causes an imbalance in the data. Under the cross-stage class strategy, this problem can also be alleviated. In our proposed framework, we utilize an MSCS-CNN (Multi-Step and Cross-Stage CNN). We experiment on our dataset, and the accuracy of the MSCS-CNN in identifying both female and male maturity stages is above 0.96. After determining maturity stage during bone age assessment, we obtain a 0.532 and 0.56 MAE (mean absolute error) for females and males, respectively. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

25 pages, 10341 KiB  
Article
Computer-Aided Detection of Hypertensive Retinopathy Using Depth-Wise Separable CNN
by Imran Qureshi, Qaisar Abbas, Junhua Yan, Ayyaz Hussain, Kashif Shaheed and Abdul Rauf Baig
Appl. Sci. 2022, 12(23), 12086; https://0-doi-org.brum.beds.ac.uk/10.3390/app122312086 - 25 Nov 2022
Cited by 8 | Viewed by 1995
Abstract
Hypertensive retinopathy (HR) is a retinal disorder, linked to high blood pressure. The incidence of HR-eye illness is directly related to the severity and duration of hypertension. It is critical to identify and analyze HR at an early stage to avoid blindness. There [...] Read more.
Hypertensive retinopathy (HR) is a retinal disorder, linked to high blood pressure. The incidence of HR-eye illness is directly related to the severity and duration of hypertension. It is critical to identify and analyze HR at an early stage to avoid blindness. There are presently only a few computer-aided systems (CADx) designed to recognize HR. Instead, those systems concentrated on collecting features from many retinopathy-related HR lesions and then classifying them using traditional machine learning algorithms. Consequently, those CADx systems required complicated image processing methods and domain-expert knowledge. To address these issues, a new CAD-HR system is proposed to advance depth-wise separable CNN (DSC) with residual connection and a linear support vector machine (LSVM). Initially, the data augmentation approach is used on retina graphics to enhance the size of the datasets. Afterward, this DSC approach is applied to retinal images to extract robust features. The retinal samples are then classified as either HR or non-HR using an LSVM classifier as the final step. The statistical investigation of 9500 retinograph images from two publicly available and one private source is undertaken to assess the accuracy. Several experimental results demonstrate that the CAD-HR model requires less computational time and fewer parameters to categorize HR. On average, the CAD-HR achieved a sensitivity (SE) of 94%, specificity (SP) of 96%, accuracy (ACC) of 95% and area under the receiver operating curve (AUC) of 0.96. This confirms that the CAD-HR system can be used to correctly diagnose HR. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

21 pages, 7344 KiB  
Article
Integration of Object-Based Image Analysis and Convolutional Neural Network for the Classification of High-Resolution Satellite Image: A Comparative Assessment
by Omer Saud Azeez, Helmi Z. M. Shafri, Aidi Hizami Alias and Nuzul A. B. Haron
Appl. Sci. 2022, 12(21), 10890; https://0-doi-org.brum.beds.ac.uk/10.3390/app122110890 - 27 Oct 2022
Cited by 1 | Viewed by 2315
Abstract
During the past decade, deep learning-based classification methods (e.g., convolutional neural networks—CNN) have demonstrated great success in a variety of vision tasks, including satellite image classification. Deep learning methods, on the other hand, do not preserve the precise edges of the targets of [...] Read more.
During the past decade, deep learning-based classification methods (e.g., convolutional neural networks—CNN) have demonstrated great success in a variety of vision tasks, including satellite image classification. Deep learning methods, on the other hand, do not preserve the precise edges of the targets of interest and do not extract geometric features such as shape and area. Previous research has attempted to address such issues by combining deep learning with methods such as object-based image analysis (OBIA). Nonetheless, the question of how to integrate those methods into a single framework in such a way that the benefits of each method complement each other remains. To that end, this study compared four integration frameworks in terms of accuracy, namely OBIA artificial neural network (OBIA ANN), feature fusion, decision fusion, and patch filtering, according to the results. Patch filtering achieved 0.917 OA, whereas decision fusion and feature fusion achieved 0.862 OA and 0.860 OA, respectively. The integration of CNN and OBIA can improve classification accuracy; however, the integration framework plays a significant role in this. Future research should focus on optimizing the existing CNN and OBIA frameworks in terms of architecture, as well as investigate how CNN models should use OBIA outputs for feature extraction and classification of remotely sensed images. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

18 pages, 5154 KiB  
Article
Conditional Generative Adversarial Networks with Total Variation and Color Correction for Generating Indonesian Face Photo from Sketch
by Mia Rizkinia, Nathaniel Faustine and Masahiro Okuda
Appl. Sci. 2022, 12(19), 10006; https://0-doi-org.brum.beds.ac.uk/10.3390/app121910006 - 05 Oct 2022
Cited by 5 | Viewed by 2352
Abstract
Historically, hand-drawn face sketches have been commonly used by Indonesia’s police force, especially to quickly describe a person’s facial features in searching for fugitives based on eyewitness testimony. Several studies have been performed, aiming to increase the effectiveness of the method, such as [...] Read more.
Historically, hand-drawn face sketches have been commonly used by Indonesia’s police force, especially to quickly describe a person’s facial features in searching for fugitives based on eyewitness testimony. Several studies have been performed, aiming to increase the effectiveness of the method, such as comparing the facial sketch with the all-points bulletin (DPO in Indonesian terminology) or generating a facial composite. However, making facial composites using an application takes quite a long time. Moreover, when these composites are directly compared to the DPO, the accuracy is insufficient, and thus, the technique requires further development. This study applies a conditional generative adversarial network (cGAN) to convert a face sketch image into a color face photo with an additional Total Variation (TV) term in the loss function to improve the visual quality of the resulting image. Furthermore, we apply a color correction to adjust the resulting skin tone similar to that of the ground truth. The face image dataset was collected from various sources matching Indonesian skin tone and facial features. We aim to provide a method for Indonesian face sketch-to-photo generation to visualize the facial features more accurately than the conventional method. This approach produces visually realistic photos from face sketches, as well as true skin tones. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

17 pages, 2189 KiB  
Article
Multiple-Stage Knowledge Distillation
by Chuanyun Xu, Nanlan Bai, Wenjian Gao, Tian Li, Mengwei Li, Gang Li and Yang Zhang
Appl. Sci. 2022, 12(19), 9453; https://0-doi-org.brum.beds.ac.uk/10.3390/app12199453 - 21 Sep 2022
Viewed by 1644
Abstract
Knowledge distillation (KD) is a method in which a teacher network guides the learning of a student network, thereby resulting in an improvement in the performance of the student network. Recent research in this area has concentrated on developing effective definitions of knowledge [...] Read more.
Knowledge distillation (KD) is a method in which a teacher network guides the learning of a student network, thereby resulting in an improvement in the performance of the student network. Recent research in this area has concentrated on developing effective definitions of knowledge and efficient methods of knowledge transfer while ignoring the learning ability of the student network. To fully utilize this potential learning ability and improve learning efficiency, this study proposes a multiple-stage KD (MSKD) method that allows students to learn the knowledge delivered by the teacher network in multiple stages. The student network in this method consists of a multi-exit architecture, and the students imitate the output of the teacher network at each exit. The final classification by the student network is achieved through ensemble learning. However, because this results in an unreasonable gap between the number of parameters in the student branch network and those in the teacher branch network, as well as a mismatch in learning capacity between these two networks, we extend the MSKD method to a one-to-one multiple-stage KD method. The experimental results reveal that the proposed method applied to the CIFAR100 and Tiny ImageNet datasets exhibits good performance gain. The proposed method of enhancing KD by changing the style of student learning provides new insight into KD. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

26 pages, 8669 KiB  
Article
Content-Based Video Big Data Retrieval with Extensive Features and Deep Learning
by Thuong-Cang Phan, Anh-Cang Phan, Hung-Phi Cao and Thanh-Ngoan Trieu
Appl. Sci. 2022, 12(13), 6753; https://0-doi-org.brum.beds.ac.uk/10.3390/app12136753 - 03 Jul 2022
Cited by 12 | Viewed by 3743
Abstract
In the era of digital media, the rapidly increasing volume and complexity of multimedia data cause many problems in storing, processing, and querying information in a reasonable time. Feature extraction and processing time play an extremely important role in large-scale video retrieval systems [...] Read more.
In the era of digital media, the rapidly increasing volume and complexity of multimedia data cause many problems in storing, processing, and querying information in a reasonable time. Feature extraction and processing time play an extremely important role in large-scale video retrieval systems and currently receive much attention from researchers. We, therefore, propose an efficient approach to feature extraction on big video datasets using deep learning techniques. It focuses on the main features, including subtitles, speeches, and objects in video frames, by using a combination of three techniques: optical character recognition (OCR), automatic speech recognition (ASR), and object identification with deep learning techniques. We provide three network models developed from networks of Faster R-CNN ResNet, Faster R-CNN Inception ResNet V2, and Single Shot Detector MobileNet V2. The approach is implemented in Spark, the next-generation parallel and distributed computing environment, which reduces the time and space costs of the feature extraction process. Experimental results show that our proposal achieves an accuracy of 96% and a processing time reduction of 50%. This demonstrates the feasibility of the approach for content-based video retrieval systems in a big data context. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

16 pages, 4364 KiB  
Article
Facial Expressions Based Automatic Pain Assessment System
by Thoria Alghamdi and Gita Alaghband
Appl. Sci. 2022, 12(13), 6423; https://0-doi-org.brum.beds.ac.uk/10.3390/app12136423 - 24 Jun 2022
Cited by 6 | Viewed by 2361
Abstract
Pain assessment is used to improve patients’ treatment outcomes. Human observers may be influenced by personal factors, such as inexperience and medical organizations are facing a shortage of experts. In this study, we developed a facial expressions-based automatic pain assessment system (FEAPAS) to [...] Read more.
Pain assessment is used to improve patients’ treatment outcomes. Human observers may be influenced by personal factors, such as inexperience and medical organizations are facing a shortage of experts. In this study, we developed a facial expressions-based automatic pain assessment system (FEAPAS) to notify medical staff when a patient suffers pain by activating an alarm and recording the incident and pain level with the date and time. The model consists of two identical concurrent subsystems, each of which takes one of the two inputs of the model, i.e., “full face” and “the upper half of the same face”. The subsystems extract the relevant input features via two pre-trained convolutional neural networks (CNNs), using either VGG16, InceptionV3, ResNet50, or ResNeXt50, while freezing all convolutional blocks and replacing the classifier layer with a shallow CNN. The concatenated outputs in this stage is then sent to the model’s classifier. This approach mimics the human observer method and gives more importance to the upper part of the face, which is similar to the Prkachin and Soloman pain intensity (PSPI). Additionally, we further optimized our models by applying four optimizers (SGD/ADAM/RMSprop/RAdam) to each model and testing them on the UNBC-McMaster shoulder pain expression archive dataset to find the optimal combination, InceptionV3-SGD. The optimal model showed an accuracy of 99.10% on 10-fold cross-validation, thus outperforming the state-of-the-art model on the UNBC-McMaster database. It also scored 90.56% on unseen subject data. To speed up the system response time and reduce unnecessary alarms associated with temporary facial expressions, a select but effective subset of frames was inspected and classified. Two frame-selection criteria were reported. Classifying only two frames at the middle of 30-frame sequence was optimal, with an average reaction time of at most 6.49 s and the ability to avoid unnecessary alarms. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

12 pages, 7688 KiB  
Article
Grouping Bilinear Pooling for Fine-Grained Image Classification
by Rui Zeng and Jingsong He
Appl. Sci. 2022, 12(10), 5063; https://0-doi-org.brum.beds.ac.uk/10.3390/app12105063 - 17 May 2022
Cited by 2 | Viewed by 1976
Abstract
Fine-grained image classification is a challenging computer visual task due to the small interclass variations and large intra-class variations. Extracting expressive feature representation is an effective way to improve the accuracy of fine-grained image classification. Bilinear pooling is a simple and effective high-order [...] Read more.
Fine-grained image classification is a challenging computer visual task due to the small interclass variations and large intra-class variations. Extracting expressive feature representation is an effective way to improve the accuracy of fine-grained image classification. Bilinear pooling is a simple and effective high-order feature interaction method. Compared with common pooling methods, bilinear pooling can obtain better feature representation by capturing complex associations between high-order features. However, the dimensions of bilinear representation are often up to hundreds of thousands or even millions. In order to get compact bilinear representation, we propose grouping bilinear pooling (GBP) for fine-grained image classification in this paper. Firstly, by dividing the feature layers into different groups, and then carrying out intra-group bilinear pooling or inter-group bilinear pooling. The representation captured by GBP can achieve the same accuracy with less than 0.4% parameters compared with full bilinear representation when using the same backbone. This extreme compact representation largely overcomes the high redundancy of the full bilinear representation, the computational cost and storage consumption. Besides, it is because GBP compresses the bilinear representation to the extreme that it can be used with more powerful backbones as a plug-and-play module. The effectiveness of GBP is proved by experiments on the widely used fine-grained recognition datasets CUB and Stanford Cars. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

27 pages, 6242 KiB  
Article
Neuroplasticity-Based Pruning Method for Deep Convolutional Neural Networks
by Jose David Camacho, Carlos Villaseñor, Carlos Lopez-Franco and Nancy Arana-Daniel
Appl. Sci. 2022, 12(10), 4945; https://0-doi-org.brum.beds.ac.uk/10.3390/app12104945 - 13 May 2022
Viewed by 1276
Abstract
In this paper, a new pruning strategy based on the neuroplasticity of biological neural networks is presented. The novel pruning algorithm proposed is inspired by the knowledge remapping ability after injuries in the cerebral cortex. Thus, it is proposed to simulate induced injuries [...] Read more.
In this paper, a new pruning strategy based on the neuroplasticity of biological neural networks is presented. The novel pruning algorithm proposed is inspired by the knowledge remapping ability after injuries in the cerebral cortex. Thus, it is proposed to simulate induced injuries into the network by pruning full convolutional layers or entire blocks, assuming that the knowledge from the removed segments of the network may be remapped and compressed during the recovery (retraining) process. To reconnect the remaining segments of the network, a translator block is introduced. The translator is composed of a pooling layer and a convolutional layer. The pooling layer is optional and placed to ensure that the spatial dimension of the feature maps matches across the pruned segments. After that, a convolutional layer (simulating the intact cortex) is placed to ensure that the depth of the feature maps matches and is used to remap the removed knowledge. As a result, lightweight, efficient and accurate sub-networks are created from the base models. Comparison analysis shows that in our approach is not necessary to define a threshold or metric as the criterion to prune the network in contrast to other pruning methods. Instead, only the origin and destination of the prune and reconnection points must be determined for the translator connection. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

12 pages, 3036 KiB  
Article
Artificial Intelligence Mortality Prediction Model for Gastric Cancer Surgery Based on Body Morphometry, Nutritional, and Surgical Information: Feasibility Study
by Yousun Ko, Hooyoung Shin, Juneseuk Shin, Hoon Hur, Jimi Huh, Taeyong Park, Kyung Won Kim and In-Seob Lee
Appl. Sci. 2022, 12(8), 3873; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083873 - 12 Apr 2022
Cited by 3 | Viewed by 1407
Abstract
The objective of this study is to develop a mortality prediction model for patients undergoing gastric cancer surgery based on body morphometry, nutritional, and surgical information. Using a prospectively built gastric surgery registry from the Asan Medical Center (AMC), 621 gastric cancer patients, [...] Read more.
The objective of this study is to develop a mortality prediction model for patients undergoing gastric cancer surgery based on body morphometry, nutritional, and surgical information. Using a prospectively built gastric surgery registry from the Asan Medical Center (AMC), 621 gastric cancer patients, who were treated with surgery with no recurrence of cancer, were selected for the development of the prediction model. Input features (i.e., body morphometry, nutritional, surgical, and clinicopathologic information) were selected in the collected data based on the XGBoost analysis results and experts’ opinions. A convolutional neural network (CNN) framework was developed to predict the mortality of patients undergoing gastric cancer surgery. Internal validation was performed in split datasets of the AMC, whereas external validation was performed in patients in the Ajou University Hospital. Fifteen features were selected for the prediction of survival probability based on the XGBoost analysis results and experts’ suggestions. Accuracy, F1 score, and area under the curve of our CNN model were 0.900, 0.909, and 0.900 in the internal validation set and 0.879, 0.882, and 0.881 in the external validation set, respectively. Our developed CNN model was published on a website where anyone could predict mortality using individual patients’ data. Our CNN model provides substantially good performance in predicting mortality in patients undergoing surgery for gastric cancer, mainly based on body morphometry, nutritional, and surgical information. Using the web application, clinicians and gastric cancer patients will be able to efficiently manage mortality risk factors. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

26 pages, 1324 KiB  
Article
Texture and Materials Image Classification Based on Wavelet Pooling Layer in CNN
by Juan Manuel Fortuna-Cervantes, Marco Tulio Ramírez-Torres, Marcela Mejía-Carlos, José Salomé Murguía, José Martinez-Carranza, Carlos Soubervielle-Montalvo and César Arturo Guerra-García
Appl. Sci. 2022, 12(7), 3592; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073592 - 01 Apr 2022
Cited by 6 | Viewed by 2623
Abstract
Convolutional Neural Networks (CNNs) have recently been proposed as a solution in texture and material classification in computer vision. However, inside CNNs, the internal layers of pooling often cause a loss of information and, therefore, is detrimental to learning the architecture. Moreover, when [...] Read more.
Convolutional Neural Networks (CNNs) have recently been proposed as a solution in texture and material classification in computer vision. However, inside CNNs, the internal layers of pooling often cause a loss of information and, therefore, is detrimental to learning the architecture. Moreover, when considering images with repetitive and essential patterns, the loss of this information affects the performance of subsequent stages, such as feature extraction and analysis. In this paper, to solve this problem, we propose a classification system with a new pooling method called Discrete Wavelet Transform Pooling (DWTP). This method is based on the image decomposition into sub-bands, in which the first level sub-band is considered as its output. The objective is to obtain approximation and detail information. As a result, this information can be concatenated in different combinations. In addition, wavelet pooling uses wavelets to reduce the size of the feature map. Combining these methods provides acceptable classification performance for three databases (CIFAR-10, DTD, and FMD). We argue that this helps eliminate overfitting and that the learning graphs reflect that the datasets show learning generalization. Therefore, our results indicate that our method based on wavelet analysis is feasible for texture and material classification. Moreover, in some cases, it outperforms traditional methods. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

11 pages, 6932 KiB  
Article
Data Augmentation Based on Generative Adversarial Networks to Improve Stage Classification of Chronic Kidney Disease
by Yun-Te Liao, Chien-Hung Lee, Kuo-Su Chen, Chie-Pein Chen and Tun-Wen Pai
Appl. Sci. 2022, 12(1), 352; https://0-doi-org.brum.beds.ac.uk/10.3390/app12010352 - 30 Dec 2021
Cited by 7 | Viewed by 8034
Abstract
The prevalence of chronic kidney disease (CKD) is estimated to be 13.4% worldwide and 15% in the United States. CKD has been recognized as a leading public health problem worldwide. Unfortunately, as many as 90% of CKD patients do not know that they [...] Read more.
The prevalence of chronic kidney disease (CKD) is estimated to be 13.4% worldwide and 15% in the United States. CKD has been recognized as a leading public health problem worldwide. Unfortunately, as many as 90% of CKD patients do not know that they already have CKD. Ultrasonography is usually the first and the most commonly used imaging diagnostic tool for patients at risk of CKD. To provide a consistent assessment of the stage classifications of CKD, this study proposes an auxiliary diagnosis system based on deep learning approaches for renal ultrasound images. The system uses the ACWGAN-GP model and MobileNetV2 pre-training model. The images generated by the ACWGAN-GP generation model and the original images are simultaneously input into the pre-training model MobileNetV2 for training. This classification system achieved an accuracy of 81.9% in the four stages of CKD classification. If the prediction results allowed a higher stage tolerance, the accuracy could be improved by up to 90.1%. The proposed deep learning method solves the problem of imbalance and insufficient data samples during training processes for an automatic classification system and also improves the prediction accuracy of CKD stage diagnosis. Full article
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)
Show Figures

Figure 1

Back to TopTop