Computer Vision in the Era of Deep Learning

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (28 February 2022) | Viewed by 14622

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Smart Cities, Public University of Navarre, 31006 Pamplona, Spain
Interests: image processing; computer vision; machine learning; deep learning; fuzzy theory

E-Mail Website
Guest Editor
Institute of Smart Cities, Public University of Navarre, 31006 Pamplona, Spain
Interests: image processing; computer vision; machine learning; deep learning; fuzzy theory

Special Issue Information

Dear Colleagues,

Computer vision is experiencing a revolution in the era of deep learning. Many computer vision tasks that were previously very hard or even impossible to handle are now possible due to the development of new arquitectures, optimization techniques, computational resources, etc.

This Special Issue intends to compile new developments in the field of computer vision and deep learning, encompassing both theoretical and applied papers. The topics may cover new theoretical results about optimization or regularization techniques, among others, or new applications, arquitectures or methodologies.

Dr. Daniel Paternain
Dr. Mikel Galar
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • computer vision
  • machine learning

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 739 KiB  
Article
Optimized Parameter Search Approach for Weight Modification Attack Targeting Deep Learning Models
by Xabier Echeberria-Barrio, Amaia Gil-Lerchundi, Raul Orduna-Urrutia and Iñigo Mendialdua
Appl. Sci. 2022, 12(8), 3725; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083725 - 07 Apr 2022
Cited by 1 | Viewed by 1249
Abstract
Deep neural network models have been developed in different fields, bringing many advances in several tasks. However, they have also started to be incorporated into tasks with critical risks. That worries researchers who have been interested in studying possible attacks on these models, [...] Read more.
Deep neural network models have been developed in different fields, bringing many advances in several tasks. However, they have also started to be incorporated into tasks with critical risks. That worries researchers who have been interested in studying possible attacks on these models, discovering a long list of threats from which every model should be defended. The weight modification attack is presented and discussed among researchers, who have presented several versions and analyses about such a threat. It focuses on detecting multiple vulnerable weights to modify, misclassifying the desired input data. Therefore, analysis of the different approaches to this attack helps understand how to defend against such a vulnerability. This work presents a new version of the weight modification attack. Our approach is based on three processes: input data clusterization, weight selection, and modification of the weights. Data clusterization allows a directed attack to a selected class. Weight selection uses the gradient given by the input data to identify the most-vulnerable parameters. The modifications are incorporated in each step via limited noise. Finally, this paper shows how this new version of fault injection attack is capable of misclassifying the desired cluster completely, converting the 100% accuracy of the targeted cluster to 0–2.7% accuracy, while the rest of the data continues being well-classified. Therefore, it demonstrates that this attack is a real threat to neural networks. Full article
(This article belongs to the Special Issue Computer Vision in the Era of Deep Learning)
Show Figures

Figure 1

16 pages, 14621 KiB  
Article
A New Deep Learning Model for the Classification of Poisonous and Edible Mushrooms Based on Improved AlexNet Convolutional Neural Network
by Wacharaphol Ketwongsa, Sophon Boonlue and Urachart Kokaew
Appl. Sci. 2022, 12(7), 3409; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073409 - 27 Mar 2022
Cited by 8 | Viewed by 4878
Abstract
The difficulty involved in distinguishing between edible and poisonous mushrooms stems from their similar appearances. In this study, we attempted to classify five common species of poisonous and edible mushrooms found in Thailand, Inocybe rimosa, Amanita phalloides, Amanita citrina, Russula [...] Read more.
The difficulty involved in distinguishing between edible and poisonous mushrooms stems from their similar appearances. In this study, we attempted to classify five common species of poisonous and edible mushrooms found in Thailand, Inocybe rimosa, Amanita phalloides, Amanita citrina, Russula delica, and Phaeogyroporus portentosus, using the convolutional neural network (CNN) and region convolutional neural network (R-CNN). This study was motivated by the yearly death toll from eating poisonous mushrooms in Thailand. In this research, a method for the classification of edible and poisonous mushrooms was proposed and the testing time and accuracy of three pretrained models, AlexNet, ResNet-50, and GoogLeNet, were compared. The proposed model was found to reduce the duration required for training and testing while retaining a high level of accuracy. In the mushroom classification experiments using CNN and R-CNN, the proposed model demonstrated accuracy levels of 98.50% and 95.50%, respectively. Full article
(This article belongs to the Special Issue Computer Vision in the Era of Deep Learning)
Show Figures

Figure 1

14 pages, 1004 KiB  
Article
Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System
by Muhammad Tahir and Saeed Anwar
Appl. Sci. 2021, 11(19), 9197; https://0-doi-org.brum.beds.ac.uk/10.3390/app11199197 - 02 Oct 2021
Cited by 5 | Viewed by 2885
Abstract
Person Re-Identification is an essential task in computer vision, particularly in surveillance applications. The aim is to identify a person based on an input image from surveillance photographs in various scenarios. Most Person re-ID techniques utilize Convolutional Neural Networks (CNNs); however, Vision Transformers [...] Read more.
Person Re-Identification is an essential task in computer vision, particularly in surveillance applications. The aim is to identify a person based on an input image from surveillance photographs in various scenarios. Most Person re-ID techniques utilize Convolutional Neural Networks (CNNs); however, Vision Transformers are replacing pure CNNs for various computer vision tasks such as object recognition, classification, etc. The vision transformers contain information about local regions of the image. The current techniques take this advantage to improve the accuracy of the tasks underhand. We propose to use the vision transformers in conjunction with vanilla CNN models to investigate the true strength of transformers in person re-identification. We employ three backbones with different combinations of vision transformers on two benchmark datasets. The overall performance of the backbones increased, showing the importance of vision transformers. We provide ablation studies and show the importance of various components of the vision transformers in re-identification tasks. Full article
(This article belongs to the Special Issue Computer Vision in the Era of Deep Learning)
Show Figures

Figure 1

18 pages, 3934 KiB  
Article
Multi-Class Strategies for Joint Building Footprint and Road Detection in Remote Sensing
by Christian Ayala, Carlos Aranda and Mikel Galar
Appl. Sci. 2021, 11(18), 8340; https://0-doi-org.brum.beds.ac.uk/10.3390/app11188340 - 08 Sep 2021
Cited by 3 | Viewed by 2173
Abstract
Building footprints and road networks are important inputs for a great deal of services. For instance, building maps are useful for urban planning, whereas road maps are essential for disaster response services. Traditionally, building and road maps are manually generated by remote sensing [...] Read more.
Building footprints and road networks are important inputs for a great deal of services. For instance, building maps are useful for urban planning, whereas road maps are essential for disaster response services. Traditionally, building and road maps are manually generated by remote sensing experts or land surveying, occasionally assisted by semi-automatic tools. In the last decade, deep learning-based approaches have demonstrated their capabilities to extract these elements automatically and accurately from remote sensing imagery. The building footprint and road network detection problem can be considered a multi-class semantic segmentation task, that is, a single model performs a pixel-wise classification on multiple classes, optimizing the overall performance. However, depending on the spatial resolution of the imagery used, both classes may coexist within the same pixel, drastically reducing their separability. In this regard, binary decomposition techniques, which have been widely studied in the machine learning literature, are proved useful for addressing multi-class problems. Accordingly, the multi-class problem can be split into multiple binary semantic segmentation sub-problems, specializing different models for each class. Nevertheless, in these cases, an aggregation step is required to obtain the final output labels. Additionally, other novel approaches, such as multi-task learning, may come in handy to further increase the performance of the binary semantic segmentation models. Since there is no certainty as to which strategy should be carried out to accurately tackle a multi-class remote sensing semantic segmentation problem, this paper performs an in-depth study to shed light on the issue. For this purpose, open-access Sentinel-1 and Sentinel-2 imagery (at 10 m) are considered for extracting buildings and roads, making use of the well-known U-Net convolutional neural network. It is worth stressing that building and road classes may coexist within the same pixel when working at such a low spatial resolution, setting a challenging problem scheme. Accordingly, a robust experimental study is developed to assess the benefits of the decomposition strategies and their combination with a multi-task learning scheme. The obtained results demonstrate that decomposing the considered multi-class remote sensing semantic segmentation problem into multiple binary ones using a One-vs.-All binary decomposition technique leads to better results than the standard direct multi-class approach. Additionally, the benefits of using a multi-task learning scheme for pushing the performance of binary segmentation models are also shown. Full article
(This article belongs to the Special Issue Computer Vision in the Era of Deep Learning)
Show Figures

Figure 1

20 pages, 605 KiB  
Article
A Study of OWA Operators Learned in Convolutional Neural Networks
by Iris Dominguez-Catena, Daniel Paternain and Mikel Galar
Appl. Sci. 2021, 11(16), 7195; https://0-doi-org.brum.beds.ac.uk/10.3390/app11167195 - 04 Aug 2021
Cited by 5 | Viewed by 1830
Abstract
Ordered Weighted Averaging (OWA) operators have been integrated in Convolutional Neural Networks (CNNs) for image classification through the OWA layer. This layer lets the CNN integrate global information about the image in the early stages, where most CNN architectures only allow for the [...] Read more.
Ordered Weighted Averaging (OWA) operators have been integrated in Convolutional Neural Networks (CNNs) for image classification through the OWA layer. This layer lets the CNN integrate global information about the image in the early stages, where most CNN architectures only allow for the exploitation of local information. As a side effect of this integration, the OWA layer becomes a practical method for the determination of OWA operator weights, which is usually a difficult task that complicates the integration of these operators in other fields. In this paper, we explore the weights learned for the OWA operators inside the OWA layer, characterizing them through their basic properties of orness and dispersion. We also compare them to some families of OWA operators, namely the Binomial OWA operator, the Stancu OWA operator and the exponential RIM OWA operator, finding examples that are currently impossible to generalize through these parameterizations. Full article
(This article belongs to the Special Issue Computer Vision in the Era of Deep Learning)
Show Figures

Figure 1

Back to TopTop