Recent Advances in Algorithms for Computer Vision Applications

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Combinatorial Optimization, Graph, and Network Algorithms".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 14174

Special Issue Editor

Department of Computer Information Systems, State University of New York at Buffalo State, Buffalo, NY 14222, USA
Interests: computer vision; image processing; pattern recognition; machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Multi-source visual information fusion and quality improvement can enhance the ability to perceive the real world. Image fusion focuses on fusing multi-source images from multiple sensors into a synthesized image that provides either comprehensive or reliable descriptions. Quality improvement techniques can be used to address the challenges of low-quality image analysis.

So far, a lot of brain-inspired solutions have been proposed in order to accomplish the above two tasks, and the artificial neural network, as one of the most popular techniques, has been widely used in image fusion and quality improvement. As this is an exciting research field, there are many interesting issues that remain to be explored, such as deep few-shot learning, unsupervised learning, the application of embodied neural systems and industrial applications.

Potential topics of interest for this Special Issue include (but are not limited to) the following areas:

  • Image acquisition;
  • Image quality analysis;
  • Image filtering, restoration and enhancement;
  • Image segmentation;
  • Biomedical image processing;
  • Color image processing;
  • Multispectral image processing;
  • Hardware and architectures for image processing;
  • Image databases;
  • Image retrieval and indexing;
  • Image compression;
  • Low-level and high-level image description;
  • Mathematical methods in image processing, analysis and representation;
  • Artificial intelligence tools in image analysis;
  • Pattern recognition algorithms applied for images;
  • Practical applications of image processing, analysis and recognition algorithms in medicine, surveillance, biometrics, document analysis, multimedia, intelligent transportation systems, stereo vision, remote sensing, computer vision, robotics and other fields.

Dr. Guanqiu Qi
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

12 pages, 2334 KiB  
Article
CentralBark Image Dataset and Tree Species Classification Using Deep Learning
by Charles Warner, Fanyou Wu, Rado Gazo, Bedrich Benes, Nicole Kong and Songlin Fei
Algorithms 2024, 17(5), 179; https://0-doi-org.brum.beds.ac.uk/10.3390/a17050179 - 27 Apr 2024
Viewed by 426
Abstract
The task of tree species classification through deep learning has been challenging for the forestry community, and the lack of standardized datasets has hindered further progress. Our work presents a solution in the form of a large bark image dataset called CentralBark, which [...] Read more.
The task of tree species classification through deep learning has been challenging for the forestry community, and the lack of standardized datasets has hindered further progress. Our work presents a solution in the form of a large bark image dataset called CentralBark, which enhances the deep learning-based tree species classification. Additionally, we have laid out an efficient and repeatable data collection protocol to assist future works in an organized manner. The dataset contains images of 25 central hardwood and Appalachian region tree species, with over 19,000 images of varying diameters, light, and moisture conditions. We tested 25 species: elm, oak, American basswood, American beech, American elm, American sycamore, bitternut hickory, black cherry, black locust, black oak, black walnut, eastern cottonwood, hackberry, honey locust, northern red oak, Ohio buckeye, Osage-orange, pignut hickory, sassafras, shagbark hickory silver maple, slippery elm, sugar maple, sweetgum, white ash, white oak, and yellow poplar. Our experiment involved testing three different models to assess the feasibility of species classification using unaltered and uncropped images during the species-classification training process. We achieved an overall accuracy of 83.21% using the EfficientNet-b3 model, which was the best of the three models (EfficientNet-b3, ResNet-50, and MobileNet-V3-small), and an average accuracy of 80.23%. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

17 pages, 1872 KiB  
Article
A Multi-Stage Method for Logo Detection in Scanned Official Documents Based on Image Processing
by María Guijarro, Juan Bayon, Daniel Martín-Carabias and Joaquín Recas
Algorithms 2024, 17(4), 170; https://0-doi-org.brum.beds.ac.uk/10.3390/a17040170 - 22 Apr 2024
Viewed by 340
Abstract
A logotype is a rectangular region defined by a set of characteristics, which come from the pixel information and region shape, that differ from those of the text. In this paper, a new method for automatic logo detection is proposed and tested using [...] Read more.
A logotype is a rectangular region defined by a set of characteristics, which come from the pixel information and region shape, that differ from those of the text. In this paper, a new method for automatic logo detection is proposed and tested using the public Tobacco800 database. Our method outputs a set of regions from an official document with a high probability to contain a logo using a new approach based on the variation of the feature rectangles method available in the literature. Candidate regions were computed using the longest increasing run algorithm over the document blank lines’ indices. Those regions were further refined by using a feature-rectangle-expansion method with forward checking, where the rectangle expansion can occur in parallel in each region. Finally, a C4.5 decision tree was trained and tested against a set of 1291 official documents to evaluate its performance. The strategic combination of the three previous steps offers a precision and recall for logo detention of 98.9% and 89.9%, respectively, being also resistant to noise and low-quality documents. The method is also able to reduce the processing area of the document while maintaining a low percentage of false negatives. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

13 pages, 4268 KiB  
Article
Effect of the Light Environment on Image-Based SPAD Value Prediction of Radish Leaves
by Yuto Kamiwaki and Shinji Fukuda
Algorithms 2024, 17(1), 16; https://0-doi-org.brum.beds.ac.uk/10.3390/a17010016 - 29 Dec 2023
Viewed by 1212
Abstract
This study aims to clarify the influence of photographic environments under different light sources on image-based SPAD value prediction. The input variables for the SPAD value prediction using Random Forests, XGBoost, and LightGBM were RGB values, HSL values, HSV values, light color temperature [...] Read more.
This study aims to clarify the influence of photographic environments under different light sources on image-based SPAD value prediction. The input variables for the SPAD value prediction using Random Forests, XGBoost, and LightGBM were RGB values, HSL values, HSV values, light color temperature (LCT), and illuminance (ILL). Model performance was assessed using Pearson’s correlation coefficient (COR), Nash–Sutcliffe efficiency (NSE), and root mean squared error (RMSE). Especially, SPAD value prediction with Random Forests resulted in high accuracy in a stable light environment; CORRGB+ILL+LCT and CORHSL+ILL+LCT were 0.929 and 0.922, respectively. Image-based SPAD value prediction was effective under halogen light with a similar color temperature at dusk; CORRGB+ILL and CORHSL+ILL were 0.895 and 0.876, respectively. The HSL value under LED could be used to predict the SPAD value with high accuracy in all performance measures. The results supported the applicability of SPAD value prediction using Random Forests under a wide range of lighting conditions, such as dusk, by training a model based on data collected under different illuminance conditions in various light sources. Further studies are required to examine this method under outdoor conditions in spatiotemporally dynamic light environments. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

15 pages, 8312 KiB  
Article
A Lightweight Graph Neural Network Algorithm for Action Recognition Based on Self-Distillation
by Miao Feng and Jean Meunier
Algorithms 2023, 16(12), 552; https://0-doi-org.brum.beds.ac.uk/10.3390/a16120552 - 01 Dec 2023
Viewed by 1399
Abstract
Recognizing human actions can help in numerous ways, such as health monitoring, intelligent surveillance, virtual reality and human–computer interaction. A quick and accurate detection algorithm is required for daily real-time detection. This paper first proposes to generate a lightweight graph neural network by [...] Read more.
Recognizing human actions can help in numerous ways, such as health monitoring, intelligent surveillance, virtual reality and human–computer interaction. A quick and accurate detection algorithm is required for daily real-time detection. This paper first proposes to generate a lightweight graph neural network by self-distillation for human action recognition tasks. The lightweight graph neural network was evaluated on the NTU-RGB+D dataset. The results demonstrate that, with competitive accuracy, the heavyweight graph neural network can be compressed by up to 80%. Furthermore, the learned representations have denser clusters, estimated by the Davies–Bouldin index, the Dunn index and silhouette coefficients. The ideal input data and algorithm capacity are also discussed. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

13 pages, 8724 KiB  
Article
Cloud Detection and Tracking Based on Object Detection with Convolutional Neural Networks
by Jose Antonio Carballo, Javier Bonilla, Jesús Fernández-Reche, Bijan Nouri, Antonio Avila-Marin, Yann Fabel and Diego-César Alarcón-Padilla
Algorithms 2023, 16(10), 487; https://0-doi-org.brum.beds.ac.uk/10.3390/a16100487 - 19 Oct 2023
Cited by 3 | Viewed by 1481
Abstract
Due to the need to know the availability of solar resources for the solar renewable technologies in advance, this paper presents a new methodology based on computer vision and the object detection technique that uses convolutional neural networks (EfficientDet-D2 model) to detect clouds [...] Read more.
Due to the need to know the availability of solar resources for the solar renewable technologies in advance, this paper presents a new methodology based on computer vision and the object detection technique that uses convolutional neural networks (EfficientDet-D2 model) to detect clouds in image series. This methodology also calculates the speed and direction of cloud motion, which allows the prediction of transients in the available solar radiation due to clouds. The convolutional neural network model retraining and validation process finished successfully, which gave accurate cloud detection results in the test. Also, during the test, the estimation of the remaining time for a transient due to a cloud was accurate, mainly due to the precise cloud detection and the accuracy of the remaining time algorithm. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

15 pages, 11652 KiB  
Article
Ascertaining the Ideality of Photometric Stereo Datasets under Unknown Lighting
by Elisa Crabu, Federica Pes, Giuseppe Rodriguez and Giuseppa Tanda
Algorithms 2023, 16(8), 375; https://0-doi-org.brum.beds.ac.uk/10.3390/a16080375 - 05 Aug 2023
Cited by 1 | Viewed by 984
Abstract
The standard photometric stereo model makes several assumptions that are rarely verified in experimental datasets. In particular, the observed object should behave as a Lambertian reflector, and the light sources should be positioned at an infinite distance from it, along a known direction. [...] Read more.
The standard photometric stereo model makes several assumptions that are rarely verified in experimental datasets. In particular, the observed object should behave as a Lambertian reflector, and the light sources should be positioned at an infinite distance from it, along a known direction. Even when Lambert’s law is approximately fulfilled, an accurate assessment of the relative position between the light source and the target is often unavailable in real situations. The Hayakawa procedure is a computational method for estimating such information directly from data images. It occasionally breaks down when some of the available images excessively deviate from ideality. This is generally due to observing a non-Lambertian surface, or illuminating it from a close distance, or both. Indeed, in narrow shooting scenarios, typical, e.g., of archaeological excavation sites, it is impossible to position a flashlight at a sufficient distance from the observed surface. It is then necessary to understand if a given dataset is reliable and which images should be selected to better reconstruct the target. In this paper, we propose some algorithms to perform this task and explore their effectiveness. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

12 pages, 2788 KiB  
Article
Vessel Velocity Estimation and Docking Analysis: A Computer Vision Approach
by João V. R. de Andrade, Bruno J. T. Fernandes, André R. L. C. Izídio, Nilson M. da Silva Filho and Francisco Cruz
Algorithms 2023, 16(7), 326; https://0-doi-org.brum.beds.ac.uk/10.3390/a16070326 - 30 Jun 2023
Cited by 1 | Viewed by 1300
Abstract
The opportunities for leveraging technology to enhance the efficiency of vessel port activities are vast. Applying video analytics to model and optimize certain processes offers a remarkable way to improve overall operations. Within the realm of vessel port activities, two crucial processes are [...] Read more.
The opportunities for leveraging technology to enhance the efficiency of vessel port activities are vast. Applying video analytics to model and optimize certain processes offers a remarkable way to improve overall operations. Within the realm of vessel port activities, two crucial processes are vessel approximation and the docking process. This work specifically focuses on developing a vessel velocity estimation model and a docking mooring analytical system using a computer vision approach. The study introduces algorithms for speed estimation and mooring bitt detection, leveraging techniques such as the Structural Similarity Index (SSIM) for precise image comparison. The obtained results highlight the effectiveness of the proposed algorithms, demonstrating satisfactory speed estimation capabilities and successful identification of tied cables on the mooring bitts. These advancements pave the way for enhanced safety and efficiency in vessel docking procedures. However, further research and improvements are necessary to address challenges related to occlusions and illumination variations and explore additional techniques to enhance the models’ performance and applicability in real-world scenarios. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

21 pages, 6194 KiB  
Article
Fusion of CCTV Video and Spatial Information for Automated Crowd Congestion Monitoring in Public Urban Spaces
by Vivian W. H. Wong and Kincho H. Law
Algorithms 2023, 16(3), 154; https://0-doi-org.brum.beds.ac.uk/10.3390/a16030154 - 10 Mar 2023
Cited by 2 | Viewed by 2126
Abstract
Crowd congestion is one of the main causes of modern public safety issues such as stampedes. Conventional crowd congestion monitoring using closed-circuit television (CCTV) video surveillance relies on manual observation, which is tedious and often error-prone in public urban spaces where crowds are [...] Read more.
Crowd congestion is one of the main causes of modern public safety issues such as stampedes. Conventional crowd congestion monitoring using closed-circuit television (CCTV) video surveillance relies on manual observation, which is tedious and often error-prone in public urban spaces where crowds are dense, and occlusions are prominent. With the aim of managing crowded spaces safely, this study proposes a framework that combines spatial and temporal information to automatically map the trajectories of individual occupants, as well as to assist in real-time congestion monitoring and prediction. Through exploiting both features from CCTV footage and spatial information of the public space, the framework fuses raw CCTV video and floor plan information to create visual aids for crowd monitoring, as well as a sequence of crowd mobility graphs (CMGraphs) to store spatiotemporal features. This framework uses deep learning-based computer vision models, geometric transformations, and Kalman filter-based tracking algorithms to automate the retrieval of crowd congestion data, specifically the spatiotemporal distribution of individuals and the overall crowd flow. The resulting collective crowd movement data is then stored in the CMGraphs, which are designed to facilitate congestion forecasting at key exit/entry regions. We demonstrate our framework on two video data, one public from a train station dataset and the other recorded at a stadium following a crowded football game. Using both qualitative and quantitative insights from the experiments, we demonstrate that the suggested framework can be useful to help assist urban planners and infrastructure operators with the management of congestion hazards. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

31 pages, 29405 KiB  
Article
Assessing the Mass Transfer Coefficient in Jet Bioreactors with Classical Computer Vision Methods and Neural Networks Algorithms
by Irina Nizovtseva, Vladimir Palmin, Ivan Simkin, Ilya Starodumov, Pavel Mikushin, Alexander Nozik, Timur Hamitov, Sergey Ivanov, Sergey Vikharev, Alexei Zinovev, Vladislav Svitich, Matvey Mogilev, Margarita Nikishina, Simon Kraev, Stanislav Yurchenko, Timofey Mityashin, Dmitrii Chernushkin, Anna Kalyuzhnaya and Felix Blyakhman
Algorithms 2023, 16(3), 125; https://0-doi-org.brum.beds.ac.uk/10.3390/a16030125 - 21 Feb 2023
Cited by 2 | Viewed by 1826
Abstract
Development of energy-efficient and high-performance bioreactors requires progress in methods for assessing the key parameters of the biosynthesis process. With a wide variety of approaches and methods for determining the phase contact area in gas–liquid flows, the question of obtaining its accurate quantitative [...] Read more.
Development of energy-efficient and high-performance bioreactors requires progress in methods for assessing the key parameters of the biosynthesis process. With a wide variety of approaches and methods for determining the phase contact area in gas–liquid flows, the question of obtaining its accurate quantitative estimation remains open. Particularly challenging are the issues of getting information about the mass transfer coefficients instantly, as well as the development of predictive capabilities for the implementation of effective flow control in continuous fermentation both on the laboratory and industrial scales. Motivated by the opportunity to explore the possibility of applying classical and non-classical computer vision methods to the results of high-precision video records of bubble flows obtained during the experiment in the bioreactor vessel, we obtained a number of results presented in the paper. Characteristics of the bioreactor’s bubble flow were estimated first by classical computer vision (CCV) methods including an elliptic regression approach for single bubble boundaries selection and clustering, image transformation through a set of filters and developing an algorithm for separation of the overlapping bubbles. The application of the developed method for the entire video filming makes it possible to obtain parameter distributions and set dropout thresholds in order to obtain better estimates due to averaging. The developed CCV methodology was also tested and verified on a collected and labeled manual dataset. An onwards deep neural network (NN) approach was also applied, for instance the segmentation task, and has demonstrated certain advantages in terms of high segmentation resolution, while the classical one tends to be more speedy. Thus, in the current manuscript both advantages and disadvantages of the classical computer vision method (CCV) and neural network approach (NN) are discussed based on evaluation of bubbles’ number and their area defined. An approach to mass transfer coefficient estimation methodology in virtue of obtained results is also represented. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

17 pages, 495 KiB  
Article
Audiovisual Biometric Network with Deep Feature Fusion for Identification and Text Prompted Verification
by Juan Carlos Atenco, Juan Carlos Moreno and Juan Manuel Ramirez
Algorithms 2023, 16(2), 66; https://0-doi-org.brum.beds.ac.uk/10.3390/a16020066 - 19 Jan 2023
Cited by 2 | Viewed by 1811
Abstract
In this work we present a bimodal multitask network for audiovisual biometric recognition. The proposed network performs the fusion of features extracted from face and speech data through a weighted sum to jointly optimize the contribution of each modality, aiming for the identification [...] Read more.
In this work we present a bimodal multitask network for audiovisual biometric recognition. The proposed network performs the fusion of features extracted from face and speech data through a weighted sum to jointly optimize the contribution of each modality, aiming for the identification of a client. The extracted speech features are simultaneously used in a speech recognition task with random digit sequences. Text prompted verification is performed by fusing the scores obtained from the matching of bimodal embeddings with the Word Error Rate (WER) metric calculated from the accuracy of the transcriptions. The score fusion outputs a value that can be compared with a threshold to accept or reject the identity of a client. Training and evaluation was carried out by using our proprietary database BIOMEX-DB and VidTIMIT audiovisual database. Our network achieved an accuracy of 100% and an Equal Error Rate (EER) of 0.44% for identification and verification, respectively, in the best case. To the best of our knowledge, this is the first system that combines the mutually related tasks previously described for biometric recognition. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

Back to TopTop