Advances in Computer Vision, Pattern Recognition, Machine Learning and Symmetry

A special issue of Symmetry (ISSN 2073-8994). This special issue belongs to the section "Computer".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 50066

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Systems and Robotics, University of Coimbra, 3004-531 Coimbra, Portugal
Interests: rehabilitation robotics; assistive robotics; medical engineering; applied machine learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Center for MicroElectroMechanical Systems (CMEMS), University of Minho, 4710-057 Braga, Portugal
Interests: human motion; human locomotion; human–robot interactions and collaboration; medical devices; neuro-rehabilitation of patients suffering from motor problems by means of bio-inspired robotics and neuroscience technologies
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
1. Institute of Systems and Robotics, University of Coimbra, 3030-290 Coimbra, Portugal
2. Polytechnic Institute of Tomar, 2300-313 Tomar, Portugal
Interests: human-machine interface; brain-computer interface; biosignal processing; assistive robotics

Special Issue Information

Dear Colleagues,

Machine intelligence is no longer a science fiction utopia but rather a very present reality. It has been evolving rapidly within the fields of computer vision, pattern recognition, machine learning, and symmetry. It is a daunting task to try and keep up with the abundance of new publications that present the most recent advancements within each field. As such, this Special Issue is dedicated to presenting and aggregating recent advancements in these research fields, spread across a universe of applications, such as industry, medicine, robotics, biotechnology, mechanical engineering, and others, as well as in fundamental and theoretical forms.

Please note that all submitted papers must be within the general scope of the Symmetry journal.

Dr. João Ruivo Paulo
Dr. Cristina P. Santos
Dr. Gabriel Pires
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Symmetry is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • pattern recognition
  • machine learning
  • symmetry
  • machine intelligence applied in industry, medicine, and biotechnology
  • intelligence in biomedical engineering
  • intelligent robotic systems
  • autonomous driving systems
  • data mining

Published Papers (35 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 4396 KiB  
Article
YOLO-RDP: Lightweight Steel Defect Detection through Improved YOLOv7-Tiny and Model Pruning
by Guiheng Zhang, Shuxian Liu, Shuaiqi Nie and Libo Yun
Symmetry 2024, 16(4), 458; https://0-doi-org.brum.beds.ac.uk/10.3390/sym16040458 - 10 Apr 2024
Viewed by 347
Abstract
During steel manufacturing, surface defects such as scratches, scale, and oxidation can compromise product quality and safety. Detecting these defects accurately is critical for production efficiency and product integrity. However, current target detection algorithms are often too resource-intensive for deployment on edge devices [...] Read more.
During steel manufacturing, surface defects such as scratches, scale, and oxidation can compromise product quality and safety. Detecting these defects accurately is critical for production efficiency and product integrity. However, current target detection algorithms are often too resource-intensive for deployment on edge devices with limited computing resources. To address this challenge, we propose YOLO-RDP, an enhanced YOLOv7-tiny model. YOLO-RDP integrates RexNet, a lightweight network, for feature extraction, and employs GSConv and VOV-GSCSP modules to enhance the network’s neck layer, reducing parameter count and computational complexity. Additionally, we designed a dual-headed object detection head called DdyHead with a symmetric structure, composed of two complementary object detection heads, greatly enhancing the model’s ability to recognize minor defects. Further model optimization through pruning achieves additional lightweighting. Experimental results demonstrate the superiority of our model, with improvements in mAP values of 3.7% and 3.5% on the NEU-DET and GC10-DET datasets, respectively, alongside reductions in parameter count and computation by 40% and 30%, and 25% and 24%, respectively. Full article
Show Figures

Figure 1

21 pages, 1406 KiB  
Article
Improved Weed Detection in Cotton Fields Using Enhanced YOLOv8s with Modified Feature Extraction Modules
by Doudou Ren, Wenzhong Yang, Zhifeng Lu, Danny Chen and Houwang Shi
Symmetry 2024, 16(4), 450; https://0-doi-org.brum.beds.ac.uk/10.3390/sym16040450 - 07 Apr 2024
Viewed by 456
Abstract
Weed detection plays a crucial role in enhancing cotton agricultural productivity. However, the detection process is subject to challenges such as target scale diversity and loss of leaf symmetry due to leaf shading. Hence, this research presents an enhanced model, EY8-MFEM, for detecting [...] Read more.
Weed detection plays a crucial role in enhancing cotton agricultural productivity. However, the detection process is subject to challenges such as target scale diversity and loss of leaf symmetry due to leaf shading. Hence, this research presents an enhanced model, EY8-MFEM, for detecting weeds in cotton fields. Firstly, the ALGA module is proposed, which combines the local and global information of feature maps through weighting operations to better focus on the spatial information of feature maps. Following this, the C2F-ALGA module was developed to augment the feature extraction capability of the underlying backbone network. Secondly, the MDPM module is proposed to generate attention matrices by capturing the horizontal and vertical information of feature maps, reducing duplicate information in the feature maps. Finally, we will replace the upsampling module of YOLOv8 with the CARAFE module to provide better upsampling performance. Extensive experiments on two publicly available datasets showed that the F1, mAP50 and mAP75 metrics improved by 1.2%, 5.1%, 2.9% and 3.8%, 1.3%, 2.2%, respectively, compared to the baseline model. This study showcases the algorithm’s potential for practical applications in weed detection within cotton fields, promoting the significant development of artificial intelligence in the field of agriculture. Full article
Show Figures

Figure 1

20 pages, 7775 KiB  
Article
LW-YOLO: Lightweight Deep Learning Model for Fast and Precise Defect Detection in Printed Circuit Boards
by Zhaohui Yuan, Xiangyang Tang, Hao Ning and Zhengzhe Yang
Symmetry 2024, 16(4), 418; https://0-doi-org.brum.beds.ac.uk/10.3390/sym16040418 - 03 Apr 2024
Viewed by 424
Abstract
Printed circuit board (PCB) manufacturing processes are becoming increasingly complex, where even minor defects can impair product performance and yield rates. Precisely identifying PCB defects is critical but remains challenging. Traditional PCB defect detection methods, such as visual inspection and automated technologies, have [...] Read more.
Printed circuit board (PCB) manufacturing processes are becoming increasingly complex, where even minor defects can impair product performance and yield rates. Precisely identifying PCB defects is critical but remains challenging. Traditional PCB defect detection methods, such as visual inspection and automated technologies, have limitations. While defects can be readily identified based on symmetry, the operational aspect proves to be quite challenging. Deep learning has shown promise in defect detection; however, current deep learning models for PCB defect detection still face issues like large model size, slow detection speed, and suboptimal accuracy. This paper proposes a lightweight YOLOv8 (You Only Look Once version 8)-based model called LW-YOLO (Lightweight You Only Look Once) to address these limitations. Specifically, LW-YOLO incorporates a bidirectional feature pyramid network for multiscale feature fusion, a Partial Convolution module to reduce redundant calculations, and a Minimum Point Distance Intersection over Union loss function to simplify optimization and improve accuracy. Based on the experimental data, LW-YOLO achieved an mAP0.5 of 96.4%, which is 2.2 percentage points higher than YOLOv8; the precision reached 97.1%, surpassing YOLOv8 by 1.7 percentage points; and at the same time, LW-YOLO achieved an FPS of 141.5. The proposed strategies effectively enhance efficiency and accuracy for deep-learning-based PCB defect detection. Full article
Show Figures

Figure 1

19 pages, 4238 KiB  
Article
Symmetry Breaking in the U-Net: Hybrid Deep-Learning Multi-Class Segmentation of HeLa Cells in Reflected Light Microscopy Images
by Ali Ghaznavi, Renata Rychtáriková, Petr Císař, Mohammad Mehdi Ziaei and Dalibor Štys
Symmetry 2024, 16(2), 227; https://0-doi-org.brum.beds.ac.uk/10.3390/sym16020227 - 13 Feb 2024
Viewed by 710
Abstract
Multi-class segmentation of unlabelled living cells in time-lapse light microscopy images is challenging due to the temporal behaviour and changes in cell life cycles and the complexity of these images. The deep-learning-based methods achieved promising outcomes and remarkable success in single- and multi-class [...] Read more.
Multi-class segmentation of unlabelled living cells in time-lapse light microscopy images is challenging due to the temporal behaviour and changes in cell life cycles and the complexity of these images. The deep-learning-based methods achieved promising outcomes and remarkable success in single- and multi-class medical and microscopy image segmentation. The main objective of this study is to develop a hybrid deep-learning-based categorical segmentation and classification method for living HeLa cells in reflected light microscopy images. A symmetric simple U-Net and three asymmetric hybrid convolution neural networks—VGG19-U-Net, Inception-U-Net, and ResNet34-U-Net—were proposed and mutually compared to find the most suitable architecture for multi-class segmentation of our datasets. The inception module in the Inception-U-Net contained kernels with different sizes within the same layer to extract all feature descriptors. The series of residual blocks with the skip connections in each ResNet34-U-Net’s level alleviated the gradient vanishing problem and improved the generalisation ability. The m-IoU scores of multi-class segmentation for our datasets reached 0.7062, 0.7178, 0.7907, and 0.8067 for the simple U-Net, VGG19-U-Net, Inception-U-Net, and ResNet34-U-Net, respectively. For each class and the mean value across all classes, the most accurate multi-class semantic segmentation was achieved using the ResNet34-U-Net architecture (evaluated as the m-IoU and Dice metrics). Full article
Show Figures

Figure 1

22 pages, 10385 KiB  
Article
Single-View 3D Reconstruction via Differentiable Rendering and Inverse Procedural Modeling
by Albert Garifullin, Nikolay Maiorov, Vladimir Frolov and Alexey Voloboy
Symmetry 2024, 16(2), 184; https://0-doi-org.brum.beds.ac.uk/10.3390/sym16020184 - 04 Feb 2024
Viewed by 1150
Abstract
Three-dimensional models, reconstructed from real-life objects, are extensively used in virtual and mixed reality technologies. In this paper we propose an approach to 3D model reconstruction via inverse procedural modeling and describe two variants of this approach. The first option is to fit [...] Read more.
Three-dimensional models, reconstructed from real-life objects, are extensively used in virtual and mixed reality technologies. In this paper we propose an approach to 3D model reconstruction via inverse procedural modeling and describe two variants of this approach. The first option is to fit a set of input parameters using a genetic algorithm. The second option allows us to significantly improve precision by using gradients within the memetic algorithm, differentiable rendering, and differentiable procedural generators. We demonstrate the results of our work on different models, including trees, which are complex objects that most existing methods cannot reconstruct. In our work, we see two main contributions. First, we propose a method to join differentiable rendering and inverse procedural modeling. This gives us the ability to reconstruct 3D models more accurately than existing approaches when few input images are available, even for a single image. Second, we combine both differentiable and non-differentiable procedural generators into a single framework that allows us to apply inverse procedural modeling to fairly complex generators. We show that both variants of our approach can be useful: the differentiable one is more precise but puts limitations on the procedural generator, while the one based on genetic algorithms can be used with any existing generator. The proposed approach uses information about the symmetry and structure of the object to achieve high-quality reconstruction from a single image. Full article
Show Figures

Figure 1

19 pages, 7376 KiB  
Article
Siamese Tracking Network with Spatial-Semantic-Aware Attention and Flexible Spatiotemporal Constraint
by Huanlong Zhang, Panyun Wang, Jie Zhang, Fengxian Wang, Xiaohui Song and Hebin Zhou
Symmetry 2024, 16(1), 61; https://0-doi-org.brum.beds.ac.uk/10.3390/sym16010061 - 03 Jan 2024
Viewed by 810
Abstract
Siamese trackers based on classification and regression have drawn extensive attention due to their appropriate balance between accuracy and efficiency. However, most of them are prone to failure in the face of abrupt motion or appearance changes. This paper proposes a Siamese-based tracker [...] Read more.
Siamese trackers based on classification and regression have drawn extensive attention due to their appropriate balance between accuracy and efficiency. However, most of them are prone to failure in the face of abrupt motion or appearance changes. This paper proposes a Siamese-based tracker that incorporates spatial-semantic-aware attention and flexible spatiotemporal constraint. First, we develop a spatial-semantic-aware attention model, which identifies the importance of each feature region and channel to target representation through the single convolution attention network with a loss function and increases the corresponding weights in the spatial and channel dimensions to reinforce the target region and semantic information on the target feature map. Secondly, considering that the traditional method unreasonably weights the target response in abrupt motion, we design a flexible spatiotemporal constraint. This constraint adaptively adjusts the constraint weights on the response map by evaluating the tracking result. Finally, we propose a new template updating the strategy. This strategy adaptively adjusts the contribution weights of the tracking result to the new template using depth correlation assessment criteria, thereby enhancing the reliability of the template. The Siamese network used in this paper is a symmetric neural network with dual input branches sharing weights. The experimental results on five challenging datasets show that our method outperformed other advanced algorithms. Full article
Show Figures

Figure 1

26 pages, 21717 KiB  
Article
Simple Hybrid Camera-Based System Using Two Views for Three-Dimensional Body Measurements
by Mohammad Montazerian and Frederic Fol Leymarie
Symmetry 2024, 16(1), 49; https://0-doi-org.brum.beds.ac.uk/10.3390/sym16010049 - 29 Dec 2023
Viewed by 979
Abstract
Using a single RGB camera to obtain accurate body dimensions, rather than measuring these manually or via more complex multicamera systems or more expensive 3D scanners, has a high application potential for the apparel industry. We present a system that estimates upper human [...] Read more.
Using a single RGB camera to obtain accurate body dimensions, rather than measuring these manually or via more complex multicamera systems or more expensive 3D scanners, has a high application potential for the apparel industry. We present a system that estimates upper human body measurements using a hybrid set of techniques from both classic computer vision and recent machine learning. The main steps involve (1) using a camera to obtain two views (frontal and side); (2) isolating in the image pair a set of main body parts; (3) improving the image quality; (4) extracting body contours and features from the images of body parts; (5) indicating markers on these images; (6) performing a calibration step; and (7) producing refined final 3D measurements. We favour a unique geometric shape, that of an ellipse, to approximate human body main horizontal cross-sections. We focus on the more challenging parts of the body, i.e., the upper body from the head to the hips, which, we show, can be well represented by varying an ellipse’s eccentricity for each individual. Then, evaluating each fitted ellipse’s perimeter allows us to obtain better results than the current state-of-the-art methods for use in the fashion and online retail industry. In our study, we selected a set of two equations, out of many other possible choices, to best estimate upper human body section circumferences. We experimented with the system on a diverse sample of 78 female participants. The results for the upper human body measurements in comparison to the traditional manual method of tape measurements, when used as a reference, show ±1 cm average differences, which are sufficient for many applications, including online retail. Full article
Show Figures

Figure 1

16 pages, 884 KiB  
Article
Spatio-Temporal Information Fusion and Filtration for Human Action Recognition
by Man Zhang, Xing Li and Qianhan Wu
Symmetry 2023, 15(12), 2177; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15122177 - 08 Dec 2023
Viewed by 772
Abstract
Human action recognition (HAR) as the most representative human-centred computer vision task is critical in human resource management (HRM), especially in human resource recruitment, performance appraisal, and employee training. Currently, prevailing approaches to human action recognition primarily emphasize either temporal or spatial features [...] Read more.
Human action recognition (HAR) as the most representative human-centred computer vision task is critical in human resource management (HRM), especially in human resource recruitment, performance appraisal, and employee training. Currently, prevailing approaches to human action recognition primarily emphasize either temporal or spatial features while overlooking the intricate interplay between these two dimensions. This oversight leads to less precise and robust action classification within complex human resource recruitment environments. In this paper, we propose a novel human action recognition methodology for human resource recruitment environments, which aims at symmetrically harnessing temporal and spatial information to enhance the performance of human action recognition. Specifically, we compute Depth Motion Maps (DMM) and Depth Temporal Maps (DTM) from depth video sequences as space and time descriptors, respectively. Subsequently, a novel feature fusion technique named Center Boundary Collaborative Canonical Correlation Analysis (CBCCCA) is designed to enhance the fusion of space and time features by collaboratively learning the center and boundary information of feature class space. We then introduce a spatio-temporal information filtration module to remove redundant information introduced by spatio-temporal fusion and retain discriminative details. Finally, a Support Vector Machine (SVM) is employed for human action recognition. Extensive experiments demonstrate that the proposed method has the ability to significantly improve human action recognition performance. Full article
Show Figures

Figure 1

22 pages, 4021 KiB  
Article
A Novel Lightweight Object Detection Network with Attention Modules and Hierarchical Feature Pyramid
by Shengying Yang, Linfeng Chen, Junxia Wang, Wuyin Jin and Yunxiang Yu
Symmetry 2023, 15(11), 2080; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15112080 - 17 Nov 2023
Viewed by 845
Abstract
Object detection methods based on deep learning typically require devices with ample computing capabilities, which limits their deployment in restricted environments such as those with embedded devices. To address this challenge, we propose Mini-YOLOv4, a lightweight real-time object detection network that achieves an [...] Read more.
Object detection methods based on deep learning typically require devices with ample computing capabilities, which limits their deployment in restricted environments such as those with embedded devices. To address this challenge, we propose Mini-YOLOv4, a lightweight real-time object detection network that achieves an excellent trade-off between speed and accuracy. Based on CSPDarknet-Tiny as the backbone network, we enhance the detection performance of the network in three ways. We use a multibranch structure embedded in an attention module for simultaneous spatial and channel attention calibration. We design a group self-attention block with a symmetric structure consisting of a pair of complementary self-attention modules to mine contextual information, thereby ensuring that the detection accuracy is improved without increasing the computational cost. Finally, we introduce a hierarchical feature pyramid network to fully exploit multiscale feature maps and promote the extraction of fine-grained features. The experimental results demonstrate that Mini-YOLOv4 requires only 4.7 M parameters and has a billion floating point operations (BFLOPs) value of 3.1. Compared with YOLOv4-Tiny, our approach achieves a 3.2% improvement in mean accuracy precision (mAP) for the PASCAL VOC dataset and obtains a significant improvement of 3.5% in overall detection accuracy for the MS COCO dataset. In testing with an embedded platform, Mini-YOLOv4 achieves a real-time detection speed of 25.6 FPS on the NVIDIA Jetson Nano, thus meeting the demand for real-time detection in computationally limited devices. Full article
Show Figures

Figure 1

13 pages, 2759 KiB  
Article
Action Recognition Based on GCN with Adjacency Matrix Generation Module and Time Domain Attention Mechanism
by Rong Yang, Junyu Niu, Ying Xu, Yun Wang and Li Qiu
Symmetry 2023, 15(10), 1954; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15101954 - 23 Oct 2023
Viewed by 999
Abstract
Different from other computer vision tasks, action recognition needs to process larger-scale video data. How to extract and analyze the effective parts from a huge amount of video information is the main difficulty of action recognition technology. In recent years, due to the [...] Read more.
Different from other computer vision tasks, action recognition needs to process larger-scale video data. How to extract and analyze the effective parts from a huge amount of video information is the main difficulty of action recognition technology. In recent years, due to the outstanding performance of Graph Convolutional Networks (GCN) in many fields, a new solution to the action recognition algorithm has emerged. However, in current GCN models, the constant physical adjacency matrix makes it difficult to mine synergistic relationships between key points that are not directly connected in physical space. Additionally, a simple time connection of skeleton data from different frames makes each frame in the video contribute equally to the recognition results, which increases the difficulty of distinguishing action stages. In this paper, the information extraction ability of the model has been optimized in the space domain and time domain, respectively. In the space domain, an Adjacency Matrix Generation (AMG) module, which can pre-analyze node sets and generate an adaptive adjacency matrix, has been proposed. The adaptive adjacency matrix can help the graph convolution model to extract the synergistic information between the key points that are crucial for recognition. In the time domain, the Time Domain Attention (TDA) mechanism has been designed to calculate the time-domain weight vector through double pooling channels and complete the weights of key point sequences. Furthermore, performance of the improved TDA-AMG-GCN modules has been verified on the NTU-RGB+D dataset. Its detection accuracy at the CS and CV divisions reached 84.5% and 89.8%, respectively, with an average level higher than other commonly used detection methods at present. Full article
Show Figures

Figure 1

17 pages, 1621 KiB  
Article
Symmetric Graph-Based Visual Question Answering Using Neuro-Symbolic Approach
by Jiyoun Moon
Symmetry 2023, 15(9), 1713; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15091713 - 07 Sep 2023
Viewed by 812
Abstract
As the applications of robots expand across a wide variety of areas, high-level task planning considering human–robot interactions is emerging as a critical issue. Various elements that facilitate flexible responses to humans in an ever-changing environment, such as scene understanding, natural language processing, [...] Read more.
As the applications of robots expand across a wide variety of areas, high-level task planning considering human–robot interactions is emerging as a critical issue. Various elements that facilitate flexible responses to humans in an ever-changing environment, such as scene understanding, natural language processing, and task planning, are thus being researched extensively. In this study, a visual question answering (VQA) task was examined in detail from among an array of technologies. By further developing conventional neuro-symbolic approaches, environmental information is stored and utilized in a symmetric graph format, which enables more flexible and complex high-level task planning. We construct a symmetric graph composed of information such as color, size, and position for the objects constituting the environmental scene. VQA, using graphs, largely consists of a part expressing a scene as a graph, a part converting a question into SPARQL, and a part reasoning the answer. The proposed method was verified using a public dataset, CLEVR, with which it successfully performed VQA. We were able to directly confirm the process of inferring answers using SPARQL queries converted from the original queries and environmental symmetric graph information, which is distinct from existing methods that make it difficult to trace the path to finding answers. Full article
Show Figures

Figure 1

22 pages, 8553 KiB  
Article
Efficient DCNN-LSTM Model for Fault Diagnosis of Raw Vibration Signals: Applications to Variable Speed Rotating Machines and Diverse Fault Depths Datasets
by Muhammad Ahsan and Mostafa M. Salah
Symmetry 2023, 15(7), 1413; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15071413 - 14 Jul 2023
Cited by 1 | Viewed by 969
Abstract
Bearings are the backbone of industrial machines that can shut down or damage the whole process when a fault occurs in them. Therefore, health diagnosis and fault identification in the bearings are essential to avoid a sudden shutdown. Vibration signals from the rotating [...] Read more.
Bearings are the backbone of industrial machines that can shut down or damage the whole process when a fault occurs in them. Therefore, health diagnosis and fault identification in the bearings are essential to avoid a sudden shutdown. Vibration signals from the rotating bearings are extensively used to diagnose the health of industrial machines as well as to analyze their symmetrical behavior. When a fault occurs in the bearings, deviations from their symmetrical behavior can be indicative of potential faults. However, fault identification is challenging when (1) the vibration signals are recorded from variable speeds compared to the constant speed and (2) the vibration signals have diverse fault depths. In this work, we have proposed a highly accurate Deep Convolution Neural Network (DCNN)–Long Short-Term Memory (LSTM) model with a SoftMax classifier. The proposed model offers an innovative approach to fault diagnosis, as it obviates the need for preprocessing and digital signal processing techniques for feature computation. It demonstrates remarkable efficiency in accurately diagnosing fault conditions across variable speed vibration datasets encompassing diverse fault conditions, including but not limited to outer race fault, inner race fault, ball fault, and mixed faults, as well as constant speed datasets with varying fault depths. The proposed method can extract the features automatically from these vibration signals and, hence, are excellent to enhance the performance and efficiency to diagnose the machine’s health. For the experimental study, two different datasets—the constant speed with different fault depths and variable speed rotating machines—are considered to validate the performance of the proposed method. The accuracy achieved for the variable speed rotating machine dataset is 99.40%, while for the diverse fault dataset, the accuracy reaches 99.87%. Furthermore, the experimental results of the proposed method are compared with the existing methods in the literature as well as the artificial neural network (ANN) model. Full article
Show Figures

Figure 1

17 pages, 8814 KiB  
Article
A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application
by Yiyi Liu, Yuxin Wang and Hongjian Shi
Symmetry 2023, 15(4), 849; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15040849 - 02 Apr 2023
Cited by 5 | Viewed by 2581
Abstract
Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually [...] Read more.
Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually consists of two steps: text detection and text recognition. Scene text recognition is a subfield of OCR that focuses on processing text in natural scenes, such as streets, billboards, license plates, etc. Unlike traditional document category photographs, it is a challenging task to use computer technology to locate and read text information in natural scenes. Imaging sequence recognition is a longstanding subject of research in the field of computer vision. Great progress has been made in this field; however, most models struggled to recognize text in images of complex scenes with high accuracy. This paper proposes a new pattern of text recognition based on the convolutional recurrent neural network (CRNN) as a solution to address this issue. It combines real-time scene text detection with differentiable binarization (DBNet) for text detection and segmentation, text direction classifier, and the Retinex algorithm for image enhancement. To evaluate the effectiveness of the proposed method, we performed experimental analysis of the proposed algorithm, and carried out simulation on complex scene image data based on existing literature data and also on several real datasets designed for a variety of nonstationary environments. Experimental results demonstrated that our proposed model performed better than the baseline methods on three benchmark datasets and achieved on-par performance with other approaches on existing datasets. This model can solve the problem that CRNN cannot identify text in complex and multi-oriented text scenes. Furthermore, it outperforms the original CRNN model with higher accuracy across a wider variety of application scenarios. Full article
Show Figures

Figure 1

15 pages, 7820 KiB  
Article
Cross-Correlation Fusion Graph Convolution-Based Object Tracking
by Liuyi Fan, Wei Chen and Xiaoyan Jiang
Symmetry 2023, 15(3), 771; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15030771 - 21 Mar 2023
Viewed by 1343
Abstract
Most popular graph attention networks treat pixels of a feature map as individual nodes, which makes the feature embedding extracted by the graph convolution lack the integrity of the object. Moreover, matching between a template graph and a search graph using only part-level [...] Read more.
Most popular graph attention networks treat pixels of a feature map as individual nodes, which makes the feature embedding extracted by the graph convolution lack the integrity of the object. Moreover, matching between a template graph and a search graph using only part-level information usually causes tracking errors, especially in occlusion and similarity situations. To address these problems, we propose a novel end-to-end graph attention tracking framework that has high symmetry, combining traditional cross-correlation operations directly. By utilizing cross-correlation operations, we effectively compensate for the dispersion of graph nodes and enhance the representation of features. Additionally, our graph attention fusion model performs both part-to-part matching and global matching, allowing for more accurate information embedding in the template and search regions. Furthermore, we optimize the information embedding between the template and search branches to achieve better single-object tracking results, particularly in occlusion and similarity scenarios. The flexibility of graph nodes and the comprehensiveness of information embedding have brought significant performance improvements in our framework. Extensive experiments on three challenging public datasets (LaSOT, GOT-10k, and VOT2016) show that our tracker outperforms other state-of-the-art trackers. Full article
Show Figures

Figure 1

20 pages, 2383 KiB  
Article
MSG-Point-GAN: Multi-Scale Gradient Point GAN for Point Cloud Generation
by Bingxu Wang, Jinhui Lan and Jiangjiang Gao
Symmetry 2023, 15(3), 730; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15030730 - 15 Mar 2023
Cited by 3 | Viewed by 1570
Abstract
The generative adversarial network (GAN) has recently emerged as a promising generative model. Its application in the image field has been extensive, but there has been little research concerning point clouds.The combination of a GAN and a graph convolutional network has been the [...] Read more.
The generative adversarial network (GAN) has recently emerged as a promising generative model. Its application in the image field has been extensive, but there has been little research concerning point clouds.The combination of a GAN and a graph convolutional network has been the state-of-the-art method for generating point clouds. However, there is a significant gap between the generated point cloud and the point cloud used for training. In order to improve the quality of the generated point cloud, this study proposed multi-scale gradient point GAN (MSG-Point-GAN). The training of the GAN is a dynamic game process, and we expected the generation and discrimination capabilities to be symmetric, so that the network training would be more stable. Based on the concept of progressive growth, this method used the network structure of a multi-scale gradient GAN (MSG-GAN) to stabilize the training process. The discriminator of this method used part of the PointNet structure to resolve the problem of the disorder and rotation of the point cloud. The discriminator could effectively determine the authenticity of the generated point cloud. This study also analyzed the optimization process of the objective function of the MSG-Point-GAN. The experimental results showed that the training process of the MSG-Point-GAN was stable, and the point cloud quality was superior to other methods in subjective vision. From the perspective of performance metrics, the gap between the point cloud generated by the proposed method and the real point cloud was significantly smaller than that generated by other methods. Based on the practical analysis, the point cloud generated by the proposed method for training the point-cloud classification network was improved by about 0.2%, as compared to the original network. The proposed method provided a stable training framework for point cloud generation. It can effectively promote the development of point-cloud-generation technology. Full article
Show Figures

Graphical abstract

18 pages, 3127 KiB  
Article
Breast Cancer Diagnosis in Thermography Using Pre-Trained VGG16 with Deep Attention Mechanisms
by Alia Alshehri and Duaa AlSaeed
Symmetry 2023, 15(3), 582; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15030582 - 23 Feb 2023
Cited by 2 | Viewed by 2459
Abstract
One of the most prevalent cancers in women is breast cancer. The mortality rate related to this disease can be decreased by early, accurate diagnosis to increase the chance of survival. Infrared thermal imaging is one of the breast imaging modalities in which [...] Read more.
One of the most prevalent cancers in women is breast cancer. The mortality rate related to this disease can be decreased by early, accurate diagnosis to increase the chance of survival. Infrared thermal imaging is one of the breast imaging modalities in which the temperature of the breast tissue is measured using a screening tool. The previous studies did not use pre-trained deep learning (DL) with deep attention mechanisms (AMs) on thermographic images for breast cancer diagnosis. Using thermal images from the Database for Research Mastology with Infrared Image (DMR-IR), the study investigates the use of a pre-trained Visual Geometry Group with 16 layers (VGG16) with AMs that can produce good diagnosis performance utilizing the thermal images of breast cancer. The symmetry of the three models resulting from the combination of VGG16 with three types of AMs is evident in all its stages in methodology. The models were compared to state-of-art breast cancer diagnosis approaches and tested for accuracy, sensitivity, specificity, precision, F1-score, AUC score, and Cohen’s kappa. The test accuracy rates for the AMs using the VGG16 model on the breast thermal dataset were encouraging, at 99.80%, 99.49%, and 99.32%. Test accuracy for VGG16 without AMs was 99.18%, whereas test accuracy for VGG16 with AMs improved by 0.62%. The proposed approaches also performed better than previous approaches examined in the related studies. Full article
Show Figures

Figure 1

28 pages, 5242 KiB  
Article
MODeLING.Vis: A Graphical User Interface Toolbox Developed for Machine Learning and Pattern Recognition of Biomolecular Data
by Jorge Emanuel Martins, Davide D’Alimonte, Joana Simões, Sara Sousa, Eduardo Esteves, Nuno Rosa, Maria José Correia, Mário Simões and Marlene Barros
Symmetry 2023, 15(1), 42; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15010042 - 23 Dec 2022
Viewed by 1433
Abstract
Many scientific publications that affect machine learning have set the basis for pattern recognition and symmetry. In this paper, we revisit the concept of “Mind-life continuity” published by the authors, testing the symmetry between cognitive and electrophoretic strata. We opted for machine learning [...] Read more.
Many scientific publications that affect machine learning have set the basis for pattern recognition and symmetry. In this paper, we revisit the concept of “Mind-life continuity” published by the authors, testing the symmetry between cognitive and electrophoretic strata. We opted for machine learning to analyze and understand the total protein profile of neurotypical subjects acquired by capillary electrophoresis. Capillary electrophoresis permits a cost-wise solution but lacks modern proteomic techniques’ discriminative and quantification power. To compensate for this problem, we developed tools for better data visualization and exploration in this work. These tools permitted us to examine better the total protein profile of 92 young adults, from 19 to 25 years old, healthy university students at the University of Lisbon, with no serious, uncontrolled, or chronic diseases affecting the nervous system. As a result, we created a graphical user interface toolbox named MODeLING.Vis, which showed specific expected protein profiles present in saliva in our neurotypical sample. The developed toolbox permitted data exploration and hypothesis testing of the biomolecular data. In conclusion, this analysis offered the data mining of the acquired neuroproteomics data in the molecular weight range from 9.1 to 30 kDa. This molecular weight range, obtained by pattern recognition of our dataset, is characteristic of the small neuroimmune molecules and neuropeptides. Consequently, MODeLING.Vis offers a machine-learning solution for probing into the neurocognitive response. Full article
Show Figures

Figure 1

16 pages, 2780 KiB  
Article
Image Virtual Viewpoint Generation Method under Hole Pixel Information Update
by Ling Leng, Changlun Gao, Fangren Zhang, Dan Li, Weijie Zhang, Ting Gao, Zhiheng Zeng, Luxin Tang, Qing Luo and Yuxin Duan
Symmetry 2023, 15(1), 34; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15010034 - 23 Dec 2022
Viewed by 1234
Abstract
A virtual viewpoint generation method is proposed to address the problem of low fidelity in the generation of virtual viewpoints for images with overlapping pixel points. Virtual viewpoint generation factors such as overlaps, holes, cracks, and artifacts are analyzed and preprocessed. When the [...] Read more.
A virtual viewpoint generation method is proposed to address the problem of low fidelity in the generation of virtual viewpoints for images with overlapping pixel points. Virtual viewpoint generation factors such as overlaps, holes, cracks, and artifacts are analyzed and preprocessed. When the background of the hole is a simple texture, pheromone information around the hole is used as the support, a pixel at the edge of the hole is detected, and the hole is predicted at the same time, so that the hole area is filled in blocks. When the hole background has a relatively complex texture, the depth information of the hole pixels is updated with the inverse 3D transformation method, and the updated area pheromone is projected onto the auxiliary plane and compared with the known plane pixel auxiliary parameters. The hole filling is performed according to the symmetry of the pixel position of the auxiliary reference viewpoint plane to obtain the virtual viewpoint after optimization. The proposed method was validated using image quality metrics and objective evaluation metrics such as PSNR. The experimental results show that the proposed method could generate virtual viewpoints with high fidelity, excellent quality, and a short image-processing time, which effectively enhanced the virtual viewpoint generation performance. Full article
Show Figures

Figure 1

15 pages, 1414 KiB  
Article
An Augmented Model of Rutting Data Based on Radial Basis Neural Network
by Zhuoxuan Li, Meng Tao, Jinde Cao, Xinli Shi, Tao Ma and Wei Huang
Symmetry 2023, 15(1), 33; https://0-doi-org.brum.beds.ac.uk/10.3390/sym15010033 - 23 Dec 2022
Cited by 3 | Viewed by 1312
Abstract
The rutting depth is an important index to evaluate the damage degree of the pavement. Therefore, establishing an accurate rutting depth prediction model can guide pavement design and provide the necessary basis for pavement maintenance. However, the sample size of pavement rutting depth [...] Read more.
The rutting depth is an important index to evaluate the damage degree of the pavement. Therefore, establishing an accurate rutting depth prediction model can guide pavement design and provide the necessary basis for pavement maintenance. However, the sample size of pavement rutting depth data is small, and the sampling is not standardized, which makes it hard to establish a prediction model with high accuracy. Based on the data of RIOHTrack’s asphalt pavement structure, this study builds a reliable data-augmented model. In this paper, different asphalt rutting data augmented models based on Gaussian radial basis neural networks are constructed with the temperature and loading of asphalt pavements as the main features. Experimental results show that the method outperforms classical machine learning methods in data augmentation, with an average root mean square error of 3.95 and an average R-square of 0.957. Finally, the augmented data of rutting depth is constructed for training, and multiple neural network models are used for prediction. Compared with unaugmented data, the prediction accuracy is increased by 50%. Full article
Show Figures

Figure 1

16 pages, 2391 KiB  
Article
Big Data Clustering Using Chemical Reaction Optimization Technique: A Computational Symmetry Paradigm for Location-Aware Decision Support in Geospatial Query Processing
by Ali Fahem Neamah, Hussein Khudhur Ibrahim, Saad Mohamed Darwish and Oday Ali Hassen
Symmetry 2022, 14(12), 2637; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14122637 - 13 Dec 2022
Viewed by 1446
Abstract
The emergence of geospatial big data has opened up new avenues for identifying urban environments. Although both geographic information systems (GIS) and expert systems (ES) have been useful in resolving geographical decision issues, they are not without their own shortcomings. The combination of [...] Read more.
The emergence of geospatial big data has opened up new avenues for identifying urban environments. Although both geographic information systems (GIS) and expert systems (ES) have been useful in resolving geographical decision issues, they are not without their own shortcomings. The combination of GIS and ES has gained popularity due to the necessity of boosting the effectiveness of these tools in resolving very difficult spatial decision-making problems. The clustering method generates the functional effects necessary to apply spatial analysis techniques. In a symmetric clustering system, two or more nodes run applications and monitor each other simultaneously. This system is more efficient than an asymmetric system since it utilizes all available hardware and does not maintain a node in a hot standby state. However, it is still a major issue to figure out how to expand and speed up clustering algorithms without sacrificing efficiency. The work presented in this paper introduces an optimized hierarchical distributed k-medoid symmetric clustering algorithm for big data spatial query processing. To increase the k-medoid method’s efficiency and create more precise clusters, a hybrid approach combining the k-medoid and Chemical Reaction Optimization (CRO) techniques is presented. CRO is used in this approach to broaden the scope of the optimal medoid and improve clustering by obtaining more accurate data. The suggested paradigm solves the current technique’s issue of predicting the accurate clusters’ number. The suggested approach includes two phases: in the first phase, the local clusters are built using Apache Spark’s parallelism paradigm based on their portion of the whole dataset. In the second phase, the local clusters are merged to create condensed and reliable final clusters. The suggested approach condenses the data provided during aggregation and creates the ideal clusters’ number automatically based on the dataset’s structures. The suggested approach is robust and delivers high-quality results for spatial query analysis, as shown by experimental results. The proposed model reduces average query latency by 23%. Full article
Show Figures

Figure 1

12 pages, 1550 KiB  
Article
Hypernetwork Representation Learning Based on Hyperedge Modeling
by Yu Zhu, Haixing Zhao, Xiaoying Wang and Jianqiang Huang
Symmetry 2022, 14(12), 2584; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14122584 - 07 Dec 2022
Viewed by 1085
Abstract
Most network representation learning approaches only consider the pairwise relationships between the nodes in ordinary networks but do not consider the tuple relationships, namely the hyperedges, among the nodes in the hypernetworks. Therefore, to solve the above issue, a hypernetwork representation learning approach [...] Read more.
Most network representation learning approaches only consider the pairwise relationships between the nodes in ordinary networks but do not consider the tuple relationships, namely the hyperedges, among the nodes in the hypernetworks. Therefore, to solve the above issue, a hypernetwork representation learning approach based on hyperedge modeling, abbreviated as HRHM, is proposed, which fully considers the hyperedges to obtain ideal node representation vectors that are applied to downstream machine learning tasks such as node classification, link prediction, community detection, and so on. Experimental results on the hypernetwork datasets show that with regard to the node classification task, the mean node classification accuracy of HRHM approach goes beyond other best baseline approach by about 1% on the MovieLens and wordnet, and with regard to the link prediction task, except for HPHG approach, the mean AUC value of HRHM approach surpasses that of other baseline approaches by about 17%, 18%, and 6%, respectively, on the GPS, drug, and wordnet. The mean AUC value of HRHM approach is very close to that of other best baseline approach on the MovieLens. Full article
Show Figures

Figure 1

13 pages, 2872 KiB  
Article
Interactive Image Segmentation Based on Feature-Aware Attention
by Jinsheng Sun, Xiaojuan Ban, Bing Han, Xueyuan Yang and Chao Yao
Symmetry 2022, 14(11), 2396; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14112396 - 12 Nov 2022
Viewed by 1818
Abstract
Interactive segmentation is a technique for picking objects of interest in images according to users’ input interactions. Some recent works take the users’ interactive input to guide the deep neural network training, where the users’ click information is utilized as weak-supervised information. However, [...] Read more.
Interactive segmentation is a technique for picking objects of interest in images according to users’ input interactions. Some recent works take the users’ interactive input to guide the deep neural network training, where the users’ click information is utilized as weak-supervised information. However, limited by the learning capability of the model, this structure does not accurately represent the user’s interaction intention. In this work, we propose a multi-click interactive segmentation solution for employing human intention to refine the segmentation results. We propose a coarse segmentation network to extract semantic information and generate rough results. Then, we designed a feature-aware attention module according to the symmetry of user intention and image semantic information. Finally, we establish a refinement module to combine the feature-aware results with coarse masks to generate precise intentional segmentation. Furthermore, the feature-aware module is trained as a plug-and-play tool, which can be embedded into most deep image segmentation models for exploiting users’ click information in the training process. We conduct experiments on five common datasets (SBD, GrabCut, DAVIS, Berkeley, MS COCO) and the results prove our attention module can improve the performance of image segmentation networks. Full article
Show Figures

Figure 1

14 pages, 5070 KiB  
Article
Fabric Surface Defect Detection Using SE-SSDNet
by Hanqing Zhao and Tuanshan Zhang
Symmetry 2022, 14(11), 2373; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14112373 - 10 Nov 2022
Cited by 3 | Viewed by 2126
Abstract
For fabric defect detection, the crucial issue is that large defects can be detected but not small ones, and vice versa, and this symmetric contradiction cannot be solved by a single method, especially for colored fabrics. In this paper, we propose a method [...] Read more.
For fabric defect detection, the crucial issue is that large defects can be detected but not small ones, and vice versa, and this symmetric contradiction cannot be solved by a single method, especially for colored fabrics. In this paper, we propose a method based on a combination of two networks, SE and SSD, namely the SE-SSD Net method. The model is based on the SSD network and adds the SE module for squeezing and the Excitation module after its convolution operation, which is used to increase the weight of the model for the feature channels containing defect information while re-preserving the original network to extract feature maps of different scales for detection. The global features are then subjected to the Excitation operation to obtain the weights of different channels, which are multiplied by the original features to form the final features so that the model can pay more attention to the channel features with a large amount of information. In this way, large-scale feature maps can be used to detect small defects, while small-scale feature maps are used to detect relatively large defects, thus solving the asymmetry problem in detection. The experimental results show that our proposed algorithm can detect six different defects in colored fabrics, which basically meets the practical needs. Full article
Show Figures

Figure 1

16 pages, 5724 KiB  
Article
PSG-Yolov5: A Paradigm for Traffic Sign Detection and Recognition Algorithm Based on Deep Learning
by Jie Hu, Zhanbin Wang, Minjie Chang, Lihao Xie, Wencai Xu and Nan Chen
Symmetry 2022, 14(11), 2262; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14112262 - 28 Oct 2022
Cited by 14 | Viewed by 2968
Abstract
With the gradual popularization of autonomous driving technology, how to obtain traffic sign information efficiently and accurately is very important for subsequent decision-making and planning tasks. Traffic sign detection and recognition (TSDR) algorithms include color-based, shape-based, and machine learning based. However, the algorithms [...] Read more.
With the gradual popularization of autonomous driving technology, how to obtain traffic sign information efficiently and accurately is very important for subsequent decision-making and planning tasks. Traffic sign detection and recognition (TSDR) algorithms include color-based, shape-based, and machine learning based. However, the algorithms mentioned above are insufficient for traffic sign detection tasks in complex environments. In this paper, we propose a traffic sign detection and recognition paradigm based on deep learning algorithms. First, to solve the problem of insufficient spatial information in high-level features of small traffic signs, the parallel deformable convolution module (PDCM) is proposed in this paper. PDCM adaptively acquires the corresponding receptive field preserving the integrity of the abstract information through symmetrical branches thereby improving the feature extraction capability. Simultaneously, we propose sub-pixel convolution attention module (SCAM) based on the attention mechanism to alleviate the influence of scale distribution. Distinguishing itself from other feature fusion, our proposed method can better focus on the information of scale distribution through the attention module. Eventually, we introduce GSConv to further reduce the computational complexity of our proposed algorithm, better satisfying industrial application. Experimental results demonstrate that our proposed methods can effectively improve performance, both in detection accuracy and [email protected]. Specifically, when the proposed PDCM, SCAM, and GSConv are applied to the Yolov5, it achieves 89.2% [email protected] in TT100K, which exceeds the benchmark network by 4.9%. Full article
Show Figures

Figure 1

19 pages, 18553 KiB  
Article
Remaining Useful Life Prediction of Milling Cutters Based on CNN-BiLSTM and Attention Mechanism
by Lei Nie, Lvfan Zhang, Shiyi Xu, Wentao Cai and Haoming Yang
Symmetry 2022, 14(11), 2243; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14112243 - 25 Oct 2022
Cited by 3 | Viewed by 1396
Abstract
Machining tools are a critical component in machine manufacturing, the life cycle of which is an asymmetrical process. Extracting and modeling the tool life variation features is very significant for accurately predicting the tool’s remaining useful life (RUL), and it is vital to [...] Read more.
Machining tools are a critical component in machine manufacturing, the life cycle of which is an asymmetrical process. Extracting and modeling the tool life variation features is very significant for accurately predicting the tool’s remaining useful life (RUL), and it is vital to ensure product reliability. In this study, based on convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM), a tool wear evolution and RUL prediction method by combining CNN-BiLSTM and attention mechanism is proposed. The powerful CNN is applied to directly process the sensor-monitored data and extract local feature information; the BiLSTM neural network is used to adaptively extract temporal features; the attention mechanism can selectively study the important degradation features and extract the tool wear status information. By evaluating the performance and generalization ability of the proposed method under different working conditions, two datasets are applied for experiments, and the proposed method outperforms the traditional method in terms of prediction accuracy. Full article
Show Figures

Figure 1

16 pages, 539 KiB  
Article
Crowd Density Estimation in Spatial and Temporal Distortion Environment Using Parallel Multi-Size Receptive Fields and Stack Ensemble Meta-Learning
by Addis Abebe Assefa, Wenhong Tian, Negalign Wake Hundera and Muhammad Umar Aftab
Symmetry 2022, 14(10), 2159; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14102159 - 15 Oct 2022
Cited by 1 | Viewed by 1709
Abstract
The estimation of crowd density is crucial for applications such as autonomous driving, visual surveillance, crowd control, public space planning, and warning visually distracted drivers prior to an accident. Having strong translational, reflective, and scale symmetry, models for estimating the density of a [...] Read more.
The estimation of crowd density is crucial for applications such as autonomous driving, visual surveillance, crowd control, public space planning, and warning visually distracted drivers prior to an accident. Having strong translational, reflective, and scale symmetry, models for estimating the density of a crowd yield an encouraging result. However, dynamic scenes with perspective distortions and rapidly changing spatial and temporal domains still present obstacles. The main reasons for this are the dynamic nature of a scene and the difficulty of representing and incorporating the feature space of objects of varying sizes into a prediction model. To overcome the aforementioned issues, this paper proposes a parallel multi-size receptive field units framework that leverages the majority of the CNN layer’s features, allowing for the representation and participation in the model prediction of the features of objects of all sizes. The proposed method utilizes features generated from lower to higher layers. As a result, different object scales can be handled at different framework depths, and various environmental densities can be estimated. However, the inclusion of the vast majority of layer features in the prediction model has a number of negative effects on the prediction’s outcome. Asymmetric non-local attention and the channel weighting module of a feature map are proposed to handle noise and background details and re-weight each channel to make it more sensitive to important features while ignoring irrelevant ones, respectively. While the output predictions of some layers have high bias and low variance, those of other layers have low bias and high variance. Using stack ensemble meta-learning, we combine individual predictions made with lower-layer features and higher-layer features to improve prediction while balancing the tradeoff between bias and variance. The UCF CC 50 dataset and the ShanghaiTech dataset have both been subjected to extensive testing. The results of the experiments indicate that the proposed method is effective for dense distributions and objects of various sizes. Full article
Show Figures

Figure 1

15 pages, 793 KiB  
Article
Prediction of COVID-19 Cases Using Constructed Features by Grammatical Evolution
by Ioannis G. Tsoulos, Alexandros T. Tzallas and Dimitrios Tsalikakis
Symmetry 2022, 14(10), 2149; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14102149 - 14 Oct 2022
Cited by 1 | Viewed by 1074
Abstract
A widely used method that constructs features with the incorporation of so-called grammatical evolution is proposed here to predict the COVID-19 cases as well as the mortality rate. The method creates new artificial features from the original ones using a genetic algorithm and [...] Read more.
A widely used method that constructs features with the incorporation of so-called grammatical evolution is proposed here to predict the COVID-19 cases as well as the mortality rate. The method creates new artificial features from the original ones using a genetic algorithm and is guided by BNF grammar. After the artificial features are generated, the original data set is modified based on these features, an artificial neural network is applied to the modified data, and the results are reported. From the comparative experiments done, it is clear that feature construction has an advantage over other machine-learning methods for predicting pandemic elements. Full article
Show Figures

Figure 1

18 pages, 16513 KiB  
Article
A Novel Driver Abnormal Behavior Recognition and Analysis Strategy and Its Application in a Practical Vehicle
by Shida Liu, Xuyun Wang, Honghai Ji, Li Wang and Zhongsheng Hou
Symmetry 2022, 14(10), 1956; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14101956 - 20 Sep 2022
Cited by 1 | Viewed by 1463
Abstract
In this work, a novel driver abnormal behavior analysis system based on practical facial landmark detection (PFLD) and you only look once version 5 (YOLOv5) were developed to solve the recognition and analysis of driver abnormal behaviors. First, a library for analyzing the [...] Read more.
In this work, a novel driver abnormal behavior analysis system based on practical facial landmark detection (PFLD) and you only look once version 5 (YOLOv5) were developed to solve the recognition and analysis of driver abnormal behaviors. First, a library for analyzing the abnormal behavior of vehicle drivers was designed, in which the factors that cause an abnormal behavior of drivers were divided into three categories according to the behavioral characteristics including natural behavioral factors, unnatural behavioral factors, and passive behavioral factors. Then, different neural network models were established through the representation of the actual scene of the three behaviors. Specifically, the abnormal driver behavior caused by natural behavioral factors was identified by a PFLD neural network model based on facial key point detection, and the abnormal driver behavior caused by unnatural behavioral factors and passive behavioral factors were identified by a YOLOv5 neural network model based on target detection. In addition, in a test of the driver abnormal behavior analysis system in an actual vehicle, the precision rate was greater than 95%, which meets the requirements of practical application. Full article
Show Figures

Figure 1

20 pages, 5957 KiB  
Article
An Intelligent Vision-Based Tracking Method for Underground Human Using Infrared Videos
by Xiaoyu Li, Shuai Wang, Wei Chen, Zhi Weng, Weiqiang Fan and Zijian Tian
Symmetry 2022, 14(8), 1750; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14081750 - 22 Aug 2022
Cited by 1 | Viewed by 1238
Abstract
The underground mine environment is dangerous and harsh, tracking and detecting humans based on computer vision is of great significance for mine safety monitoring, which will also greatly facilitate identification of humans using the symmetrical image features of human organs. However, existing methods [...] Read more.
The underground mine environment is dangerous and harsh, tracking and detecting humans based on computer vision is of great significance for mine safety monitoring, which will also greatly facilitate identification of humans using the symmetrical image features of human organs. However, existing methods have difficulty solving the problems of accurate identification of humans and background, unstable human appearance characteristics, and humans occluded or lost. For these reasons, an improved aberrance repressed correlation filter (IARCF) tracker for human tracking in underground mines based on infrared videos is proposed. Firstly, the preprocess operations of edge sharpening, contrast adjustment, and denoising are used to enhance the image features of original videos. Secondly, the response map characteristics of peak shape and peak to side lobe ratio (PSLR) are analyzed to identify abnormal human locations in each frame, and the method of calculating the image similarity by generating virtual tracking boxes is used to accurately relocate the human. Finally, using the value of PSLR and the highest peak point of the response map, the appearance model is adaptively updated to further improve the robustness of the tracker. Experimental results show that the average precision and success rate of the IARCF tracker in the five underground scenarios reach 0.8985 and 0.7183, respectively, and the improvement of human tracking in difficult scenes is excellent. The IARCF tracker can effectively track underground human targets, especially occluded humans in complex scenes. Full article
Show Figures

Figure 1

15 pages, 1588 KiB  
Article
Hypernetwork Representation Learning with Common Constraints of the Set and Translation
by Yu Zhu, Haixing Zhao, Jianqiang Huang and Xiaoying Wang
Symmetry 2022, 14(8), 1745; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14081745 - 22 Aug 2022
Viewed by 994
Abstract
Different from conventional networks with only pairwise relationships among the nodes, there are also complex tuple relationships, namely the hyperedges among the nodes in the hypernetwork. However, most of the existing network representation learning methods cannot effectively capture the complex tuple relationships. Therefore, [...] Read more.
Different from conventional networks with only pairwise relationships among the nodes, there are also complex tuple relationships, namely the hyperedges among the nodes in the hypernetwork. However, most of the existing network representation learning methods cannot effectively capture the complex tuple relationships. Therefore, in order to resolve the above challenge, this paper proposes a hypernetwork representation learning method with common constraints of the set and translation, abbreviated as HRST, which incorporates both the hyperedge set associated with the nodes and the hyperedge regarded as the interaction relation among the nodes through the translation mechanism into the process of hypernetwork representation learning to obtain node representation vectors rich in the hypernetwork topology structure and hyperedge information. Experimental results on four hypernetwork datasets demonstrate that, for the node classification task, our method outperforms the other best baseline methods by about 1%. As for the link prediction task, our method is almost entirely superior to other baseline methods. Full article
Show Figures

Figure 1

14 pages, 3683 KiB  
Article
Multi-Type Object Tracking Based on Residual Neural Network Model
by Tao Jiang, Qiuyan Zhang, Jianying Yuan, Changyou Wang and Chen Li
Symmetry 2022, 14(8), 1689; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14081689 - 15 Aug 2022
Cited by 7 | Viewed by 1456
Abstract
In this paper, a tracking algorithm based on the residual neural network model and machine learning is proposed. Compared with the widely used VGG network, the residual neural network has deeper characteristic layers and special additional layer structure, which break the symmetry of [...] Read more.
In this paper, a tracking algorithm based on the residual neural network model and machine learning is proposed. Compared with the widely used VGG network, the residual neural network has deeper characteristic layers and special additional layer structure, which break the symmetry of the network and reduce the degradation of the neural network. The additional layer and convolution layer are used for feature fusion to represent the target. The multi-features of the object can be captured by using the developed algorithm, so that the accuracy of tracking can be improved in some complex scenarios. In addition, we defined a new measure to calculate the similarity of different image regions and find the optimal matched region. The search area is delimited according to the continuity of the target motion, which improves the real-time performance of tracking. The experimental results illustrate that the proposed algorithm achieved a higher accuracy while taking into account the real time performance, especially in dealing with some complex scenarios such as deformation, rotation changes and background clutters, in comparison with the Multi-Domain Network (MDNet) algorithm based on a convolutional neural network. Full article
Show Figures

Figure 1

15 pages, 23894 KiB  
Article
Internal Similarity Network for Rejoining Oracle Bone Fragment Images
by Zhan Zhang, An Guo and Bang Li
Symmetry 2022, 14(7), 1464; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14071464 - 18 Jul 2022
Cited by 4 | Viewed by 1335
Abstract
Rejoining oracle bone fragments plays an import role in studying the history and culture of the Shang dynasty by its characters. However, current computer vision technology has a low accuracy in judging whether the texture of oracle bone fragment image pairs can be [...] Read more.
Rejoining oracle bone fragments plays an import role in studying the history and culture of the Shang dynasty by its characters. However, current computer vision technology has a low accuracy in judging whether the texture of oracle bone fragment image pairs can be put back together. When rejoining fragment images, the coordinate sequence and texture features of edge pixels from original and target fragment images form a continuous symmetrical structure, so we put forward an internal similarity network (ISN) to rejoin the fragment image automatically. Firstly, an edge equidistant matching (EEM) algorithm was given to search similar coordinate sequences of edge segment pairs on the fragment image contours and to locally match the edge coordinate sequence of an oracle bone fragment image. Then, a target mask-based method was designed in order to put two images into a whole and to cut a local region image by the local matching edge. Next, we calculated a convolution feature gradient map (CFGM) of the local region image texture, and an internal similarity pooling (ISP) layer was proposed to compute the internal similarity of the convolution feature gradient map. Finally, ISN was contributed in order to evaluate a similarity score of a local region image texture and to determine whether two fragment images are a coherent whole. The experiments show that the correct judgement probability of ISN is higher than 90% in actual rejoining work and that our method searched 37 pairs of correctly rejoined oracle bone fragment images that have not been discovered by archaeologists. Full article
Show Figures

Figure 1

23 pages, 5857 KiB  
Article
CLHF-Net: A Channel-Level Hierarchical Feature Fusion Network for Remote Sensing Image Change Detection
by Jinming Ma, Di Lu, Yanxiang Li and Gang Shi
Symmetry 2022, 14(6), 1138; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061138 - 01 Jun 2022
Cited by 3 | Viewed by 1655
Abstract
Remote sensing (RS) image change detection (CD) is the procedure of detecting the change regions that occur in the same area in different time periods. A lot of research has extracted deep features and fused multi-scale features by convolutional neural networks and attention [...] Read more.
Remote sensing (RS) image change detection (CD) is the procedure of detecting the change regions that occur in the same area in different time periods. A lot of research has extracted deep features and fused multi-scale features by convolutional neural networks and attention mechanisms to achieve better CD performance, but these methods do not result in well-fused feature pairs of the same scale and features of different layers. To solve this problem, a novel CD network with symmetric structure called the channel-level hierarchical feature fusion network (CLHF-Net) is proposed. First, a channel-split feature fusion module (CSFM) with symmetric structure is proposed, which consists of three branches. The CSFM integrates feature information of the same scale feature pairs more adequately and effectively solves the problem of insufficient communication between feature pairs. Second, an interaction guidance fusion module (IGFM) is designed to fuse the feature information of different layers more effectively. IGFM introduces the detailed information from shallow features into deep features and deep semantic information into shallow features, and the fused features have more complete feature information of change regions and clearer edge information. Compared with other methods, CLHF-Net improves the F1 scores by 1.03%, 2.50%, and 3.03% on the three publicly available benchmark datasets: season-varying, WHU-CD, and LEVIR-CD datasets, respectively. Experimental results show that the performance of the proposed CLHF-Net is better than other comparative methods. Full article
Show Figures

Figure 1

21 pages, 4668 KiB  
Article
Research on Prediction Method of Gear Pump Remaining Useful Life Based on DCAE and Bi-LSTM
by Chenyang Wang, Wanlu Jiang, Yi Yue and Shuqing Zhang
Symmetry 2022, 14(6), 1111; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14061111 - 28 May 2022
Cited by 9 | Viewed by 1968
Abstract
As a hydraulic pump is the power source of a hydraulic system, predicting its remaining useful life (RUL) can effectively improve the operating efficiency of the hydraulic system and reduce the incidence of failure. This paper presents a scheme for predicting the RUL [...] Read more.
As a hydraulic pump is the power source of a hydraulic system, predicting its remaining useful life (RUL) can effectively improve the operating efficiency of the hydraulic system and reduce the incidence of failure. This paper presents a scheme for predicting the RUL of a hydraulic pump (gear pump) through a combination of a deep convolutional autoencoder (DCAE) and a bidirectional long short-term memory (Bi-LSTM) network. The vibration data were characterized by the DCAE, and a health indicator (HI) was constructed and modeled to determine the degradation state of the gear pump. The DCAE is a typical symmetric neural network, which can effectively extract characteristics from the data by using the symmetry of the encoding network and decoding network. After processing the original vibration data segment, health indicators were entered as a label into the RUL prediction model based on the Bi-LSTM network, and model training was carried out to achieve the RUL prediction of the gear pump. To verify the validity of the methodology, a gear pump accelerated life experiment was carried out, and whole life cycle data were obtained for method validation. The results show that the constructed HI can effectively characterize the degenerative state of the gear pump, and the proposed RUL prediction method can effectively predict the degeneration trend of the gear pump. Full article
Show Figures

Figure 1

13 pages, 2824 KiB  
Article
A Semi-Supervised Semantic Segmentation Method for Blast-Hole Detection
by Zeyu Zhang, Honggui Deng, Yang Liu, Qiguo Xu and Gang Liu
Symmetry 2022, 14(4), 653; https://0-doi-org.brum.beds.ac.uk/10.3390/sym14040653 - 23 Mar 2022
Cited by 6 | Viewed by 2094
Abstract
The goal of blast-hole detection is to help place charge explosives into blast-holes. This process is full of challenges, because it requires the ability to extract sample features in complex environments, and to detect a wide variety of blast-holes. Detection techniques based on [...] Read more.
The goal of blast-hole detection is to help place charge explosives into blast-holes. This process is full of challenges, because it requires the ability to extract sample features in complex environments, and to detect a wide variety of blast-holes. Detection techniques based on deep learning with RGB-D semantic segmentation have emerged in recent years of research and achieved good results. However, implementing semantic segmentation based on deep learning usually requires a large amount of labeled data, which creates a large burden on the production of the dataset. To address the dilemma that there is very little training data available for explosive charging equipment to detect blast-holes, this paper extends the core idea of semi-supervised learning to RGB-D semantic segmentation, and devises an ERF-AC-PSPNet model based on a symmetric encoder–decoder structure. The model adds a residual connection layer and a dilated convolution layer for down-sampling, followed by an attention complementary module to acquire the feature maps, and uses a pyramid scene parsing network to achieve hole segmentation during decoding. A new semi-supervised learning method, based on pseudo-labeling and self-training, is proposed, to train the model for intelligent detection of blast-holes. The designed pseudo-labeling is based on the HOG algorithm and depth data, and proved to have good results in experiments. To verify the validity of the method, we carried out experiments on the images of blast-holes collected at a mine site. Compared to the previous segmentation methods, our method is less dependent on the labeled data and achieved IoU of 0.810, 0.867, 0.923, and 0.945, at labeling ratios of 1/8, 1/4, 1/2, and 1. Full article
Show Figures

Figure 1

Back to TopTop