Topic Editors

Department of Engineering and Architecture, University of Parma, Parco Area delle Scienze, 181/A, 43124 Parma, Italy
Prof. Dr. Luis Javier García Villalba
Department of Software Engineering and Artificial Intelligence (DISIA), Faculty of Computer Science and Engineering, Office 431, Universidad Complutense de Madrid (UCM), 28040 Madrid, Spain
Computer Science, Stockton University, Galloway, NJ 08205, USA

Machine and Deep Learning

Abstract submission deadline
closed (31 December 2022)
Manuscript submission deadline
closed (31 March 2023)
Viewed by
663066

Topic Information

Dear Colleagues,

Our society is facing a new era of automation, not only in industry but also in our daily lives. Computers are everywhere, and their employment is no longer relegated to just industry or work but also to entertainment and leisure. Computing and artificial intelligence are not simply scientific lab experiments for publishing papers in major journals and conferences but opportunities to make our lives better.  

Among the different fields of artificial intelligence, machine learning is certainly one of the most studied in recent years. There has been a gigantic shift in the last few decades due to the birth of deep learning, which has opened unprecedented theoretic and application-based opportunities.

In this context, advances in machine and deep learning are discovered on a daily basis, but still much has to be learned. For instance, the functioning of deep learning architectures is still partially obscure and explaining it will foster new applications, algorithms and architectures. While deep learning is considered the hottest topic of artificial intelligence nowadays, still much interest is raised by “traditional” machine learning, especially in (but not limited to) new learning paradigms, extendibility to big/huge data applications, and optimization.

Even more diffused, then, are the (new) applications of machine and deep learning, to finance, healthcare, sustainability, climate science, neuroscience, to name a few. Continuing and improving the research in machine and deep learning will not only be a chance for new surprising discoveries, but also a way to contribute to our wellbeing and economical growth.

Prof. Dr. Andrea Prati
Dr. Luis Javier García Villalba
Prof. Dr. Vincent A. Cicirello
Topic Editors

Keywords

  • machine learning
  • deep learning
  • natural language processing
  • text mining
  • active learning
  • clustering
  • regression
  • data mining
  • web mining
  • online learning
  • ranking in machine learning
  • reinforcement learning
  • transfer learning
  • semi-supervised learning
  • zero- and few-shot learning
  • time series analysis
  • unsupervised learning
  • deep learning architectures
  • generative models
  • deep reinforcement learning
  • learning theory (bandits, game theory, statistical learning theory, etc.)
  • optimization (convex and non-convex optimization, matrix/tensor methods, sparsity, etc.)
  • probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)
  • probabilistic inference (Bayesian methods, graphical models, Monte Carlo methods, etc.)
  • evolution-based methods
  • explanation-based learning
  • multi-agent learning
  • neuroscience and cognitive science (e.g., neural coding, brain–computer interfaces)
  • trustworthy machine learning (accountability, causality, fairness, privacy, robustness, etc.)
  • applications (e.g., speech processing, computational biology, computer vision, NLP)

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Applied Sciences
applsci
2.7 4.5 2011 16.9 Days CHF 2400
Big Data and Cognitive Computing
BDCC
3.7 4.9 2017 18.2 Days CHF 1800
Mathematics
mathematics
2.4 3.5 2013 16.9 Days CHF 2600
Electronics
electronics
2.9 4.7 2012 15.6 Days CHF 2400
Entropy
entropy
2.7 4.7 1999 20.8 Days CHF 2600

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (269 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
21 pages, 773 KiB  
Article
Parkinson’s Disease Detection Using Hybrid LSTM-GRU Deep Learning Model
by Amjad Rehman, Tanzila Saba, Muhammad Mujahid, Faten S. Alamri and Narmine ElHakim
Electronics 2023, 12(13), 2856; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12132856 - 28 Jun 2023
Cited by 8 | Viewed by 2916
Abstract
Parkinson’s disease is the second-most common cause of death and disability as well as the most prevalent neurological disorder. In the last 15 years, the number of cases of PD has doubled. The accurate detection of PD in the early stages is one [...] Read more.
Parkinson’s disease is the second-most common cause of death and disability as well as the most prevalent neurological disorder. In the last 15 years, the number of cases of PD has doubled. The accurate detection of PD in the early stages is one of the most challenging tasks to ensure individuals can continue to live with as little interference as possible. Yet there are not enough trained neurologists around the world to detect Parkinson’s disease in its early stages. Machine learning methods based on Artificial intelligence have acquired a lot of popularity over the past few decades in medical disease detection. However, these methods do not provide an accurate and timely diagnosis. The overall detection accuracy of machine learning-related models is inadequate. This study collected data from 31 male and female patients, including 195 voices. Approximately six recordings were created per patient, with the length of each recording extending from 1 to 36 s. These voices were recorded in a soundproof studio using an Industrial Acoustics Company (IAC) AKG-C420 head-mounted microphone. The data set was collected to investigate the diagnostic significance of speech and voice abnormalities caused by Parkinson’s disease. An imbalanced dataset is the main contributor of model overfitting and generalization errors, and hence one class has the majority of samples and the other class has minority samples. This problem is addressed in this study by utilizing the three sampling techniques. After balancing the datasets, each class has the same number of samples, which has proven valuable in improving the model’s performance and reducing the overfitting problem. Four performance metrics such as accuracy, precision, recall and f1 score are used to evaluate the effectiveness of the proposed hybrid model. Experiments demonstrated that the proposed model achieved 100% accuracy, recall and f1 score using the balanced dataset with the random oversampling technique and 100% precision, 97% recall, 99% AUC score and 91% f1 score with the SMOTE technique. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

10 pages, 868 KiB  
Article
Efficient Meta-Learning through Task-Specific Pseudo Labelling
by Sanghyuk Lee, Seunghyun Lee and Byung Cheol Song
Electronics 2023, 12(13), 2757; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12132757 - 21 Jun 2023
Viewed by 984
Abstract
Meta-learning is attracting attention as a crucial tool for few-show learning tasks. Meta-learning involves the establishment and acquisition of “meta-knowledge”, enabling the ability to adapt to a novel field using only limited data. Transductive meta-learning has garnered increasing attention as a solution to [...] Read more.
Meta-learning is attracting attention as a crucial tool for few-show learning tasks. Meta-learning involves the establishment and acquisition of “meta-knowledge”, enabling the ability to adapt to a novel field using only limited data. Transductive meta-learning has garnered increasing attention as a solution to the sample bias problem arising from meta-learning’s reliance on a limited support set for adaptation. This approach surpasses the traditional inductive learning perspective, aiming to address this issue effectively. Transductive meta-learning infers the class of each instance in time by considering the relation of instances in the test set. In order to enhance the effectiveness of transductive meta-learning, this paper introduces a novel technique called task-specific pseudo labelling. The main idea is to produce synthetic labels for unannotated query sets by propagating labels from annotated support sets. This approach allows the utilization of the supervised setting as is, while incorporating the unannotated query set into the adjustment procedure. Consequently, our approach enables handling a larger number of examples during adaptation compared to inductive approaches, leading to improved classification performance of the model. Notably, this approach represents the first instance of employing task adaptation within the context of pseudo labelling. Based on the experimental outcomes in the evaluation configurations of few-shot learning, specifically in the 5-way 1-shot setup, the proposed method demonstrates noteworthy enhancements over two existing meta-learning algorithms, with improvements of 6.75% and 5.03%, respectively. Consequently, the proposed method establishes a new state-of-the-art performance in the realm of transductive meta-learning. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 1158 KiB  
Article
Machine Learning and Cochlear Implantation: Predicting the Post-Operative Electrode Impedances
by Yousef A. Alohali, Mahmoud Samir Fayed, Yassin Abdelsamad, Fida Almuhawas, Asma Alahmadi, Tamer Mesallam and Abdulrahman Hagr
Electronics 2023, 12(12), 2720; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12122720 - 18 Jun 2023
Cited by 2 | Viewed by 2801
Abstract
Cochlear implantation is the common treatment for severe to profound sensorineural hearing loss if there is no benefit from hearing aids. Measuring the electrode impedance along the electrode array at different time points after surgery is crucial in verifying the electrodes’ status, determining [...] Read more.
Cochlear implantation is the common treatment for severe to profound sensorineural hearing loss if there is no benefit from hearing aids. Measuring the electrode impedance along the electrode array at different time points after surgery is crucial in verifying the electrodes’ status, determining the compliance levels, and helping to identify the electric dynamic range. Increased impedance values without proper reprogramming can affect the patient’s performance. The prediction of acceptable levels of electrode impedance at different time points after the surgery could help clinicians during the fitting sessions through a comparison of the predicted with the measured levels. Accordingly, clinicians can decide if the measured levels are within the predicted normal range or not. In this work, we used a dataset of 80 pediatric patients who had received cochlear implants with the MED-EL FLEX 28 electrode array. We predicted the impedance of the electrode arrays in each channel at different time points: at one month, three months, six months, and one year after the date of surgery. We used different machine learning algorithms such as linear regression, Bayesian linear regression, decision forest regression, boosted decision tree regression, and neural networks. The used features include the patient’s age and the intra-operative electrode impedance at different electrodes. Our results indicated that the best algorithm varies depending on the channel, while the Bayesian linear regression and neural networks provide the best results for 75% of the channels. Furthermore, the accuracy level ranges between 83% and 100% in half of the channels one year after the surgery, when an error range between 0 and 3 KΩ is defined as an acceptable threshold. Moreover, the use of the patient’s age alone can provide the best prediction results for 50% of the channels at six months or one year after surgery. This reflects that the patient’s age could be a predictor of the electrode impedance after the surgery. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 719 KiB  
Article
KNN-Based Machine Learning Classifier Used on Deep Learned Spatial Motion Features for Human Action Recognition
by Kalaivani Paramasivam, Mohamed Mansoor Roomi Sindha and Sathya Bama Balakrishnan
Entropy 2023, 25(6), 844; https://0-doi-org.brum.beds.ac.uk/10.3390/e25060844 - 25 May 2023
Cited by 3 | Viewed by 1698
Abstract
Human action recognition is an essential process in surveillance video analysis, which is used to understand the behavior of people to ensure safety. Most of the existing methods for HAR use computationally heavy networks such as 3D CNN and two-stream networks. To alleviate [...] Read more.
Human action recognition is an essential process in surveillance video analysis, which is used to understand the behavior of people to ensure safety. Most of the existing methods for HAR use computationally heavy networks such as 3D CNN and two-stream networks. To alleviate the challenges in the implementation and training of 3D deep learning networks, which have more parameters, a customized lightweight directed acyclic graph-based residual 2D CNN with fewer parameters was designed from scratch and named HARNet. A novel pipeline for the construction of spatial motion data from raw video input is presented for the latent representation learning of human actions. The constructed input is fed to the network for simultaneous operation over spatial and motion information in a single stream, and the latent representation learned at the fully connected layer is extracted and fed to the conventional machine learning classifiers for action recognition. The proposed work was empirically verified, and the experimental results were compared with those for existing methods. The results show that the proposed method outperforms state-of-the-art (SOTA) methods with a percentage improvement of 2.75% on UCF101, 10.94% on HMDB51, and 0.18% on the KTH dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 2957 KiB  
Article
Autonomous Driving Decision Control Based on Improved Proximal Policy Optimization Algorithm
by Qingpeng Song, Yuansheng Liu, Ming Lu, Jun Zhang, Han Qi, Ziyu Wang and Zijian Liu
Appl. Sci. 2023, 13(11), 6400; https://0-doi-org.brum.beds.ac.uk/10.3390/app13116400 - 24 May 2023
Cited by 1 | Viewed by 1237
Abstract
The decision-making control of autonomous driving in complex urban road environments is a difficult problem in the research of autonomous driving. In order to solve the problem of high dimensional state space and sparse reward in autonomous driving decision control in this environment, [...] Read more.
The decision-making control of autonomous driving in complex urban road environments is a difficult problem in the research of autonomous driving. In order to solve the problem of high dimensional state space and sparse reward in autonomous driving decision control in this environment, this paper proposed a Coordinated Convolution Multi-Reward Proximal Policy Optimization (CCMR-PPO). This method reduces the dimension of the bird’s-eye view data through the coordinated convolution network and then fuses the processed data with the vehicle state data as the input of the algorithm to optimize the state space. The control commands acc (acc represents throttle and brake) and steer of the vehicle are used as the output of the algorithm.. Comprehensively considering the lateral error, safety distance, speed, and other factors of the vehicle, a multi-objective reward mechanism was designed to alleviate the sparse reward. Experiments on the CARLA simulation platform show that the proposed method can effectively increase the performance: compared with the PPO algorithm, the line crossed times are reduced by 24 %, and the number of tasks completed is increased by 54 %. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

22 pages, 4078 KiB  
Article
Multi-Class Document Classification Using Lexical Ontology-Based Deep Learning
by Ilkay Yelmen, Ali Gunes and Metin Zontul
Appl. Sci. 2023, 13(10), 6139; https://0-doi-org.brum.beds.ac.uk/10.3390/app13106139 - 17 May 2023
Cited by 2 | Viewed by 1670
Abstract
With the recent growth of the Internet, the volume of data has also increased. In particular, the increase in the amount of unstructured data makes it difficult to manage data. Classification is also needed in order to be able to use the data [...] Read more.
With the recent growth of the Internet, the volume of data has also increased. In particular, the increase in the amount of unstructured data makes it difficult to manage data. Classification is also needed in order to be able to use the data for various purposes. Since it is difficult to manually classify the ever-increasing volume data for the purpose of various types of analysis and evaluation, automatic classification methods are needed. In addition, the performance of imbalanced and multi-class classification is a challenging task. As the number of classes increases, so does the number of decision boundaries a learning algorithm has to solve. Therefore, in this paper, an improvement model is proposed using WordNet lexical ontology and BERT to perform deeper learning on the features of text, thereby improving the classification effect of the model. It was observed that classification success increased when using WordNet 11 general lexicographer files based on synthesis sets, syntactic categories, and logical groupings. WordNet was used for feature dimension reduction. In experimental studies, word embedding methods were used without dimension reduction. Afterwards, Random Forest (RF), Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) algorithms were employed to perform classification. These studies were then repeated with dimension reduction performed by WordNet. In addition to the machine learning model, experiments were also conducted with the pretrained BERT model with and without WordNet. The experimental results showed that, on an unstructured, seven-class, imbalanced dataset, the highest accuracy value of 93.77% was obtained when using our proposed model. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 19289 KiB  
Article
Improving Graphite Ore Grade Identification with a Novel FRCNN-PGR Method Based on Deep Learning
by Junchen Xiang, Haoyu Shi, Xueyu Huang and Daogui Chen
Appl. Sci. 2023, 13(8), 5179; https://0-doi-org.brum.beds.ac.uk/10.3390/app13085179 - 21 Apr 2023
Cited by 2 | Viewed by 1344
Abstract
Graphite stone is widely used in various industries, including the refractory, battery making, steel making, expanded graphite, brake pads, casting coatings, and lubricants industries. In the mineral processing industry, an effective and accurate diagnostic method based on FRCNN-PGR is proposed and evaluated, which [...] Read more.
Graphite stone is widely used in various industries, including the refractory, battery making, steel making, expanded graphite, brake pads, casting coatings, and lubricants industries. In the mineral processing industry, an effective and accurate diagnostic method based on FRCNN-PGR is proposed and evaluated, which involves cutting images to expand the dataset, combining them using the faster R-CNN model with high and low feature layers, and adding a global attention mechanism, Relation-Aware Global Attention Network (RGA), to extract features of interest from both the space and channel. The proposed model outperforms the original faster R-CNN model with 80.21% mAP and 87.61% recall on the split graphite mine dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 1249 KiB  
Article
KHGCN: Knowledge-Enhanced Recommendation with Hierarchical Graph Capsule Network
by Fukun Chen, Guisheng Yin, Yuxin Dong, Gesu Li and Weiqi Zhang
Entropy 2023, 25(4), 697; https://0-doi-org.brum.beds.ac.uk/10.3390/e25040697 - 20 Apr 2023
Cited by 4 | Viewed by 2707
Abstract
Knowledge graphs as external information has become one of the mainstream directions of current recommendation systems. Various knowledge-graph-representation methods have been proposed to promote the development of knowledge graphs in related fields. Knowledge-graph-embedding methods can learn entity information and complex relationships between the [...] Read more.
Knowledge graphs as external information has become one of the mainstream directions of current recommendation systems. Various knowledge-graph-representation methods have been proposed to promote the development of knowledge graphs in related fields. Knowledge-graph-embedding methods can learn entity information and complex relationships between the entities in knowledge graphs. Furthermore, recently proposed graph neural networks can learn higher-order representations of entities and relationships in knowledge graphs. Therefore, the complete presentation in the knowledge graph enriches the item information and alleviates the cold start of the recommendation process and too-sparse data. However, the knowledge graph’s entire entity and relation representation in personalized recommendation tasks will introduce unnecessary noise information for different users. To learn the entity-relationship presentation in the knowledge graph while effectively removing noise information, we innovatively propose a model named knowledgeenhanced hierarchical graph capsule network (KHGCN), which can extract node embeddings in graphs while learning the hierarchical structure of graphs. Our model eliminates noisy entities and relationship representations in the knowledge graph by the entity disentangling for the recommendation and introduces the attentive mechanism to strengthen the knowledge-graph aggregation. Our model learns the presentation of entity relationships by an original graph capsule network. The capsule neural networks represent the structured information between the entities more completely. We validate the proposed model on real-world datasets, and the validation results demonstrate the model’s effectiveness. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 21723 KiB  
Article
Multi-Mode Data Generation and Fault Diagnosis of Bearings Based on STFT-SACGAN
by Hongxing Wang, Hua Zhu and Huafeng Li
Electronics 2023, 12(8), 1910; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12081910 - 18 Apr 2023
Cited by 4 | Viewed by 1153
Abstract
To achieve multi-mode fault sample generation and fault diagnosis of bearings in a complex operating environment with scarce labeled data. Combining a semi-supervised generative adversarial network (SGAN) and an auxiliary classifier generative adversarial network (ACGAN), a semi-supervised auxiliary classifier generative adversarial network (SACGAN) [...] Read more.
To achieve multi-mode fault sample generation and fault diagnosis of bearings in a complex operating environment with scarce labeled data. Combining a semi-supervised generative adversarial network (SGAN) and an auxiliary classifier generative adversarial network (ACGAN), a semi-supervised auxiliary classifier generative adversarial network (SACGAN) is constructed in this paper. The network structure and the loss function are improved. A fault diagnosis method based on STFT-SACGAN is also proposed. The method uses a short-time Fourier transform (STFT) to convert one-dimensional time-domain vibration signals of bearings into two-dimensional time-frequency images, which are used as the input of SACGAN. Two multi-mode fault data generation and intelligent diagnosis cases for bearings are studied. The experimental results show that the proposed method generates high-quality multi-mode fault samples with high fault diagnosis accuracy, generalization, and stability. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

35 pages, 2329 KiB  
Review
A Survey on Deep Learning Based Segmentation, Detection and Classification for 3D Point Clouds
by Prasoon Kumar Vinodkumar, Dogus Karabulut, Egils Avots, Cagri Ozcinar and Gholamreza Anbarjafari
Entropy 2023, 25(4), 635; https://0-doi-org.brum.beds.ac.uk/10.3390/e25040635 - 10 Apr 2023
Cited by 7 | Viewed by 4073
Abstract
The computer vision, graphics, and machine learning research groups have given a significant amount of focus to 3D object recognition (segmentation, detection, and classification). Deep learning approaches have lately emerged as the preferred method for 3D segmentation problems as a result of their [...] Read more.
The computer vision, graphics, and machine learning research groups have given a significant amount of focus to 3D object recognition (segmentation, detection, and classification). Deep learning approaches have lately emerged as the preferred method for 3D segmentation problems as a result of their outstanding performance in 2D computer vision. As a result, many innovative approaches have been proposed and validated on multiple benchmark datasets. This study offers an in-depth assessment of the latest developments in deep learning-based 3D object recognition. We discuss the most well-known 3D object recognition models, along with evaluations of their distinctive qualities. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 3251 KiB  
Article
CGA-MGAN: Metric GAN Based on Convolution-Augmented Gated Attention for Speech Enhancement
by Haozhe Chen and Xiaojuan Zhang
Entropy 2023, 25(4), 628; https://0-doi-org.brum.beds.ac.uk/10.3390/e25040628 - 06 Apr 2023
Viewed by 1651
Abstract
In recent years, neural networks based on attention mechanisms have seen increasingly use in speech recognition, separation, and enhancement, as well as other fields. In particular, the convolution-augmented transformer has performed well, as it can combine the advantages of convolution and self-attention. Recently, [...] Read more.
In recent years, neural networks based on attention mechanisms have seen increasingly use in speech recognition, separation, and enhancement, as well as other fields. In particular, the convolution-augmented transformer has performed well, as it can combine the advantages of convolution and self-attention. Recently, the gated attention unit (GAU) was proposed. Compared with traditional multi-head self-attention, approaches with GAU are effective and computationally efficient. In this CGA-MGAN: MetricGAN based on Convolution-augmented Gated Attention for Speech Enhancement, we propose a network for speech enhancement called CGA-MGAN, a kind of MetricGAN based on convolution-augmented gated attention. CGA-MGAN captures local and global correlations in speech signals at the same time by fusing convolution and gated attention units. Experiments on Voice Bank + DEMAND show that our proposed CGA-MGAN model achieves excellent performance (3.47 PESQ, 0.96 STOI, and 11.09 dB SSNR) with a relatively small model size (1.14 M). Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 503 KiB  
Article
A Unified Approach to Nested and Non-Nested Slots for Spoken Language Understanding
by Xue Wan, Wensheng Zhang, Mengxing Huang, Siling Feng and Yuanyuan Wu
Electronics 2023, 12(7), 1748; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12071748 - 06 Apr 2023
Cited by 1 | Viewed by 1130
Abstract
As chatbots become more popular, multi-intent spoken language understanding (SLU) has received unprecedented attention. Multi-intent SLU, which primarily comprises the two subtasks of multiple intent detection (ID) and slot filling (SF), has the potential for widespread implementation. The two primary issues with the [...] Read more.
As chatbots become more popular, multi-intent spoken language understanding (SLU) has received unprecedented attention. Multi-intent SLU, which primarily comprises the two subtasks of multiple intent detection (ID) and slot filling (SF), has the potential for widespread implementation. The two primary issues with the current approaches are as follows: (1) They cannot solve the problem of slot nesting; (2) The performance and inference rate of the model are not high enough. To address these issues, we suggest a multi-intent joint model based on global pointers to handle nested and non-nested slots. Firstly, we constructed a multi-dimensional type-slot label interaction network (MTLN) for subsequent intent decoding to enhance the implicit correlation between intents and slots, which allows for more adequate information about each other. Secondly, the global pointer network (GP) was introduced, which not only deals with nested and non-nested slots and slot incoherence but also has a faster inference rate and better performance than the baseline model. On two multi-intent datasets, the proposed model achieves state-of-the-art results on MixATIS with 1.6% improvement of intent Acc, 0.1% improvement of slot F1 values, 3.1% improvement of sentence Acc values, and 1.2%, 1.1% and 4.5% performance improvements on MixSNIPS, respectively. Meanwhile, the inference rate is also improved. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 2378 KiB  
Article
Automated Segmentation to Make Hidden Trigger Backdoor Attacks Robust against Deep Neural Networks
by Saqib Ali, Sana Ashraf, Muhammad Sohaib Yousaf, Shazia Riaz and Guojun Wang
Appl. Sci. 2023, 13(7), 4599; https://0-doi-org.brum.beds.ac.uk/10.3390/app13074599 - 05 Apr 2023
Cited by 1 | Viewed by 1635
Abstract
The successful outcomes of deep learning (DL) algorithms in diverse fields have prompted researchers to consider backdoor attacks on DL models to defend them in practical applications. Adversarial examples could deceive a safety-critical system, which could lead to hazardous situations. To cope with [...] Read more.
The successful outcomes of deep learning (DL) algorithms in diverse fields have prompted researchers to consider backdoor attacks on DL models to defend them in practical applications. Adversarial examples could deceive a safety-critical system, which could lead to hazardous situations. To cope with this, we suggested a segmentation technique that makes hidden trigger backdoor attacks more robust. The tiny trigger patterns are conventionally established by a series of parameters encompassing their DNN size, location, color, shape, and other defining attributes. From the original triggers, alternate triggers are generated to control the backdoor patterns by a third party in addition to their original designer, which can produce a higher success rate than the original triggers. However, the significant downside of these approaches is the lack of automation in the scene segmentation phase, which results in the poor optimization of the threat model. We developed a novel technique that automatically generates alternate triggers to increase the effectiveness of triggers. Image denoising is performed for this purpose, followed by scene segmentation techniques to make the poisoned classifier more robust. The experimental results demonstrated that our proposed technique achieved 99% to 100% accuracy and helped reduce the vulnerabilities of DL models by exposing their loopholes. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 4060 KiB  
Article
Multi-Modal Fake News Detection via Bridging the Gap between Modals
by Peng Liu, Wenhua Qian, Dan Xu, Bingling Ren and Jinde Cao
Entropy 2023, 25(4), 614; https://0-doi-org.brum.beds.ac.uk/10.3390/e25040614 - 04 Apr 2023
Cited by 4 | Viewed by 2131
Abstract
Multi-modal fake news detection aims to identify fake information through text and corresponding images. The current methods purely combine images and text scenarios by a vanilla attention module but there exists a semantic gap between different scenarios. To address this issue, we introduce [...] Read more.
Multi-modal fake news detection aims to identify fake information through text and corresponding images. The current methods purely combine images and text scenarios by a vanilla attention module but there exists a semantic gap between different scenarios. To address this issue, we introduce an image caption-based method to enhance the model’s ability to capture semantic information from images. Formally, we integrate image description information into the text to bridge the semantic gap between text and images. Moreover, to optimize image utilization and enhance the semantic interaction between images and text, we combine global and object features from the images for the final representation. Finally, we leverage a transformer to fuse the above multi-modal content. We carried out extensive experiments on two publicly available datasets, and the results show that our proposed method significantly improves performance compared to other existing methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 2796 KiB  
Article
Utility Analysis about Log Data Anomaly Detection Based on Federated Learning
by Tae-Ho Shin and Soo-Hyung Kim
Appl. Sci. 2023, 13(7), 4495; https://0-doi-org.brum.beds.ac.uk/10.3390/app13074495 - 01 Apr 2023
Cited by 1 | Viewed by 1343
Abstract
Logs that record system information are managed in anomaly detection, and more efficient anomaly detection methods have been proposed due to their increase in complexity and scale. Accordingly, deep learning models that automatically detect system anomalies through log data learning have been proposed. [...] Read more.
Logs that record system information are managed in anomaly detection, and more efficient anomaly detection methods have been proposed due to their increase in complexity and scale. Accordingly, deep learning models that automatically detect system anomalies through log data learning have been proposed. However, in existing log anomaly detection models, user logs are collected from the central server system, exposing the data collection process to the risk of leaking sensitive information. A distributed learning method, federated learning, is a trend proposed for artificial intelligence learning regarding sensitive information because it guarantees the anonymity of the collected user data and collects only weights learned from each local server in the central server. In this paper, we executed an experiment regarding system log anomaly detection using federated learning. The results demonstrate the feasibility of applying federated learning in deep-learning-based system-log anomaly detection compared to the existing centralized learning method. Moreover, we present an efficient deep-learning model based on federated learning for system log anomaly detection. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 3955 KiB  
Article
TS-CGANet: A Two-Stage Complex and Real Dual-Path Sub-Band Fusion Network for Full-Band Speech Enhancement
by Haozhe Chen and Xiaojuan Zhang
Appl. Sci. 2023, 13(7), 4431; https://0-doi-org.brum.beds.ac.uk/10.3390/app13074431 - 31 Mar 2023
Viewed by 1315
Abstract
Speech enhancement based on deep neural networks faces difficulties, as modeling more frequency bands can lead to a decrease in the resolution of low-frequency bands and increase the computational complexity. Previously, we proposed a convolution-augmented gated attention unit (CGAU), which captured local and [...] Read more.
Speech enhancement based on deep neural networks faces difficulties, as modeling more frequency bands can lead to a decrease in the resolution of low-frequency bands and increase the computational complexity. Previously, we proposed a convolution-augmented gated attention unit (CGAU), which captured local and global correlation in speech signals through the fusion of the convolution and gated attention unit. In this paper, we further improved the CGAU, and proposed a two-stage complex and real dual-path sub-band fusion network for full-band speech enhancement called TS-CGANet. Specifically, we proposed a dual-path CGA network to enhance low-band (0–8 kHz) speech signals. In the medium band (8–16 kHz) and high band (16–24 kHz), noise suppression is only performed in the magnitude domain. The Voice Bank+DEMAND dataset was used to conduct experiments on the proposed TS-CGANet, which consistently outperformed state-of-the-art full-band baselines, as evidenced by the results. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 4334 KiB  
Article
Energy Dispatch for CCHP System in Summer Based on Deep Reinforcement Learning
by Wenzhong Gao and Yifan Lin
Entropy 2023, 25(3), 544; https://0-doi-org.brum.beds.ac.uk/10.3390/e25030544 - 21 Mar 2023
Cited by 2 | Viewed by 1276
Abstract
Combined cooling, heating, and power (CCHP) system is an effective solution to solve energy and environmental problems. However, due to the demand-side load uncertainty, load-prediction error, environmental change, and demand charge, the energy dispatch optimization of the CCHP system is definitely a tough [...] Read more.
Combined cooling, heating, and power (CCHP) system is an effective solution to solve energy and environmental problems. However, due to the demand-side load uncertainty, load-prediction error, environmental change, and demand charge, the energy dispatch optimization of the CCHP system is definitely a tough challenge. In view of this, this paper proposes a dispatch method based on the deep reinforcement learning (DRL) algorithm, DoubleDQN, to generate an optimal dispatch strategy for the CCHP system in the summer. By integrating DRL, this method does not require any prediction information, and can adapt to the load uncertainty. The simulation result shows that compared with strategies based on benchmark policies and DQN, the proposed dispatch strategy not only well preserves the thermal comfort, but also reduces the total intra-month cost by 0.13~31.32%, of which the demand charge is reduced by 2.19~46.57%. In addition, this method is proven to have the potential to be applied in the real world by testing under extended scenarios. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 6162 KiB  
Article
Extraction of Interconnect Parasitic Capacitance Matrix Based on Deep Neural Network
by Yaoyao Ma, Xiaoyu Xu, Shuai Yan, Yaxing Zhou, Tianyu Zheng, Zhuoxiang Ren and Lan Chen
Electronics 2023, 12(6), 1440; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12061440 - 17 Mar 2023
Cited by 1 | Viewed by 1951
Abstract
Interconnect parasitic capacitance extraction is crucial in analyzing VLSI circuits’ delay and crosstalk. This paper uses the deep neural network (DNN) to predict the parasitic capacitance matrix of a two-dimensional pattern. To save the DNN training time, the neural network’s output includes only [...] Read more.
Interconnect parasitic capacitance extraction is crucial in analyzing VLSI circuits’ delay and crosstalk. This paper uses the deep neural network (DNN) to predict the parasitic capacitance matrix of a two-dimensional pattern. To save the DNN training time, the neural network’s output includes only coupling capacitances in the matrix, and total capacitances are obtained by summing corresponding predicted coupling capacitances. In this way, we can obtain coupling and total capacitances simultaneously using a single neural network. Moreover, we introduce a mirror flip method to augment the datasets computed by the finite element method (FEM), which doubles the dataset size and reduces data preparation efforts. Then, we compare the prediction accuracy of DNN with another neural network ResNet. The result shows that DNN performs better in this case. Moreover, to verify our method’s efficiency, the total capacitances calculated from the trained DNN are compared with the network (named DNN-2) that takes the total capacitance as an extra output. The results show that the prediction accuracy of the two methods is very close, indicating that our method is reliable and can save the training workload for the total capacitance. Finally, a solving efficiency comparison shows that the average computation time of the trained DNN for one case is not more than 2% of that of FEM. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 6183 KiB  
Article
Feature Fusion and Metric Learning Network for Zero-Shot Sketch-Based Image Retrieval
by Honggang Zhao, Mingyue Liu and Mingyong Li
Entropy 2023, 25(3), 502; https://0-doi-org.brum.beds.ac.uk/10.3390/e25030502 - 14 Mar 2023
Cited by 1 | Viewed by 1321
Abstract
Zero-shot sketch-based image retrieval (ZS-SBIR) is an important computer vision problem. The image category in the test phase is a new category that was not visible in the training stage. Because sketches are extremely abstract, the commonly used backbone networks (such as VGG-16 [...] Read more.
Zero-shot sketch-based image retrieval (ZS-SBIR) is an important computer vision problem. The image category in the test phase is a new category that was not visible in the training stage. Because sketches are extremely abstract, the commonly used backbone networks (such as VGG-16 and ResNet-50) cannot handle both sketches and photos. Semantic similarities between the same features in photos and sketches are difficult to reflect in deep models without textual assistance. To solve this problem, we propose a novel and effective feature embedding model called Attention Map Feature Fusion (AMFF). The AMFF model combines the excellent feature extraction capability of the ResNet-50 network with the excellent representation ability of the attention network. By processing the residuals of the ResNet-50 network, the attention map is finally obtained without introducing external semantic knowledge. Most previous approaches treat the ZS-SBIR problem as a classification problem, which ignores the huge domain gap between sketches and photos. This paper proposes an effective method to optimize the entire network, called domain-aware triplets (DAT). Domain feature discrimination and semantic feature embedding can be learned through DAT. In this paper, we also use the classification loss function to stabilize the training process to avoid getting trapped in a local optimum. Compared with the state-of-the-art methods, our method shows a superior performance. For example, on the Tu-berlin dataset, we achieved 61.2 + 1.2% Prec200. On the Sketchy_c100 dataset, we achieved 62.3 + 3.3% mAPall and 75.5 + 1.5% Prec100. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 10865 KiB  
Article
Rock Image Classification Based on EfficientNet and Triplet Attention Mechanism
by Zhihao Huang, Lumei Su, Jiajun Wu and Yuhan Chen
Appl. Sci. 2023, 13(5), 3180; https://0-doi-org.brum.beds.ac.uk/10.3390/app13053180 - 01 Mar 2023
Cited by 9 | Viewed by 2889
Abstract
Rock image classification is a fundamental and crucial task in the creation of geological surveys. Traditional rock image classification methods mainly rely on manual operation, resulting in high costs and unstable accuracy. While existing methods based on deep learning models have overcome the [...] Read more.
Rock image classification is a fundamental and crucial task in the creation of geological surveys. Traditional rock image classification methods mainly rely on manual operation, resulting in high costs and unstable accuracy. While existing methods based on deep learning models have overcome the limitations of traditional methods and achieved intelligent image classification, they still suffer from low accuracy due to suboptimal network structures. In this study, a rock image classification model based on EfficientNet and a triplet attention mechanism is proposed to achieve accurate end-to-end classification. The model was built on EfficientNet, which boasts an efficient network structure thanks to NAS technology and a compound model scaling method, thus achieving high accuracy for rock image classification. Additionally, the triplet attention mechanism was introduced to address the shortcoming of EfficientNet in feature expression and enable the model to fully capture the channel and spatial attention information of rock images, further improving accuracy. During network training, transfer learning was employed by loading pre-trained model parameters into the classification model, which accelerated convergence and reduced training time. The results show that the classification model with transfer learning achieved 92.6% accuracy in the training set and 93.2% Top-1 accuracy in the test set, outperforming other mainstream models and demonstrating strong robustness and generalization ability. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

9 pages, 543 KiB  
Communication
Detecting Phishing Accounts on Ethereum Based on Transaction Records and EGAT
by Xuanchen Zhou, Wenzhong Yang and Xiaodan Tian
Electronics 2023, 12(4), 993; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12040993 - 16 Feb 2023
Cited by 5 | Viewed by 1784
Abstract
In recent years, the losses caused by scams on Ethereum have reached a level that cannot be ignored. As one of the most rampant crimes, phishing scams have caused a huge economic loss to blockchain platforms and users. Under these circumstances, to address [...] Read more.
In recent years, the losses caused by scams on Ethereum have reached a level that cannot be ignored. As one of the most rampant crimes, phishing scams have caused a huge economic loss to blockchain platforms and users. Under these circumstances, to address the threat to the financial security of blockchain, an Edge Aggregated Graph Attention Network (EGAT) based on the static subgraph representation of the transaction network is proposed. This study intends to detect Ethereum phishing accounts through the classification of transaction network subgraphs with the following procedures. Firstly, the accounts are used as nodes and the flow of transaction funds is used as directed edges to construct the transaction network graph. Secondly, the transaction record data of phishing accounts in the publicly available Ethereum are analyzed and statistical features of Value, Gas, and Timestamp values are manually constructed as node and edge features of the graph. Finally, the features are extracted and classified using the EGAT network. According to the experimental results, the Recall of the proposed method from the article is 99.3% on the dataset of phishing accounts. As demonstrated, the EGAT is more efficient and accurate compared with Graph2Vec and DeepWalk, and the graph structure features can express semantics better than manual features and simple transaction networks, which effectively improves the performance of phishing account detection. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 365 KiB  
Article
TKRM: Learning a Transfer Kernel Regression Model for Cross-Database Micro-Expression Recognition
by Zixuan Chen, Cheng Lu, Feng Zhou and Yuan Zong
Mathematics 2023, 11(4), 918; https://0-doi-org.brum.beds.ac.uk/10.3390/math11040918 - 11 Feb 2023
Viewed by 995
Abstract
Cross-database micro-expression recognition (MER) is a more challenging task than the conventional one because its labeled training (source) and unlabeled testing (target) micro-expression (ME) samples are from different databases. In this circumstance, a large feature-distribution gap may exist between the source and target [...] Read more.
Cross-database micro-expression recognition (MER) is a more challenging task than the conventional one because its labeled training (source) and unlabeled testing (target) micro-expression (ME) samples are from different databases. In this circumstance, a large feature-distribution gap may exist between the source and target ME samples due to the different sample sources, which decreases the recognition performance of existing MER methods. In this paper, we focus on this challenging task by proposing a simple yet effective method called the transfer kernel regression model (TKRM). The basic idea of TKRM is to find an ME-discriminative, database-invariant and common reproduced kernel Hilbert space (RKHS) to bridge MEs belonging to different databases. For this purpose, TKRM has the ME discriminative ability of learning a kernel mapping operator to generate an RKHS and build the relationship between the kernelized ME features and labels in such RKHS. Meanwhile, an additional novel regularization term called target sample reconstruction (TSR) is also designed to benefit kernel mapping operator learning by improving the database-invariant ability of TKRM while preserving the ME-discriminative one. To evaluate the proposed TKRM method, we carried out extensive cross-database MER experiments on widely used micro-expression databases, including CASME II and SMIC. Experimental results obtained proved that the proposed TKRM method is indeed superior to recent state-of-the-art domain adaptation methods for cross-database MER. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 2300 KiB  
Article
FAD: Fine-Grained Adversarial Detection by Perturbation Intensity Classification
by Jin-Tao Yang, Hao Jiang, Hao Li, Dong-Sheng Ye and Wei Jiang
Entropy 2023, 25(2), 335; https://0-doi-org.brum.beds.ac.uk/10.3390/e25020335 - 11 Feb 2023
Cited by 1 | Viewed by 1342
Abstract
Adversarial examples present a severe threat to deep neural networks’ application in safetycritical domains such as autonomous driving. Although there are numerous defensive solutions, they all have some flaws, such as the fact that they can only defend against adversarial attacks with a [...] Read more.
Adversarial examples present a severe threat to deep neural networks’ application in safetycritical domains such as autonomous driving. Although there are numerous defensive solutions, they all have some flaws, such as the fact that they can only defend against adversarial attacks with a limited range of adversarial intensities. Therefore, there is a need for a detection method that can distinguish the adversarial intensity in a fine-grained manner so that subsequent tasks can perform different defense processing against perturbations of various intensities. Based on thefact that adversarial attack samples of different intensities are significantly different in the highfrequency region, this paper proposes a method to amplify the high-frequency component of the image and input it into the deep neural network based on the residual block structure. To our best knowledge, the proposed method is the first to classify adversarial intensities at a fine-grained level, thus providing an attack detection component for a general AI firewall. Experimental results show that our proposed method not only has advanced performance in AutoAttack detection by perturbation intensity classification, but also can effectively apply to detect examples of unseen adversarial attack methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 9501 KiB  
Article
A Score-Based Approach for Training Schrödinger Bridges for Data Modelling
by Ludwig Winkler, Cesar Ojeda and Manfred Opper
Entropy 2023, 25(2), 316; https://0-doi-org.brum.beds.ac.uk/10.3390/e25020316 - 08 Feb 2023
Viewed by 2445
Abstract
A Schrödinger bridge is a stochastic process connecting two given probability distributions over time. It has been recently applied as an approach for generative data modelling. The computational training of such bridges requires the repeated estimation of the drift function for a time-reversed [...] Read more.
A Schrödinger bridge is a stochastic process connecting two given probability distributions over time. It has been recently applied as an approach for generative data modelling. The computational training of such bridges requires the repeated estimation of the drift function for a time-reversed stochastic process using samples generated by the corresponding forward process. We introduce a modified score- function-based method for computing such reverse drifts, which can be efficiently implemented by a feed-forward neural network. We applied our approach to artificial datasets with increasing complexity. Finally, we evaluated its performance on genetic data, where Schrödinger bridges can be used to model the time evolution of single-cell RNA measurements. Full article
(This article belongs to the Topic Machine and Deep Learning)
(This article belongs to the Section Information Theory, Probability and Statistics)
Show Figures

Figure 1

17 pages, 343 KiB  
Article
A Reasonable Effectiveness of Features in Modeling Visual Perception of User Interfaces
by Maxim Bakaev, Sebastian Heil and Martin Gaedke
Big Data Cogn. Comput. 2023, 7(1), 30; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc7010030 - 08 Feb 2023
Viewed by 1504
Abstract
Training data for user behavior models that predict subjective dimensions of visual perception are often too scarce for deep learning methods to be applicable. With the typical datasets in HCI limited to thousands or even hundreds of records, feature-based approaches are still widely [...] Read more.
Training data for user behavior models that predict subjective dimensions of visual perception are often too scarce for deep learning methods to be applicable. With the typical datasets in HCI limited to thousands or even hundreds of records, feature-based approaches are still widely used in visual analysis of graphical user interfaces (UIs). In our paper, we benchmarked the predictive accuracy of the two types of neural network (NN) models, and explored the effects of the number of features, and the dataset volume. To this end, we used two datasets that comprised over 4000 webpage screenshots, assessed by 233 subjects per the subjective dimensions of Complexity, Aesthetics and Orderliness. With the experimental data, we constructed and trained 1908 models. The feature-based NNs demonstrated 16.2%-better mean squared error (MSE) than the convolutional NNs (a modified GoogLeNet architecture); however, the CNNs’ accuracy improved with the larger dataset volume, whereas the ANNs’ did not: therefore, provided that the effect of more data on the models’ error improvement is linear, the CNNs should become superior at dataset sizes over 3000 UIs. Unexpectedly, adding more features to the NN models caused the MSE to somehow increase by 1.23%: although the difference was not significant, this confirmed the importance of careful feature engineering. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 7944 KiB  
Article
Gradient Agreement Hinders the Memorization of Noisy Labels
by Shaotian Yan, Xiang Tian, Rongxin Jiang and Yaowu Chen
Appl. Sci. 2023, 13(3), 1823; https://0-doi-org.brum.beds.ac.uk/10.3390/app13031823 - 31 Jan 2023
Viewed by 1089
Abstract
The performance of deep neural networks (DNNs) critically relies on high-quality annotations, while training DNNs with noisy labels remains challenging owing to their incredible capacity to memorize the entire training set. In this work, we use two synchronously trained networks to reveal that [...] Read more.
The performance of deep neural networks (DNNs) critically relies on high-quality annotations, while training DNNs with noisy labels remains challenging owing to their incredible capacity to memorize the entire training set. In this work, we use two synchronously trained networks to reveal that noisy labels may result in more divergent gradients when updating the parameters. To overcome this, we propose a novel co-training framework named gradient agreement learning (GAL). By dynamically evaluating the gradient agreement coefficient of every pair of parameters from two identical DNNs to determine whether to update them in the training process. GAL can effectively hinder the memorization of noisy labels. Furthermore, we utilize the pseudo labels produced by the two DNNs as the supervision for the training of another network, thereby gaining further improvement by correcting some noisy labels while overcoming the confirmation bias. Extensive experiments on various benchmark datasets demonstrate the superiority of the proposed GAL. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

27 pages, 2357 KiB  
Article
Technical Study of Deep Learning in Cloud Computing for Accurate Workload Prediction
by Zaakki Ahamed, Maher Khemakhem, Fathy Eassa, Fawaz Alsolami and Abdullah S. Al-Malaise Al-Ghamdi
Electronics 2023, 12(3), 650; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12030650 - 28 Jan 2023
Cited by 3 | Viewed by 2221
Abstract
Proactive resource management in Cloud Services not only maximizes cost effectiveness but also enables issues such as Service Level Agreement (SLA) violations and the provisioning of resources to be overcome. Workload prediction using Deep Learning (DL) is a popular method of inferring complicated [...] Read more.
Proactive resource management in Cloud Services not only maximizes cost effectiveness but also enables issues such as Service Level Agreement (SLA) violations and the provisioning of resources to be overcome. Workload prediction using Deep Learning (DL) is a popular method of inferring complicated multidimensional data of cloud environments to meet this requirement. The overall quality of the model depends on the quality of the data as much as the architecture. Therefore, the data sourced to train the model must be of good quality. However, existing works in this domain have either used a singular data source or have not taken into account the importance of uniformity for unbiased and accurate analysis. This results in the efficacy of DL models suffering. In this paper, we provide a technical analysis of using DL models such as Recurrent Neural Networks (RNN), Multilayer Perception (MLP), Long Short-Term Memory (LSTM), and, Convolutional Neural Networks (CNN) to exploit the time series characteristics of real-world workloads from the Parallel Workloads Archive of the Standard Workload Format (SWF) with the aim of conducting an unbiased analysis. The robustness of these models is evaluated using the Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) error metrics. The findings of these highlight that the LSTM model exhibits the best performance compared to the other models. Additionally, to the best of our knowledge, insights of DL in workload prediction of cloud computing environments is insufficient in the literature. To address these challenges, we provide a comprehensive background on resource management and load prediction using DL. Then, we break down the models, error metrics, and data sources across different bodies of work. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 867 KiB  
Article
Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models
by Adam Elwood, Marco Leonardi, Ashraf Mohamed and Alessandro Rozza
Entropy 2023, 25(2), 188; https://0-doi-org.brum.beds.ac.uk/10.3390/e25020188 - 18 Jan 2023
Cited by 1 | Viewed by 1543
Abstract
Contextual bandits can solve a huge range of real-world problems. However, current popular algorithms to solve them either rely on linear models or unreliable uncertainty estimation in non-linear models, which are required to deal with the exploration–exploitation trade-off. Inspired by theories of human [...] Read more.
Contextual bandits can solve a huge range of real-world problems. However, current popular algorithms to solve them either rely on linear models or unreliable uncertainty estimation in non-linear models, which are required to deal with the exploration–exploitation trade-off. Inspired by theories of human cognition, we introduce novel techniques that use maximum entropy exploration, relying on neural networks to find optimal policies in settings with both continuous and discrete action spaces. We present two classes of models, one with neural networks as reward estimators, and the other with energy based models, which model the probability of obtaining an optimal reward given an action. We evaluate the performance of these models in static and dynamic contextual bandit simulation environments. We show that both techniques outperform standard baseline algorithms, such as NN HMC, NN Discrete, Upper Confidence Bound, and Thompson Sampling, where energy based models have the best overall performance. This provides practitioners with new techniques that perform well in static and dynamic settings, and are particularly well suited to non-linear scenarios with continuous action spaces. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 3117 KiB  
Article
Long-Range Dependence Involutional Network for Logo Detection
by Xingzhuo Li, Sujuan Hou, Baisong Zhang, Jing Wang, Weikuan Jia and Yuanjie Zheng
Entropy 2023, 25(1), 174; https://0-doi-org.brum.beds.ac.uk/10.3390/e25010174 - 15 Jan 2023
Cited by 5 | Viewed by 2050
Abstract
Logo detection is one of the crucial branches in computer vision due to various real-world applications, such as automatic logo detection and recognition, intelligent transportation, and trademark infringement detection. Compared with traditional handcrafted-feature-based methods, deep learning-based convolutional neural networks (CNNs) can learn both [...] Read more.
Logo detection is one of the crucial branches in computer vision due to various real-world applications, such as automatic logo detection and recognition, intelligent transportation, and trademark infringement detection. Compared with traditional handcrafted-feature-based methods, deep learning-based convolutional neural networks (CNNs) can learn both low-level and high-level image features. Recent decades have witnessed the great feature representation capabilities of deep CNNs and their variants, which have been very good at discovering intricate structures in high-dimensional data and are thereby applicable to many domains including logo detection. However, logo detection remains challenging, as existing detection methods cannot solve well the problems of a multiscale and large aspect ratios. In this paper, we tackle these challenges by developing a novel long-range dependence involutional network (LDI-Net). Specifically, we designed a strategy that combines a new operator and a self-attention mechanism via rethinking the intrinsic principle of convolution called long-range dependence involution (LD involution) to alleviate the detection difficulties caused by large aspect ratios. We also introduce a multilevel representation neural architecture search (MRNAS) to detect multiscale logo objects by constructing a novel multipath topology. In addition, we implemented an adaptive RoI pooling module (ARM) to improve detection efficiency by addressing the problem of logo deformation. Comprehensive experiments on four benchmark logo datasets demonstrate the effectiveness and efficiency of the proposed approach. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

24 pages, 1042 KiB  
Article
Constructing Traceability Links between Software Requirements and Source Code Based on Neural Networks
by Peng Dai, Li Yang, Yawen Wang, Dahai Jin and Yunzhan Gong
Mathematics 2023, 11(2), 315; https://0-doi-org.brum.beds.ac.uk/10.3390/math11020315 - 07 Jan 2023
Viewed by 2182
Abstract
Software requirement changes, code changes, software reuse, and testing are important activities in software engineering that involve the traceability links between software requirements and code. Software requirement documents, design documents, code documents, and test case documents are the intermediate products of software development. [...] Read more.
Software requirement changes, code changes, software reuse, and testing are important activities in software engineering that involve the traceability links between software requirements and code. Software requirement documents, design documents, code documents, and test case documents are the intermediate products of software development. The lack of interrelationship between these documents can make it extremely difficult to change and maintain the software. Frequent requirements and code changes are inevitable in software development. Software reuse, change impact analysis, and testing also require the relationship between software requirements and code. Using these traceability links can improve the efficiency and quality of related software activities. Existing methods for constructing these links need to be better automated and accurate. To address these problems, we propose to embed software requirements and source code into feature vectors containing their semantic information based on four neural networks (NBOW, RNN, CNN, and self-attention). Accurate traceability links from requirements to code are established by comparing the similarity between these vectors. We develop a prototype tool RCT based on this method. These four networks’ performances in constructing links are explored on 18 open-source projects. The experimental results show that the self-attention network performs best, with an average Recall@50 value of 0.687 on the 18 projects, which is higher than the other three neural network models and much higher than previous approaches using information retrieval and machine learning. Full article
(This article belongs to the Topic Machine and Deep Learning)
(This article belongs to the Section Network Science)
Show Figures

Figure 1

16 pages, 3697 KiB  
Article
FCKDNet: A Feature Condensation Knowledge Distillation Network for Semantic Segmentation
by Wenhao Yuan, Xiaoyan Lu, Rongfen Zhang and Yuhong Liu
Entropy 2023, 25(1), 125; https://0-doi-org.brum.beds.ac.uk/10.3390/e25010125 - 07 Jan 2023
Viewed by 1977
Abstract
As a popular research subject in the field of computer vision, knowledge distillation (KD) is widely used in semantic segmentation (SS). However, based on the learning paradigm of the teacher–student model, the poor quality of teacher network feature knowledge still hinders the development [...] Read more.
As a popular research subject in the field of computer vision, knowledge distillation (KD) is widely used in semantic segmentation (SS). However, based on the learning paradigm of the teacher–student model, the poor quality of teacher network feature knowledge still hinders the development of KD technology. In this paper, we investigate the output features of the teacher–student network and propose a feature condensation-based KD network (FCKDNet), which reduces pseudo-knowledge transfer in the teacher–student network. First, combined with the pixel information entropy calculation rule, we design a feature condensation method to separate the foreground feature knowledge from the background noise of the teacher network outputs. Then, the obtained feature condensation matrix is applied to the original outputs of the teacher and student networks to improve the feature representation capability. In addition, after performing feature condensation on the teacher network, we propose a soft enhancement method of features based on spatial and channel dimensions to improve the dependency of pixels in the feature maps. Finally, we divide the outputs of the teacher network into spatial condensation features and channel condensation features and perform distillation loss calculation with the student network separately to assist the student network to converge faster. Extensive experiments on the public datasets Pascal VOC and Cityscapes demonstrate that our proposed method improves the baseline by 3.16% and 2.98% in terms of mAcc, and 2.03% and 2.30% in terms of mIoU, respectively, and has better segmentation performance and robustness than the mainstream methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

10 pages, 962 KiB  
Article
Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation
by Hongliang Fu, Zhihao Zhuang, Yang Wang, Chen Huang and Wenzhuo Duan
Entropy 2023, 25(1), 124; https://0-doi-org.brum.beds.ac.uk/10.3390/e25010124 - 07 Jan 2023
Cited by 3 | Viewed by 1865
Abstract
To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition tasks, this paper proposed an emotion recognition model based on multi-task learning and subdomain adaptation, which alleviates the impact on emotion recognition. Existing methods have shortcomings in speech feature representation [...] Read more.
To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition tasks, this paper proposed an emotion recognition model based on multi-task learning and subdomain adaptation, which alleviates the impact on emotion recognition. Existing methods have shortcomings in speech feature representation and cross-corpus feature distribution alignment. The proposed model uses a deep denoising auto-encoder as a shared feature extraction network for multi-task learning, and the fully connected layer and softmax layer are added before each recognition task as task-specific layers. Subsequently, the subdomain adaptation algorithm of emotion and gender features is added to the shared network to obtain the shared emotion features and gender features of the source domain and target domain, respectively. Multi-task learning effectively enhances the representation ability of features, a subdomain adaptive algorithm promotes the migrating ability of features and effectively alleviates the impact of feature distribution differences in emotional features. The average results of six cross-corpus speech emotion recognition experiments show that, compared with other models, the weighted average recall rate is increased by 1.89~10.07%, the experimental results verify the validity of the proposed model. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 597 KiB  
Article
Survey of Reinforcement-Learning-Based MAC Protocols for Wireless Ad Hoc Networks with a MAC Reference Model
by Zhichao Zheng, Shengming Jiang, Ruoyu Feng, Lige Ge and Chongchong Gu
Entropy 2023, 25(1), 101; https://0-doi-org.brum.beds.ac.uk/10.3390/e25010101 - 03 Jan 2023
Cited by 10 | Viewed by 3119
Abstract
In this paper, we conduct a survey of the literature about reinforcement learning (RL)-based medium access control (MAC) protocols. As the scale of the wireless ad hoc network (WANET) increases, traditional MAC solutions are becoming obsolete. Dynamic topology, resource allocation, interference management, limited [...] Read more.
In this paper, we conduct a survey of the literature about reinforcement learning (RL)-based medium access control (MAC) protocols. As the scale of the wireless ad hoc network (WANET) increases, traditional MAC solutions are becoming obsolete. Dynamic topology, resource allocation, interference management, limited bandwidth and energy constraint are crucial problems needing resolution for designing modern WANET architectures. In order for future MAC protocols to overcome the current limitations in frequently changing WANETs, more intelligence need to be deployed to maintain efficient communications. After introducing some classic RL schemes, we investigate the existing state-of-the-art MAC protocols and related solutions for WANETs according to the MAC reference model and discuss how each proposed protocol works and the challenging issues on the related MAC model components. Finally, this paper discusses future research directions on how RL can be used to enable MAC protocols for high performance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1478 KiB  
Article
Deep Interest Network Based on Knowledge Graph Embedding
by Dehai Zhang, Haoxing Wang, Xiaobo Yang, Yu Ma, Jiashu Liang and Anquan Ren
Appl. Sci. 2023, 13(1), 357; https://0-doi-org.brum.beds.ac.uk/10.3390/app13010357 - 27 Dec 2022
Cited by 1 | Viewed by 1453
Abstract
Recommendation systems based on knowledge graphs often obtain user preferences through the user’s click matrix. However, the click matrix represents static data and cannot represent the dynamic preferences of users over time. Therefore, we propose DINK, a knowledge graph-based deep interest exploration network, [...] Read more.
Recommendation systems based on knowledge graphs often obtain user preferences through the user’s click matrix. However, the click matrix represents static data and cannot represent the dynamic preferences of users over time. Therefore, we propose DINK, a knowledge graph-based deep interest exploration network, to extract users’ dynamic interests. DINK can be divided into a knowledge graph embedding layer, an interest exploration layer, and a recommendation layer. The embedding layer expands the receptive field of the user’s click sequence through the knowledge graph, the interest exploration layer combines the GRU and the attention mechanism to explore the user’s dynamic interest, and the recommendation layer completes the prediction task. We demonstrate the effectiveness of DINK by conducting extensive experiments on three public datasets. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 1977 KiB  
Article
A Study on the Prediction of Electrical Energy in Food Storage Using Machine Learning
by Sangoh Kim
Appl. Sci. 2023, 13(1), 346; https://0-doi-org.brum.beds.ac.uk/10.3390/app13010346 - 27 Dec 2022
Cited by 3 | Viewed by 2217
Abstract
This study discusses methods for the sustainability of freezers used in frozen storage methods known as long-term food storage methods. Freezing preserves the quality of food for a long time. However, it is inevitable to use a freezer that uses a large amount [...] Read more.
This study discusses methods for the sustainability of freezers used in frozen storage methods known as long-term food storage methods. Freezing preserves the quality of food for a long time. However, it is inevitable to use a freezer that uses a large amount of electricity to store food with this method. To maintain the quality of food, lower temperatures are required, and therefore more electrical energy must be used. In this study, machine learning was performed using data obtained through a freezer test, and an optimal inference model was obtained with this data. If the inference model is applied to the selection of freezer control parameters, it turns out that optimal food storage is possible using less electrical energy. In this paper, a method for obtaining a dataset for machine learning in a deep freezer and the process of performing SLP and MLP machine learning through the obtained dataset are described. In addition, a method for finding the optimal efficiency is presented by comparing the performances of the inference models obtained in each method. The application of such a development method can reduce electrical energy in the food manufacturing equipment related industry, and accordingly it will be possible to achieve carbon emission reductions. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 2635 KiB  
Brief Report
A Machine Learning Approach for the Forecasting of Computing Resource Requirements in Integrated Circuit Simulation
by Yue Wu, Hua Chen, Min Zhou and Faxin Yu
Electronics 2023, 12(1), 95; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12010095 - 26 Dec 2022
Viewed by 1357
Abstract
For the iterative development of the chip, ensuring that the simulation is completed in the shortest time is critical. To meet this demand, the common practice is to reduce simulation time by providing more computing resources. However, this acceleration method has an upper [...] Read more.
For the iterative development of the chip, ensuring that the simulation is completed in the shortest time is critical. To meet this demand, the common practice is to reduce simulation time by providing more computing resources. However, this acceleration method has an upper limit. After reaching the upper limit, providing more CPUs can no longer shorten the simulation time, but will instead waste a lot of computing resources. Unfortunately, the recommended values of the existing commercial tools are often higher than this upper limit. To better match this limit, a machine learning optimization algorithm trained with a custom loss function is proposed. Experimental results demonstrate that the proposed algorithm is superior to commercial tools in terms of both accuracy and stability. In addition, the simulations using the resources predicted by the proposed model maintain the same simulation completion time while reducing core hour consumption by approximately 30%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 13022 KiB  
Article
Citrus Tree Crown Segmentation of Orchard Spraying Robot Based on RGB-D Image and Improved Mask R-CNN
by Peichao Cong, Jiachao Zhou, Shanda Li, Kunfeng Lv and Hao Feng
Appl. Sci. 2023, 13(1), 164; https://0-doi-org.brum.beds.ac.uk/10.3390/app13010164 - 23 Dec 2022
Cited by 4 | Viewed by 2033
Abstract
Orchard spraying robots must visually obtain citrus tree crown growth information to meet the variable growth-stage-based spraying requirements. However, the complex environments and growth characteristics of fruit trees affect the accuracy of crown segmentation. Therefore, we propose a feature-map-based squeeze-and-excitation UNet++ (MSEU) region-based [...] Read more.
Orchard spraying robots must visually obtain citrus tree crown growth information to meet the variable growth-stage-based spraying requirements. However, the complex environments and growth characteristics of fruit trees affect the accuracy of crown segmentation. Therefore, we propose a feature-map-based squeeze-and-excitation UNet++ (MSEU) region-based convolutional neural network (R-CNN) citrus tree crown segmentation method that intakes red–green–blue-depth (RGB-D) images that are pixel aligned and visual distance-adjusted to eliminate noise. Our MSEU R-CNN achieves accurate crown segmentation using squeeze-and-excitation (SE) and UNet++. To fully fuse the feature map information, the SE block correlates image features and recalibrates their channel weights, and the UNet++ semantic segmentation branch replaces the original mask structure to maximize the interconnectivity between feature layers, achieving a near-real time detection speed of 5 fps. Its bounding box (bbox) and segmentation (seg) AP50 scores are 96.6 and 96.2%, respectively, and the bbox average recall and F1-score are 73.0 and 69.4%, which are 3.4, 2.4, 4.9, and 3.5% higher than the original model, respectively. Compared with bbox instant segmentation (BoxInst) and conditional convolutional frameworks (CondInst), the MSEU R-CNN provides better seg accuracy and speed than the previous-best Mask R-CNN. These results provide the means to accurately employ autonomous spraying robots. Full article
(This article belongs to the Topic Machine and Deep Learning)
(This article belongs to the Section Agricultural Science and Technology)
Show Figures

Figure 1

14 pages, 483 KiB  
Article
An Efficient Hidden Markov Model with Periodic Recurrent Neural Network Observer for Music Beat Tracking
by Guangxiao Song and Zhijie Wang
Electronics 2022, 11(24), 4186; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11244186 - 14 Dec 2022
Cited by 4 | Viewed by 1574
Abstract
In music information retrieval (MIR), beat tracking is one of the most fundamental tasks. To obtain this critical component from rhythmic music signals, a previous beat tracking system of hidden Markov model (HMM) with a recurrent neural network (RNN) observer was developed. Although [...] Read more.
In music information retrieval (MIR), beat tracking is one of the most fundamental tasks. To obtain this critical component from rhythmic music signals, a previous beat tracking system of hidden Markov model (HMM) with a recurrent neural network (RNN) observer was developed. Although the frequency of music beat is quite stable, existing HMM based methods do not take this feature into account. Accordingly, most of hidden states in these HMM-based methods are redundant, which is a disadvantage for time efficiency. In this paper, we proposed an efficient HMM using hidden states by exploiting the frequency contents of the neural network’s observation with Fourier transform, which extremely reduces the computational complexity. Observers that previous works used, such as bi-directional recurrent neural network (Bi-RNN) and temporal convolutional network (TCN), cannot perceive the frequency of music beat. To obtain more reliable frequencies from music, a periodic recurrent neural network (PRNN) based on attention mechanism is proposed as well, which is used as the observer in HMM. Experimental results on open source music datasets, such as GTZAN, Hainsworth, SMC, and Ballroom, show that our efficient HMM with PRNN is competitive to the state-of-the-art methods and has lower computational cost. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 5165 KiB  
Article
Remaining Useful Life Prediction Using Dual-Channel LSTM with Time Feature and Its Difference
by Cheng Peng, Jiaqi Wu, Qilong Wang, Weihua Gui and Zhaohui Tang
Entropy 2022, 24(12), 1818; https://0-doi-org.brum.beds.ac.uk/10.3390/e24121818 - 13 Dec 2022
Cited by 10 | Viewed by 2051
Abstract
At present, the research on the prediction of the remaining useful life (RUL) of machinery mainly focuses on multi-sensor feature extraction and then uses the features to predict RUL. In complex operations and multiple abnormal environments, the impact of noise may result in [...] Read more.
At present, the research on the prediction of the remaining useful life (RUL) of machinery mainly focuses on multi-sensor feature extraction and then uses the features to predict RUL. In complex operations and multiple abnormal environments, the impact of noise may result in increased model complexity and decreased accuracy of RUL predictions. At the same time, how to use the sensor characteristics of time is also a problem. To overcome these issues, this paper proposes a dual-channel long short-term memory (LSTM) neural network model. Compared with the existing methods, the advantage of this method is to adaptively select the time feature and then perform first-order processing on the time feature value and use LSTM to extract the time feature and first-order time feature information. As the RUL curve predicted by the neural network is zigzag, we creatively designed a momentum-smoothing module to smooth the predicted RUL curve and improve the prediction accuracy. Experimental verification on the commercial modular aerospace propulsion system simulation (C-MAPSS) dataset proves the effectiveness and stability of the proposed method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 543 KiB  
Article
Convolution Based Graph Representation Learning from the Perspective of High Order Node Similarities
by Xing Li, Qingsong Li, Wei Wei and Zhiming Zheng
Mathematics 2022, 10(23), 4586; https://0-doi-org.brum.beds.ac.uk/10.3390/math10234586 - 03 Dec 2022
Viewed by 1023
Abstract
Nowadays, graph representation learning methods, in particular graph neural network methods, have attracted great attention and performed well in many downstream tasks. However, most graph neural network methods have a single perspective since they start from the edges (or adjacency matrix) of graphs, [...] Read more.
Nowadays, graph representation learning methods, in particular graph neural network methods, have attracted great attention and performed well in many downstream tasks. However, most graph neural network methods have a single perspective since they start from the edges (or adjacency matrix) of graphs, ignoring the mesoscopic structure (high-order local structure). In this paper, we introduce HS-GCN (High-order Node Similarity Graph Convolutional Network), which can mine the potential structural features of graphs from different perspectives by combining multiple high-order node similarity methods. We analyze HS-GCN theoretically and show that it is a generalization of the convolution-based graph neural network methods from different normalization perspectives. A series of experiments have shown that by combining high-order node similarities, our method can capture and utilize the high-order structural information of the graph more effectively, resulting in better results. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 6928 KiB  
Article
Image Fundus Classification System for Diabetic Retinopathy Stage Detection Using Hybrid CNN-DELM
by Dian Candra Rini Novitasari, Fatmawati Fatmawati, Rimuljo Hendradi, Hetty Rohayani, Rinda Nariswari, Arnita Arnita, Moch Irfan Hadi, Rizal Amegia Saputra and Ardhin Primadewi
Big Data Cogn. Comput. 2022, 6(4), 146; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6040146 - 01 Dec 2022
Cited by 7 | Viewed by 2061
Abstract
Diabetic retinopathy is the leading cause of blindness suffered by working-age adults. The increase in the population diagnosed with DR can be prevented by screening and early treatment of eye damage. This screening process can be conducted by utilizing deep learning techniques. In [...] Read more.
Diabetic retinopathy is the leading cause of blindness suffered by working-age adults. The increase in the population diagnosed with DR can be prevented by screening and early treatment of eye damage. This screening process can be conducted by utilizing deep learning techniques. In this study, the detection of DR severity was carried out using the hybrid CNN-DELM method (CDELM). The CNN architectures used were ResNet-18, ResNet-50, ResNet-101, GoogleNet, and DenseNet. The learning outcome features were further classified using the DELM algorithm. The comparison of CNN architecture aimed to find the best CNN architecture for fundus image features extraction. This research also compared the effect of using the kernel function on the performance of DELM in fundus image classification. All experiments using CDELM showed maximum results, with an accuracy of 100% in the DRIVE data and the two-class MESSIDOR data. Meanwhile, the best results obtained in the MESSIDOR 4 class data reached 98.20%. The advantage of the DELM method compared to the conventional CNN method is that the training time duration is much shorter. CNN takes an average of 30 min for training, while the CDELM method takes only an average of 2.5 min. Based on the value of accuracy and duration of training time, the CDELM method had better performance than the conventional CNN method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 513 KiB  
Article
A Double Penalty Model for Ensemble Learning
by Wenjia Wang and Yi-Hui Zhou
Mathematics 2022, 10(23), 4532; https://0-doi-org.brum.beds.ac.uk/10.3390/math10234532 - 30 Nov 2022
Viewed by 1156
Abstract
Modern statistical learning techniques often include learning ensembles, for which the combination of multiple separate prediction procedures (ensemble components) can improve prediction accuracy. Although ensemble approaches are widely used, work remains to improve our understanding of the theoretical underpinnings of aspects such as [...] Read more.
Modern statistical learning techniques often include learning ensembles, for which the combination of multiple separate prediction procedures (ensemble components) can improve prediction accuracy. Although ensemble approaches are widely used, work remains to improve our understanding of the theoretical underpinnings of aspects such as identifiability and relative convergence rates of the ensemble components. By considering ensemble learning for two learning ensemble components as a double penalty model, we provide a framework to better understand the relative convergence and identifiability of the two components. In addition, with appropriate conditions the framework provides convergence guarantees for a form of residual stacking when iterating between the two components as a cyclic coordinate ascent procedure. We conduct numerical experiments on three synthetic simulations and two real world datasets to illustrate the performance of our approach, and justify our theory. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 10480 KiB  
Article
Joint Deep Reinforcement Learning and Unsupervised Learning for Channel Selection and Power Control in D2D Networks
by Ming Sun, Yanhui Jin, Shumei Wang and Erzhuang Mei
Entropy 2022, 24(12), 1722; https://0-doi-org.brum.beds.ac.uk/10.3390/e24121722 - 24 Nov 2022
Cited by 2 | Viewed by 1582
Abstract
Device-to-device (D2D) technology enables direct communication between devices, which can effectively solve the problem of insufficient spectrum resources in 5G communication technology. Since the channels are shared among multiple D2D user pairs, it may lead to serious interference between D2D user pairs. In [...] Read more.
Device-to-device (D2D) technology enables direct communication between devices, which can effectively solve the problem of insufficient spectrum resources in 5G communication technology. Since the channels are shared among multiple D2D user pairs, it may lead to serious interference between D2D user pairs. In order to reduce interference, effectively increase network capacity, and improve wireless spectrum utilization, this paper proposed a distributed resource allocation algorithm with the joint of a deep Q network (DQN) and an unsupervised learning network. Firstly, a DQN algorithm was constructed to solve the channel allocation in the dynamic and unknown environment in a distributed manner. Then, a deep power control neural network with the unsupervised learning strategy was constructed to output an optimized channel power control scheme to maximize the spectrum transmit sum-rate through the corresponding constraint processing. As opposed to traditional centralized approaches that require the collection of instantaneous global network information, the algorithm proposed in this paper used each transmitter as a learning agent to make channel selection and power control through a small amount of state information collected locally. The simulation results showed that the proposed algorithm was more effective in increasing the convergence speed and maximizing the transmit sum-rate than other traditional centralized and distributed algorithms. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 622 KiB  
Article
Taxonomy-Aware Prototypical Network for Few-Shot Relation Extraction
by Mengru Wang, Jianming Zheng and Honghui Chen
Mathematics 2022, 10(22), 4378; https://0-doi-org.brum.beds.ac.uk/10.3390/math10224378 - 21 Nov 2022
Cited by 1 | Viewed by 1260
Abstract
Relation extraction aims to predict the relation triple between the tail entity and head entity in a given text. A large body of works adopt meta-learning to address the few-shot issue faced by relation extraction, where each relation category only contains few labeled [...] Read more.
Relation extraction aims to predict the relation triple between the tail entity and head entity in a given text. A large body of works adopt meta-learning to address the few-shot issue faced by relation extraction, where each relation category only contains few labeled data for demonstration. Despite promising results achieved by existing meta-learning methods, these methods still struggle to distinguish the subtle differences between different relations with similar expressions. We argue this is largely owing to that these methods cannot capture unbiased and discriminative features in the very few-shot scenario. For alleviating the above problems, we propose a taxonomy-aware prototype network, which consists of a category-aware calibration module and a task-aware training strategy module. The former implicitly and explicitly calibrates the representation of prototype to become sufficiently unbiased and discriminative. The latter balances the weight between easy and hard instances, which enables our proposal to focus on data with more information during the training stage. Finally, comprehensive experiments are conducted on four typical meta tasks. Furthermore, our proposal presents superiority over the competitive baselines with an improvement of 3.30% in terms of average accuracy. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 2833 KiB  
Article
A Novel Drinking Category Detection Method Based on Wireless Signals and Artificial Neural Network
by Jie Zhang, Zhongmin Wang, Kexin Zhou and Ruohan Bai
Entropy 2022, 24(11), 1700; https://0-doi-org.brum.beds.ac.uk/10.3390/e24111700 - 21 Nov 2022
Viewed by 1531
Abstract
With the continuous improvement of people’s health awareness and the continuous progress of scientific research, consumers have higher requirements for the quality of drinking. Compared with high-sugar-concentrated juice, consumers are more willing to accept healthy and original Not From Concentrated (NFC) juice and [...] Read more.
With the continuous improvement of people’s health awareness and the continuous progress of scientific research, consumers have higher requirements for the quality of drinking. Compared with high-sugar-concentrated juice, consumers are more willing to accept healthy and original Not From Concentrated (NFC) juice and packaged drinking water. At the same time, drinking category detection can be used for vending machine self-checkout. However, the current drinking category systems rely on special equipment, which require professional operation, and also rely on signals that are not widely used, such as radar. This paper introduces a novel drinking category detection method based on wireless signals and artificial neural network (ANN). Unlike past work, our design relies on WiFi signals that are widely used in life. The intuition is that when the wireless signals propagate through the detected target, the signals arrive at the receiver through multiple paths and different drinking categories will result in distinct multipath propagation, which can be leveraged to detect the drinking category. We capture the WiFi signals of detected drinking using wireless devices; then, we calculate channel state information (CSI), perform noise removal and feature extraction, and apply ANN for drinking category detection. Results demonstrate that our design has high accuracy in detecting drinking category. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 1073 KiB  
Article
Adaptive Dynamic Search for Multi-Task Learning
by Eunwoo Kim
Appl. Sci. 2022, 12(22), 11836; https://0-doi-org.brum.beds.ac.uk/10.3390/app122211836 - 21 Nov 2022
Viewed by 1302
Abstract
Multi-task learning (MTL) is a learning strategy for solving multiple tasks simultaneously while exploiting commonalities and differences between tasks for improved learning efficiency and prediction performance. Despite its potential, there remain several major challenges to be addressed. First of all, the task performance [...] Read more.
Multi-task learning (MTL) is a learning strategy for solving multiple tasks simultaneously while exploiting commonalities and differences between tasks for improved learning efficiency and prediction performance. Despite its potential, there remain several major challenges to be addressed. First of all, the task performance degrades when the number of tasks to solve increases or the tasks are less related. In addition, finding the prediction model for each task is typically laborious and can be suboptimal. This nature of manually designing the architecture further aggravates the problem when it comes to solving multiple tasks under different computational budgets. In this work, we propose a novel MTL approach to address these issues. The proposed method learns to search in a finely modularized base network dynamically and to discover an optimal prediction model for each instance of a task on the fly while taking the computational costs of the discovered models into account. We evaluate our learning framework on a diverse set of MTL scenarios comprising standard benchmark datasets. We achieve significant improvements in performance for all tested cases compared with existing MTL alternatives. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 2567 KiB  
Article
Video Action Recognition Using Motion and Multi-View Excitation with Temporal Aggregation
by Yuri Yudhaswana Joefrie and Masaki Aono
Entropy 2022, 24(11), 1663; https://0-doi-org.brum.beds.ac.uk/10.3390/e24111663 - 15 Nov 2022
Cited by 1 | Viewed by 1634
Abstract
Spatiotemporal and motion feature representations are the key to video action recognition. Typical previous approaches are to utilize 3D CNNs to cope with both spatial and temporal features, but they suffer from huge computations. Other approaches are to utilize (1+2)D CNNs to learn [...] Read more.
Spatiotemporal and motion feature representations are the key to video action recognition. Typical previous approaches are to utilize 3D CNNs to cope with both spatial and temporal features, but they suffer from huge computations. Other approaches are to utilize (1+2)D CNNs to learn spatial and temporal features in an efficient way, but they neglect the importance of motion representations. To overcome problems with previous approaches, we propose a novel block which makes it possible to alleviate the aforementioned problems, since our block can capture spatial and temporal features more faithfully and efficiently learn motion features. This proposed block includes Motion Excitation (ME), Multi-view Excitation (MvE), and Densely Connected Temporal Aggregation (DCTA). The purpose of ME is to encode feature-level frame differences; MvE is designed to enrich spatiotemporal features with multiple view representations adaptively; and DCTA is to model long-range temporal dependencies. We inject the proposed building block, which we refer to as the META block (or simply “META”), into 2D ResNet-50. Through extensive experiments, we demonstrate that our proposed method architecture outperforms previous CNN-based methods in terms of “Val Top-1 %” measure with Something-Something v1 and Jester datasets, while the META yielded competitive results with the Moment-in-Time Mini dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 2145 KiB  
Article
Feature-Enhanced Document-Level Relation Extraction in Threat Intelligence with Knowledge Distillation
by Yongfei Li, Yuanbo Guo, Chen Fang, Yongjin Hu, Yingze Liu and Qingli Chen
Electronics 2022, 11(22), 3715; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11223715 - 13 Nov 2022
Viewed by 1232
Abstract
Relation extraction in the threat intelligence domain plays an important role in mining the internal association between crucial threat elements and constructing a knowledge graph (KG). This study designed a novel document-level relation extraction model, FEDRE-KD, integrating additional features to take full advantage [...] Read more.
Relation extraction in the threat intelligence domain plays an important role in mining the internal association between crucial threat elements and constructing a knowledge graph (KG). This study designed a novel document-level relation extraction model, FEDRE-KD, integrating additional features to take full advantage of the information in documents. The study also introduced a teacher–student model, realizing knowledge distillation, to further improve performance. Additionally, a threat intelligence ontology was constructed to standardize the entities and their relationships. To solve the problem of lack of publicly available datasets for threat intelligence, manual annotation was carried out on the documents collected from social blogs, vendor bulletins, and hacking forums. After training the model, we constructed a threat intelligence knowledge graph in Neo4j. Experimental results indicate the effectiveness of additional features and knowledge distillation. Compared to mainstream models SSAN, GAIN, and ATLOP, FEDRE-KD improved the F1score by 22.07, 20.06, and 22.38, respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 3710 KiB  
Article
Prediction of Prospecting Target Based on ResNet Convolutional Neural Network
by Le Gao, Yongjie Huang, Xin Zhang, Qiyuan Liu and Zequn Chen
Appl. Sci. 2022, 12(22), 11433; https://0-doi-org.brum.beds.ac.uk/10.3390/app122211433 - 11 Nov 2022
Cited by 8 | Viewed by 1659
Abstract
In recent years, with the development of geological prospecting from shallow ore to deep and hidden ore, the difficulty of prospecting is increasing day by day, so the application of computer technology and new methods of geological and mineral exploration is paid more [...] Read more.
In recent years, with the development of geological prospecting from shallow ore to deep and hidden ore, the difficulty of prospecting is increasing day by day, so the application of computer technology and new methods of geological and mineral exploration is paid more and more attention. The mining and prediction of geological prospecting information based on deep learning have become the frontier field of earth science. However, as a deep artificial intelligence algorithm, deep learning still has many problems to be solved in the big data mining and prediction of geological prospecting, such as the small number of training samples of geological and mineral images, the difficulty of building deep learning network models, and the universal applicability of deep learning models. In this paper, the training samples and convolutional neural network models suitable for geochemical element data mining are constructed to solve the above problems, and the model is successfully applied to the prediction research of gold, silver, lead and zinc polymetallic metallogenic areas in South China. Taking the Pangxidong research area in the west of Guangdong Province as an example, this paper carries out prospecting target prediction research based on a 1:50000 stream sediment survey original data. Firstly, the support vector machine (SVM) model and statistical method were used to determine the ore-related geochemical element assemblage. Secondly, the experimental data of geochemical elements were augmented and a dataset was established. Finally, ResNet-50 neural network model is used for data training and prediction research. The experimental results show that the areas numbered 9, 29, 38, 40, 95, 111, 114, 124, 144 have great metallogenic potential, and this method would be a promising tool for metallogenic prediction. By applying the ResNet-50 neural network in metallogenic prediction, it can provide a new idea for the future exploration of mineral resources. In order to verify the generality of the research method in this paper, we conducted experimental tests on the geochemical dataset of B area, another deposit research area in South China. The results show that 100% of the prediction area obtained by using the proposed method covers the known ore deposit area. This model also provides method support for further delineating the prospecting target area in study area B. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 6014 KiB  
Case Report
Comparative Study of Mortality Rate Prediction Using Data-Driven Recurrent Neural Networks and the Lee–Carter Model
by Yuan Chen and Abdul Q. M. Khaliq
Big Data Cogn. Comput. 2022, 6(4), 134; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6040134 - 10 Nov 2022
Cited by 4 | Viewed by 2311
Abstract
The Lee–Carter model could be considered as one of the most important mortality prediction models among stochastic models in the field of mortality. With the recent developments of machine learning and deep learning, many studies have applied deep learning approaches to time series [...] Read more.
The Lee–Carter model could be considered as one of the most important mortality prediction models among stochastic models in the field of mortality. With the recent developments of machine learning and deep learning, many studies have applied deep learning approaches to time series mortality rate predictions, but most of them only focus on a comparison between the Long Short-Term Memory and the traditional models. In this study, three different recurrent neural networks, Long Short-Term Memory, Bidirectional Long Short-Term Memory, and Gated Recurrent Unit, are proposed for the task of mortality rate prediction. Different from the standard country level mortality rate comparison, this study compares the three deep learning models and the classic Lee–Carter model on nine divisions’ yearly mortality data by gender from 1966 to 2015 in the United States. With the out-of-sample testing, we found that the Gated Recurrent Unit model showed better average MAE and RMSE values than the Lee–Carter model on 72.2% (13/18) and 67.7% (12/18) of the database, respectively, while the same measure for the Long Short-Term Memory model and Bidirectional Long Short-Term Memory model are 50%/38.9% (MAE/RMSE) and 61.1%/61.1% (MAE/RMSE), respectively. If we consider forecasting accuracy, computing expense, and interpretability, the Lee–Carter model with ARIMA exhibits the best overall performance, but the recurrent neural networks could also be good candidates for mortality forecasting for divisions in the United States. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 2137 KiB  
Article
A Transfer Learning for Line-Based Portrait Sketch
by Hyungbum Kim, Junyoung Oh and Heekyung Yang
Mathematics 2022, 10(20), 3869; https://0-doi-org.brum.beds.ac.uk/10.3390/math10203869 - 18 Oct 2022
Cited by 2 | Viewed by 2746
Abstract
This paper presents a transfer learning-based framework that produces line-based portrait sketch images from portraits. The proposed framework produces sketch images using a GAN architecture, which is trained through a pseudo-sketch image dataset. The pseudo-sketch image dataset is constructed from a single artist-created [...] Read more.
This paper presents a transfer learning-based framework that produces line-based portrait sketch images from portraits. The proposed framework produces sketch images using a GAN architecture, which is trained through a pseudo-sketch image dataset. The pseudo-sketch image dataset is constructed from a single artist-created portrait sketch using a style transfer model with a series of postprocessing schemes. The proposed framework successfully produces portrait sketch images for portraits of various poses, expressions and illuminations. The excellence of the proposed model is proved by comparing the produced results with those from the existing works. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 8722 KiB  
Article
A GAN-Based Face Rotation for Artistic Portraits
by Handong Kim, Junho Kim and Heekyung Yang
Mathematics 2022, 10(20), 3860; https://0-doi-org.brum.beds.ac.uk/10.3390/math10203860 - 18 Oct 2022
Cited by 1 | Viewed by 4694
Abstract
We present a GAN-based model that rotates the faces in artistic portraits to various angles. We build a dataset of artistic portraits for training our GAN-based model by applying a 3D face model to the artistic portraits. We also devise proper loss functions [...] Read more.
We present a GAN-based model that rotates the faces in artistic portraits to various angles. We build a dataset of artistic portraits for training our GAN-based model by applying a 3D face model to the artistic portraits. We also devise proper loss functions to preserve the styles in the artistic portraits as well as to rotate the faces in the portraits to proper angles. These approaches enable us to construct a GAN-based face rotation model. We apply this model to various artistic portraits, including photorealistic oil paint portraits, watercolor portraits, well-known portrait artworks and banknote portraits, and produce convincing rotated faces in the artistic portraits. Finally, we prove that our model can produce improved results compared with the existing models by evaluating the similarity and the angles of the rotated faces through evaluation schemes including FID estimation, recognition ratio estimation, pose estimation and user study. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

14 pages, 3464 KiB  
Article
Hydrogen Storage Prediction in Dibenzyltoluene as Liquid Organic Hydrogen Carrier Empowered with Weighted Federated Machine Learning
by Ahsan Ali, Muhammad Adnan Khan and Hoimyung Choi
Mathematics 2022, 10(20), 3846; https://0-doi-org.brum.beds.ac.uk/10.3390/math10203846 - 17 Oct 2022
Cited by 7 | Viewed by 1783
Abstract
The hydrogen stored in liquid organic hydrogen carriers (LOHCs) has an advantage of safe and convenient hydrogen storage system. Dibenzyltoluene (DBT), due to its low flammability, liquid nature and high hydrogen storage capacity, is an efficient LOHC system. It is imperative to indicate [...] Read more.
The hydrogen stored in liquid organic hydrogen carriers (LOHCs) has an advantage of safe and convenient hydrogen storage system. Dibenzyltoluene (DBT), due to its low flammability, liquid nature and high hydrogen storage capacity, is an efficient LOHC system. It is imperative to indicate the optimal reaction conditions to achieve the theoretical hydrogen storage density. Hence, a Hydrogen Storage Prediction System empowered with Weighted Federated Machine Learning (HSPS-WFML) is proposed in this study. The dataset were divided into three classes, i.e., low, medium and high, and the performance of the proposed HSPS-WFML was investigated. The accuracy of the medium class is higher (99.90%) than other classes. The accuracy of the low and high class is 96.50% and 96.40%, respectively. Moreover, the overall accuracy and miss rate of the proposed HSPS-WFML are 96.40% and 3.60%, respectively. Our proposed model is compared with existing studies related to hydrogen storage prediction, and its accuracy is found in agreement with these studies. Therefore, the proposed HSPS-WFML is an efficient model for hydrogen storage prediction. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

29 pages, 3781 KiB  
Article
Reservoir Prediction Model via the Fusion of Optimized Long Short-Term Memory Network (LSTM) and Bidirectional Random Vector Functional Link (RVFL)
by Guodong Li, Yongke Pan and Pu Lan
Electronics 2022, 11(20), 3343; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11203343 - 17 Oct 2022
Viewed by 1098
Abstract
An accurate and stable reservoir prediction model is essential for oil location and production. We propose an predictive hybrid model ILSTM-BRVFL based on an improved long short-term memory network (IAOS-LSTM) and a bidirectional random vector functional link (Bidirectional-RVFL) for this problem. Firstly, the [...] Read more.
An accurate and stable reservoir prediction model is essential for oil location and production. We propose an predictive hybrid model ILSTM-BRVFL based on an improved long short-term memory network (IAOS-LSTM) and a bidirectional random vector functional link (Bidirectional-RVFL) for this problem. Firstly, the Atomic Orbit Search algorithm (AOS) is used to perform collective optimization of the parameters to improve the stability and accuracy of the LSTM model for high-dimensional feature extraction. At the same time, there is still room to improve the optimization capability of the AOS. Therefore, an improvement scheme to further enhance the optimization capability is proposed. Then, the LSTM-extracted high-dimensional features are fed into the random vector functional link (RVFL) to improve the prediction of high-dimensional features by the RVFL, which is modified as the bidirectional RVFL. The proposed ILSTM-BRVFL (IAOS) model achieves an average prediction accuracy of 95.28%, compared to the experimental results. The model’s accuracy, recall values, and F1 values also showed good performance, and the prediction ability achieved the expected results. The comparative analysis and the degree of improvement in the model results show that the high-dimensional extraction of the input data by LSTM is the most significant improvement in prediction accuracy. Secondly, it introduces a double-ended mechanism for IAOS to LSTM and RVFL for parameter search. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 1297 KiB  
Article
Enhanced Sample Self-Revised Network for Cross-Dataset Facial Expression Recognition
by Xiaolin Xu, Yuan Zong, Cheng Lu and Xingxun Jiang
Entropy 2022, 24(10), 1475; https://0-doi-org.brum.beds.ac.uk/10.3390/e24101475 - 17 Oct 2022
Cited by 1 | Viewed by 1392
Abstract
Recently, cross-dataset facial expression recognition (FER) has obtained wide attention from researchers. Thanks to the emergence of large-scale facial expression datasets, cross-dataset FER has made great progress. Nevertheless, facial images in large-scale datasets with low quality, subjective annotation, severe occlusion, and rare subject [...] Read more.
Recently, cross-dataset facial expression recognition (FER) has obtained wide attention from researchers. Thanks to the emergence of large-scale facial expression datasets, cross-dataset FER has made great progress. Nevertheless, facial images in large-scale datasets with low quality, subjective annotation, severe occlusion, and rare subject identity can lead to the existence of outlier samples in facial expression datasets. These outlier samples are usually far from the clustering center of the dataset in the feature space, thus resulting in considerable differences in feature distribution, which severely restricts the performance of most cross-dataset facial expression recognition methods. To eliminate the influence of outlier samples on cross-dataset FER, we propose the enhanced sample self-revised network (ESSRN) with a novel outlier-handling mechanism, whose aim is first to seek these outlier samples and then suppress them in dealing with cross-dataset FER. To evaluate the proposed ESSRN, we conduct extensive cross-dataset experiments across RAF-DB, JAFFE, CK+, and FER2013 datasets. Experimental results demonstrate that the proposed outlier-handling mechanism can reduce the negative impact of outlier samples on cross-dataset FER effectively and our ESSRN outperforms classic deep unsupervised domain adaptation (UDA) methods and the recent state-of-the-art cross-dataset FER results. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 915 KiB  
Article
Lipreading Using Liquid State Machine with STDP-Tuning
by Xuhu Yu, Zhong Wan, Zehao Shi and Lei Wang
Appl. Sci. 2022, 12(20), 10484; https://0-doi-org.brum.beds.ac.uk/10.3390/app122010484 - 17 Oct 2022
Cited by 1 | Viewed by 1622
Abstract
Lipreading refers to the task of decoding the text content of a speaker based on visual information about the movement of the speaker’s lips. With the development of deep learning in recent years, lipreading has attracted extensive research. However, the deep learning method [...] Read more.
Lipreading refers to the task of decoding the text content of a speaker based on visual information about the movement of the speaker’s lips. With the development of deep learning in recent years, lipreading has attracted extensive research. However, the deep learning method requires a lot of computing resources, which is not conducive to the migration of the system to edge devices. Inspired by the work of Spiking Neural Networks (SNNs) in recognizing human actions and gestures, we propose a lipreading system based on SNNs. Specifically, we construct the front-end feature extractor of the system using Liquid State Machine (LSM). On the other hand, a heuristic algorithm is used to select appropriate parameters for the classifier in the backend. On small-scale lipreading datasets, our recognition accuracy achieves good results. We claim that our network performs better in terms of accuracy and ratio of learned parameters compared to other networks, and has superior advantages in terms of network complexity and training cost. On the AVLetters dataset, our model achieves a 5% improvement in accuracy over traditional methods and a 90% reduction in parameters over the state-of-the-art. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

24 pages, 5704 KiB  
Article
PN-BBN: A Petri Net-Based Bayesian Network for Anomalous Behavior Detection
by Ke Lu, Xianwen Fang and Na Fang
Mathematics 2022, 10(20), 3790; https://0-doi-org.brum.beds.ac.uk/10.3390/math10203790 - 14 Oct 2022
Cited by 3 | Viewed by 1343
Abstract
Business process anomalous behavior detection reveals unexpected cases from event logs to ensure the trusted operation of information systems. Anomaly behavior is mainly identified through a log-to-model alignment analysis or numerical outlier detection. However, both approaches ignore the influence of probability distributions or [...] Read more.
Business process anomalous behavior detection reveals unexpected cases from event logs to ensure the trusted operation of information systems. Anomaly behavior is mainly identified through a log-to-model alignment analysis or numerical outlier detection. However, both approaches ignore the influence of probability distributions or activity relationships in process activities. Based on this concern, this paper incorporates the behavioral relationships characterized by the process model and the joint probability distribution of nodes related to suspected anomalous behaviors. Moreover, a Petri Net-Based Bayesian Network (PN-BBN) is proposed to detect anomalous behaviors based on the probabilistic inference of behavioral contexts. First, the process model is filtered based on the process structure of the process activities to identify the key regions where the suspected anomalous behaviors are located. Then, the behavioral profile of the activity is used to prune it to position the ineluctable paths that trigger these activities. Further, the model is used as the architecture for parameter learning to construct the PN-BBN. Based on this, anomaly scores are inferred based on the joint probabilities of activities related to suspected anomalous behaviors for anomaly detection under the constraints of control flow and probability distributions. Finally, PN-BBN is implemented based on the open-source frameworks PM4PY and PMGPY and evaluated from multiple metrics with synthetic and real process data. The experimental results demonstrate that PN-BBN effectively identifies anomalous process behaviors and improves the reliability of information systems. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 4844 KiB  
Article
MobileNetV2 Combined with Fast Spectral Kurtosis Analysis for Bearing Fault Diagnosis
by Tian Xue, Huaiguang Wang and Dinghai Wu
Electronics 2022, 11(19), 3176; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11193176 - 03 Oct 2022
Cited by 3 | Viewed by 1535
Abstract
Bearings are an important component in mechanical equipment, and their health detection and fault diagnosis are of great significance. In order to meet the speed and recognition accuracy requirements of bearing fault diagnosis, this paper uses the lightweight MobileNetV2 network combined with fast [...] Read more.
Bearings are an important component in mechanical equipment, and their health detection and fault diagnosis are of great significance. In order to meet the speed and recognition accuracy requirements of bearing fault diagnosis, this paper uses the lightweight MobileNetV2 network combined with fast spectral kurtosis to diagnose bearing faults. On the basis of the original MobileNetV2 network, a progressive classifier is used to compress the feature information layer by layer with the network structure to achieve high-precision and rapid identification and classification. A cross-local connection structure is added to the network to increase the extracted feature information to improve accuracy. At the same time, the original fault signal of the bearing is a one-dimensional vibration signal, and the signal contains a large number of non-Gaussian noise and accidental shock defects. In order to extract fault features more efficiently, this paper uses the fast spectral kurtosis algorithm to process the signal, extract the center frequency of the original signal, and calculate the spectral kurtosis value. The kurtosis map generated by signal preprocessing is used as the input of the MobileNetV2 network for fault classification. In order to verify the effectiveness and generality of the proposed method, this paper uses the XJTU-SY bearing fault dataset and the CWRU bearing dataset to conduct experiments. Through data preprocessing methods, such as data expansion for different fault types in the original dataset, input data that meet the experimental requirements are generated and fault diagnosis experiments are carried out. At the same time, through the comparison with other typical classification networks, the paper proves that the proposed method has significant advantages in terms of accuracy, model size, training speed, etc., and, finally, proves the effectiveness and generality of the proposed network model in the field of fault diagnosis. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 5104 KiB  
Article
Automatic Medical Face Mask Detection Based on Cross-Stage Partial Network to Combat COVID-19
by Christine Dewi and Rung-Ching Chen
Big Data Cogn. Comput. 2022, 6(4), 106; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6040106 - 30 Sep 2022
Cited by 9 | Viewed by 1985
Abstract
According to the World Health Organization (WHO), the COVID-19 coronavirus pandemic has resulted in a worldwide public health crisis. One effective method of protection is to use a mask in public places. Recent advances in object detection, which are based on deep learning [...] Read more.
According to the World Health Organization (WHO), the COVID-19 coronavirus pandemic has resulted in a worldwide public health crisis. One effective method of protection is to use a mask in public places. Recent advances in object detection, which are based on deep learning models, have yielded promising results in terms of finding objects in images. Annotating and finding medical face mask objects in real-life images is the aim of this paper. While in public places, people can be protected from the transmission of COVID-19 between themselves by wearing medical masks made of medical materials. Our works employ Yolo V4 CSP SPP to identify the medical mask. Our experiment combined the Face Mask Dataset (FMD) and Medical Mask Dataset (MMD) into one dataset to investigate through this study. The proposed model improves the detection performance of the previous research study with FMD and MMD datasets from 81% to 99.26%. We have shown that our proposed Yolo V4 CSP SPP model scheme is an accurate mechanism for identifying medically masked faces. Each algorithm conducts a comprehensive analysis of, and provides a detailed description of, the benefits that come with using Cross Stage Partial (CSP) and Spatial Pyramid Pooling (SPP). Furthermore, after the study, a comparison between the findings and those of similar works has been provided. In terms of accuracy and precision, the suggested detector surpassed earlier works. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 1262 KiB  
Article
An Asymmetric Contrastive Loss for Handling Imbalanced Datasets
by Valentino Vito and Lim Yohanes Stefanus
Entropy 2022, 24(9), 1303; https://0-doi-org.brum.beds.ac.uk/10.3390/e24091303 - 15 Sep 2022
Cited by 2 | Viewed by 2024
Abstract
Contrastive learning is a representation learning method performed by contrasting a sample to other similar samples so that they are brought closely together, forming clusters in the feature space. The learning process is typically conducted using a two-stage training architecture, and it utilizes [...] Read more.
Contrastive learning is a representation learning method performed by contrasting a sample to other similar samples so that they are brought closely together, forming clusters in the feature space. The learning process is typically conducted using a two-stage training architecture, and it utilizes the contrastive loss (CL) for its feature learning. Contrastive learning has been shown to be quite successful in handling imbalanced datasets, in which some classes are overrepresented while some others are underrepresented. However, previous studies have not specifically modified CL for imbalanced datasets. In this work, we introduce an asymmetric version of CL, referred to as ACL, in order to directly address the problem of class imbalance. In addition, we propose the asymmetric focal contrastive loss (AFCL) as a further generalization of both ACL and focal contrastive loss (FCL). The results on the imbalanced FMNIST and ISIC 2018 datasets show that the AFCL is capable of outperforming the CL and FCL in terms of both weighted and unweighted classification accuracies. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 4251 KiB  
Article
STSM: Spatio-Temporal Shift Module for Efficient Action Recognition
by Zhaoqilin Yang, Gaoyun An and Ruichen Zhang
Mathematics 2022, 10(18), 3290; https://0-doi-org.brum.beds.ac.uk/10.3390/math10183290 - 10 Sep 2022
Cited by 4 | Viewed by 1598
Abstract
The modeling, computational complexity, and accuracy of spatio-temporal models are the three major foci in the field of video action recognition. The traditional 2D convolution has low computational complexity, but it cannot capture the temporal relationships. Although the 3D convolution can obtain good [...] Read more.
The modeling, computational complexity, and accuracy of spatio-temporal models are the three major foci in the field of video action recognition. The traditional 2D convolution has low computational complexity, but it cannot capture the temporal relationships. Although the 3D convolution can obtain good performance, it is with both high computational complexity and a large number of parameters. In this paper, we propose a plug-and-play Spatio-Temporal Shift Module (STSM), which is a both effective and high-performance module. STSM can be easily inserted into other networks to increase or enhance the ability of the network to learn spatio-temporal features, effectively improving performance without increasing the number of parameters and computational complexity. In particular, when 2D CNNs and STSM are integrated, the new network may learn spatio-temporal features and outperform networks based on 3D convolutions. We revisit the shift operation from the perspective of matrix algebra, i.e., the spatio-temporal shift operation is a convolution operation with a sparse convolution kernel. Furthermore, we extensively evaluate the proposed module on Kinetics-400 and Something-Something V2 datasets. The experimental results show the effectiveness of the proposed STSM, and the proposed action recognition networks may also achieve state-of-the-art results on the two action recognition benchmarks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 3010 KiB  
Article
Micro-Expression Recognition Using Uncertainty-Aware Magnification-Robust Networks
by Mengting Wei, Yuan Zong, Xingxun Jiang, Cheng Lu and Jiateng Liu
Entropy 2022, 24(9), 1271; https://0-doi-org.brum.beds.ac.uk/10.3390/e24091271 - 09 Sep 2022
Cited by 2 | Viewed by 1486
Abstract
A micro-expression (ME) is a kind of involuntary facial expressions, which commonly occurs with subtle intensity. The accurately recognition ME, a. k. a. micro-expression recognition (MER), has a number of potential applications, e.g., interrogation and clinical diagnosis. Therefore, the subject has received a [...] Read more.
A micro-expression (ME) is a kind of involuntary facial expressions, which commonly occurs with subtle intensity. The accurately recognition ME, a. k. a. micro-expression recognition (MER), has a number of potential applications, e.g., interrogation and clinical diagnosis. Therefore, the subject has received a high level of attention among researchers in affective computing and pattern recognition communities. In this paper, we proposed a straightforward and effective deep learning method called uncertainty-aware magnification-robust networks (UAMRN) for MER, which attempts to address two key issues in MER including the low intensity of ME and imbalance of ME samples. Specifically, to better distinguish subtle ME movements, we reconstructed a new sequence by magnifying the ME intensity. Furthermore, a sparse self-attention (SSA) block was implemented which rectifies the standard self-attention with locality sensitive hashing (LSH), resulting in the suppression of artefacts generated during magnification. On the other hand, for the class imbalance problem, we guided the network optimization based on the confidence about the estimation, through which the samples from rare classes were allotted greater uncertainty and thus trained more carefully. We conducted the experiments on three public ME databases, i.e., CASME II, SAMM and SMIC-HS, the results of which demonstrate improvement compared to recent state-of-the-art MER methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 699 KiB  
Article
Adapting Multiple Distributions for Bridging Emotions from Different Speech Corpora
by Yuan Zong, Hailun Lian, Hongli Chang, Cheng Lu and Chuangao Tang
Entropy 2022, 24(9), 1250; https://0-doi-org.brum.beds.ac.uk/10.3390/e24091250 - 05 Sep 2022
Cited by 1 | Viewed by 1136
Abstract
In this paper, we focus on a challenging, but interesting, task in speech emotion recognition (SER), i.e., cross-corpus SER. Unlike conventional SER, a feature distribution mismatch may exist between the labeled source (training) and target (testing) speech samples in cross-corpus SER because they [...] Read more.
In this paper, we focus on a challenging, but interesting, task in speech emotion recognition (SER), i.e., cross-corpus SER. Unlike conventional SER, a feature distribution mismatch may exist between the labeled source (training) and target (testing) speech samples in cross-corpus SER because they come from different speech emotion corpora, which degrades the performance of most well-performing SER methods. To address this issue, we propose a novel transfer subspace learning method called multiple distribution-adapted regression (MDAR) to bridge the gap between speech samples from different corpora. Specifically, MDAR aims to learn a projection matrix to build the relationship between the source speech features and emotion labels. A novel regularization term called multiple distribution adaption (MDA), consisting of a marginal and two conditional distribution-adapted operations, is designed to collaboratively enable such a discriminative projection matrix to be applicable to the target speech samples, regardless of speech corpus variance. Consequently, by resorting to the learned projection matrix, we are able to predict the emotion labels of target speech samples when only the source label information is given. To evaluate the proposed MDAR method, extensive cross-corpus SER tasks based on three different speech emotion corpora, i.e., EmoDB, eNTERFACE, and CASIA, were designed. Experimental results showed that the proposed MDAR outperformed most recent state-of-the-art transfer subspace learning methods and even performed better than several well-performing deep transfer learning methods in dealing with cross-corpus SER tasks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 2734 KiB  
Article
Implicitly Aligning Joint Distributions for Cross-Corpus Speech Emotion Recognition
by Cheng Lu, Yuan Zong, Chuangao Tang, Hailun Lian, Hongli Chang, Jie Zhu, Sunan Li and Yan Zhao
Electronics 2022, 11(17), 2745; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11172745 - 31 Aug 2022
Cited by 2 | Viewed by 1250
Abstract
In this paper, we investigate the problem of cross-corpus speech emotion recognition (SER), in which the training (source) and testing (target) speech samples belong to different corpora. This case thus leads to a feature distribution mismatch between the source and target speech samples. [...] Read more.
In this paper, we investigate the problem of cross-corpus speech emotion recognition (SER), in which the training (source) and testing (target) speech samples belong to different corpora. This case thus leads to a feature distribution mismatch between the source and target speech samples. Hence, the performance of most existing SER methods drops sharply. To solve this problem, we propose a simple yet effective transfer subspace learning method called joint distribution implicitly aligned subspace learning (JIASL). The basic idea of JIASL is very straightforward, i.e., building an emotion discriminative and corpus invariant linear regression model under an implicit distribution alignment strategy. Following this idea, we first make use of the source speech features and emotion labels to endow such a regression model with emotion-discriminative ability. Then, a well-designed reconstruction regularization term, jointly considering the marginal and conditional distribution alignments between the speech samples in both corpora, is adopted to implicitly enable the regression model to predict the emotion labels of target speech samples. To evaluate the performance of our proposed JIASL, extensive cross-corpus SER experiments are carried out, and the results demonstrate the promising performance of the proposed JIASL in coping with the tasks of cross-corpus SER. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 4009 KiB  
Article
Application of Machine Learning to Express Measurement Uncertainty
by Vladimir Polužanski, Uros Kovacevic, Nebojsa Bacanin, Tarik A. Rashid, Sasa Stojanovic and Bosko Nikolic
Appl. Sci. 2022, 12(17), 8581; https://0-doi-org.brum.beds.ac.uk/10.3390/app12178581 - 27 Aug 2022
Cited by 4 | Viewed by 2537
Abstract
The continuing increase in data processing power in modern devices and the availability of a vast amount of data via the internet and the internet of things (sensors, monitoring systems, financial records, health records, social media, etc.) enabled the accelerated development of machine [...] Read more.
The continuing increase in data processing power in modern devices and the availability of a vast amount of data via the internet and the internet of things (sensors, monitoring systems, financial records, health records, social media, etc.) enabled the accelerated development of machine learning techniques. However, the collected data can be inconsistent, incomplete, and noisy, leading to a decreased confidence in data analysis. The paper proposes a novel “judgmental” approach to evaluating the measurement uncertainty of the machine learning model that implements the dropout additive regression trees algorithm. The considered method uses the procedure for expressing the type B measurement uncertainty and the maximal value of the empirical absolute loss function of the model. It is related to the testing and monitoring of power equipment and determining partial discharge location by the non-iterative, all-acoustic method. The example uses the dataset representing the correlation of the mean distance of partial discharge and acoustic sensors and the temperature coefficient of the sensitivity of the non-iterative algorithm. The dropout additive regression trees algorithm achieved the best performance based on the highest coefficient of determination value. Most of the model’s predictions (>97%) fell into the proposed standard measurement uncertainty interval for both “seen” and “unseen” data. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 859 KiB  
Article
StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
by Omama Hamad, Ali Hamdi, Sayed Hamdi and Khaled Shaban
Big Data Cogn. Comput. 2022, 6(3), 88; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6030088 - 22 Aug 2022
Cited by 2 | Viewed by 2550
Abstract
In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were [...] Read more.
In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manually annotated into the classes agree, disagreeor neutral. We performed benchmarking on the dataset using state-of-the-art and traditional machine learning models. Specifically, we trained deep learning models—bidirectional encoder representations from transformers, long short-term memory, convolutional neural networks, attention-based biLSTM and Naive Bayes SVM—in addition to naive Bayes, logistic regression, support vector machines, decision trees, K-nearest neighbor and random forest. The average accuracy in the 10-fold cross-validation of these models ranged from 75% to 84.8% and from 52.6% to 68% for binary and multi-class stance classifications, respectively. Performances were affected by high vocabulary overlaps between classes and unreliable transfer learning using deep models pre-trained on general texts in relation to specific domains such as COVID-19 and distance education. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 679 KiB  
Article
Efficient Privacy-Preserving K-Means Clustering from Secret-Sharing-Based Secure Three-Party Computation
by Weiming Wei, Chunming Tang and Yucheng Chen
Entropy 2022, 24(8), 1145; https://0-doi-org.brum.beds.ac.uk/10.3390/e24081145 - 18 Aug 2022
Cited by 5 | Viewed by 1982
Abstract
Privacy-preserving machine learning has become an important study at present due to privacy policies. However, the efficiency gap between the plain-text algorithm and its privacy-preserving version still exists. In this paper, we focus on designing a novel secret-sharing-based K-means clustering algorithm. Particularly, [...] Read more.
Privacy-preserving machine learning has become an important study at present due to privacy policies. However, the efficiency gap between the plain-text algorithm and its privacy-preserving version still exists. In this paper, we focus on designing a novel secret-sharing-based K-means clustering algorithm. Particularly, we present an efficient privacy-preserving K-means clustering algorithm based on replicated secret sharing with honest-majority in the semi-honest model. More concretely, the clustering task is outsourced to three semi-honest computing servers. Theoretically, the proposed privacy-preserving scheme can be proven with full data privacy. Furthermore, the experimental results demonstrate that our proposed privacy version reaches the same accuracy as the plain-text one. Compared to the existing privacy-preserving scheme, our proposed protocol can achieve about 16.5×–25.2× faster computation and 63.8×–68.0× lower communication. Consequently, the proposed privacy-preserving scheme is suitable for secret-sharing-based secure outsourced computation. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1776 KiB  
Article
Forward Warping-Based Video Frame Interpolation Using a Motion Selective Network
by Jeonghwan Heo and Jechang Jeong
Electronics 2022, 11(16), 2553; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11162553 - 15 Aug 2022
Cited by 1 | Viewed by 2338
Abstract
Recently, deep neural networks have shown surprising results in solving most of the traditional image processing problems. However, the video frame interpolation field does not show relatively good performance because the receptive field requires a vast spatio-temporal range. To reduce the computational complexity, [...] Read more.
Recently, deep neural networks have shown surprising results in solving most of the traditional image processing problems. However, the video frame interpolation field does not show relatively good performance because the receptive field requires a vast spatio-temporal range. To reduce the computational complexity, in most frame interpolation studies, motion is first calculated with the optical flow, then interpolated frames are generated through backward warping. However, while the backward warping process is simple to implement, the interpolated image contains mixed motion and ghosting defects. Therefore, we propose a new network that does not use the backward warping method through the proposed max-min warping. Since max-min warping generates a clear warping image in advance according to the size of the motion and the network is configured to select the warping result according to the warped layer, using the proposed method, it is possible to optimize the computational complexity while selecting a contextually appropriate image. The video interpolation method using the proposed method showed 34.847 PSNR in the Vimeo90k dataset and 0.13 PSNR improvement compared to the Quadratic Video Interpolation method, showing that it is an efficient frame interpolation self-supervised learning. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 38420 KiB  
Article
Combination of Deep Cross-Stage Partial Network and Spatial Pyramid Pooling for Automatic Hand Detection
by Christine Dewi and Henoch Juli Christanto
Big Data Cogn. Comput. 2022, 6(3), 85; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6030085 - 09 Aug 2022
Cited by 10 | Viewed by 3328
Abstract
The human hand is involved in many computer vision tasks, such as hand posture estimation, hand movement identification, human activity analysis, and other similar tasks, in which hand detection is an important preprocessing step. It is still difficult to correctly recognize some hands [...] Read more.
The human hand is involved in many computer vision tasks, such as hand posture estimation, hand movement identification, human activity analysis, and other similar tasks, in which hand detection is an important preprocessing step. It is still difficult to correctly recognize some hands in a cluttered environment because of the complex display variations of agile human hands and the fact that they have a wide range of motion. In this study, we provide a brief assessment of CNN-based object identification algorithms, specifically Densenet Yolo V2, Densenet Yolo V2 CSP, Densenet Yolo V2 CSP SPP, Resnet 50 Yolo V2, Resnet 50 CSP, Resnet 50 CSP SPP, Yolo V4 SPP, Yolo V4 CSP SPP, and Yolo V5. The advantages of CSP and SPP are thoroughly examined and described in detail in each algorithm. We show in our experiments that Yolo V4 CSP SPP provides the best level of precision available. The experimental results show that the CSP and SPP layers help improve the accuracy of CNN model testing performance. Our model leverages the advantages of CSP and SPP. Our proposed method Yolo V4 CSP SPP outperformed previous research results by an average of 8.88%, with an improvement from 87.6% to 96.48%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 5309 KiB  
Article
RSS-Based Wireless LAN Indoor Localization and Tracking Using Deep Architectures
by Muhammed Zahid Karakusak, Hasan Kivrak, Hasan Fehmi Ates and Mehmet Kemal Ozdemir
Big Data Cogn. Comput. 2022, 6(3), 84; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6030084 - 08 Aug 2022
Cited by 8 | Viewed by 3079
Abstract
Wireless Local Area Network (WLAN) positioning is a challenging task indoors due to environmental constraints and the unpredictable behavior of signal propagation, even at a fixed location. The aim of this work is to develop deep learning-based approaches for indoor localization and tracking [...] Read more.
Wireless Local Area Network (WLAN) positioning is a challenging task indoors due to environmental constraints and the unpredictable behavior of signal propagation, even at a fixed location. The aim of this work is to develop deep learning-based approaches for indoor localization and tracking by utilizing Received Signal Strength (RSS). The study proposes Multi-Layer Perceptron (MLP), One and Two Dimensional Convolutional Neural Networks (1D CNN and 2D CNN), and Long Short Term Memory (LSTM) deep networks architectures for WLAN indoor positioning based on the data obtained by actual RSS measurements from an existing WLAN infrastructure in a mobile user scenario. The results, using different types of deep architectures including MLP, CNNs, and LSTMs with existing WLAN algorithms, are presented. The Root Mean Square Error (RMSE) is used as the assessment criterion. The proposed LSTM Model 2 achieved a dynamic positioning RMSE error of 1.73m, which outperforms probabilistic WLAN algorithms such as Memoryless Positioning (RMSE: 10.35m) and Nonparametric Information (NI) filter with variable acceleration (RMSE: 5.2m) under the same experiment environment. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 947 KiB  
Article
Progressively Discriminative Transfer Network for Cross-Corpus Speech Emotion Recognition
by Cheng Lu, Chuangao Tang, Jiacheng Zhang and Yuan Zong
Entropy 2022, 24(8), 1046; https://0-doi-org.brum.beds.ac.uk/10.3390/e24081046 - 29 Jul 2022
Cited by 5 | Viewed by 1767
Abstract
Cross-corpus speech emotion recognition (SER) is a challenging task, and its difficulty lies in the mismatch between the feature distributions of the training (source domain) and testing (target domain) data, leading to the performance degradation when the model deals with new domain data. [...] Read more.
Cross-corpus speech emotion recognition (SER) is a challenging task, and its difficulty lies in the mismatch between the feature distributions of the training (source domain) and testing (target domain) data, leading to the performance degradation when the model deals with new domain data. Previous works explore utilizing domain adaptation (DA) to eliminate the domain shift between the source and target domains and have achieved the promising performance in SER. However, these methods mainly treat cross-corpus tasks simply as the DA problem, directly aligning the distributions across domains in a common feature space. In this case, excessively narrowing the domain distance will impair the emotion discrimination of speech features since it is difficult to maintain the completeness of the emotion space only by an emotion classifier. To overcome this issue, we propose a progressively discriminative transfer network (PDTN) for cross-corpus SER in this paper, which can enhance the emotion discrimination ability of speech features while eliminating the mismatch between the source and target corpora. In detail, we design two special losses in the feature layers of PDTN, i.e., emotion discriminant loss Ld and distribution alignment loss La. By incorporating prior knowledge of speech emotion into feature learning (i.e., high and low valence speech emotion features have their respective cluster centers), we integrate a valence-aware center loss Lv and an emotion-aware center loss Lc as the Ld to guarantee the discriminative learning of speech emotions except an emotion classifier. Furthermore, a multi-layer distribution alignment loss La is adopted to more precisely eliminate the discrepancy of feature distributions between the source and target domains. Finally, through the optimization of PDTN by combining three losses, i.e., cross-entropy loss Le, Ld, and La, we can gradually eliminate the domain mismatch between the source and target corpora while maintaining the emotion discrimination of speech features. Extensive experimental results of six cross-corpus tasks on three datasets, i.e., Emo-DB, eNTERFACE, and CASIA, reveal that our proposed PDTN outperforms the state-of-the-art methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 8847 KiB  
Article
Questioning the Anisotropy of Pedestrian Dynamics: An Empirical Analysis with Artificial Neural Networks
by Rudina Subaih, Mohammed Maree, Antoine Tordeux and Mohcine Chraibi
Appl. Sci. 2022, 12(15), 7563; https://0-doi-org.brum.beds.ac.uk/10.3390/app12157563 - 27 Jul 2022
Cited by 3 | Viewed by 1350
Abstract
Identifying the factors that control the dynamics of pedestrians is a crucial step towards modeling and building various pedestrian-oriented simulation systems. In this article, we empirically explore the influential factors that control the single-file movement of pedestrians and their impact. Our goal in [...] Read more.
Identifying the factors that control the dynamics of pedestrians is a crucial step towards modeling and building various pedestrian-oriented simulation systems. In this article, we empirically explore the influential factors that control the single-file movement of pedestrians and their impact. Our goal in this context is to apply feed-forward neural networks to predict and understand the individual speeds for different densities of pedestrians. With artificial neural networks, we can approximate the fitting function that describes pedestrians’ movement without having modeling bias. Our analysis is focused on the distances and range of interactions across neighboring pedestrians. As indicated by previous research, we find that the speed of pedestrians depends on the distance to the predecessor. Yet, in contrast to classical purely anisotropic approaches—which are based on vision fields and assume that the interaction mainly depends on the distance in front—our results demonstrate that the distance to the follower also significantly influences movement. Using the distance to the follower combined with the subject pedestrian’s headway distance to predict the speed improves the estimation by 18% compared to the prediction using the space in front alone. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 3803 KiB  
Article
Deep Compressive Sensing on ECG Signals with Modified Inception Block and LSTM
by Jing Hua, Jue Rao, Yingqiong Peng, Jizhong Liu and Jianjun Tang
Entropy 2022, 24(8), 1024; https://0-doi-org.brum.beds.ac.uk/10.3390/e24081024 - 25 Jul 2022
Cited by 7 | Viewed by 1685
Abstract
In practical electrocardiogram (ECG) monitoring, there are some challenges in reducing the data burden and energy costs. Therefore, compressed sensing (CS) which can conduct under-sampling and reconstruction at the same time is adopted in the ECG monitoring application. Recently, deep learning used in [...] Read more.
In practical electrocardiogram (ECG) monitoring, there are some challenges in reducing the data burden and energy costs. Therefore, compressed sensing (CS) which can conduct under-sampling and reconstruction at the same time is adopted in the ECG monitoring application. Recently, deep learning used in CS methods improves the reconstruction performance significantly and can removes of some of the constraints in traditional CS. In this paper, we propose a deep compressive-sensing scheme for ECG signals, based on modified-Inception block and long short-term memory (LSTM). The framework is comprised of four modules: preprocessing; compression; initial; and final reconstruction. We adaptively compressed the normalized ECG signals, sequentially using three convolutional layers, and reconstructed the signals with a modified Inception block and LSTM. We conducted our experiments on the MIT-BIH Arrhythmia Database and Non-Invasive Fetal ECG Arrhythmia Database to validate the robustness of our model, adopting Signal-to-Noise Ratio (SNR) and percentage Root-mean-square Difference (PRD) as the evaluation metrics. The PRD of our scheme was the lowest and the SNR was the highest at all of the sensing rates in our experiments on both of the databases, and when the sensing rate was higher than 0.5, the PRD was lower than 2%, showing significant improvement in reconstruction performance compared to the comparative methods. Our method also showed good recovering quality in the noisy data. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 998 KiB  
Article
EEG-Based Schizophrenia Diagnosis through Time Series Image Conversion and Deep Learning
by Dong-Woo Ko and Jung-Jin Yang
Electronics 2022, 11(14), 2265; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11142265 - 20 Jul 2022
Cited by 15 | Viewed by 3034
Abstract
Schizophrenia, a mental disorder experienced by more than 20 million people worldwide, is emerging as a serious issue in society. Currently, the diagnosis of schizophrenia is based only on mental disorder diagnosis and/or diagnosis by a psychiatrist or mental health professional using DSM-5, [...] Read more.
Schizophrenia, a mental disorder experienced by more than 20 million people worldwide, is emerging as a serious issue in society. Currently, the diagnosis of schizophrenia is based only on mental disorder diagnosis and/or diagnosis by a psychiatrist or mental health professional using DSM-5, a diagnostic and statistical manual of mental disorders. Furthermore, patients in countries with insufficient access to healthcare are difficult to diagnose for schizophrenia and early diagnosis is even more problematic. While various studies are being conducted to solve the challenges of schizophrenia diagnosis, methodology is considered to be limited, and diagnostic accuracy needs to be improved. In this study, a new approach using EEG data and deep learning is proposed to increase objectivity and efficiency of schizophrenia diagnosis. Existing deep learning studies use EEG data to classify schizophrenic patients and healthy subjects by learning EEG in the form of graphs or tables. However, in this study, EEG, a time series data, was converted into an image to improve classification accuracy, and is then studied in deep learning models. This study used EEG data of 81 people, in which the difference in N100 EEG between schizophrenic patients and healthy patients had been analyzed in prior research. EEGs were converted into images using time series image conversion algorithms, Recurrence Plot (RP) and Gramian Angular Field (GAF), and converted EEG images were learned with Convolutional Neural Network (CNN) models built based on VGGNet. When the trained deep learning model was applied to the same data from prior research, it was demonstrated that classification accuracy improved when compared to previous studies. Among the two algorithms used for image conversion, the deep learning model that learned through GAF showed significantly higher classification accuracy. The results of this study suggest that the use of GAF and CNN models based on EEG results can be an effective way to increase objectivity and efficiency in diagnosing various mental disorders, including schizophrenia. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

10 pages, 2518 KiB  
Article
Machine Learning Sorting Method of Bauxite Based on SE-Enhanced Network
by Pengfei Zhao, Zhengjie Luo, Jiansu Li, Yujun Liu and Baocheng Zhang
Appl. Sci. 2022, 12(14), 7178; https://0-doi-org.brum.beds.ac.uk/10.3390/app12147178 - 16 Jul 2022
Cited by 1 | Viewed by 1512
Abstract
A fast and accurate bauxite recognition method combining an attention module and a clustering algorithm is proposed in this paper. By introducing the K-means clustering algorithm into the YOLOv4 network and embedding the SE attention module, we calculate the corresponding anchor box value, [...] Read more.
A fast and accurate bauxite recognition method combining an attention module and a clustering algorithm is proposed in this paper. By introducing the K-means clustering algorithm into the YOLOv4 network and embedding the SE attention module, we calculate the corresponding anchor box value, enhance the feature learning ability of the network to bauxite, automatically learn the importance of different channel features, and improve the accuracy of bauxite target detection. In the experiment, 2189 bauxite photos were taken and screened as the target detection datasets, and the targets were divided into four categories: No. 55, No. 65, No. 70, and Nos. 72–73. By selecting the category volume balanced datasets, the optimal YOLOv4 network model was obtained after training 7000 times, so that the average accuracy of bauxite sorting reached 99%, and the reasoning speed was better than 0.05 s. Realizing the high-speed and high-precision sorting of bauxite greatly improves the mining efficiency and accuracy of the bauxite industry. At the same time, the model provides key technical support for the practical application of the same type of engineering. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 3852 KiB  
Article
Multiscale Dense U-Net: A Fast Correction Method for Thermal Drift Artifacts in Laboratory NanoCT Scans of Semi-Conductor Chips
by Mengnan Liu, Yu Han, Xiaoqi Xi, Linlin Zhu, Shuangzhan Yang, Siyu Tan, Jian Chen, Lei Li and Bin Yan
Entropy 2022, 24(7), 967; https://0-doi-org.brum.beds.ac.uk/10.3390/e24070967 - 13 Jul 2022
Cited by 2 | Viewed by 1471
Abstract
The resolution of 3D structure reconstructed by laboratory nanoCT is often affected by changes in ambient temperature. Although correction methods based on projection alignment have been widely used, they are time-consuming and complex. Especially in piecewise samples (e.g., chips), the existing methods are [...] Read more.
The resolution of 3D structure reconstructed by laboratory nanoCT is often affected by changes in ambient temperature. Although correction methods based on projection alignment have been widely used, they are time-consuming and complex. Especially in piecewise samples (e.g., chips), the existing methods are semi-automatic because the projections lose attenuation information at some rotation angles. Herein, we propose a fast correction method that directly processes the reconstructed slices. Thus, the limitations of the existing methods are addressed. The method is named multiscale dense U-Net (MD-Unet), which is based on MIMO-Unet and achieves state-of-the-art artifacts correction performance in nanoCT. Experiments show that MD-Unet can significantly boost the correction performance (e.g., with three orders of magnitude improvement in correction speed compared with traditional methods), and MD-Unet+ improves 0.92 dB compared with MIMO-Unet in the chip dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 1439 KiB  
Article
Domain Adaptation with Data Uncertainty Measure Based on Evidence Theory
by Ying Lv, Bofeng Zhang, Guobing Zou, Xiaodong Yue, Zhikang Xu and Haiyan Li
Entropy 2022, 24(7), 966; https://0-doi-org.brum.beds.ac.uk/10.3390/e24070966 - 13 Jul 2022
Cited by 2 | Viewed by 1812
Abstract
Domain adaptation aims to learn a classifier for a target domain task by using related labeled data from the source domain. Because source domain data and target domain task may be mismatched, there is an uncertainty of source domain data with respect to [...] Read more.
Domain adaptation aims to learn a classifier for a target domain task by using related labeled data from the source domain. Because source domain data and target domain task may be mismatched, there is an uncertainty of source domain data with respect to the target domain task. Ignoring the uncertainty may lead to models with unreliable and suboptimal classification results for the target domain task. However, most previous works focus on reducing the gap in data distribution between the source and target domains. They do not consider the uncertainty of source domain data about the target domain task and cannot apply the uncertainty to learn an adaptive classifier. Aimed at this problem, we revisit the domain adaptation from source domain data uncertainty based on evidence theory and thereby devise an adaptive classifier with the uncertainty measure. Based on evidence theory, we first design an evidence net to estimate the uncertainty of source domain data about the target domain task. Second, we design a general loss function with the uncertainty measure for the adaptive classifier and extend the loss function to support vector machine. Finally, numerical experiments on simulation datasets and real-world applications are given to comprehensively demonstrate the effectiveness of the adaptive classifier with the uncertainty measure. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 3069 KiB  
Article
Attention-Shared Multi-Agent Actor–Critic-Based Deep Reinforcement Learning Approach for Mobile Charging Dynamic Scheduling in Wireless Rechargeable Sensor Networks
by Chengpeng Jiang, Ziyang Wang, Shuai Chen, Jinglin Li, Haoran Wang, Jinwei Xiang and Wendong Xiao
Entropy 2022, 24(7), 965; https://0-doi-org.brum.beds.ac.uk/10.3390/e24070965 - 12 Jul 2022
Cited by 8 | Viewed by 1733
Abstract
The breakthrough of wireless energy transmission (WET) technology has greatly promoted the wireless rechargeable sensor networks (WRSNs). A promising method to overcome the energy constraint problem in WRSNs is mobile charging by employing a mobile charger to charge sensors via WET. Recently, more [...] Read more.
The breakthrough of wireless energy transmission (WET) technology has greatly promoted the wireless rechargeable sensor networks (WRSNs). A promising method to overcome the energy constraint problem in WRSNs is mobile charging by employing a mobile charger to charge sensors via WET. Recently, more and more studies have been conducted for mobile charging scheduling under dynamic charging environments, ignoring the consideration of the joint charging sequence scheduling and charging ratio control (JSSRC) optimal design. This paper will propose a novel attention-shared multi-agent actor–critic-based deep reinforcement learning approach for JSSRC (AMADRL-JSSRC). In AMADRL-JSSRC, we employ two heterogeneous agents named charging sequence scheduler and charging ratio controller with an independent actor network and critic network. Meanwhile, we design the reward function for them, respectively, by considering the tour length and the number of dead sensors. The AMADRL-JSSRC trains decentralized policies in multi-agent environments, using a centralized computing critic network to share an attention mechanism, and it selects relevant policy information for each agent at every charging decision. Simulation results demonstrate that the proposed AMADRL-JSSRC can efficiently prolong the lifetime of the network and reduce the number of death sensors compared with the baseline algorithms. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 565 KiB  
Article
Multi-Level Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
by Lei Feng, Yuxuan Xie, Bing Liu and Shuyan Wang
Appl. Sci. 2022, 12(14), 6938; https://0-doi-org.brum.beds.ac.uk/10.3390/app12146938 - 08 Jul 2022
Cited by 6 | Viewed by 1965
Abstract
Multi-agent reinforcement learning (MARL) has become more and more popular over recent decades, and the need for high-level cooperation is increasing every day because of the complexity of the real-world environment. However, the multi-agent credit assignment problem that serves as the main obstacle [...] Read more.
Multi-agent reinforcement learning (MARL) has become more and more popular over recent decades, and the need for high-level cooperation is increasing every day because of the complexity of the real-world environment. However, the multi-agent credit assignment problem that serves as the main obstacle to high-level coordination is still not addressed properly. Though lots of methods have been proposed, none of them have thought to perform credit assignments across multi-levels. In this paper, we aim to propose an approach to learning a better credit assignment scheme by credit assignment across multi-levels. First, we propose a hierarchical model that consists of the manager level and the worker level. The manager level incorporates the dilated Gated Recurrent Unit (GRU) to focus on high-level plans and the worker level uses GRU to execute primitive actions conditioned on high-level plans. Then, one centralized critic is designed for each level to learn each level’s credit assignment scheme. To this end, we construct a novel hierarchical MARL algorithm, named MLCA, which can achieve multi-level credit assignment. We also conduct experiments on three classical and challenging tasks to demonstrate the performance of the proposed algorithm against three baseline methods. The results show that our method gains great performance improvement across all maps that require high-level cooperation. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

22 pages, 1108 KiB  
Article
We Know You Are Living in Bali: Location Prediction of Twitter Users Using BERT Language Model
by Lihardo Faisal Simanjuntak, Rahmad Mahendra and Evi Yulianti
Big Data Cogn. Comput. 2022, 6(3), 77; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6030077 - 07 Jul 2022
Cited by 15 | Viewed by 3607
Abstract
Twitter user location data provide essential information that can be used for various purposes. However, user location is not easy to identify because many profiles omit this information, or users enter data that do not correspond to their actual locations. Several related works [...] Read more.
Twitter user location data provide essential information that can be used for various purposes. However, user location is not easy to identify because many profiles omit this information, or users enter data that do not correspond to their actual locations. Several related works attempted to predict location on English-language tweets. In this study, we attempted to predict the location of Indonesian tweets. We utilized machine learning approaches, i.e., long-short term memory (LSTM) and bidirectional encoder representations from transformers (BERT) to infer Twitter users’ home locations using display name in profile, user description, and user tweets. By concatenating display name, description, and aggregated tweet, the model achieved the best accuracy of 0.77. The performance of the IndoBERT model outperformed several baseline models. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 30605 KiB  
Article
Optical Flow-Aware-Based Multi-Modal Fusion Network for Violence Detection
by Yang Xiao, Guxue Gao, Liejun Wang and Huicheng Lai
Entropy 2022, 24(7), 939; https://0-doi-org.brum.beds.ac.uk/10.3390/e24070939 - 06 Jul 2022
Cited by 3 | Viewed by 2019
Abstract
Violence detection aims to locate violent content in video frames. Improving the accuracy of violence detection is of great importance for security. However, the current methods do not make full use of the multi-modal vision and audio information, which affects the accuracy of [...] Read more.
Violence detection aims to locate violent content in video frames. Improving the accuracy of violence detection is of great importance for security. However, the current methods do not make full use of the multi-modal vision and audio information, which affects the accuracy of violence detection. We found that the violence detection accuracy of different kinds of videos is related to the change of optical flow. With this in mind, we propose an optical flow-aware-based multi-modal fusion network (OAMFN) for violence detection. Specifically, we use three different fusion strategies to fully integrate multi-modal features. First, the main branch concatenates RGB features and audio features and the optical flow branch concatenates optical flow features with RGB features and audio features, respectively. Then, the cross-modal information fusion module integrates the features of different combinations and applies weights to them to capture cross-modal information in audio and video. After that, the channel attention module extracts valuable information by weighting the integration features. Furthermore, an optical flow-aware-based score fusion strategy is introduced to fuse features of different modalities from two branches. Compared with methods on the XD-Violence dataset, our multi-modal fusion network yields APs that are 83.09% and 1.4% higher than those of the state-of-the-art methods in offline detection, and 78.09% and 4.42% higher than those of the state-of-the-art methods in online detection. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

18 pages, 3778 KiB  
Article
Automatic Rice Disease Detection and Assistance Framework Using Deep Learning and a Chatbot
by Siddhi Jain, Rahul Sahni, Tuneer Khargonkar, Himanshu Gupta, Om Prakash Verma, Tarun Kumar Sharma, Tushar Bhardwaj, Saurabh Agarwal and Hyunsung Kim
Electronics 2022, 11(14), 2110; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11142110 - 06 Jul 2022
Cited by 14 | Viewed by 3660
Abstract
Agriculture not only supplies food but is also a source of income for a vast population of the world. Paddy plants usually produce a brown-coloured husk on the top and their seed, after de-husking and processing, yields edible rice which is a major [...] Read more.
Agriculture not only supplies food but is also a source of income for a vast population of the world. Paddy plants usually produce a brown-coloured husk on the top and their seed, after de-husking and processing, yields edible rice which is a major cereal food crop and staple food, and therefore, becomes the cornerstone of the food security for half the world’s people. However, with the increase in climate change and global warming, the quality and its production are highly degraded by the common diseases posed in rice plants due to bacteria and fungi (such as sheath rot, leaf blast, leaf smut, brown spot, and bacterial blight). Therefore, to accurately identify these diseases at an early stage, recently, recognition and classification of crop diseases is in burning demand. Hence, the present work proposes an automatic system in the form of a smartphone application (E-crop doctor) to detect diseases from paddy leaves which can also suggest pesticides to farmers. The application also has a chatbot named “docCrop” which provides 24 × 7 support to the farmers. The efficiency of the two most popular object detection algorithms (YOLOv3 tiny and YOLOv4 tiny) for smartphone applications was analysed for the detection of three diseases—brown spot, leaf blast, and hispa. The results reveal that YOLOv4 tiny achieved a mAP of 97.36% which is significantly higher by a margin of 17.59% than YOLOv3 tiny. Hence, YOLOv4 tiny is deployed for the development of the mobile application for use. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 496 KiB  
Article
Research on Computer-Aided Diagnosis Method Based on Symptom Filtering and Weighted Network
by Xiaoxi Huang and Haoxin Wang
Entropy 2022, 24(7), 931; https://0-doi-org.brum.beds.ac.uk/10.3390/e24070931 - 05 Jul 2022
Viewed by 1168
Abstract
In the process of disease identification, as the number of diseases increases, the collection of both diseases and symptoms becomes larger. However, existing computer-aided diagnosis systems do not completely solve the dimensional disaster caused by the increasing data set. To address the above [...] Read more.
In the process of disease identification, as the number of diseases increases, the collection of both diseases and symptoms becomes larger. However, existing computer-aided diagnosis systems do not completely solve the dimensional disaster caused by the increasing data set. To address the above problems, we propose methods of using symptom filtering and a weighted network with the goal of deeper processing of the collected symptom information. Symptom filtering is similar to a filter in signal transmission, which can filter the collected symptom information, further reduce the dimensional space of the system, and make the important symptoms more prominent. The weighted network, on the other hand, mines deeper disease information by modeling the channels of symptom information, amplifying important information, and suppressing unimportant information. Compared with existing hierarchical reinforcement learning models, the feature extraction methods proposed in this paper can help existing models improve their accuracy by more than 10%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 695 KiB  
Article
Topological Data Analysis Helps to Improve Accuracy of Deep Learning Models for Fake News Detection Trained on Very Small Training Sets
by Ran Deng and Fedor Duzhin
Big Data Cogn. Comput. 2022, 6(3), 74; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6030074 - 05 Jul 2022
Cited by 6 | Viewed by 4192
Abstract
Topological data analysis has recently found applications in various areas of science, such as computer vision and understanding of protein folding. However, applications of topological data analysis to natural language processing remain under-researched. This study applies topological data analysis to a particular natural [...] Read more.
Topological data analysis has recently found applications in various areas of science, such as computer vision and understanding of protein folding. However, applications of topological data analysis to natural language processing remain under-researched. This study applies topological data analysis to a particular natural language processing task: fake news detection. We have found that deep learning models are more accurate in this task than topological data analysis. However, assembling a deep learning model with topological data analysis significantly improves the model’s accuracy if the available training set is very small. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 616 KiB  
Article
Joint Entity and Relation Extraction Network with Enhanced Explicit and Implicit Semantic Information
by Huiyan Wu and Jun Huang
Appl. Sci. 2022, 12(12), 6231; https://0-doi-org.brum.beds.ac.uk/10.3390/app12126231 - 19 Jun 2022
Cited by 3 | Viewed by 2400
Abstract
The main purpose of the joint entity and relation extraction is to extract entities from unstructured texts and extract the relation between labeled entities at the same time. At present, most existing joint entity and relation extraction networks ignore the utilization of explicit [...] Read more.
The main purpose of the joint entity and relation extraction is to extract entities from unstructured texts and extract the relation between labeled entities at the same time. At present, most existing joint entity and relation extraction networks ignore the utilization of explicit semantic information and explore implicit semantic information insufficiently. In this paper, we propose Joint Entity and Relation Extraction Network with Enhanced Explicit and Implicit Semantic Information (EINET). First, on the premise of using the pre-trained model, we introduce explicit semantics from Semantic Role Labeling (SRL), which contains rich semantic features about the entity types and relation of entities. Then, to enhance the implicit semantic information and extract richer features of the entity and local context, we adopt different Bi-directional Long Short-Term Memory (Bi-LSTM) networks to encode entities and local contexts, respectively. In addition, we propose to integrate global semantic information and local context length representation in relation extraction to further improve the model performance. Our model achieves competitive results on three publicly available datasets. Compared with the baseline model on Conll04, EINET obtains improvements by 2.37% in F1 for named entity recognition and 3.43% in F1 for relation extraction. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 2393 KiB  
Article
An fMRI Sequence Representation Learning Framework for Attention Deficit Hyperactivity Disorder Classification
by Jin Xie, Zhiyong Huo, Xianru Liu and Zhishun Wang
Appl. Sci. 2022, 12(12), 6211; https://0-doi-org.brum.beds.ac.uk/10.3390/app12126211 - 18 Jun 2022
Cited by 3 | Viewed by 1688
Abstract
For attention deficit hyperactivity disorder (ADHD), a common neurological disease, accurate identification is the basis for treatment. In this paper, a novel end-to-end representation learning framework for ADHD classification of functional magnetic resonance imaging (fMRI) sequences is proposed. With such a framework, the [...] Read more.
For attention deficit hyperactivity disorder (ADHD), a common neurological disease, accurate identification is the basis for treatment. In this paper, a novel end-to-end representation learning framework for ADHD classification of functional magnetic resonance imaging (fMRI) sequences is proposed. With such a framework, the complexity of the sequence representation learning neural network decreases, the overfitting problem of deep learning for small samples cases is solved effectively, and superior classification performance is achieved. Specifically, a data conversion module was designed to convert a two-dimensional sequence into a three-dimensional image, which expands the modeling area and greatly reduces the computational complexity. The transfer learning method was utilized to freeze or fine-tune the parameters of the pre-trained neural network to reduce the risk of overfitting in the cases with small samples. Hierarchical feature extraction can be performed automatically by combining the sequence representation learning modules with a weighted cross-entropy loss. Experiments were conducted both with individual imaging sites and combining them, and the results showed that the classification average accuracies with the proposed framework were 73.73% and 72.02%, respectively, which are much higher than those of the existing methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 2429 KiB  
Article
A Multi-Lingual Speech Recognition-Based Framework to Human-Drone Interaction
by Kheireddine Choutri, Mohand Lagha, Souham Meshoul, Mohamed Batouche, Yasmine Kacel and Nihad Mebarkia
Electronics 2022, 11(12), 1829; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11121829 - 09 Jun 2022
Cited by 2 | Viewed by 2812
Abstract
In recent years, human–drone interaction has received increasing interest from the scientific community. When interacting with a drone, humans assume a variety of roles, the nature of which are determined by the drone’s application and degree of autonomy. Common methods of controlling drone [...] Read more.
In recent years, human–drone interaction has received increasing interest from the scientific community. When interacting with a drone, humans assume a variety of roles, the nature of which are determined by the drone’s application and degree of autonomy. Common methods of controlling drone movements include by RF remote control and ground control station. These devices are often difficult to manipulate and may even require some training. An alternative is to use innovative methods called natural user interfaces that allow users to interact with drones in an intuitive manner using speech. However, using only one language of interacting may limit the number of users, especially if different languages are spoken in the same region. Moreover, environmental and propellers noise make speech recognition a complicated task. The goal of this work is to use a multilingual speech recognition system that includes English, Arabic, and Amazigh to control the movement of drones. The reason for selecting these languages is that they are widely spoken in many regions, particularly in the Middle East and North Africa (MENA) zone. To achieve this goal, a two-stage approach is proposed. During the first stage, a deep learning based model for multilingual speech recognition is designed. Then, the developed model is deployed in real settings using a quadrotor UAV. The network was trained using 38,850 records including commands and unknown words mixed with noise to improve robustness. An average class accuracy of more than 93% has been achieved. After that, experiments were conducted involving 16 participants giving voice commands in order to test the efficiency of the designed system. The achieved accuracy is about 93.76% for English recognition and 88.55%, 82.31% for Arabic and Amazigh, respectively. Finally, hardware implementation of the designed system on a quadrotor UAV was made. Real time tests have shown that the approach is very promising as an alternative form of human–drone interaction while offering the benefit of control simplicity. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 4151 KiB  
Article
P2P Lending Default Prediction Based on AI and Statistical Models
by Po-Chang Ko, Ping-Chen Lin, Hoang-Thu Do and You-Fu Huang
Entropy 2022, 24(6), 801; https://0-doi-org.brum.beds.ac.uk/10.3390/e24060801 - 08 Jun 2022
Cited by 3 | Viewed by 3866
Abstract
Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are [...] Read more.
Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are still subject to risks. This paper proposes prediction models to mitigate the risks of default and asymmetric information on P2P lending platforms. Specifically, we designed sophisticated procedures to pre-process mass data extracted from Lending Club in 2018 Q3–2019 Q2. After that, three statistical models, namely, Logistic Regression, Bayesian Classifier, and Linear Discriminant Analysis (LDA), and five AI models, namely, Decision Tree, Random Forest, LightGBM, Artificial Neural Network (ANN), and Convolutional Neural Network (CNN), were utilized for data analysis. The loan statuses of Lending Club’s customers were rationally classified. To evaluate the models, we adopted the confusion matrix series of metrics, AUC-ROC curve, Kolmogorov–Smirnov chart (KS), and Student’s t-test. Empirical studies show that LightGBM produces the best performance and is 2.91% more accurate than the other models, resulting in a revenue improvement of nearly USD 24 million for Lending Club. Student’s t-test proves that the differences between models are statistically significant. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 370 KiB  
Article
ConAs-GRNs: Sentiment Classification with Construction-Assisted Multi-Scale Graph Reasoning Networks
by Bo Chen, Weiming Peng and Jihua Song
Electronics 2022, 11(12), 1825; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11121825 - 08 Jun 2022
Viewed by 1306
Abstract
Traditional neural networks have limited capabilities in modeling the refined global and contextual semantics of emotional texts and usually ignore the dependencies between different emotional words. To address this limitation, this paper proposes a construction-assisted multi-scale graph reasoning network (ConAs-GRNs), which explores the [...] Read more.
Traditional neural networks have limited capabilities in modeling the refined global and contextual semantics of emotional texts and usually ignore the dependencies between different emotional words. To address this limitation, this paper proposes a construction-assisted multi-scale graph reasoning network (ConAs-GRNs), which explores the details of the contextual semantics as well as the emotional dependencies between emotional texts from multiple aspects by focusing on the salient emotional information. In this network, an emotional construction-based multi-scale topological graph is used to describe multiple aspects of emotional dependency, and a sentence dependency tree is utilized to construct a relationship graph based on emotional words and texts. Then, the transfer learning and pooling learning on the topology map is performed. In our case, a weighted edge reduction strategy is used to aggregate the adjacency information which enables the internal transfer of semantic information in a single graph. Moreover, to implement the inter-graph transfer of semantic information, we rely on the construction structure to coordinate the heterogeneous graph information. The extensive experiments conducted on two baseline datasets, SemEval 2014 and ACL-14, demonstrate that the proposed ConAs-GRNs can effectively coordinate and integrate the heterogeneous information from within constructions. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 1324 KiB  
Article
An Active Learning Algorithm Based on the Distribution Principle of Bhattacharyya Distance
by He Xu, Chunyue Ding, Peng Li and Yimu Ji
Mathematics 2022, 10(11), 1927; https://0-doi-org.brum.beds.ac.uk/10.3390/math10111927 - 04 Jun 2022
Cited by 1 | Viewed by 1433
Abstract
Active learning is a method that can actively select examples with much information from a large number of unlabeled samples to query labeled by experts, so as to obtain a high-precision classifier with a small number of samples. Most of the current research [...] Read more.
Active learning is a method that can actively select examples with much information from a large number of unlabeled samples to query labeled by experts, so as to obtain a high-precision classifier with a small number of samples. Most of the current research uses the basic principles to optimize the classifier at each iteration, but the batch query with the largest amount of information in each round does not represent the overall distribution of the sample, that is, it may fall into partial optimization and ignore the whole, which will may affect or reduce its accuracy. In order to solve this problem, a special distance measurement method—Bhattacharyya Distance—is used in this paper. By using this distance and designing a new set of query decision logic, we can improve the accuracy of the model. Our method embodies the query of the samples with the most representative distribution and the largest amount of information to realize the classification task based on a small number of samples. We perform theoretical proofs and experimental analysis. Finally, we use different data sets and compare them with other classification algorithms to evaluate the performance and efficiency of our algorithm. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 3596 KiB  
Article
A List-Ranking Framework Based on Linear and Non-Linear Fusion for Recommendation from Implicit Feedback
by Buchen Wu and Jiwei Qin
Entropy 2022, 24(6), 778; https://0-doi-org.brum.beds.ac.uk/10.3390/e24060778 - 31 May 2022
Cited by 1 | Viewed by 1505
Abstract
Although most list-ranking frameworks are based on multilayer perceptrons (MLP), they still face limitations within the method itself in the field of recommender systems in two respects: (1) MLP suffer from overfitting when dealing with sparse vectors. At the same time, the model [...] Read more.
Although most list-ranking frameworks are based on multilayer perceptrons (MLP), they still face limitations within the method itself in the field of recommender systems in two respects: (1) MLP suffer from overfitting when dealing with sparse vectors. At the same time, the model itself tends to learn in-depth features of user–item interaction behavior but ignores some low-rank and shallow information present in the matrix. (2) Existing ranking methods cannot effectively deal with the problem of ranking between items with the same rating value and the problem of inconsistent independence in reality. We propose a list ranking framework based on linear and non-linear fusion for recommendation from implicit feedback, named RBLF. First, the model uses dense vectors to represent users and items through one-hot encoding and embedding. Second, to jointly learn shallow and deep user–item interaction, we use the interaction grabbing layer to capture the user–item interaction behavior through dense vectors of users and items. Finally, RBLF uses the Bayesian collaborative ranking to better fit the characteristics of implicit feedback. Eventually, the experiments show that the performance of RBLF obtains a significant improvement. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 3499 KiB  
Article
A Fast Multi-Scale Generative Adversarial Network for Image Compressed Sensing
by Wenzong Li, Aichun Zhu, Yonggang Xu, Hongsheng Yin and Gang Hua
Entropy 2022, 24(6), 775; https://0-doi-org.brum.beds.ac.uk/10.3390/e24060775 - 31 May 2022
Cited by 3 | Viewed by 4898
Abstract
Recently, deep neural network-based image compressed sensing methods have achieved impressive success in reconstruction quality. However, these methods (1) have limitations in sampling pattern and (2) usually have the disadvantage of high computational complexity. To this end, a fast multi-scale generative adversarial network [...] Read more.
Recently, deep neural network-based image compressed sensing methods have achieved impressive success in reconstruction quality. However, these methods (1) have limitations in sampling pattern and (2) usually have the disadvantage of high computational complexity. To this end, a fast multi-scale generative adversarial network (FMSGAN) is implemented in this paper. Specifically, (1) an effective multi-scale sampling structure is proposed. It contains four different kernels with varying sizes so that decompose, and sample images effectively, which is capable of capturing different levels of spatial features at multiple scales. (2) An efficient lightweight multi-scale residual structure for deep image reconstruction is proposed to balance receptive field size and computational complexity. The key idea is to apply smaller convolution kernel sizes in the multi-scale residual structure to reduce the number of operations while maintaining the receptive field. Meanwhile, the channel attention structure is employed for enriching useful information. Moreover, perceptual loss is combined with MSE loss and adversarial loss as the optimization function to recover a finer image. Numerous experiments show that our FMSGAN achieves state-of-the-art image reconstruction quality with low computational complexity. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1880 KiB  
Article
Automatic Classification of 15 Leads ECG Signal of Myocardial Infarction Using One Dimension Convolutional Neural Network
by Ahmad Haidar Mirza, Siti Nurmaini and Radiyati Umi Partan
Appl. Sci. 2022, 12(11), 5603; https://0-doi-org.brum.beds.ac.uk/10.3390/app12115603 - 31 May 2022
Cited by 6 | Viewed by 2031
Abstract
Impaired blood flow caused by coronary artery occlusion due to thrombus can cause damage to the heart muscle which is often called Myocardial Infarction (MI). To avoid the complexity of MI diseases such as heart failure or arrhythmias that can cause death, it [...] Read more.
Impaired blood flow caused by coronary artery occlusion due to thrombus can cause damage to the heart muscle which is often called Myocardial Infarction (MI). To avoid the complexity of MI diseases such as heart failure or arrhythmias that can cause death, it is necessary to diagnose and detect them early. An electrocardiogram (ECG) signal is a diagnostic medium that can be used to detect acute MI. Diagnostics with the help of data science is very useful in detecting MI in ECG signals. The purpose of study is to propose an automatic classification framework for Myocardial Infarction (MI) with 15 lead ECG signals consisting of 12 standard leads and 3 Frank leads. This research contributes to the improvement of classification performance for 10 MI classes and normal classes. The PTB dataset trained with the proposed 1D-CNN architecture was able to produce average accuracy, sensitivity, specificity, precision and F1-score of 99.98%, 99.91%, 99.99%, 99.91, and 99.91%. From the evaluation results, it can be concluded that the proposed 1D-CNN architecture is able to provide excellent performance in detecting MI attacks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1369 KiB  
Article
Specific Emitter Identification Based on Ensemble Neural Network and Signal Graph
by Chenjie Xing, Yuan Zhou, Yinan Peng, Jieke Hao and Shuoshi Li
Appl. Sci. 2022, 12(11), 5496; https://0-doi-org.brum.beds.ac.uk/10.3390/app12115496 - 28 May 2022
Cited by 2 | Viewed by 1696
Abstract
Specific emitter identification (SEI) is a technology for extracting fingerprint features from a signal and identifying the emitter. In this paper, the author proposes an SEI method based on ensemble neural networks (ENN) and signal graphs, with the following innovations: First, a signal [...] Read more.
Specific emitter identification (SEI) is a technology for extracting fingerprint features from a signal and identifying the emitter. In this paper, the author proposes an SEI method based on ensemble neural networks (ENN) and signal graphs, with the following innovations: First, a signal graph is used to show signal data in a non-Euclidean space. Namely, sequence signal data is constructed into a signal graph to transform the sequence signal from a Euclidian space to a non-Euclidean space. Hence, the graph feature (the feature of the non-Euclidean space) of the signal can be extracted from the signal graph. Second, the ensemble neural network is integrated with a graph feature extractor and a sequence feature extractor, making it available to extract both graph and sequence simultaneously. This ensemble neural network also fuses graph features with sequence features, obtaining an ensemble feature that has both features in Euclidean space and non-Euclidean space. Therefore, the ensemble feature contains more effective information for the identification of the emitter. The study results demonstrate that this SEI method has higher SEI accuracy and robustness than traditional machine learning methods and common deep learning methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 2829 KiB  
Article
GTAD: Graph and Temporal Neural Network for Multivariate Time Series Anomaly Detection
by Siwei Guan, Binjie Zhao, Zhekang Dong, Mingyu Gao and Zhiwei He
Entropy 2022, 24(6), 759; https://0-doi-org.brum.beds.ac.uk/10.3390/e24060759 - 27 May 2022
Cited by 16 | Viewed by 4860
Abstract
The rapid development of smart factories, combined with the increasing complexity of production equipment, has resulted in a large number of multivariate time series that can be recorded using sensors during the manufacturing process. The anomalous patterns of industrial production may be hidden [...] Read more.
The rapid development of smart factories, combined with the increasing complexity of production equipment, has resulted in a large number of multivariate time series that can be recorded using sensors during the manufacturing process. The anomalous patterns of industrial production may be hidden by these time series. Previous LSTM-based and machine-learning-based approaches have made fruitful progress in anomaly detection. However, these multivariate time series anomaly detection algorithms do not take into account the correlation and time dependence between the sequences. In this study, we proposed a new algorithm framework, namely, graph attention network and temporal convolutional network for multivariate time series anomaly detection (GTAD), to address this problem. Specifically, we first utilized temporal convolutional networks, including causal convolution and dilated convolution, to capture temporal dependencies, and then used graph neural networks to obtain correlations between sensors. Finally, we conducted sufficient experiments on three public benchmark datasets, and the results showed that the proposed method outperformed the baseline method, achieving detection results with F1 scores higher than 95% on all datasets. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 2219 KiB  
Article
A Novel Hierarchical Adaptive Feature Fusion Method for Meta-Learning
by Enjie Ding, Xu Chu, Zhongyu Liu, Kai Zhang and Qiankun Yu
Appl. Sci. 2022, 12(11), 5458; https://0-doi-org.brum.beds.ac.uk/10.3390/app12115458 - 27 May 2022
Cited by 3 | Viewed by 1396
Abstract
Meta-learning aims to teach the machine how to learn. Embedding model-based meta-learning performs well in solving the few-shot problem. The methods use an embedding model, usually a convolutional neural network, to extract features from samples and use a classifier to measure the features [...] Read more.
Meta-learning aims to teach the machine how to learn. Embedding model-based meta-learning performs well in solving the few-shot problem. The methods use an embedding model, usually a convolutional neural network, to extract features from samples and use a classifier to measure the features extracted from a particular stage of the embedding model. However, the feature of the embedding model at the low stage contains richer visual information, while the feature at the high stage contains richer semantic information. Existing methods fail to consider the impact of the information carried by the features at different stages on the performance of the classifier. Therefore, we propose a meta-learning method based on adaptive feature fusion and weight optimization. The main innovations of the method are as follows: firstly, a feature fusion strategy is used to fuse the feature of each stage of the embedding model based on certain weights, effectively utilizing the information carried by different stage features. Secondly, the particle swarm optimization algorithm was used to optimize the weight of feature fusion, and determine each stage feature’s weight in the process of feature fusion. Compared to current mainstream baseline methods on multiple few-shot image recognition benchmarks, the method performs better. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

22 pages, 5746 KiB  
Article
Surrogate Model-Based Parameter Tuning of Simulated Annealing Algorithm for the Shape Optimization of Automotive Rubber Bumpers
by Dávid Huri and Tamás Mankovits
Appl. Sci. 2022, 12(11), 5451; https://0-doi-org.brum.beds.ac.uk/10.3390/app12115451 - 27 May 2022
Cited by 3 | Viewed by 2166
Abstract
A design engineer has to deal with increasingly complex design tasks on a daily basis, for which the available design time is shrinking. Market competitiveness can be improved by using optimization if the design process can be automated. If there is limited information [...] Read more.
A design engineer has to deal with increasingly complex design tasks on a daily basis, for which the available design time is shrinking. Market competitiveness can be improved by using optimization if the design process can be automated. If there is limited information about the behavior of the objective function, global search methods such as simulated annealing (SA) should be used. This algorithm requires the selection of a number of parameters based on the task. A procedure for reducing the time spent on tuning the SA algorithm for computationally expensive, simulation-driven optimization tasks was developed. The applicability of the method was demonstrated by solving a shape optimization problem of a rubber bumper built into air spring structures of lorries. Due to the time-consuming objective function call, a support vector regression (SVR) surrogate model was used to test the performance of the optimization algorithm. To perform the SVR training, samples were taken using the maximin Latin hypercube design. The SA algorithm with an adaptive search space and different cooling schedules was implemented. Subsequently, the SA parameters were fine-tuned using the trained SVR surrogate model. An optimal design was found using the adapted SA algorithm with negligible error from a technical aspect. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 405 KiB  
Article
Bayesian Network Model Averaging Classifiers by Subbagging
by Shouta Sugahara, Itsuki Aomi and Maomi Ueno
Entropy 2022, 24(5), 743; https://0-doi-org.brum.beds.ac.uk/10.3390/e24050743 - 23 May 2022
Cited by 2 | Viewed by 2126
Abstract
When applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achieved [...] Read more.
When applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achieved by maximizing the conditional log likelihood (CLL) of a class variable given the feature variables. Nevertheless, because ML has asymptotic consistency, the performance of Bayesian network structures achieved by maximizing ML is not necessarily worse than that achieved by maximizing CLL for large data. However, the error of learning structures by maximizing the ML becomes much larger for small sample sizes. That large error degrades the classification accuracy. As a method to resolve this shortcoming, model averaging has been proposed to marginalize the class variable posterior over all structures. However, the posterior standard error of each structure in the model averaging becomes large as the sample size becomes small; it subsequently degrades the classification accuracy. The main idea of this study is to improve the classification accuracy using subbagging, which is modified bagging using random sampling without replacement, to reduce the posterior standard error of each structure in model averaging. Moreover, to guarantee asymptotic consistency, we use the K-best method with the ML score. The experimentally obtained results demonstrate that our proposed method provides more accurate classification than earlier BNC methods and the other state-of-the-art ensemble methods do. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1833 KiB  
Article
Classification of Defective Fabrics Using Capsule Networks
by Yavuz Kahraman and Alptekin Durmuşoğlu
Appl. Sci. 2022, 12(10), 5285; https://0-doi-org.brum.beds.ac.uk/10.3390/app12105285 - 23 May 2022
Cited by 6 | Viewed by 1969
Abstract
Fabric quality has an important role in the textile sector. Fabric defect, which is a highly important factor that influences the fabric quality, has become a concept that researchers are trying to minimize. Due to the limited capacity of human resources, human-based defect [...] Read more.
Fabric quality has an important role in the textile sector. Fabric defect, which is a highly important factor that influences the fabric quality, has become a concept that researchers are trying to minimize. Due to the limited capacity of human resources, human-based defect detection results in low performance and significant loss of time. To overcome human-based limited capacity, computer vision-based methods have emerged. Thanks to new additions to these methods over time, fabric defect detection methods have begun to show almost one hundred percent performance. Convolutional Neural Networks (CNNs) play a leading role in this high-performance success. However, Convolutional Neural Networks cause information loss in the pooling process. Capsule Networks is a useful technique for minimizing information loss. This paper proposes Capsule Networks, a new generation method that represents an alternative to Convolutional Neural Networks for deep learning tasks. TILDA dataset as source data for training and testing phases are employed. The model is trained for 100, 200, and 270 epoch times. Model performance is evaluated based on accuracy, recall, and precision performance metrics. Compared to mainstream deep learning algorithms, this method offers improved performance in terms of accuracy. This method has been performed under different circumstances and has achieved a performance value of 98.7%. The main contributions of this study are to use Capsule Networks in the fabric defect detection domain and to obtain a significant performance result. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

31 pages, 586 KiB  
Review
Negation and Speculation in NLP: A Survey, Corpora, Methods, and Applications
by Ahmed Mahany, Heba Khaled, Nouh Sabri Elmitwally, Naif Aljohani and Said Ghoniemy
Appl. Sci. 2022, 12(10), 5209; https://0-doi-org.brum.beds.ac.uk/10.3390/app12105209 - 21 May 2022
Cited by 6 | Viewed by 3763
Abstract
Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in [...] Read more.
Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in various natural languages and domains. Furthermore, we discuss the ongoing research into recent rule-based, supervised, and transfer learning techniques for the detection of negating and speculative content. Many English corpora for various domains are now annotated with negation and speculation; moreover, the availability of annotated corpora in other languages has started to increase. However, this growth is insufficient to address these important phenomena in languages with limited resources. The use of cross-lingual models and translation of the well-known languages are acceptable alternatives. We also highlight the lack of consistent annotation guidelines and the shortcomings of the existing techniques, and suggest alternatives that may speed up progress in this research direction. Adding more syntactic features may alleviate the limitations of the existing techniques, such as cue ambiguity and detecting the discontinuous scopes. In some NLP applications, inclusion of a system that is negation- and speculation-aware improves performance, yet this aspect is still not addressed or considered an essential step. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 4096 KiB  
Article
A Discriminative-Based Geometric Deep Learning Model for Cross Domain Recommender Systems
by John Kingsley Arthur, Conghua Zhou, Eric Appiah Mantey, Jeremiah Osei-Kwakye and Yaru Chen
Appl. Sci. 2022, 12(10), 5202; https://0-doi-org.brum.beds.ac.uk/10.3390/app12105202 - 20 May 2022
Cited by 5 | Viewed by 1819
Abstract
Recommender systems (RS) have been widely deployed in many real-world applications, but usually suffer from the long-standing user/item cold-start problem. As a promising approach, cross-domain recommendation (CDR), which has attracted a surge of interest, aims to transfer the user preferences observed in the [...] Read more.
Recommender systems (RS) have been widely deployed in many real-world applications, but usually suffer from the long-standing user/item cold-start problem. As a promising approach, cross-domain recommendation (CDR), which has attracted a surge of interest, aims to transfer the user preferences observed in the source domain to make recommendations in the target domain. Traditional machine learning and deep learning methods are not designed to learn from complex data representations such as graphs, manifolds and 3D objects. However, current trends in data generation include these complex data representations. In addition, existing research works do not consider the complex dimensions and the locality structure of items, which however, contain more discriminative information essential for improving the performance accuracy of the recommender system. Furthermore, similar outcomes between test samples and their neighboring training data restrained in the kernel space are not fully realized from the recommended objects belonging to the same object category to capture the embedded discriminative information effectively. These challenges leave the problem of sparsity and the cold-start of items/users unsolved and hence impede the performance of the cross-domain recommender system, causing it to suggest less relevant and undistinguished items to the user. To handle these challenges, we propose a novel deep learning (DL) method, Discriminative Geometric Deep Learning (D-GDL) for cross-domain recommender systems. In the proposed D-GDL, a discriminative function based on sparse local sensitivity is introduced into the structure of the DL network. In the D-GDL, a local representation learning (i.e., a local sensitivity-based deep convolutional belief network) is introduced into the structure of the DL network to effectively capture the local geometric and visual information from the structure of the recommended 3D objects. A kernel-based method (i.e., a local sensitivity deep belief network) is also incorporated into the structure of the DL framework to map the complex structure of recommended objects into high dimensional feature space and achieve an effective recognition result. An improved kernel density estimator is created to serve as a weighing function in building a high dimensional feature space, which makes it more resistant to geometric noise and computation performance. The experiment results show that the proposed D-GDL significantly outperforms the state-of-the-art methods in both sparse and dense settings for cross-domain recommendation tasks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 5104 KiB  
Article
Novel Exploit Feature-Map-Based Detection of Adversarial Attacks
by Ali Saeed Almuflih, Dhairya Vyas, Viral V. Kapdia, Mohamed Rafik Noor Mohamed Qureshi, Karishma Mohamed Rafik Qureshi and Elaf Abdullah Makkawi
Appl. Sci. 2022, 12(10), 5161; https://0-doi-org.brum.beds.ac.uk/10.3390/app12105161 - 20 May 2022
Cited by 5 | Viewed by 1741
Abstract
In machine learning (ML), adversarial attack (targeted or untargeted) in the presence of noise disturbs the model prediction. This research suggests that adversarial perturbations on pictures lead to noise in the features constructed by any networks. As a result, adversarial assaults against image [...] Read more.
In machine learning (ML), adversarial attack (targeted or untargeted) in the presence of noise disturbs the model prediction. This research suggests that adversarial perturbations on pictures lead to noise in the features constructed by any networks. As a result, adversarial assaults against image categorization systems may present obstacles and possibilities for studying convolutional neural networks (CNNs). According to this research, adversarial perturbations on pictures cause noise in the features created by neural networks. Motivated by adversarial perturbation on image pixel attacks observation, we developed a novel exploit feature map that describes adversarial attacks by performing individual object feature-map visual description. Specifically, a novel detection algorithm calculates each object’s class activation map weight and makes a combined activation map. When checked with different networks like VGGNet19 and ResNet50, in both white-box and black-box attack situations, the unique exploit feature-map significantly improves the state-of-the-art in adversarial resilience. Further, it will clearly exploit attacks on ImageNet under various algorithms like Fast Gradient Sign Method (FGSM), DeepFool, Projected Gradient Descent (PGD), and Backward Pass Differentiable Approximation (BPDA). Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 4614 KiB  
Article
Random Noise vs. State-of-the-Art Probabilistic Forecasting Methods: A Case Study on CRPS-Sum Discrimination Ability
by Alireza Koochali, Peter Schichtel, Andreas Dengel and Sheraz Ahmed
Appl. Sci. 2022, 12(10), 5104; https://0-doi-org.brum.beds.ac.uk/10.3390/app12105104 - 19 May 2022
Cited by 6 | Viewed by 1509
Abstract
The recent developments in the machine-learning domain have enabled the development of complex multivariate probabilistic forecasting models. To evaluate the predictive power of these complex methods, it is pivotal to have a precise evaluation method to gauge the performance and predictability power of [...] Read more.
The recent developments in the machine-learning domain have enabled the development of complex multivariate probabilistic forecasting models. To evaluate the predictive power of these complex methods, it is pivotal to have a precise evaluation method to gauge the performance and predictability power of these complex methods. To do so, several evaluation metrics have been proposed in the past (such as the energy score, Dawid–Sebastiani score, and variogram score); however, these cannot reliably measure the performance of a probabilistic forecaster. Recently, CRPS-Sum has gained a lot of prominence as a reliable metric for multivariate probabilistic forecasting. This paper presents a systematic evaluation of CRPS-Sum to understand its discrimination ability. We show that the statistical properties of target data affect the discrimination ability of CRPS-Sum. Furthermore, we highlight that CRPS-Sum calculation overlooks the performance of the model on each dimension. These flaws can lead us to an incorrect assessment of model performance. Finally, with experiments on real-world datasets, we demonstrate that the shortcomings of CRPS-Sum provide a misleading indication of the probabilistic forecasting performance method. We illustrate that it is easily possible to have a better CRPS-Sum for a dummy model, which looks like random noise, in comparison to the state-of-the-art method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 1100 KiB  
Article
A Triple Relation Network for Joint Entity and Relation Extraction
by Zixiang Wang, Liqun Yang, Jian Yang, Tongliang Li, Longtao He and Zhoujun Li
Electronics 2022, 11(10), 1535; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11101535 - 11 May 2022
Cited by 4 | Viewed by 1915
Abstract
Recent methods of extracting relational triples mainly focus on the overlapping problem and achieve considerable performance. Most previous approaches extract triples solely conditioned on context words, but ignore the potential relations among the extracted entities, which will cause incompleteness in succeeding Knowledge Graphs’ [...] Read more.
Recent methods of extracting relational triples mainly focus on the overlapping problem and achieve considerable performance. Most previous approaches extract triples solely conditioned on context words, but ignore the potential relations among the extracted entities, which will cause incompleteness in succeeding Knowledge Graphs’ (KGs) construction. Since relevant triples give a clue for establishing implicit connections among entities, we propose a Triple Relation Network (Trn) to jointly extract triples, especially handling extracting implicit triples. Specifically, we design an attention-based entity pair encoding module to identify all normal entity pairs directly. To construct implicit connections among these extracted entities in triples, we utilize our triple reasoning module to calculate relevance between two triples. Then, we select the top-K relevant triple pairs and transform them into implicit entity pairs to predict the corresponding implicit relations. We utilize a bipartite matching objective to match normal triples and implicit triples with the corresponding labels. Extensive experiments demonstrate the effectiveness of the proposed method on two public benchmarks, and our proposed model significantly outperforms previous strong baselines. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 3357 KiB  
Article
A Novel Two-Stage Deep Learning Structure for Network Flow Anomaly Detection
by Ming-Tsung Kao, Dian-Ye Sung, Shang-Juh Kao and Fu-Min Chang
Electronics 2022, 11(10), 1531; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11101531 - 11 May 2022
Cited by 9 | Viewed by 1906
Abstract
Unknown cyber-attacks have appeared constantly. Several anomaly detection techniques based on semi-supervised learning have been proposed to detect these unknown cyber-attacks. Among them, the Denoising Auto-Encoder (DAE) scheme performs better than others in accuracy but is not good enough in precision. This paper [...] Read more.
Unknown cyber-attacks have appeared constantly. Several anomaly detection techniques based on semi-supervised learning have been proposed to detect these unknown cyber-attacks. Among them, the Denoising Auto-Encoder (DAE) scheme performs better than others in accuracy but is not good enough in precision. This paper proposes a novel two-stage deep learning structure for network flow anomaly detection by combining the models of Gate Recurrent Unit (GRU) and DAE. By using supervised anomaly detection with a selection mechanism to assist semi-supervised anomaly detection, the precision and accuracy of the anomaly detection system are improved. In the proposed structure, we first use the GRU model to analyze the network flow and then take the outcome from the Softmax function as a confidence score. When the score is more than or equal to the predefined confidence threshold, the GRU model outputs the flow as a positive result, no matter the flow is classified as normal or abnormal. When the score is less than the confidence threshold, GRU model outputs the flow as a negative result and passes the flow to DAE model for flow classification. DAE then determines a reconstruction error threshold by learning the pattern of normal flows. Accordingly, the flow is normal or abnormal depending on whether it is under or over the reconstruction error threshold. A comparative experiment is performed using NSL-KDD dataset as benchmark. The results revealed that the precision using the proposed scheme is 0.83% better than DAE. The accuracy using the proposed approach is 90.21%, which is better than Random Forest, Naïve Bayes, One-Dimensional Convolutional Neural Network, two-stage Auto-Encoder, etc. In addition, the proposed approach is also applied to the environment of software defined network (SDN). By adopting our approach in SDN environment, the precision and F-measure are significantly improved. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 9199 KiB  
Article
STAGCN: Spatial–Temporal Attention Graph Convolution Network for Traffic Forecasting
by Yafeng Gu and Li Deng
Mathematics 2022, 10(9), 1599; https://0-doi-org.brum.beds.ac.uk/10.3390/math10091599 - 08 May 2022
Cited by 5 | Viewed by 4566
Abstract
Traffic forecasting plays an important role in intelligent transportation systems. However, the prediction task is highly challenging due to the mixture of global and local spatiotemporal dependencies involved in traffic data. Existing graph neural networks (GNNs) typically capture spatial dependencies with the predefined [...] Read more.
Traffic forecasting plays an important role in intelligent transportation systems. However, the prediction task is highly challenging due to the mixture of global and local spatiotemporal dependencies involved in traffic data. Existing graph neural networks (GNNs) typically capture spatial dependencies with the predefined or learnable static graph structure, ignoring the hidden dynamic patterns in traffic networks. Meanwhile, most recurrent neural networks (RNNs) or convolutional neural networks (CNNs) cannot effectively capture temporal correlations, especially for long-term temporal dependencies. In this paper, we propose a spatial–temporal attention graph convolution network (STAGCN), which acquires a static graph and a dynamic graph from data without any prior knowledge. The static graph aims to model global space adaptability, and the dynamic graph is designed to capture local dynamics in the traffic network. A gated temporal attention module is further introduced for long-term temporal dependencies, where a causal-trend attention mechanism is proposed to increase the awareness of causality and local trends in time series. Extensive experiments on four real-world traffic flow datasets demonstrate that STAGCN achieves an outstanding prediction accuracy improvement over existing solutions. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1861 KiB  
Article
A Novel Anti-Risk Method for Portfolio Trading Using Deep Reinforcement Learning
by Han Yue, Jiapeng Liu, Dongmei Tian and Qin Zhang
Electronics 2022, 11(9), 1506; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11091506 - 07 May 2022
Cited by 2 | Viewed by 2458
Abstract
In the past decade, the application of deep reinforcement learning (DRL) in portfolio management has attracted extensive attention. However, most classical RL algorithms do not consider the exogenous and noise of financial time series data, which may lead to treacherous trading decisions. To [...] Read more.
In the past decade, the application of deep reinforcement learning (DRL) in portfolio management has attracted extensive attention. However, most classical RL algorithms do not consider the exogenous and noise of financial time series data, which may lead to treacherous trading decisions. To address this issue, we propose a novel anti-risk portfolio trading method based on deep reinforcement learning (DRL). It consists of a stacked sparse denoising autoencoder (SSDAE) network and an actor–critic based reinforcement learning (RL) agent. SSDAE will carry out off-line training first, while the decoder will used for on-line feature extraction in each state. The SSDAE network is used for the noise resistance training of financial data. The actor–critic algorithm we use is advantage actor–critic (A2C) and consists of two networks: the actor network learns and implements an investment policy, which is then evaluated by the critic network to determine the best action plan by continuously redistributing various portfolio assets, taking Sharp ratio as the optimization function. Through extensive experiments, the results show that our proposed method is effective and superior to the Dow Jones Industrial Average index (DJIA), several variants of our proposed method, and a state-of-the-art (SOTA) method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 5680 KiB  
Article
Multi-Scale Upsampling GAN Based Hole-Filling Framework for High-Quality 3D Cultural Heritage Artifacts
by Yong Ren, Tong Chu, Yifei Jiao, Mingquan Zhou, Guohua Geng, Kang Li and Xin Cao
Appl. Sci. 2022, 12(9), 4581; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094581 - 30 Apr 2022
Cited by 3 | Viewed by 2270
Abstract
With the rapid development of 3D scanners, the cultural heritage artifacts can be stored as a point cloud and displayed through the Internet. However, due to natural and human factors, many cultural relics had some surface damage when excavated. As a result, the [...] Read more.
With the rapid development of 3D scanners, the cultural heritage artifacts can be stored as a point cloud and displayed through the Internet. However, due to natural and human factors, many cultural relics had some surface damage when excavated. As a result, the holes caused by these damages still exist in the generated point cloud model. This work proposes a multi-scale upsampling GAN (MU-GAN) based framework for completing these holes. Firstly, a 3D mesh model based on the original point cloud is reconstructed, and the method of detecting holes is presented. Secondly, the point cloud patch contains hole regions and is extracted from the point cloud. Then the patch is input into the MU-GAN to generate a high-quality dense point cloud. Finally, the empty areas on the original point cloud are filled with the generated dense point cloud patches. A series of real-world experiments are conducted on real scan data to demonstrate that the proposed framework can fill the holes of 3D heritage models with grained details. We hope that our work can provide a useful tool for cultural heritage protection. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 2936 KiB  
Article
Visual and Phonological Feature Enhanced Siamese BERT for Chinese Spelling Error Correction
by Yujia Liu, Hongliang Guo, Shuai Wang and Tiejun Wang
Appl. Sci. 2022, 12(9), 4578; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094578 - 30 Apr 2022
Viewed by 1735
Abstract
Chinese Spelling Check (CSC) aims to detect and correct spelling errors in Chinese. Most CSC models rely on human-defined confusion sets to narrow the search space, failing to resolve errors outside the confusion set. However, most spelling errors in current benchmark datasets are [...] Read more.
Chinese Spelling Check (CSC) aims to detect and correct spelling errors in Chinese. Most CSC models rely on human-defined confusion sets to narrow the search space, failing to resolve errors outside the confusion set. However, most spelling errors in current benchmark datasets are character pairs in similar pronunciations. Errors in similar shapes and errors which are visually and phonologically irrelevant are not considered. Furthermore, widely-used automatically generated training data in CSC tasks leads to label leakage and unfair comparison between different methods. In this work, we propose a feature (visual and phonological) enhanced siamese BERT to (1) correct spelling errors without using confusion sets; (2) integrate phonological and visual features for CSC by a glyph graph; (3) improve performance for unseen spelling errors. To evaluate CSC methods fairly and comprehensively, we build a large-scale CSC dataset in which the number of samples in different error types is the same. The experimental results show that the proposed approach achieves better performance compared with previous state-of-the-art methods on three benchmark datasets and the new error-type balanced dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

27 pages, 4060 KiB  
Review
A Systematic Literature Review of Learning-Based Traffic Accident Prediction Models Based on Heterogeneous Sources
by Pablo Marcillo, Ángel Leonardo Valdivieso Caraguay and Myriam Hernández-Álvarez
Appl. Sci. 2022, 12(9), 4529; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094529 - 29 Apr 2022
Cited by 7 | Viewed by 4488
Abstract
Statistics affirm that almost half of deaths in traffic accidents were vulnerable road users, such as pedestrians, cyclists, and motorcyclists. Despite the efforts in technological infrastructure and traffic policies, the number of victims remains high and beyond expectation. Recent research establishes that determining [...] Read more.
Statistics affirm that almost half of deaths in traffic accidents were vulnerable road users, such as pedestrians, cyclists, and motorcyclists. Despite the efforts in technological infrastructure and traffic policies, the number of victims remains high and beyond expectation. Recent research establishes that determining the causes of traffic accidents is not an easy task because their occurrence depends on one or many factors. Traffic accidents can be caused by, for instance, mechanical problems, adverse weather conditions, mental and physical fatigue, negligence, potholes in the road, among others. At present, the use of learning-based prediction models as mechanisms to reduce the number of traffic accidents is a reality. In that way, the success of prediction models depends mainly on how data from different sources can be integrated and correlated. This study aims to report models, algorithms, data sources, attributes, data collection services, driving simulators, evaluation metrics, percentages of data for training/validation/testing, and others. We found that the performance of a prediction model depends mainly on the quality of its data and a proper data split configuration. The use of real data predominates over data generated by simulators. This work made it possible to determine that future research must point to developing traffic accident prediction models that use deep learning. It must also focus on exploring and using data sources, such as driver data and light conditions, and solve issues related to this type of solution, such as high dimensionality in data and information imbalance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 3643 KiB  
Article
Credit Card Fraud Detection Using a New Hybrid Machine Learning Architecture
by Esraa Faisal Malik, Khai Wah Khaw, Bahari Belaton, Wai Peng Wong and XinYing Chew
Mathematics 2022, 10(9), 1480; https://0-doi-org.brum.beds.ac.uk/10.3390/math10091480 - 28 Apr 2022
Cited by 32 | Viewed by 6799
Abstract
The negative effect of financial crimes on financial institutions has grown dramatically over the years. To detect crimes such as credit card fraud, several single and hybrid machine learning approaches have been used. However, these approaches have significant limitations as no further investigation [...] Read more.
The negative effect of financial crimes on financial institutions has grown dramatically over the years. To detect crimes such as credit card fraud, several single and hybrid machine learning approaches have been used. However, these approaches have significant limitations as no further investigation on different hybrid algorithms for a given dataset were studied. This research proposes and investigates seven hybrid machine learning models to detect fraudulent activities with a real word dataset. The developed hybrid models consisted of two phases, state-of-the-art machine learning algorithms were used first to detect credit card fraud, then, hybrid methods were constructed based on the best single algorithm from the first phase. Our findings indicated that the hybrid model Adaboost + LGBM is the champion model as it displayed the highest performance. Future studies should focus on studying different types of hybridization and algorithms in the credit card domain. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 1336 KiB  
Article
Belief Entropy Tree and Random Forest: Learning from Data with Continuous Attributes and Evidential Labels
by Kangkai Gao, Yong Wang and Liyao Ma
Entropy 2022, 24(5), 605; https://0-doi-org.brum.beds.ac.uk/10.3390/e24050605 - 26 Apr 2022
Cited by 6 | Viewed by 2405
Abstract
As well-known machine learning methods, decision trees are widely applied in classification and recognition areas. In this paper, with the uncertainty of labels handled by belief functions, a new decision tree method based on belief entropy is proposed and then extended to random [...] Read more.
As well-known machine learning methods, decision trees are widely applied in classification and recognition areas. In this paper, with the uncertainty of labels handled by belief functions, a new decision tree method based on belief entropy is proposed and then extended to random forest. With the Gaussian mixture model, this tree method is able to deal with continuous attribute values directly, without pretreatment of discretization. Specifically, the tree method adopts belief entropy, a kind of uncertainty measurement based on the basic belief assignment, as a new attribute selection tool. To improve the classification performance, we constructed a random forest based on the basic trees and discuss different prediction combination strategies. Some numerical experiments on UCI machine learning data set were conducted, which indicate the good classification accuracy of the proposed method in different situations, especially on data with huge uncertainty. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 2573 KiB  
Article
Research on Modulation Signal Recognition Based on CLDNN Network
by Binghang Zou, Xiaodong Zeng and Faquan Wang
Electronics 2022, 11(9), 1379; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11091379 - 26 Apr 2022
Cited by 11 | Viewed by 2003
Abstract
Modulated signal recognition and classification occupies an important position in electronic information warfare, intelligent wireless communication, and fast modulation and demodulation. To address the shortcomings of existing recognition methods, such as high manual involvement, few recognition types, and a low recognition rate under [...] Read more.
Modulated signal recognition and classification occupies an important position in electronic information warfare, intelligent wireless communication, and fast modulation and demodulation. To address the shortcomings of existing recognition methods, such as high manual involvement, few recognition types, and a low recognition rate under a low signal-to-noise ratio, we propose an attention mechanism short-link convolution long short-term memory deep neural networks (ASCLDNN) recognition model. The network is optimized for modulated signal recognition and incorporates an attention mechanism to achieve higher accuracy by adding weights to important signals. The experimental results show that ASCLDNN can recognize 11 signal modulations with high accuracy at a low signal-to-noise ratio and no confusion for specific signals. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 7358 KiB  
Communication
Learning Local Distribution for Extremely Efficient Single-Image Super-Resolution
by Wei Wu, Wen Xu, Bolun Zheng, Aiai Huang and Chenggang Yan
Electronics 2022, 11(9), 1348; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11091348 - 24 Apr 2022
Cited by 2 | Viewed by 1351
Abstract
Achieving balance between efficiency and performance is a key problem for convolution neural network (CNN)-based single-image super-resolution (SISR) algorithms. Existing methods tend to directly output high-resolution (HR) pixels or residuals to reconstruct the HR image and focus a lot of attention on designing [...] Read more.
Achieving balance between efficiency and performance is a key problem for convolution neural network (CNN)-based single-image super-resolution (SISR) algorithms. Existing methods tend to directly output high-resolution (HR) pixels or residuals to reconstruct the HR image and focus a lot of attention on designing powerful CNN backbones. However, this reconstruction way requires the CNN backbone to have good ability to fit the mapping function from LR pixels to HR pixels, which certainly held these methods back from achieving extreme efficiency and from working in embedded environments. In this work, we propose a novel distribution learning architecture to estimate the local distribution and reconstruct HR pixels by sampling the local distribution with the corresponding 2D coordinates. We also improve the backbone structure to better support the proposed distribution learning architecture. The experimental results demonstrate that the proposed method achieves state-of-the-art performance for extremely efficient SISR and exhibits a good balance between efficiency and performance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 2105 KiB  
Article
Investigating How Reproducibility and Geometrical Representation in UMAP Dimensionality Reduction Impact the Stratification of Breast Cancer Tumors
by Jordy Bollon, Michela Assale, Andrea Cina, Stefano Marangoni, Matteo Calabrese, Chiara Beatrice Salvemini, Jean Marc Christille, Stefano Gustincich and Andrea Cavalli
Appl. Sci. 2022, 12(9), 4247; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094247 - 22 Apr 2022
Cited by 3 | Viewed by 2022
Abstract
Advances in next-generation sequencing have provided high-dimensional RNA-seq datasets, allowing the stratification of some tumor patients based on their transcriptomic profiles. Machine learning methods have been used to reduce and cluster high-dimensional data. Recently, uniform manifold approximation and projection (UMAP) was applied to [...] Read more.
Advances in next-generation sequencing have provided high-dimensional RNA-seq datasets, allowing the stratification of some tumor patients based on their transcriptomic profiles. Machine learning methods have been used to reduce and cluster high-dimensional data. Recently, uniform manifold approximation and projection (UMAP) was applied to project genomic datasets in low-dimensional Euclidean latent space. Here, we evaluated how different representations of the UMAP embedding can impact the analysis of breast cancer (BC) stratification. We projected BC RNA-seq data on Euclidean, spherical, and hyperbolic spaces, and stratified BC patients via clustering algorithms. We also proposed a pipeline to yield more reproducible clustering outputs. The results show how the selection of the latent space can affect downstream stratification results and suggest that the exploration of different geometrical representations is recommended to explore data structure and samples’ relationships. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

29 pages, 6517 KiB  
Article
Semi-Supervised Cross-Subject Emotion Recognition Based on Stacked Denoising Autoencoder Architecture Using a Fusion of Multi-Modal Physiological Signals
by Junhai Luo, Yuxin Tian, Hang Yu, Yu Chen and Man Wu
Entropy 2022, 24(5), 577; https://0-doi-org.brum.beds.ac.uk/10.3390/e24050577 - 20 Apr 2022
Cited by 7 | Viewed by 2242
Abstract
In recent decades, emotion recognition has received considerable attention. As more enthusiasm has shifted to the physiological pattern, a wide range of elaborate physiological emotion data features come up and are combined with various classifying models to detect one’s emotional states. To circumvent [...] Read more.
In recent decades, emotion recognition has received considerable attention. As more enthusiasm has shifted to the physiological pattern, a wide range of elaborate physiological emotion data features come up and are combined with various classifying models to detect one’s emotional states. To circumvent the labor of artificially designing features, we propose to acquire affective and robust representations automatically through the Stacked Denoising Autoencoder (SDA) architecture with unsupervised pre-training, followed by supervised fine-tuning. In this paper, we compare the performances of different features and models through three binary classification tasks based on the Valence-Arousal-Dominance (VAD) affection model. Decision fusion and feature fusion of electroencephalogram (EEG) and peripheral signals are performed on hand-engineered features; data-level fusion is performed on deep-learning methods. It turns out that the fusion data perform better than the two modalities. To take advantage of deep-learning algorithms, we augment the original data and feed it directly into our training model. We use two deep architectures and another generative stacked semi-supervised architecture as references for comparison to test the method’s practical effects. The results reveal that our scheme slightly outperforms the other three deep feature extractors and surpasses the state-of-the-art of hand-engineered features. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 11588 KiB  
Article
Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM)
by Arief Setyanto, Arif Laksito, Fawaz Alarfaj, Mohammed Alreshoodi, Kusrini, Irwan Oyong, Mardhiya Hayaty, Abdullah Alomair, Naif Almusallam and Lilis Kurniasari
Appl. Sci. 2022, 12(9), 4140; https://0-doi-org.brum.beds.ac.uk/10.3390/app12094140 - 20 Apr 2022
Cited by 14 | Viewed by 3109
Abstract
Arabic is one of the official languages recognized by the United Nations (UN) and is widely used in the middle east, and parts of Asia, Africa, and other countries. Social media activity currently dominates the textual communication on the Internet and potentially represents [...] Read more.
Arabic is one of the official languages recognized by the United Nations (UN) and is widely used in the middle east, and parts of Asia, Africa, and other countries. Social media activity currently dominates the textual communication on the Internet and potentially represents people’s views about specific issues. Opinion mining is an important task for understanding public opinion polarity towards an issue. Understanding public opinion leads to better decisions in many fields, such as public services and business. Language background plays a vital role in understanding opinion polarity. Variation is not only due to the vocabulary but also cultural background. The sentence is a time series signal; therefore, sequence gives a significant correlation to the meaning of the text. A recurrent neural network (RNN) is a variant of deep learning where the sequence is considered. Long short-term memory (LSTM) is an implementation of RNN with a particular gate to keep or ignore specific word signals during a sequence of inputs. Text is unstructured data, and it cannot be processed further by a machine unless an algorithm transforms the representation into a readable machine learning format as a vector of numerical values. Transformation algorithms range from the Term Frequency–Inverse Document Frequency (TF-IDF) transform to advanced word embedding. Word embedding methods include GloVe, word2vec, BERT, and fastText. This research experimented with those algorithms to perform vector transformation of the Arabic text dataset. This study implements and compares the GloVe and fastText word embedding algorithms and long short-term memory (LSTM) implemented in single-, double-, and triple-layer architectures. Finally, this research compares their accuracy for opinion mining on an Arabic dataset. It evaluates the proposed algorithm with the ASAD dataset of 55,000 annotated tweets in three classes. The dataset was augmented to achieve equal proportions of positive, negative, and neutral classes. According to the evaluation results, the triple-layer LSTM with fastText word embedding achieved the best testing accuracy, at 90.9%, surpassing all other experimental scenarios. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

10 pages, 11163 KiB  
Article
BTENet: Back-Fat Thickness Estimation Network for Automated Grading of the Korean Commercial Pig
by Hyo-Jun Lee, Jong-Hyeon Baek, Young-Kuk Kim, Jun Heon Lee, Myungjae Lee, Wooju Park, Seung Hwan Lee and Yeong Jun Koh
Electronics 2022, 11(9), 1296; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11091296 - 19 Apr 2022
Cited by 1 | Viewed by 2135
Abstract
For the automated grading of the Korean commercial pig, we propose deep neural networks called the back-fat thickness estimation network (BTENet). The proposed BTENet contains segmentation and thickness estimation modules to simultaneously perform a back-fat area segmentation and a thickness estimation. The segmentation [...] Read more.
For the automated grading of the Korean commercial pig, we propose deep neural networks called the back-fat thickness estimation network (BTENet). The proposed BTENet contains segmentation and thickness estimation modules to simultaneously perform a back-fat area segmentation and a thickness estimation. The segmentation module estimates a back-fat area mask from an input image. Through both the input image and estimated back-fat mask, the thickness estimation module predicts a real back-fat thickness in millimeters by effectively analyzing the back-fat area. To train BTENet, we also build a large-scale pig image dataset called PigBT. Experimental results validate that the proposed BTENet achieves the reliable thickness estimation (Pearson’s correlation coefficient: 0.915; mean absolute error: 1.275 mm; mean absolute percentage error: 6.4%). Therefore, we expect that BTENet will accelerate a new phase for the automated grading system of the Korean commercial pig. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 3683 KiB  
Article
Multiview Clustering of Adaptive Sparse Representation Based on Coupled P Systems
by Xiaoling Zhang and Xiyu Liu
Entropy 2022, 24(4), 568; https://0-doi-org.brum.beds.ac.uk/10.3390/e24040568 - 18 Apr 2022
Cited by 3 | Viewed by 2314
Abstract
A multiview clustering (MVC) has been a significant technique to dispose data mining issues. Most of the existing studies on this topic adopt a fixed number of neighbors when constructing the similarity matrix of each view, like single-view clustering. However, this may reduce [...] Read more.
A multiview clustering (MVC) has been a significant technique to dispose data mining issues. Most of the existing studies on this topic adopt a fixed number of neighbors when constructing the similarity matrix of each view, like single-view clustering. However, this may reduce the clustering effect due to the diversity of multiview data sources. Moreover, most MVC utilizes iterative optimization to obtain clustering results, which consumes a significant amount of time. Therefore, this paper proposes a multiview clustering of adaptive sparse representation based on coupled P system (MVCS-CP) without iteration. The whole algorithm flow runs in the coupled P system. Firstly, the natural neighbor search algorithm without parameters automatically determines the number of neighbors of each view. In turn, manifold learning and sparse representation are employed to construct the similarity matrix, which preserves the internal geometry of the views. Next, a soft thresholding operator is introduced to form the unified graph to gain the clustering results. The experimental results on nine real datasets indicate that the MVCS-CP outperforms other state-of-the-art comparison algorithms. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 5813 KiB  
Article
Fault Diagnosis of Induction Motors with Imbalanced Data Using Deep Convolutional Generative Adversarial Network
by Hong-Chan Chang, Yi-Che Wang, Yu-Yang Shih and Cheng-Chien Kuo
Appl. Sci. 2022, 12(8), 4080; https://0-doi-org.brum.beds.ac.uk/10.3390/app12084080 - 18 Apr 2022
Cited by 10 | Viewed by 2229
Abstract
A homemade defective model of an induction motor was created by the laboratory team to acquire the vibration acceleration signals of five operating states of an induction motor under different loads. Two major learning models, namely a deep convolutional generative adversarial network (DCGAN) [...] Read more.
A homemade defective model of an induction motor was created by the laboratory team to acquire the vibration acceleration signals of five operating states of an induction motor under different loads. Two major learning models, namely a deep convolutional generative adversarial network (DCGAN) and a convolutional neural network, were applied for fault diagnosis of the induction motor to the problem of an imbalanced training dataset. Two datasets were studied and analyzed: a sufficient and balanced training dataset and insufficient and imbalanced training data. When the training datasets were adequate and balanced, time–frequency analysis was advantageous for fault diagnosis at different loads, with the diagnostic accuracy achieving 95.06% and 96.38%. For the insufficient and imbalanced training dataset, regardless of the signal preprocessing method, the more imbalanced the training dataset, the lower the diagnostic accuracy was for the testing dataset. Samples generated by DCGAN were found to exhibit 80% similarity with the actual data through comparison. By oversampling the imbalanced dataset, DCGAN achieved a 90% diagnostic accuracy, close to the accuracy achieved using a balanced dataset. Among all oversampling techniques, the pro-balanced method yielded the optimal result. The diagnostic accuracy reached 85% in the cross-load test, indicating that the generated data had successfully learned the different fault features that validate the DCGAN’s ability to learn parts of input signals. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 1627 KiB  
Article
MSPNet: Multi-Scale Strip Pooling Network for Road Extraction from Remote Sensing Images
by Shenming Qu, Huafei Zhou, Bo Zhang and Shengbin Liang
Appl. Sci. 2022, 12(8), 4068; https://0-doi-org.brum.beds.ac.uk/10.3390/app12084068 - 18 Apr 2022
Cited by 2 | Viewed by 1792
Abstract
Extracting roads from remote sensing images can support a range of geo-information applications. However, it is challenging due to factors such as the complex distribution of ground objects and occlusion of buildings, trees, shadows, etc. Pixel-wise classification often fails to predict road connectivity [...] Read more.
Extracting roads from remote sensing images can support a range of geo-information applications. However, it is challenging due to factors such as the complex distribution of ground objects and occlusion of buildings, trees, shadows, etc. Pixel-wise classification often fails to predict road connectivity and thus produces fragmented road segments. In this paper, we propose a multi-scale strip pooling network (MSPNet) to learn the linear features of roads. Motivated by the strip pooling being more aligned with the shape of roads, which are long-span and narrow, we develop a multi-scale strip pooling (MSP) module that utilizes strip pooling layers with long but narrow kernel shapes to capture multi-scale long-range context from horizontal and vertical directions. The proposed MSP module focuses on establishing relationships along the road region to guarantee the connectivity of roads. Considering the complex distribution of ground objects, the spatial pyramid pooling is applied to enhance the learning ability of complex features in different sub-regions. In addition, to alleviate the problem caused by an imbalanced distribution of road and non-road pixels, we use binary cross-entropy and dice-coefficient loss functions to jointly train our proposed deep learning model. Then, we perform ablation experiments to adjust the loss contributions to suit the task of road extraction. Comparative experiments on a popular benchmark DeepGlobe dataset demonstrate that our proposed MSPNet establishes new competitive results in both IoU and F1-score. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

10 pages, 2225 KiB  
Article
Amodal Segmentation Just Like Doing a Jigsaw
by Xunli Zeng, Xiaoli Liu and Jianqin Yin
Appl. Sci. 2022, 12(8), 4061; https://0-doi-org.brum.beds.ac.uk/10.3390/app12084061 - 17 Apr 2022
Cited by 3 | Viewed by 1843
Abstract
Amodal segmentation is a new direction of instance segmentation while considering the segmentation of the visible and occluded parts of the instance. The existing state-of-the-art method uses multi-task branches to predict the amodal part and the visible part separately and subtract the visible [...] Read more.
Amodal segmentation is a new direction of instance segmentation while considering the segmentation of the visible and occluded parts of the instance. The existing state-of-the-art method uses multi-task branches to predict the amodal part and the visible part separately and subtract the visible part from the amodal part to obtain the occluded part. However, the amodal part contains visible information. Therefore, the separated prediction method will generate duplicate information. Different from this method, we propose a method of amodal segmentation based on the idea of the jigsaw. The method uses multi-task branches to predict the two naturally decoupled parts of visible and occluded, which is like getting two matching jigsaw pieces. Then put the two jigsaw pieces together to get the amodal part. This makes each branch focus on the modeling of the object. And we believe that there are certain rules in the occlusion relationship in the real world. This is a kind of occlusion context information. This jigsaw method can better model the occlusion relationship and use the occlusion context information, which is important for amodal segmentation. Experiments on two widely used amodally annotated datasets prove that our method exceeds existing state-of-the-art methods. In particular, on the amodal mask metric, our method outperforms the baseline by 5 percentage points on the COCOA cls dataset and 2 percentage points on the KINS dataset. The source code of this work will be made public soon. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 1246 KiB  
Article
Automatic Classification of Normal–Abnormal Heart Sounds Using Convolution Neural Network and Long-Short Term Memory
by Ding Chen, Weipeng Xuan, Yexing Gu, Fuhai Liu, Jinkai Chen, Shudong Xia, Hao Jin, Shurong Dong and Jikui Luo
Electronics 2022, 11(8), 1246; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11081246 - 14 Apr 2022
Cited by 11 | Viewed by 2402
Abstract
The phonocardiogram (PCG) is an important analysis method for the diagnosis of cardiovascular disease, which is usually performed by experienced medical experts. Due to the high ratio of patients to doctors, there is a pressing need for a real-time automated phonocardiogram classification system [...] Read more.
The phonocardiogram (PCG) is an important analysis method for the diagnosis of cardiovascular disease, which is usually performed by experienced medical experts. Due to the high ratio of patients to doctors, there is a pressing need for a real-time automated phonocardiogram classification system for the diagnosis of cardiovascular disease. This paper proposes a deep neural-network structure based on a one-dimensional convolutional neural network (1D-CNN) and a long short-term memory network (LSTM), which can directly classify unsegmented PCG to identify abnormal signal. The PCG data were filtered and put into the model for analysis. A total of 3099 pieces of heart-sound recordings were used, while another 100 patients’ heart-sound data collected by our group and diagnosed by doctors were used to test and verify the model. Results show that the CNN-LSTM model provided a good overall balanced accuracy of 0.86 ± 0.01 with a sensitivity of 0.87 ± 0.02, and specificity of 0.89 ± 0.02. The F1-score was 0.91 ± 0.01, and the receiver-operating characteristic (ROC) plot produced an area under the curve (AUC) value of 0.92 ± 0.01. The sensitivity, specificity and accuracy of the 100 patients’ data were 0.83 ± 0.02, 0.80 ± 0.02 and 0.85 ± 0.03, respectively. The proposed model does not require feature engineering and heart-sound segmentation, which possesses reliable performance in classification of abnormal PCG; and is fast and suitable for real-time diagnosis application. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

14 pages, 1778 KiB  
Article
Partial Atrous Cascade R-CNN
by Mofan Cheng, Cien Fan, Liqiong Chen, Lian Zou, Jiale Wang, Yifeng Liu and Hu Yu
Electronics 2022, 11(8), 1241; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11081241 - 14 Apr 2022
Viewed by 1334
Abstract
Deep-learning-based segmentation methods have achieved excellent results. As two main tasks in computer vision, instance segmentation and semantic segmentation are closely related and mutually beneficial. Spatial context information from the semantic features can also improve the accuracy of instance segmentation. Inspired by this, [...] Read more.
Deep-learning-based segmentation methods have achieved excellent results. As two main tasks in computer vision, instance segmentation and semantic segmentation are closely related and mutually beneficial. Spatial context information from the semantic features can also improve the accuracy of instance segmentation. Inspired by this, we propose a novel instance segmentation framework named partial atrous cascade R-CNN (PAC), which effectively improves the accuracy of the segmentation boundary. The proposed network innovates in two aspects: (1) A semantic branch with a partial atrous spatial pyramid extraction (PASPE) module is proposed in this paper. The module consists of atrous convolution layers with multi-dilation rates. By expanding the receptive field of the convolutional layer, multi-scale semantic features are greatly enriched. Experiments shows that the new branch obtains more accurate segmentation contours. (2) The proposed mask quality (MQ) module scores the intersection over union (IoU) between the predicted mask and the ground truth mask. Benefiting from the modified mask quality score, the quality of the segmentation results is judged credibly. Our proposed network is trained and tested on the MS COCO dataset. Compared with the benchmark, it brings consistent and noticeable improvements in the case of using the same backbone. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

0 pages, 9296 KiB  
Article
Detecting Deepfake Voice Using Explainable Deep Learning Techniques
by Suk-Young Lim, Dong-Kyu Chae and Sang-Chul Lee
Appl. Sci. 2022, 12(8), 3926; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083926 - 13 Apr 2022
Cited by 12 | Viewed by 8418
Abstract
Fake media, generated by methods such as deepfakes, have become indistinguishable from real media, but their detection has not improved at the same pace. Furthermore, the absence of interpretability on deepfake detection models makes their reliability questionable. In this paper, we present a [...] Read more.
Fake media, generated by methods such as deepfakes, have become indistinguishable from real media, but their detection has not improved at the same pace. Furthermore, the absence of interpretability on deepfake detection models makes their reliability questionable. In this paper, we present a human perception level of interpretability for deepfake audio detection. Based on their characteristics, we implement several explainable artificial intelligence (XAI) methods used for image classification on an audio-related task. In addition, by examining the human cognitive process of XAI on image classification, we suggest the use of a corresponding data format for providing interpretability. Using this novel concept, a fresh interpretation using attribution scores can be provided. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 910 KiB  
Article
Express Construction for GANs from Latent Representation to Data Distribution
by Minghui Liu, Jiali Deng, Meiyi Yang, Xuan Cheng, Tianshu Xie, Pan Deng, Xiaomin Wang and Ming Liu
Appl. Sci. 2022, 12(8), 3910; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083910 - 13 Apr 2022
Cited by 1 | Viewed by 1593
Abstract
Generative Adversarial Networks (GANs) are powerful generative models for numerous tasks and datasets. However, most of the existing models suffer from mode collapse. The most recent research indicates that the reason for it is that the optimal transportation map from random noise to [...] Read more.
Generative Adversarial Networks (GANs) are powerful generative models for numerous tasks and datasets. However, most of the existing models suffer from mode collapse. The most recent research indicates that the reason for it is that the optimal transportation map from random noise to the data distribution is discontinuous, but deep neural networks (DNNs) can only approximate continuous ones. Instead, the latent representation is a better raw material used to construct a transportation map point to the data distribution than random noise. Because it is a low-dimensional mapping related to the data distribution, the construction procedure seems more like expansion rather than starting all over. Besides, we can also search for more transportation maps in this way with smoother transformation. Thus, we have proposed a new training methodology for GANs in this paper to search for more transportation maps and speed the training up, named Express Construction. The key idea is to train GANs with two independent phases for successively yielding latent representation and data distribution. To this end, an Auto-Encoder is trained to map the real data into the latent space, and two couples of generators and discriminators are used to produce them. To the best of our knowledge, we are the first to decompose the training procedure of GAN models into two more uncomplicated phases, thus tackling the mode collapse problem without much more computational cost. We also provide theoretical steps toward understanding the training dynamics of this procedure and prove assumptions. No extra hyper-parameters have been used in the proposed method, which indicates that Express Construction can be used to train any GAN models. Extensive experiments are conducted to verify the performance of realistic image generation and the resistance to mode collapse. The results show that the proposed method is lightweight, effective, and less prone to mode collapse. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1797 KiB  
Article
Breast and Lung Anticancer Peptides Classification Using N-Grams and Ensemble Learning Techniques
by Ayad Rodhan Abbas, Bashar Saadoon Mahdi and Osamah Younus Fadhil
Big Data Cogn. Comput. 2022, 6(2), 40; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6020040 - 12 Apr 2022
Cited by 2 | Viewed by 3260
Abstract
Anticancer peptides (ACPs) are short protein sequences; they perform functions like some hormones and enzymes inside the body. The role of any protein or peptide is related to its structure and the sequence of amino acids that make up it. There are 20 [...] Read more.
Anticancer peptides (ACPs) are short protein sequences; they perform functions like some hormones and enzymes inside the body. The role of any protein or peptide is related to its structure and the sequence of amino acids that make up it. There are 20 types of amino acids in humans, and each of them has a particular characteristic according to its chemical structure. Current machine and deep learning models have been used to classify ACPs problems. However, these models have neglected Amino Acid Repeats (AARs) that play an essential role in the function and structure of peptides. Therefore, in this paper, ACPs offer a promising route for novel anticancer peptides by extracting AARs based on N-Grams and k-mers using two peptides’ datasets. These datasets pointed to breast and lung cancer cells assembled and curated manually from the Cancer Peptide and Protein Database (CancerPPD). Every dataset consists of a sequence of peptides and their synthesis and anticancer activity on breast and lung cancer cell lines. Five different feature selection methods were used in this paper to improve classification performance and reduce the experimental costs. After that, ACPs were classified using four classifiers, namely AdaBoost, Random Forest Tree (RFT), Multi-class Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP). These classifiers were evaluated by applying five well-known evaluation metrics. Experimental results showed that the breast and lung ACPs classification process provided an accurate performance that reached 89.25% and 92.56%, respectively. In terms of AUC, it reached 95.35% and 96.92% for both breast and lung ACPs, respectively. The proposed classifiers performed competently somewhat equally in AUC, accuracy, precision, F-measures, and recall, except for Multi-class SVM-based feature selection, which showed superior performance. As a result, this paper significantly improved the predictive performance that can effectively distinguish ACPs as virtual inactive, experimental inactive, moderately active, and very active. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 2767 KiB  
Article
Criteria Selection Using Machine Learning (ML) for Communication Technology Solution of Electrical Distribution Substations
by Nayli Adriana Azhar, Nurul Asyikin Mohamed Radzi, Kaiyisah Hanis Mohd Azmi, Faris Syahmi Samidi and Alisadikin Muhammad Zainal
Appl. Sci. 2022, 12(8), 3878; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083878 - 12 Apr 2022
Cited by 9 | Viewed by 1952
Abstract
In the future, as populations grow and more end-user applications become available, the current traditional electrical distribution substation will not be able to fully accommodate new applications that may arise. Consequently, there will be numerous difficulties, including network congestion, latency, jitter, and, in [...] Read more.
In the future, as populations grow and more end-user applications become available, the current traditional electrical distribution substation will not be able to fully accommodate new applications that may arise. Consequently, there will be numerous difficulties, including network congestion, latency, jitter, and, in the worst-case scenario, network failure, among other things. Thus, the purpose of this study is to assist decision makers in selecting the most appropriate communication technologies for an electrical distribution substation through an examination of the criteria’s in-fluence on the selection process. In this study, nine technical criteria were selected and processed using machine learning (ML) software, RapidMiner, to find the most optimal technical criteria. Several ML techniques were studied, and Naïve Bayes was chosen, as it showed the highest performance among the rest. From this study, the criteria were ranked in order of importance from most important to least important based on the average value obtained from the output. Seven technical criteria were identified as being important and should be evaluated in order to determine the most appropriate communication technology solution for electrical distribution substation as a result of this study. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 443 KiB  
Article
Entropy-Enhanced Attention Model for Explanation Recommendation
by Yongjie Yan, Guang Yu and Xiangbin Yan
Entropy 2022, 24(4), 535; https://0-doi-org.brum.beds.ac.uk/10.3390/e24040535 - 11 Apr 2022
Cited by 1 | Viewed by 2243
Abstract
Most of the existing recommendation systems using deep learning are based on the method of RNN (Recurrent Neural Network). However, due to some inherent defects of RNN, recommendation systems based on RNN are not only very time consuming but also unable to capture [...] Read more.
Most of the existing recommendation systems using deep learning are based on the method of RNN (Recurrent Neural Network). However, due to some inherent defects of RNN, recommendation systems based on RNN are not only very time consuming but also unable to capture the long-range dependencies between user comments. Through the sentiment analysis of user comments, we can better capture the characteristics of user interest. Information entropy can reduce the adverse impact of noise words on the construction of user interests. Information entropy is used to analyze the user information content and filter out users with low information entropy to achieve the purpose of filtering noise data. A self-attention recommendation model based on entropy regularization is proposed to analyze the emotional polarity of the data set. Specifically, to model the mixed interactions from user comments, a multi-head self-attention network is introduced. The loss function of the model is used to realize the interpretability of recommendation systems. The experiment results show that our model outperforms the baseline methods in terms of MAP (Mean Average Precision) and NDCG (Normalized Discounted Cumulative Gain) on several datasets, and it achieves good interpretability. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

28 pages, 1469 KiB  
Article
On the Black-Box Challenge for Fraud Detection Using Machine Learning (II): Nonlinear Analysis through Interpretable Autoencoders
by Jacobo Chaquet-Ulldemolins, Francisco-Javier Gimeno-Blanes, Santiago Moral-Rubio, Sergio Muñoz-Romero and José-Luis Rojo-Álvarez
Appl. Sci. 2022, 12(8), 3856; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083856 - 11 Apr 2022
Cited by 10 | Viewed by 3028
Abstract
Artificial intelligence (AI) has recently intensified in the global economy due to the great competence that it has demonstrated for analysis and modeling in many disciplines. This situation is accelerating the shift towards a more automated society, where these new techniques can be [...] Read more.
Artificial intelligence (AI) has recently intensified in the global economy due to the great competence that it has demonstrated for analysis and modeling in many disciplines. This situation is accelerating the shift towards a more automated society, where these new techniques can be consolidated as a valid tool to face the difficult challenge of credit fraud detection (CFD). However, tight regulations do not make it easy for financial entities to comply with them while using modern techniques. From a methodological perspective, autoencoders have demonstrated their effectiveness in discovering nonlinear features across several problem domains. However, autoencoders are opaque and often seen as black boxes. In this work, we propose an interpretable and agnostic methodology for CFD. This type of approach allows a double advantage: on the one hand, it can be applied together with any machine learning (ML) technique, and on the other hand, it offers the necessary traceability between inputs and outputs, hence escaping from the black-box model. We first applied the state-of-the-art feature selection technique defined in the companion paper. Second, we proposed a novel technique, based on autoencoders, capable of evaluating the relationship among input and output of a sophisticated ML model for each and every one of the samples that are submitted to the analysis, through a single transaction-level explanation (STE) approach. This technique allows each instance to be analyzed individually by applying small fluctuations of the input space and evaluating how it is triggered in the output, thereby shedding light on the underlying dynamics of the model. Based on this, an individualized transaction ranking (ITR) can be formulated, leveraging on the contributions of each feature through STE. These rankings represent a close estimate of the most important features playing a role in the decision process. The results obtained in this work were consistent with previous published papers, and showed that certain features, such as living beyond means, lack or absence of transaction trail, and car loans, have strong influence on the model outcome. Additionally, this proposal using the latent space outperformed, in terms of accuracy, our previous results, which already improved prior published papers, by 5.5% and 1.5% for the datasets under study, from a baseline of 76% and 93%. The contribution of this paper is twofold, as far as a new outperforming CFD classification model is presented, and at the same time, we developed a novel methodology, applicable across classification techniques, that allows to breach black-box models, erasingthe dependencies and, eventually, undesirable biases. We conclude that it is possible to develop an effective, individualized, unbiased, and traceable ML technique, not only to comply with regulations, but also to be able to cope with transaction-level inquiries from clients and authorities. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 2575 KiB  
Article
Enhance Domain-Invariant Transferability of Adversarial Examples via Distance Metric Attack
by Jin Zhang, Wenyu Peng, Ruxin Wang, Yu Lin, Wei Zhou and Ge Lan
Mathematics 2022, 10(8), 1249; https://0-doi-org.brum.beds.ac.uk/10.3390/math10081249 - 11 Apr 2022
Cited by 2 | Viewed by 1579
Abstract
A general foundation of fooling a neural network without knowing the details (i.e., black-box attack) is the attack transferability of adversarial examples across different models. Many works have been devoted to enhancing the task-specific transferability of adversarial examples, whereas the cross-task transferability is [...] Read more.
A general foundation of fooling a neural network without knowing the details (i.e., black-box attack) is the attack transferability of adversarial examples across different models. Many works have been devoted to enhancing the task-specific transferability of adversarial examples, whereas the cross-task transferability is nearly out of the research scope. In this paper, to enhance the above two types of transferability of adversarial examples, we are the first to regard the transferability issue as a heterogeneous domain generalisation problem, which can be addressed by a general pipeline based on the domain-invariant feature extractor pre-trained on ImageNet. Specifically, we propose a distance metric attack (DMA) method that aims to increase the latent layer distance between the adversarial example and the benign example along the opposite direction guided by the cross-entropy loss. With the help of a simple loss, DMA can effectively enhance the domain-invariant transferability (for both the task-specific case and the cross-task case) of the adversarial examples. Additionally, DMA can be used to measure the robustness of the latent layers in a deep model. We empirically find that the models with similar structures have consistent robustness at depth-similar layers, which reveals that model robustness is closely related to model structure. Extensive experiments on image classification, object detection, and semantic segmentation demonstrate that DMA can improve the success rate of black-box attack by more than 10% on the task-specific attack and by more than 5% on cross-task attack. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 4887 KiB  
Article
Research on Identification Technology of Field Pests with Protective Color Characteristics
by Zhengfang Hu, Yang Xiang, Yajun Li, Zhenhuan Long, Anwen Liu, Xiufeng Dai, Xiangming Lei and Zhenhui Tang
Appl. Sci. 2022, 12(8), 3810; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083810 - 10 Apr 2022
Cited by 8 | Viewed by 2013
Abstract
Accurate identification of field pests has crucial decision-making significance for integrated pest control. Most current research focuses on the identification of pests on the sticky card or the case of great differences between the target and the background. There is little research on [...] Read more.
Accurate identification of field pests has crucial decision-making significance for integrated pest control. Most current research focuses on the identification of pests on the sticky card or the case of great differences between the target and the background. There is little research on field pest identification with protective color characteristics. Aiming at the problem that it is difficult to identify pests with protective color characteristics in the complex field environment, a field pest identification method based on near-infrared imaging technology and YOLOv5 is proposed in this paper. Firstly, an appropriate infrared filter and ring light source have been selected to build an image acquisition system according to the wavelength with the largest spectral reflectance difference between the spectral curves of the pest (Pieris rapae) and its host plants (cabbage), which are formed by specific spectral characteristics. Then, field pest images have been collected to construct a data set, which has been trained and tested through YOLOv5. Experimental results demonstrate that the average time required to detect one pest image is 0.56 s, and the mAP reaches 99.7%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 3643 KiB  
Article
MULDASA: Multifactor Lexical Sentiment Analysis of Social-Media Content in Nonstandard Arabic Social Media
by Ghadah Alwakid, Taha Osman, Mahmoud El Haj, Saad Alanazi, Mamoona Humayun and Najm Us Sama
Appl. Sci. 2022, 12(8), 3806; https://0-doi-org.brum.beds.ac.uk/10.3390/app12083806 - 09 Apr 2022
Cited by 9 | Viewed by 2167
Abstract
The semantically complicated Arabic natural vocabulary, and the shortage of available techniques and skills to capture Arabic emotions from text hinder Arabic sentiment analysis (ASA). Evaluating Arabic idioms that do not follow a conventional linguistic framework, such as contemporary standard Arabic (MSA), complicates [...] Read more.
The semantically complicated Arabic natural vocabulary, and the shortage of available techniques and skills to capture Arabic emotions from text hinder Arabic sentiment analysis (ASA). Evaluating Arabic idioms that do not follow a conventional linguistic framework, such as contemporary standard Arabic (MSA), complicates an incredibly difficult procedure. Here, we define a novel lexical sentiment analysis approach for studying Arabic language tweets (TTs) from specialized digital media platforms. Many elements comprising emoji, intensifiers, negations, and other nonstandard expressions such as supplications, proverbs, and interjections are incorporated into the MULDASA algorithm to enhance the precision of opinion classifications. Root words in multidialectal sentiment LX are associated with emotions found in the content under study via a simple stemming procedure. Furthermore, a feature–sentiment correlation procedure is incorporated into the proposed technique to exclude viewpoints expressed that seem to be irrelevant to the area of concern. As part of our research into Saudi Arabian employability, we compiled a large sample of TTs in 6 different Arabic dialects. This research shows that this sentiment categorization method is useful, and that using all of the characteristics listed earlier improves the ability to accurately classify people’s feelings. The classification accuracy of the proposed algorithm improved from 83.84% to 89.80%. Our approach also outperformed two existing research projects that employed a lexical approach for the sentiment analysis of Saudi dialects. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

33 pages, 1141 KiB  
Review
Predicting Stock Price Changes Based on the Limit Order Book: A Survey
by Ilia Zaznov, Julian Kunkel, Alfonso Dufour and Atta Badii
Mathematics 2022, 10(8), 1234; https://0-doi-org.brum.beds.ac.uk/10.3390/math10081234 - 09 Apr 2022
Cited by 6 | Viewed by 11964
Abstract
This survey starts with a general overview of the strategies for stock price change predictions based on market data and in particular Limit Order Book (LOB) data. The main discussion is devoted to the systematic analysis, comparison, and critical evaluation of the state-of-the-art [...] Read more.
This survey starts with a general overview of the strategies for stock price change predictions based on market data and in particular Limit Order Book (LOB) data. The main discussion is devoted to the systematic analysis, comparison, and critical evaluation of the state-of-the-art studies in the research area of stock price movement predictions based on LOB data. LOB and Order Flow data are two of the most valuable information sources available to traders on the stock markets. Academic researchers are actively exploring the application of different quantitative methods and algorithms for this type of data to predict stock price movements. With the advancements in machine learning and subsequently in deep learning, the complexity and computational intensity of these models was growing, as well as the claimed predictive power. Some researchers claim accuracy of stock price movement prediction well in excess of 80%. These models are now commonly employed by automated market-making programs to set bids and ask quotes. If these results were also applicable to arbitrage trading strategies, then those algorithms could make a fortune for their developers. Thus, the open question is whether these results could be used to generate buy and sell signals that could be exploited with active trading. Therefore, this survey paper is intended to answer this question by reviewing these results and scrutinising their reliability. The ultimate conclusion from this analysis is that although considerable progress was achieved in this direction, even the state-of-art models can not guarantee a consistent profit in active trading. Taking this into account several suggestions for future research in this area were formulated along the three dimensions: input data, model’s architecture, and experimental setup. In particular, from the input data perspective, it is critical that the dataset is properly processed, up-to-date, and its size is sufficient for the particular model training. From the model architecture perspective, even though deep learning models are demonstrating a stronger performance than classical models, they are also more prone to over-fitting. To avoid over-fitting it is suggested to optimize the feature space, as well as a number of layers and neurons, and apply dropout functionality. The over-fitting problem can be also addressed by optimising the experimental setup in several ways: Introducing the early stopping mechanism; Saving the best weights of the model achieved during the training; Testing the model on the out-of-sample data, which should be separated from the validation and training samples. Finally, it is suggested to always conduct the trading simulation under realistic market conditions considering transactions costs, bid–ask spreads, and market impact. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 4124 KiB  
Article
MDA-Unet: A Multi-Scale Dilated Attention U-Net for Medical Image Segmentation
by Alyaa Amer, Tryphon Lambrou and Xujiong Ye
Appl. Sci. 2022, 12(7), 3676; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073676 - 06 Apr 2022
Cited by 18 | Viewed by 5420
Abstract
The advanced development of deep learning methods has recently made significant improvements in medical image segmentation. Encoder–decoder networks, such as U-Net, have addressed some of the challenges in medical image segmentation with an outstanding performance, which has promoted them to be the most [...] Read more.
The advanced development of deep learning methods has recently made significant improvements in medical image segmentation. Encoder–decoder networks, such as U-Net, have addressed some of the challenges in medical image segmentation with an outstanding performance, which has promoted them to be the most dominating deep learning architecture in this domain. Despite their outstanding performance, we argue that they still lack some aspects. First, there is incompatibility in U-Net’s skip connection between the encoder and decoder features due to the semantic gap between low-processed encoder features and highly processed decoder features, which adversely affects the final prediction. Second, it lacks capturing multi-scale context information and ignores the contribution of all semantic information through the segmentation process. Therefore, we propose a model named MDA-Unet, a novel multi-scale deep learning segmentation model. MDA-Unet improves upon U-Net and enhances its performance in segmenting medical images with variability in the shape and size of the region of interest. The model is integrated with a multi-scale spatial attention module, where spatial attention maps are derived from a hybrid hierarchical dilated convolution module that captures multi-scale context information. To ease the training process and reduce the gradient vanishing problem, residual blocks are deployed instead of the basic U-net blocks. Through a channel attention mechanism, the high-level decoder features are used to guide the low-level encoder features to promote the selection of meaningful context information, thus ensuring effective fusion. We evaluated our model on 2 different datasets: a lung dataset of 2628 axial CT images and an echocardiographic dataset of 2000 images, each with its own challenges. Our model has achieved a significant gain in performance with a slight increase in the number of trainable parameters in comparison with the basic U-Net model, providing a dice score of 98.3% on the lung dataset and 96.7% on the echocardiographic dataset, where the basic U-Net has achieved 94.2% on the lung dataset and 93.9% on the echocardiographic dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

32 pages, 9387 KiB  
Article
Fully Automatic Segmentation, Identification and Preoperative Planning for Nasal Surgery of Sinuses Using Semi-Supervised Learning and Volumetric Reconstruction
by Chung-Feng Jeffrey Kuo and Shao-Cheng Liu
Mathematics 2022, 10(7), 1189; https://0-doi-org.brum.beds.ac.uk/10.3390/math10071189 - 06 Apr 2022
Cited by 3 | Viewed by 2337
Abstract
The aim of this study is to develop an automatic segmentation algorithm based on paranasal sinus CT images, which realizes automatic identification and segmentation of the sinus boundary and its inflamed proportions, as well as the reconstruction of normal sinus and inflamed site [...] Read more.
The aim of this study is to develop an automatic segmentation algorithm based on paranasal sinus CT images, which realizes automatic identification and segmentation of the sinus boundary and its inflamed proportions, as well as the reconstruction of normal sinus and inflamed site volumes. Our goal is to overcome the current clinical dilemma of manually calculating the inflammatory sinus volume, which is objective and ineffective. A semi-supervised learning algorithm using pseudo-labels for self-training was proposed to train convolutional neural networks, which consisted of SENet, MobileNet, and ResNet. An aggregate of 175 CT sets was analyzed, 50 of which were from patients who subsequently underwent sinus surgery. A 3D view and volume-based modified Lund-Mackay score were determined and compared with traditional scores. Compared to state-of-the-art networks, our modifications achieved significant improvements in both sinus segmentation and classification, with an average pixel accuracy of 99.67%, an MIoU of 89.75%, and a Dice coefficient of 90.79%. The fully automatic nasal sinus volume reconstruction system was successfully obtained the relevant detailed information by accurately acquiring the nasal sinus contour edges in the CT images. The accuracy of our algorithm has been validated and the results can be effectively applied to actual clinical medicine or forensic research. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 453 KiB  
Article
Affection Enhanced Relational Graph Attention Network for Sarcasm Detection
by Guowei Li, Fuqiang Lin, Wangqun Chen and Bo Liu
Appl. Sci. 2022, 12(7), 3639; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073639 - 04 Apr 2022
Cited by 5 | Viewed by 1700
Abstract
Sarcasm detection remains a challenge for numerous Natural Language Processing (NLP) tasks, such as sentiment classification or stance prediction. Existing sarcasm detection studies attempt to capture the subtle semantic incongruity patterns by using contextual information and graph information through Graph Convolutional Networks (GCN). [...] Read more.
Sarcasm detection remains a challenge for numerous Natural Language Processing (NLP) tasks, such as sentiment classification or stance prediction. Existing sarcasm detection studies attempt to capture the subtle semantic incongruity patterns by using contextual information and graph information through Graph Convolutional Networks (GCN). However, direct application of dependence may inevitably introduce noisy information and inferiorly in modeling long-distance or disconnected words in the dependency tree. To better learn the sentiment inconsistencies between terms, we propose an Affection Enhanced Relational Graph Attention network (ARGAT) by jointly considering the affective information and the dependency information. Specifically, we use Relational Graph Attention Networks (RGAT) to integrate relation information guided by a trainable matrix of relation types and synchronously use GCNs to integrate affection information explicitly donated by affective adjacency matrixes. The employment of RGAT contributes to information interaction of structural relevant word pairs with a long distance. With the enhancement of affective information, the proposed model can capture complex forms of sarcastic expressions. Experimental results on six benchmark datasets show that our proposed approach outperforms state-of-the-art sarcasm detection methods. The best-improved results of accuracy and F1 are 4.19% and 4.33%, respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 674 KiB  
Article
Boosting Social Spam Detection via Attention Mechanisms on Twitter
by Hua Shen, Xinyue Liu and Xianchao Zhang
Electronics 2022, 11(7), 1129; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11071129 - 02 Apr 2022
Cited by 7 | Viewed by 2038
Abstract
Twitter is one of the largest social networking platforms, which allows users to make friends, read the latest news, share personal ideas, and discuss social issues. The huge popularity of Twitter mean it attracts a lot of online spammers. Traditional spam detection approaches [...] Read more.
Twitter is one of the largest social networking platforms, which allows users to make friends, read the latest news, share personal ideas, and discuss social issues. The huge popularity of Twitter mean it attracts a lot of online spammers. Traditional spam detection approaches have shown the effectiveness for identifying Twitter spammers by extracting handcrafted features and training machine learning models. However, such models need knowledge from domain experts. Moreover, the behaviors of spammers can change according to the defense strategies of Twitter. These result in the ineffectiveness of the traditional feature-based approaches. Although deep-learning-based approaches have been proposed for detecting Twitter spammers, they all treat each tweet equally, and ignore the differences among them. To solve these issues, in this paper, we propose a new attention-based deep learning model to detect social spammers in Twitter. In particular, we first introduce the state-of-the-art pretraining model BERTweet for learning the representation of each tweet, and then use the proposed novel attention-based mechanism to learn the user representations by distinguishing the differences among tweets posted by each user. Moreover, we take social interactions into consideration and propose that a graph attention network is used to update the learned user representations, to further improve the accuracy of identifying spammers. Experiments on a publicly available, real-world Twitter dataset show the effectiveness of the proposed model, which is able to significantly enhance the performance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 749 KiB  
Article
Ad Creative Discontinuation Prediction with Multi-Modal Multi-Task Neural Survival Networks
by Shunsuke Kitada, Hitoshi Iyatomi and Yoshifumi Seki
Appl. Sci. 2022, 12(7), 3594; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073594 - 01 Apr 2022
Viewed by 4359
Abstract
Discontinuing ad creatives at an appropriate time is one of the most important ad operations that can have a significant impact on sales. Such operational support for ineffective ads has been less explored than that for effective ads. After pre-analyzing 1,000,000 real-world ad [...] Read more.
Discontinuing ad creatives at an appropriate time is one of the most important ad operations that can have a significant impact on sales. Such operational support for ineffective ads has been less explored than that for effective ads. After pre-analyzing 1,000,000 real-world ad creatives, we found that there are two types of discontinuation: short-term (i.e., cut-out) and long-term (i.e., wear-out). In this paper, we propose a practical prediction framework for the discontinuation of ad creatives with a hazard function-based loss function inspired by survival analysis. Our framework predicts the discontinuations with a multi-modal deep neural network that takes as input the ad creative (e.g., text, categorical, image, numerical features). To improve the prediction performance for the two different types of discontinuations and for the ad creatives that contribute to sales, we introduce two new techniques: (1) a two-term estimation technique with multi-task learning and (2) a click-through rate-weighting technique for the loss function. We evaluated our framework using the large-scale ad creative dataset, including 10 billion scale impressions. In terms of the concordance index (short: 0.896, long: 0.939, and overall: 0.792), our framework achieved significantly better performance than the conventional method (0.531). Additionally, we confirmed that our framework (i) demonstrated the same degree of discontinuation effect as manual operations for short-term cases, and (ii) accurately predicted the ad discontinuation order, which is important for long-running ad creatives for long-term cases. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

15 pages, 6228 KiB  
Article
Automatic Search Dense Connection Module for Super-Resolution
by Huaijuan Zang, Guoan Cheng, Zhipeng Duan, Ying Zhao and Shu Zhan
Entropy 2022, 24(4), 489; https://0-doi-org.brum.beds.ac.uk/10.3390/e24040489 - 31 Mar 2022
Cited by 1 | Viewed by 1574
Abstract
The development of display technology has continuously increased the requirements for image resolution. However, the imaging systems of many cameras are limited by their physical conditions, and the image resolution is often restrictive. Recently, several models based on deep convolutional neural network (CNN) [...] Read more.
The development of display technology has continuously increased the requirements for image resolution. However, the imaging systems of many cameras are limited by their physical conditions, and the image resolution is often restrictive. Recently, several models based on deep convolutional neural network (CNN) have gained significant performance for image super-resolution (SR), while extensive memory consumption and computation overhead hinder practical applications. For this purpose, we present a lightweight network that automatically searches dense connection (ASDCN) for image super-resolution (SR), which effectively reduces redundancy in dense connection and focuses on more valuable features. We employ neural architecture search (NAS) to model the searching of dense connections. Qualitative and quantitative experiments on five public datasets show that our derived model achieves superior performance over the state-of-the-art models. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 2139 KiB  
Article
Gaussian Perturbations in ReLU Networks and the Arrangement of Activation Regions
by Bálint Daróczy
Mathematics 2022, 10(7), 1123; https://0-doi-org.brum.beds.ac.uk/10.3390/math10071123 - 31 Mar 2022
Cited by 1 | Viewed by 1736
Abstract
Recent articles indicate that deep neural networks are efficient models for various learning problems. However, they are often highly sensitive to various changes that cannot be detected by an independent observer. As our understanding of deep neural networks with traditional generalisation bounds still [...] Read more.
Recent articles indicate that deep neural networks are efficient models for various learning problems. However, they are often highly sensitive to various changes that cannot be detected by an independent observer. As our understanding of deep neural networks with traditional generalisation bounds still remains incomplete, there are several measures which capture the behaviour of the model in case of small changes at a specific state. In this paper we consider Gaussian perturbations in the tangent space and suggest tangent sensitivity in order to characterise the stability of gradient updates. We focus on a particular kind of stability with respect to changes in parameters that are induced by individual examples without known labels. We derive several easily computable bounds and empirical measures for feed-forward fully connected ReLU (Rectified Linear Unit) networks and connect tangent sensitivity to the distribution of the activation regions in the input space realised by the network. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 3435 KiB  
Article
Application of Convolutional Neural Network for Fingerprint-Based Prediction of Gender, Finger Position, and Height
by Chung-Ting Hsiao, Chun-Yi Lin, Po-Shan Wang and Yu-Te Wu
Entropy 2022, 24(4), 475; https://0-doi-org.brum.beds.ac.uk/10.3390/e24040475 - 29 Mar 2022
Cited by 7 | Viewed by 2978
Abstract
Fingerprints are the most common personal identification feature and key evidence for crime scene investigators. The prediction of fingerprints features include gender, height range (tall or short), left or right hand, and finger position can effectively narrow down the list of suspects, increase [...] Read more.
Fingerprints are the most common personal identification feature and key evidence for crime scene investigators. The prediction of fingerprints features include gender, height range (tall or short), left or right hand, and finger position can effectively narrow down the list of suspects, increase the speed of comparison, and greatly improve the effectiveness of criminal investigations. In this study, we used three commonly used CNNs (VGG16, Inception-v3, and Resnet50) to perform biometric prediction on 1000 samples, and the results showed that VGG16 achieved the highest accuracy in identifying gender (79.2%), left- and right-hand fingerprints (94.4%), finger position (84.8%), and height range (69.8%, using the ring finger of male participants). In addition, we visualized the CNN classification basis by the Grad-CAM technique and compared the results with those predicted by experts and found that the CNN model outperformed experts in terms of classification accuracy and speed, and provided good reference for fingerprints that were difficult to determine manually. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 5053 KiB  
Article
HMD-Net: A Vehicle Hazmat Marker Detection Benchmark
by Lei Jia, Jianzhu Wang, Tianyuan Wang, Xiaobao Li, Haomin Yu and Qingyong Li
Entropy 2022, 24(4), 466; https://0-doi-org.brum.beds.ac.uk/10.3390/e24040466 - 28 Mar 2022
Cited by 2 | Viewed by 1866
Abstract
Vehicles carrying hazardous material (hazmat) are severe threats to the safety of highway transportation, and a model that can automatically recognize hazmat markers installed or attached on vehicles is essential for intelligent management systems. However, there is still no public dataset for benchmarking [...] Read more.
Vehicles carrying hazardous material (hazmat) are severe threats to the safety of highway transportation, and a model that can automatically recognize hazmat markers installed or attached on vehicles is essential for intelligent management systems. However, there is still no public dataset for benchmarking the task of hazmat marker detection. To this end, this paper releases a large-scale vehicle hazmat marker dataset named VisInt-VHM, which includes 10,000 images with a total of 20,023 hazmat markers captured under different environmental conditions from a real-world highway. Meanwhile, we provide an compact hazmat marker detection network named HMD-Net, which utilizes a revised lightweight backbone and is further compressed by channel pruning. As a consequence, the trained-model can be efficiently deployed on a resource-restricted edge device. Experimental results demonstrate that compared with some established methods such as YOLOv3, YOLOv4, their lightweight versions and popular lightweight models, HMD-Net can achieve a better trade-off between the detection accuracy and the inference speed. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

10 pages, 567 KiB  
Article
Text Data Augmentation for the Korean Language
by Dang Thanh Vu, Gwanghyun Yu, Chilwoo Lee and Jinyoung Kim
Appl. Sci. 2022, 12(7), 3425; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073425 - 28 Mar 2022
Cited by 7 | Viewed by 2537
Abstract
Data augmentation (DA) is a universal technique to reduce overfitting and improve the robustness of machine learning models by increasing the quantity and variety of the training dataset. Although data augmentation is essential in vision tasks, it is rarely applied to text datasets [...] Read more.
Data augmentation (DA) is a universal technique to reduce overfitting and improve the robustness of machine learning models by increasing the quantity and variety of the training dataset. Although data augmentation is essential in vision tasks, it is rarely applied to text datasets since it is less straightforward. Some studies have concerned text data augmentation, but most of them are for the majority languages, such as English or French. There have been only a few studies on data augmentation for minority languages, e.g., Korean. This study fills the gap by demonstrating several common data augmentation methods and Korean corpora with pre-trained language models. In short, we evaluate the performance of two text data augmentation approaches, known as text transformation and back translation. We compare these augmentations among Korean corpora on four downstream tasks: semantic textual similarity (STS), natural language inference (NLI), question duplication verification (QDV), and sentiment classification (STC). Compared to cases without augmentation, the performance gains when applying text data augmentation are 2.24%, 2.19%, 0.66%, and 0.08% on the STS, NLI, QDV, and STC tasks, respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 1301 KiB  
Article
On the Black-Box Challenge for Fraud Detection Using Machine Learning (I): Linear Models and Informative Feature Selection
by Jacobo Chaquet-Ulldemolins, Francisco-Javier Gimeno-Blanes, Santiago Moral-Rubio, Sergio Muñoz-Romero and José-Luis Rojo-Álvarez
Appl. Sci. 2022, 12(7), 3328; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073328 - 25 Mar 2022
Cited by 8 | Viewed by 2547
Abstract
Artificial intelligence (AI) is rapidly shaping the global financial market and its services due to the great competence that it has shown for analysis and modeling in many disciplines. What is especially remarkable is the potential that these techniques could offer to the [...] Read more.
Artificial intelligence (AI) is rapidly shaping the global financial market and its services due to the great competence that it has shown for analysis and modeling in many disciplines. What is especially remarkable is the potential that these techniques could offer to the challenging reality of credit fraud detection (CFD); but it is not easy, even for financial institutions, to keep in strict compliance with non-discriminatory and data protection regulations while extracting all the potential that these powerful new tools can provide to them. This reality effectively restricts nearly all possible AI applications to simple and easy to trace neural networks, preventing more advanced and modern techniques from being applied. The aim of this work was to create a reliable, unbiased, and interpretable methodology to automatically evaluate CFD risk. Therefore, we propose a novel methodology to address the mentioned complexity when applying machine learning (ML) to the CFD problem that uses state-of-the-art algorithms capable of quantifying the information of the variables and their relationships. This approach offers a new form of interpretability to cope with this multifaceted situation. Applied first is a recent published feature selection technique, the informative variable identifier (IVI), which is capable of distinguishing among informative, redundant, and noisy variables. Second, a set of innovative recurrent filters defined in this work are applied, which aim to minimize the training-data bias, namely, the recurrent feature filter (RFF) and the maximally-informative feature filter (MIFF). Finally, the output is classified by using compelling ML techniques, such as gradient boosting, support vector machine, linear discriminant analysis, and linear regression. These defined models were applied both to a synthetic database, for better descriptive modeling and fine tuning, and then to a real database. Our results confirm that our proposal yields valuable interpretability by identifying the informative features’ weights that link original variables with final objectives. Informative features were living beyond one’s means, lack or absence of a transaction trail, and unexpected overdrafts, which are consistent with other published works. Furthermore, we obtained 76% accuracy in CFD, which represents an improvement of more than 4% in the real databases compared to other published works. We conclude that with the use of the presented methodology, we do not only reduce dimensionality, but also improve the accuracy, and trace relationships among input and output features, bringing transparency to the ML reasoning process. The results obtained here were used as a starting point for the companion paper which reports on our extending the interpretability to nonlinear ML architectures. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 3290 KiB  
Article
A Novel Feature Set Extraction Based on Accelerometer Sensor Data for Improving the Fall Detection System
by Hong-Lam Le, Duc-Nhan Nguyen, Thi-Hau Nguyen and Ha-Nam Nguyen
Electronics 2022, 11(7), 1030; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11071030 - 25 Mar 2022
Cited by 10 | Viewed by 2686
Abstract
Because falls are the second leading cause of injury deaths, especially in the elderly according to WHO statistics, there have been a lot of studies on developing a fall detection and warning system. Many approaches based on wearable sensors, cameras, Infrared sensors, radar, [...] Read more.
Because falls are the second leading cause of injury deaths, especially in the elderly according to WHO statistics, there have been a lot of studies on developing a fall detection and warning system. Many approaches based on wearable sensors, cameras, Infrared sensors, radar, etc., have been proposed to detect falls efficiently. However, it still faces many challenges due to noise and no clear definition of fall activities. This paper proposes a new way to extract 44 features based on the time domain, frequency domain, and Hjorth parameters to deal with this. The effect of the proposed feature set has been evaluated on several classification algorithms, such as SVM, k-NN, ANN, J48, and RF. Our method achieves a relative high performance (F1-Score metric) in detecting fall and non-fall activities, i.e., 95.23% (falls), 99.11% (non-falls), and 96.16% (falls), 99.90% (non-falls) for the MobileAct 2.0 and UP-Fall datasets, respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

17 pages, 3932 KiB  
Article
TdmTracker: Multi-Object Tracker Guided by Trajectory Distribution Map
by Yuxuan Gao, Xiaohui Gu, Qiang Gao, Runmin Hou and Yuanlong Hou
Electronics 2022, 11(7), 1010; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11071010 - 24 Mar 2022
Cited by 2 | Viewed by 1451
Abstract
With the great progress of object detection, some detection-based multiple object tracking (MOT) paradigms begin to emerge, including tracking-by-detection, joint detection and tracking, and attention mechanism-based MOT. Due to the separately executed detection, embedding, and data association, tracking-by-detection-based methods are much less efficient [...] Read more.
With the great progress of object detection, some detection-based multiple object tracking (MOT) paradigms begin to emerge, including tracking-by-detection, joint detection and tracking, and attention mechanism-based MOT. Due to the separately executed detection, embedding, and data association, tracking-by-detection-based methods are much less efficient than other end-to-end MOT methods. Therefore, recent works are devoted to integrating these separate processes into an end-to-end paradigm. Some of the transformer-based end-to-end methods introducing track queries to detect targets have achieved good results. Self-attention and track query of these methods has given us some inspiration. Moreover, we adopt optimized class query instead of static learned object query to detect new-coming objects of target category. In this work, we present a novel anchor-free attention mechanism-based end-to-end model TdmTracker, where we propose a trajectory distribution map to guide position prediction, and introduce an adaptive query embedding set and query-key attention mechanism to detect tracked objects in the current frame. The experimental results on MOT17 dataset show that the TdmTracker achieves a good speed-accuracy trade-off compared with other state-of-the-arts. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 1737 KiB  
Article
New Metric for Evaluation of Deep Neural Network Applied in Vision-Based Systems
by Fateme Bakhshande, Daniel Adofo Ameyaw, Neelu Madan and Dirk Söffker
Appl. Sci. 2022, 12(7), 3251; https://0-doi-org.brum.beds.ac.uk/10.3390/app12073251 - 23 Mar 2022
Cited by 1 | Viewed by 1770
Abstract
Vision-based object detection plays a crucial role for the complete functionality of many engineering systems. Typically, detectors or classifiers are used to detect objects or to distinguish different targets. This contribution presents a new evaluation of CNN classifiers in image detection using a [...] Read more.
Vision-based object detection plays a crucial role for the complete functionality of many engineering systems. Typically, detectors or classifiers are used to detect objects or to distinguish different targets. This contribution presents a new evaluation of CNN classifiers in image detection using a modified Probability of Detection reliability measure. The proposed method allows the evaluation of further image parameters affecting the classification results. The proposed evaluation method is implemented on images and comparisons made on parameters with the best detection capability. A typical certification standard (90/95) denoting a 90% probability of detection at 95% reliability level is adapted and successfully applied. Using the 90/95 standard, comparisons are made between different image parameters. A noise analysis procedure is introduced, permitting the trade-off between the detection rate, false alarms, and process parameters. The advantage of the novel approach is experimentally evaluated for vision-based classification results of CNN considering different image parameters. With this new POD evaluation, classifiers will become a trustworthy part of vision systems. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 887 KiB  
Article
Task’s Choice: Pruning-Based Feature Sharing (PBFS) for Multi-Task Learning
by Ying Chen, Jiong Yu, Yutong Zhao, Jiaying Chen and Xusheng Du
Entropy 2022, 24(3), 432; https://0-doi-org.brum.beds.ac.uk/10.3390/e24030432 - 21 Mar 2022
Cited by 6 | Viewed by 2255
Abstract
In most of the existing multi-task learning (MTL) models, multiple tasks’ public information is learned by sharing parameters across hidden layers, such as hard sharing, soft sharing, and hierarchical sharing. One promising approach is to introduce model pruning into information learning, such as [...] Read more.
In most of the existing multi-task learning (MTL) models, multiple tasks’ public information is learned by sharing parameters across hidden layers, such as hard sharing, soft sharing, and hierarchical sharing. One promising approach is to introduce model pruning into information learning, such as sparse sharing, which is regarded as being outstanding in knowledge transferring. However, the above method performs inefficiently in conflict tasks, with inadequate learning of tasks’ private information, or through suffering from negative transferring. In this paper, we propose a multi-task learning model (Pruning-Based Feature Sharing, PBFS) that merges a soft parameter sharing structure with model pruning and adds a prunable shared network among different task-specific subnets. In this way, each task can select parameters in a shared subnet, according to its requirements. Experiments are conducted on three benchmark public datasets and one synthetic dataset; the impact of the different subnets’ sparsity and tasks’ correlations to the model performance is analyzed. Results show that the proposed model’s information sharing strategy is helpful to transfer learning and superior to the several comparison models. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 285 KiB  
Article
Using Feature Selection with Machine Learning for Generation of Insurance Insights
by Ayman Taha, Bernard Cosgrave and Susan Mckeever
Appl. Sci. 2022, 12(6), 3209; https://0-doi-org.brum.beds.ac.uk/10.3390/app12063209 - 21 Mar 2022
Cited by 17 | Viewed by 3766
Abstract
Insurance is a data-rich sector, hosting large volumes of customer data that is analysed to evaluate risk. Machine learning techniques are increasingly used in the effective management of insurance risk. Insurance datasets by their nature, however, are often of poor quality with noisy [...] Read more.
Insurance is a data-rich sector, hosting large volumes of customer data that is analysed to evaluate risk. Machine learning techniques are increasingly used in the effective management of insurance risk. Insurance datasets by their nature, however, are often of poor quality with noisy subsets of data (or features). Choosing the right features of data is a significant pre-processing step in the creation of machine learning models. The inclusion of irrelevant and redundant features has been demonstrated to affect the performance of learning models. In this article, we propose a framework for improving predictive machine learning techniques in the insurance sector via the selection of relevant features. The experimental results, based on five publicly available real insurance datasets, show the importance of applying feature selection for the removal of noisy features before performing machine learning techniques, to allow the algorithm to focus on influential features. An additional business benefit is the revelation of the most and least important features in the datasets. These insights can prove useful for decision making and strategy development in areas/business problems that are not limited to the direct target of the downstream algorithms. In our experiments, machine learning techniques based on a set of selected features suggested by feature selection algorithms outperformed the full feature set for a set of real insurance datasets. Specifically, 20% and 50% of features in our five datasets had improved downstream clustering and classification performance when compared to whole datasets. This indicates the potential for feature selection in the insurance sector to both improve model performance and to highlight influential features for business insights. Full article
(This article belongs to the Topic Machine and Deep Learning)
19 pages, 3117 KiB  
Article
DrawnNet: Offline Hand-Drawn Diagram Recognition Based on Keypoint Prediction of Aggregating Geometric Characteristics
by Jiaqi Fang, Zhen Feng and Bo Cai
Entropy 2022, 24(3), 425; https://0-doi-org.brum.beds.ac.uk/10.3390/e24030425 - 19 Mar 2022
Cited by 4 | Viewed by 5736
Abstract
Offline hand-drawn diagram recognition is concerned with digitizing diagrams sketched on paper or whiteboard to enable further editing. Some existing models can identify the individual objects like arrows and symbols, but they become involved in the dilemma of being unable to understand a [...] Read more.
Offline hand-drawn diagram recognition is concerned with digitizing diagrams sketched on paper or whiteboard to enable further editing. Some existing models can identify the individual objects like arrows and symbols, but they become involved in the dilemma of being unable to understand a diagram’s structure. Such a shortage may be inconvenient to digitalization or reconstruction of a diagram from its hand-drawn version. Other methods can accomplish this goal, but they live on stroke temporary information and time-consuming post-processing, which somehow hinders the practicability of these methods. Recently, Convolutional Neural Networks (CNN) have been proved that they perform the state-of-the-art across many visual tasks. In this paper, we propose DrawnNet, a unified CNN-based keypoint-based detector, for recognizing individual symbols and understanding the structure of offline hand-drawn diagrams. DrawnNet is designed upon CornerNet with extensions of two novel keypoint pooling modules which serve to extract and aggregate geometric characteristics existing in polygonal contours such as rectangle, square, and diamond within hand-drawn diagrams, and an arrow orientation prediction branch which aims to predict which direction an arrow points to through predicting arrow keypoints. We conducted wide experiments on public diagram benchmarks to evaluate our proposed method. Results show that DrawnNet achieves 2.4%, 2.3%, and 1.7% recognition rate improvements compared with the state-of-the-art methods across benchmarks of FC-A, FC-B, and FA, respectively, outperforming existing diagram recognition systems on each metric. Ablation study reveals that our proposed method can effectively enable hand-drawn diagram recognition. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 3129 KiB  
Article
Constructing Condition Monitoring Model of Wind Turbine Blades
by Jong-Yih Kuo, Shang-Yi You, Hui-Chi Lin, Chao-Yang Hsu and Baiying Lei
Mathematics 2022, 10(6), 972; https://0-doi-org.brum.beds.ac.uk/10.3390/math10060972 - 18 Mar 2022
Cited by 7 | Viewed by 1629
Abstract
Wind power has become an indispensable part of renewable energy development in various countries. Due to the high cost and complex structure of wind turbines, it is important to design a method that can quickly and effectively determine the structural health of the [...] Read more.
Wind power has become an indispensable part of renewable energy development in various countries. Due to the high cost and complex structure of wind turbines, it is important to design a method that can quickly and effectively determine the structural health of the generator set. This research proposes a method that could determine structural damage or weaknesses in the blades at an early stage via a model to monitor the sound of the wind turbine blades, so as to reduce the quantity of labor required and frequency of regular maintenance, and to repair the damage rapidly in the future. This study used the operating sounds of normal and abnormal blades as a dataset. The model used discrete wavelet transform (DWT) to decompose the sound into different frequency components, performed feature extraction in a statistical measure, and combined with outlier exposure technique to train a deep neural network model that could capture abnormal values deviating from the normal samples. In addition, this paper observed that the performance of the monitoring model on the MIMII dataset was also better than the anomaly detection models proposed by other papers. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 5067 KiB  
Article
CASI-Net: A Novel and Effect Steel Surface Defect Classification Method Based on Coordinate Attention and Self-Interaction Mechanism
by Zhong Li, Chen Wu, Qi Han, Mingyang Hou, Guorong Chen and Tengfei Weng
Mathematics 2022, 10(6), 963; https://0-doi-org.brum.beds.ac.uk/10.3390/math10060963 - 17 Mar 2022
Cited by 12 | Viewed by 1877
Abstract
The surface defects of a hot-rolled strip will adversely affect the appearance and quality of industrial products. Therefore, the timely identification of hot-rolled strip surface defects is of great significance. In order to improve the efficiency and accuracy of surface defect detection, a [...] Read more.
The surface defects of a hot-rolled strip will adversely affect the appearance and quality of industrial products. Therefore, the timely identification of hot-rolled strip surface defects is of great significance. In order to improve the efficiency and accuracy of surface defect detection, a lightweight network based on coordinate attention and self-interaction (CASI-Net), which integrates channel domain, spatial information, and a self-interaction module, is proposed to automatically identify six kinds of hot-rolled steel strip surface defects. In this paper, we use coordinate attention to embed location information into channel attention, which enables the CASI-Net to locate the region of defects more accurately, thus contributing to better recognition and classification. In addition, features are converted into aggregation features from the horizontal and vertical direction attention. Furthermore, a self-interaction module is proposed to interactively fuse the extracted feature information to improve the classification accuracy. The experimental results show that CASI-Net can achieve accurate defect classification with reduced parameters and computation. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 4388 KiB  
Article
aSGD: Stochastic Gradient Descent with Adaptive Batch Size for Every Parameter
by Haoze Shi, Naisen Yang, Hong Tang and Xin Yang
Mathematics 2022, 10(6), 863; https://0-doi-org.brum.beds.ac.uk/10.3390/math10060863 - 09 Mar 2022
Cited by 5 | Viewed by 3394
Abstract
In recent years, deep neural networks (DNN) have been widely used in many fields. Lots of effort has been put into training due to their numerous parameters in a deep network. Some complex optimizers with many hyperparameters have been utilized to accelerate the [...] Read more.
In recent years, deep neural networks (DNN) have been widely used in many fields. Lots of effort has been put into training due to their numerous parameters in a deep network. Some complex optimizers with many hyperparameters have been utilized to accelerate the process of network training and improve its generalization ability. It often is a trial-and-error process to tune these hyperparameters in a complex optimizer. In this paper, we analyze the different roles of training samples on a parameter update, visually, and find that a training sample contributes differently to the parameter update. Furthermore, we present a variant of the batch stochastic gradient decedent for a neural network using the ReLU as the activation function in the hidden layers, which is called adaptive stochastic gradient descent (aSGD). Different from the existing methods, it calculates the adaptive batch size for each parameter in the model and uses the mean effective gradient as the actual gradient for parameter updates. Experimental results over MNIST show that aSGD can speed up the optimization process of DNN and achieve higher accuracy without extra hyperparameters. Experimental results over synthetic datasets show that it can find redundant nodes effectively, which is helpful for model compression. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 4528 KiB  
Article
Improved Multiple Vector Representations of Images and Robust Dictionary Learning
by Chengchang Pan, Yongjun Zhang, Zewei Wang and Zhongwei Cui
Electronics 2022, 11(6), 847; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11060847 - 08 Mar 2022
Viewed by 1299
Abstract
Each sparse representation classifier has different classification accuracy for different samples. It is difficult to achieve good performance with a single feature classification model. In order to balance the large-scale information and global features of images, a robust dictionary learning method based on [...] Read more.
Each sparse representation classifier has different classification accuracy for different samples. It is difficult to achieve good performance with a single feature classification model. In order to balance the large-scale information and global features of images, a robust dictionary learning method based on image multi-vector representation is proposed in this paper. First, this proposed method generates a reasonable virtual image for the original image and obtains the multi-vector representation of all images. Second, the same dictionary learning algorithm is used for each vector representation to obtain multiple sets of image features. The proposed multi-vector representation can provide a good global understanding of the whole image contour and increase the content of dictionary learning. Last, the weighted fusion algorithm is used to classify the test samples. The introduction of influencing factors and the automatic adjustment of the weights of each classifier in the final decision results have a significant indigenous effect on better extracting image features. The study conducted experiments on the proposed algorithm on a number of widely used image databases. A large number of experimental results show that it effectively improves the accuracy of image classification. At the same time, to fully dig and exploit possible representation diversity might be a better way to lead to potential various appearances and high classification accuracy concerning the image. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 3525 KiB  
Article
Adaptive Motion Skill Learning of Quadruped Robot on Slopes Based on Augmented Random Search Algorithm
by Xiaoqing Zhu, Mingchao Wang, Xiaogang Ruan, Lu Chen, Tingdong Ji and Xinyuan Liu
Electronics 2022, 11(6), 842; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11060842 - 08 Mar 2022
Cited by 8 | Viewed by 1966
Abstract
To deal with the problem of stable walking of quadruped robots on slopes, a gait planning algorithm framework for quadruped robots facing unknown slopes is proposed. We estimated the terrain slope by the attitude information measured by the inertial measurement unit (IMU) without [...] Read more.
To deal with the problem of stable walking of quadruped robots on slopes, a gait planning algorithm framework for quadruped robots facing unknown slopes is proposed. We estimated the terrain slope by the attitude information measured by the inertial measurement unit (IMU) without relying on the robot vision. The crawl gait was adopted, and the center of gravity trajectory planning was carried out based on the stability criterion zero-moment point (ZMP). Then, the augmented random search (ARS) algorithm was used to modulate the parameters of the Bezier curve to realize the planning of the robot foot trajectory. Additionally, the robot can adjust the posture in real time to follow the desired joint angle, which realizes the adaptive adjustment of the robot’s posture during the slope movement. Simulation experiment results show that the proposed algorithm for slope gait planning can adaptively adjust the robot’s attitude and stably pass through the slope environment when the slope is unknown. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 353 KiB  
Article
Bayes in Wonderland! Predictive Supervised Classification Inference Hits Unpredictability
by Ali Amiryousefi, Ville Kinnula and Jing Tang
Mathematics 2022, 10(5), 828; https://0-doi-org.brum.beds.ac.uk/10.3390/math10050828 - 05 Mar 2022
Cited by 1 | Viewed by 1435
Abstract
The marginal Bayesian predictive classifiers (mBpc), as opposed to the simultaneous Bayesian predictive classifiers (sBpc), handle each data separately and, hence, tacitly assume the independence of the observations. Due to saturation in learning of generative model parameters, the adverse effect of this false [...] Read more.
The marginal Bayesian predictive classifiers (mBpc), as opposed to the simultaneous Bayesian predictive classifiers (sBpc), handle each data separately and, hence, tacitly assume the independence of the observations. Due to saturation in learning of generative model parameters, the adverse effect of this false assumption on the accuracy of mBpc tends to wear out in the face of an increasing amount of training data, guaranteeing the convergence of these two classifiers under the de Finetti type of exchangeability. This result, however, is far from trivial for the sequences generated under Partition Exchangeability (PE), where even umpteen amount of training data does not rule out the possibility of an unobserved outcome (Wonderland!). We provide a computational scheme that allows the generation of the sequences under PE. Based on that, with controlled increase of the training data, we show the convergence of the sBpc and mBpc. This underlies the use of simpler yet computationally more efficient marginal classifiers instead of simultaneous. We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence as well as a testing paradigm for the equality of this parameter across different samples. The package for Bayesian predictive supervised classifications, parameter estimation and hypothesis testing of the Ewens sampling formula generative model is deposited on CRAN as PEkit package. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 1375 KiB  
Article
Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media
by Lal Khan, Ammar Amjad, Kanwar Muhammad Afaq and Hsien-Tsung Chang
Appl. Sci. 2022, 12(5), 2694; https://0-doi-org.brum.beds.ac.uk/10.3390/app12052694 - 04 Mar 2022
Cited by 51 | Viewed by 8196
Abstract
Sentiment analysis (SA) has been an active research subject in the domain of natural language processing due to its important functions in interpreting people’s perspectives and drawing successful opinion-based judgments. On social media, Roman Urdu is one of the most extensively utilized dialects. [...] Read more.
Sentiment analysis (SA) has been an active research subject in the domain of natural language processing due to its important functions in interpreting people’s perspectives and drawing successful opinion-based judgments. On social media, Roman Urdu is one of the most extensively utilized dialects. Sentiment analysis of Roman Urdu is difficult due to its morphological complexities and varied dialects. The purpose of this paper is to evaluate the performance of various word embeddings for Roman Urdu and English dialects using the CNN-LSTM architecture with traditional machine learning classifiers. We introduce a novel deep learning architecture for Roman Urdu and English dialect SA based on two layers: LSTM for long-term dependency preservation and a one-layer CNN model for local feature extraction. To obtain the final classification, the feature maps learned by CNN and LSTM are fed to several machine learning classifiers. Various word embedding models support this concept. Extensive tests on four corpora show that the proposed model performs exceptionally well in Roman Urdu and English text sentiment classification, with an accuracy of 0.904, 0.841, 0.740, and 0.748 against MDPI, RUSA, RUSA-19, and UCL datasets, respectively. The results show that the SVM classifier and the Word2Vec CBOW (Continuous Bag of Words) model are more beneficial options for Roman Urdu sentiment analysis, but that BERT word embedding, two-layer LSTM, and SVM as a classifier function are more suitable options for English language sentiment analysis. The suggested model outperforms existing well-known advanced models on relevant corpora, improving the accuracy by up to 5%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

33 pages, 487 KiB  
Article
Correcting Diacritics and Typos with a ByT5 Transformer Model
by Lukas Stankevičius, Mantas Lukoševičius, Jurgita Kapočiūtė-Dzikienė, Monika Briedienė and Tomas Krilavičius
Appl. Sci. 2022, 12(5), 2636; https://0-doi-org.brum.beds.ac.uk/10.3390/app12052636 - 03 Mar 2022
Cited by 10 | Viewed by 3624
Abstract
Due to the fast pace of life and online communications and the prevalence of English and the QWERTY keyboard, people tend to forgo using diacritics, make typographical errors (typos) when typing in other languages. Restoring diacritics and correcting spelling is important for proper [...] Read more.
Due to the fast pace of life and online communications and the prevalence of English and the QWERTY keyboard, people tend to forgo using diacritics, make typographical errors (typos) when typing in other languages. Restoring diacritics and correcting spelling is important for proper language use and the disambiguation of texts for both humans and downstream algorithms. However, both of these problems are typically addressed separately: the state-of-the-art diacritics restoration methods do not tolerate other typos, but classical spellcheckers also cannot deal adequately with all the diacritics missing.In this work, we tackle both problems at once by employing the newly-developed universal ByT5 byte-level seq2seq transformer model that requires no language-specific model structures. For a comparison, we perform diacritics restoration on benchmark datasets of 12 languages, with the addition of Lithuanian. The experimental investigation proves that our approach is able to achieve results (>98%) comparable to the previous state-of-the-art, despite being trained less and on fewer data. Our approach is also able to restore diacritics in words not seen during training with >76% accuracy. Our simultaneous diacritics restoration and typos correction approach reaches >94% alpha-word accuracy on the 13 languages. It has no direct competitors and strongly outperforms classical spell-checking or dictionary-based approaches. We also demonstrate all the accuracies to further improve with more training. Taken together, this shows the great real-world application potential of our suggested methods to more data, languages, and error classes. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 789 KiB  
Article
Mining the Frequent Patterns of Named Entities for Long Document Classification
by Bohan Wang, Rui Qi, Jinhua Gao, Jianwei Zhang, Xiaoguang Yuan and Wenjun Ke
Appl. Sci. 2022, 12(5), 2544; https://0-doi-org.brum.beds.ac.uk/10.3390/app12052544 - 28 Feb 2022
Cited by 1 | Viewed by 1671
Abstract
Nowadays, a large amount of information is stored as text, and numerous text mining techniques have been developed for various applications, such as event detection, news topic classification, public opinion detection, and sentiment analysis. Although significant progress has been achieved for short text [...] Read more.
Nowadays, a large amount of information is stored as text, and numerous text mining techniques have been developed for various applications, such as event detection, news topic classification, public opinion detection, and sentiment analysis. Although significant progress has been achieved for short text classification, document-level text classification requires further exploration. Long documents always contain irrelevant noisy information that shelters the prominence of indicative features, limiting the interpretability of classification results. To alleviate this problem, a model called MIPELD (mining the frequent pattern of a named entity for long document classification) for long document classification is demonstrated, which mines the frequent patterns of named entities as features. Discovered patterns allow semantic generalization among documents and provide clues for verifying the results. Experiments on several datasets resulted in good accuracy and marco-F1 values, meeting the requirements for practical application. Further analysis validated the effectiveness of MIPELD in mining interpretable information in text classification. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 1679 KiB  
Article
Mutual Information between Order Book Layers
by Daniel Libman, Gil Ariel, Mary Schaps and Simi Haber
Entropy 2022, 24(3), 343; https://0-doi-org.brum.beds.ac.uk/10.3390/e24030343 - 27 Feb 2022
Cited by 1 | Viewed by 2763
Abstract
The order book is a list of all current buy or sell orders for a given financial security. The rise of electronic stock exchanges introduced a debate about the relevance of the information it encapsulates of the activity of traders. Here, we approach [...] Read more.
The order book is a list of all current buy or sell orders for a given financial security. The rise of electronic stock exchanges introduced a debate about the relevance of the information it encapsulates of the activity of traders. Here, we approach this topic from a theoretical perspective, estimating the amount of mutual information between order book layers, i.e., different buy/sell layers, which are aggregated by buy/sell orders. We show that (i) layers are not independent (in the sense that the mutual information is statistically larger than zero), (ii) the mutual information between layers is small (compared to the joint entropy), and (iii) the mutual information between layers increases when comparing the uppermost layers to the deepest layers analyzed (i.e., further away from the market price). Our findings, and our method for estimating mutual information, are relevant to developing trading strategies that attempt to utilize the information content of the limit order book. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 1965 KiB  
Article
LinkNet-B7: Noise Removal and Lesion Segmentation in Images of Skin Cancer
by Cihan Akyel and Nursal Arıcı
Mathematics 2022, 10(5), 736; https://0-doi-org.brum.beds.ac.uk/10.3390/math10050736 - 25 Feb 2022
Cited by 19 | Viewed by 3734
Abstract
Skin cancer is common nowadays. Early diagnosis of skin cancer is essential to increase patients’ survival rate. In addition to traditional methods, computer-aided diagnosis is used in diagnosis of skin cancer. One of the benefits of this method is that it eliminates human [...] Read more.
Skin cancer is common nowadays. Early diagnosis of skin cancer is essential to increase patients’ survival rate. In addition to traditional methods, computer-aided diagnosis is used in diagnosis of skin cancer. One of the benefits of this method is that it eliminates human error in cancer diagnosis. Skin images may contain noise such as like hair, ink spots, rulers, etc., in addition to the lesion. For this reason, noise removal is required. The noise reduction in lesion images can be referred to as noise removal. This phase is very important for the correct segmentation of the lesions. One of the most critical problems in using such automated methods is the inaccuracy in cancer diagnosis because noise removal and segmentation cannot be performed effectively. We have created a noise dataset (hair, rulers, ink spots, etc.) that includes 2500 images and masks. There is no such noise dataset in the literature. We used this dataset for noise removal in skin cancer images. Two datasets from the International Skin Imaging Collaboration (ISIC) and the PH2 were used in this study. In this study, a new approach called LinkNet-B7 for noise removal and segmentation of skin cancer images is presented. LinkNet-B7 is a LinkNet-based approach that uses EfficientNetB7 as the encoder. We used images with 16 slices. This way, we lose fewer pixel values. LinkNet-B7 has a 6% higher success rate than LinkNet with the same dataset and parameters. Training accuracy for noise removal and lesion segmentation was calculated to be 95.72% and 97.80%, respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 3395 KiB  
Article
A Deep Learning-Based Password Security Evaluation Model
by Ki Hyeon Hong and Byung Mun Lee
Appl. Sci. 2022, 12(5), 2404; https://0-doi-org.brum.beds.ac.uk/10.3390/app12052404 - 25 Feb 2022
Cited by 3 | Viewed by 3516
Abstract
It is very important to consider whether a password has been leaked, because security can no longer be guaranteed for passwords exposed to attackers. However, most existing password security evaluation methods do not consider the leakage of the password. Even if leakage is [...] Read more.
It is very important to consider whether a password has been leaked, because security can no longer be guaranteed for passwords exposed to attackers. However, most existing password security evaluation methods do not consider the leakage of the password. Even if leakage is considered, a process of collecting, storing, and verifying a huge number of leaked passwords is required, which is not practical in low-performance devices such as IoT devices. Therefore, we propose another approach in this paper using a deep learning model. A password list was made for the proposed model by randomly extracting 133,447 words from a total of seven dictionaries, including Wikipedia and Korean-language dictionaries. After that, a deep learning model was created by using the three pieces of feature data that were extracted from the password list, as well as a label for the leakage. After creating an evaluation model in a lightweight file, it can be stored in a low-performance device and is suitable to predict and evaluate the security strength of a password in a device. To check the performance of the model, an accuracy evaluation experiment was conducted to predict the possibility of leakage. As a result, a prediction accuracy of 95.74% was verified for the proposed model. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 3415 KiB  
Article
Pulse-Shape Discrimination of SiPM Array-Coupled CLYC Detector Using Convolutional Neural Network
by Jing Lu, Xianguo Tuo, Hongchao Yang, Yushi Luo, Haolin Liu, Chao Deng and Qibiao Wang
Appl. Sci. 2022, 12(5), 2400; https://0-doi-org.brum.beds.ac.uk/10.3390/app12052400 - 25 Feb 2022
Cited by 5 | Viewed by 2242
Abstract
Cs2LiYCl6: Ce3+ (CLYC) is a dual-mode gamma-neutron scintillator with a medium gamma-ray resolution and pulse-shape discrimination (PSD) capability. The PSD performance of CLYC is greatly weakened when coupled with silicon photomultipliers (SiPMs) because of SiPMs’ low [...] Read more.
Cs2LiYCl6: Ce3+ (CLYC) is a dual-mode gamma-neutron scintillator with a medium gamma-ray resolution and pulse-shape discrimination (PSD) capability. The PSD performance of CLYC is greatly weakened when coupled with silicon photomultipliers (SiPMs) because of SiPMs’ low detection efficiency for the ultrafast Core-Valence-Luminescence (CVL) component under gamma excitation. In our previous work, the PSD Figure-of-Merit (FoM) value was optimized to 2.45 at the gamma-equivalent energy region of the thermal neutron by using the charge comparison method. However, this value was reduced to 1.37 at the lower gamma-equivalent energy region of more than 325 keV, and neutrons were difficult to distinguish from gamma rays. Hence, new algorithms should be studied to improve the PSD performance at low gamma-equivalent energy regions. Convolutional Neural Networks (CNNs) have excellent image recognition capabilities, and thus, neutron and gamma-ray waveforms can be discriminated by their characteristics through a known training set. In this study, neutron and gamma-ray waveforms were measured with a 137Cs source and moderated 252Cf source via an SiPM array-coupled CLYC detector and divided into two groups: training and PSD testing. The CNN training set comprised 137Cs characteristic gamma-ray waveforms and thermal neutron waveforms that were discriminated by the charge comparison method from the training group. A CNN with two convolution-pooling layers was designed to accomplish PSD with the test group. The PSD FoM value of the CNN method was calculated to be 37.20 at the gamma-equivalent energy region of more than 325 keV. This result was much higher than that of the charge comparison method, indicating that neutrons and gamma rays could be better distinguished with the CNN method, especially at low gamma-equivalent energy regions. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 1832 KiB  
Article
DFM-GCN: A Multi-Task Learning Recommendation Based on a Deep Graph Neural Network
by Yan Xiao, Congdong Li and Vincenzo Liu
Mathematics 2022, 10(5), 721; https://0-doi-org.brum.beds.ac.uk/10.3390/math10050721 - 24 Feb 2022
Cited by 6 | Viewed by 4033
Abstract
Among the inherent problems in recommendation systems are data sparseness and cold starts; the solutions to which lie in the introduction of knowledge graphs to improve the performance of the recommendation systems. The results in previous research, however, suffer from problems such as [...] Read more.
Among the inherent problems in recommendation systems are data sparseness and cold starts; the solutions to which lie in the introduction of knowledge graphs to improve the performance of the recommendation systems. The results in previous research, however, suffer from problems such as data compression, information damage, and insufficient learning. Therefore, a DeepFM Graph Convolutional Network (DFM-GCN) model was proposed to alleviate the above issues. The prediction of the click-through rate (CTR) is critical in recommendation systems where the task is to estimate the probability that a user will click on a recommended item. In many recommendation systems, the goal is to maximize the number of clicks so the items returned to a user can be ranked by an estimated CTR. The DFM-GCN model consists of three parts: the left part DeepFM is used to capture the interactive information between the users and items; the deep neural network is used in the middle to model the left and right parts; and the right one obtains a better item representation vector by the GCN. In an effort to verify the validity and precision of the model built in this research, and based on the public datasets ml1m-kg20m and ml1m-kg1m, a performance comparison experiment was designed. It used multiple comparison models and the MKR and FM_MKR algorithms as well as the DFM-GCN algorithm constructed in this paper. Having achieved a state-of-the-art performance, the experimental results of the AUC and f1 values verified by the CTR as well as the accuracy, recall, and f1 values of the top-k showed that the proposed approach was excellent and more effective when compared with different recommendation algorithms. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 2137 KiB  
Article
SFINet: Shuffle–and–Fusion Interaction Networks for Wind Power Forecasting
by Xu Zhang, Cheng Xiao and Tieling Zhang
Appl. Sci. 2022, 12(4), 2253; https://0-doi-org.brum.beds.ac.uk/10.3390/app12042253 - 21 Feb 2022
Cited by 1 | Viewed by 1993
Abstract
Wind energy is one of the most important renewable energy sources in the world. Accurate wind power prediction is of great significance for achieving reliable and economical power system operation and control. For this purpose, this paper is focused on wind power prediction [...] Read more.
Wind energy is one of the most important renewable energy sources in the world. Accurate wind power prediction is of great significance for achieving reliable and economical power system operation and control. For this purpose, this paper is focused on wind power prediction based on a newly proposed shuffle–and–fusion interaction network (SFINet). First, a channel shuffle is employed to promote the interaction between timing features. Second, an attention block is proposed to fuse the original features and shuffled features to further increase the model’s sequential modeling capability. Finally, the developed shuffle–and–fusion interaction network model is tested using real-world wind power production data. Based on the results verified, it was proven that the proposed SFINet model can achieve better performance than other baseline methods, and it can be easily implemented in the field without requiring additional hardware and software. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 32733 KiB  
Article
Extending Contrastive Learning to Unsupervised Redundancy Identification
by Jeongwoo Ju, Heechul Jung and Junmo Kim
Appl. Sci. 2022, 12(4), 2201; https://0-doi-org.brum.beds.ac.uk/10.3390/app12042201 - 20 Feb 2022
Viewed by 1916
Abstract
Modern deep neural network (DNN)-based approaches have delivered great performance for computer vision tasks; however, they require a massive annotation cost due to their data-hungry nature. Hence, given a fixed budget and unlabeled examples, improving the quality of examples to be annotated is [...] Read more.
Modern deep neural network (DNN)-based approaches have delivered great performance for computer vision tasks; however, they require a massive annotation cost due to their data-hungry nature. Hence, given a fixed budget and unlabeled examples, improving the quality of examples to be annotated is a clever step to obtain good generalization of DNN. One of key issues that could hurt the quality of examples is the presence of redundancy, in which the most examples exhibit similar visual context (e.g., same background). Redundant examples barely contribute to the performance but rather require additional annotation cost. Hence, prior to the annotation process, identifying redundancy is a key step to avoid unnecessary cost. In this work, we proved that the coreset score based on cosine similarity (cossim) is effective for identifying redundant examples. This is because the collective magnitude of the gradient over redundant examples exhibits a large value compared to the others. As a result, contrastive learning first attempts to reduce the loss of redundancy. Consequently, cossim for the redundancy set exhibited a high value (low coreset score). We first viewed the redundancy identification as the gradient magnitude. In this way, we effectively removed redundant examples from two datasets (KITTI, BDD10K), resulting in a better performance in terms of detection and semantic segmentation. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 76035 KiB  
Article
Single-Image Super-Resolution Neural Network via Hybrid Multi-Scale Features
by Wenfeng Huang, Xiangyun Liao, Lei Zhu, Mingqiang Wei and Qiong Wang
Mathematics 2022, 10(4), 653; https://0-doi-org.brum.beds.ac.uk/10.3390/math10040653 - 19 Feb 2022
Cited by 6 | Viewed by 2242
Abstract
In this paper, we propose an end-to-end single-image super-resolution neural network by leveraging hybrid multi-scale features of images. Different from most existing convolutional neural network (CNN) based solutions, our proposed network depends on the observation that image features extracted by CNN contain hybrid [...] Read more.
In this paper, we propose an end-to-end single-image super-resolution neural network by leveraging hybrid multi-scale features of images. Different from most existing convolutional neural network (CNN) based solutions, our proposed network depends on the observation that image features extracted by CNN contain hybrid multi-scale features: both multi-scale local texture features and global structural features. By effectively exploiting these multi-scale and local-global features, our network involves far fewer parameters, leading to a large decrease in memory usage and computation during inference. Our network benefits from three key modules: (1) an efficient and lightweight feature extraction module (EFblock); (2) a hybrid multi-scale feature enhancement module (HMblock); and (3) a reconstruction–restoration module (DRblock). Experiments on five popular benchmarks demonstrate that our super-resolution approach achieves better performance with fewer parameters and less memory consumption, compared to more than 20 SOTAs. In summary, we propose a novel multi-scale super-resolution neural network (HMSF), which is more lightweight, has fewer parameters, and requires less execution time, but has better performance than the state-of-the-art methods. Compared to SOTAs, this method is more practical and better suited to run on constrained devices, such as PCs and mobile devices, without the need for a high-performance server. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 3213 KiB  
Article
Cross-Day EEG-Based Emotion Recognition Using Transfer Component Analysis
by Zhongyang He, Ning Zhuang, Guangcheng Bao, Ying Zeng and Bin Yan
Electronics 2022, 11(4), 651; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11040651 - 19 Feb 2022
Cited by 11 | Viewed by 2915
Abstract
EEG-based emotion recognition can help achieve more natural human-computer interaction, but the temporal non-stationarity of EEG signals affects the robustness of EEG-based emotion recognition models. Most existing studies use the emotional EEG data collected in the same trial to train and test models, [...] Read more.
EEG-based emotion recognition can help achieve more natural human-computer interaction, but the temporal non-stationarity of EEG signals affects the robustness of EEG-based emotion recognition models. Most existing studies use the emotional EEG data collected in the same trial to train and test models, once this kind of model is applied to the data collected at different times of the same subject, its recognition accuracy will decrease significantly. To address the problem of EEG-based cross-day emotion recognition, this paper has constructed a database of emotional EEG signals collected over six days for each subject using the Chinese Affective Video System and self-built video library stimuli materials, and the database is the largest number of days collected for a single subject so far. To study the neural patterns of emotions based on EEG signals cross-day, the brain topography has been analyzed in this paper, which show there is a stable neural pattern of emotions cross-day. Then, Transfer Component Analysis (TCA) algorithm is used to adaptively determine the optimal dimensionality of the TCA transformation and match domains of the best correlated motion features in multiple time domains by using EEG signals from different time (days). The experimental results show that the TCA-based domain adaptation strategy can effectively improve the accuracy of cross-day emotion recognition by 3.55% and 2.34%, respectively, in the classification of joy-sadness and joy-anger emotions. The emotion recognition model and brain topography in this paper, verify that the database can provide a reliable data basis for emotion recognition across different time domains. This EEG database will be open to more researchers to promote the practical application of emotion recognition. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 2701 KiB  
Article
Exploring and Selecting Features to Predict the Next Outcomes of MLB Games
by Shu-Fen Li, Mei-Ling Huang and Yun-Zhi Li
Entropy 2022, 24(2), 288; https://0-doi-org.brum.beds.ac.uk/10.3390/e24020288 - 17 Feb 2022
Cited by 6 | Viewed by 2338
Abstract
(1) Background and Objective: Major League Baseball (MLB) is one of the most popular international sport events worldwide. Many people are very interest in the related activities, and they are also curious about the outcome of the next game. There are many factors [...] Read more.
(1) Background and Objective: Major League Baseball (MLB) is one of the most popular international sport events worldwide. Many people are very interest in the related activities, and they are also curious about the outcome of the next game. There are many factors that affect the outcome of a baseball game, and it is very difficult to predict the outcome of the game precisely. At present, relevant research predicts the accuracy of the next game falls between 55% and 62%. (2) Methods: This research collected MLB game data from 2015 to 2019 and organized a total of 30 datasets for each team to predict the outcome of the next game. The prediction method used includes one-dimensional convolutional neural network (1DCNN) and three machine-learning methods, namely an artificial neural network (ANN), support vector machine (SVM), and logistic regression (LR). (3) Results: The prediction results show that, among the four prediction models, SVM obtains the highest prediction accuracies of 64.25% and 65.75% without feature selection and with feature selection, respectively; and the best AUCs are 0.6495 and 0.6501, respectively. (4) Conclusions: This study used feature selection and optimized parameter combination to increase the prediction performance to around 65%, which surpasses the prediction accuracies when compared to the state-of-the-art works in the literature. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 5834 KiB  
Article
Automatic Screening of Bolts with Anti-Loosening Coating Using Grad-CAM and Transfer Learning with Deep Convolutional Neural Networks
by Eunsol Noh and Seokmoo Hong
Appl. Sci. 2022, 12(4), 2029; https://0-doi-org.brum.beds.ac.uk/10.3390/app12042029 - 15 Feb 2022
Viewed by 2068
Abstract
Most electronic and automotive parts are affixed by bolts. To prevent such bolts from loosening through shock and vibration, anti-loosening coating is applied to their threads. However, during the coating process, various defects can occur. Consequently, as the quality of the anti-loosening coating [...] Read more.
Most electronic and automotive parts are affixed by bolts. To prevent such bolts from loosening through shock and vibration, anti-loosening coating is applied to their threads. However, during the coating process, various defects can occur. Consequently, as the quality of the anti-loosening coating is critical for the fastening force, bolts are inspected optically and manually. It is difficult, however, to accurately screen coating defects owing to their various shapes and sizes. In this study, we applied deep learning to assess the coating quality of bolts with anti-loosening coating. From the various convolutional neural network (CNN) methods, the VGG16 structure was employed. Furthermore, the gradient-weighted class activation mapping visualization method was used to evaluate the training model; this is because a CNN cannot determine the classification criteria or the defect location, owing to its structure. The results confirmed that external factors influence the classification. We, therefore, applied the region of interest method to classify the bolt thread only, and subsequently, retrained the algorithm. Moreover, to reduce the learning time and improve the model performance, transfer learning and fine tuning were employed. The proposed method for screening coating defects was applied to a screening device equipped with an actual conveyor belt, and the Modbus TCP protocol was used to transmit signals between a programmable logic controller and a personal computer. Using the proposed method, we were able to automatically detect coating defects that were missed by optical sorters. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 1818 KiB  
Article
TB-BCG: Topic-Based BART Counterfeit Generator for Fake News Detection
by Andrea Stevens Karnyoto, Chengjie Sun, Bingquan Liu and Xiaolong Wang
Mathematics 2022, 10(4), 585; https://0-doi-org.brum.beds.ac.uk/10.3390/math10040585 - 14 Feb 2022
Cited by 4 | Viewed by 2312
Abstract
Fake news has been spreading intentionally and misleading society to believe unconfirmed information; this phenomenon makes it challenging to identify fake news based on shared content. Fake news circulation is not only a current issue, but it has been disseminated for centuries. Dealing [...] Read more.
Fake news has been spreading intentionally and misleading society to believe unconfirmed information; this phenomenon makes it challenging to identify fake news based on shared content. Fake news circulation is not only a current issue, but it has been disseminated for centuries. Dealing with fake news is a challenging task because it spreads massively. Therefore, automatic fake news detection is urgently needed. We introduced TB-BCG, Topic-Based BART Counterfeit Generator, to increase detection accuracy using deep learning. This approach plays an essential role in selecting impacted data rows and adding more training data. Our research implemented Latent Dirichlet Allocation (Topic-based), Bidirectional and Auto-Regressive Transformers (BART), and Cosine Document Similarity as the main tools involved in Constraint @ AAAI2021-COVID19 Fake News Detection dataset shared task. This paper sets forth this simple yet powerful idea by selecting a dataset based on topic and sorting based on distinctive data, generating counterfeit training data using BART, and comparing counterfeit-generated text toward source text using cosine similarity. If the comparison value between counterfeit-generated text and source text is more than 95%, then add that counterfeit-generated text into the dataset. In order to prove the resistance of precision and the robustness in various numbers of data training, we used 30%, 50%, 80%, and 100% from the total dataset and trained it using simple Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). Compared to baseline, our method improved the testing performance for both LSTM and CNN, and yields are only slightly different. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1909 KiB  
Article
Noise Modeling to Build Training Sets for Robust Speech Enhancement
by Yahui Wang, Wenxi Zhang, Zhou Wu, Xinxin Kong, Yongbiao Wang and Hongxin Zhang
Appl. Sci. 2022, 12(4), 1905; https://0-doi-org.brum.beds.ac.uk/10.3390/app12041905 - 11 Feb 2022
Cited by 1 | Viewed by 1328
Abstract
DNN-based Speech Enhancement (SE) models suffer from significant performance degradation in real recordings due to the mismatch between the synthetic datasets employed for training and real test sets. To solve this problem, we propose a new Generative Adversarial Network framework for Noise Modeling [...] Read more.
DNN-based Speech Enhancement (SE) models suffer from significant performance degradation in real recordings due to the mismatch between the synthetic datasets employed for training and real test sets. To solve this problem, we propose a new Generative Adversarial Network framework for Noise Modeling (NM-GAN) that creates realistic paired training sets by imitating real noise distribution. The proposed framework combines a novel 7-layer U-Net with two bidirectional long short-term memory (LSTM) layers that act as a generator to construct complex noise. NM-GAN generates enough recall (diversity) and precision (noise quality) in its samples through adversarial and alternate training, effectively simulating real noise, which is then utilized to compose realistic paired training sets. Extensive experiments employing various qualitative and quantitative evaluation metrics verify the effectiveness of the generated noise samples and training sets, demonstrating our framework’s capabilities. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 1106 KiB  
Article
PrimeNet: Adaptive Multi-Layer Deep Neural Structure for Enhanced Feature Selection in Early Convolution Stage
by Farhat Ullah Khan and Izzatdin Aziz
Appl. Sci. 2022, 12(4), 1842; https://0-doi-org.brum.beds.ac.uk/10.3390/app12041842 - 10 Feb 2022
Viewed by 1801
Abstract
The colossal depths of the deep neural network sometimes suffer from ineffective backpropagation of the gradients through all its depths, whereas the strong performance of shallower multilayer neural structures proves their ability to increase the gradient signals in the early stages of training, [...] Read more.
The colossal depths of the deep neural network sometimes suffer from ineffective backpropagation of the gradients through all its depths, whereas the strong performance of shallower multilayer neural structures proves their ability to increase the gradient signals in the early stages of training, which easily gets backpropagated for global loss corrections. Shallow neural structures are always a good starting point for encouraging the sturdy feature characteristics of the input. In this research, a shallow, deep neural structure called PrimeNet is proposed. PrimeNet is aimed to dynamically identify and encourage the quality visual indicators from the input to be used by the subsequent deep network layers and increase the gradient signals in the lower stages of the training pipeline. In addition to this, the layer-wise training is performed with the help of locally generated errors, which means the gradient is not backpropagated to previous layers, and the hidden layer weights are updated during the forward pass, making this structure a backpropagation free variant. PrimeNet has obtained state-of-the-art results on various image datasets, attaining the dual objective of: (1) a compact dynamic deep neural structure, which (2) eliminates the problem of backward-locking. The PrimeNet unit is proposed as an alternative to traditional convolution and dense blocks for faster and memory-efficient training, outperforming previously reported results aimed at adaptive methods for parallel and multilayer deep neural systems. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 7683 KiB  
Article
Optimization of Intrusion Detection Systems Determined by Ameliorated HNADAM-SGD Algorithm
by Shyla Shyla, Vishal Bhatnagar, Vikram Bali and Shivani Bali
Electronics 2022, 11(4), 507; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11040507 - 09 Feb 2022
Cited by 9 | Viewed by 2139
Abstract
Information security is of pivotal concern for consistently streaming information over the widespread internetwork. The bottleneck flow of incoming and outgoing data traffic introduces the issues of malicious activities taken place by intruders, hackers and attackers in the form of authenticity obstruction, gridlocking [...] Read more.
Information security is of pivotal concern for consistently streaming information over the widespread internetwork. The bottleneck flow of incoming and outgoing data traffic introduces the issues of malicious activities taken place by intruders, hackers and attackers in the form of authenticity obstruction, gridlocking data traffic, vandalizing data and crashing the established network. The issue of emerging suspicious activities is managed by the domain of Intrusion Detection Systems (IDS). The IDS consistently monitors the network for the identification of suspicious activities, and generates alarm and indication in the presence of malicious threats and worms. The performance of IDS is improved by using different machine learning algorithms. In this paper, the Nesterov-Accelerated Adaptive Moment Estimation–Stochastic Gradient Descent (HNADAM-SDG) algorithm is proposed to determine the performance of Intrusion Detection Systems IDS. The algorithm is used to optimize IDS systems by hybridization and tuning of hyperparameters. The performance of algorithm is compared with other classification algorithms such as logistic regression, ridge classifier and ensemble algorithms where the experimental analysis and computations show the improved accuracy with 99.8%, sensitivity with 99.7%, and specificity with 99.5%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 3098 KiB  
Article
Person Re-Identification via Pyramid Multipart Features and Multi-Attention Framework
by Randa Mohamed Bayoumi, Elsayed E. Hemayed, Mohammad Ehab Ragab and Magda B. Fayek
Big Data Cogn. Comput. 2022, 6(1), 20; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc6010020 - 09 Feb 2022
Cited by 2 | Viewed by 2994
Abstract
Video-based person re-identification has become quite attractive due to its importance in many vision surveillance problems. It is a challenging topic due to the inter/intra changes, occlusion, and pose variations involved. In this paper, we propose a pyramid-attentive framework that relies on multi-part [...] Read more.
Video-based person re-identification has become quite attractive due to its importance in many vision surveillance problems. It is a challenging topic due to the inter/intra changes, occlusion, and pose variations involved. In this paper, we propose a pyramid-attentive framework that relies on multi-part features and multiple attention to aggregate features of multi-levels and learns attention-based representations of persons through various aspects. Self-attention is used to strengthen the most discriminative features in the spatial and channel domains and hence capture robust global information. We propose the use of part-relation attention between different multi-granularities of features’ representation to focus on learning appropriate local features. Temporal attention is used to aggregate temporal features. We integrate the most robust features in the global and multi-level views to build an effective convolution neural network (CNN) model. The proposed model outperforms the previous state-of-the art models on three datasets. Notably, using the proposed model enables the achievement of 98.9% (a relative improvement of 2.7% on the GRL) top1 accuracy and 99.3% mAP on the PRID2011, and 92.8% (a relative improvement of 2.4% relative to GRL) top1 accuracy on iLIDS-vid. We also explore the generalization ability of our model on a cross dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 52311 KiB  
Article
Self-Supervised Noise Reduction in Low-Dose Cone Beam Computed Tomography (CBCT) Using the Randomly Dropped Projection Strategy
by Young-Joo Han and Ha-Jin Yu
Appl. Sci. 2022, 12(3), 1714; https://0-doi-org.brum.beds.ac.uk/10.3390/app12031714 - 07 Feb 2022
Cited by 1 | Viewed by 2249
Abstract
Deep learning-based denoising methods have proved efficient for medical imaging. Obtaining a three-dimensional representation of a scanned object is essential, such as in the computed tomography (CT) system. A sufficient radiation dose needs to be irradiated to a scanned object to obtain a [...] Read more.
Deep learning-based denoising methods have proved efficient for medical imaging. Obtaining a three-dimensional representation of a scanned object is essential, such as in the computed tomography (CT) system. A sufficient radiation dose needs to be irradiated to a scanned object to obtain a high-quality image. However, the radiation dose is insufficient in many cases due to hardware limitations or health care issues. A deep learning-based denoising method can be a solution to obtaining good images, even when the radiation dose is insufficient. However, most existing deep learning-based denoising methods require numerous paired low-dose CT (LDCT) images and normal-dose CT (NDCT) images. It is almost impossible to obtain numerous well-paired LDCT and NDCT images. Self-supervised denoising methods were proposed to train a denoising neural network on only noisy images. These methods can be applied to the projection domain in LDCT. However, applying denoising in the projection image domain is a challenging task, because the projection images for LDCT have extremely weak signals. To solve this problem, we propose a noise reduction method based on the dropped projection strategy. The proposed method works by first reconstructing the 3D image with the degraded versions of the projection images generated by Bernoulli sampling. Subsequently, the denoising neural network is trained to restore the signal dropped out by Bernoulli sampling in the projection image domain. As such, the method we propose solves the over-smoothing problem in previous methods and is able to be trained with a small amount of data. We verified the performance of our proposed method on the SPARE challenge dataset and the in-house lithium polymer dataset. The experiments on two datasets show that the proposed method outperforms the conventional denoising methods by at least 4.47 dB of PSNR value. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

24 pages, 10359 KiB  
Article
A New Hybrid Based on Long Short-Term Memory Network with Spotted Hyena Optimization Algorithm for Multi-Label Text Classification
by Hamed Khataei Maragheh, Farhad Soleimanian Gharehchopogh, Kambiz Majidzadeh and Amin Babazadeh Sangar
Mathematics 2022, 10(3), 488; https://0-doi-org.brum.beds.ac.uk/10.3390/math10030488 - 02 Feb 2022
Cited by 36 | Viewed by 4479
Abstract
An essential work in natural language processing is the Multi-Label Text Classification (MLTC). The purpose of the MLTC is to assign multiple labels to each document. Traditional text classification methods, such as machine learning usually involve data scattering and failure to discover relationships [...] Read more.
An essential work in natural language processing is the Multi-Label Text Classification (MLTC). The purpose of the MLTC is to assign multiple labels to each document. Traditional text classification methods, such as machine learning usually involve data scattering and failure to discover relationships between data. With the development of deep learning algorithms, many authors have used deep learning in MLTC. In this paper, a novel model called Spotted Hyena Optimizer (SHO)-Long Short-Term Memory (SHO-LSTM) for MLTC based on LSTM network and SHO algorithm is proposed. In the LSTM network, the Skip-gram method is used to embed words into the vector space. The new model uses the SHO algorithm to optimize the initial weight of the LSTM network. Adjusting the weight matrix in LSTM is a major challenge. If the weight of the neurons to be accurate, then the accuracy of the output will be higher. The SHO algorithm is a population-based meta-heuristic algorithm that works based on the mass hunting behavior of spotted hyenas. In this algorithm, each solution of the problem is coded as a hyena. Then the hyenas are approached to the optimal answer by following the hyena of the leader. Four datasets are used (RCV1-v2, EUR-Lex, Reuters-21578, and Bookmarks) to evaluate the proposed model. The assessments demonstrate that the proposed model has a higher accuracy rate than LSTM, Genetic Algorithm-LSTM (GA-LSTM), Particle Swarm Optimization-LSTM (PSO-LSTM), Artificial Bee Colony-LSTM (ABC-LSTM), Harmony Algorithm Search-LSTM (HAS-LSTM), and Differential Evolution-LSTM (DE-LSTM). The improvement of SHO-LSTM model accuracy for four datasets compared to LSTM is 7.52%, 7.12%, 1.92%, and 4.90%, respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 1580 KiB  
Article
Remaining Useful Life Estimation of Aircraft Engines Using Differentiable Architecture Search
by Pengli Mao, Yan Lin, Song Xue and Baochang Zhang
Mathematics 2022, 10(3), 352; https://0-doi-org.brum.beds.ac.uk/10.3390/math10030352 - 24 Jan 2022
Cited by 3 | Viewed by 2317
Abstract
Prognostics and health management (PHM) applications can prevent engines from potential serious accidents by predicting the remaining useful life (RUL). Recently, data-driven methods have been widely used to solve RUL problems. The network architecture has a crucial impact on the experiential performance. However, [...] Read more.
Prognostics and health management (PHM) applications can prevent engines from potential serious accidents by predicting the remaining useful life (RUL). Recently, data-driven methods have been widely used to solve RUL problems. The network architecture has a crucial impact on the experiential performance. However, most of the network architectures are designed manually based on human experience with a large cost of time. To address these challenges, we propose a neural architecture search (NAS) method based on gradient descent. In this study, we construct the search space with a directed acyclic graph (DAG), where a subgraph represents a network architecture. By using softmax relaxation, the search space becomes continuous and differentiable, then the gradient descent can be used for optimization. Moreover, a partial channel connection method is introduced to accelerate the searching efficiency. The experiment is conducted on C-MAPSS dataset. In the data processing step, a fault detection method is proposed based on the k-means algorithm, which drops large valueless data and promotes the estimation performance. The experimental result shows that our method achieves superior performance with the highest estimation accuracy compared with other popular studies. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 396 KiB  
Article
A Word-Granular Adversarial Attacks Framework for Causal Event Extraction
by Yu Zhao, Wanli Zuo, Shining Liang, Xiaosong Yuan, Yijia Zhang and Xianglin Zuo
Entropy 2022, 24(2), 169; https://doi.org/10.3390/e24020169 - 24 Jan 2022
Cited by 2 | Viewed by 2293
Abstract
As a data augmentation method, masking word is commonly used in many natural language processing tasks. However, most mask methods are based on rules and are not related to downstream tasks. In this paper, we propose a novel masking word generator, named Actor-Critic [...] Read more.
As a data augmentation method, masking word is commonly used in many natural language processing tasks. However, most mask methods are based on rules and are not related to downstream tasks. In this paper, we propose a novel masking word generator, named Actor-Critic Mask Model (ACMM), which can adaptively adjust the mask strategy according to the performance of downstream tasks. In order to demonstrate the effectiveness of the method, we conducted experiments on two causal event extraction datasets. Experiment results show that, compared with various rule-based masking methods, the masked sentences generated by our proposed method can significantly enhance the generalization of the model and improve the model performance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 4654 KiB  
Article
Pedestrian Detection with Multi-View Convolution Fusion Algorithm
by Yuhong Liu, Chunyan Han, Lin Zhang and Xin Gao
Entropy 2022, 24(2), 165; https://0-doi-org.brum.beds.ac.uk/10.3390/e24020165 - 22 Jan 2022
Cited by 5 | Viewed by 2749
Abstract
In recent years, the pedestrian detection technology of a single 2D image has been dramatically improved. When the scene becomes very crowded, the detection performance will deteriorate seriously and cannot meet the requirements of autonomous driving perception. With the introduction of the multi-view [...] Read more.
In recent years, the pedestrian detection technology of a single 2D image has been dramatically improved. When the scene becomes very crowded, the detection performance will deteriorate seriously and cannot meet the requirements of autonomous driving perception. With the introduction of the multi-view method, the task of pedestrian detection in crowded or fuzzy scenes has been significantly improved and has become a widely used method in autonomous driving. In this paper, we construct a double-branch feature fusion structure, the first branch adopts a lightweight structure, the second branch further extracts features and gets the feature map obtained from each layer. At the same time, the receptive field is enlarged by expanding convolution. To improve the speed of the model, the keypoint is used instead of the entire object for regression without an NMS post-processing operation. Meanwhile, the whole model can be learned from end to end. Even in the presence of many people, the method can still perform better on accuracy and speed. In the standard of Wildtrack and MultiviewX dataset, the accuracy and running speed both perform better than the state-of-the-art model, which has great practical significance in the autonomous driving field. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 8363 KiB  
Article
Revisiting Local Descriptors via Frequent Pattern Mining for Fine-Grained Image Retrieval
by Min Zheng, Yangliao Geng and Qingyong Li
Entropy 2022, 24(2), 156; https://0-doi-org.brum.beds.ac.uk/10.3390/e24020156 - 20 Jan 2022
Viewed by 2385
Abstract
Fine-grained image retrieval aims at searching relevant images among fine-grained classes given a query. The main difficulty of this task derives from the small interclass distinction and the large intraclass variance of fine-grained images, posing severe challenges to the methods that only resort [...] Read more.
Fine-grained image retrieval aims at searching relevant images among fine-grained classes given a query. The main difficulty of this task derives from the small interclass distinction and the large intraclass variance of fine-grained images, posing severe challenges to the methods that only resort to global or local features. In this paper, we propose a novel fine-grained image retrieval method, where global–local aware feature representation is learned. Specifically, the global feature is extracted by selecting the most relevant deep descriptors. Meanwhile, we explore the intrinsic relationship of different parts via the frequent pattern mining, thus obtaining the representative local feature. Further, an aggregation feature that learns global–local aware feature representation is designed. Consequently, the discriminative ability among different fine-grained classes is enhanced. We evaluate the proposed method on five popular fine-grained datasets. Extensive experimental results demonstrate that the performance of fine-grained image retrieval is improved with the proposed global–local aware representation. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 325 KiB  
Article
Maximum Correntropy Criterion with Distributed Method
by Fan Xie, Ting Hu, Shixu Wang and Baobin Wang
Mathematics 2022, 10(3), 304; https://0-doi-org.brum.beds.ac.uk/10.3390/math10030304 - 19 Jan 2022
Cited by 2 | Viewed by 1147
Abstract
The Maximum Correntropy Criterion (MCC) has recently triggered enormous research activities in engineering and machine learning communities since it is robust when faced with heavy-tailed noise or outliers in practice. This work is interested in distributed MCC algorithms, based on a divide-and-conquer strategy, [...] Read more.
The Maximum Correntropy Criterion (MCC) has recently triggered enormous research activities in engineering and machine learning communities since it is robust when faced with heavy-tailed noise or outliers in practice. This work is interested in distributed MCC algorithms, based on a divide-and-conquer strategy, which can deal with big data efficiently. By establishing minmax optimal error bounds, our results show that the averaging output function of this distributed algorithm can achieve comparable convergence rates to the algorithm processing the total data in one single machine. Full article
(This article belongs to the Topic Machine and Deep Learning)
15 pages, 2020 KiB  
Article
Representation Learning for Dynamic Functional Connectivities via Variational Dynamic Graph Latent Variable Models
by Yicong Huang and Zhuliang Yu
Entropy 2022, 24(2), 152; https://0-doi-org.brum.beds.ac.uk/10.3390/e24020152 - 19 Jan 2022
Cited by 1 | Viewed by 2344
Abstract
Latent variable models (LVMs) for neural population spikes have revealed informative low-dimensional dynamics about the neural data and have become powerful tools for analyzing and interpreting neural activity. However, these approaches are unable to determine the neurophysiological meaning of the inferred latent dynamics. [...] Read more.
Latent variable models (LVMs) for neural population spikes have revealed informative low-dimensional dynamics about the neural data and have become powerful tools for analyzing and interpreting neural activity. However, these approaches are unable to determine the neurophysiological meaning of the inferred latent dynamics. On the other hand, emerging evidence suggests that dynamic functional connectivities (DFC) may be responsible for neural activity patterns underlying cognition or behavior. We are interested in studying how DFC are associated with the low-dimensional structure of neural activities. Most existing LVMs are based on a point process and fail to model evolving relationships. In this work, we introduce a dynamic graph as the latent variable and develop a Variational Dynamic Graph Latent Variable Model (VDGLVM), a representation learning model based on the variational information bottleneck framework. VDGLVM utilizes a graph generative model and a graph neural network to capture dynamic communication between nodes that one has no access to from the observed data. The proposed computational model provides guaranteed behavior-decoding performance and improves LVMs by associating the inferred latent dynamics with probable DFC. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

10 pages, 1181 KiB  
Article
The Status and Trend of Chinese News Forecast Based on Graph Convolutional Network Pooling Algorithm
by Xiao Han, Jing Peng, Tailai Peng, Rui Chen, Boyuan Hou, Xinran Xie and Zhe Cui
Appl. Sci. 2022, 12(2), 900; https://0-doi-org.brum.beds.ac.uk/10.3390/app12020900 - 17 Jan 2022
Viewed by 1395
Abstract
It is always a hot issue in the intelligence analysis field to predict the trend of news description by pre-trained language models and graph neural networks. However, there are several problems in the existing research: (1) there are few Chinese data sets on [...] Read more.
It is always a hot issue in the intelligence analysis field to predict the trend of news description by pre-trained language models and graph neural networks. However, there are several problems in the existing research: (1) there are few Chinese data sets on this subject in academia and industry; and (2) using the existing pre-trained language models and graph classification algorithms cannot achieve satisfactory results. The method described in this paper can better solve these problems. (1) We built a Chinese news database predicted by more than 9000 annotated news time trends, filling the gaps in this database. (2) We designed an improved method based on the pre-trained language model and graph neural networks pooling algorithm. In the graph pooling algorithm, the Graph U-Nets Pooling method and self-attention are combined, which can better solve the analysis of the problem of forecasting the development trend of news events. The experimental results show that the effect of this method compared with the baseline graph classification algorithm is improved, and it also solves the shortcomings of the pre-trained language model that cannot handle very long texts. Therefore, it can be concluded that our research has strong processing capabilities for analyzing and predicting the development trend of Chinese news events. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 6206 KiB  
Article
A Hybrid Driver Fatigue and Distraction Detection Model Using AlexNet Based on Facial Features
by Salma Anber, Wafaa Alsaggaf and Wafaa Shalash
Electronics 2022, 11(2), 285; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11020285 - 17 Jan 2022
Cited by 11 | Viewed by 3349
Abstract
Modern cities have imposed a fast-paced lifestyle where more drivers on the road suffer from fatigue and sleep deprivation. Consequently, road accidents have increased, becoming one of the leading causes of injuries and death among young adults and children. These accidents can be [...] Read more.
Modern cities have imposed a fast-paced lifestyle where more drivers on the road suffer from fatigue and sleep deprivation. Consequently, road accidents have increased, becoming one of the leading causes of injuries and death among young adults and children. These accidents can be prevented if fatigue symptoms are diagnosed and detected sufficiently early. For this reason, we propose and compare two AlexNet CNN-based models to detect drivers’ fatigue behaviors, relying on head position and mouth movements as behavioral measures. We used two different approaches. The first approach is transfer learning, specifically, fine-tuning AlexNet, which allowed us to take advantage of what the model had already learned without developing it from scratch. The newly trained model was able to predict drivers’ drowsiness behaviors. The second approach is the use of AlexNet to extract features by training the top layers of the network. These features were reduced using non-negative matrix factorization (NMF) and classified with a support vector machine (SVM) classifier. The experiments showed that our proposed transfer learning model achieved an accuracy of 95.7%, while the feature extraction SVM-based model performed better, with an accuracy of 99.65%. Both models were trained on a simulated NTHU Driver Drowsiness Detection dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 1641 KiB  
Article
Understanding Dilated Mathematical Relationship between Image Features and the Convolutional Neural Network’s Learnt Parameters
by Eyad Alsaghir, Xiyu Shi, Varuna De Silva and Ahmet Kondoz
Entropy 2022, 24(1), 132; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010132 - 16 Jan 2022
Cited by 1 | Viewed by 1628
Abstract
Deep learning, in general, was built on input data transformation and presentation, model training with parameter tuning, and recognition of new observations using the trained model. However, this came with a high computation cost due to the extensive input database and the length [...] Read more.
Deep learning, in general, was built on input data transformation and presentation, model training with parameter tuning, and recognition of new observations using the trained model. However, this came with a high computation cost due to the extensive input database and the length of time required in training. Despite the model learning its parameters from the transformed input data, no direct research has been conducted to investigate the mathematical relationship between the transformed information (i.e., features, excitation) and the model’s learnt parameters (i.e., weights). This research aims to explore a mathematical relationship between the input excitations and the weights of a trained convolutional neural network. The objective is to investigate three aspects of this assumed feature-weight relationship: (1) the mathematical relationship between the training input images’ features and the model’s learnt parameters, (2) the mathematical relationship between the images’ features of a separate test dataset and a trained model’s learnt parameters, and (3) the mathematical relationship between the difference of training and testing images’ features and the model’s learnt parameters with a separate test dataset. The paper empirically demonstrated the existence of this mathematical relationship between the test image features and the model’s learnt weights by the ANOVA analysis. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 675 KiB  
Article
A Lightweight Learning Method for Stochastic Configuration Networks Using Non-Inverse Solution
by Jing Nan, Zhonghua Jian, Chuanfeng Ning and Wei Dai
Electronics 2022, 11(2), 262; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11020262 - 14 Jan 2022
Viewed by 1314
Abstract
Stochastic configuration networks (SCNs) face time-consuming issues when dealing with complex modeling tasks that usually require a mass of hidden nodes to build an enormous network. An important reason behind this issue is that SCNs always employ the Moore–Penrose generalized inverse method with [...] Read more.
Stochastic configuration networks (SCNs) face time-consuming issues when dealing with complex modeling tasks that usually require a mass of hidden nodes to build an enormous network. An important reason behind this issue is that SCNs always employ the Moore–Penrose generalized inverse method with high complexity to update the output weights in each increment. To tackle this problem, this paper proposes a lightweight SCNs, called L-SCNs. First, to avoid using the Moore–Penrose generalized inverse method, a positive definite equation is proposed to replace the over-determined equation, and the consistency of their solution is proved. Then, to reduce the complexity of calculating the output weight, a low complexity method based on Cholesky decomposition is proposed. The experimental results based on both the benchmark function approximation and real-world problems including regression and classification applications show that L-SCNs are sufficiently lightweight. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 8083 KiB  
Article
MFAN: Multi-Level Features Attention Network for Fake Certificate Image Detection
by Yu Sun, Rongrong Ni and Yao Zhao
Entropy 2022, 24(1), 118; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010118 - 13 Jan 2022
Cited by 5 | Viewed by 2491
Abstract
Up to now, most of the forensics methods have attached more attention to natural content images. To expand the application of image forensics technology, forgery detection for certificate images that can directly represent people’s rights and interests is investigated in this paper. Variable [...] Read more.
Up to now, most of the forensics methods have attached more attention to natural content images. To expand the application of image forensics technology, forgery detection for certificate images that can directly represent people’s rights and interests is investigated in this paper. Variable tampered region scales and diverse manipulation types are two typical characteristics in fake certificate images. To tackle this task, a novel method called Multi-level Feature Attention Network (MFAN) is proposed. MFAN is built following the encoder–decoder network structure. In order to extract features with rich scale information in the encoder, on the one hand, we employ Atrous Spatial Pyramid Pooling (ASPP) on the final layer of a pre-trained residual network to capture the contextual information at different scales; on the other hand, low-level features are concatenated to ensure the sensibility to small targets. Furthermore, the resulting multi-level features are recalibrated on channels for irrelevant information suppression and enhancing the tampered regions, guiding the MFAN to adapt to diverse manipulation traces. In the decoder module, the attentive feature maps are convoluted and unsampled to effectively generate the prediction mask. Experimental results indicate that the proposed method outperforms some state-of-the-art forensics methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

29 pages, 5463 KiB  
Article
Novel Semantic-Based Probabilistic Context Aware Approach for Situations Enrichment and Adaptation
by Abderrahim Lakehal, Adel Alti and Philippe Roose
Appl. Sci. 2022, 12(2), 732; https://0-doi-org.brum.beds.ac.uk/10.3390/app12020732 - 12 Jan 2022
Cited by 2 | Viewed by 1196
Abstract
This paper aims at ensuring an efficient recommendation. It proposes a new context-aware semantic-based probabilistic situations injection and adaptation using an ontology approach and Bayesian-classifier. The idea is to predict the relevant situations for recommending the right services. Indeed, situations are correlated with [...] Read more.
This paper aims at ensuring an efficient recommendation. It proposes a new context-aware semantic-based probabilistic situations injection and adaptation using an ontology approach and Bayesian-classifier. The idea is to predict the relevant situations for recommending the right services. Indeed, situations are correlated with the user’s context. It can, therefore, be considered in designing a recommendation approach to enhance the relevancy by reducing the execution time. The proposed solution in which four probability-based-context rule situation items (user’s location and time, user’s role, their preferences and experiences) are chosen as inputs to predict user’s situations. Subsequently, the weighted linear combination is applied to calculate the similarity of rule items. The higher scores between the selected items are used to identify the relevant user’s situations. Three context parameters (CPU speed, sensor availability and RAM size) of the current devices are used to ensure adaptive service recommendation. Experimental results show that the proposed approach enhances accuracy rate with a high number of situations rules. A comparison with existing recommendation approaches shows that the proposed approach is more efficient and decreases the execution time. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

10 pages, 2240 KiB  
Article
Reducing Parameters of Neural Networks via Recursive Tensor Approximation
by Kyuahn Kwon and Jaeyong Chung
Electronics 2022, 11(2), 214; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11020214 - 11 Jan 2022
Cited by 2 | Viewed by 1400
Abstract
Large-scale neural networks have attracted much attention for surprising results in various cognitive tasks such as object detection and image classification. However, the large number of weight parameters in the complex networks can be problematic when the models are deployed to embedded systems. [...] Read more.
Large-scale neural networks have attracted much attention for surprising results in various cognitive tasks such as object detection and image classification. However, the large number of weight parameters in the complex networks can be problematic when the models are deployed to embedded systems. In addition, the problems are exacerbated in emerging neuromorphic computers, where each weight parameter is stored within a synapse, the primary computational resource of the bio-inspired computers. We describe an effective way of reducing the parameters by a recursive tensor factorization method. Applying the singular value decomposition in a recursive manner decomposes a tensor that represents the weight parameters. Then, the tensor is approximated by algorithms minimizing the approximation error and the number of parameters. This process factorizes a given network, yielding a deeper, less dense, and weight-shared network with good initial weights, which can be fine-tuned by gradient descent. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 2334 KiB  
Article
Multipointer Coattention Recommendation with Gated Neural Fusion between ID Embedding and Reviews
by Jianjie Shao, Jiwei Qin, Wei Zeng and Jiong Zheng
Appl. Sci. 2022, 12(2), 594; https://0-doi-org.brum.beds.ac.uk/10.3390/app12020594 - 08 Jan 2022
Cited by 2 | Viewed by 1515
Abstract
Recently, the interaction information from reviews has been modeled to acquire representations between users and items and improve the sparsity problem in recommendation systems. Reviews are more responsive to information about users’ preferences for the different aspects and attributes of items. However, how [...] Read more.
Recently, the interaction information from reviews has been modeled to acquire representations between users and items and improve the sparsity problem in recommendation systems. Reviews are more responsive to information about users’ preferences for the different aspects and attributes of items. However, how to better construct the representation of users (items) still needs further research. Inspired by the interaction information from reviews, auxiliary ID embedding information is used to further enrich the word-level representation in the proposed model named MPCAR. In this paper, first, a multipointer learning scheme is adopted to extract the most informative reviews from user and item reviews and represent users (items) in a word-by-word manner. Then, users and items are embedded to extract the ID embedding that can reveal the identity of users (items). Finally, the review features and ID embedding are input to the gated neural network for effective fusion to obtain richer representations of users and items. We randomly select ten subcategory datasets from the Amazon dataset to evaluate our algorithm. The experimental results show that our algorithm can achieve the best results compared to other recommendation approaches. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 6863 KiB  
Article
Domain-Specific On-Device Object Detection Method
by Seongju Kang, Jaegi Hwang and Kwangsue Chung
Entropy 2022, 24(1), 77; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010077 - 01 Jan 2022
Cited by 1 | Viewed by 2023
Abstract
Object detection is a significant activity in computer vision, and various approaches have been proposed to detect varied objects using deep neural networks (DNNs). However, because DNNs are computation-intensive, it is difficult to apply them to resource-constrained devices. Here, we propose an on-device [...] Read more.
Object detection is a significant activity in computer vision, and various approaches have been proposed to detect varied objects using deep neural networks (DNNs). However, because DNNs are computation-intensive, it is difficult to apply them to resource-constrained devices. Here, we propose an on-device object detection method using domain-specific models. In the proposed method, we define object of interest (OOI) groups that contain objects with a high frequency of appearance in specific domains. Compared with the existing DNN model, the layers of the domain-specific models are shallower and narrower, reducing the number of trainable parameters; thus, speeding up the object detection. To ensure a lightweight network design, we combine various network structures to obtain the best-performing lightweight detection model. The experimental results reveal that the size of the proposed lightweight model is 21.7 MB, which is 91.35% and 36.98% smaller than those of YOLOv3-SPP and Tiny-YOLO, respectively. The f-measure achieved on the MS COCO 2017 dataset were 18.3%, 11.9% and 20.3% higher than those of YOLOv3-SPP, Tiny-YOLO and YOLO-Nano, respectively. The results demonstrated that the lightweight model achieved higher efficiency and better performance on non-GPU devices, such as mobile devices and embedded boards, than conventional models. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 3682 KiB  
Article
SD-UNet: A Novel Segmentation Framework for CT Images of Lung Infections
by Shuangcai Yin, Hongmin Deng, Zelin Xu, Qilin Zhu and Junfeng Cheng
Electronics 2022, 11(1), 130; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11010130 - 01 Jan 2022
Cited by 29 | Viewed by 3538
Abstract
Due to the outbreak of lung infections caused by the coronavirus disease (COVID-19), humans have to face an unprecedented and devastating global health crisis. Since chest computed tomography (CT) images of COVID-19 patients contain abundant pathological features closely related to this disease, rapid [...] Read more.
Due to the outbreak of lung infections caused by the coronavirus disease (COVID-19), humans have to face an unprecedented and devastating global health crisis. Since chest computed tomography (CT) images of COVID-19 patients contain abundant pathological features closely related to this disease, rapid detection and diagnosis based on CT images is of great significance for the treatment of patients and blocking the spread of the disease. In particular, the segmentation of the COVID-19 CT lung-infected area can quantify and evaluate the severity of the disease. However, due to the blurred boundaries and low contrast between the infected and the non-infected areas in COVID-19 CT images, the manual segmentation of the COVID-19 lesion is laborious and places high demands on the operator. Quick and accurate segmentation of COVID-19 lesions from CT images based on deep learning has drawn increasing attention. To effectively improve the segmentation effect of COVID-19 lung infection, a modified UNet network that combines the squeeze-and-attention (SA) and dense atrous spatial pyramid pooling (Dense ASPP) modules) (SD-UNet) is proposed, fusing global context and multi-scale information. Specifically, the SA module is introduced to strengthen the attention of pixel grouping and fully exploit the global context information, allowing the network to better mine the differences and connections between pixels. The Dense ASPP module is utilized to capture multi-scale information of COVID-19 lesions. Moreover, to eliminate the interference of background noise outside the lungs and highlight the texture features of the lung lesion area, we extract in advance the lung area from the CT images in the pre-processing stage. Finally, we evaluate our method using the binary-class and multi-class COVID-19 lung infection segmentation datasets. The experimental results show that the metrics of Sensitivity, Dice Similarity Coefficient, Accuracy, Specificity, and Jaccard Similarity are 0.8988 (0.6169), 0.8696 (0.5936), 0.9906 (0.9821), 0.9932 (0.9907), and 0.7702 (0.4788), respectively, for the binary-class (multi-class) segmentation task in the proposed SD-UNet. The result of the COVID-19 lung infection area segmented by SD-UNet is closer to the ground truth compared to several existing models such as CE-Net, DeepLab v3+, UNet++, and other models, which further proves that a more accurate segmentation effect can be achieved by our method. It has the potential to assist doctors in making more accurate and rapid diagnosis and quantitative assessment of COVID-19. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 4072 KiB  
Article
Realtime Emotional Reflective User Interface Based on Deep Convolutional Neural Networks and Generative Adversarial Networks
by Holly Burrows, Javad Zarrin, Lakshmi Babu-Saheer and Mahdi Maktab-Dar-Oghaz
Electronics 2022, 11(1), 118; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11010118 - 31 Dec 2021
Cited by 5 | Viewed by 2556
Abstract
It is becoming increasingly apparent that a significant amount of the population suffers from mental health problems, such as stress, depression, and anxiety. These issues are a result of a vast range of factors, such as genetic conditions, social circumstances, and lifestyle influences. [...] Read more.
It is becoming increasingly apparent that a significant amount of the population suffers from mental health problems, such as stress, depression, and anxiety. These issues are a result of a vast range of factors, such as genetic conditions, social circumstances, and lifestyle influences. A key cause, or contributor, for many people is their work; poor mental state can be exacerbated by jobs and a person’s working environment. Additionally, as the information age continues to burgeon, people are increasingly sedentary in their working lives, spending more of their days seated, and less time moving around. It is a well-known fact that a decrease in physical activity is detrimental to mental well-being. Therefore, the need for innovative research and development to combat negativity early is required. Implementing solutions using Artificial Intelligence has great potential in this field of research. This work proposes a solution to this problem domain, utilising two concepts of Artificial Intelligence, namely, Convolutional Neural Networks and Generative Adversarial Networks. A CNN is trained to accurately predict when an individual is experiencing negative emotions, achieving a top accuracy of 80.38% with a loss of 0.42. A GAN is trained to synthesise images from an input domain that can be attributed to evoking position emotions. A Graphical User Interface is created to display the generated media to users in order to boost mood and reduce feelings of stress. The work demonstrates the capability for using Deep Learning to identify stress and negative mood, and the strategies that can be implemented to reduce them. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 3146 KiB  
Article
Log Sequence Anomaly Detection Method Based on Contrastive Adversarial Training and Dual Feature Extraction
by Qiaozheng Wang, Xiuguo Zhang, Xuejie Wang and Zhiying Cao
Entropy 2022, 24(1), 69; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010069 - 30 Dec 2021
Cited by 8 | Viewed by 2240
Abstract
The log messages generated in the system reflect the state of the system at all times. The realization of autonomous detection of abnormalities in log messages can help operators find abnormalities in time and provide a basis for analyzing the causes of abnormalities. [...] Read more.
The log messages generated in the system reflect the state of the system at all times. The realization of autonomous detection of abnormalities in log messages can help operators find abnormalities in time and provide a basis for analyzing the causes of abnormalities. First, this paper proposes a log sequence anomaly detection method based on contrastive adversarial training and dual feature extraction. This method uses BERT (Bidirectional Encoder Representations from Transformers) and VAE (Variational Auto-Encoder) to extract the semantic features and statistical features of the log sequence, respectively, and the dual features are combined to perform anomaly detection on the log sequence, with a novel contrastive adversarial training method also used to train the model. In addition, this paper introduces the method of obtaining statistical features of log sequence and the method of combining semantic features with statistical features. Furthermore, the specific process of contrastive adversarial training is described. Finally, an experimental comparison is carried out, and the experimental results show that the method in this paper is better than the contrasted log sequence anomaly detection method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 828 KiB  
Article
RALR: Random Amplify Learning Rates for Training Neural Networks
by Jiali Deng, Haigang Gong, Minghui Liu, Tianshu Xie, Xuan Cheng, Xiaomin Wang and Ming Liu
Appl. Sci. 2022, 12(1), 268; https://0-doi-org.brum.beds.ac.uk/10.3390/app12010268 - 28 Dec 2021
Viewed by 1700
Abstract
It has been shown that the learning rate is one of the most critical hyper-parameters for the overall performance of deep neural networks. In this paper, we propose a new method for setting the global learning rate, named random amplify learning rates (RALR), [...] Read more.
It has been shown that the learning rate is one of the most critical hyper-parameters for the overall performance of deep neural networks. In this paper, we propose a new method for setting the global learning rate, named random amplify learning rates (RALR), to improve the performance of any optimizer in training deep neural networks. Instead of monotonically decreasing the learning rate, we expect to escape saddle points or local minima by amplifying the learning rate between reasonable boundary values based on a given probability. Training with RALR rather than conventionally decreasing the learning rate achieves further improvement on networks’ performance without extra consumption. Remarkably, the RALR is complementary with state-of-the-art data augmentation and regularization methods. Besides, we empirically study its performance on image classification tasks, fine-grained classification tasks, object detection tasks, and machine translation tasks. Experiments demonstrate that RALR can bring a notable improvement while preventing overfitting when training deep neural networks. For example, the classification accuracy of ResNet-110 trained on the CIFAR-100 dataset using RALR achieves a 1.34% gain compared with ResNet-110 trained traditionally. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 2429 KiB  
Article
Simultaneously Improve Transferability and Discriminability for Adversarial Domain Adaptation
by Ting Xiao, Cangning Fan, Peng Liu and Hongwei Liu
Entropy 2022, 24(1), 44; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010044 - 27 Dec 2021
Cited by 3 | Viewed by 2477
Abstract
Although adversarial domain adaptation enhances feature transferability, the feature discriminability will be degraded in the process of adversarial learning. Moreover, most domain adaptation methods only focus on distribution matching in the feature space; however, shifts in the joint distributions of input features and [...] Read more.
Although adversarial domain adaptation enhances feature transferability, the feature discriminability will be degraded in the process of adversarial learning. Moreover, most domain adaptation methods only focus on distribution matching in the feature space; however, shifts in the joint distributions of input features and output labels linger in the network, and thus, the transferability is not fully exploited. In this paper, we propose a matrix rank embedding (MRE) method to enhance feature discriminability and transferability simultaneously. MRE restores a low-rank structure for data in the same class and enforces a maximum separation structure for data in different classes. In this manner, the variations within the subspace are reduced, and the separation between the subspaces is increased, resulting in improved discriminability. In addition to statistically aligning the class-conditional distribution in the feature space, MRE forces the data of the same class in different domains to exhibit an approximate low-rank structure, thereby aligning the class-conditional distribution in the label space, resulting in improved transferability. MRE is computationally efficient and can be used as a plug-and-play term for other adversarial domain adaptation networks. Comprehensive experiments demonstrate that MRE can advance state-of-the-art domain adaptation methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 9175 KiB  
Article
Reliable Fault Diagnosis of Bearings Using an Optimized Stacked Variational Denoising Auto-Encoder
by Xiaoan Yan, Yadong Xu, Daoming She and Wan Zhang
Entropy 2022, 24(1), 36; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010036 - 24 Dec 2021
Cited by 18 | Viewed by 3598
Abstract
Variational auto-encoders (VAE) have recently been successfully applied in the intelligent fault diagnosis of rolling bearings due to its self-learning ability and robustness. However, the hyper-parameters of VAEs depend, to a significant extent, on artificial settings, which is regarded as a common and [...] Read more.
Variational auto-encoders (VAE) have recently been successfully applied in the intelligent fault diagnosis of rolling bearings due to its self-learning ability and robustness. However, the hyper-parameters of VAEs depend, to a significant extent, on artificial settings, which is regarded as a common and key problem in existing deep learning models. Additionally, its anti-noise capability may face a decline when VAE is used to analyze bearing vibration data under loud environmental noise. Therefore, in order to improve the anti-noise performance of the VAE model and adaptively select its parameters, this paper proposes an optimized stacked variational denoising autoencoder (OSVDAE) for the reliable fault diagnosis of bearings. Within the proposed method, a robust network, named variational denoising auto-encoder (VDAE), is, first, designed by integrating VAE and a denoising auto-encoder (DAE). Subsequently, a stacked variational denoising auto-encoder (SVDAE) architecture is constructed to extract the robust and discriminative latent fault features via stacking VDAE networks layer on layer, wherein the important parameters of the SVDAE model are automatically determined by employing a novel meta-heuristic intelligent optimizer known as the seagull optimization algorithm (SOA). Finally, the extracted latent features are imported into a softmax classifier to obtain the results of fault recognition in rolling bearings. Experiments are conducted to validate the effectiveness of the proposed method. The results of analysis indicate that the proposed method not only can achieve a high identification accuracy for different bearing health conditions, but also outperforms some representative deep learning methods. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

28 pages, 4149 KiB  
Article
Sparse Density Estimation with Measurement Errors
by Xiaowei Yang, Huiming Zhang, Haoyu Wei and Shouzheng Zhang
Entropy 2022, 24(1), 30; https://0-doi-org.brum.beds.ac.uk/10.3390/e24010030 - 24 Dec 2021
Viewed by 2140
Abstract
This paper aims to estimate an unknown density of the data with measurement errors as a linear combination of functions from a dictionary. The main novelty is the proposal and investigation of the corrected sparse density estimator (CSDE). Inspired by the penalization approach, [...] Read more.
This paper aims to estimate an unknown density of the data with measurement errors as a linear combination of functions from a dictionary. The main novelty is the proposal and investigation of the corrected sparse density estimator (CSDE). Inspired by the penalization approach, we propose the weighted Elastic-net penalized minimal 2-distance method for sparse coefficients estimation, where the adaptive weights come from sharp concentration inequalities. The first-order conditions holding a high probability obtain the optimal weighted tuning parameters. Under local coherence or minimal eigenvalue assumptions, non-asymptotic oracle inequalities are derived. These theoretical results are transposed to obtain the support recovery with a high probability. Some numerical experiments for discrete and continuous distributions confirm the significant improvement obtained by our procedure when compared with other conventional approaches. Finally, the application is performed in a meteorology dataset. It shows that our method has potency and superiority in detecting multi-mode density shapes compared with other conventional approaches. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 582 KiB  
Article
Diagnostic Evaluation of Policy-Gradient-Based Ranking
by Hai-Tao Yu, Degen Huang, Fuji Ren and Lishuang Li
Electronics 2022, 11(1), 37; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11010037 - 23 Dec 2021
Cited by 3 | Viewed by 2247
Abstract
Learning-to-rank has been intensively studied and has shown significantly increasing values in a wide range of domains, such as web search, recommender systems, dialogue systems, machine translation, and even computational biology, to name a few. In light of [...] Read more.
Learning-to-rank has been intensively studied and has shown significantly increasing values in a wide range of domains, such as web search, recommender systems, dialogue systems, machine translation, and even computational biology, to name a few. In light of recent advances in neural networks, there has been a strong and continuing interest in exploring how to deploy popular techniques, such as reinforcement learning and adversarial learning, to solve ranking problems. However, armed with the aforesaid popular techniques, most studies tend to show how effective a new method is. A comprehensive comparison between techniques and an in-depth analysis of their deficiencies are somehow overlooked. This paper is motivated by the observation that recent ranking methods based on either reinforcement learning or adversarial learning boil down to policy-gradient-based optimization. Based on the widely used benchmark collections with complete information (where relevance labels are known for all items), such as MSLRWEB30K and Yahoo-Set1, we thoroughly investigate the extent to which policy-gradient-based ranking methods are effective. On one hand, we analytically identify the pitfalls of policy-gradient-based ranking. On the other hand, we experimentally compare a wide range of representative methods. The experimental results echo our analysis and show that policy-gradient-based ranking methods are, by a large margin, inferior to many conventional ranking methods. Regardless of whether we use reinforcement learning or adversarial learning, the failures are largely attributable to the gradient estimation based on sampled rankings, which significantly diverge from ideal rankings. In particular, the larger the number of documents per query and the more fine-grained the ground-truth labels, the greater the impact policy-gradient-based ranking suffers. Careful examination of this weakness is highly recommended for developing enhanced methods based on policy gradient. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 7370 KiB  
Article
Image Preprocessing Method in Radiographic Inspection for Automatic Detection of Ship Welding Defects
by Gwang-ho Yun, Sang-jin Oh and Sung-chul Shin
Appl. Sci. 2022, 12(1), 123; https://0-doi-org.brum.beds.ac.uk/10.3390/app12010123 - 23 Dec 2021
Cited by 7 | Viewed by 3023
Abstract
Welding defects must be inspected to verify that the welds meet the requirements of ship welded joints, and in welding defect inspection, among nondestructive inspections, radiographic inspection is widely applied during the production process. To perform nondestructive inspection, the completed weldment must be [...] Read more.
Welding defects must be inspected to verify that the welds meet the requirements of ship welded joints, and in welding defect inspection, among nondestructive inspections, radiographic inspection is widely applied during the production process. To perform nondestructive inspection, the completed weldment must be transported to the nondestructive inspection station, which is expensive; consequently, automation of welding defect detection is required. Recently, at several processing sites of companies, continuous attempts are being made to combine deep learning to detect defects more accurately. Preprocessing for welding defects in radiographic inspection images should be prioritized to automatically detect welding defects using deep learning during radiographic nondestructive inspection. In this study, by analyzing the pixel values, we developed an image preprocessing method that can integrate the defect features. After maximizing the contrast between the defect and background in radiographic through CLAHE (contrast-limited adaptive histogram equalization), denoising (noise removal), thresholding (threshold processing), and concatenation were sequentially performed. The improvement in detection performance due to preprocessing was verified by comparing the results of the application of the algorithm on raw images, typical preprocessed images, and preprocessed images. The mAP for the training data and test data was 84.9% and 51.2% for the preprocessed image learning model, whereas 82.0% and 43.5% for the typical preprocessed image learning model and 78.0%, 40.8% for the raw image learning model. Object detection algorithm technology is developed every year, and the mAP is improving by approximately 3% to 10%. This study achieved a comparable performance improvement by only preprocessing with data. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 406 KiB  
Article
You Don’t Need Labeled Data for Open-Book Question Answering
by Sia Gholami and Mehdi Noori
Appl. Sci. 2022, 12(1), 111; https://0-doi-org.brum.beds.ac.uk/10.3390/app12010111 - 23 Dec 2021
Cited by 6 | Viewed by 2049
Abstract
Open-book question answering is a subset of question answering (QA) tasks where the system aims to find answers in a given set of documents (open-book) and common knowledge about a topic. This article proposes a solution for answering natural language questions from a [...] Read more.
Open-book question answering is a subset of question answering (QA) tasks where the system aims to find answers in a given set of documents (open-book) and common knowledge about a topic. This article proposes a solution for answering natural language questions from a corpus of Amazon Web Services (AWS) technical documents with no domain-specific labeled data (zero-shot). These questions have a yes–no–none answer and a text answer which can be short (a few words) or long (a few sentences). We present a two-step, retriever–extractor architecture in which a retriever finds the right documents and an extractor finds the answers in the retrieved documents. To test our solution, we are introducing a new dataset for open-book QA based on real customer questions on AWS technical documentation. In this paper, we conducted experiments on several information retrieval systems and extractive language models, attempting to find the yes–no–none answers and text answers in the same pass. Our custom-built extractor model is created from a pretrained language model and fine-tuned on the the Stanford Question Answering Dataset—SQuAD and Natural Questions datasets. We were able to achieve 42% F1 and 39% exact match score (EM) end-to-end with no domain-specific training. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 4542 KiB  
Article
Robust Pedestrian Detection Based on Multi-Spectral Image Fusion and Convolutional Neural Networks
by Xu Chen, Lei Liu and Xin Tan
Electronics 2022, 11(1), 1; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics11010001 - 21 Dec 2021
Cited by 18 | Viewed by 2544
Abstract
Nowadays, pedestrian detection is widely used in fields such as driving assistance and video surveillance with the progression of technology. However, although the research of single-modal visible pedestrian detection has been very mature, it is still not enough to meet the demand of [...] Read more.
Nowadays, pedestrian detection is widely used in fields such as driving assistance and video surveillance with the progression of technology. However, although the research of single-modal visible pedestrian detection has been very mature, it is still not enough to meet the demand of pedestrian detection at all times. Thus, a multi-spectral pedestrian detection method via image fusion and convolutional neural networks is proposed in this paper. The infrared intensity distribution and visible appearance features are retained with a total variation model based on local structure transfer, and pedestrian detection is realized with the multi-spectral fusion results and the target detection network YOLOv3. The detection performance of the proposed method is evaluated and compared with the detection methods based on the other four pixel-level fusion algorithms and two fusion network architectures. The results attest that our method has superior detection performance, which can detect pedestrian targets robustly even in the case of harsh illumination conditions and cluttered backgrounds. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

25 pages, 406 KiB  
Article
Exact Learning Augmented Naive Bayes Classifier
by Shouta Sugahara and Maomi Ueno
Entropy 2021, 23(12), 1703; https://0-doi-org.brum.beds.ac.uk/10.3390/e23121703 - 20 Dec 2021
Cited by 18 | Viewed by 2707
Abstract
Earlier studies have shown that classification accuracies of Bayesian networks (BNs) obtained by maximizing the conditional log likelihood (CLL) of a class variable, given the feature variables, were higher than those obtained by maximizing the marginal likelihood (ML). However, differences between the performances [...] Read more.
Earlier studies have shown that classification accuracies of Bayesian networks (BNs) obtained by maximizing the conditional log likelihood (CLL) of a class variable, given the feature variables, were higher than those obtained by maximizing the marginal likelihood (ML). However, differences between the performances of the two scores in the earlier studies may be attributed to the fact that they used approximate learning algorithms, not exact ones. This paper compares the classification accuracies of BNs with approximate learning using CLL to those with exact learning using ML. The results demonstrate that the classification accuracies of BNs obtained by maximizing the ML are higher than those obtained by maximizing the CLL for large data. However, the results also demonstrate that the classification accuracies of exact learning BNs using the ML are much worse than those of other methods when the sample size is small and the class variable has numerous parents. To resolve the problem, we propose an exact learning augmented naive Bayes classifier (ANB), which ensures a class variable with no parents. The proposed method is guaranteed to asymptotically estimate the identical class posterior to that of the exactly learned BN. Comparison experiments demonstrated the superior performance of the proposed method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 1689 KiB  
Article
Continuous Viewpoint Planning in Conjunction with Dynamic Exploration for Active Object Recognition
by Haibo Sun, Feng Zhu, Yanzi Kong, Jianyu Wang and Pengfei Zhao
Entropy 2021, 23(12), 1702; https://0-doi-org.brum.beds.ac.uk/10.3390/e23121702 - 20 Dec 2021
Cited by 2 | Viewed by 2114
Abstract
Active object recognition (AOR) aims at collecting additional information to improve recognition performance by purposefully adjusting the viewpoint of an agent. How to determine the next best viewpoint of the agent, i.e., viewpoint planning (VP), is a research focus. Most existing VP methods [...] Read more.
Active object recognition (AOR) aims at collecting additional information to improve recognition performance by purposefully adjusting the viewpoint of an agent. How to determine the next best viewpoint of the agent, i.e., viewpoint planning (VP), is a research focus. Most existing VP methods perform viewpoint exploration in the discrete viewpoint space, which have to sample viewpoint space and may bring in significant quantization error. To address this challenge, a continuous VP approach for AOR based on reinforcement learning is proposed. Specifically, we use two separate neural networks to model the VP policy as a parameterized Gaussian distribution and resort the proximal policy optimization framework to learn the policy. Furthermore, an adaptive entropy regularization based dynamic exploration scheme is presented to automatically adjust the viewpoint exploration ability in the learning process. To the end, experimental results on the public dataset GERMS well demonstrate the superiority of our proposed VP method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 690 KiB  
Article
Improving Multi-Label Learning by Correlation Embedding
by Jun Huang, Qian Xu, Xiwen Qu, Yaojin Lin and Xiao Zheng
Appl. Sci. 2021, 11(24), 12145; https://0-doi-org.brum.beds.ac.uk/10.3390/app112412145 - 20 Dec 2021
Cited by 4 | Viewed by 2250
Abstract
In multi-label learning, each object is represented by a single instance and is associated with more than one class labels, where the labels might be correlated with each other. As we all know, exploiting label correlations can definitely improve the performance of a [...] Read more.
In multi-label learning, each object is represented by a single instance and is associated with more than one class labels, where the labels might be correlated with each other. As we all know, exploiting label correlations can definitely improve the performance of a multi-label classification model. Existing methods mainly model label correlations in an indirect way, i.e., adding extra constraints on the coefficients or outputs of a model based on a pre-learned label correlation graph. Meanwhile, the high dimension of the feature space also poses great challenges to multi-label learning, such as high time and memory costs. To solve the above mentioned issues, in this paper, we propose a new approach for Multi-Label Learning by Correlation Embedding, namely MLLCE, where the feature space dimension reduction and the multi-label classification are integrated into a unified framework. Specifically, we project the original high-dimensional feature space to a low-dimensional latent space by a mapping matrix. To model label correlation, we learn an embedding matrix from the pre-defined label correlation graph by graph embedding. Then, we construct a multi-label classifier from the low-dimensional latent feature space to the label space, where the embedding matrix is utilized as the model coefficients. Finally, we extend the proposed method MLLCE to the nonlinear version, i.e., NL-MLLCE. The comparison experiment with the state-of-the-art approaches shows that the proposed method MLLCE has a competitive performance in multi-label learning. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 1525 KiB  
Article
A Universal Malicious Documents Static Detection Framework Based on Feature Generalization
by Xiaofeng Lu, Fei Wang, Cheng Jiang and Pietro Lio
Appl. Sci. 2021, 11(24), 12134; https://0-doi-org.brum.beds.ac.uk/10.3390/app112412134 - 20 Dec 2021
Cited by 6 | Viewed by 3003
Abstract
In this study, Portable Document Format (PDF), Word, Excel, Rich Test format (RTF) and image documents are taken as the research objects to study a static and fast method by which to detect malicious documents. Malicious PDF and Word document features are abstracted [...] Read more.
In this study, Portable Document Format (PDF), Word, Excel, Rich Test format (RTF) and image documents are taken as the research objects to study a static and fast method by which to detect malicious documents. Malicious PDF and Word document features are abstracted and extended, which can be used to detect other types of documents. A universal static detection framework for malicious documents based on feature generalization is then proposed. The generalized features include specification check errors, the structure path, code keywords, and the number of objects. The proposed method is verified on two datasets, and is compared with Kaspersky, NOD32, and McAfee antivirus software. The experimental results demonstrate that the proposed method achieves good performance in terms of the detection accuracy, runtime, and scalability. The average F1-score of all types of documents is found to be 0.99, and the average detection time of a document is 0.5926 s, which is at the same level as the compared antivirus software. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 2363 KiB  
Article
A Hybrid Online Classifier System for Internet Traffic Based on Statistical Machine Learning Approach and Flow Port Number
by Hamza Awad Hamza Ibrahim, Omer Radhi A. L. Zuobi, Awad M. Abaker and Musab B. Alzghoul
Appl. Sci. 2021, 11(24), 12113; https://0-doi-org.brum.beds.ac.uk/10.3390/app112412113 - 20 Dec 2021
Cited by 3 | Viewed by 2399
Abstract
Internet traffic classification is a beneficial technique in the direction of intrusion detection and network monitoring. After several years of searching, there are still many open problems in Internet traffic classification. The hybrid classifier combines more than one classification method to identify Internet [...] Read more.
Internet traffic classification is a beneficial technique in the direction of intrusion detection and network monitoring. After several years of searching, there are still many open problems in Internet traffic classification. The hybrid classifier combines more than one classification method to identify Internet traffic. Using only one method to classify Internet traffic poses many risks. In addition, an online classifier is very important in order to manage threats on traffic such as denial of service, flooding attack and other similar threats. Therefore, this paper provides some information to differentiate between real and live internet traffic. In addition, this paper proposes a hybrid online classifier (HOC) system. HOC is based on two common classification methods, port-base and ML-base. HOC is able to perform an online classification since it can identify live Internet traffic at the same time as it is generated. HOC was used to classify three common Internet application classes, namely web, WhatsApp and Twitter. HOC produces more than 90% accuracy, which is higher than any individual classifiers. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 3735 KiB  
Article
Human Activity Recognition Using Cell Phone-Based Accelerometer and Convolutional Neural Network
by Ashwani Prasad, Amit Kumar Tyagi, Maha M. Althobaiti, Ahmed Almulihi, Romany F. Mansour and Ayman M. Mahmoud
Appl. Sci. 2021, 11(24), 12099; https://0-doi-org.brum.beds.ac.uk/10.3390/app112412099 - 19 Dec 2021
Cited by 15 | Viewed by 3156
Abstract
Human Activity Recognition (HAR) has become an active field of research in the computer vision community. Recognizing the basic activities of human beings with the help of computers and mobile sensors can be beneficial for numerous real-life applications. The main objective of this [...] Read more.
Human Activity Recognition (HAR) has become an active field of research in the computer vision community. Recognizing the basic activities of human beings with the help of computers and mobile sensors can be beneficial for numerous real-life applications. The main objective of this paper is to recognize six basic human activities, viz., jogging, sitting, standing, walking and whether a person is going upstairs or downstairs. This paper focuses on predicting the activities using a deep learning technique called Convolutional Neural Network (CNN) and the accelerometer present in smartphones. Furthermore, the methodology proposed in this paper focuses on grouping the data in the form of nodes and dividing the nodes into three major layers of the CNN after which the outcome is predicted in the output layer. This work also supports the evaluation of testing and training of the two-dimensional CNN model. Finally, it was observed that the model was able to give a good prediction of the activities with an average accuracy of 89.67%. Considering that the dataset used in this research work was built with the aid of smartphones, coming up with an efficient model for such datasets and some futuristic ideas pose open challenges in the research community. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 2889 KiB  
Article
Detection of Pitt–Hopkins Syndrome Based on Morphological Facial Features
by Elena D’Amato, Constantino Carlos Reyes-Aldasoro, Arianna Consiglio, Gabriele D’Amato, Maria Felicia Faienza and Marcella Zollino
Appl. Sci. 2021, 11(24), 12086; https://0-doi-org.brum.beds.ac.uk/10.3390/app112412086 - 18 Dec 2021
Viewed by 2549
Abstract
This work describes a non-invasive, automated software framework to discriminate between individuals with a genetic disorder, Pitt–Hopkins syndrome (PTHS), and healthy individuals through the identification of morphological facial features. The input data consist of frontal facial photographs in which faces are located using [...] Read more.
This work describes a non-invasive, automated software framework to discriminate between individuals with a genetic disorder, Pitt–Hopkins syndrome (PTHS), and healthy individuals through the identification of morphological facial features. The input data consist of frontal facial photographs in which faces are located using histograms of oriented gradients feature descriptors. Pre-processing steps include color normalization and enhancement, scaling down, rotation, and cropping of pictures to produce a series of images of faces with consistent dimensions. Sixty-eight facial landmarks are automatically located on each face through a cascade of regression functions learnt via gradient boosting to estimate the shape from an initial approximation. The intensities of a sparse set of pixels indexed relative to this initial estimate are used to determine the landmarks. A set of carefully selected geometric features, for example, the relative width of the mouth or angle of the nose, is extracted from the landmarks. The features are used to investigate the statistical differences between the two populations of PTHS and healthy controls. The methodology was tested on 71 individuals with PTHS and 55 healthy controls. The software was able to classify individuals with an accuracy rate of 91%, while pediatricians achieved a recognition rate of 74%. Two geometric features related to the nose and mouth showed significant statistical difference between the two populations. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 1409 KiB  
Article
MFF-Net: Deepfake Detection Network Based on Multi-Feature Fusion
by Lei Zhao, Mingcheng Zhang, Hongwei Ding and Xiaohui Cui
Entropy 2021, 23(12), 1692; https://0-doi-org.brum.beds.ac.uk/10.3390/e23121692 - 17 Dec 2021
Cited by 14 | Viewed by 3744
Abstract
Significant progress has been made in generating counterfeit images and videos. Forged videos generated by deepfaking have been widely spread and have caused severe societal impacts, which stir up public concern about automatic deepfake detection technology. Recently, many deepfake detection methods based on [...] Read more.
Significant progress has been made in generating counterfeit images and videos. Forged videos generated by deepfaking have been widely spread and have caused severe societal impacts, which stir up public concern about automatic deepfake detection technology. Recently, many deepfake detection methods based on forged features have been proposed. Among the popular forged features, textural features are widely used. However, most of the current texture-based detection methods extract textures directly from RGB images, ignoring the mature spectral analysis methods. Therefore, this research proposes a deepfake detection network fusing RGB features and textural information extracted by neural networks and signal processing methods, namely, MFF-Net. Specifically, it consists of four key components: (1) a feature extraction module to further extract textural and frequency information using the Gabor convolution and residual attention blocks; (2) a texture enhancement module to zoom into the subtle textural features in shallow layers; (3) an attention module to force the classifier to focus on the forged part; (4) two instances of feature fusion to firstly fuse textural features from the shallow RGB branch and feature extraction module and then to fuse the textural features and semantic information. Moreover, we further introduce a new diversity loss to force the feature extraction module to learn features of different scales and directions. The experimental results show that MFF-Net has excellent generalization and has achieved state-of-the-art performance on various deepfake datasets. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 4017 KiB  
Article
Extracting Weld Bead Shapes from Radiographic Testing Images with U-Net
by Gang-soo Jin, Sang-jin Oh, Yeon-seung Lee and Sung-chul Shin
Appl. Sci. 2021, 11(24), 12051; https://0-doi-org.brum.beds.ac.uk/10.3390/app112412051 - 17 Dec 2021
Cited by 10 | Viewed by 2605
Abstract
Metals created by melting basic metal and welding rods in welding operations are referred to as weld beads. The weld bead shape allows the observation of pores and defects such as cracks in the weld zone. Radiographic testing images are used to determine [...] Read more.
Metals created by melting basic metal and welding rods in welding operations are referred to as weld beads. The weld bead shape allows the observation of pores and defects such as cracks in the weld zone. Radiographic testing images are used to determine the quality of the weld zone. The extraction of only the weld bead to determine the generative pattern of the bead can help efficiently locate defects in the weld zone. However, manual extraction of the weld bead from weld images is not time and cost-effective. Efficient and rapid welding quality inspection can be conducted by automating weld bead extraction through deep learning. As a result, objectivity can be secured in the quality inspection and determination of the weld zone in the shipbuilding and offshore plant industry. This study presents a method for detecting the weld bead shape and location from the weld zone image using image preprocessing and deep learning models, and extracting the weld bead through image post-processing. In addition, to diversify the data and improve the deep learning performance, data augmentation was performed to artificially expand the image data. Contrast limited adaptive histogram equalization (CLAHE) is used as an image preprocessing method, and the bead is extracted using U-Net, a pixel-based deep learning model. Consequently, the mean intersection over union (mIoU) values are found to be 90.58% and 85.44% in the train and test experiments, respectively. Successful extraction of the bead from the radiographic testing image through post-processing is achieved. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 2671 KiB  
Article
RF Signal-Based UAV Detection and Mode Classification: A Joint Feature Engineering Generator and Multi-Channel Deep Neural Network Approach
by Shubo Yang, Yang Luo, Wang Miao, Changhao Ge, Wenjian Sun and Chunbo Luo
Entropy 2021, 23(12), 1678; https://0-doi-org.brum.beds.ac.uk/10.3390/e23121678 - 14 Dec 2021
Cited by 9 | Viewed by 3519
Abstract
With the proliferation of Unmanned Aerial Vehicles (UAVs) to provide diverse critical services, such as surveillance, disaster management, and medicine delivery, the accurate detection of these small devices and the efficient classification of their flight modes are of paramount importance to guarantee their [...] Read more.
With the proliferation of Unmanned Aerial Vehicles (UAVs) to provide diverse critical services, such as surveillance, disaster management, and medicine delivery, the accurate detection of these small devices and the efficient classification of their flight modes are of paramount importance to guarantee their safe operation in our sky. Among the existing approaches, Radio Frequency (RF) based methods are less affected by complex environmental factors. The similarities between UAV RF signals and the diversity of frequency components make accurate detection and classification a particularly difficult task. To bridge this gap, we propose a joint Feature Engineering Generator (FEG) and Multi-Channel Deep Neural Network (MC-DNN) approach. Specifically, in FEG, data truncation and normalization separate different frequency components, the moving average filter reduces the outliers in the RF signal, and the concatenation fully exploits the details of the dataset. In addition, the multi-channel input in MC-DNN separates multiple frequency components and reduces the interference between them. A novel dataset that contains ten categories of RF signals from three types of UAVs is used to verify the effectiveness. Experiments show that the proposed method outperforms the state-of-the-art UAV detection and classification approaches in terms of 98.4% and F1 score of 98.3%. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 13759 KiB  
Article
Online Monitoring of Power Converter Degradation Using Deep Neural Network
by Jiayi Fan, Janghyeon Lee, Insu Jung and Yongkeun Lee
Appl. Sci. 2021, 11(24), 11796; https://0-doi-org.brum.beds.ac.uk/10.3390/app112411796 - 12 Dec 2021
Cited by 1 | Viewed by 2166
Abstract
Power semiconductor devices in the power converters used for motor drives are susceptible to wear-out and failure, especially when operated in harsh environments. Therefore, detection of degradation of power devices is crucial for ensuring the reliable performance of power converters. In this paper, [...] Read more.
Power semiconductor devices in the power converters used for motor drives are susceptible to wear-out and failure, especially when operated in harsh environments. Therefore, detection of degradation of power devices is crucial for ensuring the reliable performance of power converters. In this paper, a deep learning approach for online classification of the health states of the snubber resistors in the Insulated Gate Bipolar Transistors (IGBTs) in a three-phase Brushless DC (BLDC) motor drive is proposed. The method can locate one out of the six IGBTs experiencing a snubber resistor degradation problem by measuring the voltage waveforms of the three shunt resistors using voltage sensors. The range of the degradation of the snubber resistors for successful classification is also investigated. The off-the-shelf deep Convolutional Neural Network (CNN) architecture ResNet50 is used for transfer learning to determine which snubber resistor has degraded. The dataset for evaluating the above classification scheme of IGBT degradation is obtained by measuring the shunt voltage waveforms with varying snubber resistance and reference current. Then, the three-phase voltage waveforms are converted into greyscale images and RGB spectrogram images, which are later fed into the deep CNN. Experiments are carried out on the greyscale image dataset and the spectrogram image dataset using four-fold cross-validation. The results show that the proposed scheme can classify seven classes (one class for normal condition and six classes for abnormal condition in one of the six IGBTs in a three-phase BLDC drive) with over 95% average accuracy within a specific range of snubber resistance. Using grayscale images and using spectrogram-based RGB images yields similar accuracy. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 390 KiB  
Article
A Mixed Strategy of Higher-Order Structure for Link Prediction Problem on Bipartite Graphs
by Chao Li, Qiming Yang, Bowen Pang, Tiance Chen, Qian Cheng and Jiaomin Liu
Mathematics 2021, 9(24), 3195; https://0-doi-org.brum.beds.ac.uk/10.3390/math9243195 - 10 Dec 2021
Cited by 4 | Viewed by 2209
Abstract
Link prediction tasks have an extremely high research value in both academic and commercial fields. As a special case, link prediction in bipartite graphs has been receiving more and more attention thanks to the great success of the recommender system in the application [...] Read more.
Link prediction tasks have an extremely high research value in both academic and commercial fields. As a special case, link prediction in bipartite graphs has been receiving more and more attention thanks to the great success of the recommender system in the application field, such as product recommendation in E-commerce and movie recommendation in video sites. However, the difference between bipartite and unipartite graphs makes some methods designed for the latter inapplicable to the former, so it is quite important to study link prediction methods specifically for bipartite graphs. In this paper, with the aim of better measuring the similarity between two nodes in a bipartite graph and improving link prediction performance based on that, we propose a motif-based similarity index specifically for application on bipartite graphs. Our index can be regarded as a high-order evaluation of a graph’s local structure, which concerns mainly two kinds of typical 4-motifs related to bipartite graphs. After constructing our index, we integrate it into a commonly used method to measure the connection potential between every unconnected node pair. Some of the node pairs are originally unconnected, and the others are those we select deliberately to delete their edges for subsequent testing. We make experiments on six public network datasets and the results imply that the mixture of our index with the traditional method can obtain better prediction performance w.r.t. precision, recall and AUC in most cases. This is a strong proof of the effectiveness of our exploration on motifs structure. Also, our work points out an interesting direction for key graph structure exploration in the field of link prediction. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 2940 KiB  
Article
Screening of Potential Indonesia Herbal Compounds Based on Multi-Label Classification for 2019 Coronavirus Disease
by Aulia Fadli, Wisnu Ananta Kusuma, Annisa, Irmanida Batubara and Rudi Heryanto
Big Data Cogn. Comput. 2021, 5(4), 75; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5040075 - 09 Dec 2021
Cited by 5 | Viewed by 4063
Abstract
Coronavirus disease 2019 pandemic spreads rapidly and requires an acceleration in the process of drug discovery. Drug repurposing can help accelerate the drug discovery process by identifying new efficacy for approved drugs, and it is considered an efficient and economical approach. Research in [...] Read more.
Coronavirus disease 2019 pandemic spreads rapidly and requires an acceleration in the process of drug discovery. Drug repurposing can help accelerate the drug discovery process by identifying new efficacy for approved drugs, and it is considered an efficient and economical approach. Research in drug repurposing can be done by observing the interactions of drug compounds with protein related to a disease (DTI), then predicting the new drug-target interactions. This study conducted multilabel DTI prediction using the stack autoencoder-deep neural network (SAE-DNN) algorithm. Compound features were extracted using PubChem fingerprint, daylight fingerprint, MACCS fingerprint, and circular fingerprint. The results showed that the SAE-DNN model was able to predict DTI in COVID-19 cases with good performance. The SAE-DNN model with a circular fingerprint dataset produced the best average metrics with an accuracy of 0.831, recall of 0.918, precision of 0.888, and F-measure of 0.89. Herbal compounds prediction results using the SAE-DNN model with the circular, daylight, and PubChem fingerprint dataset resulted in 92, 65, and 79 herbal compounds contained in herbal plants in Indonesia respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 3061 KiB  
Article
Probabilistic Autoencoder Using Fisher Information
by Johannes Zacherl, Philipp Frank and Torsten A. Enßlin
Entropy 2021, 23(12), 1640; https://0-doi-org.brum.beds.ac.uk/10.3390/e23121640 - 06 Dec 2021
Cited by 1 | Viewed by 2464
Abstract
Neural networks play a growing role in many scientific disciplines, including physics. Variational autoencoders (VAEs) are neural networks that are able to represent the essential information of a high dimensional data set in a low dimensional latent space, which have a probabilistic interpretation. [...] Read more.
Neural networks play a growing role in many scientific disciplines, including physics. Variational autoencoders (VAEs) are neural networks that are able to represent the essential information of a high dimensional data set in a low dimensional latent space, which have a probabilistic interpretation. In particular, the so-called encoder network, the first part of the VAE, which maps its input onto a position in latent space, additionally provides uncertainty information in terms of variance around this position. In this work, an extension to the autoencoder architecture is introduced, the FisherNet. In this architecture, the latent space uncertainty is not generated using an additional information channel in the encoder but derived from the decoder by means of the Fisher information metric. This architecture has advantages from a theoretical point of view as it provides a direct uncertainty quantification derived from the model and also accounts for uncertainty cross-correlations. We can show experimentally that the FisherNet produces more accurate data reconstructions than a comparable VAE and its learning performance also apparently scales better with the number of latent space dimensions. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 21099 KiB  
Article
Time-Aware and Feature Similarity Self-Attention in Vessel Fuel Consumption Prediction
by Hyun Joon Park, Min Seok Lee, Dong Il Park and Sung Won Han
Appl. Sci. 2021, 11(23), 11514; https://0-doi-org.brum.beds.ac.uk/10.3390/app112311514 - 04 Dec 2021
Cited by 4 | Viewed by 2206
Abstract
An accurate vessel fuel consumption prediction is essential for constructing a ship route network and vessel management, leading to efficient sailings. Besides, ship data from monitoring and sensing systems accelerate fuel consumption prediction research. However, the ship data consist of three properties: sequential, [...] Read more.
An accurate vessel fuel consumption prediction is essential for constructing a ship route network and vessel management, leading to efficient sailings. Besides, ship data from monitoring and sensing systems accelerate fuel consumption prediction research. However, the ship data consist of three properties: sequential, irregular time interval, and feature importance, making the predicting problem challenging. In this paper, we propose Time-aware Attention (TA) and Feature-similarity Attention (FA) applied to bi-directional Long Short-Term Memory (LSTM). TA acquires time importance by nonlinear function from irregular time intervals in each sequence and emphasizes data depending on the importance. FA emphasizes data based on similarities of features in the sequence by estimating feature importance with learnable parameters. Finally, we propose the ensemble model of TA and FA-based BiLSTM. The ensemble model, which consists of fully connected layers, is capable of simultaneously capturing different properties of ship data. The experimental results on ship data showed that the proposed model improves the performance in predicting fuel consumption. In addition to model performance, visualization results of attention maps and feature importance help to understand data properties and model characteristics. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

31 pages, 23936 KiB  
Article
An Unsupervised Machine Learning-Based Framework for Transferring Local Factories into Supply Chain Networks
by Mohd Fahmi Bin Mad Ali, Mohd Khairol Anuar Bin Mohd Ariffin, Faizal Bin Mustapha and Eris Elianddy Bin Supeni
Mathematics 2021, 9(23), 3114; https://0-doi-org.brum.beds.ac.uk/10.3390/math9233114 - 03 Dec 2021
Viewed by 2043
Abstract
Transferring a local manufacturing company to a national-wide supply chain network with wholesalers and retailers is a significant problem in manufacturing systems. In this research, a hybrid PCA-K-means is used to transfer a local chocolate manufacturing firm near Kuala Lumpur into a national-wide [...] Read more.
Transferring a local manufacturing company to a national-wide supply chain network with wholesalers and retailers is a significant problem in manufacturing systems. In this research, a hybrid PCA-K-means is used to transfer a local chocolate manufacturing firm near Kuala Lumpur into a national-wide supply chain. For this purpose, the appropriate locations of the wholesaler’s center points were found according to the geographical and population features of the markets in Malaysia. To this end, four wholesalers on the left island of Malaysia are recognized, which were located in the north area, right area, middle area, and south area. Similarly, two wholesalers were identified on the right island, which were in Sarawak and WP Labuan. In order to evaluate the performance of the proposed method, its outcomes are compared with other unsupervised-learning methods such as the WARD and CLINK methods. The outcomes indicated that K-means could successfully determine the best locations for the wholesalers in the supply chain network with a higher score (0.812). Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 2533 KiB  
Article
Multi-Label Feature Selection Combining Three Types of Conditional Relevance
by Lingbo Gao, Yiqiang Wang, Yonghao Li, Ping Zhang and Liang Hu
Entropy 2021, 23(12), 1617; https://0-doi-org.brum.beds.ac.uk/10.3390/e23121617 - 01 Dec 2021
Cited by 1 | Viewed by 1840
Abstract
With the rapid growth of the Internet, the curse of dimensionality caused by massive multi-label data has attracted extensive attention. Feature selection plays an indispensable role in dimensionality reduction processing. Many researchers have focused on this subject based on information theory. Here, to [...] Read more.
With the rapid growth of the Internet, the curse of dimensionality caused by massive multi-label data has attracted extensive attention. Feature selection plays an indispensable role in dimensionality reduction processing. Many researchers have focused on this subject based on information theory. Here, to evaluate feature relevance, a novel feature relevance term (FR) that employs three incremental information terms to comprehensively consider three key aspects (candidate features, selected features, and label correlations) is designed. A thorough examination of the three key aspects of FR outlined above is more favorable to capturing the optimal features. Moreover, we employ label-related feature redundancy as the label-related feature redundancy term (LR) to reduce unnecessary redundancy. Therefore, a designed multi-label feature selection method that integrates FR with LR is proposed, namely, Feature Selection combining three types of Conditional Relevance (TCRFS). Numerous experiments indicate that TCRFS outperforms the other 6 state-of-the-art multi-label approaches on 13 multi-label benchmark data sets from 4 domains. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 2053 KiB  
Article
Quantum-Inspired Classification Algorithm from DBSCAN–Deutsch–Jozsa Support Vectors and Ising Prediction Model
by Kodai Shiba, Chih-Chieh Chen, Masaru Sogabe, Katsuyoshi Sakamoto and Tomah Sogabe
Appl. Sci. 2021, 11(23), 11386; https://0-doi-org.brum.beds.ac.uk/10.3390/app112311386 - 01 Dec 2021
Cited by 3 | Viewed by 1724
Abstract
Quantum computing is suggested as a new tool to deal with large data set for machine learning applications. However, many quantum algorithms are too expensive to fit into the small-scale quantum hardware available today and the loading of big classical data into small [...] Read more.
Quantum computing is suggested as a new tool to deal with large data set for machine learning applications. However, many quantum algorithms are too expensive to fit into the small-scale quantum hardware available today and the loading of big classical data into small quantum memory is still an unsolved obstacle. These difficulties lead to the study of quantum-inspired techniques using classical computation. In this work, we propose a new classification method based on support vectors from a DBSCAN–Deutsch–Jozsa ranking and an Ising prediction model. The proposed algorithm has an advantage over standard classical SVM in the scaling with respect to the number of training data at the training phase. The method can be executed in a pure classical computer and can be accelerated in a hybrid quantum–classical computing environment. We demonstrate the applicability of the proposed algorithm with simulations and theory. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 2652 KiB  
Article
Ensemble Learning for Threat Classification in Network Intrusion Detection on a Security Monitoring System for Renewable Energy
by Hsiao-Chung Lin, Ping Wang, Kuo-Ming Chao, Wen-Hui Lin and Zong-Yu Yang
Appl. Sci. 2021, 11(23), 11283; https://0-doi-org.brum.beds.ac.uk/10.3390/app112311283 - 29 Nov 2021
Cited by 7 | Viewed by 1985
Abstract
Most approaches for detecting network attacks involve threat analyses to match the attack to potential malicious profiles using behavioral analysis techniques in conjunction with packet collection, filtering, and feature comparison. Experts in information security are often required to study these threats, and judging [...] Read more.
Most approaches for detecting network attacks involve threat analyses to match the attack to potential malicious profiles using behavioral analysis techniques in conjunction with packet collection, filtering, and feature comparison. Experts in information security are often required to study these threats, and judging new types of threats accurately in real time is often impossible. Detecting legitimate or malicious connections using protocol analysis is difficult; therefore, machine learning-based function modules can be added to intrusion detection systems to assist experts in accurately judging threat categories by analyzing the threat and learning its characteristics. In this paper, an ensemble learning scheme based on a revised random forest algorithm is proposed for a security monitoring system in the domain of renewable energy to categorize network threats in a network intrusion detection system. To reduce classification error for minority classes of experimental data in model training, the synthetic minority oversampling technique scheme (SMOTE) was formulated to re-balance the original data sets by altering the number of data points for minority class to imbue the experimental data set. The classification performance of the proposed classifier in threat classification when the data set is unbalanced was experimentally verified in terms of accuracy, precision, recall, and F1-score on the UNSW-NB15 and CSE-CIC-IDS 2018 data sets. A cross-validation scheme featuring support vector machines was used to compare classification accuracies. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 2404 KiB  
Article
PneumoniaNet: Automated Detection and Classification of Pediatric Pneumonia Using Chest X-ray Images and CNN Approach
by Roaa Alsharif, Yazan Al-Issa, Ali Mohammad Alqudah, Isam Abu Qasmieh, Wan Azani Mustafa and Hiam Alquran
Electronics 2021, 10(23), 2949; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10232949 - 26 Nov 2021
Cited by 23 | Viewed by 6548
Abstract
Pneumonia is an inflammation of the lung parenchyma that is caused by a variety of infectious microorganisms and non-infective agents. All age groups can be affected; however, in most cases, fragile groups are more susceptible than others. Radiological images such as Chest X-ray [...] Read more.
Pneumonia is an inflammation of the lung parenchyma that is caused by a variety of infectious microorganisms and non-infective agents. All age groups can be affected; however, in most cases, fragile groups are more susceptible than others. Radiological images such as Chest X-ray (CXR) images provide early detection and prompt action, where typical CXR for such a disease is characterized by radiopaque appearance or seemingly solid segment at the affected parts of the lung due to inflammatory exudate formation replacing the air in the alveoli. The early and accurate detection of pneumonia is crucial to avoid fatal ramifications, particularly in children and seniors. In this paper, we propose a novel 50 layers Convolutional Neural Network (CNN)-based architecture that outperforms the state-of-the-art models. The suggested framework is trained using 5852 CXR images and statistically tested using five-fold cross-validation. The model can distinguish between three classes: viz viral, bacterial, and normal; with 99.7% ± 0.2 accuracy, 99.74% ± 0.1 sensitivity, and 0.9812 Area Under the Curve (AUC). The results are promising, and the new architecture can be used to recognize pneumonia early with cost-effectiveness and high accuracy, especially in remote areas that lack proper access to expert radiologists, and therefore, reduces pneumonia-caused mortality rates. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 5745 KiB  
Article
Automatic Representative View Selection of a 3D Cultural Relic Using Depth Variation Entropy and Depth Distribution Entropy
by Sheng Zeng, Guohua Geng and Mingquan Zhou
Entropy 2021, 23(12), 1561; https://0-doi-org.brum.beds.ac.uk/10.3390/e23121561 - 23 Nov 2021
Cited by 3 | Viewed by 1621
Abstract
Automatically selecting a set of representative views of a 3D virtual cultural relic is crucial for constructing wisdom museums. There is no consensus regarding the definition of a good view in computer graphics; the same is true of multiple views. View-based methods play [...] Read more.
Automatically selecting a set of representative views of a 3D virtual cultural relic is crucial for constructing wisdom museums. There is no consensus regarding the definition of a good view in computer graphics; the same is true of multiple views. View-based methods play an important role in the field of 3D shape retrieval and classification. However, it is still difficult to select views that not only conform to subjective human preferences but also have a good feature description. In this study, we define two novel measures based on information entropy, named depth variation entropy and depth distribution entropy. These measures were used to determine the amount of information about the depth swings and different depth quantities of each view. Firstly, a canonical pose 3D cultural relic was generated using principal component analysis. A set of depth maps obtained by orthographic cameras was then captured on the dense vertices of a geodesic unit-sphere by subdividing the regular unit-octahedron. Afterwards, the two measures were calculated separately on the depth maps gained from the vertices and the results on each one-eighth sphere form a group. The views with maximum entropy of depth variation and depth distribution were selected, and further scattered viewpoints were selected. Finally, the threshold word histogram derived from the vector quantization of salient local descriptors on the selected depth maps represented the 3D cultural relic. The viewpoints obtained by the proposed method coincided with an arbitrary pose of the 3D model. The latter eliminated the steps of manually adjusting the model’s pose and provided acceptable display views for people. In addition, it was verified on several datasets that the proposed method, which uses the Bag-of-Words mechanism and a deep convolution neural network, also has good performance regarding retrieval and classification when dealing with only four views. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

24 pages, 7273 KiB  
Article
SAEP: A Surrounding-Aware Individual Emotion Prediction Model Combined with T-LSTM and Memory Attention Mechanism
by Yakun Wang, Yajun Du, Jinrong Hu, Xianyong Li and Xiaoliang Chen
Appl. Sci. 2021, 11(23), 11111; https://0-doi-org.brum.beds.ac.uk/10.3390/app112311111 - 23 Nov 2021
Cited by 4 | Viewed by 1989
Abstract
The future emotion prediction of users on social media has been attracting increasing attention from academics. Previous studies on predicting future emotion have focused on the characteristics of individuals’ emotion changes; however, the role of the individual’s neighbors has not yet been thoroughly [...] Read more.
The future emotion prediction of users on social media has been attracting increasing attention from academics. Previous studies on predicting future emotion have focused on the characteristics of individuals’ emotion changes; however, the role of the individual’s neighbors has not yet been thoroughly researched. To fill this gap, a surrounding-aware individual emotion prediction model (SAEP) based on a deep encoder–decoder architecture is proposed to predict individuals’ future emotions. In particular, two memory-based attention networks are constructed: The time-evolving attention network and the surrounding attention network to extract the features of the emotional changes of users and neighbors, respectively. Then, these features are incorporated into the emotion prediction task. In addition, a novel variant LSTM is introduced as the encoder of the proposed model, which can effectively extract complex patterns of users’ emotional changes from irregular time series. Extensive experimental results show that the proposed approach outperforms five alternative methods. The SAEP approach has improved by approximately 4.21–14.84% micro F1 on a dataset built from Twitter and 7.30–13.41% on a dataset built from Microblog. Further analyses validate the effectiveness of the proposed time-evolving context and surrounding context, as well as the factors that may affect the prediction results. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 12595 KiB  
Article
Attention Enhanced Serial Unet++ Network for Removing Unevenly Distributed Haze
by Wenxuan Zhao, Yaqin Zhao, Liqi Feng and Jiaxi Tang
Electronics 2021, 10(22), 2868; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10222868 - 22 Nov 2021
Cited by 6 | Viewed by 2280
Abstract
The purpose of image dehazing is the reduction of the image degradation caused by suspended particles for supporting high-level visual tasks. Besides the atmospheric scattering model, convolutional neural network (CNN) has been used for image dehazing. However, the existing image dehazing algorithms are [...] Read more.
The purpose of image dehazing is the reduction of the image degradation caused by suspended particles for supporting high-level visual tasks. Besides the atmospheric scattering model, convolutional neural network (CNN) has been used for image dehazing. However, the existing image dehazing algorithms are limited in face of unevenly distributed haze and dense haze in real-world scenes. In this paper, we propose a novel end-to-end convolutional neural network called attention enhanced serial Unet++ dehazing network (AESUnet) for single image dehazing. We attempt to build a serial Unet++ structure that adopts a serial strategy of two pruned Unet++ blocks based on residual connection. Compared with the simple Encoder–Decoder structure, the serial Unet++ module can better use the features extracted by encoders and promote contextual information fusion in different resolutions. In addition, we take some improvement measures to the Unet++ module, such as pruning, introducing the convolutional module with ResNet structure, and a residual learning strategy. Thus, the serial Unet++ module can generate more realistic images with less color distortion. Furthermore, following the serial Unet++ blocks, an attention mechanism is introduced to pay different attention to haze regions with different concentrations by learning weights in the spatial domain and channel domain. Experiments are conducted on two representative datasets: the large-scale synthetic dataset RESIDE and the small-scale real-world datasets I-HAZY and O-HAZY. The experimental results show that the proposed dehazing network is not only comparable to state-of-the-art methods for the RESIDE synthetic datasets, but also surpasses them by a very large margin for the I-HAZY and O-HAZY real-world dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Graphical abstract

20 pages, 1647 KiB  
Article
Multi-Method Analysis of Medical Records and MRI Images for Early Diagnosis of Dementia and Alzheimer’s Disease Based on Deep Learning and Hybrid Methods
by Badiea Abdulkarem Mohammed, Ebrahim Mohammed Senan, Taha H. Rassem, Nasrin M. Makbol, Adwan Alownie Alanazi, Zeyad Ghaleb Al-Mekhlafi, Tariq S. Almurayziq and Fuad A. Ghaleb
Electronics 2021, 10(22), 2860; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10222860 - 20 Nov 2021
Cited by 60 | Viewed by 4987
Abstract
Dementia and Alzheimer’s disease are caused by neurodegeneration and poor communication between neurons in the brain. So far, no effective medications have been discovered for dementia and Alzheimer’s disease. Thus, early diagnosis is necessary to avoid the development of these diseases. In this [...] Read more.
Dementia and Alzheimer’s disease are caused by neurodegeneration and poor communication between neurons in the brain. So far, no effective medications have been discovered for dementia and Alzheimer’s disease. Thus, early diagnosis is necessary to avoid the development of these diseases. In this study, efficient machine learning algorithms were assessed to evaluate the Open Access Series of Imaging Studies (OASIS) dataset for dementia diagnosis. Two CNN models (AlexNet and ResNet-50) and hybrid techniques between deep learning and machine learning (AlexNet+SVM and ResNet-50+SVM) were also evaluated for the diagnosis of Alzheimer’s disease. For the OASIS dataset, we balanced the dataset, replaced the missing values, and applied the t-Distributed Stochastic Neighbour Embedding algorithm (t-SNE) to represent the high-dimensional data in the low-dimensional space. All of the machine learning algorithms, namely, Support Vector Machine (SVM), Decision Tree, Random Forest and K Nearest Neighbours (KNN), achieved high performance for diagnosing dementia. The random forest algorithm achieved an overall accuracy of 94% and precision, recall and F1 scores of 93%, 98% and 96%, respectively. The second dataset, the MRI image dataset, was evaluated by AlexNet and ResNet-50 models and AlexNet+SVM and ResNet-50+SVM hybrid techniques. All models achieved high performance, but the performance of the hybrid methods between deep learning and machine learning was better than that of the deep learning models. The AlexNet+SVM hybrid model achieved accuracy, sensitivity, specificity and AUC scores of 94.8%, 93%, 97.75% and 99.70%, respectively. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 1075 KiB  
Article
Fast Hyperparameter Calibration of Sparsity Enforcing Penalties in Total Generalised Variation Penalised Reconstruction Methods for XCT Using a Planted Virtual Reference Image
by Stéphane Chrétien, Camille Giampiccolo, Wenjuan Sun and Jessica Talbott
Mathematics 2021, 9(22), 2960; https://0-doi-org.brum.beds.ac.uk/10.3390/math9222960 - 19 Nov 2021
Cited by 1 | Viewed by 1587
Abstract
The reconstruction problem in X-ray computed tomography (XCT) is notoriously difficult in the case where only a small number of measurements are made. Based on the recently discovered Compressed Sensing paradigm, many methods have been proposed in order to address the reconstruction problem [...] Read more.
The reconstruction problem in X-ray computed tomography (XCT) is notoriously difficult in the case where only a small number of measurements are made. Based on the recently discovered Compressed Sensing paradigm, many methods have been proposed in order to address the reconstruction problem by leveraging inherent sparsity of the object’s decompositions in various appropriate bases or dictionaries. In practice, reconstruction is usually achieved by incorporating weighted sparsity enforcing penalisation functionals into the least-squares objective of the associated optimisation problem. One such penalisation functional is the Total Variation (TV) norm, which has been successfully employed since the early days of Compressed Sensing. Total Generalised Variation (TGV) is a recent improvement of this approach. One of the main advantages of such penalisation based approaches is that the resulting optimisation problem is convex and as such, cannot be affected by the possible existence of spurious solutions. Using the TGV penalisation nevertheless comes with the drawback of having to tune the two hyperparameters governing the TGV semi-norms. In this short note, we provide a simple and efficient recipe for fast hyperparameters tuning, based on the simple idea of virtually planting a mock image into the model. The proposed trick potentially applies to all linear inverse problems under the assumption that relevant prior information is available about the sought for solution, whilst being very different from the Bayesian method. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 4623 KiB  
Article
Effect of Probabilistic Similarity Measure on Metric-Based Few-Shot Classification
by Youngjae Lee and Hyeyoung Park
Appl. Sci. 2021, 11(22), 10977; https://0-doi-org.brum.beds.ac.uk/10.3390/app112210977 - 19 Nov 2021
Cited by 1 | Viewed by 1343
Abstract
In developing a few-shot classification model using deep networks, the limited number of samples in each class causes difficulty in utilizing statistical characteristics of the class distributions. In this paper, we propose a method to treat this difficulty by combining a probabilistic similarity [...] Read more.
In developing a few-shot classification model using deep networks, the limited number of samples in each class causes difficulty in utilizing statistical characteristics of the class distributions. In this paper, we propose a method to treat this difficulty by combining a probabilistic similarity based on intra-class statistics with a metric-based few-shot classification model. Noting that the probabilistic similarity estimated from intra-class statistics and the classifier of conventional few-shot classification models have a common assumption on the class distributions, we propose to apply the probabilistic similarity to obtain loss value for episodic learning of embedding network as well as to classify unseen test data. By defining the probabilistic similarity as the probability density of difference vectors between two samples with the same class label, it is possible to obtain a more reliable estimate of the similarity especially for the case of large number of classes. Through experiments on various benchmark data, we confirm that the probabilistic similarity can improve the classification performance, especially when the number of classes is large. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 3846 KiB  
Article
Truck Driver Fatigue Detection Based on Video Sequences in Open-Pit Mines
by Yi Wang, Zhengxiang He and Liguan Wang
Mathematics 2021, 9(22), 2908; https://0-doi-org.brum.beds.ac.uk/10.3390/math9222908 - 15 Nov 2021
Cited by 3 | Viewed by 1610
Abstract
Due to complex background interference and weak space–time connection, traditional driver fatigue detection methods perform poorly for open-pit truck drivers. For these issues, this paper presents a driver fatigue detection method based on Libfacedetection and an LRCN. The method consists of three stages: [...] Read more.
Due to complex background interference and weak space–time connection, traditional driver fatigue detection methods perform poorly for open-pit truck drivers. For these issues, this paper presents a driver fatigue detection method based on Libfacedetection and an LRCN. The method consists of three stages: (1) using a face detection module with a tracking method to quickly extract the ROI of the face; (2) extracting and coding the features; (3) combining the coding model to build a spatiotemporal classification network. The innovation of the method is to utilize the spatiotemporal features of the image sequence to build a spatiotemporal classification model suitable for this task. Meanwhile, a tracking method is added to the face detection stage to reduce time expenditure. As a result, the average speed with the tracking method for face detection on video is increased by 74% in comparison with the one without the tracking method. Our best model adopts a DHLSTM and feature-level frame aggregation, which achieves high accuracy of 99.30% on the self-built dataset. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 5292 KiB  
Article
Detecting Aggressiveness in Tweets: A Hybrid Model for Detecting Cyberbullying in the Spanish Language
by Manuel Lepe-Faúndez, Alejandra Segura-Navarrete, Christian Vidal-Castro, Claudia Martínez-Araneda and Clemente Rubio-Manzano
Appl. Sci. 2021, 11(22), 10706; https://0-doi-org.brum.beds.ac.uk/10.3390/app112210706 - 12 Nov 2021
Cited by 10 | Viewed by 2350
Abstract
In recent years, the use of social networks has increased exponentially, which has led to a significant increase in cyberbullying. Currently, in the field of Computer Science, research has been made on how to detect aggressiveness in texts, which is a prelude to [...] Read more.
In recent years, the use of social networks has increased exponentially, which has led to a significant increase in cyberbullying. Currently, in the field of Computer Science, research has been made on how to detect aggressiveness in texts, which is a prelude to detecting cyberbullying. In this field, the main work has been done for English language texts, mainly using Machine Learning (ML) approaches, Lexicon approaches to a lesser extent, and very few works using hybrid approaches. In these, Lexicons and Machine Learning algorithms are used, such as counting the number of bad words in a sentence using a Lexicon of bad words, which serves as an input feature for classification algorithms. This research aims at contributing towards detecting aggressiveness in Spanish language texts by creating different models that combine the Lexicons and ML approach. Twenty-two models that combine techniques and algorithms from both approaches are proposed, and for their application, certain hyperparameters are adjusted in the training datasets of the corpora, to obtain the best results in the test datasets. Three Spanish language corpora are used in the evaluation: Chilean, Mexican, and Chilean-Mexican corpora. The results indicate that hybrid models obtain the best results in the 3 corpora, over implemented models that do not use Lexicons. This shows that by mixing approaches, aggressiveness detection improves. Finally, a web application is developed that gives applicability to each model by classifying tweets, allowing evaluating the performance of models with external corpus and receiving feedback on the prediction of each one for future research. In addition, an API is available that can be integrated into technological tools for parental control, online plugins for writing analysis in social networks, and educational tools, among others. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 604 KiB  
Concept Paper
Kano Model Integration with Data Mining to Predict Customer Satisfaction
by Khaled Al Rabaiei, Fady Alnajjar and Amir Ahmad
Big Data Cogn. Comput. 2021, 5(4), 66; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5040066 - 11 Nov 2021
Cited by 8 | Viewed by 5007
Abstract
The Kano model is one of the models that help determine which features must be included in a product or service to improve customer satisfaction. The model is focused on highlighting the most relevant attributes of a product or service along with customers’ [...] Read more.
The Kano model is one of the models that help determine which features must be included in a product or service to improve customer satisfaction. The model is focused on highlighting the most relevant attributes of a product or service along with customers’ estimation of how the presence of these attributes can be used to predict satisfaction about specific services or products. This research aims to develop a method to integrate the Kano model and data mining approaches to select relevant attributes that drive customer satisfaction, with a specific focus on higher education. The significant contribution of this research is to solve the problem of selecting features that are not methodically correlated to customer satisfaction, which could reduce the risk of investing in features that could ultimately be irrelevant to enhancing customer satisfaction. Questionnaire data were collected from 646 students from UAE University. The experiment suggests that XGBoost Regression and Decision Tree Regression produce best results for this kind of problem. Based on the integration between the Kano model and the feature selection method, the number of features used to predict customer satisfaction is minimized to four features. It was found that ANOVA features selection model’s integration with the Kano model gives higher Pearson correlation coefficients and higher R2 values. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 1129 KiB  
Article
Entanglement-Structured LSTM Boosts Chaotic Time Series Forecasting
by Xiangyi Meng and Tong Yang
Entropy 2021, 23(11), 1491; https://0-doi-org.brum.beds.ac.uk/10.3390/e23111491 - 11 Nov 2021
Cited by 4 | Viewed by 1985
Abstract
Traditional machine-learning methods are inefficient in capturing chaos in nonlinear dynamical systems, especially when the time difference Δt between consecutive steps is so large that the extracted time series looks apparently random. Here, we introduce a new long-short-term-memory (LSTM)-based recurrent architecture by [...] Read more.
Traditional machine-learning methods are inefficient in capturing chaos in nonlinear dynamical systems, especially when the time difference Δt between consecutive steps is so large that the extracted time series looks apparently random. Here, we introduce a new long-short-term-memory (LSTM)-based recurrent architecture by tensorizing the cell-state-to-state propagation therein, maintaining the long-term memory feature of LSTM, while simultaneously enhancing the learning of short-term nonlinear complexity. We stress that the global minima of training can be most efficiently reached by our tensor structure where all nonlinear terms, up to some polynomial order, are treated explicitly and weighted equally. The efficiency and generality of our architecture are systematically investigated and tested through theoretical analysis and experimental examinations. In our design, we have explicitly used two different many-body entanglement structures—matrix product states (MPS) and the multiscale entanglement renormalization ansatz (MERA)—as physics-inspired tensor decomposition techniques, from which we find that MERA generally performs better than MPS, hence conjecturing that the learnability of chaos is determined not only by the number of free parameters but also the tensor complexity—recognized as how entanglement entropy scales with varying matricization of the tensor. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 2125 KiB  
Article
Temporal Degree-Degree and Closeness-Closeness: A New Centrality Metrics for Social Network Analysis
by Mahmoud Elmezain, Ebtesam A. Othman and Hani M. Ibrahim
Mathematics 2021, 9(22), 2850; https://0-doi-org.brum.beds.ac.uk/10.3390/math9222850 - 10 Nov 2021
Cited by 12 | Viewed by 2187
Abstract
In the area of network analysis, centrality metrics play an important role in defining the “most important” actors in a social network. However, nowadays, most types of networks are dynamic, meaning their topology changes over time. The connection weights and the strengths of [...] Read more.
In the area of network analysis, centrality metrics play an important role in defining the “most important” actors in a social network. However, nowadays, most types of networks are dynamic, meaning their topology changes over time. The connection weights and the strengths of social links between nodes are an important concept in a social network. The new centrality measures are proposed for weighted networks, which relies on a time-ordered weighted graph model, generalized temporal degree and closeness centrality. Furthermore, two measures—Temporal Degree-Degree and Temporal Closeness-Closeness—are employed to better understand the significance of nodes in weighted dynamic networks. Our study is caried out according to real dynamic weighted networks dataset of a university-based karate club. Through extensive experiments and discussions of the proposed metrics, our analysis proves that there is an effectiveness on the impact of each node throughout social networks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 2888 KiB  
Article
MemBox: Shared Memory Device for Memory-Centric Computing Applicable to Deep Learning Problems
by Yongseok Choi, Eunji Lim, Jaekwon Shin and Cheol-Hoon Lee
Electronics 2021, 10(21), 2720; https://0-doi-org.brum.beds.ac.uk/10.3390/electronics10212720 - 08 Nov 2021
Viewed by 1901
Abstract
Large-scale computational problems that need to be addressed in modern computers, such as deep learning or big data analysis, cannot be solved in a single computer, but can be solved with distributed computer systems. Since most distributed computing systems, consisting of a large [...] Read more.
Large-scale computational problems that need to be addressed in modern computers, such as deep learning or big data analysis, cannot be solved in a single computer, but can be solved with distributed computer systems. Since most distributed computing systems, consisting of a large number of networked computers, should propagate their computational results to each other, they can suffer the problem of an increasing overhead, resulting in lower computational efficiencies. To solve these problems, we proposed an architecture of a distributed system that used a shared memory that is simultaneously accessible by multiple computers. Our architecture aimed to be implemented in FPGA or ASIC. Using an FPGA board that implemented our architecture, we configured the actual distributed system and showed the feasibility of our system. We compared the results of the deep learning application test using our architecture with that using Google Tensorflow’s parameter server mechanism. We showed improvements in our architecture beyond Google Tensorflow’s parameter server mechanism and we determined the future direction of research by deriving the expected problems. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 422 KiB  
Article
Brain Activity Recognition Method Based on Attention-Based RNN Mode
by Song Zhou and Tianhan Gao
Appl. Sci. 2021, 11(21), 10425; https://0-doi-org.brum.beds.ac.uk/10.3390/app112110425 - 05 Nov 2021
Cited by 5 | Viewed by 1489
Abstract
Brain activity recognition based on electroencephalography (EEG) marks a major research orientation in intelligent medicine, especially in human intention prediction, human–computer control and neurological diagnosis. The literature research mainly focuses on the recognition of single-person binary brain activity, which is limited in the [...] Read more.
Brain activity recognition based on electroencephalography (EEG) marks a major research orientation in intelligent medicine, especially in human intention prediction, human–computer control and neurological diagnosis. The literature research mainly focuses on the recognition of single-person binary brain activity, which is limited in the more extensive and complex scenarios. Therefore, brain activity recognition in multiperson and multi-objective scenarios has aroused increasingly more attention. Another challenge is the reduction of recognition accuracy caused by the interface of external noise as well as EEG’s low signal-to-noise ratio. In addition, traditional EEG feature analysis proves to be time-intensive and it relies heavily on mature experience. The paper proposes a novel EEG recognition method to address the above issues. The basic feature of EEG is first analyzed according to the band of EEG. The attention-based RNN model is then adopted to eliminate the interference to achieve the purpose of automatic recognition of the original EEG signal. Finally, we evaluate the proposed method with public and local data sets of EEG and perform lots of tests to investigate how factors affect the results of recognition. As shown by the test results, compared with some typical EEG recognition methods, the proposed method owns better recognition accuracy and suitability in multi-objective task scenarios. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 1456 KiB  
Article
MovieDIRec: Drafted-Input-Based Recommendation System for Movies
by Hyeonwoo An, Daeyeol Kim, Kwangkee Lee and Nammee Moon
Appl. Sci. 2021, 11(21), 10412; https://0-doi-org.brum.beds.ac.uk/10.3390/app112110412 - 05 Nov 2021
Cited by 2 | Viewed by 1690
Abstract
In a DNN-based recommendation system, the input selection of a model and design of an appropriate input are very important in terms of the accuracy and reflection of complex user preferences. Since the learning of layers by the goal of the model depends [...] Read more.
In a DNN-based recommendation system, the input selection of a model and design of an appropriate input are very important in terms of the accuracy and reflection of complex user preferences. Since the learning of layers by the goal of the model depends on the input, the more closely the input is related to the goal, the less the model needs to learn unnecessary information. In relation to this, the term Drafted-Input, defined in this paper, is input data that have been appropriately selected and processed to meet the goals of the system, and is a subject that is updated while continuously reflecting user preferences along with the learning of model parameters. In this paper, the effects of properly designed and generated inputs on accuracy and usability are verified using the proposed systems. Furthermore, the proposed method and user–item interaction are compared with state-of-the-art systems using simple embedding data as the input, and a model suitable for a practical client–server environment is also proposed. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 7867 KiB  
Article
Legal Judgment Prediction Based on Machine Learning: Predicting the Discretionary Damages of Mental Suffering in Fatal Car Accident Cases
by Decheng Hsieh, Lieuhen Chen and Taiping Sun
Appl. Sci. 2021, 11(21), 10361; https://0-doi-org.brum.beds.ac.uk/10.3390/app112110361 - 04 Nov 2021
Cited by 3 | Viewed by 1937
Abstract
The discretionary damage of mental suffering in fatal car accident cases in Taiwan is subjective, uncertain, and unpredictable; thus, plaintiffs, defendants, and their lawyers find it difficult to judge whether spending much of their money and time on the lawsuit is worthwhile and [...] Read more.
The discretionary damage of mental suffering in fatal car accident cases in Taiwan is subjective, uncertain, and unpredictable; thus, plaintiffs, defendants, and their lawyers find it difficult to judge whether spending much of their money and time on the lawsuit is worthwhile and which legal factors judges will consider important and dominant when they are assessing the mental suffering damages. To address these problems, we propose k-nearest neighbor, classification and regression trees, and random forests as learning algorithms for regression to build optimal predictive models. In addition, we reveal the importance ranking of legal factors by permutation feature importance. The experimental results show that the random forest model outperformed the other models and achieved good performance, and “the mental suffering damages that plaintiff claims” and “the age of the victim” play important roles in assessments of mental suffering damages in fatal car accident cases in Taiwan. Therefore, litigants and their lawyers can predict the discretionary damages of mental suffering in advance and wisely decide whether they should litigate or not, and then they can focus on the crucial legal factors and develop the best litigation strategy. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 4521 KiB  
Article
An Attention-Based Generative Adversarial Network for Producing Illustrative Sketches
by Jihyeon Yeom, Heekyung Yang and Kyungha Min
Mathematics 2021, 9(21), 2791; https://0-doi-org.brum.beds.ac.uk/10.3390/math9212791 - 03 Nov 2021
Cited by 3 | Viewed by 1828
Abstract
An illustrative sketch style expresses important shapes and regions of objects and scenes with salient lines and dark tones, while abstracting less important shapes and regions as vacant spaces. We present a framework that produces illustrative sketch styles from various photographs. Our framework [...] Read more.
An illustrative sketch style expresses important shapes and regions of objects and scenes with salient lines and dark tones, while abstracting less important shapes and regions as vacant spaces. We present a framework that produces illustrative sketch styles from various photographs. Our framework is designed using a generative adversarial network (GAN), which comprised four modules: a style extraction module, a generator module, a discriminator module and RCCL module. We devise two key ideas to effectively extract illustrative sketch styles from sample artworks and to apply them to input photographs. The first idea is using an attention map that extracts the required style features from important shapes and regions of sample illustrative sketch styles. This attention map is used in the generator module of our framework for the effective production of illustrative sketch styles. The second idea is using a relaxed cycle consistency loss that evaluates the quality of the produced illustrative sketch styles by comparing images that are reconstructed from the produced illustrative sketch styles and the input photographs. This relaxed cycle consistency loss focuses on the comparison of important shapes and regions for an effective evaluation of the quality of the produced illustrative sketch styles. Our GAN-based framework with an attention map and a relaxed cycle consistency loss effectively produces illustrative sketch styles on various target photographs, including portraits, landscapes, and still lifes. We demonstrate the effectiveness of our framework through a human study, ablation study, and Frechet Inception Distance evaluation. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 517 KiB  
Article
A Zeroth-Order Adaptive Learning Rate Method to Reduce Cost of Hyperparameter Tuning for Deep Learning
by Yanan Li, Xuebin Ren, Fangyuan Zhao and Shusen Yang
Appl. Sci. 2021, 11(21), 10184; https://0-doi-org.brum.beds.ac.uk/10.3390/app112110184 - 30 Oct 2021
Cited by 7 | Viewed by 1936
Abstract
Due to powerful data representation ability, deep learning has dramatically improved the state-of-the-art in many practical applications. However, the utility highly depends on fine-tuning of hyper-parameters, including learning rate, batch size, and network initialization. Although many first-order adaptive methods (e.g., Adam, Adagrad) have [...] Read more.
Due to powerful data representation ability, deep learning has dramatically improved the state-of-the-art in many practical applications. However, the utility highly depends on fine-tuning of hyper-parameters, including learning rate, batch size, and network initialization. Although many first-order adaptive methods (e.g., Adam, Adagrad) have been proposed to adjust learning rate based on gradients, they are susceptible to the initial learning rate and network architecture. Therefore, the main challenge of using deep learning in practice is how to reduce the cost of tuning hyper-parameters. To address this, we propose a heuristic zeroth-order learning rate method, Adacomp, which adaptively adjusts the learning rate based only on values of the loss function. The main idea is that Adacomp penalizes large learning rates to ensure the convergence and compensates small learning rates to accelerate the training process. Therefore, Adacomp is robust to the initial learning rate. Extensive experiments, including comparison to six typically adaptive methods (Momentum, Adagrad, RMSprop, Adadelta, Adam, and Adamax) on several benchmark datasets for image classification tasks (MNIST, KMNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100), were conducted. Experimental results show that Adacomp is not only robust to the initial learning rate but also to the network architecture, network initialization, and batch size. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

21 pages, 8243 KiB  
Article
Deep Learning-Based Method to Recognize Line Objects and Flow Arrows from Image-Format Piping and Instrumentation Diagrams for Digitization
by Yoochan Moon, Jinwon Lee, Duhwan Mun and Seungeun Lim
Appl. Sci. 2021, 11(21), 10054; https://0-doi-org.brum.beds.ac.uk/10.3390/app112110054 - 27 Oct 2021
Cited by 12 | Viewed by 5521
Abstract
As part of research on technology for automatic conversion of image-format piping and instrumentation diagram (P&ID) into digital P&ID, the present study proposes a method for recognizing various types of lines and flow arrows in image-format P&ID. The proposed method consists of three [...] Read more.
As part of research on technology for automatic conversion of image-format piping and instrumentation diagram (P&ID) into digital P&ID, the present study proposes a method for recognizing various types of lines and flow arrows in image-format P&ID. The proposed method consists of three steps. In the first step of preprocessing, the outer border and title box in the diagram are removed. In the second step of detection, continuous lines are detected, and then line signs and flow arrows indicating the flow direction are detected. In the third step of post-processing, using the results of line sign detection, continuous lines that require changing of the line type are determined, and the line types are adjusted accordingly. Then, the recognized lines are merged with flow arrows. For verification of the proposed method, a prototype system was used to conduct an experiment of line recognition. For the nine test P&IDs, the average precision and recall were 96.14% and 89.59%, respectively, showing high recognition performance. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 6237 KiB  
Article
High Inclusiveness and Accuracy Motion Blur Real-Time Gesture Recognition Based on YOLOv4 Model Combined Attention Mechanism and DeblurGanv2
by Hongchao Zhuang, Yilu Xia, Ning Wang and Lei Dong
Appl. Sci. 2021, 11(21), 9982; https://0-doi-org.brum.beds.ac.uk/10.3390/app11219982 - 25 Oct 2021
Cited by 9 | Viewed by 2536
Abstract
The combination of gesture recognition and aerospace exploration robots can realize the efficient non-contact control of the robots. In the harsh aerospace environment, the captured gesture images are usually blurred and damaged inevitably. The motion blurred images not only cause part of the [...] Read more.
The combination of gesture recognition and aerospace exploration robots can realize the efficient non-contact control of the robots. In the harsh aerospace environment, the captured gesture images are usually blurred and damaged inevitably. The motion blurred images not only cause part of the transmitted information to be lost, but also affect the effect of neural network training in the later stage. To improve the speed and accuracy of motion blurred gestures recognition, the algorithm of YOLOv4 (You Only Look Once, vision 4) is studied from the two aspects of motion blurred image processing and model optimization. The DeblurGanv2 is employed to remove the motion blur of the gestures in YOLOv4 network input pictures. In terms of model structure, the K-means++ algorithm is used to cluster the priori boxes for obtaining the more appropriate size parameters of the priori boxes. The CBAM attention mechanism and SPP (spatial pyramid pooling layer) structure are added to YOLOv4 model to improve the efficiency of network learning. The dataset for network training is designed for the human–computer interaction in the aerospace space. To reduce the redundant features of the captured images and enhance the effect of model training, the Wiener filter and bilateral filter are superimposed on the blurred images in the dataset to simply remove the motion blur. The augmentation of the model is executed by imitating different environments. A YOLOv4-gesture model is built, which collaborates with K-means++ algorithm, the CBAM and SPP mechanism. A DeblurGanv2 model is built to process the input images of the YOLOv4 target recognition. The YOLOv4-motion-blur-gesture model is composed of the YOLOv4-gesture and the DeblurGanv2. The augmented and enhanced gesture data set is used to simulate the model training. The experimental results demonstrate that the YOLOv4-motion-blur-gesture model has relatively better performance. The proposed model has the high inclusiveness and accuracy recognition effect in the real-time interaction of motion blur gestures, it improves the network training speed by 30%, the target detection accuracy by 10%, and the value of mAP by about 10%. The constructed YOLOv4-motion-blur-gesture model has a stable performance. It can not only meet the real-time human–computer interaction in aerospace space under real-time complex conditions, but also can be applied to other application environments under complex backgrounds requiring real-time detection. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

17 pages, 683 KiB  
Article
MSGCN: Multi-Subgraph Based Heterogeneous Graph Convolution Network Embedding
by Junhui Chen, Feihu Huang and Jian Peng
Appl. Sci. 2021, 11(21), 9832; https://0-doi-org.brum.beds.ac.uk/10.3390/app11219832 - 21 Oct 2021
Cited by 5 | Viewed by 2465
Abstract
Heterogeneous graph embedding has become a hot topic in network embedding in recent years and has been widely used in lots of practical scenarios. However, most of the existing heterogeneous graph embedding methods cannot make full use of all the auxiliary information. So [...] Read more.
Heterogeneous graph embedding has become a hot topic in network embedding in recent years and has been widely used in lots of practical scenarios. However, most of the existing heterogeneous graph embedding methods cannot make full use of all the auxiliary information. So we proposed a new method called Multi-Subgraph based Graph Convolution Network (MSGCN), which uses topology information, semantic information, and node feature information to learn node embedding vector. In MSGCN, the graph is firstly decomposed into multiple subgraphs according to the type of edges. Then convolution operation is adopted for each subgraph to obtain the node representations of each subgraph. Finally, the node representations are obtained by aggregating the representation vectors of nodes in each subgraph. Furthermore, we discussed the application of MSGCN with respect to a transductive learning task and inductive learning task, respectively. A node sampling method for inductive learning tasks to obtain representations of new nodes is proposed. This sampling method uses the attention mechanism to find important nodes and then assigns different weights to different nodes during aggregation. We conducted an experiment on three datasets. The experimental results indicate that our MSGCN outperforms the state-of-the-art methods in multi-class node classification tasks. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

37 pages, 1546 KiB  
Article
Self-Tuning Lam Annealing: Learning Hyperparameters While Problem Solving
by Vincent A. Cicirello
Appl. Sci. 2021, 11(21), 9828; https://0-doi-org.brum.beds.ac.uk/10.3390/app11219828 - 21 Oct 2021
Cited by 1 | Viewed by 1714
Abstract
The runtime behavior of Simulated Annealing (SA), similar to other metaheuristics, is controlled by hyperparameters. For SA, hyperparameters affect how “temperature” varies over time, and “temperature” in turn affects SA’s decisions on whether or not to transition to neighboring states. It is typically [...] Read more.
The runtime behavior of Simulated Annealing (SA), similar to other metaheuristics, is controlled by hyperparameters. For SA, hyperparameters affect how “temperature” varies over time, and “temperature” in turn affects SA’s decisions on whether or not to transition to neighboring states. It is typically necessary to tune the hyperparameters ahead of time. However, there are adaptive annealing schedules that use search feedback to evolve the “temperature” during the search. A classic and generally effective adaptive annealing schedule is the Modified Lam. Although effective, the Modified Lam can be sensitive to the scale of the cost function, and is sometimes slow to converge to its target behavior. In this paper, we present a novel variation of the Modified Lam that we call Self-Tuning Lam, which uses early search feedback to auto-adjust its self-adaptive behavior. Using a variety of discrete and continuous optimization problems, we demonstrate the ability of the Self-Tuning Lam to nearly instantaneously converge to its target behavior independent of the scale of the cost function, as well as its run length. Our implementation is integrated into Chips-n-Salsa, an open-source Java library for parallel and self-adaptive local search. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 4526 KiB  
Article
Applying Precision Agriculture to Artificial Waterfowl Hatching, Using the Black Muscovy Duck as an Example
by Shun-Chieh Chang, Chih-Hsiang Cheng and Yen-Hung Chen
Appl. Sci. 2021, 11(20), 9763; https://0-doi-org.brum.beds.ac.uk/10.3390/app11209763 - 19 Oct 2021
Viewed by 2234
Abstract
(1) Background: agriculture practices adopt homogenization-farming processes to enhance product characteristics, with lower costs, standardization, mass production, and production efficiency. (2) Problem: conventional agriculture practices eliminate products when these products are slightly different from the expected status in each phase of [...] Read more.
(1) Background: agriculture practices adopt homogenization-farming processes to enhance product characteristics, with lower costs, standardization, mass production, and production efficiency. (2) Problem: conventional agriculture practices eliminate products when these products are slightly different from the expected status in each phase of the lifecycle due to the changing natural environment and climate. However, this elimination of products can be avoided when they receive customized care to the expected developing path via a universal prediction model, for the quantitative description of biomass changing with time and the environment, and the corresponding automatic environmental controls. (3) Methods: in this study, we built a prediction model to quantitatively predict the hatching rate of each egg by observing the biomass development path along the waterfowl-like production lifecycle and the corresponding environment settings. (4) Results: two experiments using black Muscovy duck hatching as a case study were executed. The first experiment involved finding out the key characteristics, out of 25 characteristics, and building a prediction model to quantitatively predict the survivability of the black Muscovy duck egg. The second experiment was adopted to validate the effectiveness of our prediction mode; the hatching rate rose from 47% in the first experiment to 62% in the second experiment without any human interference from experienced farmers. (5) Contributions: this research builds on an AI-based precision agriculture system prototype as the reference for waterfowl research. The results show that our proposed model is capable of decreasing the training costs and enhancing the product qualification rate for individual agricultural products. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

14 pages, 493 KiB  
Article
BIBC: A Chinese Named Entity Recognition Model for Diabetes Research
by Lei Yang, Yufan Fu and Yu Dai
Appl. Sci. 2021, 11(20), 9653; https://0-doi-org.brum.beds.ac.uk/10.3390/app11209653 - 16 Oct 2021
Cited by 6 | Viewed by 1835
Abstract
In the medical field, extracting medical entities from text by Named Entity Recognition (NER) has become one of the research hotspots. This thesis takes the chapter-level diabetes literature as the research object and uses a deep learning method to extract medical entities in [...] Read more.
In the medical field, extracting medical entities from text by Named Entity Recognition (NER) has become one of the research hotspots. This thesis takes the chapter-level diabetes literature as the research object and uses a deep learning method to extract medical entities in the literature. Based on the deep and bidirectional transformer network structure, the pre-training language model BERT model can solve the problem of polysemous word representation, and supplement the features by large-scale unlabeled data, combined with BiLSTM-CRF model extracts of the long-distance features of sentences. On this basis, in view of the problem that the model cannot focus on the local information of the sentence, resulting in insufficient feature extraction, and considering the characteristics of Chinese data mainly in words, this thesis proposes a Named Entity Recognition method based on BIBC. This method combines Iterated Dilated CNN to enable the model to take into account global and local features at the same time, and uses the BERT-WWM model based on whole word masking to further extract semantic information from Chinese data. In the experiment of diabetic entity recognition in Ruijin Hospital, the accuracy rate, recall rate, and F1 score are improved to 79.58%, 80.21%, and 79.89%, which are better than the evaluation indexes of existing studies. It indicates that the method can extract the semantic information of diabetic text more accurately and obtain good entity recognition results, which can meet the requirements of practical applications. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

31 pages, 1980 KiB  
Review
Eliciting Auxiliary Information for Cold Start User Recommendation: A Survey
by Nor Aniza Abdullah, Rasheed Abubakar Rasheed, Mohd Hairul Nizam Md. Nasir and Md Mujibur Rahman
Appl. Sci. 2021, 11(20), 9608; https://0-doi-org.brum.beds.ac.uk/10.3390/app11209608 - 15 Oct 2021
Cited by 11 | Viewed by 2871
Abstract
Recommender systems suggest items of interest to users based on their preferences. These preferences are typically generated from user ratings of the items. If there are no ratings for a certain user or item, it is said that there is a cold start [...] Read more.
Recommender systems suggest items of interest to users based on their preferences. These preferences are typically generated from user ratings of the items. If there are no ratings for a certain user or item, it is said that there is a cold start problem, which leads to unreliable recommendations. Existing studies that reviewed and examined cold start in recommender systems have not explained the process of deriving and obtaining the auxiliary information needed for cold start recommendation. This study surveys the existing literature in order to explain the various approaches and techniques employed by researchers and the challenges associated with deriving and obtaining the auxiliary information necessary for cold start recommendation. Results show that auxiliary information for cold start recommendation is obtained by adapting traditional filtering and matrix factorization algorithms typically with machine learning algorithms to build learning prediction models. The understanding of similar or connected user profiles can be used as auxiliary information for building cold start user profile to enable similar recommendations in social networks. Similar users are clustered into sub-groups so that a cold start user could be allocated and inferred to a sub-group having similar profiles for recommendations. The key challenges of the process for obtaining the auxiliary information involve: (1) two separate recommendation processes of conversion from pure cold start to warm start before eliciting the auxiliary information; (2) the obtained implicit auxiliary information is usually ranked and sieved in order to select the top rated and reliable auxiliary information for the recommendation. This study also found that cold start user recommendation has frequently been researched in the entertainment domain, typically using music and movie data, while little research has been carried out in educational institutions and academia, or with cold start for mobile applications. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 1558 KiB  
Article
Dwell Time Estimation of Import Containers as an Ordinal Regression Problem
by Laidy De Armas Jacomino, Miguel Angel Medina-Pérez, Raúl Monroy, Danilo Valdes-Ramirez, Carlos Morell-Pérez and Rafael Bello
Appl. Sci. 2021, 11(20), 9380; https://0-doi-org.brum.beds.ac.uk/10.3390/app11209380 - 09 Oct 2021
Cited by 3 | Viewed by 2386
Abstract
The optimal stacking of import containers in a terminal reduces the reshuffles during the unloading operations. Knowing the departure date of each container is critical for optimal stacking. However, such a date is rarely known because it depends on various attributes. Therefore, some [...] Read more.
The optimal stacking of import containers in a terminal reduces the reshuffles during the unloading operations. Knowing the departure date of each container is critical for optimal stacking. However, such a date is rarely known because it depends on various attributes. Therefore, some authors have proposed estimation algorithms using supervised classification. Although supervised classifiers can estimate this dwell time, the variable “dwell time” takes ordered values for this problem, suggesting using ordinal regression algorithms. Thus, we have compared an ordinal regression algorithm (selected from 15) against two supervised classifiers (selected from 30). We have set up two datasets with data collected in a container terminal. We have extracted and evaluated 35 attributes related to the dwell time. Additionally, we have run 21 experiments to evaluate both approaches regarding the mean absolute error modified and the reshuffles. As a result, we have found that the ordinal regression algorithm outperforms the supervised classifiers, reaching the lowest mean absolute error modified in 15 (71%) and the lowest reshuffles in 14 (67%) experiments. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

30 pages, 17992 KiB  
Article
Exploring the Knowledge Embedded in Class Visualizations and Their Application in Dataset and Extreme Model Compression
by José Ricardo Abreu-Pederzini, Guillermo Arturo Martínez-Mascorro, José Carlos Ortíz-Bayliss and Hugo Terashima-Marín
Appl. Sci. 2021, 11(20), 9374; https://0-doi-org.brum.beds.ac.uk/10.3390/app11209374 - 09 Oct 2021
Viewed by 1599
Abstract
Artificial neural networks are efficient learning algorithms that are considered to be universal approximators for solving numerous real-world problems in areas such as computer vision, language processing, or reinforcement learning. To approximate any given function, neural networks train a large number of parameters—up [...] Read more.
Artificial neural networks are efficient learning algorithms that are considered to be universal approximators for solving numerous real-world problems in areas such as computer vision, language processing, or reinforcement learning. To approximate any given function, neural networks train a large number of parameters—up to millions, or even billions in some cases. The large number of parameters and hidden layers in neural networks make them hard to interpret, which is why they are often referred to as black boxes. In the quest to make artificial neural networks interpretable in the field of computer vision, feature visualization stands out as one of the most developed and promising research directions. While feature visualizations are a valuable tool to gain insights about the underlying function learned by the network, they are still considered to be simple visual aids requiring human interpretation. In this paper, we propose that feature visualizations—class visualizations in particular—are analogous to mental imagery in humans, resembling the experience of seeing or perceiving the actual training data. Therefore, we propose that class visualizations contain embedded knowledge that can be exploited in a more automated manner. We present a series of experiments that shed light on the nature of class visualizations and demonstrate that class visualizations can be considered a conceptual compression of the data used to train the underlying model. Finally, we show that class visualizations can be regarded as convolutional filters and experimentally show their potential for extreme model compression purposes. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1997 KiB  
Article
Woodworking Tool Wear Condition Monitoring during Milling Based on Power Signals and a Particle Swarm Optimization-Back Propagation Neural Network
by Weihang Dong, Xianqing Xiong, Ying Ma and Xinyi Yue
Appl. Sci. 2021, 11(19), 9026; https://0-doi-org.brum.beds.ac.uk/10.3390/app11199026 - 28 Sep 2021
Cited by 11 | Viewed by 1756
Abstract
In the intelligent manufacturing of furniture, the power signal has the characteristics of low cost and high accuracy and is often used as a tool wear condition monitoring signal. However, the power signal is not very sensitive to tool wear conditions. The present [...] Read more.
In the intelligent manufacturing of furniture, the power signal has the characteristics of low cost and high accuracy and is often used as a tool wear condition monitoring signal. However, the power signal is not very sensitive to tool wear conditions. The present work addresses this issue by proposing a novel woodworking tool wear condition monitoring method that employs a limiting arithmetic average filtering method and particle swarm optimization (PSO)-back propagation (BP) neural network algorithm. The limiting arithmetic average filtering method was used to process the power signal and extracted the features of the woodworking tool wear conditions. The spindle speed, depths of milling, features and tool wear conditions were used as sample vectors. The PSO-BP neural network algorithm was used to establish the monitoring model of the woodworking tool wear condition. Experiments show that the proposed limiting arithmetic average filtering method and PSO-BP neural network algorithm can accurately monitor the woodworking tool wear conditions under different milling parameters. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 1492 KiB  
Article
Improving the Accuracy of Predicting Bank Depositor’s Behavior Using a Decision Tree
by Fereshteh Safarkhani and Sérgio Moro
Appl. Sci. 2021, 11(19), 9016; https://0-doi-org.brum.beds.ac.uk/10.3390/app11199016 - 28 Sep 2021
Cited by 8 | Viewed by 3449
Abstract
Telemarketing is a widely adopted direct marketing technique in banks. Since customers hardly respond positively, data prediction models can help in selecting the most likely prospective customers. We aim to develop a classifier accuracy to predict which customer will subscribe to a long-term [...] Read more.
Telemarketing is a widely adopted direct marketing technique in banks. Since customers hardly respond positively, data prediction models can help in selecting the most likely prospective customers. We aim to develop a classifier accuracy to predict which customer will subscribe to a long-term deposit proposed by a bank. Accordingly, this paper focuses on a combination of resampling, in order to reduce the imbalanced data, using feature selection, to reduce the complexity of data computing and dimension reduction of inefficiency data modeling. The performed operation has shown an improvement in the performance of the classification algorithm in terms of accuracy. The experimental results were run on a real bank dataset and the J48 decision tree achieved 94.39% accuracy prediction, with 0.975 sensitivity and 0.709 specificity, showing better results when compared to other approaches reported in the existing literature, such as logistic regression (91.79 accuracy; 0.975 sensitivity; 0.495 specificity) and Naive Bayes classifier (90.82% accuracy; 0.961 sensitivity; 0.507 specificity). Furthermore, our resampling and feature selection approach resulted in improved accuracy (94.39%) when compared to a state-of-the-art approach based on a fuzzy algorithm (92.89%). Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 2447 KiB  
Article
AUDD: Audio Urdu Digits Dataset for Automatic Audio Urdu Digit Recognition
by Aisha Chandio, Yao Shen, Malika Bendechache, Irum Inayat and Teerath Kumar
Appl. Sci. 2021, 11(19), 8842; https://0-doi-org.brum.beds.ac.uk/10.3390/app11198842 - 23 Sep 2021
Cited by 19 | Viewed by 2951
Abstract
The ongoing development of audio datasets for numerous languages has spurred research activities towards designing smart speech recognition systems. A typical speech recognition system can be applied in many emerging applications, such as smartphone dialing, airline reservations, and automatic wheelchairs, among others. Urdu [...] Read more.
The ongoing development of audio datasets for numerous languages has spurred research activities towards designing smart speech recognition systems. A typical speech recognition system can be applied in many emerging applications, such as smartphone dialing, airline reservations, and automatic wheelchairs, among others. Urdu is a national language of Pakistan and is also widely spoken in many other South Asian countries (e.g., India, Afghanistan). Therefore, we present a comprehensive dataset of spoken Urdu digits ranging from 0 to 9. Our dataset has 25,518 sound samples that are collected from 740 participants. To test the proposed dataset, we apply different existing classification algorithms on the datasets including Support Vector Machine (SVM), Multilayer Perceptron (MLP), and flavors of the EfficientNet. These algorithms serve as a baseline. Furthermore, we propose a convolutional neural network (CNN) for audio digit classification. We conduct the experiment using these networks, and the results show that the proposed CNN is efficient and outperforms the baseline algorithms in terms of classification accuracy. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 634 KiB  
Article
Auxiliary Information-Enhanced Recommendations
by Shoujin Wang, Wanggen Wan, Tong Qu and Yanqiu Dong
Appl. Sci. 2021, 11(19), 8830; https://0-doi-org.brum.beds.ac.uk/10.3390/app11198830 - 23 Sep 2021
Cited by 1 | Viewed by 1309
Abstract
Sequential recommendations have attracted increasing attention from both academia and industry in recent years. They predict a given user’s next choice of items by mainly modeling the sequential relations over a sequence of the user’s interactions with the items. However, most of the [...] Read more.
Sequential recommendations have attracted increasing attention from both academia and industry in recent years. They predict a given user’s next choice of items by mainly modeling the sequential relations over a sequence of the user’s interactions with the items. However, most of the existing sequential recommendation algorithms mainly focus on the sequential dependencies between item IDs within sequences, while ignoring the rich and complex relations embedded in the auxiliary information, such as items’ image information and textual information. Such complex relations can help us better understand users’ preferences towards items, and thus benefit from the recommendations. To bridge this gap, we propose an auxiliary information-enhanced sequential recommendation algorithm called memory fusion network for recommendation (MFN4Rec) to incorporate both items’ image and textual information for sequential recommendations. Accordingly, item IDs, item image information and item textual information are regarded as three modalities. By comprehensively modelling the sequential relations within modalities and interaction relations across modalities, MFN4Rec can learn a more informative representation of users’ preferences for more accurate recommendations. Extensive experiments on two real-world datasets demonstrate the superiority of MFN4Rec over state-of-the-art sequential recommendation algorithms. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

19 pages, 601 KiB  
Article
A Nested Chinese Restaurant Topic Model for Short Texts with Document Embeddings
by Yue Niu, Hongjie Zhang and Jing Li
Appl. Sci. 2021, 11(18), 8708; https://0-doi-org.brum.beds.ac.uk/10.3390/app11188708 - 18 Sep 2021
Cited by 5 | Viewed by 2155
Abstract
In recent years, short texts have become a kind of prevalent text on the internet. Due to the short length of each text, conventional topic models for short texts suffer from the sparsity of word co-occurrence information. Researchers have proposed different kinds of [...] Read more.
In recent years, short texts have become a kind of prevalent text on the internet. Due to the short length of each text, conventional topic models for short texts suffer from the sparsity of word co-occurrence information. Researchers have proposed different kinds of customized topic models for short texts by providing additional word co-occurrence information. However, these models cannot incorporate sufficient semantic word co-occurrence information and may bring additional noisy information. To address these issues, we propose a self-aggregated topic model incorporating document embeddings. Aggregating short texts into long documents according to document embeddings can provide sufficient word co-occurrence information and avoid incorporating non-semantic word co-occurrence information. However, document embeddings of short texts contain a lot of noisy information resulting from the sparsity of word co-occurrence information. So we discard noisy information by changing the document embeddings into global and local semantic information. The global semantic information is the similarity probability distribution on the entire dataset and the local semantic information is the distances of similar short texts. Then we adopt a nested Chinese restaurant process to incorporate these two kinds of information. Finally, we compare our model to several state-of-the-art models on four real-world short texts corpus. The experiment results show that our model achieves better performances in terms of topic coherence and classification accuracy. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 441 KiB  
Article
Context-Based Geodesic Dissimilarity Measure for Clustering Categorical Data
by Changki Lee and Uk Jung
Appl. Sci. 2021, 11(18), 8416; https://0-doi-org.brum.beds.ac.uk/10.3390/app11188416 - 10 Sep 2021
Cited by 1 | Viewed by 1636
Abstract
Measuring the dissimilarity between two observations is the basis of many data mining and machine learning algorithms, and its effectiveness has a significant impact on learning outcomes. The dissimilarity or distance computation has been a manageable problem for continuous data because many numerical [...] Read more.
Measuring the dissimilarity between two observations is the basis of many data mining and machine learning algorithms, and its effectiveness has a significant impact on learning outcomes. The dissimilarity or distance computation has been a manageable problem for continuous data because many numerical operations can be successfully applied. However, unlike continuous data, defining a dissimilarity between pairs of observations with categorical variables is not straightforward. This study proposes a new method to measure the dissimilarity between two categorical observations, called a context-based geodesic dissimilarity measure, for the categorical data clustering problem. The proposed method considers the relationships between categorical variables and discovers the implicit topological structures in categorical data. In other words, it can effectively reflect the nonlinear patterns of arbitrarily shaped categorical data clusters. Our experimental results confirm that the proposed measure that considers both nonlinear data patterns and relationships among the categorical variables yields better clustering performance than other distance measures. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

22 pages, 19389 KiB  
Article
Application of Machine Learning Methods for Pallet Loading Problem
by Batin Latif Aylak, Murat İnce, Okan Oral, Gürsel Süer, Najat Almasarwah, Manjeet Singh and Bashir Salah
Appl. Sci. 2021, 11(18), 8304; https://0-doi-org.brum.beds.ac.uk/10.3390/app11188304 - 07 Sep 2021
Cited by 4 | Viewed by 5125
Abstract
Because of continuous competition in the corporate industrial sector, numerous companies are always looking for strategies to ensure timely product delivery to survive against their competitors. For this reason, logistics play a significant role in the warehousing, shipments, and transportation of the products. [...] Read more.
Because of continuous competition in the corporate industrial sector, numerous companies are always looking for strategies to ensure timely product delivery to survive against their competitors. For this reason, logistics play a significant role in the warehousing, shipments, and transportation of the products. Therefore, the high utilization of resources can improve the profit margins and reduce unnecessary storage or shipping costs. One significant issue in shipments is the Pallet Loading Problem (PLP) which can generally be solved by seeking to maximize the total number of boxes to be loaded on a pallet. In many previous studies, various solutions for the PLP have been suggested in the context of logistics and shipment delivery systems. In this paper, a novel two-phase approach is presented by utilizing a number of Machine Learning (ML) models to tackle the PLP. The dataset utilized in this study was obtained from the DHL supply chain system. According to the training and testing of various ML models, our results show that a very high (>85%) Pallet Utilization Volume (PUV) was obtained, and an accuracy of >89% was determined to predict an accurate loading arrangement of boxes on a suitable pallet. Furthermore, a comprehensive analysis of all the results on the basis of a comparison of several ML models is provided in order to show the efficacy of the proposed methodology. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

24 pages, 3765 KiB  
Article
On Image Classification in Video Analysis of Omnidirectional Apis Mellifera Traffic: Random Reinforced Forests vs. Shallow Convolutional Networks
by Vladimir Kulyukin, Nikhil Ganta and Anastasiia Tkachenko
Appl. Sci. 2021, 11(17), 8141; https://0-doi-org.brum.beds.ac.uk/10.3390/app11178141 - 02 Sep 2021
Cited by 3 | Viewed by 2180
Abstract
Omnidirectional honeybee traffic is the number of bees moving in arbitrary directions in close proximity to the landing pad of a beehive over a period of time. Automated video analysis of such traffic is critical for continuous colony health assessment. In our previous [...] Read more.
Omnidirectional honeybee traffic is the number of bees moving in arbitrary directions in close proximity to the landing pad of a beehive over a period of time. Automated video analysis of such traffic is critical for continuous colony health assessment. In our previous research, we proposed a two-tier algorithm to measure omnidirectional bee traffic in videos. Our algorithm combines motion detection with image classification: in tier 1, motion detection functions as class-agnostic object location to generate regions with possible objects; in tier 2, each region from tier 1 is classified by a class-specific classifier. In this article, we present an empirical and theoretical comparison of random reinforced forests and shallow convolutional networks as tier 2 classifiers. A random reinforced forest is a random forest trained on a dataset with reinforcement learning. We present several methods of training random reinforced forests and compare their performance with shallow convolutional networks on seven image datasets. We develop a theoretical framework to assess the complexity of image classification by a image classifier. We formulate and prove three theorems on finding optimal random reinforced forests. Our conclusion is that, despite their limitations, random reinforced forests are a reasonable alternative to convolutional networks when memory footprints and classification and energy efficiencies are important factors. We outline several ways in which the performance of random reinforced forests may be improved. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 39898 KiB  
Article
Birds Eye View Look-Up Table Estimation with Semantic Segmentation
by Dongkyu Lee, Wee Peng Tay and Seok-Cheol Kee
Appl. Sci. 2021, 11(17), 8047; https://0-doi-org.brum.beds.ac.uk/10.3390/app11178047 - 30 Aug 2021
Cited by 1 | Viewed by 1855
Abstract
In this work, a study was carried out to estimate a look-up table (LUT) that converts a camera image plane to a birds eye view (BEV) plane using a single camera. The traditional camera pose estimation fields require high costs in researching and [...] Read more.
In this work, a study was carried out to estimate a look-up table (LUT) that converts a camera image plane to a birds eye view (BEV) plane using a single camera. The traditional camera pose estimation fields require high costs in researching and manufacturing autonomous vehicles for the future and may require pre-configured infra. This paper proposes an autonomous vehicle driving camera calibration system that is low cost and utilizes low infra. A network that outputs an image in the form of an LUT that converts the image into a BEV by estimating the camera pose under urban road driving conditions using a single camera was studied. We propose a network that predicts human-like poses from a single image. We collected synthetic data using a simulator, made BEV and LUT as ground truth, and utilized the proposed network and ground truth to train pose estimation function. In the progress, it predicts the pose by deciphering the semantic segmentation feature and increases its performance by attaching a layer that handles the overall direction of the network. The network outputs camera angle (roll/pitch/yaw) on the 3D coordinate system so that the user can monitor learning. Since the network’s output is a LUT, there is no need for additional calculation, and real-time performance is improved. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 360 KiB  
Article
A Multiple-Choice Machine Reading Comprehension Model with Multi-Granularity Semantic Reasoning
by Yu Dai, Yufan Fu and Lei Yang
Appl. Sci. 2021, 11(17), 7945; https://0-doi-org.brum.beds.ac.uk/10.3390/app11177945 - 27 Aug 2021
Cited by 4 | Viewed by 2863
Abstract
To address the problem of poor semantic reasoning of models in multiple-choice Chinese machine reading comprehension (MRC), this paper proposes an MRC model incorporating multi-granularity semantic reasoning. In this work, we firstly encode articles, questions and candidates to extract global reasoning information; secondly, [...] Read more.
To address the problem of poor semantic reasoning of models in multiple-choice Chinese machine reading comprehension (MRC), this paper proposes an MRC model incorporating multi-granularity semantic reasoning. In this work, we firstly encode articles, questions and candidates to extract global reasoning information; secondly, we use multiple convolution kernels of different sizes to convolve and maximize pooling of the BERT-encoded articles, questions and candidates to extract local semantic reasoning information of different granularities; we then fuse the global information with the local multi-granularity information and use it to make an answer selection. The proposed model can combine the learned multi-granularity semantic information for reasoning, solving the problem of poor semantic reasoning ability of the model, and thus can improve the reasoning ability of machine reading comprehension. The experiments show that the proposed model achieves better performance on the C3 dataset than the benchmark model in semantic reasoning, which verifies the effectiveness of the proposed model in semantic reasoning. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 12343 KiB  
Article
Principal Component Analysis and Machine Learning Approaches for Photovoltaic Power Prediction: A Comparative Study
by Souhaila Chahboun and Mohamed Maaroufi
Appl. Sci. 2021, 11(17), 7943; https://0-doi-org.brum.beds.ac.uk/10.3390/app11177943 - 27 Aug 2021
Cited by 16 | Viewed by 2426
Abstract
Nowadays, in the context of the industrial revolution 4.0, considerable volumes of data are being generated continuously from intelligent sensors and connected objects. The proper understanding and use of these amounts of data are crucial levers of performance and innovation. Machine learning is [...] Read more.
Nowadays, in the context of the industrial revolution 4.0, considerable volumes of data are being generated continuously from intelligent sensors and connected objects. The proper understanding and use of these amounts of data are crucial levers of performance and innovation. Machine learning is the technology that allows the full potential of big datasets to be exploited. As a branch of artificial intelligence, it enables us to discover patterns and make predictions from data based on statistics, data mining, and predictive analysis. The key goal of this study was to use machine learning approaches to forecast the hourly power produced by photovoltaic panels. A comparison analysis of various predictive models including elastic net, support vector regression, random forest, and Bayesian regularized neural networks was carried out to identify the models providing the best predicting results. The principal components analysis used to reduce the dimensionality of the input data revealed six main factor components that could explain up to 91.95% of the variation in all variables. Finally, performance metrics demonstrated that Bayesian regularized neural networks achieved the best results, giving an accuracy of R2 = 99.99% and RMSE = 0.002 kW. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

18 pages, 3684 KiB  
Article
Solution to Solid Wood Board Cutting Stock Problem
by Min Tang, Ying Liu, Fenglong Ding and Zhengguang Wang
Appl. Sci. 2021, 11(17), 7790; https://0-doi-org.brum.beds.ac.uk/10.3390/app11177790 - 24 Aug 2021
Cited by 7 | Viewed by 2600
Abstract
In the production process for wooden furniture, the raw material costs account for more than 50% of furniture costs, and the utilization rate of raw materials depends mainly on the layout scheme. Therefore, a reasonable layout is an important measure to reduce furniture [...] Read more.
In the production process for wooden furniture, the raw material costs account for more than 50% of furniture costs, and the utilization rate of raw materials depends mainly on the layout scheme. Therefore, a reasonable layout is an important measure to reduce furniture costs. This paper investigates the solid wood board cutting stock problem (CSP) and establishes an optimization model, with the goal of the highest possible utilization rate for original boards. An ant colony-immune genetic algorithm (AC-IGA) is designed to solve this model. The solutions of the ant colony algorithm are used as the initial population of the immune genetic algorithm, and the optimal solution is obtained using the immune genetic algorithm after multiple iterations are transformed into the accumulation of global pheromones, which improves the search ability and ensures the solution quality. The layout process of the solid wood board is abstracted into the construction process of the solution. At the same time, in order to prevent premature convergence, several improved methods, such as a global pheromone hybrid update and adaptive crossover probability, are proposed. Comparative experiments are designed to verify the feasibility and effectiveness of the AC-IGA, and the experimental results show that the AC-IGA has better solution precision and global search ability compared with the ant colony algorithm (ACA), genetic algorithm (GA), grey wolf optimizer (GWO), and polar bear optimization (PBO). The utilization rate increased by more than 2.308%, which provides effective theoretical and methodological support for furniture enterprises to improve economic benefits. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 5223 KiB  
Data Descriptor
Do the European Data Portal Datasets in the Categories Government and Public Sector, Transport and Education, Culture and Sport Meet the Data on the Web Best Practices?
by Morgana Carneiro Andrade, Rafaela Oliveira da Cunha, Jorge Figueiredo and Ana Alice Baptista
Data 2021, 6(8), 94; https://0-doi-org.brum.beds.ac.uk/10.3390/data6080094 - 19 Aug 2021
Cited by 1 | Viewed by 2284
Abstract
The European Data Portal is one of the worldwide initiatives that aggregates and make open data available. This is a case study with a qualitative approach that aims to determine to what extent the datasets from the Government and Public Sector, Transport, and [...] Read more.
The European Data Portal is one of the worldwide initiatives that aggregates and make open data available. This is a case study with a qualitative approach that aims to determine to what extent the datasets from the Government and Public Sector, Transport, and Education, Culture and Sport categories published on the portal meet the Data on the Web Best Practices (W3C). With the datasets sorted by last modified and filtered by the ratings Excellent and Good+, we analyzed 50 different datasets from each category. The analysis revealed that the Government and Transport categories have the best-rated datasets, followed by Transportation and, lastly, Education. This analysis revealed that the Government and Transport categories have the best-rated datasets and Education the least. The most observed BPs were: BP1, BP2, BP4, BP5, BP10, BP11, BP12, BP13C, BP16, BP17, BP19, BP29, and BP34, while the least observed were: BP3, BP7H, BP7C, BP13H, BP14, BP15, BP21, BP32, and BP35. These results fill a gap in the literature on the quality of the data made available by this portal and provide insights for European data managers on which best practices are most observed and which ones need more attention. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 899 KiB  
Article
Strong Influence of Responses in Training Dialogue Response Generator
by So-Eon Kim, Yeon-Soo Lim and Seong-Bae Park
Appl. Sci. 2021, 11(16), 7415; https://0-doi-org.brum.beds.ac.uk/10.3390/app11167415 - 12 Aug 2021
Cited by 2 | Viewed by 1890
Abstract
The sequence-to-sequence model is a widely used model for dialogue response generators, but it tends to generate safe responses for most input queries. Since safe responses are unattractive and boring, a number of efforts have been made to make the generator produce diverse [...] Read more.
The sequence-to-sequence model is a widely used model for dialogue response generators, but it tends to generate safe responses for most input queries. Since safe responses are unattractive and boring, a number of efforts have been made to make the generator produce diverse responses, but generating diverse responses is yet an open problem. As a solution to this problem, this paper proposes a novel response generator, Response Generator with Response Weight (RGRW). The proposed response generator is a transformer-based sequence-to-sequence model of which the encoder is a pre-trained Bidirectional Encoder Representations from Transformers (BERT) and the decoder is a variant of Generative Pre-Training of a language model-2 (GPT-2). Since the attention on the response is not reflected enough at the transformer-based sequence-to-sequence model, the proposed generator enhances the influence of a response by the response weight, which determines the importance of each token in a query with respect to the response. Then, the decoder of the generator processes the response weight as well as a query encoding to generate a diverse response. The effectiveness of RGRW is proven by showing that it generates more diverse and informative responses than the baseline response generator by focusing more on the tokens that are important for generating the response. Additionally, the proposed model overwhelms the Commonsense Knowledge-Aware Dialogue generation model (ConKADI), which is a state-of-the-art model. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

11 pages, 5601 KiB  
Article
A NEAT Based Two Stage Neural Network Approach to Generate a Control Algorithm for a Pultrusion System
by Christian Pommer, Michael Sinapius, Marco Brysch and Naser Al Natsheh
AI 2021, 2(3), 355-365; https://0-doi-org.brum.beds.ac.uk/10.3390/ai2030022 - 05 Aug 2021
Cited by 1 | Viewed by 3431
Abstract
Controlling complex systems by traditional control systems can sometimes lead to sub-optimal results since mathematical models do often not completely describe physical processes. An alternative approach is the use of a neural network based control algorithm. Neural Networks can approximate any function and [...] Read more.
Controlling complex systems by traditional control systems can sometimes lead to sub-optimal results since mathematical models do often not completely describe physical processes. An alternative approach is the use of a neural network based control algorithm. Neural Networks can approximate any function and as such are able to control even the most complex system. One challenge of this approach is the necessity of a high speed training loop to facilitate enough training rounds in a reasonable time frame to generate a viable control network. This paper overcomes this problem by employing a second neural network to approximate the output of a relatively slow 3D-FE-Pultrusion-Model. This approximation is by orders of magnitude faster than the original model with only minor deviations from the original models behaviour. This new model is then employed in a training loop to successfully train a NEAT based genetic control algorithm. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

15 pages, 359 KiB  
Article
A Payload Based Malicious HTTP Traffic Detection Method Using Transfer Semi-Supervised Learning
by Tieming Chen, Yunpeng Chen, Mingqi Lv, Gongxun He, Tiantian Zhu, Ting Wang and Zhengqiu Weng
Appl. Sci. 2021, 11(16), 7188; https://0-doi-org.brum.beds.ac.uk/10.3390/app11167188 - 04 Aug 2021
Cited by 10 | Viewed by 4187
Abstract
Malicious HTTP traffic detection plays an important role in web application security. Most existing work applies machine learning and deep learning techniques to build the malicious HTTP traffic detection model. However, they still suffer from the problems of huge training data collection cost [...] Read more.
Malicious HTTP traffic detection plays an important role in web application security. Most existing work applies machine learning and deep learning techniques to build the malicious HTTP traffic detection model. However, they still suffer from the problems of huge training data collection cost and low cross-dataset generalization ability. Aiming at these problems, this paper proposes DeepPTSD, a deep learning method for payload based malicious HTTP traffic detection. First, it treats the malicious HTTP traffic detection as a text classification problem and trains the initial detection model using TextCNN on a public dataset, and then adapts the initial detection model to the target dataset based on a transfer learning algorithm. Second, in the transfer learning procedure, it uses a semi-supervised learning algorithm to accomplish the model adaptation task. The semi-supervised learning algorithm enhances the target dataset based on a HTTP payload data augmentation mechanism to exploit both the labeled and unlabeled data. We evaluate DeepPTSD on two real HTTP traffic datasets. The results show that DeepPTSD has competitive performance under the small data condition. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

12 pages, 39307 KiB  
Article
Zernike Coefficient Prediction Technique for Interference Based on Generation Adversarial Network
by Allen Jong-Woei Whang, Yi-Yung Chen, Tsai-Hsien Yang, Cheng-Tse Lin, Zhi-Jia Jian and Chun-Han Chou
Appl. Sci. 2021, 11(15), 6933; https://0-doi-org.brum.beds.ac.uk/10.3390/app11156933 - 28 Jul 2021
Cited by 2 | Viewed by 2644
Abstract
In the paper, we propose a novel prediction technique to predict Zernike coefficients from interference fringes based on Generative Adversarial Network (GAN). In general, the task of GAN is image-to-image translation, but we design GAN for image-to-number translation. In the GAN model, the [...] Read more.
In the paper, we propose a novel prediction technique to predict Zernike coefficients from interference fringes based on Generative Adversarial Network (GAN). In general, the task of GAN is image-to-image translation, but we design GAN for image-to-number translation. In the GAN model, the Generator’s input is the interference fringe image, and its output is a mosaic image. Moreover, each piece of the mosaic image links to the number of Zernike coefficients. Root Mean Square Error (RMSE) is our criterion for quantifying the ground truth and prediction coefficients. After training the GAN model, we use two different methods: the formula (ideal images) and optics simulation (simulated images) to estimate the GAN model. As a result, the RMSE is about 0.0182 ± 0.0035λ with the ideal image case and the RMSE is about 0.101 ± 0.0263λ with the simulated image case. Since the outcome in the simulated image case is poor, we use the transfer learning method to improve the RMSE to about 0.0586 ± 0.0035λ. The prediction technique applies not only to the ideal case but also to the actual interferometer. In addition, the novel prediction technique makes predicting Zernike coefficients more accurate than our previous research. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 19928 KiB  
Article
Credit Card Fraud Detection in Card-Not-Present Transactions: Where to Invest?
by Igor Mekterović, Mladen Karan, Damir Pintar and Ljiljana Brkić
Appl. Sci. 2021, 11(15), 6766; https://0-doi-org.brum.beds.ac.uk/10.3390/app11156766 - 23 Jul 2021
Cited by 9 | Viewed by 7773
Abstract
Online shopping, already on a steady rise, was propelled even further with the advent of the COVID-19 pandemic. Of course, credit cards are a dominant way of doing business online. The credit card fraud detection problem has become relevant more than ever as [...] Read more.
Online shopping, already on a steady rise, was propelled even further with the advent of the COVID-19 pandemic. Of course, credit cards are a dominant way of doing business online. The credit card fraud detection problem has become relevant more than ever as the losses due to fraud accumulate. Most research on this topic takes an isolated, focused view of the problem, typically concentrating on tuning the data mining models. We noticed a significant gap between the academic research findings and the rightfully conservative businesses, which are careful when adopting new, especially black-box, models. In this paper, we took a broader perspective and considered this problem from both the academic and the business angle: we detected challenges in the fraud detection problem such as feature engineering and unbalanced datasets and distinguished between more and less lucrative areas to invest in when upgrading fraud detection systems. Our findings are based on the real-world data of CNP (card not present) fraud transactions, which are a dominant type of fraud transactions. Data were provided by our industrial partner, an international card-processing company. We tested different data mining models and approaches to the outlined challenges and compared them to their existing production systems to trace a cost-effective fraud detection system upgrade path. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

13 pages, 702 KiB  
Article
Multiscale Content-Independent Feature Fusion Network for Source Camera Identification
by Changhui You, Hong Zheng, Zhongyuan Guo, Tianyu Wang and Xiongbin Wu
Appl. Sci. 2021, 11(15), 6752; https://0-doi-org.brum.beds.ac.uk/10.3390/app11156752 - 22 Jul 2021
Cited by 9 | Viewed by 1716
Abstract
In recent years, source camera identification has become a research hotspot in the field of image forensics and has received increasing attention. It has high application value in combating the spread of pornographic photos, copyright authentication of art photos, image tampering forensics, and [...] Read more.
In recent years, source camera identification has become a research hotspot in the field of image forensics and has received increasing attention. It has high application value in combating the spread of pornographic photos, copyright authentication of art photos, image tampering forensics, and so on. Although the existing algorithms greatly promote the research progress of source camera identification, they still cannot effectively reduce the interference of image content with image forensics. To suppress the influence of image content on source camera identification, a multiscale content-independent feature fusion network (MCIFFN) is proposed to solve the problem of source camera identification. MCIFFN is composed of three parallel branch networks. Before the image is sent to the first two branch networks, an adaptive filtering module is needed to filter the image content and extract the noise features, and then the noise features are sent to the corresponding convolutional neural networks (CNN), respectively. In order to retain the information related to the image color, this paper does not preprocess the third branch network, but directly sends the image data to CNN. Finally, the content-independent features of different scales extracted from the three branch networks are fused, and the fused features are used for image source identification. The CNN feature extraction network in MCIFFN is a shallow network embedded with a squeeze and exception (SE) structure called SE-SCINet. The experimental results show that the proposed MCIFFN is effective and robust, and the classification accuracy is improved by approximately 2% compared with the SE-SCINet network. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

Back to TopTop