Machine Learning and Knowledge Extraction

18 pages, 6572 KiB

Open AccessArticle

Semi-Supervised Adversarial Variational Autoencoder

by Ryad Zemouri

Mach. Learn. Knowl. Extr. 2020, 2(3), 361-378; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030020 - 06 Sep 2020

Cited by 16 | Viewed by 4977

Abstract

We present a method to improve the reconstruction and generation performance of a variational autoencoder (VAE) by injecting an adversarial learning. Instead of comparing the reconstructed with the original data to calculate the reconstruction loss, we use a consistency principle for deep features. [...] Read more.

We present a method to improve the reconstruction and generation performance of a variational autoencoder (VAE) by injecting an adversarial learning. Instead of comparing the reconstructed with the original data to calculate the reconstruction loss, we use a consistency principle for deep features. The main contributions are threefold. Firstly, our approach perfectly combines the two models, i.e., GAN and VAE, and thus improves the generation and reconstruction performance of the VAE. Secondly, the VAE training is done in two steps, which allows to dissociate the constraints used for the construction of the latent space on the one hand, and those used for the training of the decoder. By using this two-step learning process, our method can be more widely used in applications other than image processing. While training the encoder, the label information is integrated to better structure the latent space in a supervised way. The third contribution is to use the trained encoder for the consistency principle for deep features extracted from the hidden layers. We present experimental results to show that our method gives better performance than the original VAE. The results demonstrate that the adversarial constraints allow the decoder to generate images that are more authentic and realistic than the conventional VAE. Full article

► Show Figures

Figure 1

14 pages, 914 KiB

Open AccessArticle

Exploring the Eating Disorder Examination Questionnaire, Clinical Impairment Assessment, and Autism Quotient to Identify Eating Disorder Vulnerability: A Cluster Analysis

by Natalia Stewart Rosenfield and Erik Linstead

Mach. Learn. Knowl. Extr. 2020, 2(3), 347-360; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030019 - 02 Sep 2020

Cited by 1 | Viewed by 3032

Abstract

Eating disorders are very complicated and many factors play a role in their manifestation. Furthermore, due to the variability in diagnosis and symptoms, treatment for an eating disorder is unique to the individual. As a result, there are numerous assessment tools available, which [...] Read more.

Eating disorders are very complicated and many factors play a role in their manifestation. Furthermore, due to the variability in diagnosis and symptoms, treatment for an eating disorder is unique to the individual. As a result, there are numerous assessment tools available, which range from brief survey questionnaires to in-depth interviews conducted by a professional. One of the many benefits to using machine learning is that it offers new insight into datasets that researchers may not previously have, particularly when compared to traditional statistical methods. The aim of this paper was to employ k-means clustering to explore the Eating Disorder Examination Questionnaire, Clinical Impairment Assessment, and Autism Quotient scores. The goal is to identify prevalent cluster topologies in the data, using the truth data as a means to validate identified groupings. Our results show that a model with k = 2 performs the best and clustered the dataset in the most appropriate way. This matches our truth data group labels, and we calculated our model’s accuracy at 78.125%, so we know that our model is working well. We see that the Eating Disorder Examination Questionnaire (EDE-Q) and Clinical Impairment Assessment (CIA) scores are, in fact, important discriminators of eating disorder behavior. Full article

► Show Figures

Figure 1

20 pages, 3246 KiB

Open AccessArticle

Beyond Cross-Validation—Accuracy Estimation for Incremental and Active Learning Models

by Christian Limberg, Heiko Wersing and Helge Ritter

Mach. Learn. Knowl. Extr. 2020, 2(3), 327-346; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030018 - 01 Sep 2020

Cited by 5 | Viewed by 2964

Abstract

For incremental machine-learning applications it is often important to robustly estimate the system accuracy during training, especially if humans perform the supervised teaching. Cross-validation and interleaved test/train error are here the standard supervised approaches. We propose a novel semi-supervised accuracy estimation approach that [...] Read more.

For incremental machine-learning applications it is often important to robustly estimate the system accuracy during training, especially if humans perform the supervised teaching. Cross-validation and interleaved test/train error are here the standard supervised approaches. We propose a novel semi-supervised accuracy estimation approach that clearly outperforms these two methods. We introduce the Configram Estimation (CGEM) approach to predict the accuracy of any classifier that delivers confidences. By calculating classification confidences for unseen samples, it is possible to train an offline regression model, capable of predicting the classifier’s accuracy on novel data in a semi-supervised fashion. We evaluate our method with several diverse classifiers and on analytical and real-world benchmark data sets for both incremental and active learning. The results show that our novel method improves accuracy estimation over standard methods and requires less supervised training data after deployment of the model. We demonstrate the application of our approach to a challenging robot object recognition task, where the human teacher can use our method to judge sufficient training. Full article

► Show Figures

Figure 1

20 pages, 3916 KiB

Open AccessArticle

Semantic Predictive Coding with Arbitrated Generative Adversarial Networks

by Radamanthys Stivaktakis, Grigorios Tsagkatakis and Panagiotis Tsakalides

Mach. Learn. Knowl. Extr. 2020, 2(3), 307-326; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030017 - 25 Aug 2020

Cited by 2 | Viewed by 2953

Abstract

In spatio-temporal predictive coding problems, like next-frame prediction in video, determining the content of plausible future frames is primarily based on the image dynamics of previous frames. We establish an alternative approach based on their underlying semantic information when considering data that do [...] Read more.

In spatio-temporal predictive coding problems, like next-frame prediction in video, determining the content of plausible future frames is primarily based on the image dynamics of previous frames. We establish an alternative approach based on their underlying semantic information when considering data that do not necessarily incorporate a temporal aspect, but instead they comply with some form of associative ordering. In this work, we introduce the notion of semantic predictive coding by proposing a novel generative adversarial modeling framework which incorporates the arbiter classifier as a new component. While the generator is primarily tasked with the anticipation of possible next frames, the arbiter’s principal role is the assessment of their credibility. Taking into account that the denotative meaning of each forthcoming element can be encapsulated in a generic label descriptive of its content, a classification loss is introduced along with the adversarial loss. As supported by our experimental findings in a next-digit and a next-letter scenario, the utilization of the arbiter not only results in an enhanced GAN performance, but it also broadens the network’s creative capabilities in terms of the diversity of the generated symbols. Full article

► Show Figures

Figure 1

24 pages, 52749 KiB

Open AccessArticle

A Hybrid Artificial Neural Network to Estimate Soil Moisture Using SWAT+ and SMAP Data

by Katherine H. Breen, Scott C. James, Joseph D. White, Peter M. Allen and Jeffery G. Arnold

Mach. Learn. Knowl. Extr. 2020, 2(3), 283-306; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030016 - 21 Aug 2020

Cited by 7 | Viewed by 3379

Abstract

In this work, we developed a data-driven framework to predict near-surface (0–5 cm) soil moisture (SM) by mapping inputs from the Soil & Water Assessment Tool to SM time series from NASA’s Soil Moisture Active Passive (SMAP) satellite for the period 1 January [...] Read more.

In this work, we developed a data-driven framework to predict near-surface (0–5 cm) soil moisture (SM) by mapping inputs from the Soil & Water Assessment Tool to SM time series from NASA’s Soil Moisture Active Passive (SMAP) satellite for the period 1 January 2016–31 December 2018. We developed a hybrid artificial neural network (ANN) combining long short-term memory and multilayer perceptron networks that were used to simultaneously incorporate dynamic weather and static spatial data into the training algorithm, respectively. We evaluated the generalizability of the hybrid ANN using training datasets comprising several watersheds with different environmental conditions, examined the effects of standard and physics-guided loss functions, and experimented with feature augmentation. Our model could estimate SM on par with the accuracy of SMAP. We demonstrated that the most critical learning of the physical processes governing SM variability was learned from meteorological time series, and that additional physical context supported model performance when test data were not fully encapsulated by the variability of the training data. Additionally, we found that when forecasting SM based on trends learned during the earlier training period, the models appreciated seasonal trends. Full article

(This article belongs to the Special Issue Explainable Machine Learning)

► Show Figures

Graphical abstract

12 pages, 2828 KiB

Open AccessArticle

Digit Recognition Based on Specialization, Decomposition and Holistic Processing

by Michael Joseph and Khaled Elleithy

Mach. Learn. Knowl. Extr. 2020, 2(3), 271-282; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030015 - 18 Aug 2020

Viewed by 2838

Abstract

With the introduction of the Convolutional Neural Network (CNN) and other classical algorithms, facial and object recognition have made significant progress. However, in a situation where there are few label examples or the environment is not ideal, such as lighting conditions, orientations, and [...] Read more.

With the introduction of the Convolutional Neural Network (CNN) and other classical algorithms, facial and object recognition have made significant progress. However, in a situation where there are few label examples or the environment is not ideal, such as lighting conditions, orientations, and so on, performance is disappointing. Various methods, such as data augmentation and image registration, have been used in an effort to improve accuracy; nonetheless, performance remains far from human efficiency. Advancement in cognitive science has provided us with valuable insight into how humans achieve high accuracy in identifying and discriminating between different faces and objects. These researches help us understand how the brain uses the features in the face to form a holistic representation and subsequently uses it to discriminate between faces. Our objective and contribution in this paper is to introduce a computational model that leverages these techniques, being used by our brain, to improve robustness and recognition accuracy. The hypothesis is that the biological model, our brain, achieves such high efficiency in face recognition because it is using a two-step process. We therefore postulate that, in the case of a handwritten digit, it will be easier for a learning model to learn invariant features and to generate a holistic representation than to perform classification. The model uses a variational autoencoder to generate holistic representation of handwritten digits and a Neural Network(NN) to classify them. The results obtained in this research show the effectiveness of decomposing the recognition tasks into two specialize sub-tasks, a generator, and a classifier. Full article

(This article belongs to the Section Network)

► Show Figures

Figure 1

15 pages, 4574 KiB

Open AccessArticle

Impact of Uncertainty in the Input Variables and Model Parameters on Predictions of a Long Short Term Memory (LSTM) Based Sales Forecasting Model

by Shakti Goel and Rahul Bajpai

Mach. Learn. Knowl. Extr. 2020, 2(3), 256-270; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030014 - 15 Aug 2020

Cited by 8 | Viewed by 3595

Abstract

A Long Short Term Memory (LSTM) based sales model has been developed to forecast the global sales of hotel business of Travel Boutique Online Holidays (TBO Holidays). The LSTM model is a multivariate model; input to the model includes several independent variables in [...] Read more.

A Long Short Term Memory (LSTM) based sales model has been developed to forecast the global sales of hotel business of Travel Boutique Online Holidays (TBO Holidays). The LSTM model is a multivariate model; input to the model includes several independent variables in addition to a dependent variable, viz., sales from the previous step. One of the input variables, “number of active bookers per day”, is estimated for the same day as sales. This need for estimation requires the development of another LSTM model to predict the number of active bookers per day. The number of active bookers is variable, so the predicted is used as an input to the sales forecasting model. The use of a predicted variable as an input variable to another model increases the chance of uncertainty entering the system. This paper discusses the quantum of variability observed in sales predictions for various uncertainties or noise due to the estimation of the number of active bookers. For the purposes of this study, different noise distributions such as normalized, uniform, and logistic distributions are used, among others. Analyses of predictions demonstrate that the addition of uncertainty to the number of active bookers via dropouts as well as to the lagged sales variables leads to model predictions that are close to the observations. The least squared error between observations and predictions is higher for uncertainties modeled using other distributions (without dropouts) with the worst predictions being for Gumbel noise distribution. Gaussian noise added directly to the weights matrix yields the best results (minimum prediction errors). One possibility of this uncertainty could be that the global minimum of the least squared objective function with respect to the model weight matrix is not reached, and therefore, model parameters are not optimal. The two LSTM models used in series are also used to study the impact of corona virus on global sales. By introducing a new variable called the corona virus impact variable, the LSTM models can predict corona-affected sales within five percent (5%) of the actuals. The research discussed in the paper finds LSTM models to be effective tools that can be used in the travel industry as they are able to successfully model the trends in sales. These tools can be reliably used to simulate various hypothetical scenarios also. Full article

(This article belongs to the Section Network)

► Show Figures

Figure 1

23 pages, 1169 KiB

Open AccessFeature PaperArticle

Attributed Relational SIFT-Based Regions Graph: Concepts and Applications

by Mario Manzo

Mach. Learn. Knowl. Extr. 2020, 2(3), 233-255; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030013 - 06 Aug 2020

Cited by 7 | Viewed by 2867

Abstract

In the real world, structured data are increasingly represented by graphs. In general, the applications concern the most varied fields, and the data need to be represented in terms of local and spatial connections. In this scenario, the goal is to provide a [...] Read more.

In the real world, structured data are increasingly represented by graphs. In general, the applications concern the most varied fields, and the data need to be represented in terms of local and spatial connections. In this scenario, the goal is to provide a structure for the representation of a digital image, called the Attributed Relational SIFT-based Regions Graph (ARSRG), previously introduced. ARSRG has not been described in detail, and for this purpose, it is important to explore unknown aspects. In this regard, the goal is twofold: first, to provide a basic theory, which presents formal definitions, not yet specified above, clarifying its structural configuration; second, experimental, which provides key elements about adaptability and flexibility to different applications. The combination of the theoretical and experimental vision highlights how the ARSRG is adaptable to the representation of the images including various contents. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

17 pages, 2310 KiB

Open AccessArticle

Hierarchy-Based File Fragment Classification

by Manish Bhatt, Avdesh Mishra, Md Wasi Ul Kabir, S. E. Blake-Gatto, Rishav Rajendra, Md Tamjidul Hoque and Irfan Ahmed

Mach. Learn. Knowl. Extr. 2020, 2(3), 216-232; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030012 - 03 Aug 2020

Cited by 10 | Viewed by 4040

Abstract

File fragment classification is an essential problem in digital forensics. Although several attempts had been made to solve this challenging problem, a general solution has not been found. In this work, we propose a hierarchical machine-learning-based approach with optimized support vector machines (SVM) [...] Read more.

File fragment classification is an essential problem in digital forensics. Although several attempts had been made to solve this challenging problem, a general solution has not been found. In this work, we propose a hierarchical machine-learning-based approach with optimized support vector machines (SVM) as the base classifiers for file fragment classification. This approach consists of more general classifiers at the top level and more specialized fine-grain classifiers at the lower levels of the hierarchy. We also propose a primitive taxonomy for file types that can be used to perform hierarchical classification. We evaluate our model with a dataset of 14 file types, with 1000 fragments measuring 512 bytes from each file type derived from a subset of the publicly available Digital Corpora, the govdocs1 corpus. Our experiment shows comparable results to the present literature, with an average accuracy of 67.78% and an F1-measure of 65% using 10-fold cross-validation. We then improve on the hierarchy and find better results, with an increase in the F1-measure of 1%. Finally, we make our assessment and observations, then conclude the paper by discussing the scope of future research. Full article

► Show Figures

Figure 1

24 pages, 1106 KiB

Open AccessFeature PaperArticle

Monitoring Users’ Behavior: Anti-Immigration Speech Detection on Twitter

by Nikolaos Pitropakis, Kamil Kokot, Dimitra Gkatzia, Robert Ludwiniak, Alexios Mylonas and Miltiadis Kandias

Mach. Learn. Knowl. Extr. 2020, 2(3), 192-215; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030011 - 03 Aug 2020

Cited by 12 | Viewed by 5105

Abstract

The proliferation of social media platforms changed the way people interact online. However, engagement with social media comes with a price, the users’ privacy. Breaches of users’ privacy, such as the Cambridge Analytica scandal, can reveal how the users’ data can be weaponized [...] Read more.

The proliferation of social media platforms changed the way people interact online. However, engagement with social media comes with a price, the users’ privacy. Breaches of users’ privacy, such as the Cambridge Analytica scandal, can reveal how the users’ data can be weaponized in political campaigns, which many times trigger hate speech and anti-immigration views. Hate speech detection is a challenging task due to the different sources of hate that can have an impact on the language used, as well as the lack of relevant annotated data. To tackle this, we collected and manually annotated an immigration-related dataset of publicly available Tweets in UK, US, and Canadian English. In an empirical study, we explored anti-immigration speech detection utilizing various language features (word n-grams, character n-grams) and measured their impact on a number of trained classifiers. Our work demonstrates that using word n-grams results in higher precision, recall, and f-score as compared to character n-grams. Finally, we discuss the implications of these results for future work on hate-speech detection and social media data analysis in general. Full article

(This article belongs to the Section Privacy)

► Show Figures

Figure 1

20 pages, 7785 KiB

Open AccessFeature PaperArticle

Focal Liver Lesion Detection in Ultrasound Image Using Deep Feature Fusions and Super Resolution

by Rafid Mostafiz, Mohammad Motiur Rahman, A. K. M. Kamrul Islam and Saeid Belkasim

Mach. Learn. Knowl. Extr. 2020, 2(3), 172-191; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030010 - 09 Jul 2020

Cited by 17 | Viewed by 3615

Abstract

This research presents a machine vision approach to detect lesions in liver ultrasound as well as resolving some issues in ultrasound such as artifacts, speckle noise, and blurring effect. The anisotropic diffusion is modified using the edge preservation conditions which found better than [...] Read more.

This research presents a machine vision approach to detect lesions in liver ultrasound as well as resolving some issues in ultrasound such as artifacts, speckle noise, and blurring effect. The anisotropic diffusion is modified using the edge preservation conditions which found better than traditional ones in quantitative evolution. To dig for more potential information, a learnable super-resolution (SR) is embedded into the deep CNN. The feature is fused using Gabor Wavelet Transform (GWT) and Local Binary Pattern (LBP) with a pre-trained deep CNN model. Moreover, we propose a Bayes rule-based informative patch selection approach to reduce the processing time with the selective image patches and design an algorithm to mark the lesion region from identified ultrasound image patches. To train this model, standard data ensures promising resolution. The testing phase considers generalized data with a varying resolution and test the performance of the model. Exploring cross-validation, it finds that a 5-fold strategy can successfully eradicate the overfitting problem. Experiment data are collected using 298 consecutive ultrasounds comprising 15,296 image patches. This proposed feature fusion technique confirms satisfactory performance compared to the current relevant works with an accuracy of 98.40%. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

25 pages, 1083 KiB

Open AccessArticle

Claim Consistency Checking Using Soft Logic

by Nouf Bindris, Nello Cristianini and Jonathan Lawry

Mach. Learn. Knowl. Extr. 2020, 2(3), 147-171; https://0-doi-org.brum.beds.ac.uk/10.3390/make2030009 - 06 Jul 2020

Viewed by 3190

Abstract

Increasing concerns about the prevalence of false information and fake news has led to calls for automated fact-checking systems that are capable of verifying the truthfulness of statements, especially on the internet. Most previous automated fact-checking systems have focused on the use of [...] Read more.

Increasing concerns about the prevalence of false information and fake news has led to calls for automated fact-checking systems that are capable of verifying the truthfulness of statements, especially on the internet. Most previous automated fact-checking systems have focused on the use of grammar rules only for determining the properties of the language used in statements. Here, we demonstrate a novel approach to the fact-checking of natural language text, which uses a combination of all the following techniques: knowledge extraction to establish a knowledge base, logical inference for fact-checking of claims not explicitly mentioned in the text through the verification of the consistency of a set of beliefs with established trusted knowledge, and a re-querying approach that enables continuous learning. The approach that is presented here addresses the limitations of existing automated fact-checking systems via this novel procedure. This procedure is as follows: the approach investigates the consistency of presented facts or claims while using probabilistic soft logic and a Knowledge Base, which is continuously updated through continuous learning strategies. We demonstrate this approach by focusing on the task of checking facts about family-tree relationships against a corpus of web resources concerned with the UK Royal Family. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Mach. Learn. Knowl. Extr., Volume 2, Issue 3 (September 2020) – 12 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI