entropy-logo

Journal Browser

Journal Browser

Theory and Applications of Information Theoretic Machine Learning

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (15 December 2020) | Viewed by 166651

Special Issue Editors

Department of Mathematics, University of Patras, GR 265-00 Patras, Greece
Interests: machine learning; data mining; knowledge discovery; data science
Special Issues, Collections and Topics in MDPI journals
School of Science and Technology, Hellenic Open University, Patra, Greece
Interests: machine learning; artificial intelligence; educational intelligence; educational technology
Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece
Interests: data structures; information retrieval; data mining; bioinformatics; string algorithmic; computational geometry; multimedia databases; internet technologies
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

At present, the entire world and software industry is looking for ways to apply the principles of data science and data analytics to address various difficult problems. The usage and application of machine learning and data analytics principles, methods, and techniques can contribute to address new problems and discover improved solutions. This Special Issue aims at bringing together applications of machine learning in various interdisciplinary domains and areas of interest, such as data mining, data analytics, and data science to cater to a wide landscape of methods, methodologies, and techniques which can be applied to obtain productive results. The aims of this Special Issue are: (1) to present state-of-the-art research on data mining and machine learning; and (2) to provide a forum for researchers to discuss the latest progress, new research methodologies, and potential research topics. Further, all submissions should explain the role of entropy or information theory applications to this field. Topics of interests include, but are not limited, to classification, regression and prediction, clustering, kernel methods, data mining, web mining, information retrieval, natural language processing, deep learning, probabilistic models and methods, vision and speech perception, bioinformatics, streaming data, industrial, financial, and educational applications. Papers will be evaluated based on their originality, presentation, relevance, and contribution, as well as their suitability and the quality in terms of both technical contribution and writing. The submitted papers must be written in English and describe original research which has not been published nor currently under review by other journals or conferences.

Assist. Prof. Sotiris Kotsiantis
Assoc. Prof. Dimitris Kalles
assoc. Prof. Christos Makris
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • data mining
  • computational intelligence
  • learning analytics
  • artificial intelligence
  • educational intelligence
  • educational technology
  • information retrieval
  • bioinformatics

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

19 pages, 2944 KiB  
Article
Predicting Fraud Victimization Using Classical Machine Learning
by Mark Lokanan and Susan Liu
Entropy 2021, 23(3), 300; https://0-doi-org.brum.beds.ac.uk/10.3390/e23030300 - 03 Mar 2021
Cited by 11 | Viewed by 3152
Abstract
Protecting financial consumers from investment fraud has been a recurring problem in Canada. The purpose of this paper is to predict the demographic characteristics of investors who are likely to be victims of investment fraud. Data for this paper came from the Investment [...] Read more.
Protecting financial consumers from investment fraud has been a recurring problem in Canada. The purpose of this paper is to predict the demographic characteristics of investors who are likely to be victims of investment fraud. Data for this paper came from the Investment Industry Regulatory Organization of Canada’s (IIROC) database between January of 2009 and December of 2019. In total, 4575 investors were coded as victims of investment fraud. The study employed a machine-learning algorithm to predict the probability of fraud victimization. The machine learning model deployed in this paper predicted the typical demographic profile of fraud victims as investors who classify as female, have poor financial knowledge, know the advisor from the past, and are retired. Investors who are characterized as having limited financial literacy but a long-time relationship with their advisor have reduced probabilities of being victimized. However, male investors with low or moderate-level investment knowledge were more likely to be preyed upon by their investment advisors. While not statistically significant, older adults, in general, are at greater risk of being victimized. The findings from this paper can be used by Canadian self-regulatory organizations and securities commissions to inform their investors’ protection mandates. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

12 pages, 626 KiB  
Article
Malicious URL Detection Based on Associative Classification
by Sandra Kumi, ChaeHo Lim and Sang-Gon Lee
Entropy 2021, 23(2), 182; https://0-doi-org.brum.beds.ac.uk/10.3390/e23020182 - 31 Jan 2021
Cited by 21 | Viewed by 4753
Abstract
Cybercriminals use malicious URLs as distribution channels to propagate malware over the web. Attackers exploit vulnerabilities in browsers to install malware to have access to the victim’s computer remotely. The purpose of most malware is to gain access to a network, ex-filtrate sensitive [...] Read more.
Cybercriminals use malicious URLs as distribution channels to propagate malware over the web. Attackers exploit vulnerabilities in browsers to install malware to have access to the victim’s computer remotely. The purpose of most malware is to gain access to a network, ex-filtrate sensitive information, and secretly monitor targeted computer systems. In this paper, a data mining approach known as classification based on association (CBA) to detect malicious URLs using URL and webpage content features is presented. The CBA algorithm uses a training dataset of URLs as historical data to discover association rules to build an accurate classifier. The experimental results show that CBA gives comparable performance against benchmark classification algorithms, achieving 95.8% accuracy with low false positive and negative rates. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

12 pages, 489 KiB  
Article
Who Will Score? A Machine Learning Approach to Supporting Football Team Building and Transfers
by Bartosz Ćwiklinski, Agata Giełczyk and Michał Choraś
Entropy 2021, 23(1), 90; https://0-doi-org.brum.beds.ac.uk/10.3390/e23010090 - 10 Jan 2021
Cited by 11 | Viewed by 5578
Abstract
Background: the machine learning (ML) techniques have been implemented in numerous applications, including health-care, security, entertainment, and sports. In this article, we present how the ML can be used for building a professional football team and planning player transfers. Methods: in this research, [...] Read more.
Background: the machine learning (ML) techniques have been implemented in numerous applications, including health-care, security, entertainment, and sports. In this article, we present how the ML can be used for building a professional football team and planning player transfers. Methods: in this research, we defined numerous parameters for player assessment, and three definitions of a successful transfer. We used the Random Forest, Naive Bayes, and AdaBoost algorithms in order to predict the player transfer success. We used realistic, publicly available data in order to train and test the classifiers. Results: in the article, we present numerous experiments; they differ in the weights of parameters, the successful transfer definitions, and other factors. We report promising results (accuracy = 0.82, precision = 0.84, recall = 0.82, and F1-score = 0.83). Conclusion: the presented research proves that machine learning can be helpful in professional football team building. The proposed algorithm will be developed in the future and it may be implemented as a professional tool for football talent scouts. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

18 pages, 687 KiB  
Article
Machine Learning Algorithms for Prediction of the Quality of Transmission in Optical Networks
by Stanisław Kozdrowski, Paweł Cichosz, Piotr Paziewski and Sławomir Sujecki
Entropy 2021, 23(1), 7; https://0-doi-org.brum.beds.ac.uk/10.3390/e23010007 - 22 Dec 2020
Cited by 12 | Viewed by 2815
Abstract
Increasing demand in the backbone Dense Wavelength Division (DWDM) Multiplexing network traffic prompts an introduction of new solutions that allow increasing the transmission speed without significant increase of the service cost. In order to achieve this objective simpler and faster, DWDM network reconfiguration [...] Read more.
Increasing demand in the backbone Dense Wavelength Division (DWDM) Multiplexing network traffic prompts an introduction of new solutions that allow increasing the transmission speed without significant increase of the service cost. In order to achieve this objective simpler and faster, DWDM network reconfiguration procedures are needed. A key problem that is intrinsically related to network reconfiguration is that of the quality of transmission assessment. Thus, in this contribution a Machine Learning (ML) based method for an assessment of the quality of transmission is proposed. The proposed ML methods use a database, which was created only on the basis of information that is available to a DWDM network operator via the DWDM network control plane. Several types of ML classifiers are proposed and their performance is tested and compared for two real DWDM network topologies. The results obtained are promising and motivate further research. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

17 pages, 406 KiB  
Article
Monitoring Volatility Change for Time Series Based on Support Vector Regression
by Sangyeol Lee, Chang Kyeom Kim and Dongwuk Kim
Entropy 2020, 22(11), 1312; https://0-doi-org.brum.beds.ac.uk/10.3390/e22111312 - 17 Nov 2020
Cited by 11 | Viewed by 1859
Abstract
This paper considers monitoring an anomaly from sequentially observed time series with heteroscedastic conditional volatilities based on the cumulative sum (CUSUM) method combined with support vector regression (SVR). The proposed online monitoring process is designed to detect a significant change in volatility of [...] Read more.
This paper considers monitoring an anomaly from sequentially observed time series with heteroscedastic conditional volatilities based on the cumulative sum (CUSUM) method combined with support vector regression (SVR). The proposed online monitoring process is designed to detect a significant change in volatility of financial time series. The tuning parameters are optimally chosen using particle swarm optimization (PSO). We conduct Monte Carlo simulation experiments to illustrate the validity of the proposed method. A real data analysis with the S&P 500 index, Korea Composite Stock Price Index (KOSPI), and the stock price of Microsoft Corporation is presented to demonstrate the versatility of our model. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

27 pages, 936 KiB  
Article
Bayesian3 Active Learning for the Gaussian Process Emulator Using Information Theory
by Sergey Oladyshkin, Farid Mohammadi, Ilja Kroeker and Wolfgang Nowak
Entropy 2020, 22(8), 890; https://0-doi-org.brum.beds.ac.uk/10.3390/e22080890 - 13 Aug 2020
Cited by 15 | Viewed by 3999
Abstract
Gaussian process emulators (GPE) are a machine learning approach that replicates computational demanding models using training runs of that model. Constructing such a surrogate is very challenging and, in the context of Bayesian inference, the training runs should be well invested. The current [...] Read more.
Gaussian process emulators (GPE) are a machine learning approach that replicates computational demanding models using training runs of that model. Constructing such a surrogate is very challenging and, in the context of Bayesian inference, the training runs should be well invested. The current paper offers a fully Bayesian view on GPEs for Bayesian inference accompanied by Bayesian active learning (BAL). We introduce three BAL strategies that adaptively identify training sets for the GPE using information-theoretic arguments. The first strategy relies on Bayesian model evidence that indicates the GPE’s quality of matching the measurement data, the second strategy is based on relative entropy that indicates the relative information gain for the GPE, and the third is founded on information entropy that indicates the missing information in the GPE. We illustrate the performance of our three strategies using analytical- and carbon-dioxide benchmarks. The paper shows evidence of convergence against a reference solution and demonstrates quantification of post-calibration uncertainty by comparing the introduced three strategies. We conclude that Bayesian model evidence-based and relative entropy-based strategies outperform the entropy-based strategy because the latter can be misleading during the BAL. The relative entropy-based strategy demonstrates superior performance to the Bayesian model evidence-based strategy. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

23 pages, 833 KiB  
Article
Social Network Analysis and Churn Prediction in Telecommunications Using Graph Theory
by Stefan M. Kostić, Mirjana I. Simić and Miroljub V. Kostić
Entropy 2020, 22(7), 753; https://0-doi-org.brum.beds.ac.uk/10.3390/e22070753 - 09 Jul 2020
Cited by 19 | Viewed by 5551
Abstract
Due to telecommunications market saturation, it is very important for telco operators to always have fresh insights into their customer’s dynamics. In that regard, social network analytics and its application with graph theory can be very useful. In this paper we analyze a [...] Read more.
Due to telecommunications market saturation, it is very important for telco operators to always have fresh insights into their customer’s dynamics. In that regard, social network analytics and its application with graph theory can be very useful. In this paper we analyze a social network that is represented by a large telco network graph and perform clustering of its nodes by studying a broad set of metrics, e.g., node in/out degree, first and second order influence, eigenvector, authority and hub values. This paper demonstrates that it is possible to identify some important nodes in our social network (graph) that are vital regarding churn prediction. We show that if such a node leaves a monitored telco operator, customers that frequently interact with that specific node will be more prone to leave the monitored telco operator network as well; thus, by analyzing existing churn and previous call patterns, we proactively predict new customers that will probably churn. The churn prediction results are quantified by using top decile lift metrics. The proposed method is general enough to be readily adopted in any field where homophilic or friendship connections can be assumed as a potential churn driver. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

18 pages, 3921 KiB  
Article
A Method Based on GA-CNN-LSTM for Daily Tourist Flow Prediction at Scenic Spots
by Wenxing Lu, Haidong Rui, Changyong Liang, Li Jiang, Shuping Zhao and Keqing Li
Entropy 2020, 22(3), 261; https://0-doi-org.brum.beds.ac.uk/10.3390/e22030261 - 25 Feb 2020
Cited by 32 | Viewed by 5162
Abstract
Accurate tourist flow prediction is key to ensuring the normal operation of popular scenic spots. However, one single model cannot effectively grasp the characteristics of the data and make accurate predictions because of the strong nonlinear characteristics of daily tourist flow data. Accordingly, [...] Read more.
Accurate tourist flow prediction is key to ensuring the normal operation of popular scenic spots. However, one single model cannot effectively grasp the characteristics of the data and make accurate predictions because of the strong nonlinear characteristics of daily tourist flow data. Accordingly, this study predicts daily tourist flow in Huangshan Scenic Spot in China. A prediction method (GA-CNN-LSTM) which combines convolutional neural network (CNN) and long-short-term memory network (LSTM) and optimized by genetic algorithm (GA) is established. First, network search data, meteorological data, and other data are constructed into continuous feature maps. Then, feature vectors are extracted by convolutional neural network (CNN). Finally, the feature vectors are input into long-short-term memory network (LSTM) in time series for prediction. Moreover, GA is used to scientifically select the number of neurons in the CNN-LSTM model. Data is preprocessed and normalized before prediction. The accuracy of GA-CNN-LSTM is evaluated using mean absolute percentage error (MAPE), mean absolute error (MAE), Pearson correlation coefficient and index of agreement (IA). For a fair comparison, GA-CNN-LSTM model is compared with CNN-LSTM, LSTM, CNN and the back propagation neural network (BP). The experimental results show that GA-CNN-LSTM model is approximately 8.22% higher than CNN-LSTM on the performance of MAPE. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

27 pages, 5044 KiB  
Article
Nonlinear Canonical Correlation Analysis:A Compressed Representation Approach
by Amichai Painsky, Meir Feder and Naftali Tishby
Entropy 2020, 22(2), 208; https://0-doi-org.brum.beds.ac.uk/10.3390/e22020208 - 12 Feb 2020
Cited by 2 | Viewed by 4266
Abstract
Canonical Correlation Analysis (CCA) is a linear representation learning method that seeks maximally correlated variables in multi-view data. Nonlinear CCA extends this notion to a broader family of transformations, which are more powerful in many real-world applications. Given the joint probability, the Alternating [...] Read more.
Canonical Correlation Analysis (CCA) is a linear representation learning method that seeks maximally correlated variables in multi-view data. Nonlinear CCA extends this notion to a broader family of transformations, which are more powerful in many real-world applications. Given the joint probability, the Alternating Conditional Expectation (ACE) algorithm provides an optimal solution to the nonlinear CCA problem. However, it suffers from limited performance and an increasing computational burden when only a finite number of samples is available. In this work, we introduce an information-theoretic compressed representation framework for the nonlinear CCA problem (CRCCA), which extends the classical ACE approach. Our suggested framework seeks compact representations of the data that allow a maximal level of correlation. This way, we control the trade-off between the flexibility and the complexity of the model. CRCCA provides theoretical bounds and optimality conditions, as we establish fundamental connections to rate-distortion theory, the information bottleneck and remote source coding. In addition, it allows a soft dimensionality reduction, as the compression level is determined by the mutual information between the original noisy data and the extracted signals. Finally, we introduce a simple implementation of the CRCCA framework, based on lattice quantization. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

24 pages, 3065 KiB  
Article
Mining Educational Data to Predict Students’ Performance through Procrastination Behavior
by Danial Hooshyar, Margus Pedaste and Yeongwook Yang
Entropy 2020, 22(1), 12; https://0-doi-org.brum.beds.ac.uk/10.3390/e22010012 - 20 Dec 2019
Cited by 86 | Viewed by 9044
Abstract
A significant amount of research has indicated that students’ procrastination tendencies are an important factor influencing the performance of students in online learning. It is, therefore, vital for educators to be aware of the presence of such behavior trends as students with lower [...] Read more.
A significant amount of research has indicated that students’ procrastination tendencies are an important factor influencing the performance of students in online learning. It is, therefore, vital for educators to be aware of the presence of such behavior trends as students with lower procrastination tendencies usually achieve better than those with higher procrastination. In the present study, we propose a novel algorithm—using student’s assignment submission behavior—to predict the performance of students with learning difficulties through procrastination behavior (called PPP). Unlike many existing works, PPP not only considers late or non-submissions, but also investigates students’ behavioral patterns before the due date of assignments. PPP firstly builds feature vectors representing the submission behavior of students for each assignment, then applies a clustering method to the feature vectors for labelling students as a procrastinator, procrastination candidate, or non-procrastinator, and finally employs and compares several classification methods to best classify students. To evaluate the effectiveness of PPP, we use a course including 242 students from the University of Tartu in Estonia. The results reveal that PPP could successfully predict students’ performance through their procrastination behaviors with an accuracy of 96%. Linear support vector machine appears to be the best classifier among others in terms of continuous features, and neural network in categorical features, where categorical features tend to perform slightly better than continuous. Finally, we found that the predictive power of all classification methods is lowered by an increment in class numbers formed by clustering. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

21 pages, 6846 KiB  
Article
Electricity Load and Price Forecasting Using Jaya-Long Short Term Memory (JLSTM) in Smart Grids
by Rabiya Khalid, Nadeem Javaid, Fahad A. Al-zahrani, Khursheed Aurangzeb, Emad-ul-Haq Qazi and Tehreem Ashfaq
Entropy 2020, 22(1), 10; https://0-doi-org.brum.beds.ac.uk/10.3390/e22010010 - 19 Dec 2019
Cited by 56 | Viewed by 4757
Abstract
In the smart grid (SG) environment, consumers are enabled to alter electricity consumption patterns in response to electricity prices and incentives. This results in prices that may differ from the initial price pattern. Electricity price and demand forecasting play a vital role in [...] Read more.
In the smart grid (SG) environment, consumers are enabled to alter electricity consumption patterns in response to electricity prices and incentives. This results in prices that may differ from the initial price pattern. Electricity price and demand forecasting play a vital role in the reliability and sustainability of SG. Forecasting using big data has become a new hot research topic as a massive amount of data is being generated and stored in the SG environment. Electricity users, having advanced knowledge of prices and demand of electricity, can manage their load efficiently. In this paper, a recurrent neural network (RNN), long short term memory (LSTM), is used for electricity price and demand forecasting using big data. Researchers are working actively to propose new models of forecasting. These models contain a single input variable as well as multiple variables. From the literature, we observed that the use of multiple variables enhances the forecasting accuracy. Hence, our proposed model uses multiple variables as input and forecasts the future values of electricity demand and price. The hyperparameters of this algorithm are tuned using the Jaya optimization algorithm to improve the forecasting ability and increase the training mechanism of the model. Parameter tuning is necessary because the performance of a forecasting model depends on the values of these parameters. Selection of inappropriate values can result in inaccurate forecasting. So, integration of an optimization method improves the forecasting accuracy with minimum user efforts. For efficient forecasting, data is preprocessed and cleaned from missing values and outliers, using the z-score method. Furthermore, data is normalized before forecasting. The forecasting accuracy of the proposed model is evaluated using the root mean square error (RMSE) and mean absolute error (MAE). For a fair comparison, the proposed forecasting model is compared with univariate LSTM and support vector machine (SVM). The values of the performance metrics depict that the proposed model has higher accuracy than SVM and univariate LSTM. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

16 pages, 3020 KiB  
Article
Predicting Student Performance and Deficiency in Mastering Knowledge Points in MOOCs Using Multi-Task Learning
by Shaojie Qu, Kan Li, Bo Wu, Xuri Zhang and Kaihao Zhu
Entropy 2019, 21(12), 1216; https://0-doi-org.brum.beds.ac.uk/10.3390/e21121216 - 12 Dec 2019
Cited by 14 | Viewed by 3404
Abstract
Massive open online courses (MOOCs), which have been deemed a revolutionary teaching mode, are increasingly being used in higher education. However, there remain deficiencies in understanding the relationship between online behavior of students and their performance, and in verifying how well a student [...] Read more.
Massive open online courses (MOOCs), which have been deemed a revolutionary teaching mode, are increasingly being used in higher education. However, there remain deficiencies in understanding the relationship between online behavior of students and their performance, and in verifying how well a student comprehends learning material. Therefore, we propose a method for predicting student performance and mastery of knowledge points in MOOCs based on assignment-related online behavior; this allows for those providing academic support to intervene and improve learning outcomes of students facing difficulties. The proposed method was developed while using data from 1528 participants in a C Programming course, from which we extracted assignment-related features. We first applied a multi-task multi-layer long short-term memory-based student performance predicting method with cross-entropy as the loss function to predict students’ overall performance and mastery of each knowledge point. Our method incorporates the attention mechanism, which might better reflect students’ learning behavior and performance. Our method achieves an accuracy of 92.52% for predicting students’ performance and a recall rate of 94.68%. Students’ actions, such as submission times and plagiarism, were related to their performance in the MOOC, and the results demonstrate that our method predicts the overall performance and knowledge points that students cannot master well. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

28 pages, 3394 KiB  
Article
Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme
by Nikos Fazakis, Vasileios G. Kanas, Christos K. Aridas, Stamatis Karlos and Sotiris Kotsiantis
Entropy 2019, 21(10), 988; https://0-doi-org.brum.beds.ac.uk/10.3390/e21100988 - 10 Oct 2019
Cited by 16 | Viewed by 4803
Abstract
One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming [...] Read more.
One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming since it requires the employment of human expertise. For a wide variety of scientific fields, unlabeled examples are easy to collect but hard to handle in a useful manner, thus improving the contained information for a subject dataset. In this context, a variety of learning methods have been studied in the literature aiming to efficiently utilize the vast amounts of unlabeled data during the learning process. The most common approaches tackle problems of this kind by individually applying active learning or semi-supervised learning methods. In this work, a combination of active learning and semi-supervised learning methods is proposed, under a common self-training scheme, in order to efficiently utilize the available unlabeled data. The effective and robust metrics of the entropy and the distribution of probabilities of the unlabeled set, to select the most sufficient unlabeled examples for the augmentation of the initial labeled set, are used. The superiority of the proposed scheme is validated by comparing it against the base approaches of supervised, semi-supervised, and active learning in the wide range of fifty-five benchmark datasets. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

Review

Jump to: Research

45 pages, 1086 KiB  
Review
Explainable AI: A Review of Machine Learning Interpretability Methods
by Pantelis Linardatos, Vasilis Papastefanopoulos and Sotiris Kotsiantis
Entropy 2021, 23(1), 18; https://0-doi-org.brum.beds.ac.uk/10.3390/e23010018 - 25 Dec 2020
Cited by 982 | Viewed by 106010
Abstract
Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption, with machine learning systems demonstrating superhuman performance in a significant number of tasks. However, this surge in performance, has often been achieved through increased model complexity, turning such systems into [...] Read more.
Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption, with machine learning systems demonstrating superhuman performance in a significant number of tasks. However, this surge in performance, has often been achieved through increased model complexity, turning such systems into “black box” approaches and causing uncertainty regarding the way they operate and, ultimately, the way that they come to decisions. This ambiguity has made it problematic for machine learning systems to be adopted in sensitive yet critical domains, where their value could be immense, such as healthcare. As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models, has been tremendously reignited over recent years. This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations, in the hope that this survey would serve as a reference point for both theorists and practitioners. Full article
(This article belongs to the Special Issue Theory and Applications of Information Theoretic Machine Learning)
Show Figures

Figure 1

Back to TopTop