entropy-logo

Journal Browser

Journal Browser

Deep Artificial Neural Networks Meet Information Theory

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (30 August 2020) | Viewed by 13032

Special Issue Editor


E-Mail Website
Guest Editor
Insitute of Neural Information Processing, Ulm University, James Frank Ring, 89081 Ulm, Germany
Interests: artificial neural networks; pattern recognition; cluster analysis; statistical learning theory; data mining; multiple classifier systems; sensor fusion; affective computing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep neural networks (DNN) is an extremely growing research field with a proven record of success during the last years in various applications, e.g., computer vision, speech processing, pattern recognition or reinforment learning. Despite this great success of DNN, the theoretical understanding of DNN is still limited. In recent times, information-theoretic principles have been considered to be useful for a deeper understanding of DNN. The purpose of this Special Issue is to highlight the state-of-the-art of learning in DNN in the context of information theory.

This Special Issue welcomes original research papers concerned with learning DNN based on information-theoretic methods. Review articles describing the current state-of-the-art of DANN in context of Information Theory are highly encouraged. All submissions to this Special Issue must include substantial aspects from DNN and information theory.

Possible topics include but are not limited to the following:

  • Information-theoretic principles in machine learning, especially DNN;
  • Information-theoretic cost functions and contraints in DNN;
  • Sampling and feature learning bases on information-theoretic principles;
  • Analyzing learning in DNN utilizing information-theoretic methods;
  • Information bottleneck approaches in DNN;
  • Applications of DNN based on information-theoretic principles.

Dr. Friedhelm Schwenker
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep neural networks (DNN)
  • machine learning
  • information theory
  • pattern recognition

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

28 pages, 468 KiB  
Article
Neural Estimator of Information for Time-Series Data with Dependency
by Sina Molavipour, Hamid Ghourchian, Germán Bassi and Mikael Skoglund
Entropy 2021, 23(6), 641; https://0-doi-org.brum.beds.ac.uk/10.3390/e23060641 - 21 May 2021
Cited by 3 | Viewed by 2012
Abstract
Novel approaches to estimate information measures using neural networks are well-celebrated in recent years both in the information theory and machine learning communities. These neural-based estimators are shown to converge to the true values when estimating mutual information and conditional mutual information using [...] Read more.
Novel approaches to estimate information measures using neural networks are well-celebrated in recent years both in the information theory and machine learning communities. These neural-based estimators are shown to converge to the true values when estimating mutual information and conditional mutual information using independent samples. However, if the samples in the dataset are not independent, the consistency of these estimators requires further investigation. This is of particular interest for a more complex measure such as the directed information, which is pivotal in characterizing causality and is meaningful over time-dependent variables. The extension of the convergence proof for such cases is not trivial and demands further assumptions on the data. In this paper, we show that our neural estimator for conditional mutual information is consistent when the dataset is generated with samples of a stationary and ergodic source. In other words, we show that our information estimator using neural networks converges asymptotically to the true value with probability one. Besides universal functional approximation of neural networks, a core lemma to show the convergence is Birkhoff’s ergodic theorem. Additionally, we use the technique to estimate directed information and demonstrate the effectiveness of our approach in simulations. Full article
(This article belongs to the Special Issue Deep Artificial Neural Networks Meet Information Theory)
Show Figures

Figure 1

17 pages, 958 KiB  
Article
Discovering Higher-Order Interactions Through Neural Information Decomposition
by Kyle Reing, Greg Ver Steeg and Aram Galstyan
Entropy 2021, 23(1), 79; https://0-doi-org.brum.beds.ac.uk/10.3390/e23010079 - 07 Jan 2021
Cited by 2 | Viewed by 2604
Abstract
If regularity in data takes the form of higher-order functions among groups of variables, models which are biased towards lower-order functions may easily mistake the data for noise. To distinguish whether this is the case, one must be able to quantify the contribution [...] Read more.
If regularity in data takes the form of higher-order functions among groups of variables, models which are biased towards lower-order functions may easily mistake the data for noise. To distinguish whether this is the case, one must be able to quantify the contribution of different orders of dependence to the total information. Recent work in information theory attempts to do this through measures of multivariate mutual information (MMI) and information decomposition (ID). Despite substantial theoretical progress, practical issues related to tractability and learnability of higher-order functions are still largely unaddressed. In this work, we introduce a new approach to information decomposition—termed Neural Information Decomposition (NID)—which is both theoretically grounded, and can be efficiently estimated in practice using neural networks. We show on synthetic data that NID can learn to distinguish higher-order functions from noise, while many unsupervised probability models cannot. Additionally, we demonstrate the usefulness of this framework as a tool for exploring biological and artificial neural networks. Full article
(This article belongs to the Special Issue Deep Artificial Neural Networks Meet Information Theory)
Show Figures

Figure 1

15 pages, 687 KiB  
Article
Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks
by Shrihari Vasudevan
Entropy 2020, 22(5), 560; https://0-doi-org.brum.beds.ac.uk/10.3390/e22050560 - 17 May 2020
Cited by 8 | Viewed by 3818
Abstract
This paper demonstrates a novel approach to training deep neural networks using a Mutual Information (MI)-driven, decaying Learning Rate (LR), Stochastic Gradient Descent (SGD) algorithm. MI between the output of the neural network and true outcomes is used to adaptively set the LR [...] Read more.
This paper demonstrates a novel approach to training deep neural networks using a Mutual Information (MI)-driven, decaying Learning Rate (LR), Stochastic Gradient Descent (SGD) algorithm. MI between the output of the neural network and true outcomes is used to adaptively set the LR for the network, in every epoch of the training cycle. This idea is extended to layer-wise setting of LR, as MI naturally provides a layer-wise performance metric. A LR range test determining the operating LR range is also proposed. Experiments compared this approach with popular alternatives such as gradient-based adaptive LR algorithms like Adam, RMSprop, and LARS. Competitive to better accuracy outcomes obtained in competitive to better time, demonstrate the feasibility of the metric and approach. Full article
(This article belongs to the Special Issue Deep Artificial Neural Networks Meet Information Theory)
Show Figures

Figure 1

17 pages, 8944 KiB  
Article
A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory
by Yingying Ma, Youlong Wu and Chengqiang Lu
Entropy 2020, 22(4), 416; https://0-doi-org.brum.beds.ac.uk/10.3390/e22040416 - 07 Apr 2020
Cited by 8 | Viewed by 3614
Abstract
Name ambiguity, due to the fact that many people share an identical name, often deteriorates the performance of information integration, document retrieval and web search. In academic data analysis, author name ambiguity usually decreases the analysis performance. To solve this problem, an author [...] Read more.
Name ambiguity, due to the fact that many people share an identical name, often deteriorates the performance of information integration, document retrieval and web search. In academic data analysis, author name ambiguity usually decreases the analysis performance. To solve this problem, an author name disambiguation task is designed to divide documents related to an author name reference into several parts and each part is associated with a real-life person. Existing methods usually use either attributes of documents or relationships between documents and co-authors. However, methods of feature extraction using attributes cause inflexibility of models while solutions based on relationship graph network ignore the information contained in the features. In this paper, we propose a novel name disambiguation model based on representation learning which incorporates attributes and relationships. Experiments on a public real dataset demonstrate the effectiveness of our model and experimental results demonstrate that our solution is superior to several state-of-the-art graph-based methods. We also increase the interpretability of our method through information theory and show that the analysis could be helpful for model selection and training progress. Full article
(This article belongs to the Special Issue Deep Artificial Neural Networks Meet Information Theory)
Show Figures

Graphical abstract

Back to TopTop