Latest Trends Related to Imbalanced Classification Problems in Data Mining: New Approaches and Applications

Share This Special Issue

Special Issue Editor

Dr. José Antonio Sáez

E-Mail Website
Guest Editor

Department of Computer Science and Automatics, University of Salamanca, 37008 Salamanca, Spain
Interests: data science; data mining; machine learning; classification; regression; data preprocessing; noisy data; imbalanced learning

Special Issue Information

Dear Colleagues,

As you know, many real-world classification problems are characterized by a highly imbalanced distribution of samples among the classes. In these problems, one class (the minority class) contains a much smaller number of samples than the other classes (the majority classes). Class imbalance constitutes a difficulty for most learning algorithms which assume an approximately balanced class distribution and are biased toward the learning and recognition of the majority classes. As a result, minority class samples (which are often the most interesting from an application point of view) usually tend to be misclassified.

This Special Issue is focused on papers dealing with the imbalanced classification problem in data mining. Research topics can include but are not limited to:

1) New approaches to deal with imbalanced classification problems;

2) Applications of existing or new methods in the imbalanced classification framework;

3) Studies on class imbalance combined with other problems affecting the data: overlapping, noisy samples, presence of small disjuncts, etc.;

4) Theoretical/experimental reviews of classic and recent approaches in imbalanced classification;

5) Negative and confirmatory results of existing scientific publications related to imbalanced classification.

We cordially welcome research papers and review articles with concise and comprehensive contents related to the topics above. Papers will be subjected to a peer review procedure to ensure the scientific soundness of their content. The review process will also aim for a fast and wide dissemination of the research results of the authors.

Dr. José Antonio Sáez
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. J is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1200 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

imbalanced learning
unbalanced learning
classification
data preprocessing
data mining

Published Papers (1 paper)

Research

20 pages, 1132 KiB

Open AccessArticle

Filtering-Based Instance Selection Method for Overlapping Problem in Imbalanced Datasets

by Marcio Rubbo and Leandro A. Silva

J 2021, 4(3), 308-327; https://0-doi-org.brum.beds.ac.uk/10.3390/j4030024 - 09 Jul 2021

Cited by 3 | Viewed by 2305

Abstract

The overlapping problem occurs when a region of the dimensional data space is shared in a similar proportion by different classes. It has an impact on a classifier’s performance due to the difficulty in correctly separating the classes. Further, an imbalanced dataset consists of a situation in which one class has more instances than another, and this is another aspect that impacts a classifier’s performance. In general, these two problems are treated separately. On the other hand, Prototype Selection (PS) approaches are employed as strategies for selecting appropriate instances from a dataset by filtering redundant and noise data, which can cause misclassification performance. In this paper, we introduce Filtering-based Instance Selection (FIS), using as a base the Self-Organizing Maps Neural Network (SOM) and information entropy. In this sense, SOM is trained with a dataset, and, then, the instances of the training set are mapped to the nearest prototype (SOM neurons). An analysis with entropy is conducted in each prototype region. From a threshold, we propose three decision methods: filtering the majority class (H-FIS (High Filter IS)), the minority class (L-FIS (Low Filter IS)), and both classes (B-FIS). The experiments using artificial and real dataset showed that the methods proposed in combination with 1NN improved the accuracy, F-Score, and G-mean values when compared with the 1NN classifier without the filter methods. The FIS approach is also compatible with the approaches mentioned in the relevant literature. Full article

(This article belongs to the Special Issue Latest Trends Related to Imbalanced Classification Problems in Data Mining: New Approaches and Applications)

► Show Figures