Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

Camacho-Urriolagoitia, Francisco J.; Villuendas-Rey, Yenny; López-Yáñez, Itzamá; Camacho-Nieto, Oscar; Yáñez-Márquez, Cornelio

doi:10.3390/math10091460

Open AccessArticle

Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

¹

Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Av. Juan de Dios Bátiz s/n, Nueva Industrial Vallejo, GAM, Mexico City 07700, Mexico

²

Instituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz s/n, Nueva Industrial Vallejo, GAM, Mexico City 07738, Mexico

^*

Authors to whom correspondence should be addressed.

Mathematics 2022, 10(9), 1460; https://0-doi-org.brum.beds.ac.uk/10.3390/math10091460

Submission received: 15 February 2022 / Revised: 9 April 2022 / Accepted: 18 April 2022 / Published: 26 April 2022

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning Based Methods and Applications)

Download

Browse Figure

Versions Notes

Abstract

:

One of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers derived from Alpha-Beta models applied to the financial field. In this paper, the behavior of four associative classifiers was studied: the One-Hot version of the Hybrid Associative Classifier with Translation (CHAT-OHM), the Extended Gamma (EG), the Naïve Associative Classifier (NAC), and the Assisted Classification for Imbalanced Datasets (ACID). To establish the performance, we used the area under the curve (AUC), F-score, and geometric mean measures. The four classifiers were applied over 11 datasets from the financial area. Then, the performance of each one was analyzed, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets: feature, linearity, neighborhood, network, dimensionality, and class imbalance. The correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers are established; these results are expressed with Spearman’s Rho coefficient. The experimental results correctly indicated correlations between data complexity measures and the performance of the associative classifiers.

Keywords:

supervised classification; meta-learning; associative classification; finances

MSC:

68T05

1. Introduction

Numerous classification models have been developed in the scientific literature, with different theoretical foundations. One of the pioneers is the k-nearest neighbors (kNN) classifier [1], which is based on neighborhood criteria. Other well-known classifiers are based on probabilistic approaches, such as naïve Bayes [2], whereas others use information theory as part of their foundation (for example, ID3 [3] and C4.5 [4]).

From another perspective and focusing on finding hyperspaces where decision classes are separated, support vector machine (SVM) models have been developed [5]. Models of the human nervous system have also been used for classification tasks through the proposal of artificial neural networks (ANN) [6].

On the other hand, in Mexico, the associative approach has been developed, with the Alpha-Beta associative models [7] and their derivatives, for classification tasks. Among these derived models are the Smallest Normalized Distance Associative Memory (SNDAM) method [8] and the Hybrid Associative Classifier with Translation (HACT) method [9], also known as CHAT for its Spanish acronym, and its derivative, the CHAT-OHM classifier [10].

Another classification model within this approach is the Gamma classifier [11], with its versions of weight adjustments with differential evolution (CAG-ED) [12], its version for handling mixed and incomplete data (the Extended Gamma associative classifier, EG) [13], and the version of the latter which uses weight adjustment with differential evolution (Modified Associative Gamma Classifier, CAGM-ED) [14].

In addition to the previous models, the Naïve Associative Classifier (NAC) [15] stands out, with its disambiguation variants (NACe) [16] NACe-kVotes and NACe-kMajority [17] and its variant for the classification of data streams (Naïve Associative Classifier for Online Data, NACOD) [18].

Alpha-Beta associative models and their derivatives have shown satisfactory performance in numerous classification tasks from different fields. Among these, the medical [19], financial [20], social [14], productive [21], educational [22,23], and environmental [24] fields can be highlighted.

However, with so many different models available, how can one know which model will offer the best classification for a specific problem? The focus of our research is on the behavior of associative classifiers in the financial field. Several studies indicate that associative-based classifiers have shown good performance for finance-related datasets, outperforming several of the well-known supervised classifiers such as support vector machines, neural networks, decision trees, statistical-based classifiers, and others [15,16,25,26]. This is where the topic of interest for this paper appears and why we decided to focus on associative-based classifiers.

The algorithm selection problem (ASP) is a challenging learning task, focused on selecting proper classification algorithms for a given problem [27]. ASP has been the focus of various studies in the machine learning community, and for the supervised classification domain, meta-learning has shown ample achievements in solving the ASP. However, even though numerous studies have been conducted using diverse classifiers in the literature, to the knowledge of the authors of this study, there has been no comparative study that critically analyzes, summarizes, and evaluates the performance of associative classifiers derived from Alpha-Beta models, that allows the identification of the measures of data complexity that influence the good or bad performance of these models, considering their correlation with the data complexity measures.

This paper addresses a problem of which the solution would be beneficial for researchers in artificial intelligence, pattern recognition, and related areas: How can we know, in advance, in which dataset(s) a specific classifier will exhibit good (or bad) performance? The innovative and scientific contribution of this paper consists in the analysis of the performance of associative classifiers, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets.

In this study, we intended to make use of different complexity measures that will help to understand how some of the mentioned associative-based classifiers behave in the financial field, based on the data of the problem to be solved. The experimental results correctly indicate the existing correlations between complexity measures and the performance of the associative classifiers.

2. Background

Ho and Basu attribute the complexity of supervised classification problems to a combination of three main factors: the ambiguity of the classes, the scarcity and dimension of the data, and the complexity of the limit that separates the classes [28].

To determine the presence of such factors, several data descriptors have been introduced, based mainly on the geometric and statistical properties of the data. The data complexity measures proposed by Ho and Basu [28] are dedicated to computing the complexity of the boundary needed to separate binary classification problems (later extended for multiclass problems. Ho and Basu split their measures into three categories: individual feature value overlap measures; class separability measures; and geometry, topology, and density measures of multiple measures.

Ho and Basu’s work was instrumental in determining the complexity of a given dataset using data descriptors. Furthermore, Mansilla and Ho [29] studied the competence domains of the XCS classification system (the learning classifier system type) in the area of complexity measures and found that difficult problems for this system are characterized by a large number of samples located in the decision boundaries, the presence of a high number of adherence subsets, a high dispersion of class samples, and a high level of nonlinearity in the shapes of classes and boundaries.

A couple of years later, Sánchez et al. [30] analyzed how the complexity of the training data affects the kNN classifier and showed that the performance of this classifier is very susceptible to class overlapping and class density; in addition, they found that high dimensionality also affects the performance of kNN. Sanchez et al. also pointed out that measures are domain-dependent. That is, the performance of the classifier will depend on the characteristics of the problem.

In their studies, Luengo and Herrera presented two rules that are “simple, interpretable, and precise” to describe the good and bad performance of the FH-GBML method [31]. Two years later, they conducted another study [32]. They computed five data complexity measures and class separability measures to obtain ranges of such metrics for which the method performed significantly well or poorly.

The competence domains of the semi-naïve Bayes network classifiers (BNC) have also been studied by Flores, Gómez and Martínez [33]. They define the general attributes of ideal datasets for classification with BNCs. They also provide an automatic method to find the best BNC (in terms of prediction errors) given a particular dataset.

One year later, Luengo and Herrera [34] proposed a method of automatically extracting the competence domains for classifiers using measures of data complexity. The objective was to recognize whether the classification problem was good or bad for each classifier and obtain intervals of good performance for the classifiers based on data complexity measures. Their results showed that the domains of competence obtained were general enough to find the right datasets for the classifiers.

Years later, Morán-Fernández et al. [35] studied micro-array data, showing that the classification performance can be predicted by complexity measures and observed that attribute selection can reduce the complexity of the data.

In 2018 Barella [36] and collaborators studied data complexity measures for imbalanced classification tasks. They showed that several of the existing data complexity measures do not consider the difficulty of imbalanced classification problems. Therefore, they improved some data complexity measures to assess classes one by one and focus on the minority class. However, they also showed that data complexity measures mainly approximate majority class difficulty, which can be misleading, especially for imbalanced tasks with high class overlap. In addition, Barella suggested evaluating the difficulty of both classes in understanding the full classification problem, although minority class difficulty is often the main challenge in imbalanced tasks.

A year later, Lorena et al. [37] reviewed the main measures of data complexity in the literature and concluded that these measures allow one to characterize the complexity of a classification problem considering a geometrical approach and the distribution of data within or between classes. In 2020 Khan and collaborators [38] produced a survey about meta-learning for classifier selection. Using this approach, they deeply analyzed the problem of recommending a classification algorithm based on meta-learning. Their in-depth analysis showed that meta-learning was successful in the automatic algorithm recommendation process.

A recent study by Maillo et al. [39] analyzed redundancy and complexity measures for big data. The authors emphasized that it is possible to obtain similar or better results in several big data tasks by using a small set of quality data.

Since no classifier can regularly obtain the best performance for each classification problem, an effect proved in the no free lunch theorems [40], the analysis of data complexity measures allows us to understand the situations in which a certain classifier is successful and in which it fails.

3. Materials and Methods

This section describes the most important data complexity measures and some associative classifiers derived from the Alpha-Beta associative models that will be tested against different datasets for the development of this area of research.

3.1. Data Complexity Measures

Some of the most widely used data complexity measures are those proposed by Ho and Basu [28] and later extended in [30,41] to deal with multiclass classification problems. Ho and Basu [28] split the data complexity measures into three groups: (1) individual characteristic value overlap measures; (2) class separability measures; and (3) geometry, topology, and manifold density measurements. Likewise, Sotoca et al. [42] divided complexity measures into the following groups: (1) overlap measures, (2) class separability measures, and (3) geometry and density measurements.

Subsequently, Lorena et al. [37] classified complexity measures into six categories:

Feature-based measures;
Linearity measures;
Neighborhood measures;
Network measures;
Dimensionality measures; and
Class imbalance measures.

In this work, we use five of the six types of measures mentioned above. To compute the measures, we assume that we have a learning dataset “T” that contains pairs of instances

(x_{i}, y_{i})

, where

x_{i} = (x_{i 1}, \dots, x_{im})

and

y_{i} \in {1, \dots, n_{c}}

. That is, each instance

x_{i}

is described by

m

attributes and also has a class label

y_{i}

of one of the

n_{c}

classes.

Most data complexity measures assume that the attributes are numeric and complete (no missing values). Categorical values need to be codified into numeric ones, and missing values must be deleted or imputed. An additional assumption is that linearly separable problems are simpler than nonlinear classification problems.

Table 1 presents the data complexity measures used in this study. The measures used consisted of three feature-based measures (F1, F2, and F3), two linearity measures (L1 and L2), four neighborhood-based measures (N1, N2, N3, and N4), one network measure (T1), and one dimensionality measure (T2). We did not consider class-imbalance-related measures in this study. All the data complexity measures used here were presented in a review by Lorena et al. [37].

In addition, the complexity measures presented here are available for computation in KEEL software [43], which is an open software environment designed for experimentation with supervised classification.

3.2. Associative Classifiers

3.2.1. CHAT-OHM

The Hybrid Associative Classifier with Translation (HACT or CHAT by its Spanish acronym) is a classification algorithm proposed by Santiago-Montero [9], which has shown good results in supervised classification [25]. It has two phases: association (in which the classifier is trained) and recovery (in which the classes of the instances to be classified are obtained). This classifier assumes that the dataset does not contain absences of information and that all attributes are numeric. Furthermore, it assumes that the class values are consecutive integers.

This classification algorithm was refined in 2014, with the development of a “One-Hot” version (CHAT-OHM) [10]. The first modification arises in the training phase and consists of a new way of coding the classes. In this case, the classes are formed by binary vectors of size p (instead of vectors of size m as in HACT), and the components of said vectors have the form

y_{i}^{μ} = {\begin{matrix} 1 & if i = μ \\ 0 & otherwise \end{matrix}

.

The second modification consists of using the majority vote rule in the classification process. To accomplish this, the CHAT-OHM classification phase has three steps:

Translate the instance to classify $o$ , considering the mean of the training instances, as $\hat{o} \leftarrow o - \hat{x}$ .
For each class $k \in {1, 2, \dots, m}$ , create a masking vector ${mv}^{k}$ of size $p$ , of which the components are ${mv}_{μ}^{k} = {\begin{matrix} 1 & {if \hat{x}}^{μ} \in k \\ 0 & otherwise \end{matrix}$ and obtain the output pattern $z^{o}$ , as $z^{o} = [α \sum_{μ = 1}^{p} y^{μ} ({\hat{x}}^{μ}^{T})] \hat{o}$ . Then, for each class obtain a counter vector ${cv}^{k} = z^{o} \land {mv}^{k}$
Compute the components of the output vector (class) for the instance to classify $\hat{o}$ as $y_{i}^{o} = {\begin{matrix} 1 & if \sum_{j = 1}^{p} {cv}_{j}^{i} = \max_{h = 1 . . m} [\sum_{j = 1}^{p} {cv}_{j}^{h}] \\ 0 & otherwise \end{matrix}$ . Thus, the class of the input instance $x^{μ}$ will be returned if and only if the obtained vector has a one in the $μ$ -th component and zero in the remaining components.

Although HACT does not handle mixed or incomplete data, it has achieved good performance in the financial field [25].

3.2.2. Extended Gamma

López-Yáñez introduced the Gamma associative classifier as a model designed to predict time series [44]. It has been applied to several supervised classification tasks [11,21,24]. However, the same as HACT, it assumes that the dataset is numeric and complete. To address this drawback, the Extended Gamma (EG) classifier was proposed [13].

Let be

X

and P the training and test sets, respectively, from a universe of data U, where each instance

x \in X, p \in P

is described by a set of features

A = {A_{1}, A_{2}, \dots, A_{m}}

; and each attribute

A_{i}

has associated a definition domain

dom (A_{i})

, which can be numeric or categoric. If the value of an attribute

A_{i}

in an instance

x

is missing, it is denoted as

x_{i}

= ‘?’. Extended Gamma considers the data to belong to a set of classes

K = {K_{1}, \dots, K_{c}}

.

EG has the same training phase as its predecessor, and for the classification phase, it replaces the gamma similarity operator with the extended gamma similarity operator,

γ_{ext}

. To determine the class value of an instance, its average similarity with respect to all classes is computed as:

c_{k_{l}} = \frac{\sum_{i = 1}^{n} \sum_{j = 1}^{m} w_{j} {* γ}_{ext} (x_{j}^{i}, p_{j})}{n}

(1)

where

x_{j}^{i}

is the value of the j-th attribute in the i-th instance belonging to the class

k_{l}

,

w_{j}

is the weight associated with the j-th attribute, and n is the number of instances in the training set having

k_{l}

as the class value. If there is a unique maximum in the values of

c_{k_{l}}

, the process ends. Otherwise, an iterative process is carried out [13].

The definition of the extended gamma similarity operator allows for the handling of mixed continuous and categorical attributes and missing values. When the value of a feature is unknown, it is assumed that the values are similar, and in the case of categorical features, the patterns will only be similar if the values of the features are equal. For the case of continuous attributes, a definition equivalent to the original operator definition is used.

Extended Gamma has been applied effectively to solve problems of a social nature with mixed and incomplete data, such as electoral predictions [14].

3.2.3. Naïve Associative Classifier

Another associative classifier that is able to deal with mixed and incomplete data is the Naïve Associative Classifier (NAC), which has also been applied to solve finance-related problems [15,26]. This classifier has the properties of being transparent and transportable [15].

The training of NAC includes the computation of the standard deviation of each numeric feature. Such values will be used below for classification. To do so, NAC uses a mixed and incomplete similarity operator to compute the similarity between patterns. This operator can deal with both continuous and categorical attribute values and incomplete information in the data. As was the case for Extended Gamma, NAC computes the average similarities to each class

k_{l}

for the instance to classify p. If there is a unique maximum, the corresponding class is returned. Otherwise, ties are broken randomly.

NAC has several variants: NACe, which includes a procedure to disambiguate classes [16]; NACe kVotes, and NACe MajoritykVotes [17], which use neighborhoods in the disambiguation process; and NACOD, which allows the handling of data streams [18]. NAC has shown good performance in solving finance-related problems and has been found to be robust to data imbalance [20].

3.2.4. Assisted Classification for Imbalanced Datasets

The Assisted Classification for Imbalanced Datasets (ACID) classifier was also designed [19] for mixed and incomplete classification problems. In the training phase, ACID includes an optional procedure to compute attribute weights via differential evolution. ACID also deals with class disjoints, by performing clustering, and with class imbalance. Additionally, it uses data aggregation to diminish the influence of class overlapping and noisy or mislabeled instances [19].

In this section, we have detailed the most frequently explored data complexity measures in the literature and the associated classifiers. We found no works that analyzed how data complexity, beyond the analysis of data imbalance, affects the performance of the associative classifiers of the Alpha-Beta approach.

4. Results and Discussion

This section describes the datasets used in this study to obtain the complexity measures described in the previous chapter, as well as some of the performance measures obtained from the described classifiers.

4.1. Datasets

Several datasets containing credit data were used. All of these are related to some of the credit process phases: granting, promotion and recovery. Some of the datasets were obtained from the UCI Machine Learning Repository [45]. The datasets that were used in the experiments are:

Australian credit approval data (Australian, https://archive.ics.uci.edu/ml/datasets/statlog+(australian+credit+approval) accessed on 15 June 2021);
German credit data (German, https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data), accessed on 15 June 2021)
Japanese credit approval data (Japanese, https://archive.ics.uci.edu/ml/datasets/Credit+Approval, accessed on 15 June 2021);
Iranian (Iranian), which was introduced in [46] and was shared by Hassan Sabzevari.
Polish bankruptcy data (Polish_year1 to Polish_year5, https://archive.ics.uci.edu/ml/datasets/Polish+companies+bankruptcy+data, accessed on 15 June 2021);
The PAKDD 2009 dataset (the PAKDD, https://www.cs.purdue.edu/commugrate/data/credit_card/, accessed on 15 June 2021); and
Qualitative bankruptcy data (Qualitative, https://archive.ics.uci.edu/ml/datasets/qualitative_bankruptcy, accessed on 15 June 2021).

These datasets (Table 2) included numerical and categorical attributes, and missing data (four mixed and eight with missing values). They also contained class imbalances, due to 8 of the 11 having IR > 1.5. The number of attributes ranged between six and 64 features, whereas the number of instances was between 250 and 20,000. All datasets corresponded to good and default clients; they only contained two classes.

4.2. Performance Measures

In the following, we address the performance measures [47] we used in this study, which have been proven to be robust for imbalanced problems [48]. The measures were based on a two-class confusion matrix, as shown in Figure 1.

Three basic performance measures for two-class problems are precision, recall (also known as sensitivity and the true positive rate), and the true negative rate. Other measures, such as the F-score, geometric mean, and the area under the ROC curve, are based on those measures. Table 3 summarizes the performance measures used in this study.

In this study, the F-score, the geometric mean, and the area under the ROC curve were considered in order to evaluate the performance of the classifiers.

4.3. Complexity Measures in the Selected Datasets

Table 4 shows the results of the complexity measures computed for the selected datasets. We used KEEL software [43] to obtain these measures. KEEL software automatically stores all data in numeric arrays; therefore, it includes the imputation of the missing values and the conversion of the categorical values of the datasets into numeric values. In addition, KEEL uses the original definition for F1 and F3 measures [28], and therefore, in Table 4, small values of F1 and F3 denote complex problems (inverse relation). Consequently, all remaining complexity measures are proportional to the values; that is, small values correspond to less complex problems.

As can be seen, for measure F1 the simplest dataset was the qualitative dataset, and the most difficult ones were the Polish datasets. The F2 measure did not offer much information since it assigned 0.0 to all datasets but the Iranian (0.64), German (0.01), and the PAKDD (0.05) datasets. For measures L1 and L2, the most complex dataset was the German. The L3 measure gave a high complexity to the Japanese, the PAKDD, and the Polish datasets, all of which had values of 0.5.

Regarding measures N1, N2m and N3, again, the German database offered the greatest complexity. According to measure N4, the most complex databases were Polish_year2 and Polish_year3. The T1 measure did not offer much information since it assigned complexities equal to 1.00 for all datasets except the qualitative one. Finally, the T2 measure considered that the most complex dataset was the PKDD09 set.

4.4. Performance of the Classifiers

To establish the performance of the associative classifiers under study, we used a five-fold cross-validation procedure because some datasets were imbalanced. As for performance measures, we used the AUC, F-score, and the geometric mean, as described in Section 4.2. Table 5, Table 6 and Table 7 present the performance results of each classifier according to the measures mentioned above.

According to the AUC results, the best-performing classifier was EG, which performed the best in seven of the 11 datasets. The ACID and NAC classifiers showed good behavior for the Australian, Japanese, German, and qualitative datasets. Nevertheless, their performance for the remaining datasets was poor, except for the PAKDD dataset, in which ACID showed the best performance. On the other hand, CHAT-OHM demonstrated relatively good performance for the qualitative and German datasets. Equivalent results were shown for the F-score measure.

As for geometric mean, EG was the best in eight of the compared datasets. ACID remained the best in the Australian, Japanese, and PAKDD datasets. In addition, for the qualitative datasets, it displayed satisfactory results. However, the performance was poor for the remaining datasets because the classifier was biased toward one class, with geometric mean values ranging from 0.26 to 0.57.

NAC exhibited a behavior similar to ACID (except for the German dataset, in which NAC performed well), showing bias towards the majority class, with values ranging from 0.0 (Iranian) to 0.55 (Polish_year5). CHAT-OHM obtained average results for most of the compared datasets.

In summary, the ACID classifier showed a good performance for most of the databases, and we observed that it exceeded 0.5 for almost all the measures, except for some databases with a greater number of instances. On the other hand, the CHAT-OHM classifier achieved excellent classification performance for the “qualitative-bankruptcy” database. Nevertheless, in the other datasets it failed to obtain performance percentages greater than 0.7, which made it one of the classifiers with the lowest ranking average for this set of databases.

EG was another classifier with good performance (mostly exceeding percentages of 0.6), making it the classifier with the best results in this study. The EG classifier had an almost perfect classification performance for the qualitative database and relatively good performance for the Australian and Japanese databases.

Finally, in terms of the classification results, the NAC classifier, similarly to the EG, exhibited an almost perfect classification performance with the qualitative-bankruptcy database and a very good performance with the Australian and Japanese databases. However, we can also observe that it showed a certain weakness in the case of databases such as the Polish datasets, in which it classified practically an entire class incorrectly. Moreover, its performance was slightly less efficient compared to the EG classifier.

4.5. Correlation Analysis

Another aim of this study was to analyze in-depth and establish the relationship between complexity measures and associative classifiers. This aspect can be investigated in detail by calculating the correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers. To accomplish this, intensive use of Spearman’s Rho coefficient was made. Spearman’s Rho is a nonparametric measure of dependence in which the mean rank of the observations is calculated [49], and the differences are squared and incorporated into the formula. In other words, a classification is assigned to the observations of each variable, and the dependency relationship between two given variables is studied. The interpretation of Spearman’s coefficient depends on its values. It ranges between −1 and +1, indicating negative or positive associations. Respectively, values close to zero indicate no correlation but not independence. To carry out the correlation tests, two hypotheses were used in all cases:

H0:

There is no correlation between the measures of data complexity and the performance of the analyzed classifiers.

H1:

There is a correlation between the measures of data complexity and the performance of the analyzed classifiers.

We set a significance value

α = 0.05

, for a 95% confidence level. Table 8 shows the correlations obtained. In these results, the text “cc” represents the value of the correlation coefficient, whereas “p” is the probability value of the test. If

p \leq α

, we reject the null hypothesis H0 and accept the alternative hypothesis H1; that is, in these scenarios, there is a correlation between the data complexity measures and the performance achieved by the classifiers. These results are shown in bold in the tables. In bold and italics are the results in which the null hypothesis was rejected within a 90% confidence level instead of 95%.

The results obtained for the ACID classifier allowed us to reject the null hypothesis for the measures F1, N2, N4, L1, L3, and T1. The measures F1, N4, and L3 significantly correlated with the performance measures of the F-score, geometric mean, and AUC for the ACID classifier using Spearman’s rho coefficient, with 95% confidence. On the other hand, the L1 measure showed a significant correlation with the F-score performance measure, with the geometric mean with 95% confidence, and with the AUC measure with 90%. The N2 measure showed significant correlations with the F-score performance measure with 90% confidence, and T1 showed a significant correlation with the F-score performance measure, with 95% confidence.

The results obtained for the CHAT-OHM classifier allowed us to reject the null hypothesis for the measurements F1, N4, L1, L3, T1, and T2. Furthermore, the measures F1, N4, L1, L3, and T2 significantly correlated with the performance measures of the F-score, geometric mean, and AUC for the CHAT-OHM classifier using Spearman’s rho coefficient, with 95% confidence.

On the other hand, the T1 measure showed a significant correlation with the performance measures of the F-score, geometric mean, and AUC with 90% confidence.

The results obtained for the EG allowed us to reject the null hypothesis for the measures F1, F3, N4, L3, and T2. The N4 and L3 measures showed a significant correlation with the F-score, geometric mean, and AUC performance measures for the EG classifier using Spearman’s rho coefficient, with 95% confidence. On the other hand, the F1 measure showed a significant correlation with the F-score performance measure with 95% confidence; the F3 measure showed a significant correlation with the geometric mean and AUC measures with 90%, and finally, the T2 measure showed a significant correlation with the F-score measure with 90% confidence

Finally, the results obtained for the NAC allowed us to reject the null hypothesis for the measures F1, N4, L1, and L3. The measures F1, N4, L1, and L3 showed a significant correlation with the performance measures of the F-score, geometric mean, and AUC for the NAC classifier using Spearman’s rho coefficient, with 95% confidence.

It is important to note that F1 showed a stronger correlation than the other feature-based complexity measures. Finding the reasons behind this behavior requires deeper research, including numerical simulations using synthetic data, to be able to bound the factors that may intervene. However, in our opinion, this may be due to the formulation of the feature discriminant ratio. As a result, less complex problems have features that can be used to separate the classes, whereas complex problems are overlapping.

Similarly, N4 and L3 showed a stronger correlation than the other neighborhood-based and network-based measures, respectively. Again, further research is needed to fully understand the reason behind the high correlation between these measures. Our intuition tells us that N4 was again highly correlated because it considers the nonlinearity of the nearest neighbor classifier, focusing on class overlapping, and L3 also uses the notion of nonlinearity for a linear classifier. In our future work, we intend to address this behavior in depth.

The experimental results correctly indicated the existing correlations between complexity measures and the performance of the associative classifiers. The comparison was evaluated and studied, resulting in positive correlation coefficients in regard to different complexity measures for each associative classifier.

The comparative results indicate that our correlation study will depend entirely on the complexity of the problems to be solved and how each associative classifier was created, which leads us to return to the no-free-lunch theorem (NFL), as the ACID, CHAT, and EG classifiers are the ones that presented the highest correlations in terms of performance measures versus complexity measures in the selected datasets. In contrast, the NAC classifier turned out to be the classifier that presented the lowest correlation in its performance against complexity measures.

4.6. Comparison with Other Supervised Classifiers

In this subsection, we present the comparison of the associative-based classifiers with respect to other supervised classifiers in the literature. For this purpose, we again used the area under the ROC curve.

To execute the literature algorithms, with the exception of the nearest neighbor approach, we again used the KEEL environment because it enabled the implementation of naïve Bayes (NB), multilayer perceptron with backpropagation (MLP), and support vector machines (SVM). It is important to mention that all supervised classifiers (associative–based and others) used the same partitions of the five-fold cross-validation procedure, and no algorithm was particularly tuned. No weighing scheme was performed for associative classifiers, and for the literature classifiers, we used the default parameters offered by KEEL. However, those default parameters corresponded to the best values reported in previous studies, and therefore, we can consider the algorithms from the literature to be somewhat optimized. For the nearest neighbor (NN) classifier, we used EPIC software [50,51] with the HEOM dissimilarity function [52]. For the associative-based classifiers, we also used EPIC software. Table 9 shows the results regarding the AUC for the compared algorithms. Best results for each dataset are highlighted in bold.

We did not include classifier ensembles such as the well-known random forest [53] in the comparison for two main reasons. First, we considered that, since associative classifiers are base (single) classifiers, it would be unfair to compare the algorithms with respect to classifier committees composed of several supervised classifiers. Second, neither KEEL nor EPIC software includes classifier committees in their selection of supervised classifiers.

The results presented in Table 9 showed that the best-performing algorithm was Extended Gamma (EG), which exhibited the best results in five of the datasets, followed by ACID and SVM (which showed the best results in two datasets). The poor results of CHAT-OHM and NN resulted from the class overlapping of the datasets, and those of NAC resulted from the lack of feature weighting. It was shown in [26] that automatic feature weighting improves the performance of NAC. In addition, the ACID classifier also includes an optional feature weighting procedure [19], which was not applied here.

Regarding the interpretability of the supervised classifiers compared in this study, in our opinion, both Extended Gamma (EG) and Naïve Associative Classifier (NAC) was highly interpretable [15]. ACID was also interpretable [19], although it was not as straightforward as EG and NAC. However, these three classifiers were transparent and transportable. On the other hand, the CHAT-OHM classifier obtained an association matrix, which is not easy to interpret. Regarding the other supervised classifiers compared, the nearest neighbor classifier was interpretable, the naïve Bayes was somewhat interpretable, and MLP and SVM were not easy to interpret due to their training process.

5. Conclusions

In this paper, we used meta-learning, in the context of ASP, to analyze the correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers under study. We observed correlations between complexity measures and the performance of the associative classifiers.

The comparison was evaluated and studied, resulting in positive correlation coefficients in different complexity measures for each associative classifier. The comparative results indicate that our correlation studies will depend entirely on the complexity of the problems to be solved and the way in which each associative classifier was created, which leads us to return to the no-free-lunch theorem (NFL), and the ACID, CHAT, and EG classifiers are the ones that presented the highest correlations in terms of performance measures versus complexity measures in the selected datasets. In contrast, the NAC classifier turned out to be the classifier that presented the lowest correlation in its performance with complexity measures.

In future work, we intend to explore why some complexity measures showed a stronger correlation than others for the compared associative classifiers.

Author Contributions

Conceptualization, Y.V.-R. and C.Y.-M.; methodology, O.C.-N. and I.L.-Y.; software, F.J.C.-U.; validation, Y.V.-R. and C.Y.-M.; formal analysis, Y.V.-R.; investigation, F.J.C.-U.; writing—original draft preparation, F.J.C.-U.; writing—review and editing, Y.V.-R. and O.C.-N.; visualization, I.L.-Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All datasets used in this study are available at the Machine Learning Repository of the University of California at https://archive.ics.uci.edu/ml/datasets.php (accessed on 15 June 2021), except for the Iranian dataset, which is available upon request to the authors of [46].

Acknowledgments

The authors would like to thank the Instituto Politécnico Nacional (Secretaría Académica, Comisión de Operación y Fomento de Actividades Académicas, Secretaría de Investigación y Posgrado, Centro de Innovación y Desarrollo Tecnológico en Cómputo and Centro de Investigación en Computación), the Consejo Nacional de Ciencia y Tecnología, and Sistema Nacional de Investigadores for their economic support in developing this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
John, G.H.; Langley, P. Estimating continuous distributions in Bayesian classifiers. arXiv 2013, arXiv:1302.4964. [Google Scholar]
Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef] [Green Version]
Salzberg, S.L. C4. 5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993; Kluwer Academic Publishers: Dordrecht The Netherlands, 1994. [Google Scholar]
Platt, J. Sequential minimal optimization: A fast algorithm for training support vector machines. MSRTR 1998, 3, 88–95. [Google Scholar]
Widrow, B.; Lehr, M.A. 30 years of adaptive neural networks: Perceptron, madaline, and backpropagation. Proc. IEEE 1990, 78, 1415–1442. [Google Scholar] [CrossRef]
Yáñez-Márquez, C.; López-Yáñez, I.; Aldape-Pérez, M.; Camacho-Nieto, O.; Argüelles-Cruz, A.J.; Villuendas-Rey, Y. Theoretical foundations for the alpha-beta associative memories: 10 years of derived extensions, models, and applications. Neural Process. Lett. 2018, 48, 811–847. [Google Scholar] [CrossRef]
Ramírez-Rubio, R.; Aldape-Pérez, M.; Yáñez-Márquez, C.; López-Yáñez, I.; Camacho-Nieto, O. Pattern classification using smallest normalized difference associative memory. Pattern Recognit. Lett. 2017, 93, 104–112. [Google Scholar] [CrossRef]
Santiago-Montero, R. Hybrid Associative Pattern Classifier with Translation. Master´s Thesis, Centro de Investigación en Computación, IPN., México City, México, 2003. [Google Scholar]
Uriarte-Arcia, A.V.; López-Yáñez, I.; Yáñez-Márquez, C. One-hot vector hybrid associative classifier for medical data classification. PLoS ONE 2014, 9, e95715. [Google Scholar] [CrossRef] [Green Version]
López-Yáñez, I.; Argüelles-Cruz, A.J.; Camacho-Nieto, O.; Yáñez-Márquez, C. Pollutants time-series prediction using the Gamma classifier. Int. J. Comput. Intell. Syst. 2011, 4, 680–711. [Google Scholar] [CrossRef]
Ramirez, A.; Lopez, I.; Villuendas, Y.; Yanez, C. Evolutive improvement of parameters in an associative classifier. IEEE Lat. Am. Trans. 2015, 13, 1550–1555. [Google Scholar] [CrossRef]
Villuendas-Rey, Y.; Yáñez-Márquez, C.; Anton-Vargas, J.A.; López-Yáñez, I. An extension of the gamma associative classifier for dealing with hybrid data. IEEE Access 2019, 7, 64198–64205. [Google Scholar] [CrossRef]
Sonia, O.-Á.; Yenny, V.-R.; Cornelio, Y.-M.; Itzamá, L.-Y.; Oscar, C.-N. Determining electoral preferences in Mexican voters by computational intelligence algorithms. IEEE Lat. Am. Trans. 2020, 18, 704–713. [Google Scholar] [CrossRef]
Villuendas-Rey, Y.; Rey-Benguría, C.F.; Ferreira-Santiago, Á.; Camacho-Nieto, O.; Yáñez-Márquez, C. The naïve associative classifier (NAC): A novel, simple, transparent, and accurate classification model evaluated on financial data. Neurocomputing 2017, 265, 105–115. [Google Scholar] [CrossRef]
De La Vega, A.R.-D.; Villuendas-Rey, Y.; Yanez-Marquez, C.; Camacho-Nieto, O. The Naïve Associative Classifier with Epsilon Disambiguation. IEEE Access 2020, 8, 51862–51870. [Google Scholar] [CrossRef]
Camacho-Urriolagoitia, O. Intelligent data science analysis for individual finance. Master’s Thesis, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Insituto Politéctnico Nacional, México City, México, 2020. [Google Scholar]
Villuendas-Rey, Y.; Hernández-Castaño, J.A.; Camacho-Nieto, O.; Yáñez-Márquez, C.; López-Yañez, I. NACOD: A naïve associative classifier for online data. IEEE Access 2019, 7, 117761–117767. [Google Scholar] [CrossRef]
Villuendas-Rey, Y.; Alanis-Tamez, M.D.; Rey-Benguría, C.F.; Yáñez-Márquez, C.; Nieto, O.C. Medical Diagnosis of Chronic Diseases Based on a Novel Computational Intelligence Algorithm. J. Univers. Comput. Sci. 2018, 24, 775–796. [Google Scholar]
Villuendas-Rey, Y.; Yáñez-Márquez, C.; Camacho-Nieto, O.; López-Yáñez, I. Impact of imbalanced datasets preprocessing in the performance of associative classifiers. Appl. Sci. 2020, 10, 2779. [Google Scholar]
López-Martın, C.; Lopez-Yanez, I.; Yanez-Marquez, C. Application of Gamma classifier to development effort prediction of software projects. Appl. Math 2012, 6, 411–418. [Google Scholar]
López-Yáñez, I.; Yáñez-Márquez, C.; Camacho-Nieto, O.; Aldape-Pérez, M.; Argüelles-Cruz, A.-J. Collaborative learning in postgraduate level courses. Comput. Hum. Behav. 2015, 51, 938–944. [Google Scholar] [CrossRef]
Calvo, H.; Gelbukh, A. Improving prepositional phrase attachment disambiguation using the web as corpus. In Proceedings of the Iberoamerican Congress on Pattern Recognition, Havana, Cuba, 26–29 November 2003; pp. 604–610. [Google Scholar]
López-Yáñez, I.; Sheremetov, L.; Yáñez-Márquez, C. A novel associative model for time series data mining. Pattern Recognit. Lett. 2014, 41, 23–33. [Google Scholar] [CrossRef]
Cleofas-Sánchez, L.; García, V.; Marqués, A.; Sánchez, J.S. Financial distress prediction using the hybrid associative memory with translation. Appl. Soft Comput. 2016, 44, 144–152. [Google Scholar] [CrossRef] [Green Version]
Serrano-Silva, Y.O.; Villuendas-Rey, Y.; Yáñez-Márquez, C. Automatic feature weighting for improving financial Decision Support Systems. Decis. Support Syst. 2018, 107, 78–87. [Google Scholar] [CrossRef]
Rice, J.R. The algorithm selection problem. In Advances in Computers; Elsevier: Amsterdam, The Netherlands, 1976; Volume 15, pp. 65–118. [Google Scholar]
Ho, T.K.; Basu, M. Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 289–300. [Google Scholar]
Bernadó-Mansilla, E.; Ho, T.K. Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans. Evol. Comput. 2005, 9, 82–104. [Google Scholar] [CrossRef]
Sánchez, J.S.; Mollineda, R.A.; Sotoca, J.M. An analysis of how training data complexity affects the nearest neighbor classifiers. Pattern Anal. Appl. 2007, 10, 189–201. [Google Scholar] [CrossRef]
Luengo, J.; Herrera, F. Domains of competence of fuzzy rule based classification systems with data complexity measures: A case of study using a fuzzy hybrid genetic based machine learning method. Fuzzy Sets Syst. 2010, 161, 3–19. [Google Scholar] [CrossRef]
Luengo, J.; Herrera, F. Shared domains of competence of approximate learning models using measures of separability of classes. Inf. Sci. 2012, 185, 43–65. [Google Scholar] [CrossRef]
Flores, M.J.; Gámez, J.A.; Martínez, A.M. Domains of competence of the semi-naive Bayesian network classifiers. Inf. Sci. 2014, 260, 120–148. [Google Scholar] [CrossRef]
Luengo, J.; Herrera, F. An automatic extraction method of the domains of competence for learning classifiers using data complexity measures. Knowl. Inf. Syst. 2015, 42, 147–180. [Google Scholar] [CrossRef]
Morán-Fernández, L.; Bolón-Canedo, V.; Alonso-Betanzos, A. Can classification performance be predicted by complexity measures? A study using microarray data. Knowl. Inf. Syst. 2017, 51, 1067–1090. [Google Scholar] [CrossRef]
Barella, V.H.; Garcia, L.P.; de Souto, M.P.; Lorena, A.C.; de Carvalho, A. Data complexity measures for imbalanced classification tasks. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar]
Lorena, A.C.; Garcia, L.P.; Lehmann, J.; Souto, M.C.; Ho, T.K. How Complex is your classification problem? A survey on measuring classification complexity. ACM Comput. Surv. 2019, 52, 1–34. [Google Scholar] [CrossRef] [Green Version]
Khan, I.; Zhang, X.; Rehman, M.; Ali, R. A literature survey and empirical study of meta-learning for classifier selection. IEEE Access 2020, 8, 10262–10281. [Google Scholar] [CrossRef]
Maillo, J.; Triguero, I.; Herrera, F. Redundancy and complexity metrics for big data classification: Towards smart data. IEEE Access 2020, 8, 87918–87928. [Google Scholar] [CrossRef]
Wolpert, D.H. The Supervised Learning No-Free-Lunch Theorems. In Soft Computing and Industry; Roy, R., Köppen, M., Ovaska, S., Furuhashi, T., Hoffmann, F., Eds.; Springer: London, UK, 2002. [Google Scholar]
Ho, T.K.; Basu, M.; Law, M.H.C. Measures of geometrical complexity in classification problems. In Data Complexity in Pattern recognition; Springer: Berlin/Heidelberg, Germany, 2006; pp. 1–23. [Google Scholar]
Sotoca, J.M.; Sánchez, J.; Mollineda, R.A. A Review of Data Complexity Measures and Their Applicability to Pattern Classification Problems. In Actas del III Taller Nacional de Minería de Datos y Aprendizaje; TAMIDA: Granada, Spain, 2005; pp. 77–83. [Google Scholar]
Triguero, I.; González, S.; Moyano, J.M.; García, S.; Alcalá-Fdez, J.; Luengo, J.; Fernández, A.; del Jesús, M.J.; Sánchez, L.; Herrera, F. KEEL 3.0: An open source software for multi-stage analysis in data mining. Int. J. Comput. Intell. Syst. 2017, 10, 1238–1249. [Google Scholar] [CrossRef] [Green Version]
44. López-Yáñez., I. Theory and Applications of the Gamma Associative Classifier. Ph.D. Thesis, Centro de Investigación en Computación, Insitituto Politécnico Nacional, México City, México, 2011.
Dua, D.; Graff, C. UCI Machine Learning Repository. Available online: http://archive.ics.uci.edu/ml (accessed on 15 June 2021).
Sabzevari, H.; Soleymani, M.; Noorbakhsh, E. A comparison between statistical and data mining methods for credit scoring in case of limited available data. In Proceedings of the 3rd CRC Credit Scoring Conference, Edinburgh, UK, 4 November 2007; pp. 1–5. [Google Scholar]
Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
López, V.; Fernández, A.; García, S.; Palade, V.; Herrera, F. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Inf. Sci. 2013, 250, 113–141. [Google Scholar] [CrossRef]
Spearman, C. “General Intelligence ” Objectively Determined and Measured. J. Psychol. 1961, 15, 201–292. [Google Scholar] [CrossRef]
Hernández-Castaño, J.A.; Villuendas-Rey, Y.; Camacho-Nieto, O.; Yáñez-Márquez, C. Experimental platform for intelligent computing (EPIC). Comput. Y Sist. 2018, 22, 245–253. [Google Scholar] [CrossRef]
Hernández-Castaño, J.A.; Villuendas-Rey, Y.; Nieto, O.C.; Rey-Benguría, C.F. A New Experimentation Module for the EPIC Software. Res. Comput. Sci. 2018, 147, 243–252. [Google Scholar] [CrossRef]
Wilson, D.R.; Martinez, T.R. Improved heterogeneous distance functions. J. Artif. Intell. Res. 1997, 6, 1–34. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Confusion matrix for a two-class problem.

Table 1. Complexity measures used.

Measure	Equation
Maximum Fisher’s Discriminant Ratio (F1)—Complementary formulation [37]	$F 1 = \frac{1}{1 + \max_{i = 1}^{m} r_{ƒ i}}$
Volume of overlap region (F2)	$F 2 = \prod_{i}^{m} \frac{overlap (fi)}{range (fi)} = \prod_{i}^{m} \frac{\max {0, \min \max (fi) - \max \min (fi)}}{\max \max (fi) - \min \min (fi)}$ , where $\min \max (fi) = \min (\max ({fi}^{c 1}), \max ({fi}^{c 2}))$ , $\max \min (fi) = \max (\min ({fi}^{c 1}), \min ({fi}^{c 2})), \max \max (fi) = \max (\max ({fi}^{c 1}), \max ({fi}^{c 2})),$ $\min \min (fi) = \min (\min ({fi}^{c 1}), \min ({fi}^{c 2}))$ and max $({fi}^{cj})$ and min $({fi}^{cj})$ are the maximum and minimum values of each feature in the class c_j ∈ {1, 2}, respectively.
Maximum Individual Feature Efficiency (F3)—Complementary formulation [37]	$F 3 = \min_{i = 1}^{m} \frac{n_{o} (fi)}{n}$ , where $n_{o} (fi) = \sum_{j}^{n} I (x_{ji} > \max \min (fi) \land x_{ji} < \min \max (fi))$
Sum of the Error Distance by Linear Programming (L1)	$L 1 = 1 - \frac{1}{1 + S u m E r r o r D i s t} = \frac{S u m E r r o r D i s t}{1 + S u m E r r o r D i s t}$ , where $S u m E r r o r D i s t = \frac{1}{n} \sum_{i = 1}^{n} ε_{j}$ . The $ε_{j}$ values are determined by optimizing an SVM
Error rate of linear classifier (L2)	$L 2 = \frac{\sum_{i = 1}^{n} I (h (x_{i}) \neq y_{i})}{n}$ where h(x) is the linear classifier.
Nonlinearity of a Linear Classifier (L3)	$L 3 = \frac{1}{l} \sum_{i = 1}^{l} I (h_{T} ({\overset{´}{x}}_{i}) \neq {\overset{´}{y}}_{i})$ where $h_{T} (x)$ is the linear classifier induced using the original dataset T, and l in the number of interpolated examples ${\overset{´}{x}}_{i}$ and ${\overset{´}{y}}_{i}$ are the corresponding labels.
Fraction of Borderline Points (N1)	$N 1 = \frac{1}{n} \sum_{i = 1}^{n} I ((x_{i}, x_{j}) \in MST \land y_{i} \neq y_{j})$ , where MST is a Minimum Spanning Tree obtained over the interpolated data.
Ratio of Intra/Extra Class Nearest Neighbor Distance (N2)	$N 2 = 1 - \frac{1}{1 + intra_extra} = \frac{intra_extra}{1 + intra_extra}$ , where $intra_extra = \frac{\sum_{i = 1}^{n} d (x_{i}, NN (x_{i}) \in y_{i})}{\sum_{i = 1}^{n} d (x_{i}, NN (x_{i}) \in y_{j} \neq y_{i})}$
Error Rate of the Nearest Neighbor Classifier (N3)	$N 3 = \frac{\sum_{i = 1}^{n} I (NN (x_{i}) \neq y_{i})}{n}$
Nonlinearity of the Nearest Neighbor Classifier (N4)	$N 4 = \frac{1}{l} \sum_{i = 1}^{l} I ({NN}_{T} (x_{i}^{,}) \neq y_{i}^{,})$
Fraction of Hyperspheres Covering Data (T1)	$T 1 = \frac{# Hyperspheres (T)}{n}$ , where $# Hyperspheres (T)$ is the number of hyperspheres needed to cover the dataset, considering that only points of the same class can be inside the same hypersphere.
Average Number of Features per Dimension (T2)	$T 2 = \frac{m}{n}$

Table 2. Description of the datasets used.

Dataset	Instances	Attributes		Missing Values	Imbalance Ratio
Dataset	Instances	Continuous	Categorical	Missing Values	Imbalance Ratio
Australian	690	8	6	No	1.25
German	1000	7	13	No	2.33
Iranian	1002	28	0	Yes	19.04
Japanese	690	6	9	Yes	1.21
Polish_year1	7027	64	0	Yes	24.93
Polish_year2	10,173	64	0	Yes	24.43
Polish_year3	10,503	64	0	Yes	20.22
Polish_year4	9792	64	0	Yes	18.01
Polish_year5	5910	64	0	Yes	13.41
Qualitative	250	0	6	No	1.34
The PAKDD	20,000	10	9	Yes	4.12

Table 3. Performance measures used.

Measure	Equation
Precision	$P = \frac{TP}{TP + FP}$
Recall	$R = TPR = \frac{TP}{TP + FN}$
True Negative Rate	$TNR = \frac{TN}{TN + FP}$
F-score	$F = 2 * \frac{P * R}{P + R}$
Geometric Mean	$GM = \sqrt{P * R}$
Area under the ROC curve	$AUC = \frac{TPR + TNR}{2}$

Table 4. Complexity measure values for the datasets used.

Dataset	F1	F2	F3	L1	L2	L3	N1	N2	N3	N4	T1	T2
Australian	2.28	0.00	0.03	0.30	0.14	0.14	0.32	0.56	0.20	0.16	1.00	39.43
German	2.45	0.01	0.03	0.28	0.14	0.13	0.32	0.57	0.20	0.17	1.00	36.48
Iranian	0.36	0.64	0.01	0.83	0.24	0.33	0.46	0.85	0.31	0.30	1.00	40.00
Japanese	0.35	0.00	0.21	0.10	0.05	0.50	0.12	0.31	0.07	0.37	1.00	28.57
Qualitative	9.51	0.00	0.78	0.60	0.00	0.00	0.03	0.10	0.01	0.00	0.95	33.33
The PAKDD	0.11	0.05	0.00	0.39	0.20	0.50	0.26	0.15	0.06	0.45	1.00	842.11
Polish_year1	0.03	0.00	0.03	0.08	0.04	0.50	0.09	0.57	0.06	0.45	1.00	87.84
Polish_year2	0.01	0.00	0.02	0.08	0.04	0.50	0.11	0.63	0.08	0.47	1.00	127.16
Polish_year3	0.04	0.00	0.03	0.09	0.05	0.50	0.12	0.64	0.08	0.47	1.00	131.29
Polish_year4	0.06	0.00	0.02	0.11	0.05	0.50	0.13	0.63	0.09	0.45	1.00	122.40
Polish_year5	0.18	0.00	0.04	0.14	0.07	0.50	0.15	0.71	0.10	0.39	1.00	73.88

Table 5. Area under the ROC curve of the compared classifiers. The best results are indicated in bold.

Dataset	ACID	CHAT-OHM	EG	NAC
Australian	0.86	0.59	0.84	0.83
Japanese	0.85	0.66	0.82	0.83
German	0.57	0.64	0.68	0.69
Iranian	0.56	0.64	0.68	0.50
Qualitative	0.93	0.95	0.99	0.99
The PAKDD	0.77	0.58	0.61	0.61
Polish_year1	0.58	0.52	0.76	0.50
Polish_year2	0.56	0.50	0.71	0.50
Polish_year3	0.56	0.53	0.74	0.50
Polish_year4	0.54	0.55	0.75	0.53
Polish_year5	0.65	0.62	0.79	0.62

Table 6. F-score of the compared classifiers. The best results are indicated in bold.

Dataset	ACID	CHAT-OHM	EG	NAC
Australian	0.86	0.59	0.84	0.83
Japanese	0.86	0.66	0.82	0.84
German	0.58	0.63	0.67	0.67
Iranian	0.60	0.58	0.63	0.49
Qualitative	0.93	0.95	0.99	0.99
The PAKDD	0.78	0.57	0.59	0.59
Polish_year1	0.58	0.51	0.64	0.50
Polish_year2	0.55	0.50	0.62	0.53
Polish_year3	0.57	0.52	0.64	0.51
Polish_year4	0.55	0.53	0.64	0.54
Polish_year5	0.66	0.57	0.68	0.61

Table 7. Geometric mean of the compared classifiers. The best results are indicated in bold.

Dataset	ACID	CHAT-OHM	EG	NAC
Australian	0.86	0.59	0.83	0.83
Japanese	0.85	0.65	0.82	0.83
German	0.49	0.63	0.68	0.68
Iranian	0.26	0.63	0.63	0.00
Qualitative	0.93	0.95	0.99	0.99
The PAKDD	0.75	0.57	0.61	0.61
Polish_year1	0.42	0.51	0.76	0.05
Polish_year2	0.39	0.50	0.71	0.04
Polish_year3	0.39	0.53	0.74	0.10
Polish_year4	0.32	0.55	0.75	0.20
Polish_year5	0.57	0.61	0.79	0.55

Table 8. Correlation analysis of the data complexity measures and the performance measures.

Complexity Measure	Ro	F-Score				Geometric Mean				AUC
Complexity Measure	Ro	ACID	CHAT-OHM	EG	NAC	ACID	CHAT-OHM	EG	NAC	ACID	CHAT-OHM	EG	NAC
F1	cc	0.864	0.991	0.764	0.782	0.682	0.945	0.482	0.800	0.655	0.936	0.482	0.800
F1	p	0.001	0.000	0.006	0.004	0.021	0.000	0.133	0.003	0.029	0.000	0.133	0.003
F2	cc	0.282	0.391	−0.018	0.227	0.164	0.364	−0.364	0.291	0.109	0.327	−0.364	0.245
F2	p	0.401	0.235	0.958	0.502	0.631	0.272	0.272	0.385	0.750	0.326	0.272	0.467
F3	cc	0.418	0.336	0.436	0.055	0.173	0.400	0.564	0.027	0.291	0.473	0.564	0.064
F3	p	0.201	0.312	0.180	0.873	0.612	0.223	0.071	0.937	0.385	0.142	0.071	0.853
N1	cc	0.236	0.364	0.218	0.409	0.273	0.282	−0.109	0.427	0.173	0.218	−0.109	0.427
N1	p	0.484	0.272	0.519	0.212	0.417	0.401	0.750	0.190	0.612	0.519	0.750	0.190
N2	cc	−0.564	−0.246	0.036	−0.100	−0.345	−0.200	−0.127	−0.173	−0.445	−0.273	−0.127	−0.082
N2	p	0.071	0.467	0.915	0.770	0.298	0.555	0.709	0.612	0.170	0.417	0.709	0.811
N3	cc	−0.036	0.246	0.409	0.309	0.055	0.173	0.145	0.245	−0.045	0.109	0.145	0.318
N3	p	0.915	0.467	0.212	0.355	0.873	0.612	0.670	0.467	0.894	0.750	0.670	0.340
N4	cc	−0.809	−0.909	−0.818	−0.691	−0.645	−0.845	−0.618	−0.709	−0.645	−0.836	−0.618	−0.700
N4	p	0.003	0.000	0.002	0.019	0.032	0.001	0.043	0.015	0.032	0.001	0.043	0.016
L1	cc	0.682	0.809	0.509	0.800	0.673	0.755	0.136	0.827	0.573	0.691	0.136	0.809
L1	p	0.021	0.003	0.110	0.003	0.023	0.007	0.689	0.002	0.066	0.019	0.689	0.003
L2	cc	0.236	0.327	0.045	0.327	0.218	0.273	−0.291	0.355	0.127	0.218	−0.291	0.336
L2	p	0.484	0.326	0.894	0.326	0.519	0.417	0.385	0.285	0.709	0.519	0.385	0.312
L3	cc	−0.727	−0.853	−0.811	−0.863	−0.769	−0.748	−0.642	−0.863	−0.727	−0.705	−0.642	−0.863
L3	p	0.011	0.001	0.002	0.001	0.006	0.008	0.033	0.001	0.011	0.015	0.033	0.001
T1	cc	−0.670	−0.563	−0.326	−0.251	−0.363	−0.530	−0.381	−0.326	−0.451	−0.600	−0.381	−0.251
T1	p	0.024	0.071	0.328	0.456	0.273	0.093	0.247	0.328	0.164	0.051	0.247	0.456
T2	cc	−0.573	−0.764	−0.600	−0.355	−0.255	−0.773	−0.436	−0.318	−0.300	−0.809	−0.436	−0.345
T2	p	0.066	0.006	0.051	0.285	0.450	0.005	0.180	0.340	0.370	0.003	0.180	0.298

Table 9. Area under the ROC curve for associative-based and other supervised classifiers.

Dataset	ACID	CHAT-OHM	EG	NAC	NB	NN	MLP	SVM
Australian	0.86	0.59	0.84	0.83	0.85	0.71	0.85	0.85
Japanese	0.85	0.66	0.82	0.83	0.86	0.72	0.86	0.86
German	0.57	0.64	0.68	0.69	0.68	0.52	0.53	0.66
Iranian	0.56	0.64	0.68	0.50	0.61	0.64	0.58	0.50
Polish_year1	0.58	0.52	0.76	0.50	0.61	0.54	0.50	0.50
Polish_year2	0.56	0.50	0.71	0.50	0.55	0.50	0.50	0.50
Polish_year3	0.56	0.53	0.74	0.50	0.61	0.52	0.50	0.50
Polish_year4	0.54	0.55	0.75	0.53	0.63	0.54	0.50	0.50
Polish_ year5	0.65	0.62	0.79	0.62	0.76	0.55	0.53	0.50
Qualitative	0.93	0.95	0.99	0.99	0.98	1.00	0.99	1.00
The PAKDD	0.77	0.58	0.61	0.61	0.51	0.53	0.52	0.50
Times Best	2	0	6	1	1	0	1	2

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Camacho-Urriolagoitia, F.J.; Villuendas-Rey, Y.; López-Yáñez, I.; Camacho-Nieto, O.; Yáñez-Márquez, C. Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures. Mathematics 2022, 10, 1460. https://0-doi-org.brum.beds.ac.uk/10.3390/math10091460

AMA Style

Camacho-Urriolagoitia FJ, Villuendas-Rey Y, López-Yáñez I, Camacho-Nieto O, Yáñez-Márquez C. Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures. Mathematics. 2022; 10(9):1460. https://0-doi-org.brum.beds.ac.uk/10.3390/math10091460

Chicago/Turabian Style

Camacho-Urriolagoitia, Francisco J., Yenny Villuendas-Rey, Itzamá López-Yáñez, Oscar Camacho-Nieto, and Cornelio Yáñez-Márquez. 2022. "Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures" Mathematics 10, no. 9: 1460. https://0-doi-org.brum.beds.ac.uk/10.3390/math10091460

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

Abstract

1. Introduction

2. Background

3. Materials and Methods

3.1. Data Complexity Measures

3.2. Associative Classifiers

3.2.1. CHAT-OHM

3.2.2. Extended Gamma

3.2.3. Naïve Associative Classifier

3.2.4. Assisted Classification for Imbalanced Datasets

4. Results and Discussion

4.1. Datasets

4.2. Performance Measures

4.3. Complexity Measures in the Selected Datasets

4.4. Performance of the Classifiers

4.5. Correlation Analysis

4.6. Comparison with Other Supervised Classifiers

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI