Next Article in Journal
LMDFS: A Lightweight Model for Detecting Forest Fire Smoke in UAV Images Based on YOLOv7
Next Article in Special Issue
National-Standards- and Deep-Learning-Oriented Raster and Vector Benchmark Dataset (RVBD) for Land-Use/Land-Cover Mapping in the Yangtze River Basin
Previous Article in Journal
Comparison of Data Fusion Methods in Fusing Satellite Products and Model Simulations for Estimating Soil Moisture on Semi-Arid Grasslands
Previous Article in Special Issue
UCDnet: Double U-Shaped Segmentation Network Cascade Centroid Map Prediction for Infrared Weak Small Target Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Discarding–Recovering and Co-Evolution Mechanisms Based Evolutionary Algorithm for Hyperspectral Feature Selection

1
School of Computer Science, Hubei University of Technology, Wuhan 430068, China
2
Institute of Geological Survey, China University of Geosciences, Wuhan 430074, China
3
School of Geosciences, Yangtze University, Wuhan 430100, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(15), 3788; https://0-doi-org.brum.beds.ac.uk/10.3390/rs15153788
Submission received: 9 June 2023 / Revised: 17 July 2023 / Accepted: 27 July 2023 / Published: 30 July 2023

Abstract

:
With the improvement of spectral resolution, the redundant information in the hyperspectral imaging (HSI) datasets brings computational, analytical, and storage complexities. Feature selection is a combinatorial optimization problem, which selects a subset of feasible features to reduce the dimensionality of data and decrease the noise information. In recent years, the evolutionary algorithm (EA) has been widely used in feature selection, but the diversity of agents is lacking in the population, which leads to premature convergence. In this paper, a feature selection method based on discarding–recovering and co-evolution mechanisms is proposed with the aim of obtaining an effective feature combination in HSI datasets. The feature discarding mechanism is introduced to remove redundant information by roughly filtering the feature space. To further enhance the agents’ diversity, the reliable information interaction is also designed into the co-evolution mechanism, and if detects the event of stagnation, a subset of discarded features will be recovered using adaptive weights. Experimental results demonstrate that the proposed method performs well on three public datasets, achieving an overall accuracy of 92.07%, 92.36%, and 98.01%, respectively, and obtaining the number of selected features between 15% and 25% of the total.

1. Introduction

The advancement of hyperspectral remote sensing leads to its widespread use in scanning continuous, narrow spectral bands, as it enables the acquisition of information on the reflection or radiation spectrum of objects at various wavelengths [1,2,3]. Digital number (DN) or reflectance value is considered as the feature value for each band and represented as a feature vector. However, there is a large amount of redundant information collected through hyperspectral imaging (HSI) with the electromagnetic spectrum and visible light infrared technology, resulting in high dimensionality [4]. In essence, data dimensionality reduction helps to trim the redundancy and noise [5] and improves classification accuracy, which has become an important topic in the processing of HSI datasets.
Generally, there are two methods for removing the redundancy of the dataset: feature extraction and feature selection. Feature extraction involves the linear or nonlinear transformation of the original high-dimensional features, such as combining different features into a new feature set [6], where the features lose their original physical meaning. Feature selection involves selecting the most representative feature combination from the dataset; it detects representative features and decreases redundant information or noise from data, which improves classification accuracy and enhances comprehensibility [7]. Due to the difficulty in interpreting selected features from feature extraction, feature selection is widely used in the processing of HSI datasets.
There are three feature selection strategies based on the search rule, namely, filter, wrapper, and embedded [8]. The filtering strategy analyzes each feature using a proxy measure [9] and selects a combination with a specified number of features based on the score ranking. However, the score only reflects the correlation with labels, ignoring the feature interactivity that some feature with a low correlation to labels provides greater performance improvement than those with a high correlation to labels. The wrapper strategy combines the process of feature selection with the agent to identify an appropriate combination of features. However, this strategy requires a continuous measure of the feature combination, resulting in high computational complexity and inadequate generalization ability [10]. The embedded strategy selects features in the learning process [11,12], and incorporates the feature selection in training, avoiding the overfitting that may occur in other strategies by adjusting the weights of features. The embedded strategy is usually combined with some iterative searching; the weight is used to guide the next iteration.
Evolutionary algorithm (EA), which mimics the adaptation and survival of the fittest observed in living organisms in nature, uses the searched heuristic information as the guidance; the genetic material of these combinations is then assembled to create new offspring, and the process is repeated over many generations to allow the population evolves towards better solutions [13]. Traditional EAs update the genetic material through mutation and crossover operations, which are then passed to the next generation. However, those algorithms only consider the stochasticity between agents, but not the similarity between them, which will lead to the occurrence of premature convergence and overfitting phenomena [14,15]. To address these limitations, distance-based EAs have been proposed; these algorithms are designed to calculate the distance between agents to determine their similarity and select some of them for crossover and mutation based on the similarity [16,17,18,19,20], thereby helping to maintain diversity in the population and prevent premature convergence. Nonetheless, due to the absence of competition or collaboration, the information interaction between agents is insufficient, making it difficult for the EA to overcome local optima and leading to stagnation in the iterative process.
The co-evolution mechanism is a means of enhancing information interaction ability. Due to its robustness, this mechanism has received extensive attention and has been widely used in various fields, including natural language processing and image retrieval [21,22]. By combining EA, the co-evolution mechanism improves the search efficiency of the EA in feature selection to some extent [23,24,25]. It divides the original feature set into many subsets; subpopulations are formed based on the agents generated by these subsets, then this mechanism enhances the diversity by information interacting between the agents in different subpopulations. However, the current information interaction only takes into account exchanging the solution encoding with weak representation, leading to the low diversity of agents and the subpopulation imbalance where some agents obtain better combinations after searching than others most of the time. Therefore, a co-evolution mechanism with prominent reliability is necessary to be further searched to fully realize its potential.
In this paper, a feature selection method based on discarding–recovering and co-evolution mechanisms is proposed to obtain a reduced feature combination of the HSI datasets with adequate accuracy. The feature discarding mechanism is introduced to filter out redundant features from the original dataset. Moreover, the co-evolution mechanism is combined with EA to enhance the diversity of agents, and a reliable information interaction is used to enable collaborative search between agents and help EA to jump out of the local optima. To avoid the erroneous discarding of the interactive features that have a low correlation with labels and improve the generalization ability, feature recovery is introduced to raise the probability of discarded features. The purpose of this work is that propose a feature selection method to select an effective feature combination and decrease the redundant information in HSI datasets. The co-evolution mechanism is utilized to promote the subpopulations of EA consistently. Moreover, the feature discarding and recovering mechanisms are used to avoid meaningless searching and enhance the generalization ability. The main contributions of this work are listed as follows:
(1)
The discarding–recovering mechanism is designed to enhance the generalization ability and decrease the computational load, which filters the original feature space and recovers some features into the population.
(2)
The co-evolution mechanism is combined with EA, which divides two subpopulations to co-evolve and utilizes reliable information interaction to enhance the diversity of agents in subpopulations.
(3)
A feature selection method based on discarding–recovering and co-evolution mechanisms is proposed to obtain an effective feature combination, which has a prominent performance in HSI datasets.
The rest of this paper is structured as follows: Section 2 provides the background information; Section 3 details the proposed feature selection method; Section 4 presents the experimental results from different perspectives; Section 5 exhibits the discussion of the proposed method and Section 6 outlines the conclusions.

2. Related Work

2.1. The Feature Selection Method Based on Distance-Based EA

The feature selection method based on distance-based EA has received much attention for its effectiveness in data dimensionality reduction, as it iteratively uses heuristic information to guide the next iteration. Wu et al. [26] developed the particle swarm optimizer (PSO) to reduce the dimensionality of the HSI dataset, where the chaotic sequence was used to initialize the feature space, helping PSO jump out of local optima. Su et al. [27] proposed a novel feature selection method based on the improved firefly algorithm (FA), which largely outperformed the conventional covariance method. Xie et al. [28] proposed a comprehensive feature selection method based on the artificial bee colony algorithm (ABC) and subspace division, achieving prominent overall classification accuracy (OA) while reducing a small amount of redundant information. Wang et al. [29] presented an optimized feature selection method based on the grey wolf optimizer (GWO) in the HSI dataset, which uses the adaptive weight to regulate the balance between optimal individuals and chaos operation to set correlative parameters. Tschannerl et al. [30] proposed an unsupervised feature selection method based on information theory and a modified discrete gravitational search algorithm (GSA), obtaining a more informative subset of features. However, with the increase of the data dimensionality, the ability of EA for further dimensionality reduction gradually decreases since the monotonous agents, leading to the selected feature combination, are redundant to some extent, and distinguishing between the approximate labels is difficult.

2.2. The Co-Evolution Mechanism of Feature Selection

The co-evolution mechanism uses the “divide and conquer” approach to divide the population, identify the current optimal subsets in the feature space, and eventually join them together into a global subset. Song et al. [31] proposed an adaptive subpopulation size adjustment mechanism based on co-evolution and a feature importance-oriented spatial partition strategy, decreasing the calculating time of particle evaluation and providing a competitive solution for the feature selection of high-dimensional data. Zhao et al. [32] proposed a multiple populations co-evolution mechanism and multi-stage interaction learning (OL) mechanism to fully search the prospective features in the stagnant state and increase the possibility of jumping out of local optima. Zhou et al. [33] proposed a feature selection method based on a cooperative co-evolution mechanism (CC-DFS). This method used a heterogeneous model to search for feature combinations with cut-off points and feature combinations without cut-off points, resulting in improved performance and generalization ability. Rashid et al. [34] proposed a feature selection method based on a cooperative co-evolution mechanism and random feature grouping (CC-RFG). Three ways were introduced to decompose the feature set dynamically to ensure the interactive features were divided into the same subpopulation. However, the above co-evolution mechanisms for feature selection only exchange the feature combination with weak representation, leading to the difficulty of regulating those features.

2.3. Motivation

To tackle the problem of the EA in data dimensionality reduction caused by the large feature space and redundant information, the preliminary filtering of the original feature set is required, which helps to decrease the redundancy information of the dataset. To further enhance the performance and effectiveness of EA, it is important to speed up the search process, increase the diversity of agents, and facilitate effective information interaction to improve the quality of the selected features.
Regarding the co-evolution mechanism, when agents from different subpopulations interact, they exchange information that is likely to improve the OA or decrease the number of selected features searched by agents. However, if weak features are not considered, it will lead to an imbalance problem. To overcome these limitations, increasing the probability of selecting the weak features and promoting diverse information interaction between agents is necessary. By this, the co-evolution mechanism achieves a balanced and effective optimization process, leading to a prominent result in HSI datasets.
In all, to improve search efficiency, it is necessary to remove the redundant features in the original feature set while recovering some of these features when detecting update stagnation. Additionally, the co-evolution mechanism is introduced to enhance the diversity of agents in corresponding subpopulations, given that interaction with diverse information is required to maintain the balance between subpopulations. All these measures help improve the performance and stability of agents in feature selection, making them more effective for real-world applications.

3. The Proposed Method

There is a certain of redundant information in the HSI dataset, and the performance in the reduction of data dimensionality has room to improve for EA. As a result, the feature discarding mechanism is implemented that uses some measure criteria to roughly filter the feature space, and the co-evolution mechanism is utilized to divide the population and take the reliable information interaction between agents to enhance the generalization ability. During the iteration process, if a stagnation phenomenon is detected, it is likely caused by the previous erroneous discarding of the interactive features, so the feature recovering mechanism is detonated to increase the selected probability of weak features by adaptive weights, and some of them will be recovered into the subpopulations.

3.1. The Feature Discarding Mechanism

Given the high degree of redundant features in the original dataset, removing it on a large scale is necessary. This eliminates the need for a thorough analysis of each feature and allows for a fast return of selected features. The evaluation measure for each feature is defined as follows:
S n = K = 1 n X n K , n = 1,2 , 3 m
X n K = 2 θ k T H θ k θ k T H ( t ) θ k
where   H R n × n , H i , j = k e r n e l x i , x j , and   H t = k e r n e l x i t , x j t   the   t   indicates that t t h feature is discarded. Note that   k e r n e l x i , x j   is the kernel function mapped to a high-dimensional space, and   θ   represents the optimized parameters obtained from the SVM-based classifier [35].
The feature discarding mechanism, which is based on forward filtering and reverse learning, is implemented to obtain the ranking of feature scores using Equations (1) and (2), drops the specified number of features, and recovers groups of features with low score ranking through reverse learning [36]. In addition, to improve the generalization ability, the feature discarding mechanism calculates the compromise value of recovery groups [37]. The mathematical model is defined as follows:
U i = m a x 1 < k < m w k o k * x i k o k * o k , R i = k = 1 m w k o k * x i k o k * o k
Q i = v R i R * R R * + ( 1 v ) U i U * U U *
where   R i   and   U i   denote the utility measure and the regret measure between   m   features, respectively,   W   is the weight vector,   O k *   is the maximum value of   k t h   feature of the decision matrix,   O k   is the minimum value of   k t h   feature of the decision matrix.   R *   and   U *   are the maximum value of   R   a n d   U , respectively.   S ,   U   are the minimum value of R   a n d   U , respectively.   Q i   represents the compromise value for each sample. The feature discarding mechanism obtains the compromise value of feature groups using Equations (3) and (4); the smallest one is selected as the original feature set.

3.2. The EA-Based Co-Evolution Mechanism

After feature discarding, the original feature set still has a high degree of redundant features, necessitating further decrease. The EA-based co-evolution mechanism can effectively search for the remaining features. Specifically, it divides the population into many subpopulations and uses information interaction to achieve a balance between them.

3.2.1. The Population Division Based on Feature Correlation

Generally, the population division involves partitioning the original feature set into multiple clusters (i.e., feature subsets) and initializing the agents generated in subpopulations based on these clusters. In addition, agents only search for features within their corresponding subsets while obtaining the rest via information interaction. Ideally, the population division considers the correlation between features or between features and labels commonly to minimize the correlation between features and maximize the correlation between features and labels [38]. However, when interactive features are partitioned into different subsets, subpopulations may fall into local traps that are not the local optimum of the original feature set and rather the local optimum resulting from the incorrect division. Therefore, the population division should ensure the feature subsets corresponding to subpopulations are sufficiently different, and the interactive features are partitioned together as much as possible, with the correlation between features considered.
Furthermore, generating many subsets requires an equal number of subpopulations to match them, leading to a large computation load. Additionally, interactive features may be divided into different subsets, resulting in mature convergence. To minimize the redundant features of the entire dataset, the population division decomposes the original feature set into two subsets, generating agents to form subpopulations within them. Figure 1 shows an example of the population division. The original feature set is partitioned according to the correlation between features, assuming that it has m features waiting for selection, two subsets are formed after the population division, and the number of features is   q . To maintain subpopulations’ balance,   q   is equal to m / 2 , the is the integer-value function.

3.2.2. The Reliable Information Interaction

The agents in different subpopulations are designed to exchange the information in parallel to facilitate interaction. If a feature does not belong to the current feature subset, it is searched with a probability of 0. Moreover, subpopulations should be provided with representative information to keep balance. Features with unsatisfactory scores may be the result of not finding other interactive features [39]. With the representative information, the features’ performance will be boosted. The representative information is defined as the best and worst combinations searched by agents in the corresponding subpopulation, and one of them is selected as the interaction object to enhance the reliability of the co-evolution mechanism. Figure 2 illustrates the reliable information interaction between subpopulations. It can be seen that during the interaction, each subpopulation receives representative information from the other. The agent then combines this information to make an overall evaluation after conducting a search. By following this process, features will be fully searched to obtain a prominent classification accuracy through the support vector machine (SVM)-based classifier on the testing set.
In subpopulations, the position of the next iteration of agents ( X ( i t e r + 1 ) ) is updated based on the distance ( d I S ) between the current agent’s position ( X i t e r ) and the optimal position. Here, the positions of the current agents are updated based on the optimal agent obtained using Equation (5).
X ( i t e r + 1 ) = X i t e r L d I S
where d I S refers to the distance between the current agent and the global optimum, while L denotes the social status of the optimum with a random value selected from the range of [−1, 1].

3.3. The Feature Recovering Mechanism

After the feature discarding, only the features with high ranking are selected, but the interactive features are not considered, which may lead to stagnation [40]. Consequently, the performance of feature subsets searched by agents may fall into local optima. Moreover, EA generally operates within the original feature set, and it is difficult to recycle the discarded features. To address these limitations, the feature recovering mechanism is applied to incorporate the recycled discarded features into the recovery subset, thereby increasing the probability of selection. The feature recovery mechanism has two stages. The first stage is reverse learning, which increases the probability of selecting features with low score ranking. Moreover, if training stagnation is detected, indicating that the subpopulation is not improved in successive iterations, some of the discarded features should be recycled. This will allow agents to fully search features later.
More attention should be paid to the features with low scores in the evaluative measures when recovering features. However, the low score does not necessarily mean that corresponding features should be simply discarded. Therefore, the lower-ranked features will receive higher weights. Assuming the dimension of input data is   m , the calculation for weight is described below:
W m = 1 S m i = 1 m S i m
where W m denotes the feature weight, and S m represents the feature score set obtained from feature discarding, S i m   represents the ith feature score. More weights obtained through Equation (6) are assigned to weak features, thus increasing their chances of being selected. As illustrated in Figure 3, after the features recovered through weighted screening are added to the corresponding subpopulation’s feature subset, a new feature space is generated for agents.

3.4. The Objective Function

The main target of feature selection is obtaining a representative feature combination from the original feature set to maximize the OA [41], which is an important evaluation criterion, but how to decrease the number of selected features is also a crucial target in feature selection. In this paper, the objective function is used to evaluate the feature combination searched by agents [42]; it is described in Equation (7).
f i t n e s s = α O A + 1 α lg n c n s
where   f i t n e s s   represents the fitness value of the feature combination searched by agents, O A   represents the overall classification accuracy obtained by SVM. Note that n c and n s are the number of total features in the dataset and the number of selected features.   α   is a weight factor of   O A   and the number of selected features; it takes   α   = 0.9 in this paper.

3.5. Implementation of the Proposed Method

The proposed feature selection method updates the agent based on distance, and its key process involves the information interaction between agents. Moreover, in the event of stalling, it recycles some of the discarded features, thereby improving the probability of features with low score ranking. The proposed feature selection method is described as follows (Algorithm 1):
Algorithm 1: Discarding–recovering and co-evolution mechanisms for HSI feature selection
Input: the n × m dataset D, the agent size Agesize, the number of feature groups M by reverse learning, and the maximum number of iterations Maxiter
Output: the effective feature combination selected by agents
  •  Undergo the feature discarding process through the feature discarding mechanism and obtain the SS and DS using the Equations (1)–(4) by reverse learning M feature groups
  • for i in SS:
  • do
  •    W i m     1 S i m k = 1 m S k m
  • end for
The SS is divided into two subsets SS1 and SS2 based on the correlation
  •  Selectgroup1  SS1, Selectgroup2  SS2
  •  Two subpopulations are generated in Selectgroup1 and Selectgroup2 and Agesize agents are obtained
  •  t 0
  • while t < Maxiter:
  • do
  •   Update the location of each agent by Equation (5)
Update the fitness value of each agent by Equation (7)
  •   if the optimal solution has been updated then
  •    Exchange the information through interaction
  •   end if
  •   if one of the subpopulations has stalled then
  •    h Recover(subset)
  •    SS Add(SS, h), DS Sub(DS, h), Selectgroup1,2  Add (Selectgroup1,2, h),
  •     S Add(s, h)
  •   end if
  •   t t + 1
  • end while
  • return The OA of effective feature combination
In the beginning, the feature evaluation is performed on all agents to discard the features with low score ranking, resulting in a selected set (SS) of features and a discarded set (DS) of features. The SS is then divided into two subsets: selectgroup1 and selectgroup2. Representative information is exchanged between these subsets when the optimal agent is updated. Moreover, if the stagnation phenomenon is detected, the feature recovering mechanism is triggered to recycle a recovery feature subset with a certain number based on the adaptive weight W. These features will be added to the SS and removed from the DS. The iteration process continues until the maximum iteration is reached.

4. Experimental Results

The proposed feature selection method is implemented using Python 3.8 on a personal computer that has a 2.30 GHz CPU, 8.00 GB RAM, and the Windows 8 operating system. To evaluate the performance of the proposed method, three HSI datasets, namely KSC (176 bands), Salinas (204 bands), and Longkou (270 bands), are used in the study. The experimental results are compared with some feature selection methods with EA-based, co-evolution mechanism-based, and others, and each independent experiment is performed in 30 operations with 50 iterations of each operation.

4.1. Dataset Description

The first dataset was acquired by NASA at the Kennedy Space Center (KSC) in Florida. It was obtained from a distance of approximately 20 km and contained 224 bands with a spatial resolution of 18 m. After removing bands with water absorbance and low signal-to-noise ratio, 176 bands were used for verification. The image consists of 512 × 614 pixels.
The second HSI dataset, named Salinas, was obtained by an AVIRIS sensor in the Salinas Valley of California, USA. It consists of 204 bands with a spatial resolution of 3.7 m and a pixel size of 512 × 217. The spectral range of the dataset spans from 0.4 to 2.5 μm, and the spectral resolution is 10 nanometers.
The third dataset was obtained in Longkou Town, Jingzhou City, Hubei Province, China, and includes six classes in an agrarian context. The UAV flew at an altitude of 500 m, and the spatial resolution of the airborne hyperspectral image is approximately 0.463 m. The image size is 550 × 400, with 270 bands ranging from 400 to 1000 nm.
The class names and corresponding sample numbers of three HSI datasets are described in Table 1. The image scene and ground truth of them are shown in Figure 4.

4.2. Parameters Setting of EAs

Before running, some parameters of EAs should be set for the heuristic search. The performance of the effective feature combination is dependent on the setting of parameters to some extent. In this paper, several EA-based feature selection methods, including PSO [43], FA [44], GWO [45], and GSA [46], are adopted to provide an intuitive performance comparison with the proposed method. Table 2 shows the parameters setting by these EAs.

4.3. Experiments for the Search Ability

Table 3 presents the OA and Kappa coefficient after 30 independent operations, while WTL is the win/tie/loss indicator of the fitness value. Table 4 shows the number of features (Num) and CPU time (Time) selected by the 30 independent operations. To demonstrate the prominent OA of the effective feature combination achieved in each iteration, the average number of features and OA are obtained for each iteration, as shown in Figure 5, and the fitness value is shown in Figure 6.
According to Table 3, the proposed method outperforms PSO, FA, GWO, and GSA in search capability, which achieves a prominent OA, surpassing PSO, FA, GWO, and GSA by 1.1%, 1.81%, 1.15%, and 1.36%, respectively. Those experimental results exhibit the superior searchability of the proposed method. Moreover, it enhances the development potential of local search by using a feature recovery mechanism. Its winning frequency is higher than 27, especially in Longkou, where it reached 30. These demonstrate the prominent stability of the proposed method and the superior exploration ability.
According to Table 4, the proposed method exhibits significantly higher reductive efficiency than other EA-based feature selection methods. Specifically, it selects less than 20% of the features from the HSI dataset, resulting in the selection of only 42 features out of a total of 176 bands in the KSC dataset while achieving a prominent OA. The Salinas dataset, it selects approximately half number of features compared with GSA yet achieves a better OA. On average, other methods select 58.7 features, whereas the proposed method selects only 43.1 features, indicating superior performance. In addition, the feature discarding mechanism substantially reduces redundant features, thereby shrinking the feature space and improving the computation time, especially in the Longkou dataset.
As shown in Figure 5, after feature discarding, the number of selected features searched by agents is still high, and the number of features is decreased after the heuristic search, while the OA is little to no fluctuation, demonstrating the prominent stability of the proposed method. Furthermore, the feature recovering mechanism effectively updates agents before the iteration ends, indicating that the proposed feature selection method possesses a prominent ability to escape from local optima. According to Figure 6, the fitness value is visualized to comprehensively evaluate the searching ability of each algorithm, and the proposed feature selection method achieves promising results on three HSI datasets and ranks 1st in terms of average fitness value, followed by GWO, FA, PSO, and GSA. Moreover, the proposed method achieves the optimal fitness value on three HSI datasets compared with other EA-based methods, proving it has a prominent search ability for feature selection.

4.4. Comparison with Other Feature Selection Methods

To assess the impact on each class, some feature selection methods in HSI datasets are compared in the experiment: maximum information minimum redundancy (MRMR) [47], joint mutual information with class correlation (JOMIC) [48], joint mutual information maximization (JMIM) [49], conditional mutual information maximization (CMIM) [50] and shallow-to-deep feature enhancement (SDFE) [51]. The experiments are performed on 10% to 25% of the total features. The accuracy for each class and Kappa coefficient are shown in Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15 and Table 16.

4.4.1. The Result of the KSC Dataset

Table 5, Table 6, Table 7 and Table 8 show the OA and Kappa coefficients for the KSC dataset using 10–25% of the total number of features.
Based on Table 5, Table 6, Table 7 and Table 8, it is concluded that the proposed feature selection method outperforms MRMR, JOMIC, JMIM, CMIM, and SDFE in terms of the OA for different numbers of features, with an improvement of over 0.7%. Furthermore, when using 20% of the total number of features, the Kappa coefficient reaches 0.9, demonstrating that its OA is basically anastomotic with the labels. For 25% of the total number of features, other feature selection methods have an OA of below 91.2%, while the proposed method achieves the OA and Kappa coefficients exceeding 92.8% and 0.916, respectively. Moreover, the proposed method takes the OA of over 97% for five classes, with Willow swamp, Cattaial marsh, and Mudflats even reaching 98%. In summary, it is a practical feature selection method for the KSC dataset.

4.4.2. The Result of the Salinas dataset

Table 9, Table 10, Table 11 and Table 12 present the OA and Kappa coefficients for the Salinas datasets using a fixed number of features.
The experimental results demonstrate that the proposed method outperforms other commonly used feature selection methods, achieving an OA of over 92% while obtaining the total number of features by less than 20%. Moreover, the Kappa coefficient for 25% of the total number of features is 0.2 higher than that of other methods, and the OA is higher for each class, with an OA of over 96% for all 14 classes. Notably, the samples of Brocoli_green_beads_1 are all correctly identified. These indicate that it achieves a prominent OA and Kappa coefficient for each class of the Salinas datasets, demonstrating the superiority of the proposed method.

4.4.3. The Result of the Longkou Dataset

Table 13, Table 14, Table 15 and Table 16 present the OA and Kappa coefficients for the Longkou datasets using a fixed number of features.
Table 13, Table 14, Table 15 and Table 16 present the OA and Kappa coefficients in the Longkou dataset. It is evident that the proposed method obtains prominent OA and Kappa coefficients, and it maintains a clear advantage in the classification of a small number of features. In the experimental comparison using 10% of the total number of features, MRMR, JOMIC, and SDFE achieve an OA of below 97%, while the proposed method achieves an OA of as high as 97.1%, which is 1.6%, 1.1% and 0.2% higher than MRMR, JOMIC, and SDFE, respectively. The OA of JMIM and CMIM is lower than 89%. The Kappa coefficient also demonstrates an overall advantage for the proposed method. Those results indicate that it is a robust and feasible feature selection method for the Longkou dataset.

5. Discussion

5.1. Design Analysis of the Proposed Method

EA is an effective strategy to obtain a feature combination of HSI datasets with a preferable OA in a limited time, the OA obtained on three HSI datasets exceeds 90%, and some even reach 98%. However, it is prone to stagnation during iteration due to the insufficient interactivity of agents. Co-evolution is a prominent mechanism to improve the agents’ diversity, the original feature set is divided into some subsets, and agents are generated by those to form subpopulations. Moreover, information interaction exchanges the optimal feature combination searched by agents to maintain the balance of subpopulations, but solely exchanging the optimal feature combination reduces the selected probability of interactive features. The proposed method incorporates reliable information interaction and a series of mechanisms focusing on features to address this. The trajectory of the OA for each iteration indicates that the stability of the proposed method, is decreased by less than 0.5% as the feature space condenses, and the computational time is also reduced by an average of 15%.
The proposed method has a prominent OA in most of the classes and even reaches 100% for Brocoli_green_beads_1 in the KSC dataset and Water in the Longkou dataset. Although it is lower than other feature selection methods in a few classes, the difference is not apparent in the class with small samples. Although other methods based on measure criteria stand out in terms of efficiency, it is difficult to distinguish interactive features as the number of instances increases. Feature discarding is an effective mechanism for eliminating redundant information and improving the computational load. Similar to other feature selection methods, the OA is negatively impacted due to improper discarding. To counterbalance this effect, the feature recovering mechanism is employed to improve the generalization ability while maintaining the OA at a high level. Experimental results indicate that the OA of the proposed method surpasses other feature selection methods by an average of 3%, and important features are adequately restored by the feature recovery mechanism, thereby improving the performance and reliability of the proposed method.

5.2. Discussion for the Training Size

In Section 4.1, three HSI datasets, namely, KSC, Salinas, and Longkou, are introduced to validate the performance of the proposed method. The OA of effective feature combination and computation time is influenced by the size of the training set. Several tests are conducted on the proportion of the training set, ranging from 5% to 25%, because of the small-sample learning properties to determine the appropriate size of the training set. The change curves for the number of features and the OA of different training sets are shown in Figure 7.
The experimental results indicate that the increasing size of the training set from 5% to 10% leads to a significant improvement in the OA. However, further increasing the proportion from 10% to 25% only results in a minimal increase, while the computation time also decreases to some extent. Additionally, the number of selected features does not show significant fluctuations, so the size of the training set is designated as 10%. This size strikes a balance between the OA and computational load, making it a practical and effective choice for feature selection in HSI datasets.

5.3. Comparison with Other Co-Evolution Mechanisms

To verify the search efficiency of the co-evolution mechanism in the proposed method, it is compared with other co-evolution mechanisms named CC-DFS [33] and CC-RFG [34]; the average fitness value of each iteration on three HSI datasets is shown in Figure 8.
In the beginning, the fitness value of the proposed method is higher than that of CC-DFS and CC-RFG in three HSI datasets, which demonstrates that the feature discard mechanism effectively removes the redundant features. With further iterations, the fitness trajectory of CC-DFS and CC-RFG gradually stabilizes while that of the proposed method remains upward. This indicates that the co-evolution mechanism enhances the search efficiency of agents and suggests a prominent ability to escape from local optima. As a result, the reliable co-evolution mechanism effectively interacts with more representative information, largely avoiding the occurrence of stagnation.

6. Conclusions

A feature selection method based on discarding–recovering and co-evolution mechanisms is proposed in this study with the aim of obtaining effective feature combinations in HSI datasets. According to the experimental results, the proposed method outperforms other EA-based feature selection methods, including PSO, FA, GWO, and GSA, in terms of optimization ability and search speed in the feature space. It achieves a prominent OA with a small number of selected features, outperforming other feature selection methods in this regard, and exhibits satisfied stability. In addition, through comparing with the other co-evolution mechanism, the fitness trajectory exhibits that the reliable co-evolution mechanism could interact with more representative information between agents, making them continuously improve. The performance limitations caused by feature discarding are improved through the recovery of dropped features, which guarantees the generalization ability and decreases computational load.
Furthermore, the proposed method outperforms MRMR, JOMIC, JMIM, CMIM, and SDFE in terms of the OA with varying numbers of features, and the reliable information interaction ensures a more balanced learning process, which maintains a positive balance between classification accuracy and the number of selected features, making it become a suitable choice for feature selection. In future studies, more representative criteria will be synthesized in the information interaction to further improve the diversity of agents. Moreover, it is interesting to use feature clustering to take the population division and further avoid population imbalance.

Author Contributions

Conceptualization, B.L. and M.W.; methodology, B.L. and Y.L.; software, M.W. and Y.L.; validation, B.L. and W.L.; formal analysis, Y.L.; investigation, B.L. and M.W.; resources, B.L. and X.G.; writing—original draft preparation, B.L. writing—review and editing, B.L. and M.W.; visualization, X.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China under Grant No. 41901296 and the Key Laboratory for National Geographic Census and Monitoring, National Administration of Surveying, Mapping and Geoinformation under Grant No. 2018NGCM06.

Data Availability Statement

The datasets presented in this paper can be obtained through https://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gwon, Y.; Kim, D.; You, H.; Nam, S.-H.; Kim, Y.D. A Standardized Procedure to Build a Spectral Library for Hazardous Chemicals Mixed in River Flow Using Hyperspectral Image. Remote Sens. 2023, 15, 477. [Google Scholar] [CrossRef]
  2. Liu, J.; Li, Y.; Zhao, F.; Liu, Y. Hyperspectral Remote Sensing Images Feature Extraction Based on Spectral Fractional Differentiation. Remote Sens. 2023, 15, 2879. [Google Scholar] [CrossRef]
  3. Wei, X.; Xiao, J.; Gong, Y. Blind Hyperspectral Image Denoising with Degradation Information Learning. Remote Sens. 2023, 15, 490. [Google Scholar] [CrossRef]
  4. Wang, J.; Mao, X.; Wang, Y.; Tao, X.; Chu, J.; Li, Q. Automatic generation of pathological benchmark dataset from hyperspectral images of double stained tissues. Opt. Laser Technol. 2023, 163, 109331. [Google Scholar] [CrossRef]
  5. Tang, C.; Liu, X.; Li, M.; Wang, P.; Chen, J.; Wang, L.; Li, W. Robust unsupervised feature selection via dual self-representation and manifold regularization. Knowl. Based Syst. 2018, 145, 109–120. [Google Scholar] [CrossRef]
  6. Ruan, W.; Sun, L. Robust latent discriminative adaptive graph preserving learning for image feature extraction. Knowl. Based Syst. 2023, 268, 110487. [Google Scholar] [CrossRef]
  7. Ba, J.; Wang, P.; Yang, X.; Yu, H.; Yu, D. Glee: A granularity filter for feature selection. Eng. Appl. Artif. Intell. 2023, 122, 106080. [Google Scholar] [CrossRef]
  8. Ma, W.; Zhou, X.; Zhu, H.; Li, L.; Jiao, L. A two-stage hybrid ant colony optimization for high-dimensional feature selection. Pattern Recognit. 2021, 116, 107933. [Google Scholar] [CrossRef]
  9. Cekik, R.; Uysal, A.K. A novel filter feature selection method using rough set for short text data. Expert Syst. Appl. 2020, 160, 113691. [Google Scholar] [CrossRef]
  10. Cilia, N.D.; D’alessandro, T.; De Stefano, C.; Fontanella, F.; di Freca, A.S. Comparing filter and wrapper approaches for feature selection in handwritten character recognition. Pattern Recognit. Lett. 2023, 168, 39–46. [Google Scholar] [CrossRef]
  11. Deng, T.; Huang, Y.; Yang, G.; Wang, C. Pointwise mutual information sparsely embedded feature selection. Int. J. Approx. Reason. 2022, 151, 251–270. [Google Scholar] [CrossRef]
  12. Paja, W. Generational Feature Elimination to Find All Relevant Feature Subset. Syst. Technol. 2017, 72, 140–148. [Google Scholar] [CrossRef]
  13. Aranha, C.; Villalón, C.L.C.; Campelo, F.; Dorigo, M.; Ruiz, R.; Sevaux, M.; Sörensen, K.; Stützle, T. Metaphor-based metaheuristics, a call for action: The elephant in the room. Swarm Intell. 2022, 16, 1–6. [Google Scholar] [CrossRef]
  14. Qin, Y.; Li, Z.; Ding, J.; Zhao, F.; Meng, M. Automatic optimization model of transmission line based on GIS and genetic algo-rithm. Array 2023, 17, 100266. [Google Scholar] [CrossRef]
  15. Zheng, K.; Zhang, Q.; Peng, L.; Zeng, S. Adaptive memetic differential evolution-back propagation-fuzzy neural network algo-rithm for robot control. Inf. Sci. 2023, 637, 118940. [Google Scholar] [CrossRef]
  16. Ong, P.; Zainuddin, Z. An optimized wavelet neural networks using cuckoo search algorithm for function approximation and chaotic time series prediction. Decis. Anal. J. 2023, 6, 100188. [Google Scholar] [CrossRef]
  17. Qu, L.; He, W.; Li, J.; Zhang, H.; Yang, C.; Xie, B. Explicit and size-adaptive PSO-based feature selection for classification. Swarm Evol. Comput. 2023, 77, 101249. [Google Scholar] [CrossRef]
  18. Su, F.; Duan, C.; Wang, R. Analysis and improvement of GSA’s optimization process. Appl. Soft Comput. 2021, 107, 107367. [Google Scholar] [CrossRef]
  19. Al-Tashi, Q.; Md Rais, H.; Abdulkadir, S.-J.; Mirjalili, S.; Alhussian, H. A Review of Grey Wolf Optimizer-Based Feature Se-lection Methods for Classification. Evol. Mach. Learn. Tech. Algorithms Intell. Syst. 2017, 12, 273–286. [Google Scholar]
  20. Chary, V.; Rosalina, K. Analysis of transmission line modeling routines by using offsets measured least squares regression ant lion optimizer. ORPD and ELD problems. Heliyon 2023, 9, 13387. [Google Scholar] [CrossRef]
  21. Bezginov, A.; Clark, G.W.; Charlebois, R.L.; Dar, V.-U.; Tillier, E.R. Coevolution Reveals a Network of Human Proteins Originating with Multicellularity. Mol. Biol. Evol. 2013, 30, 332–346. [Google Scholar] [CrossRef] [Green Version]
  22. Qi, S.; Wang, R.; Zhang, T.; Dong, N. Cooperative coevolutionary competition swarm optimizer with perturbation for high-dimensional multi-objective optimization. Inf. Sci. 2023, 644, 119253. [Google Scholar] [CrossRef]
  23. Zhong, R.; Zhang, E.; Munetomo, M. Cooperative coevolutionary differential evolution with linkage measurement minimization for large-scale optimization problems in noisy environments. Complex Intell. Syst. 2023, 573, 1–18. [Google Scholar] [CrossRef]
  24. Tian, J.; Li, M.; Chen, F. Dual-population based coevolutionary algorithm for designing RBFNN with feature selection. Expert Syst. Appl. 2010, 37, 6904–6918. [Google Scholar] [CrossRef]
  25. Too, J.; Abdullah, A.R.; Saad, N.M. A New Co-Evolution Binary Particle Swarm Optimization with Multiple Inertia Weight Strategy for Feature Selection. Informatics 2019, 6, 21. [Google Scholar] [CrossRef] [Green Version]
  26. Wu, Y.; Xue, W.; Xu, L.; Guo, X.; Xue, D.; Yao, Y.; Zhao, S.; Li, N. Optimized least-squares support vector machine for predicting aero-optic imaging deviation based on chaotic particle swarm optimization. Optik 2020, 206, 163215. [Google Scholar] [CrossRef]
  27. Su, H.; Li, Q.; Du, P. Hyperspectral band selection using firefly algorithm. In Proceedings of the 2014 6th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing, Lausanne, Switzerland, 24-27 June 2014; pp. 1–4. [Google Scholar]
  28. Xie, F.; Li, F.; Lei, C.; Yang, J.; Zhang, Y. Unsupervised band selection based on artificial bee colony algorithm for hyperspectral image classification. Appl. Soft Comput. 2019, 75, 428–440. [Google Scholar] [CrossRef]
  29. Wang, M.; Liu, W.; Chen, M.; Huang, X.; Han, W. A band selection approach based on a modified gray wolf optimizer and weight updating of bands for hyperspectral image. Appl. Soft Comput. 2021, 112, 107805. [Google Scholar] [CrossRef]
  30. Tschannerl, J.; Ren, J.; Yuen, P.; Sun, G.; Zhao, H.; Yang, Z.; Wang, Z.; Marshall, S. MIMR-DGSA: Unsupervised hyperspectral band selection based on information theory and a modified discrete gravitational search algorithm. Inf. Fusion 2019, 51, 189–200. [Google Scholar] [CrossRef] [Green Version]
  31. Song, X.-F.; Zhang, Y.; Guo, Y.-N.; Sun, X.-Y.; Wang, Y.-L. Variable-Size Cooperative Coevolutionary Particle Swarm Optimization for Feature Selection on High-Dimensional Data. IEEE Trans. Evol. Comput. 2020, 24, 882–895. [Google Scholar] [CrossRef]
  32. Zhao, F.; Bao, H.; Wang, L.; Cao, J.; Tang, J. A multipopulation cooperative coevolutionary whale optimization algorithm with a two-stage orthogonal learning mechanism. Knowl. Based Syst. 2022, 246, 108664. [Google Scholar] [CrossRef]
  33. Zhou, Y.; Kang, J.; Zhang, X. A Cooperative Coevolutionary Approach to Discretization-Based Feature Selection for High-Dimensional Data. Entropy 2020, 22, 613. [Google Scholar] [CrossRef] [PubMed]
  34. Rashid, A.N.M.B.; Ahmed, M.; Sikos, L.F.; Haskell-Dowland, P. Cooperative co-evolution for feature selection in Big Data with random feature grouping. J. Big Data 2020, 7, 107. [Google Scholar] [CrossRef]
  35. Fernández, D.; Adermann, E.; Pizzolato, M.; Pechenkin, R.; Rodríguez, C.G.; Taravat, A. Comparative Analysis of Machine Learning Algorithms for Soil Erosion Modelling Based on Remotely Sensed Data. Remote Sens. 2023, 15, 482. [Google Scholar] [CrossRef]
  36. Guo, Y.; Zhang, Z.; Tang, F. Feature selection with kernelized multi-class support vector machine. Pattern Recognit. 2021, 117, 107988. [Google Scholar] [CrossRef]
  37. Chen, T.-Y. An evolved VIKOR method for multiple-criteria compromise ranking modeling under T-spherical fuzzy uncertainty. Adv. Eng. Inform. 2022, 54, 101802. [Google Scholar] [CrossRef]
  38. Yan, X.; Jia, M. Intelligent fault diagnosis of rotating machinery using improved multiscale dispersion entropy and mRMR feature selection. Knowl. Based Syst. 2019, 163, 450–471. [Google Scholar] [CrossRef]
  39. Maldonado, J.; Riff, M.C.; Neveu, B. A review of recent approaches on wrapper feature selection for intrusion detection. Expert Syst. Appl. 2022, 198, 116822. [Google Scholar] [CrossRef]
  40. Liu, Z.; Yang, J.; Wang, L.; Chang, Y. A novel relation aware wrapper method for feature selection. Pattern Recognit. 2023, 140, 109566. [Google Scholar] [CrossRef]
  41. Maha, N.; Ghaith, M.; Ouajdi, K. Advances in nature-inspired metaheuristic optimization for feature selection problem: A comprehensive survey. Comput. Sci. Rev. 2023, 49, 100559. [Google Scholar]
  42. Zhuang, Z.; Pan, J.-S.; Li, J.; Chu, S.-C. Parallel binary arithmetic optimization algorithm and its application for feature selection. Knowl. Based Syst. 2023, 275, 110640. [Google Scholar] [CrossRef]
  43. Du, W.; Ma, J.; Yin, W. Orderly charging strategy of electric vehicle based on improved PSO algorithm. Energy 2023, 271, 127088. [Google Scholar] [CrossRef]
  44. Kumar, V.; Kumar, D. A Systematic Review on Firefly Algorithm: Past, Present, and Future. Arch. Comput. Methods Eng. 2021, 28, 3269–3291. [Google Scholar] [CrossRef]
  45. Achom, A.; Das, R.; Pakray, P. An improved Fuzzy based GWO algorithm for predicting the potential host receptor of COVID-19 infection. Comput. Biol. Med. 2022, 151, 106050. [Google Scholar] [CrossRef] [PubMed]
  46. Biabani, F.; Shojaee, S.; Hamzehei-Javaran, S. A new insight into metaheuristic optimization method using a hybrid of PSO, GSA, and GWO. Structures 2022, 44, 1168–1189. [Google Scholar] [CrossRef]
  47. Esmaeili, A.; Hamidi, J.K.; Mousavi, A. Determination of sublevel stoping layout using a network flow algorithm and the MRMR classification system. Resour. Policy 2023, 80, 103265. [Google Scholar] [CrossRef]
  48. Robindro, K.; Clinton, U.B.; Hoque, N.; Bhattacharyya, D.K. JoMIC: A joint MI-based filter feature selection method. J. Comput. Math. Data Sci. 2023, 6, 100075. [Google Scholar] [CrossRef]
  49. Kumar, C.; Chatterjee, S.; Oommen, T.; Guha, A. Automated lithological mapping by integrating spectral enhancement tech-niques and machine learning algorithms using AVIRIS-NG hyperspectral data in Gold-bearing granite-greenstone rocks in Hutti, India. Int. J. Appl. Earth Obs. Geoinf. 2020, 86, 102006. [Google Scholar]
  50. Souza, F.; Premebida, C.; Araújo, R. High-order conditional mutual information maximization for dealing with high-order dependencies in feature selection. Pattern Recognit. 2022, 131, 108895. [Google Scholar] [CrossRef]
  51. Zhou, L.; Ma, X.; Wang, X.; Hao, S.; Ye, Y.; Zhao, K. Shallow-to-Deep Spatial–Spectral Feature Enhancement for Hyperspectral Image Classification. Remote Sens. 2023, 15, 261. [Google Scholar] [CrossRef]
Figure 1. The process of population division.
Figure 1. The process of population division.
Remotesensing 15 03788 g001
Figure 2. The process of reliable information interaction between subpopulations.
Figure 2. The process of reliable information interaction between subpopulations.
Remotesensing 15 03788 g002
Figure 3. The process of feature recovery.
Figure 3. The process of feature recovery.
Remotesensing 15 03788 g003
Figure 4. Image scene and ground truth of three HSI datasets: (a) KSC image (b) KSC ground truth (c) Salinas image (d) Salinas ground truth (e) Longkou image (f) Longkou ground truth.
Figure 4. Image scene and ground truth of three HSI datasets: (a) KSC image (b) KSC ground truth (c) Salinas image (d) Salinas ground truth (e) Longkou image (f) Longkou ground truth.
Remotesensing 15 03788 g004
Figure 5. The trajectory of the OA and number of features on three HSI datasets: (a) KSC, (b) Salinas, (c) Longkou.
Figure 5. The trajectory of the OA and number of features on three HSI datasets: (a) KSC, (b) Salinas, (c) Longkou.
Remotesensing 15 03788 g005
Figure 6. The fitness values of the EA-based and the proposed feature selection method on three HSI datasets. The bar chart shows the fitness values on three HSI datasets, and the curve shows the mean fitness value.
Figure 6. The fitness values of the EA-based and the proposed feature selection method on three HSI datasets. The bar chart shows the fitness values on three HSI datasets, and the curve shows the mean fitness value.
Remotesensing 15 03788 g006
Figure 7. The average number of selected features and the OA under the different training sizes of three HSI datasets: (a) KSC, (b) Salinas, (c) Longkou.
Figure 7. The average number of selected features and the OA under the different training sizes of three HSI datasets: (a) KSC, (b) Salinas, (c) Longkou.
Remotesensing 15 03788 g007
Figure 8. The average fitness value of different co-evolution mechanisms on three HSI datasets: (a) KSC, (b) Salinas, (c) Longkou.
Figure 8. The average fitness value of different co-evolution mechanisms on three HSI datasets: (a) KSC, (b) Salinas, (c) Longkou.
Remotesensing 15 03788 g008
Table 1. The land-cover classes of three HSI datasets.
Table 1. The land-cover classes of three HSI datasets.
Class NumberClass NameSample NumberClass NameSample NumberClass NameSample Number
1Scrub761Brocoli_green weeds_12009Corn34,511
2Willow swamp243Brocoli_green weeds_23726Cotton8374
3Cabbage palm
Hammock
256Fallow1976Sesame3031
4Cabbage palm/Oak hammock252Fallow_rough plow1394Broad-leaf soybean63,212
5Slash pine161Fallow_smooth2678Narrow-leaf soybean4151
6Oak/Broadleaaf hammock229Stubble3959Rice11,854
7Hardwood swamp105Celery3579Water67,056
8Graminoid marsh431Grapes_Untrained11,271Roads and houses7124
9Spartina marsh520Soil vineyard develop6203Mixed weed5229
10Cattaial marsh404Corn_senesced greed_weeds3278--
11Salt marsh419Lettuce_romaine 4 wk1068--
12Mud flats503Lettuce_romaine 5 wk1927--
13Water927Lettuce_romaine 6 wk916--
14--Lettuce_romaine 7 wk1070--
15--Vinyard_untrained7268--
16--Vinyard_vertical_trellis1087--
-Total5211Total54,129Total204,542
Table 2. Parameters setting of each algorithm.
Table 2. Parameters setting of each algorithm.
ParametersValues
Size of agents15
DimensionNumber of features
The number of iterations per algorithm50
The acceleration constant c 1 , c 2 in PSO2
Min-max inertia weight ω m i n , ω m a x in PSO0.2, 0.9
The light intensity absorption coefficient I in FA1
The step factor α in FA0.97
Min-max attraction β m i n ,   β m a x in FA0.2, 1
α Correlation coefficient in GWO[2, 0]
The initial universal gravitational constant in GSA100
Number of the subpopulation in co-evolution mechanism2
Table 3. The OA and Kappa coefficient of EA-based feature selection methods.
Table 3. The OA and Kappa coefficient of EA-based feature selection methods.
DatasetsMetricsPSOFAGWOGSAProposed
KSCOA (%)90.97 ± 0.3390.26 ± 0.2490.92 ± 0.3190.71 ± 0.3892.07 ± 0.12
Kappa0.904 ± 0.0020.887 ± 0.00160.897 ± 0.0010.892 ± 0.0020.917 ± 0.001
WTL+++28/0/2
SalinasOA (%)91.77 ± 0.1591.63 ± 0.2391.65 ± 0.1791.82 ±   0.2692.36 ± 0.21
Kappa0.907 ± 0.0010.908 ± 0.00140.905 ± 0.0010.909 ± 0.00180.915 ± 0.0013
WTL++27/1/2
LongkouOA (%)97.36 ± 0.1196.89 ± 0.2797.12 ± 0.4197.74 ± 0.3398.01 ± 0.14
Kappa0.964 ± 0.00020.956 ± 0.0010.960 ± 0.00240.969 ± 0.0060.979 ± 0.002
WTL+++30/0/0
Table 4. Number of selected features and CPU time of EA-based feature selection methods.
Table 4. Number of selected features and CPU time of EA-based feature selection methods.
DatasetsMetricsPSOFAGWOGSAProposed
KSCNum49.650.350.755.342.2
Time190.1176152.4127204.1769195.3568150.6149
SalinasNum67.555.356.755.643.1
Time3600.50432714.36583536.93473742.59282427.6280
LongkouNum95.768.77695.462.5
Time1564.36201269.37951514.64891892.34811204.7301
Table 5. The results for the KSC dataset using 10% of the total number of features.
Table 5. The results for the KSC dataset using 10% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
192.571486.657988.436381.472492.371792.4264
286.255986.222293.299086.784186.175189.3519
363.467569.967074.717083.854285.407777.4074
452.907060.199052.188651.923155.555657.7670
562.500073.118375.000070.370468.376165.0407
660.843460.992970.491864.893683.471171.0059
767.948778.409171.428683.333374.782674.0741
876.566179.430476.744268.698187.623893.9058
982.643086.434184.453090.969290.239089.9606
1087.6081100.0000100.000099.156299.872799.9618
1195.595995.405498.895099.189298.583693.6869
1291.355179.921376.864274.087692.703997.9499
1399.517599.948299.289499.880199.624799.9915
OA (%)85.032086.012885.714384.520390.213290.3624
Kappa0.84060.85430.84180.83700.89080.8917
Table 6. The results for the KSC dataset using 15% of the total number of features.
Table 6. The results for the KSC dataset using 15% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
191.873392.510488.210888.965592.887092.2006
289.622690.476272.075582.096188.732490.6977
370.707169.255777.732884.615480.246983.4711
448.473356.684556.774258.299657.933660.2996
582.051379.487273.333352.903275.238186.6667
678.431466.656771.578966.666765.100773.6842
770.192373.451381.578972.477174.226870.5357
880.964587.761285.078572.234389.540888.5856
982.777878.157087.140189.130491.666789.9408
1087.887391.9162100.0000100.000099.718396.8927
1198.356296.382498.400099.435097.637895.1157
1292.944093.781194.405693.192598.135297.6526
1398.459798.106599.991966.6667100.0000100.0000
OA (%)86.588587.036287.910487.036290.597090.6397
Kappa0.85140.85600.86440.85810.89250.8936
Table 7. The results for the KSC dataset using 20% of the total number of features.
Table 7. The results for the KSC dataset using 20% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
192.917890.760987.267986.292492.083393.1642
291.747686.147288.888988.479395.580188.5416
373.646272.563285.853790.740785.775979.4165
460.563448.818956.250059.315658.000068.2564
565.909179.687572.727358.267778.504778.2196
665.161377.083369.354867.741978.861872.2519
777.142980.681874.576372.033992.093085.6429
882.926888.829881.704382.481886.810693.4579
988.517788.282888.326888.757492.629590.1245
1091.379392.668699.994899.444497.707799.9094
1197.650195.372899.985998.128398.933399.9654
1292.290292.424295.714397.277297.387295.6413
1399.640799.7599100.000099.9618100.0000100.0000
OA (%)88.315688.294288.784688.869990.959591.7057
Kappa0.87120.86280.87500.87350.89920.9047
Table 8. The results for the KSC dataset using 25% of the total number of features.
Table 8. The results for the KSC dataset using 25% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
194.542893.589790.287889.901591.539594.2413
289.671491.943186.516992.372988.557298.5618
369.444473.630177.777888.000087.111187.1594
456.363654.117660.396060.544261.991069.5432
583.333385.507275.000072.727375.892984.5621
662.430973.446371.428668.235368.020377.8654
769.523878.217891.044879.310375.213772.5613
886.967491.111182.130683.333390.099093.6421
988.176488.537588.029995.180792.089292.6578
1098.833894.9861100.000099.918599.717599.9153
1197.905898.583699.910397.196397.790197.2541
1297.571794.055290.598392.692397.550198.4623
1399.982699.591099.8459100.0000100.000099.8917
OA (%)89.488389.850789.172190.330091.151492.8144
Kappa0.87420.88190.88930.89910.90140.9164
Table 9. The results for the Salinas dataset using 10% of the total number of features.
Table 9. The results for the Salinas dataset using 10% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
199.534398.8636100.0000100.000099.886299.9654
297.204498.176597.900998.072498.589198.1319
392.725394.630989.836192.172994.462591.2998
498.061497.903299.194897.959297.191998.8606
597.835198.089795.385992.283298.412799.1625
699.441099.943899.329299.608199.719999.9438
797.741197.619099.937398.948099.008199.4410
874.166473.886370.068171.019874.849874.2189
997.993798.061396.597493.771599.108198.1501
1094.070990.524986.024487.989094.569292.3497
1194.635297.685276.143173.696195.503287.8981
1291.046493.777893.058693.383995.758996.8362
1393.119396.420090.200493.318093.750094.8598
1498.695798.717995.735698.430598.008898.4749
1577.067078.415377.467975.318780.481391.4327
1698.525598.803299.626999.501999.619399.7484
OA (%)89.367489.568587.310687.257390.479990.1081
Kappa0.87970.88160.86570.86590.89360.8905
Table 10. The results for the Salinas dataset using 15% of the total number of features.
Table 10. The results for the Salinas dataset using 15% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
199.198299.9989100.0000100.000099.7517100.0000
297.894798.245698.583298.351099.376599.5247
393.942791.541890.264694.932493.107194.8578
494.487097.770798.730297.500097.952899.0476
598.138797.684097.721597.119398.821598.1148
699.9987100.000099.943899.984699.9438100.0000
798.344699.008799.811899.009399.380899.2560
874.351874.512475.024675.425476.836778.0508
999.178399.004696.360597.550699.358599.2479
1091.397193.181886.818586.838593.827295.9028
1192.842197.111187.415787.651896.936596.0526
1292.841494.938196.962996.420695.758996.2963
1394.626296.181490.807292.889998.058394.6009
1498.903598.268497.315496.390796.919994.6030
1577.852078.505982.968082.002678.993081.5315
1699.082698.971798.525899.503199.498799.0111
OA (%)89.634290.032489.765690.028391.014291.7074
Kappa0.88250.88760.88150.89360.90130.9105
Table 11. The results for the Salinas dataset using 20% of the total number of features.
Table 11. The results for the Salinas dataset using 20% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
199.442099.7738100.000099.984399.456299.9913
299.406998.879798.939999.525898.939999.5848
395.016694.384492.340991.918194.174894.8408
497.310195.833397.805698.422799.362098.1132
598.506298.823598.316597.317797.952599.2437
6100.000099.720599.943799.943999.842699.9439
799.442099.255199.9958100.000099.627899.4406
875.645674.861575.521375.422178.184778.7730
998.611698.901597.133998.271599.285599.4298
1093.546194.007288.239391.501496.011497.5300
1191.053794.725391.170085.306195.217493.8819
1295.393395.200997.068895.986696.843395.2486
1395.238194.145295.058895.058896.674697.8208
1497.257498.488198.072897.040296.509298.3368
1577.314079.785883.119983.068578.802083.4154
1698.366898.096499.379799.258397.442199.3750
OA (%)90.319890.332190.393790.434791.329492.2411
Kappa0.89070.89250.89010.89680.90470.9200
Table 12. The results for the Salinas dataset using 25% of the total number of features.
Table 12. The results for the Salinas dataset using 25% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
199.774399.4388100.000099.9846100.0000100.0000
298.997199.112498.638298.417498.939399.8322
396.532496.875091.755992.963893.929796.1354
497.472497.952899.046198.420298.890699.3782
597.930599.003397.090698.501299.661999.5342
6100.000099.943999.943899.831599.943899.9473
799.318099.438998.946799.749299.564499.9544
877.034375.330876.424175.439277.206480.7261
998.936299.074798.052497.110099.429499.4388
1092.778593.454091.585593.632498.313897.7492
1193.975996.483592.440692.760295.578998.9252
1295.856795.852094.928396.089496.412698.7631
1393.103495.662794.774396.875097.142999.3150
1498.681395.876395.358697.302997.297398.5960
1576.665480.178284.418584.474782.669584.2496
1698.607698.247899.626499.503798.402999.5761
OA (%)90.689390.709890.796090.759191.760793.2614
Kappa0.88450.89140.89460.89350.90790.9251
Table 13. The results for the Longkou dataset using 10% of the total number of features.
Table 13. The results for the Longkou dataset using 10% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
196.390598.397473.647772.181998.688098.0647
288.872280.933938.695256.626586.024489.4598
395.200096.50350.00000.000096.846890.7834
494.953495.810372.437373.036397.419096.8578
574.855579.74030.000016.666775.000083.3333
695.738693.59090.00000.000093.872195.0276
799.983499.983484.695385.1456100.0000100.0000
887.358092.52620.000048.633091.666794.0458
970.694174.063438.344236.071486.901090.5724
OA (%)95.496595.996372.924874.478596.909097.1208
Kappa0.97310.97330.68450.70550.97930.9815
Table 14. The results for the Longkou dataset using 15% of the total number of features.
Table 14. The results for the Longkou dataset using 15% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
197.749697.690673.896474.657799.229599.3867
290.760991.144438.612858.048882.549589.7638
395.480298.85710.00000.000096.566595.7219
495.895596.185373.458875.822597.800596.4383
582.044281.62160.000025.000086.440785.7520
694.178493.55740.00000.000095.344796.2829
799.983499.983485.374587.078999.950399.9834
890.000090.3226100.000050.129492.227294.4109
984.745883.333341.911838.461594.684493.0818
OA (%)96.572196.631973.598476.575497.441397.4522
Kappa0.95520.95600.64560.71100.98630.9867
Table 15. The results for the Longkou dataset using 20% of the total number of features.
Table 15. The results for the Longkou dataset using 20% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
196.390598.495586.550889.143298.882199.2585
288.872289.035756.808055.067285.962688.7850
395.200096.92310.00000.000099.991794.2731
494.953496.193283.945486.578397.402898.1774
574.855580.62833.84620.000082.428984.6753
695.738696.18250.00000.000096.347095.0673
799.983499.983492.452890.311599.9503100.0000
887.358090.130660.300455.610793.263593.8160
970.694190.909173.437596.521795.364296.2865
OA (%)95.496596.903584.571985.289097.495797.8596
Kappa0.97310.97670.80600.81140.98490.9894
Table 16. The results for the Longkou dataset using 25% of the total number of features.
Table 16. The results for the Longkou dataset using 25% of the total number of features.
Class NumberMRMRJOMICJMIMCMIMSDFEProposed
199.481999.323791.199390.273999.614399.1987
292.286587.696360.279160.728087.466787.3533
397.872399.438299.270197.777899.086897.7376
496.081496.729594.978095.629797.344898.3615
581.360279.336731.250014.035183.333386.5169
696.467496.350485.227369.697096.025397.7860
799.966999.966887.423887.324499.966999.9503
892.515092.376783.218783.401992.559591.8999
992.131186.706989.697078.450492.901297.5430
OA (%)97.305597.148088.830988.390997.577198.0389
Kappa0.97150.98520.85830.85350.98800.9910
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liao, B.; Li, Y.; Liu, W.; Gao, X.; Wang, M. Discarding–Recovering and Co-Evolution Mechanisms Based Evolutionary Algorithm for Hyperspectral Feature Selection. Remote Sens. 2023, 15, 3788. https://0-doi-org.brum.beds.ac.uk/10.3390/rs15153788

AMA Style

Liao B, Li Y, Liu W, Gao X, Wang M. Discarding–Recovering and Co-Evolution Mechanisms Based Evolutionary Algorithm for Hyperspectral Feature Selection. Remote Sensing. 2023; 15(15):3788. https://0-doi-org.brum.beds.ac.uk/10.3390/rs15153788

Chicago/Turabian Style

Liao, Bowen, Yangxincan Li, Wei Liu, Xianjun Gao, and Mingwei Wang. 2023. "Discarding–Recovering and Co-Evolution Mechanisms Based Evolutionary Algorithm for Hyperspectral Feature Selection" Remote Sensing 15, no. 15: 3788. https://0-doi-org.brum.beds.ac.uk/10.3390/rs15153788

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop