Hyperspectral Band Selection via Optimal Combination Strategy

Li, Shuying; Peng, Baidong; Fang, Long; Li, Qiang

doi:10.3390/rs14122858

Open AccessArticle

Hyperspectral Band Selection via Optimal Combination Strategy

¹

School of Automation, Xi’an University of Posts and Telecommunications, Xi’an 710121, China

²

Northwest Institute of Nuclear Technology, Xi’an 710024, China

³

School of Computer Science and the School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi’an 710072, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(12), 2858; https://0-doi-org.brum.beds.ac.uk/10.3390/rs14122858

Submission received: 21 May 2022 / Revised: 9 June 2022 / Accepted: 13 June 2022 / Published: 15 June 2022

(This article belongs to the Section Remote Sensing Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Band selection is one of the main methods of reducing the number of dimensions in a hyperspectral image. Recently, various methods have been proposed to address this issue. However, these methods usually obtain the band subset in the perspective of a locally optimal solution. To achieve an optimal solution with a global perspective, this paper developed a novel method for hyperspectral band selection via optimal combination strategy (OCS). The main contributions are as follows: (1) a subspace partitioning approach is proposed which can accurately obtain the partitioning points of the subspace. This ensures that similar bands can be divided into the same subspace; (2) two candidate representative bands with a large amount of information and high similarity are chosen from each subspace, which can fully represent all bands in the subspace; and (3) an optimal combination strategy is designed to acquire the optimal band subset, which achieves an optimal solution with a global perspective. The results on four public datasets illustrate that the proposed method achieves satisfactory performance against other methods.

Keywords:

hyperspectral image; band selection; subspace division; optimal combination strategy

1. Introduction

A hyperspectral image (HSI) was obtained by the imaging spectrometer, which captures the target area through the spectral band from visible light to the near infrared. It can provide numerous narrow bands for each pixel, producing a complete and continuous spectral response curve [1]. According to the spectral response curve, the substance of HSI can be classified. Therefore, HSI is often used in crop analysis [2], mineral exploration [3], military surveillance [4], etc. Nevertheless, HSI has from dozens to hundreds of spectral dimensions and the adjacent bands also have a certain redundancy. This results in high computational complexity, which is called dimension disaster [5].

Data dimensional reduction is the common means of addressing the problem of dimensional disaster. Existing dimensional reduction methods of HSI mainly include feature extraction [6,7,8] and band (feature) selection [9,10,11,12]. Feature extraction is to project the feature information of HSI from high-dimensional space to low-dimensional space and extract the required feature information. The classical methods of feature extraction are principal component analysis [6], independent component analysis [7] and local linear embedding [8]. In recent years, some new feature extraction methods have achieved great performance improvement, such as SLGDA [13], kernel SLGDA [14], tensor LPP [15], tensor low-rank discriminant embedding [16], etc. As for band selection, it aims to select the most representative band subset. Compared with feature extraction, band selection can retain the original features and physical significance of HSI data. Therefore, band selection is generally more preferable than feature extraction.

Since the sample labeling cost of HSI is extremely high, supervised and semi-supervised methods are not practical in practical application. Researchers thus mainly focus on the unsupervised manner to establish the model. Currently, unsupervised band selection methods are mainly divided into ranking methods [17,18,19,20,21] and clustering methods [22,23,24,25,26]. Ranking methods quantify the importance of each band according to predefined criteria and choose the bands with the highest importance. Clustering methods cluster bands with their similarity and choose a most representative band in each cluster. Although these methods achieve satisfactory results, there are still many deficiencies. Ranking methods usually do not take into account the redundancy between bands, which results in the high similarity of the selected bands. Clustering methods consider that the bands are disordered and the spectral dimension information is neglected.

Considering the disadvantage of clustering and ranking methods, subspace partitioning methods [27,28,29,30] combine them to address this issue. This type of method treats all bands as ordered, and contains two main steps. First, similar bands are divided into the same subspace by defining the subspace partitioning points. Second, various search strategies are designed to find the important band of each subspace. For instance, Sun et al. [28] proposed constrained-target band selection with subspace partition (CTSPBS). The forward minimum variance strategy and the backward maximum variance strategy are utilized to choose the band that is the most similar to the other bands of the subspace. Wang et al. [30] designed a ranking on clusters strategy (RCS). The strategy by applying the ranking algorithms includes MVPCA [31], E-FDPC [32] and IE [33]. After obtaining the clustering results, the band with the top ranking in each cluster is selected. Recently, Wang et al. [29] developed a fast neighborhood grouping method for hyperspectral band selection (FNGBS). The principle of the approach is to acquire the band with the largest product which is calculated with band information entropy and local density in each subspace. Although these aforementioned methods bring some improvements, they ignore band selection from a global perspective.

To address this issue, in this paper, we propose a novel approach for hyperspectral band selection via optimal combination strategy (OCS). In contrast to other methods, we design a new approach from a global perspective. As shown in Figure 1, the globally optimal solution is the ideal point in the upper right corner. For the most part, the ideal point cannot be reached. Therefore, the objective of the proposed method is to obtain the optimal solution that is infinitely close to the ideal point in the feasible region. Instead of simply selecting a band in each subspace, multiple bands are chosen from each subspace to generate candidate bands. Subsequently, the optimal band subset is acquired by optimal combination strategy. The main contributions of this paper are:

(1) A novel subspace partitioning method is developed to accurately divide subspace by local density. Compared with the existing subspace partitioning methods, this approach can accurately obtain the subspace partitioning points and the evaluation criteria for partitioning points is not affected by subspace changes.

(2) Two bands with large amounts of information and high similarity are obtained from each subspace with Euclidean distance and information entropy. In contrast with other methods that only select one band in each group, this method provides support for building more and improved band subsets.

(3) An optimal band combination is obtained with forward search strategy, which meets the characteristics of minimum redundancy and maximum information. In addition, the results of the strategy are considered from the perspective of a globally optimal solution.

The rest of this paper is organized as follows. Section 2 introduces several unsupervised band selection methods. Section 3 describes the proposed subspace partitioning method and the optimal combination strategy. Section 4 compares the performance of other band selection methods on four public datasets. Section 5 summarizes the proposed methods.

2. Related Work

Previously unsupervised band selection methods are mainly divided into clustering methods and ranking methods. In recent years, some researchers have proposed subspace partitioning methods for band selection, which have shown superior performance. In this section, we review some typical algorithms of these three types.

2.1. Clustering Method

Clustering methods first construct the similarity matrix and utilize an objective function to cluster bands. Then, the band subset is formed with the band which is selected from each cluster. For the most part, the band is the cluster center or the top rank with statistics.

Typical clustering methods include affinity propagation (AP) [34,35], fast density peak-based clustering (FDPC) [36], enhanced fast density peak-based clustering (E-FDPC) [32], etc. AP regards all bands as potential clustering centers. The similarity matrix is built by calculating the similarity between bands with K-L divergence [37] and a Gaussian peak [38]. Additionally, the attraction of each band and its suitability as the clustering center are determined to evaluate the ability of the band. As for FDPC, it is a clustering method combined with a ranking method. All bands are first clustered by their similarities, and are then ranked by calculating the local density [39] and intraclass distance. Nevertheless, the hyperparameter threshold in FDPC is fixed, which affects the discrimination of local density. In addition, local density and intraclass distance are of different orders of magnitude. To solve these issues, E-FDPC first normalizes the local density and intraclass distance. Then, it adjusts the cut-off threshold of a different number of bands through exponential-based learning rules. These steps greatly improve the performance of FDPC.

Although clustering methods take into account the correlation between bands, the information content of the selected band is ignored. Therefore, some bands with huge information and discrimination for the target may not be selected.

2.2. Ranking Method

In contrast to clustering methods, ranking methods generally rank all bands in descending order by maximizing data statistics such as relative entropy [40], information entropy [41], mutual information [42], etc.

Maximum variance principal components analysis (MVPCA) [31] and linearly constrained minimum variance-based constrained band selection (LCMV-CBS) [21] are classical ranking methods. MVPCA has two steps to obtain the band subset. The first step is that it constructs the covariance matrix of the band. The second step is that it decomposes the eigenvalue of the similar matrix to construct the loading factor matrix. All bands are sorted according to the loading factor matrix. With regard to LCMV-CBS, it designs a filter for each band and minimizes the output of the average least squares filter. The similarity between a specific band and other bands can be measured by the filter. Subsequently, all bands are sorted with the filter.

The disadvantage of ranking methods is that the correlation between bands is not considered. This leads to the band subset with high similarity. Therefore, ranking methods usually achieve a poorer performance than clustering methods.

2.3. Subspace Partitioning Method

The subspace partitioning method is a combination of a clustering method and a tanking method. There are two main steps in subspace partitioning. First, all bands are evenly divided into multiple subspaces according to the spectral order. The partitioning points of the subspace are moved according to the similarity between bands. This ensures that similar bands are divided into the identical subspace. Second, diverse search strategies are utilized to search for the required bands in each subspace.

Representative subspace partitioning methods contain ASPS [27], CTSPBS [28], FNGBS [29], etc. ASPS divides subspace by maximizing the interclass distance-to-intraclass distance ratio in adjacent subspaces. Then, the band with the smallest noise is selected in each subspace. CTSPBS applies the same subspace partitioning approach as ASPS but CTSPBS adopts the forward minimum variance search strategy and the backward maximum variance search strategy [43] to choose the band in each subspace. As for FNGBS, it first labels the bands with the subspace order. The subspace center is defined by the bands that have the same label. Subsequently, the labels of bands are then defined with the closest subspace center. Through multiple iterations, the bands selected for the subspace become stable. In addition, the local density and information entropy are utilized to obtain the most appropriate band in each subspace.

The performance of subspace partitioning methods is significantly improved with a clustering method and ranking method. Nevertheless, these methods only select one band that meets the search strategy in each subspace, which neglects the issue of global optimization. As a result, their solution may not be satisfactory.

3. Optimal Combination Strategy

This section introduces the proposed method (OCS) in detail, whose flowchart is shown in Figure 2. As mentioned above, we know that the subspace partitioning methods have great advantages in band selection. Therefore, we adopt the same approach to group all bands using a coarse–fine grouping strategy. Subsequently, two candidate representative bands are searched with high information entropy and small Euclidean distance. This operation can ensure that each band contains sufficient information and representativeness. In order to obtain the optimal band combination, we design a forward search strategy to achieve the optimal band subset. These three parts are described in detail in the following sections.

3.1. Subspace Grouping

In order to obtain the reasonable grouping of bands, coarse–fine grouping subspace strategy is introduced to partition bands. In our approach, the coarse grouping divides all of the bands into m subspaces. It facilitates the movement of subspace partitioning points. Fine grouping utilizes local density to determine the reasonable position of subspace partitioning points. Through these two steps, the similarity of the bands in the same subspace is high and the bands of different subspaces have differences.

3.1.1. Coarse Grouping

Suppose that

X \in R^{W \times H \times L} = {X_{1}, X_{2}, X_{3}, \dots, X_{L}}

is HSI, where W and H denote the width and height of the HSI and

L

is the number of spectral bands. The coarse grouping is to partition all bands into m groups evenly. Here,

m

is also the number of bands to be selected. In order to ensure that all bands are divided into m subspaces, we formulate diverse coarse grouping norms according to different situations. The situations are divided into

m o d (\frac{L}{m}) = 0

and

m o d (\frac{L}{m}) \neq 0

, where

m o d (.)

denotes the mod operation to obtain the remainder. The criterion of the condition

m o d (\frac{L}{m}) = 0

is designed as follows:

\begin{matrix} P_{g} = [X_{\frac{L \times (g - 1)}{m} + 1}, X_{\frac{L \times g}{m}}], & g = 1, 2, 3, \dots, m, \end{matrix}

(1)

where

P_{g}

represents the g-th subspace. For the condition of

m o d (\frac{L}{m}) \neq 0

, the criterion of subspaces is constructed as follows:

\begin{matrix} P_{g} = \{\begin{matrix} [X_{f l o o r (\frac{L \times (g - 1)}{m}) + 1}, X_{f l o o r (\frac{L \times (g)}{m})}], g = 1, \dots, m - 1 \\ [X_{L - m o d (\frac{L}{m}) + 1}, X_{L}], g = m, \end{matrix} \end{matrix}

(2)

where

f l o o r (\cdot)

denotes round down operation.

3.1.2. Fine Grouping

The partitioning points obtained by coarse grouping cannot effectively distinguish great difference bands. As a result, the partitioning points of the subspaces are moved by fine grouping. Traditional subspace partitioning methods usually maximize the interclass distance-to-intraclass distance ratio to partition subspace. Actually, the ratio is often affected by subspace moving and needs huge calculation. Since local density [39] clearly shows the similarity among all bands, it can accurately obtain the most appropriate subspace partitioning points. Therefore, this paper proposes a novel fine subspace partitioning approach via local density.

Local density is a metric to measure the similarity of bands. It is calculated by the number of bands in a certain range. This range is usually measured by a threshold. In order to acquire the threshold

σ_{i}

, we first need to determine the Euclidean distance [44]. The Euclidean distance

S_{i j}

between the i-th band and the j-th band is as follows:

S_{i j} = \sqrt{\sum {(X_{i} - X_{j})}^{2}}, i, j = 1, 2, 3, \dots, L,

(3)

Subsequently, each column of similarity matrix S is arranged in ascending order to construct matrix A. The formula for threshold

σ_{i}

is as follows:

σ_{i} = \frac{\sum_{j = m + 1}^{L} A_{i j}}{L - m} .

(4)

n_{i}

is the number of bands whose distance from the i-th band is less than the threshold

σ_{i}

. The local density of each band

μ_{i}

is as follows:

μ_{i} = \frac{n_{i}}{L} .

(5)

Coarse grouping neglects the connection between partitioning points and the nearby subspaces. Therefore, we employ fine grouping to tackle this issue. Theoretically, the movement of partitioning points usually occurs in the middle region of a pair of neighborhood subspaces. Therefore, the undetermined region

[α, β]

is as follows:

α = \frac{Z_{g} + Z_{g + 1}}{2}, g = 1, 2, 3, \dots, m - 1,

(6)

β = \{\begin{matrix} \frac{Z_{g + 1} + Z_{g + 2}}{2}, & g = 1, 2, 3, \dots, m - 2 \\ \frac{Z_{g + 1} + L}{2}, & g = m - 1, \end{matrix}

(7)

where

Z_{g}

is the first band of g-th subspace.

The band with the lowest local density in the region

[α, β]

is searched as the new partitioning point of the pair of neighborhood subspaces. All partitioning points are moved in this way according to the order of subspaces. The partitioning point movement of the last pair of neighborhood subspaces means the one-round iteration is complete. Nevertheless, one-round iteration is not enough to construct ideal subspaces. As a result, we set the number of iterations as t. When the maximum frequency of iteration is reached or the partitioning points no longer move, the fine grouping is completed. An example of fine grouping is displayed in Figure 3.

3.2. Candidate Representative Bands

Traditional clustering methods and subspace partitioning methods usually select one band in each group. The band is usually the center of the cluster or the top band through various evaluation criteria in a subspace. The relationship between the band and other bands in the clustered group is not taken into account. Therefore, we use multiple processing bands instead of a single band to prevent the solution being a suboptimal solution. All candidate representative bands are used to obtain the optimal band subset.

The number of candidate representative bands in each subspace is an issue to be considered. First, the number must be bigger than one, which ensures the candidate representative bands have sufficient diversity and representation. Second, since the connection between two bands needs to be taken into account, the evaluation of the candidate representative bands becomes complicated when the number is greater than two. Considering this question, we select two candidate bands in each subspace.

Based on the above considerations, the two bands need to have the characteristics of a large amount of information and high similarity. In this paper, information entropy [37] and Euclidean distance are adopted as the evaluation standard. Information entropy is a statistic that describes the degree of confusion of information. It can measure the amount of information contained in a band, which is as follows:

H_{i} = - \sum_{i ϵ L}^{} ρ (X_{i}) \times l o g (ρ (X_{i})),

(8)

where

ρ (X_{i})

is the probability of

X_{i}

appearing in the image. The information entropy and Euclidean distance need to be normalized for subsequent calculation since they are not an equivalent order of magnitude. The normalized results are as follows:

H_{i} = \frac{H_{i} - m i n (H)}{m a x (H) - m i n (H)},

(9)

S_{i j} = \frac{S_{i j} - m i n (S_{i})}{m a x (S_{i}) - m i n (S_{i})},

(10)

where

m i n f l o o r (\cdot)

and

m a x f l o o r (\cdot)

, respectively, denote the minimum and maximum value.

Based on the description of the candidate bands, the evaluation function

r_{i j}

is as follows:

r_{i j} = \frac{H_{i} \times H_{j}}{S_{i j}}, (i \neq j),

(11)

It should be noted that the i-th band and j-th band need to be in the same subspace. The two bands with the highest value in the subspace are the candidate bands. Two candidate representative bands are defined as

K_{g}^{1}

and

K_{g}^{2}

in g-th subspace. K is constructed by all candidate bands, which is as follows:

K = \{\{K_{1}^{1}, K_{1}^{2}\}, \{K_{2}^{1}, K_{2}^{2}\}, \dots, \{K_{m}^{1}, K_{m}^{2}\}\} .

(12)

3.3. Optimal Combination Strategy

In our work, two candidate representative bands are obtained in each subspace which facilitate getting a desired result. The next issue is how to obtain the optimal combination. To address the issue, we use forward search strategy [43] to construct different combinations to achieve optimal combination from a global perspective.

The forward search strategy constructs different combinations in each subspace to improve the construction of the band subset. The main processes of the forward search strategy are described below. In each subspace, two local combinations are constructed with representative bands before the subspace and two candidate representative bands in the subspace. We set two combinations for

R^{t}

, where

t \in \{1, 2\}

. The form of

R^{t}

in diverse subspaces is as follows:

R^{t} = \{\begin{matrix} \{K_{1}^{t}\}, & n = 1 \\ \{R_{1}, K_{n}^{t}\}, & n = 2 \\ \{R_{1}, R_{2}, \dots, R_{n - 1}, K_{n}^{t}\}, & n = 3, 4, \dots, m \end{matrix},

(13)

R_{g}

is the representative band in the g-th subspace, which is acquired by the evaluation function. The optimal band subset needs to meet the requirement of large information content and low correlation. Information entropy and Euclidean distance are used as the standard for evaluating combinations. In addition, the candidate representative bands are ranked in the order of subspace and spectral dimension. As a result, the correlation of the band combination can be obtained by only calculating the Euclidean distance among the bands in adjacent subspace. The evaluation function

ψ

is as follows:

ψ = \{\begin{matrix} H_{_{K_{n}^{t}}}, & n = 1 \\ \frac{H_{_{R_{1}}} + \dots + H_{_{K_{n}^{t}}}}{n} \times \frac{S_{R_{1}, R_{2}} + \dots + S_{R_{n - 1}, K_{n}^{t}}}{n - 1}, & n = 2, 3, 4, \dots, m \end{matrix} .

(14)

According to the order of subspace, the two local combinations are produced by Formula (13). In addition,

ψ

of the two local combinations is calculated with Formula (14).

R^{t}

with the highest

ψ

is the optimal local combination and

K_{n}^{t}

is the representative band of the n-th subspace. Finally, the optimal combination

R_{o p t i m a l}

is acquired with m-round iterations. The detailed description of the optimal combination strategy is displayed in Algorithm 1.

OCS is divided into three main parts. The first part is subspace grouping. It goes through t-round iterations and moves the partition points in m subspaces. The complexity of it is

O (t m)

. The second part is to select two candidate representative bands in each subspace. We cannot calculate the number of bands in each subspace. We suppose that the g-th subspace has

B_{g}

bands and

\sum_{g = 1}^{m} B_{g} = L

. The algorithm complexity of the second part is

O (\sum_{g = 1}^{m} B_{g}^{2})

. The third part uses the forward search strategy to generate two local combinations in m subspace to obtain local optimal combination. The algorithm complexity of this part is

O (m^{2})

. Therefore, the overall complexity of OCS is

O (t m + \sum_{g = 1}^{m} B_{g}^{2} + m^{2})

.

Algorithm 1: Optimal combination strategy

4. Experiments

In this section, we conduct some experiments to test the band selection availability of the proposed method. Four public datasets were first introduced in detail. Then, the experimental setup is described, which includes the classifier, comparison methods, and the number of selected bands. Finally, the effectiveness of the proposed method is verified from various aspects.

4.1. Datasets

4.1.1. Indian Pines

This scene was acquired by the AVIRIS sensor on the Indian pines test site. It has 220 bands after removing 20 bands that cannot be reflected by water. Consequently, we use the remaining 200 bands after excluding these 20 bands as the research object. The resolution of the dataset is 145 × 145 × 200.

4.1.2. Pavia University

This scene was obtained by the ROSIS sensor over Pavia in northern Italy. It contains 115 bands, where the size of each band is 610 × 340. Since 12 bands are influenced by noise, these bands were removed in our experiment.

4.1.3. Salinas

This scene was captured by the 224 bands AVIRIS sensor over Salinas Valley, California. After some pixels with no information discarded, an image of size 512 × 217 is used. There are 204 bands since other bands cannot be reflected by water.

4.1.4. Botswana

This scene was obtained by NASA EO-1 satellite in the Okavango Delta of Botswana. There are a total of 145 bands after removing 97 bands affected by noise, and each band is with size 1476 × 256.

4.2. Experimental Setup

4.2.1. Classifier Setting

In order to test the performance of different band selection methods, k-nearest neighbor (KNN) and support vector machine (SVM) were adopted to classify four public datasets. Here, the same parameters are set to ensure the accuracy of the experiment. The penalty factor of SVM is

1 \times 10^{4}

and gamma is 0.5. The RBF kernel is utilized in SVM. As for the KNN classifier, the parameter is fixed at 5. In the experiment, 10% of samples were utilized as training samples. As the selection of training samples is random, the classification result is unstable. To address the issue, each algorithm is tested 5 times and their average value is taken as the final result. In the experiment, we apply overall accuracy (OA) and average overall accuracy (AOA) as evaluation metrics.

4.2.2. Comparison Methods

To measure the superiority of the proposed method, three algorithms were adopted, wherein the details are introduced as follows:

(1) MVPCA [31]: The covariance matrix of the bands used as an eigenvalue to construct a loading factors matrix. All bands are sorted according to the loading factors matrix.

(2) E-FDPC [32]: After all bands are clustered by similarity matrix, the local density and intraclass distance of each band are computed and normalized to rank the bands.

(3) FNGBS [29]: The bands are grouped by coarse–fine division strategy. Then, the band with the largest product is gained by local density and information entropy in each group.

(4) CDSP_MinV [28]: The bands are divided by maximizing the interclass distance-to-intraclass distance ratio in adjacent subspaces. In addition, it adopts the forward minimum variance search strategy to construct the band subset.

4.2.3. Number of Selected Bands

The number of bands that need to be selected is unknown. Generally, different number bands are utilized to analyze the performance of the algorithm. Therefore, we set the range to

[5, 50]

in our experiment.

4.3. Results

This section discusses in detail in four aspects: the analysis of parameter t, the study of candidate representative bands, the effectiveness of the search strategy, and the performance of the classification to analyze the performance of the proposed algorithm.

4.3.1. Parameter Analysis

In subspace grouping, we set the number of iterations to t, which we investigate here. For the number of iterations t, we need to consider the most complex condition. Salina is the dataset with the largest number of bands among the four public datasets. In this dataset, we divide all bands into 50 subspaces and observe the change in subspaces. The results are shown in Table 1.

In this most complex case, the subspace changes over the first few iterations. In the 6-round iteration, there is no change in subspace. This means that 5-round iterations are enough to complete subspace grouping in all cases. Therefore, we set t as 5. When the number of iterations of the subspace reaches 5 or the movement of the subspace partitioning points does not change, the subspace partitioning points do not move.

4.3.2. Study of Candidate Representative Bands

In our work, two bands with high information entropy and similarity are chosen in each subspace. In order to explore the viability and drop of this strategy, respectively, we select two bands with high information entropy and two bands with high similarity in each subspace for comparison. These three groups are combined with two classifiers to classify the Salinas dataset. The experimental results are shown in Table 2, Table 3 and Table 4.

Table 2 and Table 3 demonstrate the OA of the KNN classifier for diverse groups. As seen from these tables, two bands with high information entropy have great advantages when selecting the number of 40 or 50 bands. For two bands with high similarity, it gained higher accuracy than the two bands with high information entropy and similarity when selecting the number of 5, 40 or 50 bands. As for the two bands with high information entropy and similarity, it has obvious performance superiority in the others’ number bands. Table 4 shows the AOA of two classifiers on the Salinas dataset. The AOA of the two bands with high information entropy and similarity is higher than other groups in summary. In addition, the AOA of the high similarity group is higher than that of the high information entropy group. Generally, the performance of the two compared groups is poorer than the group of two bands with high information entropy and the similarity group. Furthermore, the availability of the strategy that chooses two bands with high information entropy and similarity is also verified.

4.3.3. Effectiveness of Search Strategy

FNGBS has two parts, namely the coarse–fine division and search strategy. These steps are convenient for verifying the effectiveness of a search strategy in the proposed method. To verify the performance of the search strategy, the search strategy of FNGBS is replaced too, which is named FNGBS-OCS. The experiments are conducted in two aspects, including (1) the OA of various numbers of bands; and (2) the AOA of bands within [5, 50].

Figure 4 displays the OA curves of different classifiers on two datasets. Table 5 shows the AOA of two classifiers on two datasets. Both Figure 4 and Table 5 reveal that FNGBS-OCS obtain better results than FNGBS. Overall, the results of FNGBS-OCS is more excellent than FNGBS. Especially in the case of selecting a small number of bands, the improvement is more obvious. The reason is that FNGBS ignores band selection in terms of global optimality. This further demonstrates that the search strategy of the proposed method is effective to obtain an optimal solution.

4.3.4. Comparison of Classification Performance

In order to prove the performance of the proposed method, E-FDPC, FNGBS, MVPCA and CDSP_MinV are compared with our OCS on four public datasets.

Figure 5 and Figure 6 describe the OA curves on four datasets using the KNN and SVM classifier. The OA of the proposed OCS is better than other methods in four datasets. Specifically, the OA of OCS is superior to other algorithms when the number of bands is small. Since FNGBS does not accomplish band selection from a global perspective, it obtains a lower OA than OCS. FNGBS combines the clustering and ranking method. Therefore, it acquires better results than E-FDPC and MVPCA. Although CDSP_MinV also adopts the forward search strategy, it only focuses on the variance of the band without considering the information content of the band. Therefore, its performance is not satisfactory. E-FDPC achieves poor results on the Indian Pines and Botswana datasets. The main reason is that E-FDPC rarely takes into account the similarity between bands. As for MVPCA, there is a substantial gap between it and other competitors in the accuracy curve. This verifies that ignoring the redundancy of selected bands results in an unsatisfactory performance. Table 6 and Table 7 display the AOA on four datasets. In contrast to the results in OA, the AOA of OCS does not attain superior performance on some datasets. The reason for this phenomenon is that the classification accuracy decreases when the number of selected bands is large. In contrast, the performance of the proposed method yields advantages in most situations. For instance, OCS obtains a 9% higher AOA than E-FDPC on the Indian Pines dataset.

The proposed method takes into account the information content and similarity of bands. In addition, it obtains a band subset from a globally optimal solution. Therefore, the OCS obtains better results than other methods.

5. Conclusions

The existing band selection methods merely select one band with the supreme score that meets a criterion in each group. It ignores the band selection from a global perspective. To address this issue, this paper proposes a new method from the perspective of global optimization. Inspired by the subspace partitioning method, this paper proposes to divide the neighborhood subspace by local density. Compared with other subspace partitioning methods, the evaluation criteria for partitioning points is not affected by the change of subspace. In addition, two bands with sufficient information and high similarity are searched in each subspace as the candidate representative bands. Forward search strategy uses these candidate bands to construct huge amounts of combinations. Subsequently, the evaluation function is used to obtain the optimal solution. By doing so, we can obtain the band subset with the largest information content and least correlation which meets the requirements of band selection. A large number of experiments on four public datasets demonstrate that our method has a large performance improvement compared with other methods.

In future work, we will extend the proposed method from two aspects: first, the number of candidate representative bands in each subspace should be adaptive to ensure how many bands are selected; and second, we will seek to acquire the topmost band combination with greater speed and accuracy.

Author Contributions

Conceptualization, S.L. and B.P.; methodology, B.P. and Q.L.; software, B.P. and Q.L.; validation, B.P., L.F. and S.L.; investigation, S.L. and L.F.; writing original draft preparation, S.L. and B.P.; writing review and editing, L.F. and Q.L.; supervision, L.F. and Q.L.; project administration, S.L. and B.P.; All authors contributed to the results analysis and reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The Indiana Pines, University of Pavia and Salinas datasets are available online at http://-www.ehu.eus/-ccwintco/index.php/-Hyperspectral_Remote_Sensing_Scenes, accessed on 20 May 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Meng, Z.; Li, L.; Jiao, L.; Liang, M. Fully dense multiscale fusion network for hyperspectral image classification. Remote Sens. 2019, 11, 2718. [Google Scholar] [CrossRef] [Green Version]
Thenkabail, P.S.; Mariotto, I.M.; Gumma, K.; Middleton, E.M.; Landis, D.R.; Huemmrich, K.F. Selection of hyperspectral narrowbands (HNBs) and composition of hyperspectral twoband vegetation indices (HVIs) for Biophysical Characterization and Discrimination of Crop Types Using Field Reflectance and Hyperion/EO-1 Data. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2013, 6, 427–439. [Google Scholar] [CrossRef] [Green Version]
Saralıoğlu, E.; Görmüş, E.T.; Güngör, O. Mineral exploration with hyperspectral image fusion. In Proceedings of the 2016 24th Signal Processing and Communication Application Conference, Zonguldak, Turkey, 16–19 May 2016; pp. 1281–1284. [Google Scholar]
Kwan, C.; Choi, J.; Chan, S.; Zhou, J.; Budavari, B. A super-resolution and fusion approach to enhancing hyperspectral images. Remote Sens. 2018, 10, 1416. [Google Scholar] [CrossRef] [Green Version]
Zhou, Y.; Peng, J.; Chen, C.L.P. Dimension reduction using spatial and spectral regularized local discriminant embedding for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1082–1095. [Google Scholar] [CrossRef]
Chin, T.-J.; Suter, D. Incremental kernel principal component analysis. IEEE Trans. Image. Process. 2007, 16, 1662–1674. [Google Scholar] [CrossRef]
Wang, J.; Chang, C. Independent component analysis-based dimensionality reduction with applications in hyperspectral image analysis. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1586–1600. [Google Scholar] [CrossRef]
Quan, Y.; Tong, Y.; Feng, W.; Dauphin, G.; Xing, M. Relative total variation structure analysis-based fusion method for hyperspectral and LiDAR data classification. Remote Sens. 2021, 13, 1143. [Google Scholar] [CrossRef]
MartÍnez-UsÓMartinez-Uso, A.; Pla, F.; Sotoca, J.M.; GarcÍa-Sevilla, P. Clustering-based hyperspectral band selection using information measures. IEEE Trans. Geosci. Remote Sens. 2007, 45, 4158–4171. [Google Scholar] [CrossRef]
Su, H.; Du, Q.; Chen, G.; Du, P. Optimized hyperspectral band selection using particle swarm optimization. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2014, 7, 2659–2670. [Google Scholar] [CrossRef]
Feng, J.; Jiao, L.C.; Zhang, X.; Sun, T. Hyperspectral band selection based on trivariate mutual information and clonal selection. IEEE Trans. Geosci. Remote Sens. 2014, 52, 4092–4105. [Google Scholar] [CrossRef]
Yang, H.; Du, Q.; Chen, G. Unsupervised hyperspectral band selection using graphics processing units. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2011, 4, 660–668. [Google Scholar] [CrossRef]
Li, W.; Liu, J.; Du, Q. Sparse and low-rank graph for discriminant analysis of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4094–4105. [Google Scholar] [CrossRef]
Pan, L.; Li, H.-C.; Li, W.; Chen, X.-D.; Wu, G.-N.; Du, Q. Discriminant analysis of hyperspectral imagery using fast kernel sparse and low-rank graph. IEEE Trans. Geosci. Remote Sens. 2017, 55, 6085–6098. [Google Scholar] [CrossRef]
Deng, Y.-J.; Li, H.-C.; Pan, L.; Shao, L.-Y.; Du, Q.; Emery, W.J. Modified tensor locality preserving projection for dimensionality reduction of hyperspectral images. IEEE Trans. Geosci. Remote Sens. Lett. 2018, 15, 277–281. [Google Scholar] [CrossRef]
Deng, Y.; Li, H.; Fu, K.; Du, Q.; Emery, W.J. Tensor low-rank discriminant embedding for hyperspectral image dimensionality reduction. IEEE Trans. Geosci. Remote Sens. 2018, 56, 7183–7194. [Google Scholar] [CrossRef]
Chang, C.-I.; Wang, S. Constrained band selection for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1575–1585. [Google Scholar] [CrossRef]
Li, H.C.; Chang, C.-I.; Li, Y.; Wang, L. Constrained multiple band selection for hyperspectral imagery. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium—IGARSS, Beijing, China, 11–15 July 2016; pp. 6149–6152. [Google Scholar]
Wang, Y.; Wang, L.; Yu, C.; Zhao, E.; Song, M.; Wen, C.H.; Chang, C.I. Constrained-target band selection for multiple-target detection. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6079–6103. [Google Scholar] [CrossRef]
Sellami, A.; Farah, M.; Farah, I.R.; Solaiman, B. Hyperspectral imagery semantic interpretation based on adaptive constrained band selection and knowledge extraction techniques. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2018, 11, 1337–1347. [Google Scholar] [CrossRef]
Liao, D.; Qian, Y.; Tang, Y.Y. Constrained manifold learning for hyperspectral imagery visualization. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2018, 11, 1213–1226. [Google Scholar] [CrossRef] [Green Version]
Datta, A.; Ghosh, S.; Ghosh, A. Combination of clustering and ranking techniques for unsupervised band selection of hyperspectral images. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2015, 8, 2814–2823. [Google Scholar] [CrossRef]
Bevilacqua, M.; Berthoumieu, Y. Multiple-feature kernel-based probabilistic clustering for unsupervised band selection. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6675–6689. [Google Scholar] [CrossRef] [Green Version]
Chang, C.-I.; Kuo, Y.-M.; Chen, S.; Liang, C.-C.; Ma, K.Y.; Hu, P.F. Self-mutual information-based band selection for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 5979–5997. [Google Scholar] [CrossRef]
Huang, S.; Zhang, H.; Pižurica, A. A structural subspace clustering approach for hyperspectral band selection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
Mehta, A.; Dikshit, O. Segmentation-based projected clustering of hyperspectral images using mutual nearest neighbour. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2017, 10, 5237–5244. [Google Scholar] [CrossRef]
Wang, Q.; Li, Q.; Li, X. Hyperspectral band selection via adaptive subspace partition strategy. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2019, 12, 4940–4950. [Google Scholar] [CrossRef]
Sun, X.; Zhang, H.; Xu, F.; Zhu, Y.; Fu, X. Constrained-target band selection with subspace partition for hyperspectral target detection. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2021, 14, 9147–9161. [Google Scholar] [CrossRef]
Wang, Q.; Li, Q.; Li, X. A fast neighborhood grouping method for hyperspectral band selection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 5028–5039. [Google Scholar] [CrossRef]
Wang, Q.; Zhang, F.; Li, X. Optimal clustering framework for hyperspectral band selection. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5910–5922. [Google Scholar] [CrossRef] [Green Version]
Chang, C.; Du, Q.; Sun, T.; Althouse, M.L.G. A joint band prioritization and band-decorrelation approach to band selection for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 1999, 37, 2631–2641. [Google Scholar] [CrossRef] [Green Version]
Jia, S.; Tang, G.; Zhu, J.; Li, Q. A novel ranking-based clustering approach for hyperspectral band selection. IEEE Trans. Geosci. Remote Sens. 2016, 54, 88–102. [Google Scholar] [CrossRef]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 3–55. [Google Scholar] [CrossRef] [Green Version]
Jiao, L.; Feng, J.; Liu, F.; Sun, T.; Zhang, X. Semisupervised affinity propagation based on normalized trivariable mutual information for hyperspectral band selection. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2015, 8, 2760–2773. [Google Scholar] [CrossRef]
Yang, C.; Liu, S.; Bruzzone, L.; Guan, R.; Du, P. A feature-metric-based affinity propagation technique for feature selection in hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2013, 10, 1152–1156. [Google Scholar] [CrossRef]
Tang, G.; Jia, S.; Li, J. An enhanced density peak-based clustering approach for hyperspectral band selection. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium—IGARSS, Milan, Italy, 26–31 July 2015; pp. 1116–1119. [Google Scholar]
Erven, T.V.; Harremos, P. Rényi divergence and kullback-leibler divergence. IEEE Trans. Inf. Theory 2014, 60, 3797–3820. [Google Scholar] [CrossRef] [Green Version]
Polyanskiy, Y.; Wu, Y. Peak-to-average power ratio of good codes for gaussian channel. IEEE Trans. Inf. Theory 2014, 60, 7655–7660. [Google Scholar] [CrossRef] [Green Version]
Hu, W.; Gao, J.; Li, B.; Wu, O.; Du, J.; Maybank, S. Anomaly detection using local kernel density estimation and context-based regression. IEEE Trans. Knowl. Data Eng. 2020, 32, 218–233. [Google Scholar] [CrossRef] [Green Version]
Datta, N. Min- and max-relative entropies and a new entanglement monotone. IEEE Trans. Inf. Theory 2009, 55, 2816–2826. [Google Scholar] [CrossRef] [Green Version]
Gour, G.; Tomamichel, M. Entropy and relative entropy from information-theoretic principles. IEEE Trans. Inf. Theory 2021, 67, 6313–6327. [Google Scholar] [CrossRef]
Kontoyiannis, I.; Madiman, M. Sumset and inverse sumset inequalities for differential entropy and mutual information. IEEE Trans. Inf. Theory 2014, 60, 4503–4514. [Google Scholar] [CrossRef] [Green Version]
Xu, P.; Susilo, W.; Wang, W.; Chen, T.; Wu, Q.; Liang, K. ROSE: Robust searchable encryption with forward and backward security. IEEE Trans. Inf. Forensics Secur. 2022, 17, 1115–1130. [Google Scholar] [CrossRef]
Schouhamer Immink, K.A.; Weber, J.H. Hybrid minimum pearson and Euclidean distance detection. IEEE Trans. Commun. 2015, 63, 3290–3298. [Google Scholar] [CrossRef]

Figure 1. The description of the objective.

Figure 2. Flowchart of OCS. All bands are divided into m groups by coarse–fine grouping. In addition, two candidate representative bands are searched in each subspace and the two search strategies are utilized to obtain the optimal combination.

Figure 3. An example of fine grouping to partition 16 bands. Assume that the input and output are, respectively, the result of coarse grouping and fine grouping.

Figure 4. The OA curves of different classifiers on two datasets: (a) KNN on Pavia University; (b) SVM on Pavia University; and (c) KNN on Salinas; and (d) SVM on Salinas.

Figure 5. The OA curves of KNN on different datasets. (a) Indian Pines; (b) Pavia University; (c) Salinas; and (d) Botswana.

Figure 6. The OA curves of SVM on different datasets. (a) Indian Pines; (b) Pavia University; (c) Salinas; and (d) Botswana.

Table 1. Subspace grouping results of different iterations. The bold part in the second column indicates the partitioning points of the change.

Number of Iterations	Subspace Partitioning Points
1	1, 4, 8, 14, 17, 20, 28, 31, 36, 37, 44, 45, 50, 55, 60, 62, 64, 73, 77, 81, 82, 85, 88, 97, 99, 105, 107, 113, 116, 118, 125, 128,
	132, 137, 142, 145, 147, 153, 158, 161, 163, 170, 174, 175, 181, 183, 186, 189, 195, 198, 204
2	1, 3, 6, 14, 17, 19, 28, 31, 37, 38, 44, 45, 50, 58, 62, 63, 64, 76, 80, 82, 84, 85, 88, 99, 103, 107, 111, 113, 116, 118, 125, 128,
	135, 140, 144, 145, 147, 153, 160, 161, 163, 173, 175, 179, 181, 183, 186, 188, 192, 198, 204
3	1, 3, 5, 10, 14, 17, 28, 31, 37, 38, 44, 45, 50, 60, 62, 63, 64, 79, 82, 84, 85, 86, 88, 99, 106, 107, 113, 114, 116, 118, 125, 132,
	138, 143, 145, 146, 147, 153, 161, 162, 163, 175, 176, 181, 182, 183, 186, 188, 191, 195, 204
4	1, 3, 5, 8, 14, 17, 28, 31, 37, 38, 44, 45, 50, 62, 63, 64, 65, 81, 82, 84, 85, 86, 88, 103, 107, 111, 113, 116, 117, 118, 128, 136,
	141, 145, 146, 147, 151, 153, 161, 162, 163, 175, 179, 181, 182, 183, 186, 188, 191, 195, 204
5	1, 3, 5, 8, 14, 17, 28, 31, 37, 38, 44, 45, 50, 62, 63, 64, 65, 82, 84, 85, 86, 88, 96, 106, 107, 113, 114, 116, 117, 118, 132, 139,
	144, 145, 146, 147, 153, 154, 161, 162, 163, 175, 181, 182, 183, 184, 186, 188, 191, 195, 204
6	1, 3, 5, 8, 14, 17, 28, 31, 37, 38, 44, 45, 50, 62, 63, 64, 65, 82, 84, 85, 86, 88, 96, 106, 107, 113, 114, 116, 117, 118, 132, 139,
	144, 145, 146, 147, 153, 154, 161, 162, 163, 175, 181, 182, 183, 184, 186, 188, 191, 195, 204

Table 2. OA of KNN classifier on Salinas dataset (%). Our method is highlighted in bold.

Select Strategy	5	8	10	12	15	20	25	30	40	50
High Information Entropy and Similarity	86.16	87.85	89.12	88.90	89.10	89.05	89.24	88.94	89.08	88.96
High Information Entropy	85.60	86.93	88.33	88.23	88.57	88.96	89.22	89.21	89.06	89.12
High Similarity	87.12	87.25	88.69	88.81	89.01	89.05	88.32	88.30	89.20	89.22

Table 3. OA of SVM classifier on Salinas dataset (%). Our method is highlighted in bold.

Select Strategy	5	8	10	12	15	20	25	30	40	50
High Information Entropy and Similarity	87.89	90.33	91.39	91.51	91.73	92.12	92.19	92.34	92.33	92.52
High Information Entropy	87.86	89.89	91.00	91.45	91.60	91.89	92.14	92.31	92.41	92.63
High Similarity	89.29	89.79	91.02	91.36	91.63	92.00	91.71	92.08	92.61	92.74

Table 4. AOA of two classifier on Salinas dataset (%). Our method is highlighted in bold.

Select strategy	KNN	SVM
High Information Entropy and Similarity	88.61	91.52
High Information Entropy	88.32	91.32
High Similarity	88.50	91.34

Table 5. AOA of KNN on Pavia University and Salines (%).

Classifier	Method	Pavia University	Salines
KNN	FNGBS-OCS	85.40	88.81
	FNGBS [29]	85.12	88.49
SVM	FNGBS-OCS	85.65	88.96
	FNGBS [29]	85.38	88.70

Table 6. AOA of KNN on four public datasets (%).

Method	Indian Pines	Pavia University	Salines	Botswana
FNGBS [29]	66.88	85.12	88.49	82.77
MVPCA [31]	56.09	78.55	81.95	71.92
E-FDPC [32]	61.55	86.28	88.60	82.33
CDSP_MinV [28]	66.15	85.04	88.35	82.43
OCS	66.18	84.56	88.62	83.50

Table 7. AOA of SVM on four public datasets (%).

Method	Indian Pines	Pavia University	Salines	Botswana
FNGBS [29]	75.26	89.61	91.36	87.50
MVPCA [31]	63.49	81.78	85.43	80.53
E-FDPC [32]	67.11	90.02	91.43	86.11
CDSP_MinV [28]	75.21	89.09	91.14	87.73
OCS	76.18	89.12	91.32	88.36

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, S.; Peng, B.; Fang, L.; Li, Q. Hyperspectral Band Selection via Optimal Combination Strategy. Remote Sens. 2022, 14, 2858. https://0-doi-org.brum.beds.ac.uk/10.3390/rs14122858

AMA Style

Li S, Peng B, Fang L, Li Q. Hyperspectral Band Selection via Optimal Combination Strategy. Remote Sensing. 2022; 14(12):2858. https://0-doi-org.brum.beds.ac.uk/10.3390/rs14122858

Chicago/Turabian Style

Li, Shuying, Baidong Peng, Long Fang, and Qiang Li. 2022. "Hyperspectral Band Selection via Optimal Combination Strategy" Remote Sensing 14, no. 12: 2858. https://0-doi-org.brum.beds.ac.uk/10.3390/rs14122858

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperspectral Band Selection via Optimal Combination Strategy

Abstract

1. Introduction

2. Related Work

2.1. Clustering Method

2.2. Ranking Method

2.3. Subspace Partitioning Method

3. Optimal Combination Strategy

3.1. Subspace Grouping

3.1.1. Coarse Grouping

3.1.2. Fine Grouping

3.2. Candidate Representative Bands

3.3. Optimal Combination Strategy

4. Experiments

4.1. Datasets

4.1.1. Indian Pines

4.1.2. Pavia University

4.1.3. Salinas

4.1.4. Botswana

4.2. Experimental Setup

4.2.1. Classifier Setting

4.2.2. Comparison Methods

4.2.3. Number of Selected Bands

4.3. Results

4.3.1. Parameter Analysis

4.3.2. Study of Candidate Representative Bands

4.3.3. Effectiveness of Search Strategy

4.3.4. Comparison of Classification Performance

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI