Next Article in Journal
High-Resolution Monitoring of Tidal Systems Using UAV: A Case Study on Poplar Island, MD (USA)
Next Article in Special Issue
Remote Hyperspectral Imaging Acquisition and Characterization for Marine Litter Detection
Previous Article in Journal
Multitemporal InSAR Coherence Analysis and Methods for Sand Mitigation
Previous Article in Special Issue
Automatic Cotton Mapping Using Time Series of Sentinel-2 Images
Article

Dimensionality Reduction of Hyperspectral Image Based on Local Constrained Manifold Structure Collaborative Preserving Embedding

1
School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
3
The Key Laboratory on Opto-electronic Technique and Systems, Ministry of Education, Chongqing University, Chongqing 400044, China
*
Author to whom correspondence should be addressed.
Academic Editors: Lefei Zhang, Liangpei Zhang, Qian Shi and Yanni Dong
Received: 10 February 2021 / Revised: 26 March 2021 / Accepted: 29 March 2021 / Published: 2 April 2021
(This article belongs to the Special Issue Recent Advances in Hyperspectral Image Processing)

Abstract

Graph learning is an effective dimensionality reduction (DR) manner to analyze the intrinsic properties of high dimensional data, it has been widely used in the fields of DR for hyperspectral image (HSI) data, but they ignore the collaborative relationship between sample pairs. In this paper, a novel supervised spectral DR method called local constrained manifold structure collaborative preserving embedding (LMSCPE) was proposed for HSI classification. At first, a novel local constrained collaborative representation (CR) model is designed based on the CR theory, which can obtain more effective collaborative coefficients to characterize the relationship between samples pairs. Then, an intraclass collaborative graph and an interclass collaborative graph are constructed to enhance the intraclass compactness and the interclass separability, and a local neighborhood graph is constructed to preserve the local neighborhood structure of HSI. Finally, an optimal objective function is designed to obtain a discriminant projection matrix, and the discriminative features of various land cover types can be obtained. LMSCPE can characterize the collaborative relationship between sample pairs and explore the intrinsic geometric structure in HSI. Experiments on three benchmark HSI data sets show that the proposed LMSCPE method is superior to the state-of-the-art DR methods for HSI classification.
Keywords: hyperspectral image; graph learning; dimensionality reduction; collaborative representation; local neighborhood structure hyperspectral image; graph learning; dimensionality reduction; collaborative representation; local neighborhood structure

1. Introduction

Hyperspectral imagery (HSI) captures reflectance values over a wide range of electromagnetic spectra for each pixel, it can distinguish more subtle differences between land cover types than traditional multispectral imagery (MSI) [1,2,3,4]. Due to the detailed spatial structure and spectral information in HSI, it has been widely used in the fields of urban planning, environment monitoring, precision agriculture, and land-cover classification [5,6,7]. However, the improvement of the spectral and spatial resolution in hyperspectral sensors has led to high-dimensional data sets, while high-dimensionality data processing requires huge computational resources and storage capacity [8,9,10]. Besides, the classification performance often deteriorates as the dimensionality increases (Hughes phenomenon) [11]. Therefore, it is of great importance to perform dimensionality reduction (DR) for HSI data while preserving the useful feature information [12,13].
Serving as a good tool for DR, graph learning methods have attracted increasing attention of researchers by mapping the high-dimensional data into a lower-dimensional embedding space [14,15,16]. Based on this theory, a lot of graph learning algorithms and its variants have been proposed to reveal the intrinsic geometric structure of high-dimensional data [17,18,19], such as Laplacian eigenmaps (LE) [20], local linear embedding (LLE) [21], and isometric feature mapping (ISOMAP) [22]. LE aims to find low dimensional representations of the high dimensional data by preserving the local geometry between them [23]. LLE computes the low-dimensional features that best preserve the local geometry of each locally linear patch, and seeks a lower-dimensional projection of the data that preserves distances within local neighborhoods [24]. ISOMAP aims at preserving geodesic distances of all similarity pairs for delivering highly nonlinear manifolds, and it approximates the geodesic distance between two points by measuring shortest path between these points [25]. However, as they are nonlinear algorithms, the DR process can not obtain an explicit project matrix, which can map the test samples into low-dimensional space [26]. Therefore, it is difficult for the nonlinear algorithms to process new samples.
To overcome this drawback, researchers proposed a series of linear graph learning methods [27,28], including locality preserving projection (LPP) [29], discriminative supervised neighborhood preserving embedding (DSNPE) [30], and multi-manifold discriminant analysis (MMDA) [31]. LPP constructs an adjacency matrix to weight the distance between each pair of sample points for learning a projection that can preserve the local manifold structures of data. DSNPE finds the optimal projection direction by pulling the neighboring points with the same class label as near as possible, while simultaneously pushing the neighboring points with different labels as far as possible. MMDA designs the intraclass graph and interclass graph to characterize the intraclass compactness and the interclass separability, and it seeks for the discriminant matrix by maximizing the interclass scatter and minimizing the intraclass scatter simultaneously. In order to unify these algorithms, a graph embedding (GE) framework has been designed to analyze the graph learning methods on the basis of statistics or geometry theory [32,33,34]. Based on this framework, discriminant analysis with graph learning (DAGL) method was proposed by pulling the within-class similar samples together while pushing the between-class similar samples far away [35]. However, the above graph learning methods construct adjacency graph mainly depend on pairwise Euclidean distance, the classification performance of them is sensitive to data noise, which may result in suboptimal graph representation [36,37,38].
Recently, the graph learning methods based on sparse representation have achieved good classification performance [39,40,41]. The reason is that sparse coefficients can characterize the relationship between sample pairs more accurately, and they can preserve more valuable information of HSI data for DR. Among these methods, sparse graph based discriminant analysis (SGDA) exploits the discriminant capability from sparse representation [42], and enhances the discriminant power with the labeled information. Sparse manifold embedding (SME) utilizes the sparse coefficients of affine subspace to construct a similarity graph and preserves this sparse similarity in embedding space [43]. Discriminative learning by sparse representation projections (DLSP) incorporates the merits of both local geometry structure and global sparse property, it has better discrimination performance for different classes [44]. Although sparse graph-based methods are superior to traditional graph learning methods, they suffer from the problem of over-sparseness issue when there are few samples. Besides, these methods solve the sparse coefficients with l 1 -norm, which is an iterative procedure and may lead to a higher computational cost [45,46]. Therefore, collaborative representation is introduced to graph learning for avoiding the above-mentioned problems [47,48]. Collaborative graph-based discriminant analysis (CGDA) exploits the discriminant capability with collaborative representation, it can effectively characterize the relationship between sample pairs and improve the classification performance [49]. Manifold aware discriminant collaborative graph embedding (MADCGE) constructs an adjacent graph with the collaborative representation coefficients, it can preserve the linear reconstructive relationships between samples, and sufficiently utilize the merits of label information and nonlinear manifold structure to further improve the discriminative ability [50]. Collaborative representation based local discriminant projection (CRLDP) utilizes collaborative representation relationships among samples to construct the within-class graph and the between-class graph, which can characterize the compactness and separability of samples, then it seeks to find an optimal projection matrix by maximizing the ratio of the between-class scatter to the within-class scatter [51]. However, these methods only consider global geometry information of high dimensional data, and ignore the local neighborhood information of within-class samples, which will limit the discriminant ability of collaborative representation.
To characterize the collaborative relationship and intrinsic structure of the HSI data, we proposed a novel supervised spectral DR method, termed local constrained manifold structure collaborative preserving embedding (LMSCPE) for HSI classification. The LMSCPE method makes full use of collaborative relationship and local neighborhood information of HSI to extract discriminant features for classification. The main contributions of this paper are listed as below: (1) Based on the collaborative representation theory, we proposed a novel local constrained CR model, which can obtain more effective collaborative coefficients to characterize the relationship between samples pairs. (2) According to the graph embedding frame, an intraclass collaborative graph and an interclass collaborative graph are constructed to enhance the intraclass compactness and the interclass separability. (3) To preserve the local neighborhood structure of HSI, a local neighborhood graph is constructed by its k-nearest neighbors, and the aggregation of HSI data can be improved. (4) An optimal objective function is designed to obtain a discriminant projection matrix, it can extract the discriminant features and subsequently improve the classification performance of HSI.
The paper organization is as follows: Section 2 gives a brief description of GE and collaborative representation (CR). Section 3 describes the details of the proposed LMSCPE method. Section 4 gives the parameter analysis of LMSCPE to achieve the best classification performance. Section 5 provides some analysis and discussion about the experiment results. Finally, we summarize this paper and give recommendations for future work in Section 6.

2. Related Works

For convenience, let us denote a HSI data set with D bands by X = x 1 , x 2 , , x N R D × N , where N is the pixels number of HSI. Suppose the class label of i-th pixel can be represented as l i { 1 , 2 , , c } , where c is the number of total classes in HSI. For DR methods, the goal is to map X R D × N into Y R d × N , where dD is the embedding dimensionality. For the linear DR methods, the low-dimensional embedding features Y R d × N can be computed as Y = V T X , where V R D × d is the projection matrix.

2.1. Graph Embedding

The graph embedding (GE) framework helps to redefine most DR algorithms in a unified framework, it characterizes the statistical and geometric properties of data by intrinsic graph G = { X , W } and penalty graph G P = X , W P , where X is the vertex set, W R N × N and W P R N × N are the weight matrices [52,53]. In GE, intrinsic graph G is adopted to describe similarities relationship of different vertexes, and penalty graph G P is employed to reveal dissimilarities properties between vertex pairs.
The GE framework aims to represent a graph in a low dimensional space which preserves as much graph property information as possible. The optimal objective function can be given by a graph preserving criterion:
min tr Y T H Y = h 1 2 i j y i y j 2 w i j = min tr Y T H Y = h tr Y T L Y
in which L is the Laplacian matrix of intrinsic graph G , H is the constraint matrix for avoiding a trivial solution, and it is typically set as a diagonal matrix or the Laplacian matrix of penalty graph G P . Then, L and L P can be given by
L = D W , D i i = j i N w i j , W = w i j i , j = 1 N
L P = D P W P , D i i P = j i N w i j P , W P = w i j P i , j = 1 N

2.2. Collaborative Representation

In the hyperspectral imagery community, many representative graph learning methods have been proposed for preserving intrinsic structure information of HSI data, these methods often employ sparse representation coefficients to characterize the relationship of different samples. However, sparse representation is an iterative procedure, which usually suffers from computational expensiveness and the solution is often sub-optimal. Therefore, the problem of collaborative representation (CR) has received considerable attentions recently [54,55,56].
The basic idea of CR is that a query sample can be reconstructed by a set of training samples, and the reconstructive coefficients achieved by the l 2 -norm optimization can be treated as the affinities between the query sample and other training samples. For each pixel x i in HSI, the typical collaborative representation can be formulated as the following l 2 -norm optimization problem:
min α i α i 2 such that x i Z i α i 2 ϵ
in which Z i is the d × ( N 1 ) dimensional dictionary excluding x i itself, ϵ > 0 is a small tolerance, and α i = α i , 1 , , α i , N 1 T R N 1 is the collaborative representation coefficients with size of ( N 1 ) . Then, the objective function of CR can be reformulated as
min α i x i Z i α i 2 2 + γ α i 2 2
where γ is a tuning parameter to balance the residual term x i Z i α i 2 and regularization term α i 2 of the collaborative representation.
With some mathematical operations, the collaborative representation vector α j can be calculated as
α i = Z i T Z i + γ I 1 Z i T x i
in which I is an identity matrix of size ( N 1 ) × ( N 1 ) .

3. Local Constrained Manifold Structure Collaborative Preserving Embedding

To characterize the discriminative properties and intrinsic structure of the HSI data, a local constrained manifold structure collaborative preserving embedding (LMSCPE) method was proposed for DR. LMSCPE first designs a novel CR model to discover the collaborative relationship between different samples that belong to the same class. Based on the collaborative representation coefficients, it constructs an intraclass collaborative graph and an interclass collaborative graph to characterize the intraclass compactness and the interclass separability. Then, it selects neighbors to construct a local neighborhood graph, which can effectively preserve the local geometric structure of HSI. After that, the collaborative graph and local neighborhood graph are incorporated to learn an effective projection matrix. LMSCPE can preserve the local neighborhood structure in HSI and enhance the discrimination power of embedding features. The flowchart of the proposed LMSCPE method is shown in Figure 1.

3.1. Local Constrained Collaborative Graph Analysis Model

Due to the nonlinear optical effects of spectrum transmitting in the atmosphere, such as reflection, absorption, and dispersion, the spectral curves of pixels of the same category usually emerge subtle differences, which will affect the final classification performance of HSI [57,58,59]. Existing graph learning algorithms usually employ a distance-based weight matrix to reflect the similarity relationship between sample pairs, and construct a graph structure to map the high-dimensional data into a low-dimensional space for DR [60,61]. However, the subtle differences in spectral curves often cause the weight matrix inaccurate [62]. Therefore, according to the CR theory, we design an LCCR model to obtain more accurate weight coefficients.
In the proposed LCCR model, we incorporate locality-constrained terms into collaborative representation, and the minimization problem can be formulated as
arg min α i x i Z i α i 2 2 + γ Γ i α i 2 2 + δ x i Z ^ i α i 2 2
in which γ and δ are two regularization parameters that balance the minimization between regularization terms and the residual part. Z ^ i R d × N consists of the k-nearest neighbors of x i and ( N k ) zero vectors 0 R d × ( N k ) , Γ i is a biasing Tikhonov matrix, it can be given by
Γ i = x 1 Z i 2 0 0 x N 1 Z i 2
where x 1 , x 2 , , x N 1 are the columns of matrix Z i .
In (7), the introduction of Γ i in the first regularization term can adjust the collaborative coefficient adaptively based on the distance between sample pairs, and the second regularization term is designed to explore the local geometric structure of HSI, which can effectively enhance the aggregation of HSI data.
According to LCCR model, we construct the intraclass collaborative graph and the interclass collaborative graph to characterize the intraclass compactness and the interclass separability. For intraclass collaborative graph, the weight matrix is set as the collaborative coefficients between each point and the points from the same class. Denote the samples of the ith class as X i = [ x i 1 , , x i N i ] , where N i is the samples number of the ith class. Then, the intraclass dictionary Z i j w of x i j can be given by
Z i j w = X i / x i j = x i 1 , , x i j 1 , x i j + 1 , , x i N i R D × N i 1
Therefore, the intraclass collaborative representation coefficients of x i j can be solved by
α i j w = arg min α i j w x i j Z i j w α i j w 2 + γ Γ i w α i j w 2 + δ x i j Z ^ i j w α i j w 2 2
With some mathematical operations, the collaborative representation coefficients α i j w can be calculated as
α i j w = [ ( Z i j w ) T Z i j w + γ ( Γ i w ) T Γ i w + δ ( Z ^ i j w ) T Z ^ i j w ] 1 [ δ ( Z ^ i j w ) T + ( Z i j w ) T ] x i j
in which Γ i w is the Tikhonov matrix of x i j , it can be given by
Γ i = x i 1 Z i j w 2 0 0 x i N i 1 Z i j w 2
where x i 1 , x i 2 , , x i N i 1 are the columns of matrix Z i j w .
After obtaining the collaborative coefficients of each class, the intraclass weight matrix W s can be constructed as
W s = W s 1 0 0 0 W s 2 0 0 0 0 0 0 W s c
in which 0 is the zero matrix and W s i j k is defined as
W s i j k = 0 , k = j α i j k w , k < j α i j ( k 1 ) w , k > j .
where α i j k w is the kth element of α i j w , and W s i j k is the jth row and kth column of matrix W s i .
For interclass collaborative graph, the interclass dictionary Z i j b of x i j is designed as
Z i j b = X / X i = x 1 1 , , x i 1 N i 1 , x i + 1 1 , , x c N c R D × N N i
where x c N c is the N c th sample in the c class.
Then, the interclass collaborative representation coefficients of x i j can be given as
α i j b = arg min α i j b x i j Z i j b α i j b 2 + γ Γ i b α i j b 2 + δ x i j Z ^ i j b α i j b 2 2
With some mathematical operations, the collaborative representation coefficients α i j b can be calculated as
α i j b = [ ( Z i j b ) T Z i j b + γ ( Γ i b ) T Γ i b + δ ( Z ^ i j b ) T Z ^ i j b ] 1 [ δ ( Z ^ i j b ) T + ( Z i j b ) T ] x i j
where Γ i b has the similar form with Γ i w .
Considering that the CR coefficients can effectively characterize the similarity relationship between sample pairs, we adopt the coefficients to construct the interclass weight matrix W b in the following way:
W b = W b 1 T , W b 2 T , , W b c T T
where
W b i j k = α i j k b , 0 < k t = 1 i 1 N t 0 , t = 1 i 1 N t < k t = 1 i N t α i j ( k N i ) b , t = 1 i N t < k t = 1 c N t .
in which α i j k b is the kth element of α i j b , and W b i j k is the jth row and kth column of matrix W b i .
Therefore, based on the intraclass collaborative graph G w ( X , W s ) , the intraclass compactness in the reduced subspace can be defined as
J w = i = 1 N V T x i j = 1 N W s i j V T x j 2 = tr V T i = 1 N x i j = 1 N W s i j x j × x i j = 1 N W s i j x j T V = tr V T X I W s I W s T X T V = tr V T X M w X T V
where M w = I W s W s T + W s W s T and I is identity matrix of size N × N .
Similarly, according to interclass collaborative graph G b ( X , W b ) , the interclass separability in the reduced subspace can be designed as
J b = i = 1 N V T x i j = 1 N W b i j V T x j 2 = tr V T i = 1 N x i j = 1 N W b i j x j × x i j = 1 N W b i j x j T V = tr V T X I W b I W b T X T V = tr V T X M b X T V
where M b = I W b W b T + W b W b T .
For the purpose of seeking an optimal projection, it is natural to minimize the intraclass compactness and maximize the interclass variance simultaneously, that is
min V tr V T X M w X T V max V tr V T X M b X T V

3.2. Local Neighborhood Graph Analysis Model

Due to the fact that the spectral curves of HSI pixels are easily affected by external environment and imaging equipment, the actually obtained spectral curves of each category exhibits a certain degree of difference, which will lead to a degraded classification performance [63,64,65]. To eliminate this impact, we construct a local neighborhood graph by considering the spectral similarity of pixels, and it explores the projection relationships from high dimensional space to a lower dimensional space by aggregating the local graph structure and separating the total data.
In local neighborhood graph G L , ith point x i is connected with its intraclass neighbor points that are from the same class. Denote the neighbor set of x i as C ( x i ) = [ x i 1 , x i 2 , , x i k ] , where k is the neighbor number. Then, the similarity weight w i j w between x i and x j can be given as following:
w i j w = exp x i x j 2 2 t i 2 , if x i C x j or x j C x i and l i = l j 0 , otherwise
where t i = 1 k j = 1 k x i x j .
To enhance the aggregation of data on local neighborhood structure, each point and its neighbor points are adopted to formulate the optimization problem in low-dimensional embedding space
J 1 V = arg min i = 1 N j = 1 k y i y j 2 w i j w = i = 1 N j = 1 k V T x i V T x j 2 w i j w = tr V T i = 1 N j = 1 k x i x j x i x j T w i j w V = tr V T S V
in which S = i = 1 N j = 1 k x i x j x i x j T w i j w is the local neighborhood scatter matrix of training set.
In addition, aiming to separate the local graph structure of each pixel as far as possible, the optimization problem between different classes can be designed as follows.
J 2 ( V ) = arg max i = 1 N V T x i V T x ¯ 2 c i = i = 1 N V T x i x ¯ c i x i x ¯ T V = tr V T i = 1 N x i x ¯ c i x i x ¯ T V = tr V T H V
where x ¯ is the mean of training set, c i = exp x i x 2 / 2 t i k 2 is the weight between x i and x ¯ , H = i = 1 N x i x ¯ c i x i x ¯ T is the total scatter matrix.
To explore the local neighborhood structure in low-dimensional embedding space, we minimize the local neighborhood scatter matrix and maximize the total scatter matrix. Therefore, the optimal projection matrix V should satisfy the following two optimization criteria:
arg min V V T S V arg max V V T H V
For high dimensional data, the collaborative relationship and local neighborhood structure between pixel pairs should be discovered simultaneously. Therefore, based on optimization problem of (22) and (26), we propose a local constrained manifold structure collaborative preserving embedding (LMSCPE) method for DR of HSI data. This method can reveal the collaborative relationship and the local neighborhood structure for learning a more effective projection, and the optimal objective function of LMSCPE can be designed as
J ( V ) = min tr V T a X M w X T + ( 1 a ) S V tr V T a X M b X T + ( 1 a ) H V
where a [ 0 , 1 ] is a trade-off parameter that balances the contributions of collaborative graph and local neighborhood graph in embedding space. Then, the optimization problem of (27) can be transformed as:
min V V T a X M w X T + ( 1 a ) S V s . t . V T a X M b X T + ( 1 a ) H V = B
where B is a constant matrix.
With the Lagrangian multiplier method, the solution of (28) can be obtained through the following generalized eigenvalue problem:
[ a X M w X T + ( 1 a ) S ] v i = λ i [ a X M b X T + ( 1 a ) H ] v i
in which λ i is the ith eigenvalue and v i is the corresponding eigenvector, the eigenvectors v 1 , v 2 , , v d corresponding to the first d eigenvalues construct the optimal projection matrix V = [ v 1 v 2 v d ] R D × d . Then, the low-dimensional embedding Y can be represented as Y = V T X .

4. Experimental Setup and Parameters Discussion

In this section, three public HSI data sets are adopted to demonstrate the effectiveness of LMSCPE by comparing it with some state-of-the-art DR algorithms.

4.1. Data Set Description

PaviaU data set: This data set was acquired by the ROSIS-03 sensor over the University of Pavia, Italy. The spatial size of the scene is 610 × 340 with the spatial resolution of 1.3 m/pixel, and each pixel contains 115 spectral bands ranging from 0.43 to 0.86 μ m. Due to 12 bands suffering from water absorption, we adopted the remaining 103 bands for experiments. Figure 2 shows the false color image, ground truth (GT) and spectral curves of PaviaU data set.
LongKou data set: This data set was captured by Headwall Nano-Hyperspec imaging sensor on a UAV platform in Longkou Town, Hubei province, China. The full scene contains 550 × 400 pixels, each pixel contains 270 spectral bands ranging from 400 to 1000 nm. The data set possesses 9 different land cover types, and the spatial resolution is about 0.463 m. Figure 3 shows the false color image, ground truth (GT) and spectral curves of LongKou data set.
MUUFL data set: This data set was captured by the ITRES CASI-1500 sensor over the University of Southern Mississippis-Gulfpark Campus. The full scene contains 325 × 220 pixels with a spatial resolution of 1 m, and the number of spectral bands in this dataset is 72. After removing 8 bands affected by noise, we adopt the remaining 64 spectral bands for classification. Figure 4 shows the false color image, ground truth (GT) and spectral curves of the MUUFL data set.

4.2. Experimental Setup

In this section, the HSI data were randomly divided into training and test sets. The training set was adopted to learn a DR model, while the test set was used to verify the validity of the model. Then, we employed the nearest neighbor (NN) classifier for classification. After that, four performance indicators were used to evaluate the effectiveness of the algorithms, such as classification accuracy of each class (CA), overall classification accuracy (OA), average classification accuracy (AA), and kappa coefficient (KC) [66,67,68]. For robustness, all experiments were repeated 10 times under the same conditions.
In experiments, we compared LMSCPE with some state-of-the-art DR algorithms, including principal component analysis (PCA), locality preserving projection (LPP), linear discriminant analysis (LDA), local Fisher Discriminant Analysis (LFDA), local geometric structure Fisher analysis (LGSFA), sparse graph based discriminant analysis (SGDA), collaborative graph-based discriminant analysis (CGDA), and low-rank graph-based discriminant analysis (LGDA). Among them, PCA and LPP are unsupervised DR algorithms, they only consider the spectrum similarity between different pixels. The later six algorithms are supervised DR algorithms, which simultaneously consider the spectrum similarity and category information of pixels. Besides, the RAW method was added to compare with the DR algorithms, it indicates that the test set was classified by the NN classifier directly without DR process.
To demonstrate the effectiveness of the proposed LMSCPE algorithm, the parameters of all the DR models were optimized to achieve higher classification accuracies. For LPP and LFDA, the neighbors number was set to 7. For LGSFA, the numbers of intraclass neighbor and interclass neighbor were set as k 1 = 7 and k 2 = 7, respectively.

4.3. Analysis of Neighbors Number k

To analyze the influence of different neighbor number k on classification performance, we randomly selected 50 labeled samples from each class to form the training set, and the remaining samples construct the testing set. The corresponding experimental results on these three different HSI data sets are given in Figure 5, in which parameter k varies from 1 to 15 with an interval of 1.
As shown in Figure 5, parameter k has similar change trends on three different HSI data sets. That is, as the neighbor number k increases, the OAs first improve and then decrease with the increase of k. The reason is that a larger number of neighbor points possesses richer spectral information, and the neighbors can be utilized to construct a more effective graph learning model, so it has more powerful discrimination capabilities for complex pixels in HSIs. However, when the neighbors number k is too large, it will produce too much useless information for adjacency graph construction, which will degrade the final classification performance of HSI. Therefore, the classification accuracies of LMSCPE will remain stable or even slightly decrease. Besides, a larger spectral neighbor set will lead to high computational cost, which requires huge computational resources and storage capacity. To balance the running time and classification performance, we set k = 5 for the PaviaU data set and k = 7 for both the LongKou data set and MUUFL data set in the following experiments.

4.4. Analysis of Regularization Parameters γ and δ

To evaluate the classification performance of the proposed LMSCPE method under different regularization parameters of γ and δ , 50 labeled samples per class were selected as the training set, and the remaining samples were used as the testing set. Parameters γ and δ were tuned with a set of { 0 , 10 , 20 , , 110 , 120 } and a set of { 0 , 1 , 2 , , 9 , 10 } , respectively. Figure 6 shows the classification results with different regularization parameters γ and δ .
From Figure 6, we can see that parameter γ has a strong influence on the classification performance of LMSCPE, which indicates that Tikhonov regularization helps the CR model to obtain more appropriate collaborative coefficients for constructing adjacency graphs. That is, Tikhonov matrix is an effective distance-based measurement between sample pairs, it can adjust the collaborative coefficients of each pixel adaptively. When two pixels in HSI have a similar spectrum, then the distance between them will be small, the introduction of the Tikhonov regularization will cause the corresponding weight coefficient to get bigger, thereby enhancing the effectiveness of adjacency graph structure and improving the discrimination ability of the DR model. Besides, an increasing δ leads to a subtle improvement on classification accuracy, because it can explore the local neighborhood structure and enhance the aggregation of HSI data. To achieve the best classification accuracies, we set γ and δ to 60 and 4 for the PaviaU data set, 70 and 3 for the LongKou and MUUFL data sets in the experiments.

4.5. Analysis of Trade-Off Parameter a

In the experiments, the trade-off parameter a will also affect the classification performance of LMSCPE, it is mainly employed to balance the contributions between the collaborative graph and local neighborhood graph. To investigate the classification performance with different trade-off parameter a, we randomly selected 50 labeled samples per class for training, and the remaining were for testing. To obtain the optimal classification performance, parameter a was chosen within a set of {0, 0.1, 0.2, 0.3, …, 0.9, 1}. Figure 7 illustrates the classification results of LMSCPE with different trade-off parameter a.
According to Figure 7, the OAs first improve with the increase of a and then slightly decrease at the peak value, which indicates that both collaborative graph and local neighborhood graph contribute to enhance the classification performance. Aiming to achieve the best classification performance, we set a to 0.7 for the PaviaU and MUFFL data sets and 0.8 for the LongKou data set in the following experiments.

4.6. Investigation of Embedding Dimension d

For the above-mentioned DR methods, the value of the embedding dimension d determines the effectiveness of the low-dimensional embedding features, and has an important impact on the final classification performance. Therefore, we analyzed the change of OAs with respect to different embedding dimensions d in Figure 8. In the experiments, 50 labeled samples from each type of land covers were randomly selected as training samples, and the remaining were used as test samples.
From Figure 8, we can draw some similar conclusions on PaviaU, LongKou, and MUUFL data sets. As d varies from 1 to 40, the OAs of each DR method first increase and then remain stable, because a larger embedding dimension d will preserve richer discriminant features in the low dimensional embedding space, which can be utilized to enhance the classification performance, especially when there are few labeled training samples available in complex scenes. However, when the number of training samples is fixed, the available valuable information contained in the samples is limited. Therefore, the increase of d will not lead to the continuous increase of the OAs, but remain stable after it reaches a peak value. Besides, most DR methods outperform RAW in terms of OA, it indicates that these methods can effectively extract the discriminant features of HSI data, thereby enhancing the classification performance. considering that all the DR methods reach the peak value when d is larger than 20, we set d = 30 for all DR methods except LDA in the experiments, and the dimension of LDA is c-1, c is the class number of the HSI data set.

5. Experimental Results and Discussion

5.1. Analysis of Training Sample Size

To analyze the influence of training sample size on the classification performance, n i (n i = 20, 30, 40, 50, 60) samples were randomly selected from each type of land covers as train set, and the remaining were used as test set. After that, the NN classifier was employed to predict the labels of test set, and each experiment was repeated ten times to reduce the random errors. Table 1, Table 2 and Table 3 present the quantitative results of different training sample size on the PaviaU, LongKou, and MUUFL data sets.
As shown in Table 1, Table 2 and Table 3, the OAs of each algorithm significantly improve with the increase of training samples. The reason is that a larger number of training samples will produce more abundant feature information, which is helpful to distinguish the land covers with similar spectrum in complex scenes. In addition, the supervised DR methods of LDA, LFDA, LGSFA, SGDA, LGDA, CGDA, and LMSCPE perform better than unsupervised ones in most cases, which indicates that the category information of land covers contributes to improve the classification performance of the DR models. Among all the supervised DR algorithms, representation-based DR methods, SGDA, LGDA, CGDA, and LMSCPE, are superior to the other methods, because the sparse or collaborative coefficients of these methods can effectively characterize the sparse or collaborative relationship between sample pairs, which can be utilized as the weight coefficients of adjacency graphs, so they have great advantages over the traditional DR methods whose structure depends on pairwise Euclidean distance. For all the above-mentioned DR methods, LMSCPE achieves the best OA and KC in all cases, because it simultaneously considers the collaborative relationship and the local neighborhood structure of HSI, which can further improve the classification accuracy of HSI.

5.2. Analysis of Classification Results

Considering that the land cover types of HSI have the problem of sample imbalance in practical scenes, we constructed the training set by selecting a certain proportion of training samples from each class, and the rest samples were adopted as test set. In the experiments, we set the proportion to 1% for the PaviaU data set, 0.2% for the LongKou data set, and 1% for the MUUFL data set. For convenience of comparison, we listed the CAs, OAs, AAs, and KCs of different algorithms in Table 4, Table 5 and Table 6, respectively, and Figure 9, Figure 10 and Figure 11 present the corresponding classification maps for whole scene on these three different HSI data sets.
From Table 4, Table 5 and Table 6, we can see that the proposed LMSCPE method possesses best OAs, AAs, and KCs for the three different HSI data sets, and it has the higher classification accuracies in most classes than the other algorithms. Besides, the classification maps of LMSCPE are smoother and have less Salt&Pepper noise on the homogeneous regions, especially in the classes of Asphalt, Gravel, and Soil for the PaviaU data set, Cotton, Sesame, and Mixed weed for the LongKou data set, and Trees, Mixed ground surface, and Building shadow for the MUUFL data set. The reason is that LMSCPE not only explores the pixel-wise collaborative relationship of HSI data but also reveals the local neighborhood structure, so it has a stronger ability to extract the discriminant features of HSI data and enhance the classification performance.

5.3. Analysis of Computational Efficiency

For the classification task, both classification accuracy and running time are important performance evaluation indicators. To analyze the computational complexity of LMSCPE, denote the neighbor number of each sample as k, the intraclass weight matrix W s and the interclass weight matrix W b both take O ( N 2 ) . The calculation of X M w X T and X M b X T are both O ( D N 2 ) . The local neighborhood scatter matrix S and the total scatter matrix H are calculated with O ( N k 2 ) and O ( N ) , respectively. It takes O ( D 3 ) to solve the generalized eigenvalue problem of (29). Therefore, the final computational complexity of LMSCPE is O ( N 2 + N k 2 + D N 2 + D 3 ) , and it mainly depends on the size of the training samples, band number, and neighbor number.
Aiming to quantitatively evaluate the classification performance of the above-mentioned DR methods, we detailed the running time of all DR algorithms in Table 7, in which 50 labeled samples were randomly selected from each type of land covers for training, and the rest samples were for testing. All the experimental results were obtained on a personal computer, which had an i7-7800X CPU and 32-G memory, and the version of Windows system and MATLAB are 64-bit Windows 10 and 2016b, respectively.
As shown in Table 7, LMSCPE costs more time than the most DR methods, because LMSCPE simultaneously considers the pixel-wise collaborative relationship and the local neighborhood structure of HSI data, the whole process will slightly increase the computational complexity. However, the slight increase in running time and computational complexity is acceptable relative to the improvement for classification performance.

6. Conclusions

Hyperspectral images (HSI) contain abundant spectral information that can accurately distinguish the subtle differences between different pixels. However, the high dimensionality of HSIs has brought huge challenge to land cover classification. Traditional graph learning algorithms cannot effectively characterize the collaborative relationship between sample pairs, which will lead to a degraded classification performance for HSI. In this paper, we designed a supervised spectral DR method termed local constrained manifold structure collaborative preserving embedding (LMSCPE) for HSI classification. LMSCPE first adopts a novel local constrained CR model to obtain more effective collaborative coefficients between samples pairs. Then, two collaborative graphs are constructed to enhance the intraclass compactness and the interclass separability, and a local neighborhood graph is constructed to explore the local neighborhood structure of HSI. After that, an optimal objective function is designed to obtain a discriminant projection matrix, and the embedding features of different land cover types can be obtained. Experiments on PaviaU, LongKou, and MUUFL hyperspectral data sets demonstrate that LMSCPE is superior to some state-of-the-art methods. However, spectral-based DR methods only consider the spectral information in HSI, which limit the final classification performance. Therefore, in the future, we consider fusing the spatial information into the DR model to further improve the classification accuracy of HSI.

Author Contributions

All the authors made significant contributions to the manuscript. G.S. designed the DR algorithm, completed the corresponding experiments, and finished the first manuscript draft. F.L. analyzed the results and performed the validation work. Y.T. and Y.L. edited the first manuscript and finalized the manuscript for communication. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61801336 and Grant 62071340, in part by the Fundamental Research Funds for the Central Universities under Grant 2042020kf0013, and in part by the China Postdoctoral Science Foundation under Grant 2019M662717 and Grant 2020T130480.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to thank the anonymous reviewers and associate editor for their valuable comments and suggestions to improve the quality of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chen, P.H.; Jiao, L.C.; Liu, F.; Zhao, J.Q.; Zhao, Z.Q. Dimensionality reduction for hyperspectral image classification based on multiview graphs ensemble. J. Appl. Remote Sens. 2016, 10, 030501. [Google Scholar] [CrossRef]
  2. Tao, C.N.; Zhu, H.Z.; Sun, P.; Wu, R.M.; Zheng, Z.R. Hyperspectral image recovery based on fusion of coded aperture snapshot spectral imaging and RGB images by guided filtering. Opt. Commun. 2020, 458, 124804. [Google Scholar] [CrossRef]
  3. Xue, Z.H.; Du, P.J.; Li, J.; Su, H.J. Simultaneous Sparse Graph Embedding for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 6114–6133. [Google Scholar] [CrossRef]
  4. Jia, S.; Hu, J.; Xie, Y.; Shen, L.L.; Jia, Q.Q. Gabor Cube Selection Based Multitask Joint Sparse Representation for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2016, 54, 3174–3187. [Google Scholar] [CrossRef]
  5. Sun, Y.B.; Wang, S.J.; Liu, Q.S.; Hang, R.L.; Liu, G.C. Hypergraph embedding for spatial-spectral joint feature extraction in hyperspectral images. Remote Sens. 2017, 9, 506. [Google Scholar]
  6. Ren, J.S.; Wang, R.X.; Liu, G.; Feng, R.Y.; Wang, Y.N.; Wu, W. Partitioned Relief-F Method for Dimensionality Reduction of Hyperspectral Images. Remote Sens. 2020, 12, 1104. [Google Scholar] [CrossRef]
  7. Liu, B.; Yu, X.C.; Zhang, P.Q.; Tan, X.; Wang, R.R.; Zhi, L. Spectral-spatial classification of hyperspectral image using three-dimensional convolution network. J. Appl. Remote Sens. 2018, 12, 016005. [Google Scholar]
  8. Huang, X.; Zhang, L.P. An SVM ensemble approach combining spectral, structural, and semantic features for the classification of high-resolution remotely sensed imagery. IEEE Trans. Geosci. Remote Sens. 2012, 51, 257–272. [Google Scholar] [CrossRef]
  9. Zhang, L.F.; Song, L.C.; Du, B.; Zhang, Y.P. Nonlocal low-rank tensor completion for visual data. IEEE Trans. Cybern. 2021, 51, 673–685. [Google Scholar] [CrossRef]
  10. Lan, M.; Zhang, Y.P.; Zhang, L.F.; Du, B. Global Context Based Automatic Road Segmentation Via Dilated Convolutional Neural Network. Inf. Sci. 2020, 535, 156–171. [Google Scholar] [CrossRef]
  11. Zhang, Q.; Tian, Y.; Yang, Y.P.; Pan, C.H. Automatic spatial-spectral feature selection for hyperspectral image via discriminative sparse multimodal learning. IEEE Trans. Geosci. Remote Sens. 2014, 53, 261–279. [Google Scholar] [CrossRef]
  12. Zhou, F.; Hang, R.L.; Liu, Q.S.; Yuan, X.T. Hyperspectral image classification using spectral-spatial LSTMs. Neurocomputing 2019, 328, 39–47. [Google Scholar] [CrossRef]
  13. Yi, C.; Zhao, Y.Q.; Chan, J.C.W. Spectral super-resolution for multispectral image based on spectral improvement strategy and spatial preservation strategy. IEEE Trans. Geosci. Remote Sens. 2019, 57, 9010–9024. [Google Scholar] [CrossRef]
  14. Jiang, J.J.; Ma, J.Y.; Chen, C.; Wang, Z.Y.; Cai, Z.H.; Wang, L.Z. SuperPCA: A Superpixelwise PCA Approach for Unsupervised Feature Extraction of Hyperspectral Imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4581–4593. [Google Scholar] [CrossRef]
  15. Xia, J.S.; Falco, N.; Benediktsson, J.A.; Du, P.J.; Chanussot, J. Hyperspectral Image Classification with Rotation Random Forest Via KPCA. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2017, 10, 1601–1609. [Google Scholar] [CrossRef]
  16. Shao, Y.; Lan, J.H. A Spectral Unmixing Method by Maximum Margin Criterion and Derivative Weights to Address Spectral Variability in Hyperspectral Imagery. Remote Sens. 2019, 11, 1045. [Google Scholar] [CrossRef]
  17. Feng, F.B.; Li, W.; Du, Q.; Zhang, B. Dimensionality reduction of hyperspectral image with graph-based discriminant analysis considering spectral similarity. Remote Sens. 2017, 9, 323. [Google Scholar] [CrossRef]
  18. Li, W.; Prasad, S.; Fowler, J.E.; Bruce, L.M. Locality-preserving dimensionality reduction and classification for hyperspectral image analysis. IEEE Trans. Geosci. Remote Sens. 2012, 50, 1185–1198. [Google Scholar] [CrossRef]
  19. Zhang, Z.; Chow, T.; Zhao, M. M-Isomap: Orthogonal constrained marginal isomap for nonlinear dimensionality reduction. IEEE Trans. Cybern. 2013, 43, 1292–1303. [Google Scholar] [CrossRef] [PubMed]
  20. Shi, G.Y.; Huang, H.; Wang, L.H. Unsupervised Dimensionality Reduction for Hyperspectral Imagery via Local Geometric Structure Feature Learning. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1425–1429. [Google Scholar] [CrossRef]
  21. Shi, G.Y.; Huang, H.; Liu, J.M.; Li, Z.Y.; Wang, L.H. Spatial-Spectral Multiple Manifold Discriminant Analysis for Dimensionality Reduction of Hyperspectral Imagery. Remote Sens. 2019, 11, 2414. [Google Scholar] [CrossRef]
  22. Zhang, Y.; Zhang, Z.; Jie, Q.; Zhang, L.; Li, B.; Li, F.Z. Semi-supervised local multi-manifold Isomap by linear embedding for feature extraction. Pattern Recognit. 2018, 76, 662–678. [Google Scholar] [CrossRef]
  23. Li, Q.; Ji, H.B. Multimodality image registration using local linear embedding and hybrid entropy. Neurocomputing 2013, 111, 34–42. [Google Scholar] [CrossRef]
  24. Tu, S.T.; Chen, J.Y.; Yang, W.; Sun, H. Laplacian Eigenmaps-Based Polarimetric Dimensionality Reduction for SAR Image Classification. IEEE Trans. Geosci. Remote Sens. 2012, 50, 170–179. [Google Scholar] [CrossRef]
  25. Li, W.; Zhang, L.P.; Zhang, L.F.; Du, B. GPU Parallel Implementation of Isometric Mapping for Hyperspectral Classification. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1532–1536. [Google Scholar] [CrossRef]
  26. Pu, H.Y.; Chen, Z.; Wang, B.; Jiang, G.M. A novel spatial-spectral similarity measure for dimensionality reduction and classification of hyperspectral image. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7008–7022. [Google Scholar]
  27. Fang, L.Y.; Li, S.T.; Kang, X.D.; Benediktssonet, J.A. Spectral-spatial classification of hyperspectral images with a superpixel-based discriminative sparse model. IEEE Trans. Geosci. Remote Sens. 2015, 53, 4186–4201. [Google Scholar] [CrossRef]
  28. Yuan, H.L.; Tang, Y.Y. Learning with hypergraph for hyperspectral image feature extraction. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1695–1699. [Google Scholar] [CrossRef]
  29. Zhou, Y.C.; Wei, Y.T. Learning Hierarchical Spectral-Spatial Features for Hyperspectral Image Classification. IEEE Trans. Cybern. 2016, 46, 1667–1678. [Google Scholar] [CrossRef] [PubMed]
  30. Luo, R.B.; Liao, W.Z.; Pi, Y.G. Discriminative Supervised Neighborhood Preserving Embedding Feature Extraction for Hyperspectral-image Classification. Telkomnika 2012, 10, 1051–1056. [Google Scholar] [CrossRef]
  31. Yang, W.K.; Sun, C.Y.; Zhang, L. A multi-manifold discriminant analysis method for image feature extraction. Pattern Recognit. 2012, 10, 1051–1056. [Google Scholar] [CrossRef]
  32. Luo, F.L.; Du, B.; Zhang, L.P.; Zhang, L.F.; Tao, D.C. Feature Learning Using Spatial-Spectral Hypergraph Discriminant Analysis for Hyperspectral Image. IEEE Trans. Cybern. 2019, 49, 2406–2419. [Google Scholar] [CrossRef] [PubMed]
  33. Zhang, G.Y.; Jia, X.P.; Hu, J.K. Superpixel-based graphical model for remote sensing image mapping. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5861–5871. [Google Scholar] [CrossRef]
  34. Gao, Y.; Ji, R.R.; Cui, P.; Dai, Q.H.; Hua, G. Hyperspectral image classification through bilayer graph-based learning. IEEE Trans. Image Process. 2014, 23, 2769–2778. [Google Scholar] [CrossRef]
  35. Chen, M.L.; Wang, Q.; Li, X.L. Discriminant Analysis with Graph Learning for Hyperspectral Image Classification. Remote Sens. 2018, 10, 836. [Google Scholar] [CrossRef]
  36. Zhang, C.J.; Li, G.D.; Du, S.H.; Tan, W.Z.; Gao, F. Three-dimensional densely connected convolutional network for hyperspectral remote sensing image classification. J. Appl. Remote Sens. 2019, 13, 016519. [Google Scholar] [CrossRef]
  37. Pan, B.; Shi, Z.W.; Zhang, N.; Xie, S.B. Hyperspectral image classification based on nonlinear spectral-spatial network. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1782–1786. [Google Scholar] [CrossRef]
  38. Zhang, L.P.; Zhong, Y.F.; Huang, B.; Gong, J.Y.; Li, P.X. Dimensionality reduction based on clonal selection for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2007, 45, 4172–4186. [Google Scholar] [CrossRef]
  39. Tang, Y.Y.; Yuan, H.L.; Li, L.Q. Manifold-based sparse representation for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7606–7618. [Google Scholar] [CrossRef]
  40. Chen, Y.; Nasrabadi, N.M.; Tran, T.D. Hyperspectral image classification using dictionary-based sparse representation. IEEE Trans. Geosci. Remote Sens. 2011, 49, 3973–3985. [Google Scholar] [CrossRef]
  41. Gui, J.; Sun, Z.N.; Jia, W.; Hu, R.X.; Lei, Y.K.; Ji, S.W. Discriminant sparse neighborhood preserving embedding for face recognition. Pattern Recognit. 2012, 45, 2884–2893. [Google Scholar] [CrossRef]
  42. Li, W.; Liu, J.B.; Du, Q. Sparse and low-rank graph for discriminant analysis of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4094–4105. [Google Scholar] [CrossRef]
  43. Huang, H.; Luo, F.L.; Liu, J.M.; Yang, Y.Q. Dimensionality reduction of hyperspectral images based on sparse discriminant manifold embedding. ISPRS J. Photogramm. Remote Sens. 2015, 106, 42–54. [Google Scholar] [CrossRef]
  44. Zang, F.; Zhang, J.S. Discriminative learning by sparse representation for classification. Neurocomputing 2011, 74, 2176–2183. [Google Scholar] [CrossRef]
  45. Luo, F.L.; Huang, H.; Liu, J.M.; Ma, Z.Z. Fusion of graph embedding and sparse representation for feature extraction and classification of hyperspectral imagery. Photogramm. Eng. Remote Sens. 2017, 83, 37–46. [Google Scholar] [CrossRef]
  46. Zhong, Y.F.; Wang, X.Y.; Zhao, L.; Feng, R.Y.; Zhang, L.P.; Xu, Y.Y. Blind spectral unmixing based on sparse component analysis for hyperspectral remote sensing imagery. ISPRS J. Photogramm. Remote Sens. 2016, 119, 49–63. [Google Scholar] [CrossRef]
  47. Ye, Z.; Dong, R.; Bai, L.; Nian, Y.J. Adaptive collaborative graph for discriminant analysis of hyperspectral imagery. Eur. J. Remote Sens. 2020, 53, 91–103. [Google Scholar] [CrossRef]
  48. Lv, M.; Hou, Q.L.; Deng, N.Y.; Jing, L. Collaborative Discriminative Manifold Embedding for Hyperspectral Imagery. IEEE Geosci. Remote Sens. Lett. 2017, 14, 272–281. [Google Scholar] [CrossRef]
  49. Ly, N.H.; Du, Q.; Fowler, J.E. Collaborative Graph-Based Discriminant Analysis for Hyperspectral Imagery. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2014, 7, 2688–2696. [Google Scholar] [CrossRef]
  50. Lou, S.J.; Ma, Y.H.; Zhao, X.M. Manifold aware discriminant collaborative graph embedding for face recognition. In Proceedings of the Tenth International Conference on Digital Image Processing, Chengdu, China, 12–14 December 2018. [Google Scholar]
  51. Huang, P.; Li, T.; Gao, G.W.; Yao, Y.Z.; Yang, G. Collaborative representation based local discriminant projection for feature extraction. Digit. Signal Prog. 2018, 76, 84–93. [Google Scholar] [CrossRef]
  52. Zhang, X.R.; He, Y.D.; Zhou, N.; Zheng, Y.G. Semisupervised Dimensionality Reduction of Hyperspectral Images via Local Scaling Cut Criterion. IEEE Geosci. Remote Sens. Lett. 2013, 10, 1547–1551. [Google Scholar] [CrossRef]
  53. Wong, W.K.; Zhao, H.T. Supervised optimal locality preserving projection. Pattern Recognit. 2012, 45, 186–197. [Google Scholar] [CrossRef]
  54. Li, W.; Du, Q. Collaborative Representation for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2014, 53, 1463–1474. [Google Scholar] [CrossRef]
  55. Li, W.; Du, Q.; Zhang, B. Combined sparse and collaborative representation for hyperspectral target detection. Pattern Recognit. 2015, 48, 3904–3916. [Google Scholar] [CrossRef]
  56. Zhu, P.F.; Zuo, W.M.; Zhang, L.; Shiu, S.C.; Zhang, D. Image Set-Based Collaborative Representation for Face Recognition. IEEE Trans. Inf. Forensic Secur. 2014, 9, 1120–1132. [Google Scholar]
  57. Zhang, L.F.; Zhang, L.P.; Tao, D.C.; Huang, X. A modified stochastic neighbor embedding for multi-feature dimension reduction of remote sensing images. ISPRS J. Photogramm. Remote Sens. 2013, 83, 30–39. [Google Scholar] [CrossRef]
  58. Shi, Q.; Zhang, L.P.; Du, B. Semisupervised discriminative locally enhanced alignment for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2013, 51, 4800–4815. [Google Scholar] [CrossRef]
  59. Dong, Y.N.; Du, B.; Zhang, L.P.; Zhang, L.F. Exploring locally adaptive dimensionality reduction for hyperspectral image classification: A maximum margin metric learning aspect. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2017, 10, 1136–1150. [Google Scholar] [CrossRef]
  60. Dong, Y.N.; Du, B.; Zhang, L.P.; Zhang, L.F. Dimensionality reduction and classication of hyperspectral images using ensemble discriminative local metric learning. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2509–2524. [Google Scholar] [CrossRef]
  61. Sun, W.W.; Halevy, A.; Benedetto, J.J.; Czaja, W.; Liu, C.; Wu, H.B.; Shi, B.Q.; Li, Q.Y. UL-Isomap based nonlinear dimensionality reduction for hyperspectral imagery classification. ISPRS J. Photogramm. Remote Sens. 2014, 89, 25–36. [Google Scholar] [CrossRef]
  62. Sun, W.W.; Yang, G.; Du, B.; Zhang, L.P.; Zhang, L.P. A sparse and low-rank near-isometric linear embedding method for feature extraction in hyperspectral imagery classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4032–4046. [Google Scholar] [CrossRef]
  63. Datta, A.; Ghosh, S.; Ghosh, A. Unsupervised band extraction for hyperspectral images using clustering and kernel principal component analysis. Int. J. Remote Sens. 2017, 38, 850–873. [Google Scholar] [CrossRef]
  64. Huang, H.; Li, Z.Y.; Pan, Y.S. Multi-Feature Manifold Discriminant Analysis for Hyperspectral Image Classification. Remote Sens. 2019, 11, 651. [Google Scholar] [CrossRef]
  65. Fang, L.Y.; Wang, C.; Li, S.T.; Benediktsson, J.A. Hyperspectral image classification via multiple-feature based adaptive sparse representation. IEEE Trans. Instrum. Meas. 2017, 66, 1646–1657. [Google Scholar] [CrossRef]
  66. Fang, L.Y.; Zhuo, H.J.; Li, S.T. Super-resolution of hyperspectral image via superpixel-based sparse representation. Neurocomputing 2018, 273, 171–177. [Google Scholar] [CrossRef]
  67. He, L.; Li, J.; Liu, C.Y.; Li, S.T. Recent advances on spectral-spatial hyperspectral image classification: An overview and new guidelines. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1579–1597. [Google Scholar] [CrossRef]
  68. Zhang, L.F.; Zhang, Q.; Du, B.; Huang, X.; Tang, Y.Y.; Tao, D.C. Simultaneous Spectral-Spatial Feature Selection and Extraction for Hyperspectral Images. IEEE Trans. Cybern. 2018, 48, 16–28. [Google Scholar] [CrossRef]
Figure 1. The flowchart of the proposed local constrained manifold structure collaborative preserving embedding (LMSCPE) method.
Figure 1. The flowchart of the proposed local constrained manifold structure collaborative preserving embedding (LMSCPE) method.
Remotesensing 13 01363 g001
Figure 2. PaviaU hyperspectral image. (a) False color image. (b) Ground truth. (c) Spectral curves. (Note that the number of samples for each class is shown in brackets).
Figure 2. PaviaU hyperspectral image. (a) False color image. (b) Ground truth. (c) Spectral curves. (Note that the number of samples for each class is shown in brackets).
Remotesensing 13 01363 g002
Figure 3. LongKou hyperspectral image. (a) False color image. (b) Ground truth. (c) Spectral curves. (Note that the number of samples for each class is shown in brackets).
Figure 3. LongKou hyperspectral image. (a) False color image. (b) Ground truth. (c) Spectral curves. (Note that the number of samples for each class is shown in brackets).
Remotesensing 13 01363 g003
Figure 4. MUUFL hyperspectral image. (a) False color image. (b) Ground truth. (c) Spectral curves. (Note that the number of samples for each class is shown in brackets).
Figure 4. MUUFL hyperspectral image. (a) False color image. (b) Ground truth. (c) Spectral curves. (Note that the number of samples for each class is shown in brackets).
Remotesensing 13 01363 g004
Figure 5. Classification results with different neighbors number of k on three hyperspectral image (HSI) data sets.
Figure 5. Classification results with different neighbors number of k on three hyperspectral image (HSI) data sets.
Remotesensing 13 01363 g005
Figure 6. Classification results with different regularization parameters γ and δ . (a) PaviaU. (b) LongKou. (c) MUUFL.
Figure 6. Classification results with different regularization parameters γ and δ . (a) PaviaU. (b) LongKou. (c) MUUFL.
Remotesensing 13 01363 g006
Figure 7. Classification results of LMSCPE with different tradeoff parameter a on three HSI data sets.
Figure 7. Classification results of LMSCPE with different tradeoff parameter a on three HSI data sets.
Remotesensing 13 01363 g007
Figure 8. Classification results with different embedding dimensions d. (a) PaviaU. (b) LongKou. (c) MUUFL.
Figure 8. Classification results with different embedding dimensions d. (a) PaviaU. (b) LongKou. (c) MUUFL.
Remotesensing 13 01363 g008
Figure 9. Classification maps for the whole scene on the PaviaU data set. (a) False color image; (b) ground truth; (c) RAW; (d) PCA; (e) LPP; (f) LDA; (g) LFDA; (h) LGSFA; (i) SGDA; (j) LGDA; (k) CGDA; (l) LMSCPE.
Figure 9. Classification maps for the whole scene on the PaviaU data set. (a) False color image; (b) ground truth; (c) RAW; (d) PCA; (e) LPP; (f) LDA; (g) LFDA; (h) LGSFA; (i) SGDA; (j) LGDA; (k) CGDA; (l) LMSCPE.
Remotesensing 13 01363 g009
Figure 10. Classification maps for the whole scene on the LongKou data set. (a) False color image; (b) ground truth; (c) RAW; (d) PCA; (e) LPP; (f) LDA; (g) LFDA; (h) LGSFA; (i) SGDA; (j) LGDA; (k) CGDA; (l) LMSCPE.
Figure 10. Classification maps for the whole scene on the LongKou data set. (a) False color image; (b) ground truth; (c) RAW; (d) PCA; (e) LPP; (f) LDA; (g) LFDA; (h) LGSFA; (i) SGDA; (j) LGDA; (k) CGDA; (l) LMSCPE.
Remotesensing 13 01363 g010
Figure 11. Classification maps for the whole scene on the MUUFL data set. (a) False color image; (b) ground truth; (c) RAW; (d) PCA; (e) LPP; (f) LDA; (g) LFDA; (h) LGSFA; (i) SGDA; (j) LGDA; (k) CGDA; (l) LMSCPE.
Figure 11. Classification maps for the whole scene on the MUUFL data set. (a) False color image; (b) ground truth; (c) RAW; (d) PCA; (e) LPP; (f) LDA; (g) LFDA; (h) LGSFA; (i) SGDA; (j) LGDA; (k) CGDA; (l) LMSCPE.
Remotesensing 13 01363 g011
Table 1. Classification results of each method with different training sample sizes on the PaviaU dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Table 1. Classification results of each method with different training sample sizes on the PaviaU dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Method2030405060
RAW68.77 ± 1.74 (0.607)70.48 ± 1.93 (0.628)71.21 ± 2.04 (0.637)72.79 ± 1.03 (0.655)73.29 ± 0.76 (0.661)
PCA68.75 ± 1.73 (0.607)70.44 ± 1.94 (0.628)71.23 ± 1.97 (0.637)72.77 ± 1.01 (0.655)73.28 ± 0.79 (0.661)
LPP66.12 ± 0.91 (0.578)70.91 ± 2.90 (0.634)72.56 ± 1.48 (0.654)74.58 ± 0.89 (0.677)76.13 ± 1.05 (0.695)
LDA67.28 ± 4.54 (0.589)72.53 ± 2.21 (0.652)75.38 ± 0.85 (0.683)77.36 ± 1.10 (0.709)77.57 ± 1.56 (0.711)
LFDA61.88 ± 2.44 (0.525)69.95 ± 2.84 (0.622)74.03 ± 1.80 (0.670)75.39 ± 1.02 (0.687)77.42 ± 1.89 (0.711)
LGSFA65.08 ± 1.96 (0.560)71.02 ± 1.71 (0.634)73.01 ± 1.51 (0.657)75.18 ± 3.02 (0.683)75.60 ± 0.98 (0.686)
SGDA73.04 ± 1.19 (0.659)75.62 ± 1.77 (0.690)76.13 ± 2.33 (0.696)78.43 ± 1.29 (0.723)79.06 ± 1.15 (0.730)
LGDA71.55 ± 1.90 (0.642)73.85 ± 1.60 (0.669)75.30 ± 2.44 (0.682)75.65 ± 0.93 (0.690)77.25 ± 0.81 (0.709)
CGDA72.22 ± 2.60 (0.651)75.20 ± 1.03 (0.686)76.03 ± 2.62 (0.696)77.20 ± 1.44 (0.709)78.95 ± 1.47 (0.730)
LMSCPE76.81 ± 1.55 (0.705)78.11 ± 1.53 (0.722)78.91 ± 2.50 (0.732)79.89 ± 0.64 (0.742)81.62 ± 2.19 (0.763)
Table 2. Classification results of each method with different training sample sizes on the LongKou dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Table 2. Classification results of each method with different training sample sizes on the LongKou dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Method2030405060
RAW79.39 ± 1.52 (0.740)81.39 ± 0.68 (0.765)82.20 ± 0.84 (0.775)82.49 ± 0.78 (0.778)83.53 ± 0.89 (0.791)
PCA79.38 ± 1.53 (0.740)81.38 ± 0.68 (0.764)82.18 ± 0.84 (0.774)82.48 ± 0.80 (0.778)83.52 ± 0.91 (0.791)
LPP66.70 ± 1.81 (0.592)73.26 ± 0.39 (0.668)76.66 ± 1.38 (0.708)80.06 ± 1.69 (0.749)82.45 ± 0.67 (0.777)
LDA83.94 ± 1.93 (0.795)85.60 ± 0.74 (0.816)87.73 ± 0.68 (0.843)89.74 ± 1.06 (0.868)91.16 ± 0.63 (0.886)
LFDA77.76 ± 2.43 (0.719)83.43 ± 2.33 (0.789)83.02 ± 0.55 (0.784)89.47 ± 0.57 (0.865)91.68 ± 0.47 (0.893)
LGSFA83.46 ± 0.66 (0.789)83.54 ± 2.16 (0.790)85.78 ± 1.13 (0.818)90.12 ± 0.36 (0.873)91.79 ± 0.40 (0.894)
SGDA87.47 ± 2.04 (0.839)89.05 ± 0.65 (0.859)89.93 ± 0.91 (0.870)90.59 ± 0.65 (0.879)91.62 ± 0.44 (0.892)
LGDA84.73 ± 2.62 (0.805)85.02 ± 0.62 (0.809)85.50 ± 1.23 (0.815)85.79 ± 0.50 (0.819)86.77 ± 0.53 (0.831)
CGDA88.03 ± 2.82 (0.847)89.61 ± 0.94 (0.866)89.92 ± 1.01 (0.870)90.17 ± 0.72 (0.874)90.99 ± 0.69 (0.884)
LMSCPE90.74 ± 1.31 (0.881)91.28 ± 0.98 (0.887)91.57 ± 0.73 (0.891)91.85 ± 0.82 (0.895)92.06 ± 0.94 (0.897)
Table 3. Classification results of each method with different training sample sizes on the MUUFL dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Table 3. Classification results of each method with different training sample sizes on the MUUFL dataset [OA ± STD (%)(KC)]. (OA—overall classification accuracy, STD—standard deviation, KC—kappa coefficient, and the best results of a column are marked in bold).
Method2030405060
RAW68.72 ± 3.60 (0.610)70.55 ± 1.02 (0.632)71.41 ± 1.19 (0.641)72.58 ± 1.36 (0.655)72.70 ± 0.40 (0.656)
PCA68.71 ± 3.58 (0.610)70.55 ± 1.02 (0.632)71.42 ± 1.17 (0.642)72.57 ± 1.32 (0.654)72.74 ± 0.40 (0.657)
LPP63.08 ± 4.46 (0.545)67.32 ± 1.70 (0.594)68.86 ± 2.25 (0.612)70.90 ± 1.29 (0.635)72.16 ± 1.01 (0.649)
LDA62.23 ± 4.26 (0.532)65.65 ± 1.70 (0.570)65.76 ± 2.23 (0.571)66.89 ± 0.77 (0.583)67.04 ± 1.18 (0.586)
LFDA62.65 ± 4.03 (0.537)65.68 ± 2.88 (0.574)68.45 ± 2.09 (0.606)70.89 ± 1.25 (0.636)71.56 ± 0.74 (0.643)
LGSFA63.04 ± 4.68 (0.544)67.32 ± 1.38 (0.593)68.24 ± 2.50 (0.605)69.97 ± 2.23 (0.625)70.42 ± 0.73 (0.630)
SGDA70.03 ± 3.78 (0.625)72.61 ± 1.10 (0.656)73.32 ± 0.95 (0.664)73.87 ± 1.36 (0.671)74.89 ± 0.42 (0.682)
LGDA71.33 ± 3.75 (0.640)72.71 ± 0.72 (0.658)73.54 ± 0.86 (0.667)74.11 ± 0.20 (0.673)74.34 ± 1.19 (0.676)
CGDA71.12 ± 3.52 (0.638)72.40 ± 0.98 (0.654)73.04 ± 1.08 (0.661)74.62 ± 1.33 (0.679)74.71 ± 0.44 (0.680)
LMSCPE72.43 ± 4.31 (0.654)73.96 ± 1.62 (0.672)74.54 ± 1.83 (0.679)76.46 ± 1.39 (0.702)76.57 ± 0.93 (0.708)
Table 4. Classification results (%) of each type of land covers (t = 1%) with the nearest neighbor (NN) classifier on the PaviaU data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
Table 4. Classification results (%) of each type of land covers (t = 1%) with the nearest neighbor (NN) classifier on the PaviaU data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
ClassLand CoversTrainingTestRAWPCALPPLDALFDALGSFASGDALGDACGDALMSCPE
1Asphalt66656584.5484.4284.7484.4969.7284.0485.3584.7888.2691.36
2Meadows18618,46389.8789.8489.4388.9377.5696.9189.6090.8892.8395.54
3Gravel21207849.0948.8941.2938.6948.8531.5752.1756.5960.5964.87
4Trees31303375.7075.7779.5986.2592.6588.1675.3778.3778.5785.23
5Metal13133298.5798.5799.1799.6299.0299.7099.1098.5799.3299.55
6Soil50497954.2554.3553.9955.7966.9838.2456.9261.5466.9469.65
7Bitumen13131764.5464.7751.8620.1251.9432.7365.6063.8672.7463.33
8Bricks37364572.9872.6265.2462.0964.2073.3973.7773.2078.3876.27
9Shadows1093799.8999.8999.7988.2699.8998.4099.8999.1599.3699.79
AA76.6076.5773.9069.3674.5371.4677.5378.5581.8982.84
OA80.0980.0578.7677.5673.9980.2880.6681.9784.9587.16
KC73.2973.2371.5570.0266.5972.9274.1075.8779.8482.76
Table 5. Classification results (%) of each type of land covers (t = 0.2%) with the NN classifier on the LongKou data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
Table 5. Classification results (%) of each type of land covers (t = 0.2%) with the NN classifier on the LongKou data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
ClassLand CoversTrainingTestRAWPCALPPLDALFDALGSFASGDALGDACGDALMSCPE
1Corn6934,44293.3693.3082.2093.0394.5696.9095.4696.8597.7798.81
2Cotton17835747.2447.2434.7947.7952.0256.1953.0545.1454.8366.02
3Sesame10302137.8037.5415.3333.4032.5465.1840.1956.0464.7168.42
4Broad-leaf soybean12663,08687.3387.3280.9888.8990.5795.3689.7490.0992.2296.67
5Narrow-leaf soybean10414153.3953.3427.5833.3035.6957.3854.9474.5076.3376.45
6Rice2411,83090.5190.4578.8898.0897.9895.5891.6296.1399.8299.80
7Water13466,92299.9399.9399.9599.9299.9199.9899.9399.9499.9199.91
8Roads and houses14711075.0975.0860.0362.5767.3379.5476.1783.3685.5485.23
9Mixed weed10521928.6828.6629.7060.8057.1458.5934.6850.8959.1165.97
AA68.1568.1056.6068.6469.7578.3070.6476.9981.1484.14
OA87.6787.6581.3088.4789.5292.8489.3390.9192.7895.01
KC83.8083.7775.4784.7886.1390.5285.9688.0390.4993.42
Table 6. Classification results (%) of each type of land covers (t = 1%) with the NN classifier on the MUUFL data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
Table 6. Classification results (%) of each type of land covers (t = 1%) with the NN classifier on the MUUFL data set. (AA—average classification accuracy, OA—overall classification accuracy, KC—kappa coefficient, and the best results of a row are marked in bold).
ClassLand CoversTrainingTestRAWPCALPPLDALFDALGSFASGDALGDACGDALMSCPE
1Trees46522,78188.9889.0289.4190.0681.7189.8389.0689.6589.8190.95
2Mostly grass85418567.7867.6865.6560.8267.6465.2767.1470.9569.4370.43
3Mixed ground surface138674462.6462.7065.1469.0353.0064.6063.0965.4966.8471.64
4Dirt/sand37178958.1358.0864.8862.3980.5372.8457.5262.9467.8771.24
5Road134655386.6086.6683.5672.0166.8977.9286.2788.1787.3487.70
6Water1045677.1976.9773.9027.4153.5149.7876.7580.9278.2976.75
7Building shadow45218861.2860.8357.4930.0852.9241.7560.8362.7864.3665.31
8Buildings125611577.7877.7075.1769.9664.4576.2177.9782.0383.2082.10
9Sidewalk28135743.2543.2544.4232.8250.6941.0643.7645.9543.0341.79
10Yellow curb1017347.4046.8243.3578.0382.6682.0847.9857.2353.7670.52
11Cloth panels1025988.8088.8089.9694.9894.5994.9888.4289.1987.6489.58
AA69.0868.9668.4562.5168.0568.7668.9872.3071.9674.36
OA78.7078.6978.4274.9870.8477.3978.6980.6580.9382.21
KC72.0672.0471.6566.6162.4470.0572.0474.6474.9776.60
Table 7. Computational time (in seconds) of the different algorithms on the PaviaU, LongKou, and MUUFL data sets.
Table 7. Computational time (in seconds) of the different algorithms on the PaviaU, LongKou, and MUUFL data sets.
DatasetPCALPPLDALFDALGSFASGDALGDACGDALMSCPE
PaviaU0.0250.0310.0130.0370.2920.6593.0090.3152.650
LongKou0.0140.1240.0230.0400.6160.3081.6550.1120.887
MUUFL0.0310.0420.0120.0190.3560.2872.0380.1251.038
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Back to TopTop