Next Article in Journal
A New Method for Calculating Water Quality Parameters by Integrating Space–Ground Hyperspectral Data and Spectral-In Situ Assay Data
Next Article in Special Issue
Multimodal Satellite Image Time Series Analysis Using GAN-Based Domain Translation and Matrix Profile
Previous Article in Journal
Estimation and Simulation of Forest Carbon Stock in Northeast China Forestry Based on Future Climate Change and LUCC
Previous Article in Special Issue
Shuffle-CDNet: A Lightweight Network for Change Detection of Bitemporal Remote-Sensing Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Novel Higher-Order Clique Conditional Random Field to Unsupervised Change Detection for Remote Sensing Images

1
Hubei Engineering Technology Research Center for Farmland Environmental Monitoring, China Three Gorges University, Yichang 443002, China
2
College of Computer and Information Technology, China Three Gorges University, Yichang 443002, China
3
Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong 999077, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(15), 3651; https://0-doi-org.brum.beds.ac.uk/10.3390/rs14153651
Submission received: 21 June 2022 / Revised: 24 July 2022 / Accepted: 26 July 2022 / Published: 29 July 2022
(This article belongs to the Special Issue Image Change Detection Research in Remote Sensing)

Abstract

:
Change detection (CD) is one of the most important topics in remote sensing. In this paper, we propose a novel higher-order clique conditional random field model to unsupervised CD for remote sensing images (termed HOC2RF), by defining a higher-order clique potential. The clique potential, constructed based on a well-designed higher-order clique of image objects, takes the interaction between the neighboring objects in both feature and location spaces into account. HOC2RF consists of five principle steps: (1) Two difference images with complementary change information are produced by change vector analysis and using the spectral correlation mapper, which describe changes from the perspective of the vector magnitude and angle, respectively. (2) The fuzzy partition matrix of each difference image is calculated by fuzzy clustering, and the fused partition matrix is obtained by fusing the calculated partition matrices with evidence theory. (3) An object-level map is created by segmenting the difference images with an adaptive morphological reconstruction based watershed algorithm. (4) The energy function of the proposed HOC2RF, composed of unary, pairwise, and higher-order clique potentials, is computed based on the difference images, the fusion partition matrix, and the object-level map. (5) The energy function is minimized by the graph cut algorithm to achieve the binary CD map. The proposed HOC2RF CD approach combines the complementary change information extracted from the perspectives of vector magnitude and angle, and synthetically exploits the pixel-level and object-level spatial correlation of images. The main contributions of this article include: (1) proposing the idea of using the interaction between neighboring objects in both feature and location spaces to enhance the CD performance; and (2) presenting a method to construct a higher-order clique of objects, developing a higher-order clique potential function, and proposing a novel CD method HOC2RF. In the experiments on three real remote sensing images, the Kappa coefficient/overall accuracy values of the proposed HOC2RF are 0.9655/0.9967, 0.9518/0.9910, and 0.7845/0.9651, respectively, which are superior to some state-of-the-art CD methods. The experimental results confirm the effectiveness of the proposed method.

Graphical Abstract

1. Introduction

The change information on the earth surface is of great importance due to its extensive uses in various practical applications, such as urban studies, environmental monitoring, resource management, and damage assessment [1]. Change detection (CD) from remote sensing images provides a powerful tool to detect the land cover changes. Generally, CD involves the analysis of multitemporal remote sensing images taken on the same ground area. Over the past several decades, a number of techniques to CD have been proposed and developed for different types of remote sensing images [1,2].
The techniques can be grouped into two categories according to whether they require training samples, i.e., supervised and unsupervised [3]. The former is able to provide the “from–to” types of land cover transitions. However, it is often difficult and laborious to gain sufficient training samples in real applications. In contrast, the latter performs CD by comparing two temporal remote sensing images directly, without need for any additional information. As a consequence, unsupervised CD is easier to implement and more popular [4,5].
Unsupervised CD is typically realized by two key steps: (1) produce a difference image (DI) and (2) analyze the DI to discriminate the no-change and change pixels. In the first step, different comparison algorithms can be employed to generate DI, including image differencing, change vector analysis (CVA), and spectral correlation mapper (SCM). For the second step, many machine learning techniques have been adopted to produce the binary CD map, such as thresholding [3,6], fuzzy C-means clustering (FCM) [7,8], and information fusion [9].
Some unsupervised CD methods assume that the pixels in remote sensing images are independent of each other, and only use the spectral information of images. This often leads to “salt and pepper” noises in the generated CD map and thus reduces the CD accuracy. To address the problems, several methods have been presented to integrate spatial information into CD, such as neighboring windows [10], local histogram-based analysis [11], active contour model [12], and random field theory [4,13].
The Markov random field (MRF), as a classical model to utilize the spatial-context information in the labeling field, has been widely applied to the CD studies [3,13,14]. In MRF-based CD, the joint probability distribution of the observed DI and an initial CD map is first modeled using a Bayesian generative framework [15]. Then, the final CD map is obtained by an inference algorithm, such as graph cuts, simulated annealing, and iterated conditional modes. However, for computational tractability, MRF generally assumes that the observed image is conditional independent [16], which is not appropriate for some real applications and may result in the over-smoothing problem of CD maps.
To overcome the shortcomings of MRF, the conditional random field (CRF) model was applied to remote sensing image CD [15,17]. CRF, which takes spatial-context information into account without assuming the conditional independence of the observed image, is an improved version of MRF. It was first given by [18] to segment and label the 1-D natural language sequences, and then was extended by [19] to deal with the labeling task of 2-D images. From then on, CRF has been extensively applied to image analysis and classification because of its effectiveness and flexibility. The pairwise CRF is the most commonly used CRF model in the analysis of remotely sensed imagery.
Recently, the higher-order CRF (HOCRF) was introduced into the CD task [20,21]. HOCRF incorporates a higher-order potential function (object term) into the pairwise CRF, and can make better use of the spatial correlation of images. Experimental results showed that HOCRF could obtain higher CD accuracy than pairwise CRF. However, the HOCRF CD methods in [20,21] have two main limitations: (1) they only consider a single object and ignore the dependence of neighboring objects when computing higher-order potentials. This limits the methods’ ability to utilize the spatial contextual information of images for CD. (2) They only use the magnitude change of spectral vectors while ignoring the spectral angle (direction) difference, which is also crucial for CD [22].
In order to overcome the above two limitations and enhance the CD performance, in this study, we propose a novel higher-order clique CRF model (HOC2RF) for the unsupervised CD of remote sensing images. For the first limitation, HOC2RF defines a novel higher-order clique potential based on a properly designed clique of objects to utilize the interaction of neighboring objects in both feature and location spaces. For the second limitation, HOC2RF considers two complementary DI images in both observed and labeling fields. The two DI images describe change information from the perspective of vector magnitude and angle, respectively.
The proposed HOC2RF CD method is made up of five main steps. Two DI images providing complementary change information are generated first using CVA and SCM. Second, the fuzzy partition matrix for each DI is estimated by FCM, and the fused partition matrix is achieved by combining the estimated partition matrices with evidence theory. Third, an adaptive morphological reconstruction (AMR)-based watershed algorithm is used to segment the DI images for creating an object-level map. Then, the HOC2RF energy function with three potentials is calculated based on the DIs, the fused fuzzy partition matrix, and the object-level map. Finally, the CD map is obtained by minimizing the HOC2RF energy function with the graph cut algorithm. The main contributions of the paper are as below:
(1)
The basic idea of using the interaction between neighboring objects in both feature and location spaces to enhance CD performance.
(2)
The method to construct a higher-order clique of objects, the novel higher-order clique potential function, and the novel CD method HOC2RF.
The rest of this paper is organized as follows. Section 2 presents related works in the literature. Section 3 describes the proposed HOC2RF CD method in detail. Section 4 evaluates the performance of HOC2RF by three experiments. Finally, the discussion and conclusions of this paper are presented in Section 5 and Section 6, respectively.

2. Related Work

Recently, CRF has been applied to remote sensing CD, and some CRF-based CD techniques have been proposed and developed [15,17,20,21,23,24]. These techniques can be divided into three classes according to the types of CRF models they used: pairwise CRF-based, fully connected CRF-based (FCCRF), and HOCRF-based methods.
The pairwise CRF-based CD methods [15,23,24] include two potential functions, unary and pairwise. The former models the relationship between the labeling and observed fields and describes the cost of a single pixel being assigned to the change or no-change class. The latter models the spatial contextual information between the adjacent pixels in a local neighborhood. Cao et al. [23] applied pairwise CRF to unsupervised CD. The method uses FCM to compute unary potentials and uses a scaled squared Euclidean distance to define pairwise potential.
Lv et al. [15] proposed a multi-feature probabilistic ensemble CD method based on pairwise CRF. It combines the DI’s spectral and morphological features in order to obtain more accurate unary potential. To improve the accuracy of CD, Shao et al. [24] first fused three-scale DI images to compute the unary potential, and then used a spatial attraction model to improve the pairwise potential. Although the pairwise CRF-based methods can obtain effective CD results, they still have a common limitation: they do not fully exploit the spatial contextual information of images since their pairwise potentials only consider a small local neighborhood.
Different from the pairwise CRF, the FCCRF model establishes pairwise potentials based on all pairs of pixels in the whole image, which can enhance the ability to model the dependence of pixels in an image. Cao et al. [17] adopted FCCRF to perform CD to utilize the long-range dependence of pixels. Their experimental results demonstrate that FCCRF can yield more accurate CD results than pairwise CRF. However, the FCCRF CD method [17] (with five parameters) requires much more parameter tuning than the pairwise CRF methods (with only one parameter).
In order to utilize the spatial correlation of pixels in a higher-order neighborhood (i.e., an image object), HOCRF was introduced into the CD task by [20,21]. The HOCRF CD methods add an object term (i.e., a higher-order potential) into the pairwise CRF to capture rich statistics and contextual information in an object and can take advantage of the spatial correlation of pixels more effectively.
However, the HOCRF CD approaches in [20,21] have the following two drawbacks: (1) their higher-order potentials only consider a single object, ignoring the interaction between neighboring objects. (2) They only use the magnitude change of spectral vectors and ignore the spectral angle difference, which is also crucial for CD [23]. Table 1 summaries the unsupervised CD methods based on CRF in the literature.
In addition, CRF has also been introduced into supervised CD [25,26]. Li et al. [25] used the supervised support vector machine (SVM) to compute the unary potential, and utilized the statistical distribution of DI image to enhance the performance of pairwise CRF. Shi et al. [26] proposed a class-priori CRF models for binary and multiclass CD tasks, which used the class posterior probabilities obtained by SVM to improve the CD accuracy.
This study follows the HOCRF CD methods [20,21], which can utilize the spatial contextual information in a local neighborhood and an image object and has less parameters than FCCRF. In order to maintain the advantages of these methods and overcome their two main limitations mentioned above, this paper proposes a novel HOC2RF model for unsupervised CD of remote sensing images. The details of the proposed CD method are described in the next section.

3. Proposed HOC2RF CD Method

This section details the proposed HOC2RF CD approach. HOC2RF maintains the advantages of the existing HOCRF CD methods and overcomes their two main limitations. First, HOC2RF defines a novel higher-order clique potential by constructing a higher-order clique of objects, to utilize the interaction between the neighboring objects in both feature and location spaces. Then, HOC2RF makes comprehensive use of the magnitude and angle change of spectral vectors in both observed and labeling fields to enhance the CD performance.

3.1. Procedure and Organization of HOC2RF

The proposed HOC2RF model involves two types of fields: observed and labeling, and includes three potentials: unary, pairwise, and higher-order clique. This study uses two complementary DIs computed by CVA and SCM to define the observed field. For the labeling field, an initial CD map is yielded by fusing the two DI images with FCM and evidence theory. In the process of fusing the two DIs, a fusion fuzzy partition matrix is also obtained, which will be used to compute the unary potential. In addition, in order to compute the proposed higher-order clique potential, an object-level map needs to be generated.
Specifically, the proposed HOC2RF CD method is achieved by the following steps (Figure 1): (1) produce two complementary DI images using CVA and SCM; (2) combine the two DIs with FCM and evidence theory to obtain the fused fuzzy partition matrix and an initial CD map; (3) generate an object-level map using the AMR-based watershed algorithm; (4) compute the unary, pairwise, and higher-order clique potentials for HOC2RF; and (5) create the final CD map by optimizing the HOC2RF model with the graph cut algorithm.
Steps 1–3 are the preparation steps for defining the HOC2RF model. The two DIs obtained in Step 1 are used to define the observation field of HOC2RF. The fused fuzzy partition matrix and initial CD map yielded in Step 2 are used to compute the unary potential and define the labeling field, respectively. The object-level map generated in Step 3 is used for computing the higher-order clique potential.
The rest of Section 3 is organized as follows: Section 3.2, Section 3.3 and Section 3.4 present the details of Steps 1–3, respectively. Section 3.5 describes the proposed HOC2RF model and its implementation in detail.

3.2. Generating Complementary DI Images

Let us consider two multispectral (or hyperspectral) remote sensing images X1 and X2 (with the same size of N pixels) acquired in the same geographical area at two different dates, respectively, which have been radiometrically corrected and coregistered. Both X1 and X2 are composed of B spectral bands (B > 1); Xtb is the bth band of image Xt, t = 1, 2; b = 1, 2, …, B.
Generally, different land cover types have their own typical spectral characteristics represented by peculiar spectral curves, although there are the phenomena of “same object with different spectrum” and “different objects with same spectrum” in some cases. Accordingly, various features could be extracted from remote sensing images. For the CD task, the differences between multitemporal remote sensing images can reflect the changes occurring on the corresponding area during the observation times to an extent, and thus can provide hints to CD. Images with multiple spectral bands can constitute spectral vectors of elements [22]. For the multispectral image CD, the DI image is generally produced based on the multitemporal spectral vectors.
However, most unsupervised CD techniques mainly take the vector magnitude change into account, failing to utilize the vector angle (or direction). The vector magnitude and angle can provide complementary change information [22], and can be described by CVA and spectral angle mapper (SAM), respectively. SAM uses cosine correlation to compute the vector angle, which is unable to detect negatively correlated data and sensitive to offset factors [27]. In contrast, SCM uses Pearson’s correlation to calculate vector angles, and can overcome the shortcomings of SAM to some extent. Given the above analysis, this study uses CVA and SCM to produce the DI images.
The DI image defined by CVA is denoted by DICVA. CVA uses the Euclidean distance to compute the differences between two temporal images. Specifically, DICVA can be computed through the following equation:
D I C V A ( i ) = b = 1 B ( X 2 b ( i ) X 1 b ( i ) ) 2
where Xtb(i) is the ith pixel at the bth band of image Xt, t = 1, 2; b = 1, 2, …, B; and i = 1, 2, …, N.
Denote the DI determined with SCM by DISCM. SCM utilizes the angle of spectral vectors to model their difference. First, the Pearson’s correlation coefficient of X2 and X1 is calculated by
SC M i ( X 2 , X 1 ) = b = 1 B ( X 2 b ( i ) X ¯ 2 ( i ) ) ( X 1 b ( i ) X ¯ 1 ( i ) ) b = 1 B ( X 2 b ( i ) X ¯ 2 ( i ) ) 2 b = 1 B ( X 1 b ( i ) X ¯ 1 ( i ) ) 2
where Xtb(i) denotes the ith pixel at the bth band of image Xt, and X ¯ t ( i ) represents the average of the spectral bands of Xt at pixel i, t = 1, 2; b = 1, 2, …, B; and i = 1, 2, …, N. The correlation coefficient in (2) can be viewed as an angle if applying the arc-cosine operation to it. Thus, SCM is the centered version of SAM by X ¯ 1 and X ¯ 2 . The SCMi(X2, X1) value varies in the interval [−1, 1]. SCMi(X2, X1) = 1 means that the two vectors are completely positively correlated, and SCMi(X2, X1) = −1 means that the two vectors are completely negatively correlated. Then, SCM(X2, X1) can be converted to the DI image DISCM by Equation (3) or (4):
D I S C M ( i ) = 1 S C M i ( X 2 , X 1 )
D I S C M ( i ) = arcos ( S C M i ( X 2 , X 1 ) )
Here, X1 and X2 represent the two considered remote sensing images (see the beginning of this subsection). After obtaining the two DI images, their pixel values are normalized to the interval [0, 1] in order to make different datasets have the same weight.

3.3. Combine DI Images with FCM and Evidence Theory

Evidence theory (also known as Dempster-Shafer theory) [28,29] is a popular decision-level fusion framework and has been successfully applied to various applications. It can deal with both single and composite hypotheses and allows the modeling of both uncertainty and ignorance. Let us consider a frame of discernment Ω consisting of all possible single hypotheses and its power set P(Ω). A mass function for the discernment frame Ω is a mapping m from P(Ω) to the interval [0, 1] and satisfies the following properties:
m ( ) = 0 A P ( Ω ) m ( A ) = 1
where represents the empty set, A represents a nonempty subset of Ω, and m(A) represents the mass value of A.
In evidence theory, evidence from different sources is usually combined using the orthogonal sum. Consider D mass functions (namely, mn, n = 1, 2, …, D) from D pieces of evidence, respectively. Their fused mass function m can be determined as follows [30]:
m ( A ) = m 1 m 2 m D = 1 1 Ψ A 1 A D = A n = 1 D m n ( A n )   A 0 A =
with
Ψ = A 1 A D = n = 1 D m n ( A n )
where Ψ represents the degree of conflict between evidence, called the conflict coefficient.
For the CD task, there are two single hypotheses in Ω—namely, Ω = {Cu, Cc}, where Cu and Cc represent the no-change and change classes, respectively. Two pieces of evidence, the two produced DI images, are available for Ω in our case. In order to fuse the DIs with evidence theory, the first step is to define their mass functions.
Usually, there exists an overlap between the ranges of the DI pixel values from the change and no-change classes [8]. This leads to the inherent uncertainty in the analysis of DI. Fuzzy clustering provides an opportune tool for analyzing DI owning to its capability to process uncertainty. In fuzzy clustering, the pixels are not assigned to either the no-change or the change category but to both categories with a certain degree of membership. Moreover, fuzzy clustering requires no prior assumption about the distribution of the no-change and change classes. Given the above analysis, we use the popular FCM clustering to analyze the DIs for estimating their fuzzy partition matrices (also called membership functions). The mass functions for the two pieces of DI evidence are then derived from the estimated fuzzy partition matrices. The FCM details can be found in [31].
Let Un = {uni(k)} represent the fuzzy partition matrix obtained by FCM based on the nth DI image, n = 1, 2, and uni(k) stands for the membership of the ith pixel with respect to class k, i = 1, 2, …, N; k ∈ {Cu, Cc}, satisfying
  0 u n i ( k ) 1 u n i ( C u ) + u n i ( C c ) = 1
The mass function mn for the nth DI image can be determined according to the fuzzy partition matrix Un. In particular, the mass values of a given pixel i for no-change and change classes are obtained by
m n i ( k ) = u n i ( k ) , k C u , C c
where mni(k) represents the mass value of the ith pixel to class k obtained based on the nth DI image. Then, the combined mass function m is computed through Equation (6), and an initial CD map is yielded using the principle of maximum mass value. For a given pixel i, its initial class label yi is obtained as follows:
y i = C c i f m i ( C c ) m i ( C u ) C u i f m i ( C c ) < m i ( C u )
where mi(Cu) and mi(Cc) represent the combined mass values of pixel i to the no-change and change classes, respectively.

3.4. Generate an Object-Level Map for HOC2RF

This subsection aims to generate an object-level map by segmenting the DI images, which will be used for computing the proposed higher-order clique potential function. Different segmentation techniques can be employed to produce object-level images, such as the spectral clustering and the seeded algorithm. Seeded segmentation algorithms have been successfully applied to many image segmentation applications because of their good performance [32]. The watershed algorithm is one of the most important seeded algorithms. However, it often suffers from the problem of over-segmentation.
This is because the watershed algorithm gains seeds from the gradient image that usually contains many seeds produced by unimportant texture details or noises. To solve this problem, an advanced AMR technique was given recently in [32] to improve the seed image. The AMR algorithm has the following advantages: (1) It is easy to implement; (2) it can remove useless seeds while maintaining meaningful ones adaptively; (3) it uses multiscale structuring elements of erosion and dilation operations and is robust to the structuring element scale; and (4) it has two attractive properties—namely, the monotonicity and the convergence, which help AMR-based algorithms obtain a hierarchical segmentation. We refer to [32] for more details of AMR.
This study proposes to use the AMR-based watershed algorithm to segment the DIs for producing an object-level map consisting of image objects. Specifically, a gradient image is created first by applying the Sobel operator to the three-dimensional DI image, DI = {DICVA, DISCM, DImean}, where DImean = (DICVA + DISCM)/2. Then, the AMR algorithm is employed to reconstruct the gradient image adaptively. Finally, the reconstructed gradient image is used as the seed image, and the watershed algorithm is adopted to yield an object-level image.
In AMR, two parameters need to be set: the scale of the minimal structuring element s and the positive threshold η used to control the convergent condition. Since the segmentation results are not sensitive to these two parameters, they are fixed in this work and set to 2 and 10−5, respectively—that is, s = 2 and η = 10−5. The obtained object-level map will be used in the next subsection for computing the higher-order potential term of the proposed HOC2RF.

3.5. HOC2RF Model

This subsection defines the proposed HOC2RF model based on the DIs, the fused fuzzy partition matrix, the initial CD map, and the object-level map obtained in Section 3.2, Section 3.3 and Section 3.4. HOC2RF integrates the complementary change information extracted from the perspective of vector magnitude and angle, and synthetically utilizes the spatial correlation of images at both the pixel and object levels. The details of the proposed HOC2RF are as follows.
Let the random variable sets X = {x1, x2, …, xN} and Y = {y1, y2, …, yN} denote the observation field and labeling field of an image, respectively, where N represents the total number of the pixels in the used image. xi stands for the spectral features of pixel i, and yiC denotes the class label of pixel i, where C = {C1, C2, …, CM} is the class label set and M denotes the number of classes. For unsupervised CD, C = {Cu, Cc}.
In HOC2RF, X is defined based on the two complementary DIs, DICVA and DISCM (obtained in Section 3.2). In particular, X = DI = {DICVA, DISCM, DImean}, DImean = (DICVA + DISCM)/2, and xi = {DICVA(i), DISCM(i), DImean(i)}. The initial class label yi for the labeling field is obtained by combining DICVA and DISCM (see Section 3.3). Then, the energy function of the proposed HOC2RF is defined as follows:
E X = i = 1 N ψ i ( y i , X ) + λ i = 1 N j N i ψ i j ( y i , y j , X ) + o S ψ h i g h ( v o , X )
where ψ i ( y i , X ) , ψ i j ( y i , y j , X ) , and ψ h i g h ( v o , X ) represent the unary, the pairwise, and the proposed higher-order clique potentials, respectively. Ni denotes the neighborhood of pixel I, and jNi denotes the neighboring pixel of pixel i. In this study, the widely used second-order (i.e., the eight neighbors) neighborhood system is used to define Ni. S is the set composed of the image objects from the object-level map produced in Section 3.4, o represents an image object, vo represents the higher-order clique of objects for o (see Equation (15)), and the parameter λ is the weight coefficient used to control the weight of the pairwise potential.

3.5.1. Unary and Pairwise Potentials

The unary potential function ψ i ( y i , X ) is used to describe the relationship between the labeling and observation fields. ψ i ( y i , X ) denotes the cost of pixel i taking the class label yi given the observed data and is usually defined as the negative logarithm of the probability of pixel i belonging to class yi:
ψ i y i , X = log P ( y i )
where P(yi) represents the probability of pixel i to class yi, yi ∈{Cu, Cc}, and log is the natural logarithm operator. P(yi) can be computed with different techniques, such as FCM. In this study, P(yi) is defined using the joint mass function that is obtained by combining the two complementary DIs DICVA and DISCM with FCM and evidence theory (see Section 3.2 and Section 3.3). This is
P ( y i ) = m i ( C u ) y i = C u m i ( C c ) y i = C c
where mi(Cu) and mi(Cc) represent the combined mass values of pixel i to the no-change and change classes, respectively (see Section 3.3).
The pairwise potential function ψ i j ( y i , y j , X ) is used to utilize the spatial correlation of an image in local neighborhood, and to model the interaction between pixel i and its neighboring pixels j in Ni. It imposes a label constraint on the image by constraining the class labels to be consistent, by which the adjacent pixels with similar spectral values are encouraged to take the same class label. Following [15], the pairwise potential term in (11) is written as follows:
ψ i j ( y i , y j , X ) = 0   i f y i = y j 1 + exp d ( x i , x j ) 2 σ 2 otherwise
where d(xi, xj) is the Euclidean distance of xi and xj, respectively: xi = {DICVA(i), DISCM(i), DImean(i)} and xj = {DICVA(j), DISCM(j), DImean(j)}. σ 2 is the mean value of d(xi, xj) over the neighborhood Ni.

3.5.2. Proposed Higher-Order Clique Potential

The higher-order potential (object term) was introduced into the CD task in [21] to enhance CD performance. However, the higher-order potential in [21] only considers a single object, ignoring the dependence between neighboring objects. To overcome this shortcoming, this study proposes a novel higher-order clique potential by constructing a higher-order clique of objects and by considering the dependence between neighboring objects in both feature and location spaces. For a given object o, its higher-order clique vo is defined as:
v o = o , r 1 ( o ) , r 2 ( o ) , g 1 ( o ) , g 2 ( o )
where o denotes the object o; r1(o) and r2(o) denote the two neighboring objects that are nearest to object o in the feature space; and g1(o) and g2(o) denote the two neighboring objects nearest to object o in the location space. Figure 2a,b shows an example of the feature space (location space) for a simple object-level image.
In order to determine the objects r1(o), r2(o), g1(o), and g2(o), we need to define the distances between objects in the feature and location spaces. For two given objects o1 and o2, their distance in the feature space is defined as the Euclidean distance of x(o1) and x(o2), where x(o1) and x(o2) denote the mean values of the DI features of the pixels in objects o1 and o2, respectively. The distance between o1 and o2 in the location space is defined as the Euclidean distance of the location coordinates of the center pixels of objects o1 and o2.
The clique vo is made up of three parts: the object o, its two nearest neighboring objects in feature space, and its two nearest neighboring objects in location space. In general, there is correlation between the neighboring objects—in particular for the over- segmentation objects. The proposed higher-order clique potential function ψ h i g h ( v o , X ) is defined based on the object clique vo and is used to utilize the correlation of the pixels within an object and its nearest neighboring objects in both feature and location spaces. ψ h i g h ( v o , X ) takes the following form:
ψ h i g h v o , X = N ( v o ) × f o
where ψ h i g h ( v o , X ) is used to describe the cost of the label inconsistency in the clique vo. N(vo) denotes the number of the pixels in the clique vo, and consequently, a large clique will have a large weight. f o is used to define the cost coefficient, which takes both the clique segmentation quality and the clique likelihood for change/no-change into account. In particular, the cost coefficient f o is defined as:
f o = min min k q ( v o ) × z k ( v o ) + ( 1 z k ( v o ) ) ,   1
where min represents the minimum operator, q(vo) represents the clique segmentation quality of clique vo, and zk(vo) represents the clique likelihood of clique vo to class k, k∈{Cu, Cc}.
In this study, the clique segmentation quality, q(vo), is defined as the weighted average sum of the segmentation quality of the objects in clique vo:
q ( v o ) = q ( o ) + 0 . 5 q ( r 1 ( ο ) ) + 0 . 5 q ( r 2 ( ο ) ) + 0 . 5 q ( g 1 ( ο ) ) + 0 . 5 q ( g 2 ( ο ) ) 1 + 4 × 0 . 5
where q(o), q(r1(o)), q(r2(o)), q(g1(o)), and q(g2(o)) stand for the segmentation quality of objects o, r1(o), r2(o), g1(o), and g2(o), respectively.
Let ol denote a given object in clique vo. Then, the object segmentation quality of ol is estimated based on the object consistency assumption, which encourages all the pixels in an object to have the same class label. Specifically, we define q(ol) as follows: q(ol) = (N(ol) − Nk(ol))/Q(ol), where N(ol) denotes the number of the pixels in object ol, Nk(ol) denotes the number of the pixels assigned to class k in object ol, k∈{Cu, Cc}, and Q(ol) denotes a truncated parameter used to adjust the degree of rigidity of q(ol). This study sets Q(ol) = 0.1 × N(ol). That means, if more than 90% of the pixels in ol are assigned to Cc or Cu, the value of q(ol) is less than 1. Similarly, if 70% of the pixels in object ol are assigned to Cc or Cu, the value of q(ol) is set to 3.
According to the definition of q(ol), the more pixels of an object have the same class label, the better segmentation quality on this object and the smaller value of q(ol) will be. As a result, the better the segmentation quality of the objects in clique vo is, the smaller value of q(vo) will be (see Equation (18)).
The clique likelihood zk(vo) is defined based on the number of pixels in the objects from clique vo and the objects’ joint mass values, taking the following form:
z k ( v o ) = N ( o ) × m k ( o ) + 0 . 5 N ( r 1 ( o ) ) × m k ( r 1 ( o ) ) + 0 . 5 N ( r 2 ( o ) ) × m k ( r 2 ( o ) ) + 0 . 5 N ( g 1 ( o ) ) × m k ( g 1 ( o ) ) + 0 . 5 N ( g 2 ( o ) ) × m k ( g 2 ( o ) ) N ( o ) + 0 . 5 N ( r 1 ( o ) ) + 0 . 5 N ( r 2 ( o ) ) + 0 . 5 N ( g 1 ( o ) ) + 0 . 5 N ( g 2 ( o ) )
where N(o), N(r1(o)), N(r2(o)), N(g1(o)), and N(g2(o)) denote the number of the pixels in objects o, r1(o), r2(o), g1(o), and g2(o), respectively. mk(o) denotes the joint mass value of the object o to class k and is computed via mk(o) = i o m i ( k ) / N ( o ) , where mi(k) denotes the joint mass value of pixel i to class k, k∈{Cu, Cc}. mk(r1(o)), mk(r2(o)), mk(g1(o)), and mk(g2(o)) have the similar definition of mk(o).
Generally, the center object in clique vo is more important than their neighboring objects. Accordingly, when computing q(vo) and zk(vo), the weight of object o is set to 1, whereas the weights of objects r1(o), r2(o), g1(o), and g2(o) are set to 0.5.
On the one hand, the proposed higher-order clique potential encourages all pixels in clique vo to have the same class label. On the other hand, it uses the label consistency in a clique as a soft constraint and, thus, enables some pixels in the clique to take different labels. Accordingly, the higher-order clique potential can make effective use of the interaction of the pixels within an object and its nearest neighboring objects in both feature and location spaces and, thus, can improve the CD performance.
Different optimization algorithms, such as graph cuts and iterated conditional modes, can be adopted to minimize (optimize) the CRF model. The graph cut algorithm [33] is used to minimize the HOC2RF model for producing the final CD map.

4. Results

This section evaluates the performance of the proposed HOC2RF CD method. To this end, experiments were conducted on three real remote sensing datasets acquired by different sensors. Before performing CD, the relative radiometric correction and co-registration have been done on the three datasets, in order to make the two-temporal remote sensing images of each dataset to be comparable in both spectral and spatial spaces.

4.1. Dataset Description and Experimental Settings

The first experiment was conducted on the Neimeng dataset, which comprises a pair of multispectral images taken by Landsat-5 Thematic Mapper sensor on 22 August 2006 and 17 June 2011 in the boundary area between the Neimeng and Heilongjiang Provinces, China. This dataset covers an area with 1200 × 1350 pixels, and contains one main type of land cover, forest. The changes occurred in this study area mainly due to a wildfire. Bands 1, 2, 3, 4, 5, and 7 were used for CD. Figure 3a–c shows the images of 2006 and 2011, and their reference map, respectively. The reference map was produced according to a careful visual interpretation of the two-temporal images.
The second dataset, the Texas dataset, consists of two multispectral images with 1534 × 808 pixels acquired by Landsat-5 Thematic Mapper sensor on August 26 and September 11, 2011. The dataset covers a forest fire in Bastrop County, Texas. Bands 1, 2, 3, 4, and 5 were used for CD. Figure 4a–c shows the images of August and September, and their reference map, respectively.
The third dataset is the Poyang River dataset (463 × 241 pixels), which is made up of two Earth Observing-1 (EO-1) Hyperion images acquired on 3 May 2013 and 31 December 2013, respectively, in Jiangsu province, China. The dataset has 198 bands available after noisy band removal. The images of May and December are displayed in Figure 5a,b, respectively; and their reference map is shown in Figure 5c.
The Texas and Poyang River datasets are provided by [34,35], respectively. These two datasets are both offered in MAT format, and their location information is unavailable. As a result, we cannot show the information regarding the north direction and detailed location for these two datasets (Figure 4 and Figure 5).
To assess the effectiveness of the proposed HOC2RF CD method, it was compared with nine related approaches, CVA, SCM, the reformulated fuzzy local information C-means (RFLICM) [36], MRF [3], the traditional pairwise CRF (CRF), the fully-connected CRF (FCCRF) [37], the improved FCCRF (IFCCRF) [17], the higher-order CRF (HOCRF) [21], and the improved nonlocal patch-based graph (INLPG) [38].
Seven measures [39] were employed to conduct the performance evaluation: false positives (FP), the number of the unchanged pixels that are wrongly detected as changed ones; false negatives (FN), the number of the changed pixels that are wrongly detected as unchanged ones; true positives (TP), the number of the correctly detected change pixels; true negatives (TN), the correctly detected no-change pixels; overall errors (OE), the sum of FP and FN, OE = FP + FN; the overall accuracy (OA), OA = 1 − OE/(TP + TN + FP + FN); and the Kappa coefficient (KC), which is calculated by
KC = N × ( TP + TN ) ( ( TN + FN ) × ( TN + FP ) + ( TP + FP ) × ( TP + FN ) ) N 2 ( ( TN + FN ) × ( TN + FP ) + ( TP + FP ) × ( TP + FN ) )
For the CD task, FP and FN are also known as false alarms (FA) and missed detections (MD), respectively. FA and MD are more widely used than FP and FN in CD literature. KC involves more classification information, and thus it is more cogent than the other indicators [39].
In addition, the consumption time of each algorithm is also an important criterion. It was recorded for the comparison of time complexity of different algorithms. The nine comparative methods and the proposed HOC2RF were all conducted in a computer with Intel(R) Core(TM) i7-9750H 2.59 GHz processor and 16 GB RAM.
The parameter m used in FCM and RFLICM to adjust the fuzzy degree of membership was set to 2. The other parameters used in the compared and our algorithms were obtained by experiments, and only the results with the best parameters were given for performance assessment. For the Neimeng, Texas, and Poyang River datasets, the weights of the pairwise potentials about MRF, CRF, HOCRF and the proposed HOC2RF were set to 4/7/3, 5/8/2, 5/7/1, and 8/9/1, respectively.
The weights of the higher-order potential of HOCRF [21] were set to 3, 2 and 1. For our method, there is no need to set the weight of the higher-order clique potential, which is a fixed value, 1 (see Equation (11)). There are five parameters in FCCRF and IFCCRF: the two weights of the Gaussian kernels (w1 and w2), and the control parameters of nearness, similarity, and smoothness (θα, θβ, and θγ). The same as the two weights, the three control parameters, θα, θβ, and θγ, are also dimensionless. The values of the parameters in FCCRF and IFCCRF used in the experiments are shown in Table 2.

4.2. Result and Analysis

The CD results in this study are presented in two ways: the CD maps in a graphical format and the quantitative indicators in a tabular format. Figure 6, Figure 7 and Figure 8 demonstrate the CD maps of different methods on the three datasets: (a)–(j) were produced by CVA, SCM, RFLICM, MRF, CRF, FCCRF, IFCCRF, HOCRF, INLPG, and the proposed HOC2RF approach, respectively. Black stands for the correctly detected no-change pixels, white stands for the correctly detected change pixels, red stands for the MD pixels, whereas yellow stands for the FA pixels. Table 3, Table 4 and Table 5 list the quantitative indicators of different CD maps for the three datasets: The unit of time is the second (s), and the other indicators are dimensionless.
As shown in Figure 6, Figure 7 and Figure 8, CVA and SCM provide complementary CD maps for all three datasets: The change maps obtained by CVA contain a large number of FA errors (yellow areas) but a small amount of MD errors (red areas), whereas the maps generated by SCM have small yellow FA areas but large red areas of MD (Figure 6a,b, Figure 7a,b and Figure 8a,b). This observation proves that CVA and SCM can yield complementary change information and shows the potentials to enhance the CD performance by performing fusion strategies.
In terms of the other seven comparative algorithms, for the Neimeng dataset, HOCRF and INLPG yield better CD results than RFLICM, MRF, CRF, FCCRF, and IFCCRF (Figure 6c–i and Table 3). However, the map of HOCRF still contains a few apparent yellow FA errors, whereas the INLPG’s map includes some apparent red MD areas. For the Texas dataset, CRF and IFCCRF generate better CD results than other comparative methods (Figure 7c–i and Table 4).
However, some obvious red MD errors still exist at the boundary of the change regions in their CD maps (Figure 7e,g). For the Poyang River dataset, FCCRF and IFCCRF perform better than RFLICM, MRF, CRF, HOCRF, and INLPG (Figure 8c–i and Table 5). Nevertheless, the maps of FCCRF and IFCCRF have many MD errors.
By integrating FCM, evidence theory, and the novel higher-order clique CRF model (developed in this study), the proposed HOC2RF CD approach first combines the complementary change information coming from the perspective of vector magnitude and angle (direction) and then utilizes the spatial correlation of images at both pixel and object levels to enhance the performance of CD. HOC2RF performs better than the nine benchmark algorithms and produces the most accurate change maps for all the three datasets (Figure 6, Figure 7 and Figure 8).
Table 3, Table 4 and Table 5 demonstrate the quantitative superiority of the proposed HOC2RF CD method. It yields the lowest OE and highest KC for all three datasets. For example, for the Neimeng dataset, its KC value is 0.9655, which is 36.18%, 5.88%, 25.56%, 20.45%, 12.05%, 9.4%, 6.53%, 4.39%, and 4.83% larger than CVA, SCM, RFLICM, MRF, CRF, FCCRF, IFCCRF, HOCRF, and INLPG, respectively. For the Texas dataset, its OE is 11136 pixels, which decreases by at least 7600 pixels in compaction to the nine alternative methods.
For the computation time complexity, INPLG takes much more time than the other methods as it has a complex process of generating DIs. The proposed HOC2RF has slightly higher computation time requirement than the CVA, SCM, MRF, CRF, FCCRF, IFCCRF, and HOCRF. For RFLICM, it takes more time than our method for the first two datasets but less time for the third dataset.

5. Discussion

5.1. Enhancing Process of HOC2RF

As shown in Table 3, Table 4 and Table 5, for all the three datasets, the proposed HOC2RF method outperforms the nine benchmark methods over both OE and KC. Furthermore, the similar results for the three datasets demonstrate the robustness of HOC2RF to some extent.
In this subsection, the Neimeng and Texas datasets are taken as examples to analyze and discuss the enhancing process of HOC2RF. To this end, Table 6 demonstrates the CD results produced by evidence theory, SHOC2RF, CVA-HOC2RF and the proposed HOC2RF. The CD results of evidence theory were obtained by combining the ones of CVA and SCM with evidence theory. SHOC2RF and CVA-HOC2RF can be viewed as two special cases of HOC2RF, which are used to analyze the effects of using the higher-order clique potential (16) and using the two complementary DIs, respectively. In SHOC2RF, the higher-order clique potential (16) is replaced with the higher-order potential (object term) in [21]. CVA-HOC2RF only uses the DI produced by CVA, removing the DI from SCM.
Through comparing the CD results generated by CVA, SCM, evidence theory, and the proposed HOC2RF in Table 3, Table 4 and Table 6, it can be seen that HOC2RF enhances the CD performance by two stages:
(1) Combining the CD results from CVA and SCM using evidence theory. For the Neimeng dataset, the CD results of the evidence theory are better than those of CVA, RFLICM, MRF, CRF, and FCCRF, which use only one DI (Table 3 and Table 6). For the Texas dataset, evidence theory performs better than CVA, SCM, and RFLICM (Table 4 and Table 6). Although the evidence theory’s CD results are slightly worse than those from SCM for Neimeng dataset, the FA errors of CVA and the MD errors of SCM are both significantly reduced by the fusion step. This results in more balanced FA and MD errors in the CD result of evidence theory, and thus makes it easier to further improve the performance of CD by using the CRF model. In addition, the advantages of using two complementary DIs also can be seen by comparing the CD results of CVA-HOC2RF and HOC2RF (Table 6). HOC2RF that fuses two DIs from CVA and SCM produces much better CD results than CVA-HOC2RF that only uses the CVA DI. For example, for the Neimeng dataset, the value of KC increases from 0.9276 for CVA-HOC2RF to 0.9655 for HOC2RF.
(2) Improving the fused CD results of evidence theory by utilizing the HOC2RF model. For both Neimeng and Texas datasets, HOC2RF outperforms MRF, CRF, FCCRF, IFCCRF, and HOCRF (Table 3, Table 4 and Table 6), demonstrating the superiority of the HOC2RF model. In addition, by comparing the SHOC2RF and HOC2RF rows in Table 6, we can see that, the proposed higher-order clique potential (16) performs much better than the higher-order potential used in [21]. For instance, for the Texas dataset, the KC value increases from 0.9128 for SHOC2RF to 0.9518 for HOC2RF. This is mainly because the higher-order potential in [21] only considers a single object, ignoring the dependence of the neighboring objects, whereas the proposed higher-order clique potential (16) uses an object clique consisting of an object and its neighboring objects in both feature and location spaces.

5.2. Parameter Comparison of Random Field Models

This subsection compares the parameters used in MRF, CRF, FCCRF, IFCCRF, HOCRF, and the proposed HOC2RF. Only one parameter (λ) needs to be set for implementing of the proposed HOC2RF, the same as MRF and the traditional pairwise CRF. The parameter λ is used to tune the weight of the pairwise potential. In general, a small λ causes low MD errors but leads to a large amount of noise, whereas a large one will remove some noise but may miss some detailed changes. For HOCRF [21], it needs to set two parameters, the weights of the pairwise potential and higher-order potential (object term). For both FCCRF and IFCCRF, there are five parameters to be set: the two weights of the Gaussian kernels (w1 and w2), and the control parameters of nearness, similarity, and smoothness (θα, θβ, and θγ). Given the above analysis, the proposed HOC2RF needs much less parameter tuning than FCCRF IFCCRF, and HOCRF.

6. Conclusions

In this paper, a novel, unsupervised CD method was proposed by developing a higher-order clique CRF model, termed HOC2RF. For the observation field, HOC2RF further introduces the vector angle change of two temporal images compared with the existing CRF-based CD methods, which mainly utilize the vector magnitude change. For the labeling field, HOC2RF uses FCM and evidence theory to fuse the two complementary types of change information at the decision level to create an initial CD map.
Moreover, HOC2RF defines a novel higher-order clique potential based on a properly designed clique of objects. The clique potential considers the interactions between neighboring objects in both feature and location spaces. As a consequence, HOC2RF can combine the complementary change information coming from the perspective of vector magnitude and angle and utilize the spatial-context information of images at both the pixel and object levels effectively.
Three case studies verified the effectiveness and advantages of the proposed HOC2RF approach. The Kappa coefficient/overall accuracy values of HOC2RF were 0.9655/0.9967, 0.9518/0.9910, and 0.7845/0.9651, respectively, which are better than the nine benchmark methods (CVA, SCM, RFLICM, MRF, CRF, FCCRF, IFCCRF, HOCRF, and INLPG). For example, the Kappa coefficient values of HOC2RF increased at least by 4.39%, 3.31%, and 4.29% compared to the nine methods.
HOC2RF has only one parameter, thereby, needing much less parameter tuning compared with HOCRF, FCCRF, and IFCCRF. Theoretically, this article contributes to CD development by proposing the idea of using the interaction between neighboring objects in both feature and location spaces to enhance the CD performance. Methodologically, we presented a method to construct a higher-order clique, developed a higher-order clique potential function, and proposed a novel CD method—HOC2RF.
The proposed method has two limitations: (1) It has a slightly higher computation time requirement than the existing CRF CD methods because it needs to combine two DI images and compute the higher-order clique potential function. (2) Though it has only one parameter, it still requires parameter tuning.
Future work can focus on the following two directions. (1) To further automate HOC2RF, additional work can be conducted on the automatic determination of the only parameter λ used in HOC2RF. (2) Additional research can be conducted on how to define new higher-order clique potential functions.

Author Contributions

Conceptualization, W.F., P.S. and T.D.; methodology, W.F. and P.S.; software, W.F.; validation, W.F., P.S. and T.D.; writing—original draft preparation, W.F. and P.S.; writing—review and editing, W.F., P.S., T.D. and Z.L.; supervision, P.S.; project administration, W.F. and P.S.; funding acquisition, P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Project No.: 41901341).

Data Availability Statement

All datasets presented in this study are available upon request from the corresponding author.

Acknowledgments

The authors would like to thank Wang for providing the Poyang River dataset and Volpi for providing the Texas dataset. The authors are also grateful to the editors and anonymous reviewers for their valuable comments and insightful suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hussain, M.; Chen, D.; Cheng, A.; Wei, H.; Stanley, D. Change detection from remotely sensed images: From pixel-based to object-based approaches. ISPRS J. Photogramm. Remote Sens. 2013, 80, 91–106. [Google Scholar] [CrossRef]
  2. Lv, Z.; Liu, T.; Benediktsson, J.A.; Falco, N. Land cover change detection Techniques: Very-high-resolution optical Images: A review. IEEE Geosci. Remote Sens. Mag. 2022, 10, 44–63. [Google Scholar] [CrossRef]
  3. Bruzzone, L.; Prieto, D.F. Automatic analysis of the difference image for unsupervised change detection. IEEE Tran. Geosci. Remote Sens. 2000, 38, 1171–1182. [Google Scholar] [CrossRef] [Green Version]
  4. Hao, M.; Zhou, M.; Jin, J.; Shi, W. An advanced superpixel-based markov random field model for unsupervised change detection. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1401–1405. [Google Scholar] [CrossRef]
  5. Shao, P.; Shi, W.; Liu, Z.; Dong, T. Unsupervised change detection using fuzzy topology-based majority voting. Remote Sens. 2021, 13, 3171. [Google Scholar] [CrossRef]
  6. Patra, S.; Ghosh, S.; Ghosh, A. Histogram thresholding for unsupervised change detection of remote sensing images. Int. J. Remote Sens. 2011, 32, 6071–6089. [Google Scholar] [CrossRef]
  7. Shao, P.; Shi, W.Z.; He, P.F.; Hao, M.; Zhang, X.K. Novel approach to unsupervised change detection based on a robust semi-supervised FCM clustering algorithm. Remote Sens. 2016, 8, 264. [Google Scholar] [CrossRef] [Green Version]
  8. Ghosh, A.; Mishra, N.S.; Ghosh, S. Fuzzy clustering algorithms for unsupervised change detection in remote sensing images. Inf. Sci. 2011, 181, 699–715. [Google Scholar] [CrossRef]
  9. Du, P.J.; Liu, S.C.; Gamba, P.; Tan, K.; Xia, J.S. Fusion of difference images for change detection over urban areas. IEEE J. Sel. Top. Appl. Obs. Earth Remote Sens. 2012, 5, 1076–1086. [Google Scholar] [CrossRef]
  10. Celik, T. Unsupervised change detection in satellite images using principal component analysis and k-means clustering. IEEE Geosci. Remote Sens. Lett. 2009, 6, 772–776. [Google Scholar] [CrossRef]
  11. Lv, Z.; Liu, T.; Shi, C.; Benediktsson, J.A. Local histogram-based analysis for detecting land cover change using VHR remote sensing images. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1284–1287. [Google Scholar] [CrossRef]
  12. Bazi, Y.; Melgani, F.; Al-Sharari, H.D. Unsupervised change detection in multispectral remotely sensed imagery with level set methods. IEEE Trans. Geosci. Remote Sens. 2010, 48, 3178–3187. [Google Scholar] [CrossRef]
  13. Hao, M.; Zhang, H.; Shi, W.; Deng, K. Unsupervised change detection using fuzzy c-means and MRF from remotely sensed images. Remote Sens. Lett. 2013, 4, 1185–1194. [Google Scholar] [CrossRef]
  14. Hedjam, R.; Kalacska, M.; Mignotte, M.; Nafchi, H.Z.; Cheriet, M. Iterative classifiers combination model for change detection in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6997–7008. [Google Scholar] [CrossRef]
  15. Lv, P.; Zhong, Y.; Zhao, J.; Jiao, H.; Zhang, L. Change detection based on a multifeature probabilistic ensemble conditional random field model for high spatial resolution remote sensing imagery. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1965–1969. [Google Scholar] [CrossRef]
  16. Zhong, P.; Wang, R. Learning conditional random fields for classification of hyperspectral images. IEEE Trans. Image Processing 2010, 19, 1890–1907. [Google Scholar] [CrossRef] [PubMed]
  17. Cao, G.; Zhou, L.; Li, Y. A new change-detection method in high-resolution remote sensing images based on a conditional random field model. Int. J. Remote Sens. 2016, 37, 1173–1189. [Google Scholar] [CrossRef]
  18. Lafferty, J.; McCallum, A.; Pereira, F.C. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001), Williamstown, MA, USA, 28 June–1 July 2001; pp. 282–289. [Google Scholar]
  19. Kumar, S. Discriminative random fields: A discriminative framework for contextual interaction in classification. In Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, 13–16 October 2003; pp. 1150–1157. [Google Scholar]
  20. Zhou, L.; Cao, G.; Li, Y.; Shang, Y. Change detection based on conditional random field with region connection constraints in high-resolution remote sensing images. IEEE J. Sel. Top. Appl. Obs. Earth Remote Sens. 2016, 9, 3478–3488. [Google Scholar] [CrossRef]
  21. Lv, P.Y.; Zhong, Y.F.; Zhao, J.; Zhang, L.P. Unsupervised change detection based on hybrid conditional random field model for high spatial resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4002–4015. [Google Scholar] [CrossRef]
  22. Zhuang, H.; Deng, K.; Fan, H.; Yu, M. Strategies combining spectral angle mapper and change vector analysis to unsupervised change detection in multispectral images. IEEE Geosci. Remote Sens. Lett. 2016, 13, 681–685. [Google Scholar] [CrossRef]
  23. Cao, G.; Li, X.; Zhou, L. Unsupervised change detection in high spatial resolution remote sensing images based on a conditional random field model. Eur. J. Remote Sens. 2016, 49, 225–237. [Google Scholar] [CrossRef] [Green Version]
  24. Shao, P.; Yi, Y.; Liu, Z.; Dong, T.; Ren, D. Novel multiscale decision fusion approach to unsupervised change detection for high-resolution images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  25. Li, H.; Li, M.; Zhang, P.; Song, W.; An, L.; Wu, Y. SAR image change detection based on hybrid conditional random field. IEEE Geosci. Remote Sens. Lett. 2014, 12, 910–914. [Google Scholar]
  26. Shi, S.; Zhong, Y.; Zhao, J.; Lv, P.; Liu, Y.; Zhang, L. Land-Use/Land-Cover change detection based on class-prior object-oriented conditional random field framework for high spatial resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
  27. Carvalho, O.A., Jr.; Guimarães, R.F.; Gillespie, A.R.; Silva, N.C.; Gomes, R.A. A new approach to change vector analysis using distance and similarity measures. Remote Sens. 2011, 3, 2473–2493. [Google Scholar] [CrossRef] [Green Version]
  28. Dempster, A.P. Upper and lower probabilities included by a multivalued mapping. Ann. Math. Statist. 1967, 38, 325–339. [Google Scholar] [CrossRef]
  29. Shafer, G.A. A Mathematical Theory of Evidence; Princeton University Press: Princeton, NJ, USA, 1976. [Google Scholar]
  30. Shao, P.; Shi, W.; Hao, M. Indicator-Kriging-Integrated evidence theory for unsupervised change detection in remotely sensed imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 4649–4663. [Google Scholar] [CrossRef]
  31. Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1981. [Google Scholar]
  32. Lei, T.; Jia, X.; Liu, T.; Liu, S.; Meng, H.; Nandi, A.K. Adaptive morphological reconstruction for seeded image segmentation. IEEE Trans. Image Processing 2019, 28, 5510–5523. [Google Scholar] [CrossRef] [Green Version]
  33. Wang, W.; Shen, J. Higher-Order image co-segmentation. IEEE Trans. Multimed. 2016, 18, 1011–1021. [Google Scholar] [CrossRef]
  34. Volpi, M.; Camps-Valls, G.; Tuia, D. Spectral alignment of multi-temporal cross-sensor images with automated kernel canonical correlation analysis. ISPRS J. Photogramm. Remote Sens. 2015, 107, 50–63. [Google Scholar] [CrossRef]
  35. Wang, Q.; Yuan, Z.; Qian, D.; Li, X. GETNET: A general end-to-end 2-D CNN framework for hyperspectral image change detection. IEEE Trans. Geosci. Remote Sens. 2018, 57, 3–13. [Google Scholar] [CrossRef] [Green Version]
  36. Gong, M.G.; Zhou, Z.Q.; Ma, J.J. Change detection in synthetic aperture radar images based on image fusion and fuzzy clustering. IEEE Trans. Image Processing 2012, 21, 2141–2151. [Google Scholar] [CrossRef] [PubMed]
  37. Krähenbühl, P.; Koltun, V. Efficient inference in fully connected CRFs with gaussian edge potentials. Adv. Neural Inf. Processing Syst. 2011, 24, 109–117. [Google Scholar]
  38. Sun, Y.L.; Lei, L.; Li, X.; Tan, X.; Kuang, G.Y. Structure consistency-based graph for unsupervised change detection with homogeneous and heterogeneous remote sensing images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–21. [Google Scholar] [CrossRef]
  39. Gong, M.; Su, L.; Jia, M.; Chen, W. Fuzzy clustering with a modified MRF energy function for change detection in synthetic aperture radar images. IEEE Trans. Fuzzy Syst. 2014, 22, 98–109. [Google Scholar] [CrossRef]
Figure 1. The flowchart of the proposed HOC2RF CD method.
Figure 1. The flowchart of the proposed HOC2RF CD method.
Remotesensing 14 03651 g001
Figure 2. (a) An example of the 2-D feature space for a simple object-level image and (b) an example of the location space for a simple object-level image.
Figure 2. (a) An example of the 2-D feature space for a simple object-level image and (b) an example of the location space for a simple object-level image.
Remotesensing 14 03651 g002
Figure 3. (a) Image of 2006, (b) image of 2011, and (c) reference map.
Figure 3. (a) Image of 2006, (b) image of 2011, and (c) reference map.
Remotesensing 14 03651 g003
Figure 4. (a) Image of August, (b) image of September, and (c) reference map.
Figure 4. (a) Image of August, (b) image of September, and (c) reference map.
Remotesensing 14 03651 g004
Figure 5. (a) Image of May, (b) image of December, and (c) reference map.
Figure 5. (a) Image of May, (b) image of December, and (c) reference map.
Remotesensing 14 03651 g005
Figure 6. CD maps produced by different methods for the Neimeng dataset.
Figure 6. CD maps produced by different methods for the Neimeng dataset.
Remotesensing 14 03651 g006
Figure 7. CD maps produced by different methods for the Texas dataset.
Figure 7. CD maps produced by different methods for the Texas dataset.
Remotesensing 14 03651 g007
Figure 8. CD maps produced by different methods for the Poyang River dataset.
Figure 8. CD maps produced by different methods for the Poyang River dataset.
Remotesensing 14 03651 g008
Table 1. The comparison of unsupervised CD methods based on CRF.
Table 1. The comparison of unsupervised CD methods based on CRF.
CategoryReferencesAdvantagesLimitations
Pairwise CRFCao et al. [23]
  • Easy to implement
  • Having only one parameter
  • Only considering a small local neighborhood and therefore failing to fully use the spatial-context information of images
  • Only using the magnitude change of spectral vectors
Lv et al. [15]
  • Combining three features of DI for unary potential
  • Having only one parameter
Shao et al. [24]
  • Fusing three-scale DIs for unary potential
  • Improving pairwise potential with a spatial attraction model
  • Having only one parameter
FCCRFCao et al. [17]
  • Considering all pairs of pixels in the whole image
  • Demanding much more parameter tuning
  • Only using the magnitude change of spectral vectors
HOCRFZhou et al. [20]
  • Utilizing the spatial-context information in both local neighborhoods and image objects
  • Only considering a single object and ignoring the dependence between neighboring objects
  • Only using the magnitude change of spectral vectors
Lv et al. [21]
Table 2. The values of the parameters of FCCRF and IFCCRF used in the experiments.
Table 2. The values of the parameters of FCCRF and IFCCRF used in the experiments.
DatasetMethodw1w2 θ α θ β θ γ
NeimengFCCRF84801030
IFCCRF21505080
TexasFCCRF6152020
IFCCRF3130540
Poyang RiverFCCRF11801010
IFCCRF11108010
Table 3. Quantitative indicators for CD maps on the Neimeng dataset.
Table 3. Quantitative indicators for CD maps on the Neimeng dataset.
MethodsMDFATPTNOEOAKCTime(s)
CVA3400877,83577,7901,450,97591,2350.94370.603710.39
SCM12,280117468,9101,537,63613,4540.99170.90673.30
RFLICM435653,10276,8341,485,70857,4580.96450.709937.94
MRF225543,47878,9351,495,33245,7880.97180.761011.90
CRF238424,63478,8061,514,17627,0180.98330.845014.85
FCCRF10,425926170,7651,529,54919,6860.98780.87154.33
IFCCRF7970736273,2201,531,44815,3320.99050.90027.22
HOCRF4027837577,1631,530,43512,4020.99230.921621.62
INPLG9557270171,6331,536,10912,2580.99240.91724058.78
HOC2RF2181316479,0091,535,64653450.99670.965526.93
Table 4. Quantitative indicators for CD maps on the Texas dataset.
Table 4. Quantitative indicators for CD maps on the Texas dataset.
MethodsMDFATPTNOEOAKCTime(s)
CVA24,82331,056107,0461,076,54755,8790.95490.76774.86
SCM59,124264872,7451,104,95561,7720.95020.67702.53
RFLICM27,53120,366104,3381,087,23747,8970.96140.791826.56
MRF18,09714,788113,7721,092,81532,8850.97350.85899.11
CRF14,1246457117,7451,101,14620,5810.98340.91048.01
FCCRF16,8905515114,9791,102,08822,4050.98190.90122.28
IFCCRF12,3386449119,5311,101,15418,7870.98480.91877.02
HOCRF18,1975053113,6721,102,55023,2500.98120.896815.00
INPLG92,22119,77139,6481,087,832111,9920.90960.37313553.57
HOC2RF86642472123,2051,105,13111,1360.99100.951820.96
Table 5. Quantitative indicators for CD maps on the Poyang River dataset.
Table 5. Quantitative indicators for CD maps on the Poyang River dataset.
MethodsMDFATPTNOEOAKCTime(s)
CVA3478286935193,59986330.92260.64430.63
SCM31699306529100,95540990.96330.74160.62
RFLICM4346581926495,30470150.93710.69222.07
MRF12226182847695,70374040.93360.66050.71
CRF21443787755498,09859310.94680.68890.92
FCCRF34799456219100,94044240.96040.71670.23
IFCCRF34898626209101,02343510.96100.72001.81
HOCRF11056244859395,64173490.93410.66533.54
INPLG384023,738585878,14727,5780.75280.19241317.21
HOC2RF17272168797199,71738950.96510.78454.94
Table 6. CD results obtained by evidence theory, SHOC2RF, CVA-HOC2RF and HOC2RF.
Table 6. CD results obtained by evidence theory, SHOC2RF, CVA-HOC2RF and HOC2RF.
MethodsNeimengTexas
MDFAOEKCMDFAOEKC
Evidence596711,37517,3420.891029,200697436,1740.8342
SHOC2RF1213943610,6490.934117,722173819,4600.9128
CVA-HOC2RF2497910411,6010.927614,908160016,5080.9267
HOC2RF2181316453450.96558664247211,1360.9518
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Fu, W.; Shao, P.; Dong, T.; Liu, Z. Novel Higher-Order Clique Conditional Random Field to Unsupervised Change Detection for Remote Sensing Images. Remote Sens. 2022, 14, 3651. https://0-doi-org.brum.beds.ac.uk/10.3390/rs14153651

AMA Style

Fu W, Shao P, Dong T, Liu Z. Novel Higher-Order Clique Conditional Random Field to Unsupervised Change Detection for Remote Sensing Images. Remote Sensing. 2022; 14(15):3651. https://0-doi-org.brum.beds.ac.uk/10.3390/rs14153651

Chicago/Turabian Style

Fu, Weiqi, Pan Shao, Ting Dong, and Zhewei Liu. 2022. "Novel Higher-Order Clique Conditional Random Field to Unsupervised Change Detection for Remote Sensing Images" Remote Sensing 14, no. 15: 3651. https://0-doi-org.brum.beds.ac.uk/10.3390/rs14153651

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop