1. Introduction
According to statistics, the world has a total forest aera of 4.06 billion hectares (ha), which is 31% of the total land area [
1]. The acquisition of forest understory digital elevation model (DEM) has been a challenging problem. Accurate DEM data can assist decision makers in effectively planning forest land utilization, afforestation, and logging activities, thereby achieving the conservation and optimal utilization of forest resources [
2]. Meanwhile, the forest understory DEM is crucial for identifying landslides and debris flows [
3]. Through the analysis of accurate DEM data, we are able to pinpoint the location and path of potential landslides and debris flows, aiding in the implementation of effective disaster prevention measures.
Light Detection and Ranging (LiDAR) and Polarimetric Interferometric Synthetic Aperture Radar (PolInSAR) are the primary means in remote sensing for obtaining DEM under the forest. LiDAR can penetrate the forest canopy and has good accuracy in estimating DEM [
4,
5,
6]. However, it is limited by the atmosphere, mist and clouds. Due to high cost and spatial discontinuity, it is difficult to achieve global and long-term coverage [
7]. PolInSAR combines the sensitivity of InSAR to the spatial and height distribution of scatterers with the sensitivity of PolSAR to the shape and orientation of scatterers. It has advantages including all-weather and all-season functionality, as well as high temporal and spatial resolution, making it widely used in obtaining DEM [
8,
9,
10,
11,
12,
13,
14].
To obtain the required physical parameters, it is necessary to establish a model that correlates the PolInSAR data with these parameters. In recent years, more and more models have emerged for retrieving vegetation parameters from PolInSAR data. Yamada [
15] assumed that there were two uncorrelated scattering centers in the forest area, located at the top of the canopy and on the ground, respectively. This assumption transformed the problem into the direction-of-arrival (DOA) estimation. The interference phase of the ground and canopy can be obtained by ESPRIT (Estimation of Signal Parameters via Rotational Invariance Technique) [
16,
17,
18,
19]. However, because of the strong volume scattering component, the detection accuracy of this technique becomes worse for dense forest regions [
20,
21]. Based on the theory of electromagnetic scattering, Treuhaft et al. proposed the Random Volume over Ground (RVoG) model [
22,
23]. The model simplifies the forest region into a layer of randomly oriented uniform particles on the ground and derives the expression for complex interference coherence. It is a function of four physical parameters: the forest height that determines the thickness of the volume layer; the mean extinction coefficient that represents the attenuation of electromagnetic waves through the canopy; the ground phase related to the underlying topography; and the ground-to-volume amplitude ratio (GVR) that varies with polarization channels.
Solving the RVoG model parameters is a difficult problem, especially for the ground phase, which is the basis for solving the other parameters and an important parameter for inverting the DEM. In the RVOG model, the complex interference coherence can be expressed as a line in the complex plane. This line intersects the unit circle with two points, one of which represents the ground phase and the other has no practical physical meaning. This phenomenon is known as the double-candidate effect of ground phase. Based on this geometric property, Cloude et al. [
24] propose a three-stage inversion (TSI) method. This method divides the inversion process into three stages: (1) least squares line fit; (2) vegetation bias removal; (3) height and extinction estimation. From the perspective of estimation theory, Tabb et al. [
25,
26] proposed using a maximum likelihood estimation (MLE) to retrieve forest parameters. This approach begins with the observation that the covariance matrix of the PolInSAR data follows a complex Wishart distribution. By maximizing the log-likelihood function, an objective function related solely to the ground phase is obtained. MLE provides the ability to completely separate volume scattering from surface scattering. However, the objective function regarding ground phase in MLE usually presents two indistinguishable peaks, corresponding to the two intersections of the coherence line and the unit circle in the TSI, making this method also suffers from the double-candidate effect.
Due to the double-candidate effect, the solution of the ground phase becomes complex. The three-stage method often uses some assumptions based on physical scattering to solve this problem, such as selecting the intersection point that is far away from the complex coherence under HV polarization as the ground phase [
24]. However, these methods do not fully meet the assumptions in practice. Therefore, the accuracy of parameter inversion is limited. Some researchers propose solving the ground phase using the relationship between the vertical wavenumber (kz) and two candidate phases [
27], which performs well with airborne data but is less effective with certain spaceborne data. For MLE, Huang et al. [
28] proposed the maximum a posteriori inversion method for the RVoG model by modeling the ground phase as a Gaussian distribution (MAPG) with the mean of the topographic phase computed from the external DEM. This method effectively suppressed the double-candidate effect. However, the assumption of Gaussian distribution cannot satisfy the circular data, which may lead to incorrect results when the topographic phase appears near the phase jump point.
To make the MAP method more robust, this paper proposes a maximum a posteriori with the von Mises distribution as the prior (MAPV) method for solving the RVoG model. The method models the ground phase as a Von Mises distribution [
29,
30,
31], which effectively addresses the problems caused by modeling the ground phase with the Gaussian distribution, making the estimation of the ground phase more continuous and accurate. Additionally, since the solution method for MAP requires exhaustive searching of the phase, this will significantly decrease computational efficiency. This paper proposes a four-step optimization (FSO) method for solving ground phase based on the characteristics of the MAPV objective function, employing a gradient-based optimization approach [
32,
33]. This method markedly improves solving efficiency without sacrificing accuracy.
This paper is organized as follows.
Section 2 provides a detailed description of the experimental materials.
Section 3 introduces the RVoG model and provides a brief overview of the three-stage inversion process.
Section 4 presents the proposed MAPV method, and analyzes and simulates its CRLB. Furthermore, a four-step optimization method is proposed. Results and discussions are presented in
Section 5. Finally, conclusions are given in
Section 6.
2. Materials
The test area is located in the central region of Sardinia (
N and
E), which is in the southwest of Italy. The area is heavily forested, with the predominant vegetation being holm oak woods. The test area consists of mountains and plains, with ground elevations ranging from 0 to 760 m above mean sea level.
Figure 1a depicts the Google Earth optical image of the area projected onto the SAR coordinate system, with the dark green area indicating vegetated regions.
Paired quad-polarized data used for interferometry were acquired by the SAOCOM-1A and SAOCOM-1B satellites, respectively. The SAOCOM (Satélite Argentino de Observación COn Microondas) satellite series is developed by the Argentine Space Agency and comprises two satellites: SAOCOM 1A and SAOCOM 1B. This mission provides L-band (approximately 1.275 GHz) full polarimetric data with spatial resolutions ranging from 10 to 100 m in both real-time and stored modes, and an incidence angle between 20 to 50°.
Figure 1b displays a Pauli-based PolSAR image of the test area, with the RGB channels representing HH-VV, HV, and HH+VV, respectively.
Height checkpoints are from Ice, Cloud and land Elevation Satellite-2 (ICESat-2) data [
34]. This satellite carries the Advanced Topographic Laser Altimeter System (ATLAS), which introduced single photon detection technology for the first time in Earth’s elevation measurement, significantly improving the data acquisition rate for terrain detection. This article employs the land and vegetation height data (ATL08) from ICESat-2, with a vertical uncertainty of 0.2 m for flat terrain and 2.0 m for mountainous terrain [
35].
The external DEMs selected for the experiment are the ALOS Global Digital Surface Model “ALOS World 3D - 30 m” (AW3D30) and the Shuttle Radar Topography Mission 30 m (SRTM30), both datasets having an approximate resolution of 30 m.
Figure 1c,d show the Alos DEM projected to the SAR coordinate system and the topographic phase, respectively.
Figure 2 shows the flowchart of the proposed DEM inversion method. The entire inversion framework consists of (a) ground phase estimation based on MAPV and (b) DEM generation.
4. Maximum a Posteriori Estimation of the Ground Phase
Given the observed value
, the maximum a posteriori (MAP) estimate of the parameter
can be expressed as
where
is the likelihood function, which represents the performance of the data set, and
represents a prior distribution of
.
It is known that the observed coherence matrix of PolInSAR after multilook (
2) follows a complex wishart distribution
where
N is the number of looks,
is a constant for normalization,
is the determinant operator, and
is the trace operator.
A priori knowledge about the ground phase is obtained by introducing an external DEM.
4.1. A Prior Probability Model of the Ground Phase
With InSAR geometry, the topographic phase can be solved by an external DEM [
38]
The ground phase in PolInSAR is the same as the topographic phase, both representing the contribution of the underlying ground to the interferometric phase. Considering the influence of vertical accuracy and resolution, there is an error between the topographic phase calculated from an external DEM and the actual ground phase. The ground phase can be represented as:
where
denotes the error component. Some researchers modeled the
as a Gaussian distribution with zero mean and variance dependent on the external DEM. Therefore, the ground phase follows a Gaussian distribution with a mean of the
. The maximum a posteriori estimation method was used to solve the RVoG model parameters. This method effectively suppressed the double-candidate effect, making the two local maxima of the objective function clearly distinguishable [
28].
However, since the estimated ground phase is the principal value of the phase within the interval after the periodic wrapping, and differ numerically by but represent the same point in the complex plane. This point is called the phase jump point. The assumption of the Gaussian distribution does not take into account the characteristic of phase wrapping, so it is discontinuous near the phase jump point, which can lead to errors in phase estimation.
Figure 3a,c show the Gaussian distribution curves in the Cartesian coordinate system and on the unit circle of the complex plane, respectively, for one phase period. In the complex plan, it is evident that among the two candidates (
and
),
is closer to
, so it should have a higher probability of being the ground phase. However, in
Figure 3a, the probability of
being the ground phase is greater than that of
, which is obviously incorrect.
The modeling of the ground phase must consider the characteristic of phase wrapping while retaining the selectivity of the Gaussian distribution. At present, probability models have not found much application to circular data. Common ones include: Uniform, Cardioid, Wrapped Normal, Wrapped Cauchy, and Von Mises distributions [
29,
39]. This paper employs the Von Mises distribution to model ground phase, offering significant advantages in this context. Firstly, its circular nature enables better adaptation to periodic data, akin to Cardioid and Wrapped Normal distributions, but more flexible than Wrapped Cauchy distribution. Secondly, the Von Mises distribution possesses parametric flexibility, enabling adjustment of its parameters to accommodate different degrees of data concentration. Furthermore, the mathematical form of the Von Mises distribution is simple and easy to understand, facilitating straightforward computation and interpretation, while also possessing robust statistical properties such as mean, variance, etc., making it more manageable and analyzable in practical applications [
30,
40].
The probability density function of the Von Mises distribution is
where
represents the degree of concentration (similar to the inverse of variance),
is the modified Bessel function of the first kind of order 0, which is used for normalization.
As can be seen in
Figure 3b,d, the Von Mises distribution effectively solves the problem of discontinuity at phase jump point in the Gauss distribution, making the ground phase estimation more robust.
4.2. MAP with Von Mises Distribution as Prior
By changing
and
in (
5) to matrix form
and inserting them into (
4), we can obtain the matrix form of the RVoG model in the same form of (
3), and
where
are functions of the forest height
and extinction coefficient,
and
are the volume and ground coherence matrices, respectively. The volume only coherence can be obtained by
.
It is worth noting that , , and are the ensemble averages of the data and should be distinguished from the ideal mathematical expectations , , and .
Combining the complex Wishart distribution (
8) and the prior probability distribution (
11), the objective function of the MAPV estimation of the RVoG model is given by (
7)
where the first term is the objective function of the MLE. Maximising it yields [
25]
where
and
represents the angle of the intersection of the unit circle with the extension of the line segment connecting 1 and
in the complex plane, geometrically.
Substituting (
11) and (
15) into (
14), the objective function with respect to the
and
is obtained as
Differentiating the objective function with respect to
and
, respectively, and setting them to 0 yields
Combining the two equations of (
18), we can obtain the expression of
with respect to
Figure 4 shows the difference between the objective functions of MAPG and MAPV. By observing the positions of
,
, and
in the complex plane in
Figure 3, it can be seen that the expected peak of the objective function should appear near
, which is consistent with the MAPV objective function.
According to (
14) and (
19), the ground phase is given by
which is related to the underlying topography. The DEM can be obtained from part (b) of
Figure 2.
4.3. The Cramer–Rao Lower Bound Analysis
To evaluate the validity of the method, it is necessary to analyze its Cramer–Rao Lower Bound (CRLB), which provides the minimum bound of variance if it is unbiased.
Note that the above derivation of MAPV is done under the noise-free assumption. In practice, the PolInSAR system inevitably introduces noise, the noise-affected R can be expressed as [
41]
where
and
are the noise covariance matrices of the receivers at each end of the baseline. Assuming that the noise between channels is uncorrelated,
can be expressed as
where
denotes the noise variance of each channel. Note that since the noise of the two receivers is uncorrelated, the noise has no effect on
.
The CRLB can be derived from the inverse of the Fisher information. The derivation of the Fisher information is presented in
Appendix A for clarity, which can be expressed as [
42,
43]
Figure 5 shows the simulation results using the data from [
44] with
and
(approximately equivalent to
).
It is worth noting that the variance of the TSI method falls below the theoretical CRLB curve when the SNR is reduced below about −14 dB. This is due to the fact that the phase error is calculated in the sense of phase wrapping, whereas the CRLB is calculated using the local curvature of the distribution function [
45]. From
Figure 5, it can be seen that the inclusion of the prior effectively reduces the CRLB, and also demonstrates the superiority of the MAPV method over the TSI method for ground phase estimation.
4.4. Four-Step Optimized Solution for MAPV
To determine the ground phase, which corresponds to maximizing the objective function (
17), the typical approach involves conducting an exhaustive search (ES) over
. Although this ensures accuracy in finding the solution, it leads to high computational complexity and low computational efficiency.
Since the objective function of MAPV with respect to
is a continuous curve, as shown in
Figure 4. Based on this property, this paper employs the gradient optimization method to enhance the efficiency of solving ground phase. The gradient of the MAPV objective function is presented in
Appendix B.
Given the significant variations in the objective function for each pixel, the method requires addressing two distinct problems:
Initial phase: Due to the presence of two peaks in the objective function, a single initial point is prone to fall into the local maximum. Therefore, it is necessary to choose an appropriate strategy to ensure that the results converge to the global maximum.
Global learning rate: Because of the large differences between the objective functions of different pixels, it is very important to choose the global learning rate so that the method can adapt to all objective functions.
4.4.1. Initial Phase
Taking into account the characteristics of the MAPV objective function depicted in
Figure 6, this paper introduces a four-step solving approach.
Step 1: Gradient descent using as initial phase. Get the between the two peaks.
Step 2: Give the a in the opposite direction to as a .
Step 3: Perform gradient ascent separately using this and to obtain the two peak values and their respective phases.
Step 4: Compare the magnitudes of these two peak values and select the phase corresponding to the larger peak value as the final result.
4.4.2. Global Learning Rate
In the FSO method, we employed both gradient ascent and gradient descent algorithms. For convenience, this paper converts the gradient ascent process into a gradient descent process, with the specific conversion procedure given below.
The gradient descent algorithm is based on the Root Mean Square Propagation (RMSProp) [
33]. Although RMSprop adjusts the learning rate during iterations, the global learning rate determines the speed, convergence, and eventual performance of the model during training. If the learning rate is too high, it may lead to an unstable training process and difficulties in convergence. Conversely, a low learning rate might result in slower training speed, requiring more iterations to converge to an optimal solution.
Due to significant differences in the objective functions of various pixels, selecting a uniform global learning rate poses a challenge. Therefore, this paper proposes improvements to RMSprop to reduce the model’s reliance on the global learning rate to accommodate all objective functions, as shown in Algorithm 1.
When the current gradient g has opposite signs with the minimum gradient , or when the current function value is greater than the minimum function value , it indicates that the step size might be too large, causing oscillations. In such cases, the minimum phase is necessarily situated between and , closer to the side with the lower function value. In the fourth line of Algorithm 1, a weighted computation based on function values is applied for this condition, assigning higher weight to the lower function value when updating the phase. As the updated phase is very close to the minimum phase, it is advisable to moderately decrease the learning rate for better convergence.
Algorithm 1 assumes the objective function to be positive, but neglecting certain constant terms during derivation may result in the objective function potentially becoming negative. This could result in an error on line 4 during execution. This problem can be solved by preprocessing the coherence matrix, let
where
is the diagonal matrix formed by the eigenvalues of
.
Algorithm 1: Framework of improved RMSprop algorithm. |
Require: Global learning rate , Decay rate , Initial phase |
Initialize: , , , , , |
|
In FSO, the gradient ascent process can be converted into a gradient descent process. In the gradient ascent task, calculate the function value
f and gradient
g at that point, then let
Then, Algorithm 1 can be used for solving.
Although the method requires performing three iterations of gradient descent, the improvement made in RMSprop based on the characteristics of the objective function allows each gradient descent to converge to the extremum within ten iterations. Furthermore, the precision obtained by this algorithm often surpasses that achieved through exhaustive search (<1 degree).
5. Results
5.1. Evaluation Indicator
The evaluation indicators are mean error
m, root mean square error (RMSE)
r, and accuracy
where
is the idicative function, defined as
5.2. Efficiency of FSO for MAPV
As mentioned above, the discrepancy in the objective function between pixels poses a challenge in the selection of the global learning rate when using traditional RMSprop. Algorithm 1 proposed in this paper is an improvement specifically designed to address this issue, and it is necessary to discuss its convergence here.
Figure 7 presents the convergence performance of RMSprop and Algorithm 1 under different global learning rates. As depicted in
Figure 7a,d, under conditions of a relatively small global learning rate, both methods exhibit robust convergence. However, this comes at the cost of requiring more than 20 iterations. As the global learning rate increases, Algorithm 1 demonstrates a pronounced acceleration in convergence. In contrast, the RMSprop fails to converge to extrema due to oscillations.
While a larger appears to contribute to faster convergence in this experiment, it is important to note that, due to the potentially steep peaks in the MAPV objective function, excessively high global learning rates may cause the algorithm to jump out of these peaks. Therefore, it is also not advisable to set the global learning rate too high.
Table 1 presents the efficiency of the four-step optimization method for the ground phase of the test area. Here, ’ES’ denotes achieving 1-degree accuracy through exhaustive search. It is evident that the proposed method is 5.6 times faster than exhaustive search while ensuring accuracy. On average, convergence is achieved in only 24.5 steps per pixel.
5.3. Results of DEM Inversion in the Test Area
Given that the Gaussian distribution is not suitable for modeling the ground phase, MAPG encounters discontinuities when solving for ground phase. These discontinuities are propagated to the DEM during phase inversion, resulting in discontinuities within the DEM. This section will employ the Alos DEM as a benchmark to evaluate the extent of this discontinuity. Despite vertical accuracy differences between Alos DEM and actual values, it remains an essential reference in assessing DEM continuity.
As shown in
Figure 8, the phase estimated by MAPG and MAPV show similar fringes that are basically consistent with the topographic phase computed by the external DEM. However, the comparison shows that there are significant differences between the two methods in the vicinity of the phase jump point (the blue and red boundary region in the phase diagram). In this region, the MAPG typically transitions with an intermediate value, resulting in a phase discontinuity that causes elevation jumps when inverting the DEM.
The DEM inverted by the two methods are shown in
Figure 9a,b. Due to the large elevation range in the test area, it is not easy to observe differences over an entire image. Therefore, we select two areas for comparison, corresponding to the two areas in
Figure 8. Different scales are set for the two areas. As shown in
Figure 9, the MAPG-inverted DEM shows a strong discontinuity in some regions that are highly correlated with the vicinity of the phase jump point in
Figure 8a. From the elevation change curves of the marked line segments in Area 1 and Area 2 shown in
Figure 10a,b, it can be seen that the jump between adjacent pixels in the MAPG-inverted DEM reaches 20 m in some areas, which does not correspond to the actual situation. On the other hand, the MAPV-inverted DEM shows better results without significant elevation jumps.
Table 2 evaluates the continuity of Area 1 and Area 2. The criterion used is the accuracy
, where discrepancies greater than the set threshold compared to the Alos DEM are deemed as discontinuities. It is evident that compared to the MAPG-inverted DEM, the MAPV-inverted DEM exhibits better continuity.
5.4. Performance Assessment of the Proposed Method
In this section, IceSat-2 data are used to confirm the effectiveness of MAPV in generating DEM for forested areas.
Figure 11 shows the distribution of IceSat-2 data in the test forest area, where the color of the dots indicates the magnitude of the difference between the calculated elevation at that point and the LiDAR data.
Figure 11a,b show the difference between the Alos and SRTM DEM with the LiDAR DEM, respectively. With reference to the colorbar on the right, it can be seen that the Alos and SRTM DEM are generally higher than the LiDAR DEM.
Figure 11c shows the difference between the TSI-inverted DEM and the LiDAR DEM. It is worth noting that, as mentioned above, the assumptions made by TSI in solving the double-candidate effect cannot satisfy all conditions, and errors will occur in some low-coherence regions, resulting in elevation jumps in these regions when inverting the DEM.
Figure 11d shows the difference between the MAPV-inverted DEM and the LiDAR DEM, and it can be seen that the DEM obtained by this method is closer to the LiDAR DEM.
Table 3 shows the mean error (ME) and the root mean square error (RMSE) of the different source DEM compared to the LiDAR DEM. Compared to the DEM of Alos and SRTM, the RMSE of the MAPV-inverted DEM is improved by 23.1% and 24.1%, respectively. It should be noted that due to the elevation jumps in the TSI-inverted DEM, the elevation in some areas may differ significantly from the LiDAR DEM, resulting in a large RMSE.
Figure 12 shows a histogram of the distribution of elevation differences between different source DEM. Obviously, compared to the DEM of Alos, SRTM, and TSI, the error center of MAPV-inverted DEM is closer to 0.
The method proposed in this paper validates the feasibility of using spaceborne L-band PolInSAR for retrieving forest understory DEM. The rapid development of spaceborne PolInSAR technology has expanded the coverage and increased the update speed of PolInSAR data, enabling the generation of large-scale, accurate DEM. High-precision, rapidly updated DEM plays a crucial role in fields such as environmental monitoring and disaster monitoring, providing precise data support for assessing forest coverage and terrain changes, among other issues. However, solving for the ground phase requires high-quality data, with the interferometric pair needing suitable temporal and spatial baselines, as well as good coherence. Low-quality data can result in inaccurate phase solutions, leading to errors in the inverted DEM. Furthermore, future research needs to further explore the applicability of this method under different climatic conditions or forest types and address potential challenges, such as considering the impact of varying climatic conditions on data quality and investigating interference effects and L-band penetration depth in different types of forests.
6. Conclusions
This paper proposes a MAPV inversion algorithm for RVoG model, and generates high-precision radar DEM data in forested area. The method is based on maximum a posteriori estimation and models the ground phase as a Von Mises distribution, with its mean derived from the topographic phase computed from external DEM. This method effectively overcomes the double-candidate effect, addresses the phase jump issue caused by modeling ground phases as a Gaussian distribution, and ensures the continuity and accuracy of ground phase solutions. Additionally, this paper derives, analyzes, and simulates the CRLB for MAPV. The results indicate that the introduction of prior information significantly reduces the CRLB of this method. In response to the characteristics of the MAPV objective function, this paper proposes a FSO method to enhance the efficiency of ground phase determination. FSO obtains two candidate phases separately based on the gradient descent algorithm and selects the one with the higher function value as the ground phase, as the gradient descent algorithm requires setting a global learning rate and there exist significant differences in the objective functions among different pixels. This paper improves the traditional RMSprop algorithm by reducing its dependency on the global learning rate, enabling it to handle various objective functions.
In this paper, the L-band spaceborne SAOCOM data are used to estimate the PolInSAR understory DEM to evaluate and verify the effectiveness of the method. The results demonstrate that:
Compared to the traditional exhaustive search method, FSO significantly improves computational efficiency without compromising accuracy.
The MAPV method effectively addresses the issue of elevation jumps in DEM caused by the discontinuity in ground phase solutions by MAPG.
Using IceSat-2 data as a benchmark, the DEM of the test forest area is compared with the DEMs of Alos, SRTM, and TSI. The results show that MAPV has better estimation performance, with improvements in both mean error (ME) and root mean square error (RMSE).