Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Hyperanalytic Wavelet-Based Robust Edge Detection

Remote Sens. 2021, 13(15), 2888; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13152888

by Alexandru Isar^*

, Corina Nafornita

and Georgiana Magu

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Remote Sens. 2021, 13(15), 2888; https://0-doi-org.brum.beds.ac.uk/10.3390/rs13152888

Submission received: 21 June 2021 / Revised: 17 July 2021 / Accepted: 20 July 2021 / Published: 23 July 2021

(This article belongs to the Special Issue The Future of Remote Sensing: Harnessing the Data Revolution)

Round 1

Reviewer 1 Report

The manuscript presents a novel combination of denoising and edge detection in images. Although it presents a large bibliographic support for its work, most of the references are quite old. No mention is made for recent work, like deep learning, for example.

The experimental results are impressive. However, there is not a clear distinction between gaussian and speckle noise filtering in the text. Furthermore, as far SAR speckle noise is concerned, the authors do not make clear what type of detection (linear or quadratic ?) or number of looks. It should be remarked that is extremely important in the derivation of the denoising filter. The authors use a homomorphic filter to deal with the multiplicative noise. But, for one-look, linear detection, the distribution of the speckle is Rayleigh and, taking the log, leads to a Fisher-Tippett distribution. However, nothing is explained about this.

Also, the examples of SAR and Sonar images lack information about the sensors,the type of detection and the number of looks.

A minor detail: Ref. [56] was published in 1978, instead of 1977.

Author Response

Comments and Suggestions for Authors

Following a remark of the editor we have separated the Results and Discussion into two individual sections, to improve the clarity of the results presentation. We have added newer references: [65-77] concerning the use of deep learning approaches to the remote sensing despeckling methods as a last paragraph of the Discussion section and we have compared these modern despeckling methods with the proposed method. The content of the last paragraph is the following.

“Deep learning in remote sensing has become a modern direction of research, but it is mostly limited to the evaluation of optical data [65]. The development of more powerful computing devices and the increase of data availability, led to substantial advances in machine learning (ML) methods. The use of ML methods allows remote sensing systems to reach high performance in many complex tasks, e.g. despecklization [66-77], object detection, semantic segmentation or image classification. These advancements are due to the capability of Deep Neural Networks to automatically learn suitable features from images in a data-driven approach, without manually setting the parameters of specific algorithms. The deep neural networks act as universal function approximators, using some training data to learn a mapping between an input and the corresponding desired output. We present, in the following, some ML methods for despeckling, found in two very recent references. The first one is [65]. Inspired by the success of image denoising using a residual learning network architecture in the computer vision community [66], in [67] was introduced a residual learning Convolutional Neural Network (CNN) for SAR image despeckling, named SAR-CNN. It is a 17-layered CNN for learning to subtract speckle components from noisy images, in a homomorphic framework. As in the case of the proposed despeckling method, the homomorphic approach is performed before and after feeding images to a denoising kernel, which in case of [67] is represented by the neural network itself. In this case, multiplicative speckle noise is transformed into an additive form and can be recovered by residual learning, where log-speckle noise is regarded as residual. An input log-noisy image is mapped identically to a fusion layer via a shortcut connection, and then added element-wise with the learned residual image to produce a log-clean image. Afterwards, denoised images can be obtained by the logarithm inversion. The SAR-CNN is trained on simulated single-look SAR images. However, to ensure a better fidelity to the actual statistics of SAR signal and speckle, it is retrained on real SAR data, using multilooked images as approximate clean references.

Wang et al. [68] proposed the ID-CNN, for SAR image despeckling, which can directly learn denoised images via a component-wise division-residual layer with skip connections. Therefore, homomorphic processing is avoided, but at a final stage the noisy image is divided by the learned noise to yield the clean image. This approach makes full sense, considering the multiplicative nature of noise. Of course, a pointwise ratio of images may easily produce outliers, in the presence of estimated noise values close to zero. However, a tanh nonlinearity layer placed right before the output performs a soft thresholding thus avoiding serious shortcomings.

In [69], Yue et al. proposed a novel deep neural network architecture specifically designed for SAR despeckling. It models both speckle and signal itself as random processes, so as to better take into account the homogeneous/heterogeneous nature of the observed cell. Working in the log-domain, the pdf of the observed signal can be regarded as the result of a convolution between the pdfs of clean signal (unknown) and speckle. The authors of [69] used a CNN to extract image features and reconstruct a discrete RADAR Cross Section (RCS) pdf. It is trained by a hybrid loss function which measures the distance between the actual SAR image intensity pdf and the estimated one that is derived from convolution between the reconstructed RCS pdf and prior speckle pdf.

The unique distribution of SAR intensity images was taken into account in [70] also. A different loss function, which contains three terms between the true and the reconstructed image is proposed in [70]. These terms are: the common L2 loss; the L2 difference between the gradient of the two images and the Kullback-Leibler divergence between the distributions of the two images. The three terms are designed to emphasize the spatial details, the identification of strong scatterers, and the speckle statistics, respectively.

In [71], the problem of despeckling was tackled by a time series of images. The authors utilized a multi-layer perceptron with several hidden layers to learn non-linear intensity characteristics of training image patches.

Again using single images instead of time series, in [72] the authors proposed a deep encoder–decoder CNN architecture with focus on a weakness of CNNs namely feature preservation. They modified U-Net [73] in order to accommodate speckle statistical features.

Another notable CNN approach was introduced in [74], where the authors used a NLM algorithm, while the weights for pixel-wise similarity measures were assigned using a CNN. The network takes as input a patch extracted from the original domain image, and outputs a set of filter weights, adapted to the local image content. Two types of CNN were conceived to implement this task. The first type is a standard CNN with 12 convolutional layers, while the second type is a 20-layer CNN which includes also two N3 layers to exploit image self-similarities. These layers associate the set of its K nearest neighbors with each input feature, which can be exploited for subsequent nonlocal processing steps. Training for both types of CNN is both on synthetic data and on real multilooked SAR images. The results of this approach, called CNN-NLM, are impressive, feature preservation and speckle reduction being clearly observable.

One of the drawbacks of the aforementioned algorithms is the requirement of noise-free and noisy image pairs for training. Often, those training data are simulated using optical images with multiplicative noise. This is of course not ideal for real SAR images. Therefore, one elegant solution is the noise2noise framework [75], where the network only requires two noisy images of the same area. The authors of [75] proved that the network is able to learn a clean representation of the image given the noise distributions of the two noisy images are independent and identical. This idea has been employed in SAR despeckling in [76]. The authors make use of multitemporal SAR images of a same area as the input to the noise2noise network. To mitigate the effect of the temporal change between the input SAR image pairs, the authors multiples a patch similarity term to the original loss function.

Some of the ML methods for SAR images despeckling, already mentioned, are discussed in the second very new reference, [77], as well. We find very useful Table 2 in [77], which present 31 relevant deep learning based despeckling methods with their main features. Between these methods, we can find the SAR-CNN, the ID-CNN and the CNN-NLM methods.

Based on [65] and [77], it can be observed that most ML based despeckling methods employ CNN-based architectures with single images of the scene for training; they either output the clean image in an end-to-end fashion or propose residual-based techniques to learn the underlying noise model.

The despeckling method proposed in this paper is faster and requires less computational resources than the ML based methods, due to the sparsity of HWT (in conformity with the property A) mentioned at the end of the section 1.1), it does not use any training and does not necessitate any learning methodology.

With the availability of large archives of time series thanks to the Sentinel-1 mission, an interesting direction for the ML based despeckling is to exploit the temporal correlation of speckle characteristics for despeckling applications. Acting in space domain, these methods require the statistical model of the speckle. Contrary, the proposed despeckling method acts in the Hyperanalytic transform domain. Due to the statistical properties D) and C) of the wavelet coefficients, mentioned at the end of the section 1.1, the proposed method does not necessitate the knowledge of the speckle’s statistical model. Due to the statistical property B) of the wavelet coefficients, mentioned at the end of section 1.1., the proposed despeckling method already exploits the spatial correlation of detail wavelet coefficients. We consider a good idea, to exploit the temporal correlation of detail wavelet coefficients, using the available archives of time series.

One critical issue of both ML based and proposed despeckling methods is the over-smoothing. Many of the CNN-based methods and the proposed despeckling method perform well in terms of speckle removal but are not able to preserve weak edges. This is quite problematic in despeckling of high resolution SAR images of urban areas and for robust edge detection in particular.

Another problem in supervised deep learning-based despeckling techniques is the lack of ground truth data. This problem affects the proposed despeckling method as well, because some objective quality indicators, as for example the PSNR or the SSIM cannot be computed without reference. A solution could be the utilization of optical images of the same scenes, but frequently such images are not accessible or have different parameters as for example size or contrast. In many studies on ML based despecklization, this problem is more acute, because the training data set is built by corrupting optical images by multiplicative noise. This is far from realistic for despeckling applied to real SAR data. Therefore, despeckling in an unsupervised manner would be highly desirable and worth attention.”

The experimental results are impressive.

Thank you very much.

However, there is not a clear distinction between gaussian and speckle noise filtering in the text.

The case of Gaussian noise is treated in section 2.2. Global MAP Filters Applied in Wavelet Domain, in the equations (20), (23)-(25) for the marginal ASTF filter and (21), (26), (27) and (28) for the bishrink filter. The case of speckle noise is treated in section 2.3. Multiplicative noise. Due to the statistical properties of the detail wavelet coefficients D) and C) at the end of section 1.1.), the marginal ASTF and the bishrink MAP filters have the same expressions of the priors as in the case of Gaussian noise. The statistical model of the HWT of the log of speckle noise (log(ni)) is a Gaussian for marginal ASTF and a bivariate Gaussian for the bishrink and the statistical model of the HWT of the log of the noiseless component of the remote sensing image (log(s)) is a Laplacian for marginal ASTF and a bivariate symmetric Laplacian for bishrink. To make more clear distinction between the Gaussian and speckle noise filtering, we have modified slightly the end of section 2.3., by including the following text “Considering that log(s) represents the noiseless component of the input image and that log(ni) represents the additive noise, we can apply the HWT-marginal ASTF or HWT-bishrink denoising associations, which are additive noise-denoising kernels. Due to the decorrelation effect of WTs, the statistical model of the HWT of log(ni) is Gaussian in case of the HWT-marginal ASTF association and bivariate Gaussian in case of the HWT-bishrink association. These priors were considered for log(ni) in the majority of references about wavelet based despeckling already cited, as for example [45] or [46]. Based on experiments, in different papers, as for example [9], was proved that the statistical model of the detail wavelet coefficients of noiseless images is characterized by a heavy tail distribution, as for example a Laplacian, and that this model is invariant to different transformation as for example the logarithm. Therefore, the statistical model of HWT of log(s) is Laplacian in case of HWT-marginal ASTF association and bivariate symmetric Laplacian in case of HWT-bishrink association.”

Furthermore, as far SAR speckle noise is concerned, the authors do not make clear what type of detection (linear or quadratic ?) or number of looks. It should be remarked that is extremely important in the derivation of the denoising filter. The authors use a homomorphic filter to deal with the multiplicative noise. But, for one-look, linear detection, the distribution of the speckle is Rayleigh and, taking the log, leads to a Fisher-Tippett distribution. However, nothing is explained about this.

We have added an Appendix to explain the type of detection we considered and the statistics of speckle noise and of its logarithm.

Also, the examples of SAR and Sonar images lack information about the sensors, the type of detection and the number of looks.

We have completed the introduction of the section 3. Results, by giving information about the sensors, the type of detection and the number of looks of the three real images used as examples of treating real remote sensing images. We have added the following text just before the beginning of section 3.1. “We have considered as well aerial SAR [58], SONAR [35] and SENTINEL-1 [59] images as real remote sensing images. We have selected these types of images, to be as different as possible, to highlight the universality of the proposed edge detection method. The aerial SAR image used in the first example of the application of the proposed edge detection method on real remote sensing images is acquired with a NASA/JPL Airborne SAR — synthetic aperture RADAR system mounted aboard a modified NASA DC-8 aircraft, using linear detection. During data collection, the plane flew at 8 kilometers over the average terrain height of the mountains in Switzerland. The number of looks of the acquired image, L, equals 2. A satellite multiple looks SAR image is the object of the second example of treating real images. It is a Sentinel-1 Stripmap Ground Range Detected High resolution (GRDH) image. In the case of Sentinel-1 Stripmap image, the SAR signal processor uses the complete data history and the full synthetic aperture to produce the highest possible resolution, which correspond to a Single Look Complex (SLC) SAR image. The amplitude image corresponding to a SLC SAR image is obtained by quadratic detection. The intensity image corresponding to a SLC SAR image is obtained as the square of the amplitude image. Multiple looks may be generated by averaging intensity SLC SAR images over azimuth and/or range resolution cells. Ground Range Detected (GRD) products consist of focused SAR data that has been detected, multi-looked and projected using an Earth ellipsoid model to ground range. The Sentinel-1 Stripmap GRDH image used in the second example of applying the proposed edge detection method to real remote sensing images has a number of looks L=8. It represents a segment of the border between the Agulash current and the coast of South Africa. For the third example of the application of the proposed edge detection method to remote sensing images we have selected a SONAR image. This sea-bed SONAR image show the Swansea wreck stranded in Le Goulet Bay near Brest during World War I. It was acquired by a military SONAR system possessed by GESMA and has a number of looks L=1.”

A minor detail: Ref. [56] was published in 1978, instead of 1977.

Thank you, we have modified this date on the reference list.

To improve the understanding of our submission we have added in the Conclusions section the following text “The proposed denoising method allows good Peak Signal to Noise Ratio (PSNR) results in case of AWGN, outperforming the results obtained using the HWT-bishrink denoising association. Even in case of synthesized speckle noise, our results are superior to the results obtained applying other denoising methods as: marginal ASTF-HWT denoising association (our first stage), SA-WBMMAE method, the MAP-S algorithm and the PPB algorithm. The comparison results reported in this paper suggest that the proposed denoising method is competitive with some of the best wavelet-based results reported in literature. It compares favorably with other despecklization systems, acting in the spatial domain, as for example the classical despecklization filters proposed by Lee, Kuan and Frost, or the MBD algorithm. Some of the ML based despecklization methods evoked in section 4.3, which act in the spatial domain as well, have better accuracy, but the proposed despeckling method is far faster and demands less computational resources than the ML based despecklization methods. Therefore is easier to embed the proposed despeckling method on portable devices. The principal advantage of the proposed despeckling method versus the ML based despeckling methods lies in the fact that it acts in the wavelet domain and does not necessitate the knowledge of the speckle’s statistics.

Canny’s edge detector is not robust against AWGN and multiplicative speckle noise. The noise grains produce false edge pixels. The proposed edge detection method is robust against both AWGN and speckle noise. The results of this method do not contain false edges, because the proposed denoising system rejects all the noise grains. The three examples of the application of the proposed despeckling method on real images have shown that the proposed despeckling system is useful in case when the speckle is strong, the original image having a small number of looks: 2 in the first example and 1 in the third example. The second example prove that for a number of looks equal with 8, the robustness against noise of the Canny’s edge detector is sufficient and the proposed despeckling system is no longer necessary. Therefore, for remote sensing images with the number of looks between 1 and 7, the proposed despeckling method improves the robustness of the Canny’s edge detector.

The simulations for the proposed denoising method have shown good ENL results. The large ENL values obtained assure robustness against noise of the proposed edge detection method, eliminating the false edges produced by Canny’s edge detector when applied directly to the noisy image. However, the effect of ENL increasing is oversmoothing of the input image, meaning the erasing of weak edges.

As further research directions, we intend to develop the equivalence between denoising based on ASTF and LASSO regularization in the more general framework of decomposition into dictionaries by imposing some useful constraints to the convex optimization problem, as for example the preservation of edges [25] and to exploit the temporal dependencies of the Sentinel-1 time series.”

We are grateful for the attention you have given to our paper.

Submission Date

21 June 2021

Date of this review

01 Jul 2021 22:03:51

Author Response File: Author Response.docx

Reviewer 2 Report

This paper proposes a two stages denoising system acting in the Hyperanalytic Wavelet Transform Domain against Gaussian and Speckle noise. And this denoising system is applied for to improve the edge detection robustness. Three class of edge detectors, namely, gaussian based detectors, zero-crossing edge detectors, and morphological edge detectors, are reviewed. All above detectors are noise sensitive. Thus, aim of this paper is a new robust edge detector based on hyperanalytic wavelets. The image is denoised, and applied for edge detection. This method seems to be creative somehow. However, the results shown in table 1, 2 and 3, The comparison between the proposed method and cranny’s detectors is unfair. Because the images used for cranny’s detector is not denoised. This performance of proposed method should be compared based the creative implementation instead of stacking all image processing steps. Besides, Authors should pay more attention to academic grammar, for example, “In [12] was introduced the concept of two stage denoising system”.

Author Response

Thank you for the appreciation. We have made some modifications to better highlight the creativity of the proposed method. Following a remark of the editor, we have separated the Results and Discussion into two individual sections, to improve the clarity of the results presentation.

Following a remark of Reviewer 1, to improve the design of our research, we have added newer references: [65-77] concerning the use of deep learning approaches to the remote sensing despeckling methods as a last paragraph of the Discussion section and we have compared these modern despeckling methods with the proposed method.

For a better understanding of the proposed approach, we have also added an Appendix to explain the type of detection we considered and the statistics of speckle noise and of its logarithm.

However, the results shown in table 1, 2 and 3,

The comparison between the proposed method and cranny’s detectors is unfair. Because the images used for cranny’s detector is not denoised.

In the introduction section is presented a simplified description of the algorithmic steps of the Canny’s edge detector. The first step of this description is:

To reduce the sensitivity of the edge detector to noise, a Gaussian filter is applied first.

So, a denoising mechanism implemented by Gaussian filtering is embedded in the structure of the Canny’s edge detector. The aim of our paper is to test the robustness of the Canny’s edge detector to Gaussian and speckle noise and to improve this robustness by using a supplementary denoising system. The association of the proposed denoising system with the Canny’s edge detector represents the new edge detector proposed in this paper. The proposed denoising system represents the first step of the proposed edge detector and the Canny’s edge detector itself represents the second step of the proposed edge detector. The tables 1, 2 and 3 are created to test the robustness of the Canny’s edge detector against Gaussian noise and to highlight the merits of the proposed denoising system. The first columns of tables 1, 2 and 3 characterize the first step of the proposed edge detector. The PSNR increasing and the high values of the output SSIM prove the high performance of the first step of the proposed edge detector in case of Gaussian noise.

This performance of proposed method should be compared based the creative implementation instead of stacking all image processing steps.

Thank you. We have changed the fifth column in tables 1, 2 and 3 from Second step to Final result to compare the performance of the proposed edge detector with the performance of the Canny’s edge detector, based on the creative implementation of the proposed edge detector. The comparisons in Fig.2 to Fig. 5 are based on the creative implementation of the proposed edge detector. The case of synthesized speckle noise is more difficult when the noiseless component of the image which must be treated is not available, because objective measures of quality as the edge’s MSE cannot be computed. This is the case of the image in Fig. 6. The merits of the proposed edge detection method can be appreciated by visual inspection. The most difficult case is the case of remote sensing real images. The merits of the proposed edge detection method can be easily observed in Fig. 7 and Fig. 10 by visual inspection. The case of the image in Fig. 9 is very tricky. The multilooking procedure used to generate this image makes the first step of the proposed edge detection method useless and it reduces only to the application of the Canny’s edge detector.

Besides, Authors should pay more attention to academic grammar, for example, “In [12] was introduced the concept of two stage denoising system”.

Thank you for this remark. We have changed this phrase in the following form “The concept of two stages denoising system was introduced in [12]”

Submission Date

21 June 2021

Date of this review

07 Jul 2021 01:31:58

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

I am pleased with the modifications provided by the authors and I believe tat the manuscript can be published now.

Reviewer 2 Report

Since the revised manuscript has been modified, and authors has persuaded me, I think this paper can be accepted in present form.

Article Menu

Hyperanalytic Wavelet-Based Robust Edge Detection

Further Information

Guidelines

MDPI Initiatives

Follow MDPI