Next Article in Journal
Modeling the Spatial Distribution of Debris Flows and Analysis of the Controlling Factors: A Machine Learning Approach
Next Article in Special Issue
Potential of Convolutional Neural Networks for Forest Mapping Using Sentinel-1 Interferometric Short Time Series
Previous Article in Journal
Study of the Boundary Layer Structure of a Landfalling Typhoon Based on the Observation from Multiple Ground-Based Doppler Wind Lidars
 
 
Article
Peer-Review Record

Assessing the Utility of Sentinel-1 Coherence Time Series for Temperate and Tropical Forest Mapping

by Ignacio Borlaf-Mena 1,*, Ovidiu Badea 2,3 and Mihai Andrei Tanase 1,2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Submission received: 14 September 2021 / Revised: 19 November 2021 / Accepted: 23 November 2021 / Published: 27 November 2021
(This article belongs to the Special Issue SAR for Forest Mapping II)

Round 1

Reviewer 1 Report

This paper examines the use of long and short term coherence in C-band SAR data time series to improve pixelwise forest/non-forest classifiers build from such SAR data (using Sentinel 1). The results are validated across a number of data sets including pixelwise forest classification based on GEDI LiDAR data. 

This is a helpful contribution and overall a good paper, in general well written. 

Two more major issues:

-lack of explanation of the classifier

It would be good to see more explanation of the classifier and justification for it. Why an SVM over other classifiers? Some discussion of how training set should be built up? Just adding across the two sites - should it be specific to temperate/tropical forest? etc. How did classification work for multiple classes in this instance?

-somewhat unclear approach to validation

Some more explanation with some plots of how validation sets were chosen etc (exclusion masks etc) would be helpful. I found the description across the different data sets somewhat unclear. Fig 7 has a land use classification - is this definitive, and exactly how arrived at - it was confusing to follow how the GEDI derived classified points fed into this. Also some discussion of temporal change in land use classification to understand if the changes described in the classification are real or in error - this argument was also quite hard to follow.

Generally just a few very minor issues with English language and style to correct - I have suggested some, but have not been exhaustive in listing them.

More detailed comments:

l20,21 - Maybe the issue here is less about commenting on overall accuracy (as this has been done before), but rather the new contribution is including the coherency measures, and so the improvement due to the inclusion of these would be more relevant to mention here.

l34 have -> has

l89/90 Maybe there needs to be some distinction here between using SAR backscatter for detecting change in land use, and using it for determining land use. As the authors note for classifying land use it is perhaps less helpful due to confounding behaviour, for detecting change to land use however it may be enough. 

l90 Another issue here is that interferometric data requires larger datasets (phase data not just backscatter intensity must be retained), and more processing to perform interferometry. It is reasonable to ask - is this worth it? perhaps here there should be some introduction to this question of the trade off of gain in classification using coherence against the extra costs involved.

l143 Figure needs a y-axis.

l145 There is no mention of Sentinel being C-band here, maybe to contrast with the ALOS PALSAR L-band, and to note the difference in physical properties for detecting forest.

l155 remove 'in'

l186 not much discussion here of the combination of ascending and descending orbits here (I think from l148 you use both). Later in l209 there is a discussion of masking layover and shadow, these areas presumably differ from asc/desc orbit. Does this limit the number of pixels useable in the final analysis based on the union of the masks? How does this work in practice for the analysis as written?

l202 I don't understand the window size oscillation, what does this mean? Why would the window need to be changed across the area and what influence does this have on results?

l203 gamma^0 is pixelated here, maybe need to include using LaTeX or unicode.

l214 a z-score is standard deviations above or below the mean - not median. The equivalent using a median is more complicated (and it isn't clear from the text what is used here). See eg https://hausetutorials.netlify.app/posts/2019-10-07-outlier-detection-with-median-absolute-deviation/

l216 Not sure what basing things only on forest means in this context? Is the standard deviation used in the z-score based on forest pixels only?

l263 I can't see where to put this comment directly - so will mention here. I don't see any justification of the use of an SVM? Why use this type of classifier? There should be some technical details of it (classification boundaries linear/nonlinear etc)? How were multiple classes included in the classifier - how was the decision made for final class membership (highest confidence?)?

l267 What was the minimum number of pixels used for any sub-class?

l269 base -> based

l275 This procedure misses something significant though. z score above 3 puts us 3 SDs above or below the mean - that is the 99.7% location for a Normal distribution (no guarantee the pixels here are distributed this way). But then if I have 25000 such pixels I guarantee 75 points should be in this tail and are not outliers. Likewise if selecting 5000 then 15 should be. So this procedure under samples from the tail of this distribution by determining that all the points there are outliers. 

l284,285,286 What drives this choice of parameters? Could you comment on the sensitivity of the analysis to these parameters. 

l288/289/290 I don't understand this weighting of votes - why should a larger coverage influence the result at a given pixel less? Some brief explanation would be good here.

l325 Classiffication -> Classification

l365 I'm not sure the alluvial diagram communicates the confusion for the classifier very well. This would be easier to read in the standard confusion matrix form. 

l375-378 Is this a general feature though? Does this not indicate there should be more care in mixing data from different orbits?

l463-464 Again, and for discussion, is the increase worth the extra effort in using SLC data and the extra processing?

l494-495 This indicates that perhaps the critical thing would be to try and include spatial context in order to improve the classification.

l502-504 I wonder whether this indicates that the classifier used does not represent the variability well enough. This might present a good case for using a classifier with a more flexible decision boundary such as a random forest?

l524 I think this phrasing overemphasises the initial error rate, 'greatly reduced' seems very strong. Maybe this should be quantified in this paragraph.

l546 significative -> significant

l549 Although this suggests the GEDI based land use classification is not as helpful for the current study as it could be.

l558 reference [10] I believe also suggests this discrimination is possible?

l582 requires -> require

l601-603 this is a more cautious and correct statement than the abstract, the abstract should perhaps be somewhat reworded to better state this.

Author Response

We would like to thank the reviewer to the time invested in providing comments on our manuscript. We have replied to the comments provided in the attached letter

Author Response File: Author Response.docx

Reviewer 2 Report

This study compares three SAR feature sets (annual backscatter statistics, long-term coherence and short-term coherence statistics) for classifications aimed on forest/non-forest differentiation over a Romanian and a Brazilian study area. The results are satisfactory, in some cases 99% of overall accuracy was achieved. Important is the finding that forests can be classified with >92% accuracy even from GRD data. Valuable are also findings that much better urban-forest separation can be achieved by including coherence information.

In the presented study it would be good to include more clear information on land cover datasets used for generation of reference datasets and to include more accurate information on validation points (see below).

I have some questions and suggestions on the datasets used for generation of reference dataset:

  • There is a certain accuracy of the used land cover datasets. As you are using them as training as well as validation datasets, you must consider the accuracy of these land cover datasets when interpreting results.
  • How were the land cover datasets combined? There should be some weights given to datasets in case of differences for a given pixel. Please describe this process/method in more detail
  • Why did you choose these datasets? There are other datasets focusing on forest, e.g.  Copernicus High resolution layers, Hansen’s global forest change dataset, etc.; focusing on urban areas – e.g. Copernicus Urban Atlas; or yearly global LC datasets like Copernicus Global Land Cover Layers at 100 m resolution (2015-2019).
  • It would be good to include information about the accuracy of different land cover classes of these LC datasets or at least the overall accuracy of the database. This accuracy can be different according to the country (CLC) or according to the region in case of global datasets.
  • Anyway, it should be useful to verify/investigate the accuracy of GEDI data (at least GEDI forest/non-forest data) based on high-resolution imagery.
  • Indicate, why it was sufficient to choose only a yearly average of S-1 and S-2 values as reference dataset for the Brazilian site? No seasonal effects at all?
  • It is not clear, which datasets were used as reference datasets for the Brazilian site. According to Table 1, all datasets were used for the Romanian site, but according to the paragraph on lines 237-251 only the S-1 S-2 combination is described for the Brazilian site.
  • Please add to the discussion an information/explanation about a science contribution of this paper and a novelty of the methods used.

 

 

Other questions/suggestions:

Results described in the abstract do not fully correspond to results in Results and other sections. Please check it.

Have you considered using images from a specified period of the year for temperate forests, e.g. only images from summer months? Winter period (snow, frozen conditions, low temperature, leaf-off conditions) can cause differences in backscatter, thus can cause a larger variation thorough the year which can make the differentiation from other LC classes more difficult.

 

Why different DEMs were used for SAR data corrections? As far as I know, both NASADEM and TanDEM-X DEM are available globally.

 

Section 4.2: please add detailed information about the changed area – at least the extent of change for each year. Add also information about the extent of the studied areas.

 

 

Comments to the text:

Figure 1 - add a map scale

Figure 2 - please add units and values to the y axis. I guess it represents number of pixels

Tables 2-6: add units (e.g. %) where applicable

Figure 3 and 4 - Did you really achieve standard deviation of around -20? It should be positive numbers...

line 153 - change "mages" to "images"

lines 209-217 - not really clear definitions, e.g. in sentence on lines 213-216 you should use a comma

lines 216-217 - the largest backscatter intensities are only on slopes facing toward the radar sensor

260-262 - not necessary to have this sentence. It is clear, that both conditions must be fulfilled to label point as forest.

267-268 - Here you mention that 25.000 pixels were selected for each sub-class to show the variability in the data. You also mention that "Pixels not included in the training sample were used as validation dataset". What if 25.001 pixels were available for a sub-class? Only 1 pixel was used as validation data? On the other hand, on line 277 you mention that 5000 random points were used for each class. As I understood well, these points were selected from the set of random points. Then we still do not have information about the validation dataset. It would be good to include exact information about it.

line 269 - Do you mean "based" instead of "base"

lines 288-290: Please add a sentence in the following sense: "Detailed information about the methodology can be found in [...]".

lines 398-300: If you talk about classification stability, you should note that real changes also could occurred in the studied period.

463 – Do you mean "as well as"

518 – there are opening parenthesis, but no closing

558 - there are several recent studies that succeeded with forest classification using Sentinel-1

596 - delete ending parenthesis

 

Author Response

We would like to thank the reviewer to the time invested in providing comments on our manuscript. We have replied to the comments provided in the attached letter

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

The authors improved the paper and reflected all the relevant comments. Just one point/comment is needed to be improved/corrected:

 

If you convert the results of standard deviation to decibel scale, please indicate it, at least in the notice/description of the Figure 3 and 4.

Author Response

We would like to thank the reviewer for the advice provided. We have applied the suggested changes along with some small modifications included in the attached document. 

Author Response File: Author Response.docx

Back to TopTop