Next Article in Journal
Comparative Research on Deep Learning Approaches for Airplane Detection from Very High-Resolution Satellite Images
Previous Article in Journal
A Soil Moisture Spatial and Temporal Resolution Improving Algorithm Based on Multi-Source Remote Sensing Data and GRNN Model
Previous Article in Special Issue
Generating High-Quality and High-Resolution Seamless Satellite Imagery for Large-Scale Urban Regions
 
 
Article
Peer-Review Record

Discriminative Feature Learning Constrained Unsupervised Network for Cloud Detection in Remote Sensing Imagery

by Weiying Xie, Jian Yang, Yunsong Li *, Jie Lei, Jiaping Zhong and Jiaojiao Li
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Submission received: 11 December 2019 / Revised: 15 January 2020 / Accepted: 20 January 2020 / Published: 1 February 2020
(This article belongs to the Special Issue Remote Sensing Image Restoration and Reconstruction)

Round 1

Reviewer 1 Report

This paper presents a new method for detecting clouds on satellite imagery. Clouds as well as their shadows on the ground are a problem of major importance in satellite based remote sensing. The method presented here seems to work well in detecting clouds, but not especially well in detecting their shadows.

The article is in some places ambiguous in describing the methodology, and the authors also use terms that are not properly written out. See my further comments below.

 

line 28 "...is of first importance preprocessing step..." -> is important preprocessing step

line 33-34 "...the land cover in the same place does not change greatly..." Does thi refer to real or interpreted land cover?

line 38 "Fisher integrated the morphological..." Should the citation be right after "Fisher"?

line 51 "...methods and techniques have made great efforts..."
I think there is somebody behind the methods who actually made the efforts.

line 66: term deep characteristics should be explained

line 67: comparative form is "more superior"

line 78 (and elsewhere): it is not clear to readers what latent space means

line 80: relatively sufficient ?

line 83: it is not clear to whom or which "them" refers

line 106: generation of which?

line 144: latent feature space should be explained, is that something else than latent space mentioned earlier?

line 160 "...to fool..." ?

line 185: For some reason there are lines that are not numbered after this.

line 190: It is not clear what "relative larger eigenvalue" is.

line 195: non-numbered lines below this line
line 209: These images are apparently samples or extracts of original Landsat images, since the original images are of different size.
line 223: In GaoFen-5 satellite project homepage the EMI instrument is named as Environmental Monitoring Instrument

line 256: It is not clear where the reference maps come from. Are they same as the cloud reference provided by Landsat data provider?


Figure 4: It is conspicuous, that none of the presented methods seem to be able to detect cloud shadows, which are a problem similarly as actual clouds.
Table 1. In the case of clouds (or cloud shadows) errors of omission are more of a problem than errors of commission (i.e. all clouds should be detected).

Chapter 4.5 Discussion is quite shallow, and does not refer to any earlier studies. This part should be expanded.

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript:

“Discriminative Feature Learning Constrained Unsupervised Network for Cloud Detection in

Remote Sensing Imagery”, We Xie, J. Yang, Y. Li, J. Lei and J. Zhong (Ref. No.: remotesensing-678686),

contains interesting material and may be suitable for publication. However, it is not well-organized and should be considerably elaborated. In particular, the authors stated that “The proposed CDUN method depends on an important observation that clouds are sparse and modeled as sparse outliers”. This causes some confusion. What is the limitation of the CDUN method? How to quantify when clouds are sparse (at which degree or at which threshold value sparse)? Moreover, it is not clear if this method is applicable when clouds are not sparse. The authors considered other known methods like K-means, PRS, etc. These methods are perhaps better in performance with more sparse clouds. Therefore, the authors should provide the more detailed results for more general cases (for sparse and relatively dense clouds). Furthermore, the authors show the advantages of CDUN method. What are disadvantages of this approach? These details should be provided in order to make the content of the manuscript more complete and self-explanatory.

Overall, English is acceptable. However, few sentences should be amended. The manuscript requires some citations.

Apart from this the following should be taken into account:

Abstract

1) There is inconsistency between “unsupervised cloud detection network” and abbreviation CDUN. Why it is called CDUN instead of UCDN?

2) Perhaps some key quantitative results should be shown in the Abstract.

Introduction

1) The sentence:

“… remote sensing images have been successfully applied to target detection [3], anomaly detection [4], and classification [5]”.

Are target detection, anomaly detection and classification related to detection of cloud images? The references [3], [4] and [5] require brief description.

2) The introductory part is somehow excessive and take about page with small font till “In view of this, …”. Perhaps it should be reduced.

3) The sentence:

“The proposed CDUN method depends on an important observation that clouds are sparse and modeled as sparse outliers [8,9]”,

Is it a requirement that clouds are always sparse for this method? If not, please provide more details (see also comments above).

4) Provide few lines about limitations of the CDUN method in Introduction. When CDUN method is not applicable?

5) The sentence:

“As far as we know, such an unsupervised adversarial feature learning model is utilized for the first time for MS and HS cloud detection”,

Should be rewritten, for example, as:

“To the best of our knowledge, such an unsupervised adversarial feature learning model is utilized for the first time for MS and HS cloud detection”.

6) The sentence:

“An image discriminator is enforced to prevent the generalization of out-of-class features”.

The above sentence is not clear. Perhaps it is required to more details about generalization of out-of-class features.

Related Work

1) Equation (1) should be cited.

2) The sentence: “… of VAE can be interpreted as a probabilistic encoder and probabilistic decoder, which are denoted as pq(xjz) and qf(zjx), respectively”, should be cited.

3) Equations (4) and (5) should be cited.

4) The sentence:

“And when the model loss converges, the parameters including weight matrix and bias are obtained”.

What if the model loss does diverge?

5) The sentence: “The ST of the ith dimension in the residual error DZ can be defined as:”. This sentence should be cited.

6) Equations (10), (11) and (12) should be cited.

Experimental results and discussion

1) The sentence:

“Note that the reference maps for this dataset have not been published yet”, should be rewritten as:

“Note that the reference maps for this dataset have never been published yet”.

2) Since the CPU timings are not provided, it is sufficient to write: “All compared methods are carried out using MATLAB.”

3) The sentence:

“The SL method generates some detection mistakes in the image due to the thinness and thickness of the clouds”, is not clear and should be clarified.

4) The sentence:

“The reference image of the real hyperspectral dataset is not available”, is confusing. Perhaps it should be omitted.

Conclusion

1) Conclusion should show the key quantitative results obtained in this study.

2) The sentence: “…compared with other state-of-the-arts in terms of subjective and objective evaluations”, is confusing. What does it mean “subjective and objective evaluations”?

3) The last sentence:

“How to further utilize the data characteristics of remote sensing images to optimize unsupervised networks to improve the performance of cloud detection will be the focus of our future work”, is not properly written and should be corrected.

The manuscript requires the major mandatory revision.

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Looks like a good paper, but presentation needs improvement.

The science presented is pretty good, but the presentation needs
substantial improvement. Parts of the paper are somewhat confusing, and
the order in which material is presented is odd. The satellite
instruments should be introduced earlier, and mentioned explicitly in the
abstract. Currently it is only implicitly clear from the abstract that
satellite imagery is used at all.

Abstract, page 1

Line 6 ("clouds are sparse") and lines 9-10 ("clouds with limited
samples"). I don't understand this. Clouds are very common and cover on
average about 70% of the Earth surface. Why are clouds sparse? Are you
only looking at a particular cloud type, in a region where clouds are
rare, only at small scenes already selected to be relatively cloud-free,
or is there another reason?

§1. Introduction

Page 1 line 28, "of first importance preprocessing step", this is poor English,
I suggest "a first important preprocessing step"

Page 2, line 37-38, "A large number of methods are advanced", I don't understand
what the authors mean. Please rephrase.

Page 2, line 58, remove "task"

Page 2, line 59: which instrument on Landsat-8? OLI or TIRS?

Page 2, line 76: see my remark in the abstract; please explain why clouds are
sparse in your images when they are common in reality.

§2. Related Work

§2.2. Variational Autoencoders

Page 4, lines 124-128, could you please explain this equation a bit more?
Most of the terms have not been explained.

§3.3 Adverserial Image Learning Term

Page 5, line 174-175, "While the image...the real input", this is a poorly
phrased sentence, please rephrase.

Page 5, Equation 5, some of the symbols are not explained

§3.4 Latent Representation of Background

Line numbers are missing in first part of §3.4 on page 5.

Page 6, Eqation 6, some of the symbols are not explained.

Page 6, lines 178-184: Those lines are hard to follow and introduce some
unexplained symbols that may not be familiar to the audience of the
journal. Please rephrase this paragraph, make sure all symbols are
described and that it is appropriately explained.

§3.5 Reconstruction loss

Line numbers missing on part of this subsection on page 6.

Page 6, line immediately below equation 8: From "And when the...", remove
"And".

Page 7, lines 189-190: The sentence "Therefore, the weight ... relative
larger eigenvalue" is unclear. How is delta-Z calculated by the
eigenvalue, how does an eigenvalue calculate something?

Line numbers are missing between eqations 11 and 12.

Second line below equation 11, in "we utilized guided filter", add "a" to
make it read "we utilized a guided filter".

§4 Experimental Results and Discussion

Page 7, line 201: "we first present the dataset description", the
description of the datasets used belongs in the methods and not in the
results, because the datasets are an input to your research and not a
product of it.

§4.1.1 Landsat 8 Dataset

Page 8, lines 207--210: this subsection needs more information. Landsat 8
contains two instruments: which one of the two instruments was used?
What channels or this instrument were used?
By images, do the authors mean monochrome images, RGBs, or 3D "images" of
1000×1000×n with n the number of channels used? What is the spatial
resolution? When were the images taken? Where were the images taken?
What is the area covered by each image? This information belongs in the
paper such that the results can be adequately reproduced.

Page 8, line 210: what is a reference map and how is this derived?

§4.1.2 GF-1 WFV Dataset

Page 8, line 212: "launched by China", launched by what agency?

Page 8, line 217: What is the original spatial resolution, before
resampling to 16 m?

Page 8, line 217: "false-color images", how were those images produced? I
suppose it is an RGB consisting of three channels? Which channels? Or
was it produced differently?

§4.1.3 GF-5 Hyperspectral Dataset

Page 8, lines 221-224: What is the relevance of the other (non-AHSI)
instruments? Why are they mentioned here if you only use AHSI?

Page 8, line 220-228: Some important information is missing here. What
are the wavelengths in the AHSI image? I assume that by 430 × 430 × 180,
the authors mean 180 channels. What is the spectral range covered? What
is the spatial resolution? When wand where ere the images taken? How
large an area does each of them cover?

§4.1.4 Experimental setting

Page 8, line 231: should overfitting be mentioned in section 3, perhaps
near the end?

Page 8, line 231: "it is also proved...insufficient training samples", I
think this is already mentioned in section 3.

Page 8, line 233: What does "leaky" mean here?

Page 8, lines 240--242: There is too much detail on your hardware/software
here, this is not relevant. The brand and model of your graphics card is
not important and can be removed, as can your CPU brand and model as well
as your amount of RAM. The paper is not about computational performance
so computational details are not relevant. Please only keep the mention
of Python, TensorFlow, and Matlab.

§4.2 Comparative Methods and Evaluation Criterion

Page 9, lines 244-253: This subsection should be in the Methods section,
not in the Results section.

Page 9, line 252-253: Please describe how these evaluation criteria are
defined.

§4.3.1. Landsat 8 Dataset Results

Page 9, line 256: How are these reference maps obtained?

Page 9, lines 257-258: How have you selected those two images and how do
you know that they are representative?

Page 9, line 260: when you say "implement via publicly released codes",
what do you mean? Have you re-implemented them from scratch, or have you
installed publicly available software? Which software is this and where
did you get this? And how you implement it belongs in the methods not the
results section.

Page 9, line 273: replace "complies" by "agrees"

Page 10, Figure 4, as well as Figures 5 and 6: The order of the panels is
confusing. The first row is Image I, the second row Image II, the third
row is Image I and the last row is Image II again: I II I II. It would be
less confusing if you first had all the panels related to Image I and then
all the panels related to Image II.

§4.4 Component analysis

Page 12, line 300: replace "pubic" by "public"

§4.5 Discussion

Page 14, line 321: "there are some details", what details are those? What
did you find that can be improved? This is of interest to the reader.

Abbrevations: Perhaps the authors can add here as well a complete listing
of symbols use.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The manuscript:

“Discriminative Feature Learning Constrained Unsupervised Network for Cloud Detection in Remote Sensing Imagery”, We Xie, J. Yang, Y. Li, J. Lei and J. Zhong (Ref. No.: remotesensing-678686),

is significantly improved after a major revision. The authors provided descriptions, added more material and data, and included the required citations. However, it still needs some corrections and minor amendments would be desirable as follows:

1) There is a mismatch between title and UCDN. According to title the abbreviation should be INCD. This mismatch should be fixed.

2) The sentence in Abstract should be corrected as: “… Vandenberg Air Force Base in California, launched on February 11, 2013, was initially known as the Landsat Data Continuity Mission (LDCM)”.

3) The ending part of the Abstract should be corrected as: “Moreover, the OA values on Images III and Image IV from the GF-1WFV dataset are 0.9957 and 0.9934, respectively. This signifies that our algorithm performs better than other known algorithms”.

4) The sentence in Introduction should be corrected as: “Work [4] introduces a method of hyperspectral anomaly detection using image pixel selection”.

5) The sentence should be corrected as: “Most methods using CNN are intended to extract deep characteristics”.

6) The sentence:

“The proposed UCDN method depends on an important observation that clouds are sparse and modeled as sparse outliers [27-29]”.

What if clouds are not sparse? Will UCDN work in this case? If yes, how UCDN characteristics can be affected when clouds are not sparse? This should be briefly clarified.

7) The sentence:

“Although the performance of our method is the highest compared with other methods, there is still space for improvement”,

is unclear. In fact, this sentence signifies that UCDN method have some drawbacks that are not mentioned. What are those drawbacks of the UCDN method? The brief description of the drawbacks in UCDN should be provided.

8) The equations (13), (14) and (15) should be cited.

The manuscript may be recommended for publication. However, it requires a minor revision.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Thank you for the revised manuscript. It has substantially improved and I
think it can be published with minor corrections, which I have outlined
below. The corrections below focus on the revised manuscript, in
particular the version with highlighted changes as sent to authors. In
addition, I would like to thank the reviewers for their focussed
point-by-point reply to the previous iteration of the review, which has
offered valuable clarification, although some of it should probably go
additionally into the article.

Abstract

Page 1, line 20: Replace "Landsat 8 is the rocket" with "Landsat 8 was
launched" or similar (it's a satellite, not a rocket).

Page 1, line 25: Although acronyms OA and WFV are explained in the
article, they should also be explained in the abstract, which should be
self-contained.

§1 Introduction

Page 3, line 82: Replace "using" by "use"

Page 3, line 86: Replace "CNNs-based" by "CNN-based"

Page 3, line 89: Replace "significant" by "important"

Page 3, lines 104--106: "Although the ... shadow detection", not sure if
this belongs in the introduction, it seems rather something for the
discussion (where the authors have also mentioned it).

Page 3, line 121: Section 5 "Discussion" is missing here.

§3.1 Constructing Residual Error in Latent Space

Page 5, line 178: "low (...) for the background image but (...) large for
clouds", please either use "low / high" or "small / large".

§3.5 Reconstruction loss

Page 7, line 241: "We assume ... larger eigenvalue", isn't lambda_1 the
larger eigenvalue by definition, such that this assumption is superfluous?

§3.6.1 Landsat 8 Dataset

Page 8, line 261-262: although the acronyms OLI an TIRS are now explained
in the appendix, they should also be explained on first use (i.e. here).

§3.6.5 GF-5 Hyperspectral Dataset

Page 9, line 283: After "430 × 430 × 180", insert "pixels".

Page 9, line 284: Replace "never" by "not". I know that the other
reviewer recommended the opposite but I think "not" is more appropriate
here.

§4. Experimental Results

Page 9, line 288-289: Remove "Specifically, we first present the dataset
description", since this has been moved (as it should be) to the
methodology.

§4.1 Experimental Setting

Page 9, line 299: replace "epoch" by "epochs"

Page 9, line 300-301: replace "we set epoch as 1000 to trade-off" by "we
set the number of epochs to 1000 s a trade-off"

§4.2 Comparative Methods and Evaluation Criterion

Page 9, line 309: Please expand acronyms PRS, SVM, PCANet, and SL on first
use.

Page 9, line 315-316: replace "three most commonly used evaluation
criteria including the area (...)" by "three commonly used evaluation
criteria: area (...)". There exist many evaluation criteria
(https://en.wikipedia.org/wiki/Confusion_matrix), these are indeed
commonly used but it is overstated, too general, and not necessary to
claim they are the most commonly used.

§4.3.1 Landsat 8 Dataset Results

Page 10, line 330: Please explain briefly, in the article, what a
reference map is.

Page 10, line 332: As discussed before, I don't think "representative" is
the right word here. In statistics, a representative sample is a sample
that shares the same statistical characteristics as the overall
population. This would appear to be an overstatement here, and it is not
necessary for your study. I would replace it by "examples", as in "Image
I is an example of a case with thin clouds, and Image II is an example of
a case with thick clouds".

§5 Discussion

Page 15, lines 402-410: this paragraph seems to repeat the previous
paragraph a bit.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop