Next Article in Journal
An Approach for Filter Divergence Suppression in a Sequential Data Assimilation System and Its Application in Short-Term Traffic Flow Forecasting
Next Article in Special Issue
A Change of Theme: The Role of Generalization in Thematic Mapping
Previous Article in Journal
DKP: A Geographic Data and Knowledge Platform for Supporting Climate Service Design
Previous Article in Special Issue
Geological Map Generalization Driven by Size Constraints
 
 
Article
Peer-Review Record

Exploring the Potential of Deep Learning Segmentation for Mountain Roads Generalisation

ISPRS Int. J. Geo-Inf. 2020, 9(5), 338; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi9050338
by Azelle Courtial 1,*, Achraf El Ayedi 1, Guillaume Touya 1 and Xiang Zhang 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
ISPRS Int. J. Geo-Inf. 2020, 9(5), 338; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi9050338
Submission received: 31 January 2020 / Revised: 7 May 2020 / Accepted: 20 May 2020 / Published: 25 May 2020
(This article belongs to the Special Issue Map Generalization)

Round 1

Reviewer 1 Report

River Deep, Mountain High: Deep Learning Segmentation for Mountain Roads Generalization

 In this paper, the authors experiment with different deep learning techniques to automate mountain road generalization.

 This paper is creative in its approach to problem-solving and reports many details of the process.

Unfortunately, the authors do not clearly describe the problem they attempt to solve. They do not clearly describe and justify the reasons for the solution they attempt. Also, they do not clearly explain or describe their evaluation methods chosen to then frame their results. They even state this in the conclusion. This I see as the main downfall of this paper and it is therefore not a publishable scientific paper.

There were several important scientific writing flaws. For example, Table 1 is in French with no translation. The techniques that were chosen are not well described or justified for picking them.

 Key terms are not sufficiently defined. Figures are not always referenced or described in the text. There are several problems with the language, very awkward English which makes the paper difficult to follow. I found myself questioning what the authors were trying to argue. The paper is also disorganized. There are sentences about the value of the results in the background section and sentences about the methods in the results section. Acronyms are not spelled out the first time they are used. There are so many stylistic and language issues, it is distracting from the content. Why is the term river in the title? Rivers are never mentioned in the paper?

Mountain roads follow specific contours of a slope. Different countries have different laws in terms of grades in which roads may be built. I would expect to see this somewhere in the paper or test this in the results. At the very least, to test the value and effectiveness of the results of their generalization tests, they would need to be compared to a contour map as roads that are built-in mountainous areas often follow the contours of the mountain, so to check if the output is sufficient- there would need to be some comparison to elevation maps.

 

For all of these reasons, and more, this paper is not fit for publication. 

Author Response

Here are responses to reviewer 1. Find attached response to all reviewers. 

 

Unfortunately, the authors do not clearly describe the problem they attempt to solve. They do not clearly describe and justify the reasons for the solution they attempt.

We have now completed the introduction section to make clearer the problem and our motivation.

Also, they do not clearly explain or describe their evaluation methods chosen to then frame their results. They even state this in the conclusion. This I see as the main downfall of this paper and it is therefore not a publishable scientific paper.

The evaluation is justified in the 3.5 paragraph. We develop this section in order to raise the misunderstanding between intern evaluation (for learning) and finale evaluation.

There were several important scientific writing flaws.

The english has now been controled .

For example, Table 1 is in French with no translation.

The attributs of the database are now traduced.

The techniques that were chosen are not well described or justified for picking them.

We do not understand this remarks which techniques do you speak about ? We verify the choice of architecture and image creation explanation. (the evaluation has been better describe in response to point 2.

Key terms are not sufficiently defined.

As for the previous point, we did not find which key term you mention here. We verify the definition of following terms : convolutional neural network, loss, generalization, segmentation, deep learning. Train test and validation dataset terms have been clarified.

 

Figures are not always referenced or described in the text.

The figure 2 is now referenced. Figure 11 has been regroup with figure 10 and is now referenced.

There are several problems with the language, very awkward English which makes the paper difficult to follow. I found myself questioning what the authors were trying to argue.

The english has now been controled .

The paper is also disorganized. There are sentences about the value of the results in the background section and sentences about the methods in the results section.

We did neither find sentence about our result are in the background section, nor sentence about our method in the result section.

Acronyms are not spelled out the first time they are used. There are so many stylistic and language issues, it is distracting from the content.

The following acronyms have been spelled :

GPU: Graphics processing unit

IGN : is not an acronym but the name of the national map agencie

REF: reference.

Why is the term river in the title? Rivers are never mentioned in the paper?

This was a reference to a song, but it may be unadapted the title have been changed for a more serious one that better emphasize the aim of the paper. "Exploring the potential of Deep learning segmentation for mountain roads generalisation."

Mountain roads follow specific contours of a slope. Different countries have different laws in terms of grades in which roads may be built. I would expect to see this somewhere in the paper or test this in the results. At the very least, to test the value and effectiveness of the results of their generalization tests, they would need to be compared to a contour map as roads that are built-in mountainous areas often follow the contours of the mountain, so to check if the output is sufficient- there would need to be some comparison to elevation maps.

The inclusion of context is mentioned as future work. However the relief is not represented one map at 1/250 000 (at IGN, we suppose this is the same for most other national map agencies). Secondly the reference is made respecting the elevation so we suppose that if the prediction is near the reference would be relief coherent.

Reviewer 2 Report

This study tries to address an interesting map generalization challenge using deep learning. The reviewer has the following comments that might be helpful to improve the manuscript in the future (the commented pdf is also attached).

#1. Page 6, Lines 172-177. It is not clear to me how post-processing of the tiles from the second method (object hold) was done. Could you elaborate more using an example?

#2. Page 6, Figure 4. Increase the spacing between the four sub-figures for better differentiability.

#3. Page 7, Lines 214-216. I think the texts should be removed.

#4. Page 8, Lines 233-234. “In our case the loop creation rather that the turn enlargement is not a big segmentation error but unimportant cartographic mistake.” Confused by this sentence. Can you state it in a more straightforward way?

Page 8, Line 254. The “test data” was never mentioned before (instead, data were divided into a training set and an evaluation set). Here, do you mean “training data” instead of “test data”? Same comment on the legend of Figure 6 (testing data).

Page 8, Figure 6. The legend has “Validation data” was never mentioned before neither. Do you mean “Evaluation data”? Please use consistent terms throughout the article.

Page 9, Line 268. “Another important point to note is the difference between training and validation evaluation. Here it is not so big, so there is not so much over-fitting of the learning model.” The difference of what? How can you conclude “there is not so much over-fitting”?

Page 9, Line 271. “…we have very different values.” Different values of what?

Page 9, Lines 274-277. The results on “test data” is really confusing as “test data” was never introduced before in the article. I guess the author(s) mean “training data.” Usually, prediction accuracy is higher (lower loss) on the training data than on the validation data, which is the case if by “test data” the author(s) mean “training data.”   

Page 9, Lines 285-287. “Moreover, our method does not guarantee the network connectivity inside each tile, and often some roads are disconnected in the predicted image.” This is a serious drawback. Any thoughts on how to address it?

Page 10, Lines 293-295. Finally, here the author(s) explained what they mean by “test data” – a part of the training data, which is problematic. In machine learning, test data are supposed to be independent from training data (i.e., test data should not be used in training in any way). In my view, the statistics (e.g., loss in Figure 6, evaluation score in Figure 8) computed on the so-called test data are actually on the training data.

Page 12, Figures 10 and 11. The captions of Figure 10 and Figure 11 are not informative (the readers do not know what you are comparing).

Page 13, Figure 12. The a, b, c referenced in the caption are not present on the figure.

Author Response

Here are responses to reviewer 2. Find attached responses to all reviewers. 

 

Page 6, Lines 172-177. It is not clear to me how post-processing of the tiles from the second method (object hold) was done. Could you elaborate more using an example?

We added some explanations, illustrated by a figure, and the equation to compute the width of the roads.

Page 6, Figure 4. Increase the spacing between the four sub-figures for better differentiability. Done

Page 7, Lines 214-216. I think the texts should be removed.

We kept this text but moved it in the future work section.

Page 8, Lines 233-234. “In our case the loop creation rather that the turn enlargement is not a big segmentation error but unimportant cartographic mistake.” Confused by this sentence. Can you state it in a more straightforward way?

Done : "In our case, the segmentation tends to create loops in sinuous bends rather than just enlarging bend (see the results in the following section). This is not a big segmentation error because only a few pixels are misclassified, but it is clearly an important cartographic mistake."

Page 8, Line 254. The “test data” was never mentioned before (instead, data were divided into a training set and an evaluation set). Here, do you mean “training data” instead of “test data”? Same comment on the legend of Figure 6 (testing data).

Page 8, Figure 6. The legend has “Validation data” was never mentioned before neither. Do you mean “Evaluation data”? Please use consistent terms throughout the article.

Page 9, Lines 274-277. The results on “test data” is really confusing as “test data” was never introduced before in the article. I guess the author(s) mean “training data.” Usually, prediction accuracy is higher (lower loss) on the training data than on the validation data, which is the case if by “test data” the author(s) mean “training data.”  

There was an inconsistency between the figures and the text. You are right, the figure represents the values of train and evaluation data and not test and validation. We corrected the whole article to be consistent with this vocabulary.

Page 9, Line 268. “Another important point to note is the difference between training and validation evaluation. Here it is not so big, so there is not so much over-fitting of the learning model.” The difference of what? How can you conclude “there is not so much over-fitting”?

Page 9, Line 271. “…we have very different values.” Different values of what?

We made this conclusion in comparison to other experiments. We corrected this sentence to make this clear.

Page 9, Lines 285-287. “Moreover, our method does not guarantee the network connectivity inside each tile, and often some roads are disconnected in the predicted image.” This is a serious drawback. Any thoughts on how to address it?

We added a sentence in discussion : "This is a serious drawback that we could address by a new architecture that can discriminate if the roads remain connected (e.g. a generative adversarial network). We could also use morphological operations in the image, or after a conversion to vector data, use methods that preserve network connectivity during generalisation (Touya, 2010)."

Page 10, Lines 293-295. Finally, here the author(s) explained what they mean by “test data” – a part of the training data, which is problematic. In machine learning, test data are supposed to be independent from training data (i.e., test data should not be used in training in any way). In my view, the statistics (e.g., loss in Figure 6, evaluation score in Figure 8) computed on the so-called test data are actually on the training data.

Contrary to most machine learning techniques, there are three datasets in deep learning: the training set, the test set that is used during training to guide the convergence of the training iterations through loss values, and the validation or evaluation set (never used in the training phase). We now clarified this point at the end of method section.

Page 12, Figures 10 and 11. The captions of Figure 10 and Figure 11 are not informative (the readers do not know what you are comparing).

These two figures have been merged because the comparison was not inside each of these figures but between these two figures.

Page 13, Figure 12. The a, b, c referenced in the caption are not present on the figure.

Done

Author Response File: Author Response.pdf

Reviewer 3 Report

Brief summary

This is a well written paper that has several strengths. The authors utilize deep learning techniques to overcome the cartographic generalization problem of sinuous bends in mountain roads. The method is quite innovative and the evaluation has been thoroughly carried out. The method seem still to have some limitations before it can be applied in practice, but when these are solved, I believe that it can contribute to an increase in more automatic generalization procedures.

Broad comments highlighting areas of strength and weakness.

  1. The visual presentations of data and results in most of the figures need to be improved and clarified so that the readers are able to understand them without too much information other than the associated figure text. This relates mostly to Figures 4, 7, 9, 10-15. For example, in Figure 7, please clarify inside the figure which columns that represent which type of data and results. That is, improve the figures and legends so that they as much as possible can be understood by only looking at them and the figure title.
  2. As I understand the evaluation results, the developed method does not produce better results than the reference generalized data. If so, please mention this in the conclusions.
  3. In relation to point 2, the authors discuss what they could have done to improve the automatic generalization. For example, that they need more examples to train the model, larger images or images with a better resolution, and that the network needs to be deeper than the current setting. My question is then why the authors did not do this to improve the results? Was it not possible due to lack of data?
  4. Because the complexity of quantitative evaluations for generalization results, I understand the authors approach of combining it with a qualitative evaluation. Ideally, however, such qualitative evaluation could be carried out with user-tests in order to obtain a more objective view. Because the paper is already quite long, I do not suggest the authors to carry out such a test. However, they could mention it shortly when discussing the approach of their qualitative evaluation.

 

Specific comments referring to line numbers, tables or figures. 

Most figures: Please use scale-bars when relevant.

Lines 2-6: This long sentence could be divided in two shorter sentences.

Line 12: “… looks like a generalized version of the roads.” I suggest using other wording that the subjective “looks like” when you summarize the results from the evaluation.

Figure 1: This figure could be clearer if you add a zoomed-in area showing a detail example of the difference between the generalized roads and initial roads.

Figure 2: It is difficult to identify other indexes than the two extremes (“13-30” and “83-100”). Maybe other colors or shapes can make the other groups more visible to the reader.

Table 1: It would be useful to the reader if the Attribute values were translated to English.

Lines 144-149: Although the reason for using the 256*256 pixel size is discussed, it is still not very clear why other pixel sizes were not tested. Therefore, the choice of 256*256 seems somewhat arbitrary.

Figure 4: Borders around the figures and labels of A-D would make the figure clearer.

Figure 5: Please explain shortly (in the text) the legend in this figure. E.g., “Conv 3*3, ReLu”

Line 224: Please remove “really” from “… is a really common evaluation …”

Line 255: Please rephrase: “ …(so not too large here as our dataset is small)”.

Author Response

Here are responses to reviewer 2. Find attached response to all reviewers. 

 

The visual presentations of data and results in most of the figures need to be improved and clarified so that the readers are able to understand them without too much information other than the associated figure text. This relates mostly to Figures 4, 7, 9, 10-15. For example, in Figure 7, please clarify inside the figure which columns that represent which type of data and results. That is, improve the figures and legends so that they as much as possible can be understood by only looking at them and the figure title.

Figures and legends have been modified so that they now can be understood independently. We added column name and scale bars.

As I understand the evaluation results, the developed method does not produce better results than the reference generalized data. If so, please mention this in the conclusions.

It was not our goal to outperform existing techniques for now. We now have made it clear in the conclusion and at the beginning of the results section.

In relation to point 2, the authors discuss what they could have done to improve the automatic generalization. For example, that they need more examples to train the model, larger images or images with a better resolution, and that the network needs to be deeper than the current setting. My question is then why the authors did not do this to improve the results? Was it not possible due to lack of data?

It is true that the lack of data is the main limitation as we need a manual data maching between the generalised and ungeneralised roads. Moreover, the computation time was also a limitation that we are trying to solve. We now mention these limitations in the discussion section.

Because the complexity of quantitative evaluations for generalization results, I understand the authors approach of combining it with a qualitative evaluation. Ideally, however, such qualitative evaluation could be carried out with user-tests in order to obtain a more objective view. Because the paper is already quite long, I do not suggest the authors to carry out such a test. However, they could mention it shortly when discussing the approach of their qualitative evaluation.

We totally agree with the reviewer, and now mention it in the future work section.

Specific comments referring to line numbers, tables or figures. 

Most figures: Please use scale-bars when relevant. done

Lines 2-6: This long sentence could be divided in two shorter sentences. done

Line 12: “… looks like a generalized version of the roads.” I suggest using other wording that the subjective “looks like” when you summarize the results from the evaluation.

We did not change these terms because this subjective observation was our real goal, we wanted the segmentation result to look like generalised roads. However, we added a quantitative value to make the result summary less subjective.

Figure 1: This figure could be clearer if you add a zoomed-in area showing a detail example of the difference between the generalized roads and initial roads. Done

Figure 2: It is difficult to identify other indexes than the two extremes (“13-30” and “83-100”). Maybe other colors or shapes can make the other groups more visible to the reader.

We provide new colors and a zoomed image for more clarity.

Table 1: It would be useful to the reader if the Attribute values were translated to English.

Done

Lines 144-149: Although the reason for using the 256*256 pixel size is discussed, it is still not very clear why other pixel sizes were not tested. Therefore, the choice of 256*256 seems somewhat arbitrary.

We added a sentence to explain this point.

Figure 4: Borders around the figures and labels of A-D would make the figure clearer.

Done

Figure 5: Please explain shortly (in the text) the legend in this figure. E.g., “Conv 3*3, ReLu”

We added a paragraph that explains in details the layers of the convolutional network.

Line 224: Please remove “really” from “… is a really common evaluation …”

Done

Line 255: Please rephrase: “ …(so not too large here as our dataset is small)”.

Done

 

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The authors addressed all my comments.

After reviewing the revised manuscript, the reviewer now tends to agree that the article is worth publishing on its merit of exploring the challenges and difficulties of using deep learning for mountain roads generalization, although still it is a work-in-progress rather than a completed work. 

Author Response

Thank you for reviewing our work.

Back to TopTop