Next Article in Journal
Ecological Impact Prediction of Groundwater Change in Phreatic Aquifer under Multi-Mining Conditions
Previous Article in Journal
An Augmented Geospatial Service Web Based on QoS Constraints and Geospatial Service Semantic Relationships
 
 
Article
Peer-Review Record

Extraction of Continuous and Discrete Spatial Heterogeneities: Fusion Model of Spatially Varying Coefficient Model and Sparse Modelling

ISPRS Int. J. Geo-Inf. 2022, 11(7), 358; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi11070358
by Ryo Inoue * and Koichiro Den
Reviewer 1:
Reviewer 2:
Reviewer 3:
Reviewer 4: Anonymous
ISPRS Int. J. Geo-Inf. 2022, 11(7), 358; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi11070358
Submission received: 28 April 2022 / Revised: 6 June 2022 / Accepted: 19 June 2022 / Published: 23 June 2022

Round 1

Reviewer 1 Report

This paper describes a hybrid method for regression analysis of spatial data for cases where it is anticipated that both continuous and discreet forms of spatial autocorrelation apply. The issue it addresses is significant and the argument for a hybrid method is well made. The particular approach chosen is appropriate, because it addresses the two types of autocorrelation as qualitatively different issues rather than for example attempting to handle discreet change as only an extreme inflection of a continuous surface.

Some of the claims are perhaps a little too strongly made however and I have some lingering doubt as to the methods generality. On line 54 the authors note that ESFSVC has several advantages over GWR but they do not mention that it also comes with relative disadvantages e.g. where GWR can be fitted with a flexible bandwidth to accommodate non-stationarity in the scale of effects. The limitation of ESF-SVC is noted in lines 145-150 where it is stated that to avoid over fitting it is necessary to use only the larger Eigenvalues in an ESFSVC model, corresponding to larger scale spatial heterogeneity. This limitation is applied on line 174 in the method used. So, continuous heterogeneity is being assumed as relevant only to the global scale, while the GL is applied to sub regions, which in the simulated data are relatively small i.e. discreet heterogeneity is assumed to be local scale. The simulated data embeds the same assumption as the method, which isn’t ideal, randomly sized and shaped subareas would have been preferable. It is still OK if the assumption holds for the use cases, but one can think of many cases where change appears discreet at global scales and vague at local scales. 

In the case study the scale at which each kind of heterogeneity should be modelled is partly selected out by removal of High-rise condominiums on the grounds they represent too many examples with similar values, but one could also see this as a either a hyper-local patch with a discreet boundary, or as areas with continuous but very local price heterogeneity. I wonder if it is also the cause of the overfitting noted by the authors in Figure 16, because the value surface around the CBD is very steeply inflected yet not truly discreet in the same way as would be the case for the effect or popular postcode.

I do not feel this undermines the paper at all. It is a nice method and will be very useful in some circumstances. However, I do think that either the paper needs to more clearly show generality, or the conclusions should be more caveated.

Overall, the paper was easy to read and the method is interesting. 

 

Author Response

Thank you for your detailed comments. We really appreciate them.

We agree that the “ordinary” ESF-SVC models have limitations in representing local and continuous spatial heterogeneity to avoid the overfitting issues. Because our proposed model utilized the ESF-SVC model that can represent global and continuous heterogeneity but cannot represent local and continuous heterogeneity, there might be a chance that it expresses the local and continuous heterogeneity by the local and discrete heterogeneity. Now we are trying to expand the proposed model by utilizing the random effects ESF-SVC model, which can express from global to local spatial heterogeneity without the overfitting issues. We also considering the multi-scale local spatial heterogeneity by utilizing group lasso and tree-structured group lasso as an expansion of the proposed method.

As a third possible improvement, we explained the problem you mentioned and added a proposed solution to it in the last part of the discussion.

Reviewer 2 Report

The extraction of continuous and discrete spatial heterogeneity is an important research in the field of spatial metrology, and supports the interpretation and analysis of spatial characteristics. This paper proposes an extraction method based on the “Fusion of Spatially Varying Coefficient Model and Sparse Model”. Although the research subject is very interesting and has good application, there are still some shortcomings in this study, which are as follows:

1.    In section Abstract, the research objectives of this paper are clearly introduced, but the research necessity and problems of this paper are not summarized, for example, what are the characteristics of continuous heterogeneity and discrete heterogeneity? What are the difficulties in common extraction? What are the limitations of separate extraction of the two heterogeneity? And how does the model in this paper solve these defects?

2.   In section Introductionthe manuscript introduces the application background and situation of continuous heterogeneity and discrete heterogeneity models in detail, but does not introduce the research objectives, especially the research status and progress of the joint extraction of continuous and discrete spatial phenomena.

3. In sections 2.1.1 and 2.1.2, the fusion of ESF-SVC and GL proposed in this paper is a good idea, but why these two models are used for fusion is not well explained. There are several variants of ESF. What are their characteristics? Why combined with the GL model can solve the problems presented in this paper

4. In Section 2.2, the method introduction and calculation formula of the model seem to be relatively simple. It seems that only a discrete heterogeneity detection variable is added on the basis of ESF-SVC model. The author can discuss the progress of the method and increase the discussion on the physical meaning of formula variables.

 

5. In Section 2.3, the manuscript spent a lot of space on simulation experiments to evaluate the performance of this model is very effective. However, the setting of experimental data, such as the division of experimental areas and the selection of vector coefficients, is not explained in this paper. Is it representative or commonly used in similar studies? Authors can discuss or add references.

6.   In the experiments in Section 2.3.3 and section 3, the manuscript mainly compares the ESF-SVC and GL models, but lacks the comparison with the typical ESF variant models and other relevant research results, which does not well illustrate the innovation of this method. At the same time, it is suggested to strengthen the qualitative analysis of the experiment in Section 3. At present, the comprehensibility of the results is not very high

7. It is well known that spatial scale and the shape of spatial units have an important impact on heterogeneity analysis. The method of this paper is how to consider the role of these factors, and what are the solutions?

8.   Add references to relevant work and classical methods, and strengthen the discussion on the research characteristics in the summary part, especially the problems that cannot be solved by the current methods.

Author Response

Thank you for your detailed comment. We really appreciate them. The responses are described as follows.

Comment #1
In section Abstract, the research objectives of this paper are clearly introduced, but the research necessity and problems of this paper are not summarized, for example, what are the characteristics of continuous heterogeneity and discrete heterogeneity? What are the difficulties in common extraction? What are the limitations of separate extraction of the two heterogeneity? And how does the model in this paper solve these defects?

Response to comment #1
To clarify the necessity to consider both types of spatial heterogeneity simultaneously, explanations have been added to the abstract, indicated by yellow markers.

 

Comment #2
In section Introduction, the manuscript introduces the application background and situation of continuous heterogeneity and discrete heterogeneity models in detail, but does not introduce the research objectives, especially the research status and progress of the joint extraction of continuous and discrete spatial phenomena.

Response to comment #2
We introduce the previous analysis methods for continuous spatial heterogeneity in lines 52-63, and for discrete spatial heterogeneity in lines 64-83. Then we introduce the example of dataset with both types of spatial heterogeneity in line 89-99 and state the present research status in which little discussion has taken place for the consideration of both types of spatial heterogeneity.
We modified the sentence in lines 103 and 104 to clarify the research objective.

 

Comment #3
In sections 2.1.1 and 2.1.2, the fusion of ESF-SVC and GL proposed in this paper is a good idea, but why these two models are used for fusion is not well explained. There are several variants of ESF. What are their characteristics? Why combined with the GL model can solve the problems presented in this paper?

Response to comment #3
Unlike some variants of ESF-SVC models, such as an RE-ESF-SVC model, the “classical” ESF-SVC model has high compatibility to construct a fusion model with GL, because it is represented by a linear regression model and can be estimated by OLS. We add the explanation in lines 159-161.

 

Comment #4
In Section 2.2, the method introduction and calculation formula of the model seem to be relatively simple. It seems that only a discrete heterogeneity detection variable is added on the basis of ESF-SVC model. The author can discuss the progress of the method and increase the discussion on the physical meaning of formula variables.

Response to comment #4
As you pointed out, the proposed model has a simple structure that have subregion-specific coefficients and adds ℓ1 penalty term on the differences between adjacent coefficients. The regularization term is effective to mitigate the scale issues related to the presetting of subregions. To clarify the meanings of the term, we add an explanation in line 192-195. Addition to that, the explanation on other regularization terms is added as a response to the other reviewers’ comment in lines 195-203.

 

Comment #5
In Section 2.3, the manuscript spent a lot of space on simulation experiments to evaluate the performance of this model is very effective. However, the setting of experimental data, such as the division of experimental areas and the selection of vector coefficients, is not explained in this paper. Is it representative or commonly used in similar studies? Authors can discuss or add references.

Response to comment #5
The explanation on how to create experimental data is in lines 211-240, the settings for creating data with continuous spatial heterogeneity are in lines 219-227, and the settings for creating data with discrete spatial heterogeneity (e.g., subregion settings, subregions where discrete spatial heterogeneity occurs) are in lines 228-235.
Simulated data are not commonly used in analyses of discrete spatial heterogeneity because prior studies used simple linear regression models with subregion-specific coefficients and there was no need to discuss the performance of the model.

 

Comment #6
In the experiments in Section 2.3.3 and section 3, the manuscript mainly compares the ESF-SVC and GL models, but lacks the comparison with the typical ESF variant models and other relevant research results, which does not well illustrate the innovation of this method. At the same time, it is suggested to strengthen the qualitative analysis of the experiment in Section 3. At present, the comprehensibility of the results is not very high

Response to comment #6
The ordinary ESF models are not spatially varying coefficient models but models that represent the spatial autocorrelation of dependent variables. An ESF-SVC model, an expansion of ESF to estimate spatially varying coefficients, can estimate the spatially varying coefficients. To our knowledge, there are not many variants for ESF-SVC models that could be used for a linear regression model. An exception is an RE-ESF-SVC model, which is introduced in the discussion section. An RE-ESF-SVC model might be a good option to compare. However, as it is also the model that assumes continuous spatial heterogeneity, it cannot be expected to produce highly accurate analysis results, especially for the simulation data with clear discrete spatial heterogeneity. We do not believe that comparisons with other methods are useful to show the difference in estimation results from conventional methods.

 

Comment #7
It is well known that spatial scale and the shape of spatial units have an important impact on heterogeneity analysis. The method of this paper is how to consider the role of these factors, and what are the solutions?

Response to comment #7
As we wrote in the response to comment #4, we added the explanation that the GL regularization term on the differences between adjacent coefficients mitigates the scale issues of subregion setting in line 192-195.

 

Comment #8
Add references to relevant work and classical methods, and strengthen the discussion on the research characteristics in the summary part, especially the problems that cannot be solved by the current methods.

Response to comment #8
We believe that the existing methods are adequately discussed in Section 1, and the differences between the existing method and the proposed methods are adequately discussed in Sections 2.3.3 and 3.3. As the summary of application to apartment rent data analysis is described in section 3.4 just before section 4, we decided that repetition should be avoided. The conclusion is not mandatory according to the instructions for authors of the journal, then we placed the discussion as the final section of this paper.

Reviewer 3 Report

 

This is an interesting and innovative approach to spatial heterogenetity assessment, linking geoinformatics with spatial statistics. The paper is ready for publication, but the authors might consider that the whole debate about spatial heterogeneity is part of a larger problem of spatial complexity: the more spatially complex an area is, the more heterogeneous it is and (possibly) the more approaches such as the one proposed by the authors make sense to analyze it (see Papadimitriou, F. 2020, Spatial Complexity. Theory, Mathematical Methods and Applications, Springer).

Author Response

Thank you for your comment.

We have checked the recommended book. However, it is difficult for us to make discussion on the relevance between spatial complexity and spatial heterogeneity, as spatial complexity is related to broad field of geoinformatics. 

Reviewer 4 Report

 

The paper improves the spatial econometric methodology by introducing a new model that combines discrete and continuous spatial heterogeneity. The study is solid, includes a good literature review, introduces a model with a significant level of novelty, proposes an estimator and illustrates it using the simulated data, and completes with a practical example.

Minor issues:

1. The literature review can be extended with papers, devoted to discrete-continuous heterogeneity (including spatial heterogeneity).

2. The parameter identification problem frequently appears in spatial econometric models (especially structure-rich models). I recommend to address this problem (and provide evidences of model parameter identifiability).

 

3. Misspelling “aug min” in formulas

 

Author Response

Thank you for your comments. We appreciate them. The responses to your comments are as follows.

Comment #1
The literature review can be extended with papers, devoted to discrete-continuous heterogeneity (including spatial heterogeneity).

Response to comment #1
In our understandings, discrete-continuous models are behavioral models for describing situations in which discrete choice behavior and choice behavior regarding continuous quantities are partially associated by common factors. We think those models have less relevance to this study, then we did not add the comments on those models.

 

Comment #2
The parameter identification problem frequently appears in spatial econometric models (especially structure-rich models). I recommend addressing this problem (and provide evidence of model parameter identifiability).

Response to comment #2
We think the regularization for the ESF-SVC and subregion dummy variable coefficients themselves can reduce the occurrence of parameter identification problems between the ESF-SVC and the subregion dummy variable coefficients. We added the explanation of the proposed method to clarify the functions of three regularization terms of equation (8) in lines 192-203. And the BIC-based model selection is also effective in reducing the possibility of parameter identification problems. We added the comments in lines 217-219. In the simulation study in section 2, no major parameter identification problems seem to have occurred, because the models are quite simple. We added the comments in lines 316-321.

Although we could not judge if some parameter identification problems have occurred or not in the rent analysis in section 3, we consider it unlikely that parameter identification problems are occurring.

 

Comment #3
Misspelling “aug min” in formulas

Response to comment #3
Thank you for pointing this out. We modified equations (5), (6), and (8).

Round 2

Reviewer 2 Report

The authors have improved the paper significantly, addressing most of my comments sufficiently. I therefore suggest to accept the submission for publication.

Back to TopTop