Next Article in Journal
Using Cosmic-Ray Neutron Probes in Validating Satellite Soil Moisture Products and Land Surface Models
Next Article in Special Issue
Hydrogeological Bayesian Hypothesis Testing through Trans-Dimensional Sampling of a Stochastic Water Balance Model
Previous Article in Journal
The Analysis of Short-Term Dataset of Water Stable Isotopes Provides Information on Hydrological Processes Occurring in Large Catchments from the Northern Italian Apennines
Previous Article in Special Issue
Evaluation of Four GLUE Likelihood Measures and Behavior of Large Parameter Samples in ISPSO-GLUE for TOPMODEL
 
 
Article
Peer-Review Record

Analysis of the Effect of Uncertainty in Rainfall-Runoff Models on Simulation Results Using a Simple Uncertainty-Screening Method

Water 2019, 11(7), 1361; https://doi.org/10.3390/w11071361
by Mun-Ju Shin 1 and Chung-Soo Kim 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Submission received: 5 April 2019 / Revised: 28 June 2019 / Accepted: 28 June 2019 / Published: 30 June 2019

Round 1

Reviewer 1 Report

The manuscript should be subjected to proof reading.

Author Response

Thank you for the comment. We have modified and proof readed the manuscript.

Reviewer 2 Report

The manuscript analyzes the parameter uncertainty of three rainfall-runoff models (GR4, IHACRES and Sacramento) using the DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm. The results of the comparison show the higher parameter uncertainty in the Sacramento model. Overall, the topic of this manuscript is of interest and use in the hydrologic community. The presentation of the results is also of acceptable quality for publication. However, the introduction section has been poorly written and should be revised. Here are my major comments:

1.       The presentation of the goal and objectives of this manuscript should be improved in the introduction section. The authors claim that their approach can be used by modelers to select an appropriate model.  However, it is not clear how one can make such decision from the results of these analysis. Does it mean that the high uncertainty of Sacramento model, shown in this paper, is the reason for selecting the other two models? Selection of an appropriate model depends on many factors, including both uncertainty and accuracy. Therefore, the authors should provide a better justification about the application and advantage of their approach in this section (The last lines of the abstract should be also modified for the same reason).

 

2.     Uncertainty analysis in hydrologic models has been widely used in the literature. The authors should provide a better list of references in this area. For example, I suggest the authors to revise the first paragraph of introduction. They can address different types of uncertainties in hydrologic models (uncertainty in forcing data, model structure, parameters and state variables), show appropriate references corresponding to each type of uncertainties and explicitly emphasize that their work is only focused on the parameter uncertainty. It is also nice to refer to some newer approaches, such as Data assimilation techniques, for uncertainty quantification of hydrologic models.


 I also list more comments below:

 

Title: Change “using  Simple” to “Using a Simple” in the title of manuscript.

 

Line 14: An academic approach is not in conflict with being a practical method. This sentence should be revised.

 

Section 2: It is useful if the authors add a new figure showing the location of five catchments on map, the shape of each catchment and the location of gauges used for the calibration.

 

Line 212-216: Please revise this part and explain why the NSE is close to 1. Is it because of human errors? Can the authors provide a possible reasoning for the odd behavior of parameters in figure 3?

 

Line 303-304: Why the focus of this analysis is on GR4J? For this table, I suggest the authors to discuss all three models equally.

 

Section 4.3: There is no line numbering here but the authors mention “It means that the best

one hundred parameter sets extracted for the period of 1980s could predict the hydrographs of 1990s and 2000s with a range of errors similar to that of the 1980s, but the uncertainty of the peak flow prediction for the period of 1970s was greate” Please provide the potential reasons for this behavior.





Author Response

Thank you for the comment. We added references to the first paragraph.

 

 I also list more comments below:

Title: Change “using Simple” to “Using a Simple” in the title of manuscript.

Response: Thank you for the comment. We modified the title.

 

Line 14: An academic approach is not in conflict with being a practical method. This sentence should be revised.

Response: Thank you for the comment. We revised the sentence.

 

Section 2: It is useful if the authors add a new figure showing the location of five catchments on map, the shape of each catchment and the location of gauges used for the calibration.

Response: Thank you for the comment. We added a new figure showing the location of five catchments.

 

Line 212-216: Please revise this part and explain why the NSE is close to 1. Is it because of human errors? Can the authors provide a possible reasoning for the odd behavior of parameters in figure 3?

Response: Thank you for the comment. This problem did not occur when the same data was applied to the other models, so that it is not seen as human errors in the data. . We think that there is a problem with the structure of the model and have modified the sentence.

 

Line 303-304: Why the focus of this analysis is on GR4J? For this table, I suggest the authors to discuss all three models equally.

Response: Thank you for the comment. We have added an analysis of the results of the other two models.

 

Section 4.3: There is no line numbering here but the authors mention “It means that the best one hundred parameter sets extracted for the period of 1980s could predict the hydrographs of 1990s and 2000s with a range of errors similar to that of the 1980s, but the uncertainty of the peak flow prediction for the period of 1970s was greater.” Please provide the potential reasons for this behavior.

Response: Thank you for the comment. Since this behavior occurs only on the Sacramento model for wet catchments, the potential reason is that the structural complexity of this model may increase the variability of the peak flow during floods. We have added this to the manuscript.

Reviewer 3 Report

The authors claim to have performed uncertainty analysis of three rainfall-runoff models, but they failed to present the statistical confidence intervals of the modeling results. Also, I do not agree with them that they actually conducted statistical uncertainty analysis because they basically tried to identify "optimal parameter ranges" instead of identifying how much uncertainty exists in the model outputs (e.g., 95% confidence interval of the simulated streamflow). I know that they plotted the boxplot of IHAs, but IHAs are an indirect indicator of the model performance because streamflow information is aggregated into these indices. That is one type of uncertainty propagation from streamflow to IHAs. I cannot find these discussions. I believe that they need to address these major comments first to make the manuscript publishable in the Water journal. Please also find my detail comments below.


Detail Comments

L35: It would make the sentence smoother to remove "however" because it emphasizes the preceding sentence rather than contradicting it.
L36: "error" => "errors"
L41: "model" => "models"
L49: There is a recent study using GLUE published in the Water journal. Please consider referring to "Huidae Cho, Jeongha Park, Dongkyun Kim, March 2019. Evaluation of Four GLUE Likelihood Measures and Behavior of Large Parameter Samples in ISPSO-GLUE for TOPMODEL. Water 11 (3), 447. doi:10.3390/w11030447."
L58-61: The authors need to justify this sentence and convince their audience. Why do they believe that the current uncertainty estimation methods are "excessively" academic to be practical? Personally, I do not see any problems in applying those uncertainty approaches in real scenarios. The bigger challenge would be to shift modelers' focus from "optimization" or "calibration" to "uncertainty estimation," which basically calls for the change of modeling paradigms.
L63: "appropriate rainfall-runoff model" => "an appropriate rainfall-runoff model"
L70: "Catchment" => "Catchments"
L110: "The range" => "The ranges"
L112: "various version" => "various versions"
L131: "that is R based" => "that is an R-based"
L136: "extracted" => "obtained"
L138: "more samples near the optimal value" Would not this behavior produce biased samples that over-emphasize optimal parameter values? Please refer to the publication by Cho et al. (2019) mentioned earlier. They discuss sampling bias in uncertainty estimation.
L140: "near the optimal likelihood value with high likelihoods that provides" => "near optimal likelihood values that provide a"
L151: Remove "respectively" because both models use the same number of function evaluations.
L152: "Dotty Plot" => "Dotty Plots"
L161: "A good simulation results is when" => "Good simulation results are when"
L162: A threshold value for the NSE highly depends on the study area, data, and the model used. For example, when the observed streamflow is highly dynamic, the denominator in Eq. (2) can become relatively bigger than its numerator, which in effect, increases the NSE without simulation being improved at all. This observation is discussed in Cho et al. (2019).
L179: I would argue that it is not just the structural uncertainty, but it can also be the different nature of observed data in different periods (e.g., wet vs. dry).
L182: "Rainfall-Runoff Model Structure" => "the Rainfall-Runoff Model Structure"
L196: "annual IHAs were averaged over a decade" Does it mean that there are four IHAs from 1970 to 2009 generated using one parameter set?
L213: "so they are not possible when using actual data" This sentence needs more explanations. Does it mean that they did not use "actual" data? Did they use synthetic data?
L216: "returning N/A" If any model returns N/A for whatever reasons, modelers need to figure out why this is happening and fix the issue first.
L219: "is abnormal" Please explain why it is considered abnormal. Multi-modal distributions are very common.
L219-222: The authors narrowed down the range of the lztwm parameter after the initial sampling. What justifies this approach? Are they trying to find "optimal" parameter ranges or the "uncertainty bounds" of model results conditional to a priori parameter distributions? If the range of red dots in Figure 3 is very narrow, why do they not take it as uncertainty in the model structure (e.g., lztwm parameterization)? If it is not the case, I would argue that the initial range of the lztwm parameter was physically invalid.
L223: "uncertainty in the parameter range" Any models should have reasonable parameter ranges. This study should not be about narrowing down the range of parameters, which is more of an optimization or parameter identification problem than an uncertainty estimation problem.
L229: "was propagated" I would not say it was propagated because the other parameters are not direct outputs generated by the lztwm parameter, like the model results. All these parameters are "correlated" and that is why the landscapes of dotty plots got clearer when those samples in the narrow range of the lztwm parameter were rid of. Propagation is more of the results of partial derivatives. For example, partial output per partial lztwm or, in general, the impact of parameter changes on outputs, not "between" parameters.

Author Response

Thank you for the comment. The purpose of this paper is not to find the statistical confidence interval of the model output as Cho et al.(2019)'s study. The purpose of this paper is to investigate the uncertainty of the hydrologic models using equally good parameter values. We have analyzed the uncertainty of the model by examining the distribution of equifinal parameter values and examining the range of NSE and IHAs values by simulation period. Therefore, we did not present statistical confidence intervals of the modeling results in this paper.

Huidae Cho, Jeongha Park, Dongkyun Kim, March 2019. Evaluation of Four GLUE Likelihood Measures and Behavior of Large Parameter Samples in ISPSO-GLUE for TOPMODEL. Water 11 (3), 447. doi:10.3390/w11030447.

 

Detail Comments

L35: It would make the sentence smoother to remove "however" because it emphasizes the preceding sentence rather than contradicting it.

Response: Thank you for the comment. We removed "however" from the sentence.

 

L36: "error" => "errors"

Response: Thank you for the comment. We modified the sentence.

 

L41: "model" => "models"

Response: Thank you for the comment. We modified the sentence.

 

L49: There is a recent study using GLUE published in the Water journal. Please consider referring to "Huidae Cho, Jeongha Park, Dongkyun Kim, March 2019. Evaluation of Four GLUE Likelihood Measures and Behavior of Large Parameter Samples in ISPSO-GLUE for TOPMODEL. Water 11 (3), 447. doi:10.3390/w11030447."

Response: Thank you for the comment. We added the reference to the manuscript.

 

L58-61: The authors need to justify this sentence and convince their audience. Why do they believe that the current uncertainty estimation methods are "excessively" academic to be practical? Personally, I do not see any problems in applying those uncertainty approaches in real scenarios. The bigger challenge would be to shift modelers' focus from "optimization" or "calibration" to "uncertainty estimation," which basically calls for the change of modeling paradigms.

Response: Thank you for the comment. We revised the sentence.

 

L63: "appropriate rainfall-runoff model" => "an appropriate rainfall-runoff model"

Response: Thank you for the comment. We modified the sentence.

 

L70: "Catchment" => "Catchments"

Response: Thank you for the comment. We have modified the subtitle.

 

L110: "The range" => "The ranges"

Response: Thank you for the comment. We modified the sentence.

 

L112: "various version" => "various versions"

Response: Thank you for the comment. We modified the sentence.

 

L131: "that is R based" => "that is an R-based"

Response: Thank you for the comment. We modified the sentence.

 

L136: "extracted" => "obtained"

Response: Thank you for the comment. We modified the sentence.

 

L138: "more samples near the optimal value" Would not this behavior produce biased samples that over-emphasize optimal parameter values? Please refer to the publication by Cho et al. (2019) mentioned earlier. They discuss sampling bias in uncertainty estimation.

Response: Thank you for the comment. In this regard, we refer to Cho et al.(2019)'s work in Section 3.1.

 

L140: "near the optimal likelihood value with high likelihoods that provides" => "near optimal likelihood values that provide a"

Response: Thank you for the comment. We modified the sentence.

 

L151: Remove "respectively" because both models use the same number of function evaluations.

Response: Thank you for the comment. We modified the sentence.

 

L152: "Dotty Plot" => "Dotty Plots"

Response: Thank you for the comment. We have modified the subtitle.

 

L161: "A good simulation results is when" => "Good simulation results are when"

Response: Thank you for the comment. We modified the sentence.

 

L162: A threshold value for the NSE highly depends on the study area, data, and the model used. For example, when the observed streamflow is highly dynamic, the denominator in Eq. (2) can become relatively bigger than its numerator, which in effect, increases the NSE without simulation being improved at all. This observation is discussed in Cho et al. (2019).

Response: Thank you for the comment. We eliminated this sentence because the threshold of good NSE value was not used in this study.

 

L179: I would argue that it is not just the structural uncertainty, but it can also be the different nature of observed data in different periods (e.g., wet vs. dry).

Response: Thank you for the comment. NSE values may differ due to different nature of observed data in different periods (e.g., wet vs. dry). For example, the NSE value of the calibration period is 0.8, but the NSE value of the validation period may be 0.7. However, what we claim is that if the range of NSE values by equifinal sample sets for the calibration period is much different from the range of NSE values for the validation period, then this may be mainly due to structural problems of the model. For example, if the range of the NSE value in the calibration period is 0.75-0.85 and the range of the NSE value in the validation period is 0.65-0.75, then this model has low uncertainty and can predict runoff with stability. The results for this range are described in detail in Section 4.2. We have added more explanation in Section 3.3 to clarify this.

 

L182: "Rainfall-Runoff Model Structure" => "the Rainfall-Runoff Model Structure"

Response: Thank you for the comment. We have modified the subtitle.

 

L196: "annual IHAs were averaged over a decade" Does it mean that there are four IHAs from 1970 to 2009 generated using one parameter set?

Response: Thank you for the comment. Yes and we gave additional explanations.

 

L213: "so they are not possible when using actual data" This sentence needs more explanations. Does it mean that they did not use "actual" data? Did they use synthetic data?

Response: Thank you for the comment. We used real data. This sentence means that it cannot happen when using real data. We have provided additional explanations in this paragraph to clarify this.

 

L216: "returning N/A" If any model returns N/A for whatever reasons, modelers need to figure out why this is happening and fix the issue first.

Response: Thank you for the comment. We have added an explanation for this.

 

L219: "is abnormal" Please explain why it is considered abnormal. Multi-modal distributions are very common.

Response: Thank you for the comment. Not only is this multi-modal distribution perfectly separated, but the best 100 values are almost perfect, so we consider this to be abnormal. In order to clarify this, we added an additional explanation to the preceding sentence.

 

L219-222: The authors narrowed down the range of the lztwm parameter after the initial sampling. What justifies this approach? Are they trying to find "optimal" parameter ranges or the "uncertainty bounds" of model results conditional to a priori parameter distributions? If the range of red dots in Figure 3 is very narrow, why do they not take it as uncertainty in the model structure (e.g., lztwm parameterization)? If it is not the case, I would argue that the initial range of the lztwm parameter was physically invalid.

Response: Thank you for the comment. We removed and analyzed the range of abnormal parameter to investigate the abnormal behavior of the lztwm parameter. This may be caused by parameter uncertainty or uncertainty of the model structure and therefore uncertainty propagation analysis as in Section 4.1. This problem did not occur under other conditions, so the initial range of this parameter is not always physically invalid. To clarify this, we have provided additional explanation in the paragraph.

 

L223: "uncertainty in the parameter range" Any models should have reasonable parameter ranges. This study should not be about narrowing down the range of parameters, which is more of an optimization or parameter identification problem than an uncertainty estimation problem.

Response: Thank you for the comment. Narrowing down the range of the lztwm parameter is necessary to investigate uncertainty in the parameter or model structure, so it is necessary in this study.

 

L229: "was propagated" I would not say it was propagated because the other parameters are not direct outputs generated by the lztwm parameter, like the model results. All these parameters are "correlated" and that is why the landscapes of dotty plots got clearer when those samples in the narrow range of the lztwm parameter were rid of. Propagation is more of the results of partial derivatives. For example, partial output per partial lztwm or, in general, the impact of parameter changes on outputs, not "between" parameters.

Response: Thank you for the comment. We removed "propagate" and modified the sentence.

Round 2

Reviewer 2 Report

I see the authors have addressed almost all my comments properly. My only comment is that (as I mentioned in the previous round of the revision) I would like the authors to explicitly mention in the introduction that their work is only focused on the parameter uncertainty (not all sources of uncertainty). After this minor revision. I believe the manuscript will be appropriate for the publication. 

Author Response

I see the authors have addressed almost all my comments properly. My only comment is that (as I mentioned in the previous round of the revision) I would like the authors to explicitly mention in the introduction that their work is only focused on the parameter uncertainty (not all sources of uncertainty). After this minor revision. I believe the manuscript will be appropriate for the publication.

Response: Thank you for the comment. We have added this to the last paragraph of the introduction section.

Reviewer 3 Report

I appreciate the authors' responses. I added some discussions below.


I suggest to include the definition of the IHA as equations in the text for audience not familiar with it.


L179: I would argue that it is not just the structural uncertainty, but it can also be the different nature of observed data in different periods (e.g., wet vs. dry).

Response: Thank you for the comment. NSE values may differ due to different nature of observed data in different periods (e.g., wet vs. dry). For example, the NSE value of the calibration period is 0.8, but the NSE value of the validation period may be 0.7. However, what we claim is that if the range of NSE values by equifinal sample sets for the calibration period is much different from the range of NSE values for the validation period, then this may be mainly due to structural problems of the model. For example, if the range of the NSE value in the calibration period is 0.75-0.85 and the range of the NSE value in the validation period is 0.65-0.75, then this model has low uncertainty and can predict runoff with stability. The results for this range are described in detail in Section 4.2. We have added more explanation in Section 3.3 to clarify this.


Comment: Similar higher NSE values do not necessarily mean that the model performs equally well in both calibration and validation periods because the NSE tends to be affected by the dynamics of observed data, not just by the model performance. A model that performs well in a wet calibration period might produce a high NSE for a dry period, that it poorly simulates (most likely overestimation), with a couple peak flows, or vice versa. Can we still claim that the model performs with stability? I still believe that an analysis of model reliability or uncertainty should assess how accurate the model results are as compared to the observed data, not just how much narrowly whatever metrics based on simulated data are distributed (e.g., NSE with simulated and observed, but aggregated, and IHAs with simulated, if I'm not mistaken). Have you compared the simulated IHAs with the observed IHAs?


Your approach can be summarized as:

1. Find equally good parameter sets

2. Assess the distribution of these parameter sets, NSE, and IHAs.

3. Lower variability in NSE or IHAs (maybe, parameter distributions also?) means lower parameter uncertainty.


How good these equifinal parameter sets were evaluated only using the NSE and simulated IHAs did not seem to be compared to observed IHAs. Again, the NSE has a weakness that can artificially inflate the model performance. The authors may not need to construct the statistical confidence interval of the hydrograph for their purposes, but a low variability of the NSE does not guarantee the stability of models. This is why it would be great to compare IHAs between simulated and observed data.

 


L213: "so they are not possible when using actual data" This sentence needs more explanations. Does it mean that they did not use "actual" data? Did they use synthetic data?

Response: Thank you for the comment. We used real data. This sentence means that it cannot happen when using real data. We have provided additional explanations in this paragraph to clarify this.

 

L216: "returning N/A" If any model returns N/A for whatever reasons, modelers need to figure out why this is happening and fix the issue first.

Response: Thank you for the comment. We have added an explanation for this.


Comment: OK, the data they used is "actual" data; only some part of simulated data is N/A. Based on these two responses, I assume that those 100 best models "failed" to simulate the entire calibration period with some time steps almost being perfect, but with others being N/A. If this is the case, I would not call these models "best" models and not include them in the analysis. Anyway, I believe that the authors got rid of these models later (?) to fix this issue. Please confirm this.

 

L219: "is abnormal" Please explain why it is considered abnormal. Multi-modal distributions are very common.

Response: Thank you for the comment. Not only is this multi-modal distribution perfectly separated, but the best 100 values are almost perfect, so we consider this to be abnormal. In order to clarify this, we added an additional explanation to the preceding sentence.


Comment: Again, as far as I understand, those 100 best models were not really almost perfect because the simulated data included N/A. The NSE would be N/A, not close to 1.



Author Response

Reviewer 3:

Comments and Suggestions for Authors

 

I appreciate the authors' responses. I added some discussions below.

 

I suggest to include the definition of the IHA as equations in the text for audience not familiar with it.

Response: Thank you for the comment. Descriptions of the six IHA's used in this study are described in Section 3.4 and we believe that it is appropriate to describe them in text rather than as equations because their concept is simple.

 

 

L179: I would argue that it is not just the structural uncertainty, but it can also be the different nature of observed data in different periods (e.g., wet vs. dry).

Response: Thank you for the comment. NSE values may differ due to different nature of observed data in different periods (e.g., wet vs. dry). For example, the NSE value of the calibration period is 0.8, but the NSE value of the validation period may be 0.7. However, what we claim is that if the range of NSE values by equifinal sample sets for the calibration period is much different from the range of NSE values for the validation period, then this may be mainly due to structural problems of the model. For example, if the range of the NSE value in the calibration period is 0.75-0.85 and the range of the NSE value in the validation period is 0.65-0.75, then this model has low uncertainty and can predict runoff with stability. The results for this range are described in detail in Section 4.2. We have added more explanation in Section 3.3 to clarify this.

Comment: Similar higher NSE values do not necessarily mean that the model performs equally well in both calibration and validation periods because the NSE tends to be affected by the dynamics of observed data, not just by the model performance. A model that performs well in a wet calibration period might produce a high NSE for a dry period, that it poorly simulates (most likely overestimation), with a couple peak flows, or vice versa. Can we still claim that the model performs with stability? I still believe that an analysis of model reliability or uncertainty should assess how accurate the model results are as compared to the observed data, not just how much narrowly whatever metrics based on simulated data are distributed (e.g., NSE with simulated and observed, but aggregated, and IHAs with simulated, if I'm not mistaken). Have you compared the simulated IHAs with the observed IHAs?

Your approach can be summarized as:
1. Find equally good parameter sets
2. Assess the distribution of these parameter sets, NSE, and IHAs.
3. Lower variability in NSE or IHAs (maybe, parameter distributions also?) means lower parameter uncertainty.

How good these equifinal parameter sets were evaluated only using the NSE and simulated IHAs did not seem to be compared to observed IHAs. Again, the NSE has a weakness that can artificially inflate the model performance. The authors may not need to construct the statistical confidence interval of the hydrograph for their purposes, but a low variability of the NSE does not guarantee the stability of models. This is why it would be great to compare IHAs between simulated and observed data.

Response: Thank you for the comment. The purpose of this study is not to find high NSE values for the calibration and validation period of the parameters. This is because, as you mentioned, the NSE value is affected by the dynamics of the observed data.
As described in Section 3.3, this study first calculates a range of NSE values by extracting 100 equally good parameter sample sets for the calibration period. This is related to the identifiability of the parameters as described in Section 3.2 and Figure 5. One may use other model performance evaluation statistics other than NSE, but as described in Section 3.2, we used the NSE because of its popularity. The 100 parameter sample sets are then applied to the validation period to calculate the range of NSE values. Due to the dynamics of the observed data, the NSE values during the validation period may be low and may be different from those in the calibration period, and it does not matter. However, the size of the range of 100 NSE values in the calibration period (e.g. 0.1, the size of the NSE range of 0.75-0.85) should be similar to the size of the range of 100 NSE values in the validation period (e.g. 0.1, the size of the NSE range of 0.65-0.75). This is because the size of the range of NSE values in the calibration and validation periods should be similar if parameters and model structures are well-defined, since we use 100 equally good parameter sample sets in the same model structure. The ability to stably simulate for the validation period is not about the high NSE value but about the size of the range of NSE values. This is because, as you mentioned, the NSE value is affected by the dynamics of the observed data. As shown in this study, the GR4J model and the IHACRES model mostly have similar sizes of ranges of NSE and IHA for calibration and validation periods. However, in the case of the Sacramento model, the sizes of the ranges of the NSE and the IHA are different between calibration and validation periods because of the uncertainty of the parameters or structure. Therefore, the method proposed by this study is relatively simple to confirm the uncertainty of these hydrologic models. We have added more explanation in the second paragraph of section 3.3 to clarify this.
Comparisons of simulated IHA and observed IHA have already been described in Section 4.3.

 

 

L213: "so they are not possible when using actual data" This sentence needs more explanations. Does it mean that they did not use "actual" data? Did they use synthetic data?

Response: Thank you for the comment. We used real data. This sentence means that it cannot happen when using real data. We have provided additional explanations in this paragraph to clarify this. 

L216: "returning N/A" If any model returns N/A for whatever reasons, modelers need to figure out why this is happening and fix the issue first.

Response: Thank you for the comment. We have added an explanation for this.

Comment: OK, the data they used is "actual" data; only some part of simulated data is N/A. Based on these two responses, I assume that those 100 best models "failed" to simulate the entire calibration period with some time steps almost being perfect, but with others being N/A. If this is the case, I would not call these models "best" models and not include them in the analysis. Anyway, I believe that the authors got rid of these models later (?) to fix this issue. Please confirm this.

Response: Thank you for the comment. We removed these models to fix the problem shown in Figure 3, and the corrected results are shown in Figure 4. We did not use the problematic models in Figure 3 for later analysis. We have added explanation in Section 4.1 to clarify this.

 

 

L219: "is abnormal" Please explain why it is considered abnormal. Multi-modal distributions are very common.

Response: Thank you for the comment. Not only is this multi-modal distribution perfectly separated, but the best 100 values are almost perfect, so we consider this to be abnormal. In order to clarify this, we added an additional explanation to the preceding sentence.

Comment: Again, as far as I understand, those 100 best models were not really almost perfect because the simulated data included N/A. The NSE would be N/A, not close to 1.

Response: Thank you for the comment. As you mentioned, it is not really almost perfect because N/A is included in the simulated data and the 100 best models have almost perfect NSE because they are calculated using simulated data not including N/A. If N/A is included in the NSE calculation, the NSE will be N/A. We did not use the 100 best models because simulated data includes N/A as we have answered in the above question. To clarify this, we modified the sentence in the first paragraph of section 4.1 as follows: “The NSE value was close to 1 because it was calculated only for the simulated runoff not including N/A.”

 

Author Response File: Author Response.docx

Round 3

Reviewer 3 Report

The authors responded to my comments satisfactorily. I think that the major confusion was due to the fact that they studied the identifiability of model parameters using the NSE rather than using a pre-defined set of "true" parameter values, which is very common in this type of studies.


Minor comments

L188: "the size the range of NSE values" => "the size of the range of NSE values"

Author Response

Minor comments

L188: "the size the range of NSE values" => "the size of the range of NSE values"

Response: Thank you for the comment. We have modified this sentence.

Author Response File: Author Response.docx

Back to TopTop