Next Article in Journal
The Soil Nutrient Digital Mapping for Precision Agriculture Cases in the Trans-Ural Steppe Zone of Russia Using Topographic Attributes
Next Article in Special Issue
Deep Learning for Toponym Resolution: Geocoding Based on Pairs of Toponyms
Previous Article in Journal / Special Issue
High-Resolution Remote Sensing Image Segmentation Framework Based on Attention Mechanism and Adaptive Weighting
 
 
Article
Peer-Review Record

An Innovative Intelligent System with Integrated CNN and SVM: Considering Various Crops through Hyperspectral Image Data

ISPRS Int. J. Geo-Inf. 2021, 10(4), 242; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi10040242
by Shiuan Wan 1, Mei-Ling Yeh 2,* and Hong-Lin Ma 3
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4: Anonymous
ISPRS Int. J. Geo-Inf. 2021, 10(4), 242; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi10040242
Submission received: 22 January 2021 / Revised: 4 April 2021 / Accepted: 6 April 2021 / Published: 7 April 2021

Round 1

Reviewer 1 Report

  1. The author must reference the literature use
  2. the paper must pass an english editor
  3. the paper is not flowing properly (one of the reason is the issue of language)
  4. the method is not clearly stated 
  5. the paper can be quite interested in the issue of language is addressed and if its flowing properly. 

 

Comments for author File: Comments.pdf

Author Response

Dear reviewer

 

First of all, I deeply appreciate your time to review the paper and gave us many helpful suggestions. This paper is written under the consideration of using the pros of SVM and CNN to combine them together. The paper is very interesting but not easy to understand due to it is a little bit complicated. Let me explain it in detail.

First, we have a hyperspectral image for many kinds of crops on it. The SVM is used for pixel-based classification and the CNN is using regional-based classification. We used SVM first and then used CNN. However, the disadvantage of pixel-based classification will produce the salt-pepper effect for generating the thematic map. Thus, we develop the CNN approach to fix the salt-pepper effect. Please see Figure 3 for the steps for the entire paper.

There is a fundamental question that arise is why do not directly use CNN to solve classification problem? Because CNN is very time-consuming. Therefore, we use the output of CNN to repair the errors of SVM. How to repair? Please see Table 2+Table 5 for an explanation. The item of potatoes is used as an example. The CNN has 100% of potatoes classification and removes the errors of SVM. Based on this method, the PCA for feature selection can use the smallest number of features to produce the perfect classification performance (Please see Table 7).

I hope the above description can help you to understand this paper. Although the CNN and SVM are not new technology, however, this brand-new idea makes the paper interesting and it really has a good scientific contribution.

 

Milly

 

We do appreciate you provide the pdf file of the article. Indeed, it is very helpful.

The author must reference the literature use

We follow your instruction to fix the paper. Please see the attached pdf file.

the paper must pass an English editor

The paper is modified by a native English editor when we submit the paper on the first submission. Dr. Wan also teaches English technical writing at University. He has more than 30 international papers published.

the paper is not flowing properly

The paper is very interesting but not easy to understand due to it is a little bit complicated. I write the letter to you and hope you can read and follow it.

the method is not clearly stated

We do our best to put individually CNN and SVM two approaches. The entire steps of the paper are listed in Figure 3.

the paper can be quite interested in the issue of language is addressed and if its flowing properly.

The paper is a brand idea on how to put CNN and SVM together. It is brilliant and it really has an important scientific contribution.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

In Abstract, pag. 1, rows: 12-14: the sentence „The study plans to develop a variety of crop classification system to identify multiple major crops in the Chiayi Golden Corridor in multi-objective decision-making.” is unclear; please explain more clearly the main goal of the research.

In chp. 1 Introduction , pag. 1, rows 33-34: in  the sentence ”Therefore, the spatial resolution of 6m to 40m satellite image data in the target area of common format is very hard to use.”please give some example of  satellite systems with 6 to 40 m spatial resolution. Please consider also the avaibable satellite high resolution systems like: Jilin 1 (2019), Worldview 3 (2014), OHS (2018), etc.

In chp. 1 Introduction , pag. 1, rows 38-39: The sentence „The hyperspectral images have improved much image spatial resolution and image bands (or call image spectrum)”; the term „spectral resolution” (defined as the span of the wavelength over which a spectral channel operates by the sensor) is more appropiate.

In chp. 1 Introduction , pag. 1, rows 40-42: please verify the grammar of the sentence „However, the complicated hyperspectral image information has resulted in the use of traditional classifiers fail to obtain good classification accuracy”, and correct it.

In chp. 1 Introduction , pag. 2, rows 63-64: the sentence „Principal component analysis (PCA) is a technique to compute the principal components and transferring them to display a change of basis on the data”. Please consider also that the PCA is a technique for reducing the dimensionality of large datasets, increasing interpretability but at the same time minimizing information loss.

In chp. 1 Introduction , pag. 2, rows 40-42: the sentence ”The accuracy can effectively improve on how to effectively use the hyperspectral data to perform effective analysis on image classification” is unclear; please clarify it.

In chp. 1 Introduction , pag. 2, rows 75-77: in the sentence” „ Specifically, given a set of training instances, each training instance will mark as one of two categories or the other, the SVM training algorithm establishes a model for a new instance assigned to one of the two categories”, please expalin what is the meaning of „each training instance will mark as one of two categories or the other”.

In chp. 1 Introduction , pag. 3, rows 100-101: in the sentence ”The study proposes a brand new idea that combines SVM and CNN to handle the multi-categories classification in remote sensing”, please consider the term „multiclass or multinomial classification” insteed of multi-categories classification.

In chp. 2. Materials, 2.2. Brief Introduction on Image Format, pag. 3, rows 128-129: in the sentence „After the creation of the sampling, there are 1000 sample points by the hyperspectral data on the left, 500 points are trained, 500 points are tested”, please expain what are the „1000 sample points by the hyperspectral data on the left”?

In chp. 2. Materials, 2.3. Image Take Place on Study Area, pag. 3, rows 137-139: The information included in the sentence „In 2016, the spectral image used in this study provides the Compact Airborne Spectrographic Imager (CASI) hyperspectral image for Chung Hsing Surveying Company of Taiwan” are alredy presented in the 2.2. subchp. (rows 123-125). Please avoid duplication.

In chp. 2. Materials, 2.3. Image Take Place on Study Area, pag. 3-4, rows 140-142: Similar with the previous comment, the information given in thr sentences „The image scanning system CASI-1500 was manufactured by THE Canadian COMPANY ITRES and the CASI-1500 instrument specifications have spectral wavelengths ranging from 365nm to 1050nm. It is equivalent to a visible to near-infrared light band range of 1m spatial resolution with 72 bands.” was already presented in subchp. 2.2, with a slight difference for the spectral domain: 365nm to 1050nm versus 380nm to 1050nm. Please correct.

In chp. 2. Materials, 2.3. Image Take Place on Study Area, pag. 3-4, rows 151-152: „The hyperspectral image in this study is taken in January and April 2016 as shown in Figure 1(a).” Please specify what is the aquisition date of the image presented in fig. 1 (a).

In chp. 3. Research Method, 3.1. Study Plan, pag.4, rows 164-166: The sentence „As observing the remote sensing datasets, the convolutional neural networks (CNNs) have become the state-of-the-art approaches for object classification in images.” is only a general statement and I suggest to be removed.

In chp. 3. Research Method, 3.2. Brief on Support Vector Machine, pag.5, rows 202-203: The sentence ”A soft margin problem for the case of SVM’s is to handle the linearly non-separable data first by Cortes and Vapnik (1995).” is unclear; please clarify it.

In chp. 3. Research Method, 3.2. Brief on Support Vector Machine, pag.5, row 204: „In ξi ≥0 this case, SVMs algorithm...”, please define the parameter ξi.

In chp. 3. Research Method, the nr. of the subchp. „Brief on Deep Learning for Recognition and Identification” is wrong (has to be 3.3) ; please correct it.

In chp. 3. Research Method, subchp. Brief on Deep Learning for Recognition and Identification, pag.7, rows 236-238: The sentence „One way to characterize the image is to filter the image to get more useful  information, such as using the edge (Edge Detection) detection of the derivative (mask).” is unclear; please make it more clear.

In chp. 3. Research Method,pag. 8, row 282: the nr. and the name of the subchp. is wrong; please correct it. Please verify and correct the numbering and titles of the subchapters of chp. 3.

In chp. 3. Research Method, pag. 8, rows: 283-286”: in the sentence „The whole research plan is divided into four steps (see Figure 3) (1) material preparation for PCA ttribute selection (2) support vector machine before processing (3) convolutional neural network reprocessing detail class (4) to establish tree classification and layer rules, summarized as follows”; the steps presented are not clearly found in figure 3. Research Steps of Study. Please correct this.

In chp. 3. Research Method, pag. 8, row 288: Please include in the manuscript text a more detailed description of the Figure 3. Research Steps of Study.

In chp. 3. Research Method, pag. 8, rows 290-292: the sentence „Figure 4 (a) is the Regional Object Classification model (ROC) [32] program that selects a combination set of parameters of the seeds Area(A) and Similarity(S).” is unclear...I suggest „Figure 4 (a) shows how the Regional Object Classification model (ROC) [32] program is able to select a combination set of parameters ....”.

In chp. 3. Research Method, pag. 8, rows 292-293: the sentence „The red line is the linear regression function and it will increase to the surrounding levees for each margin of sides as an integrated patch.” Is unclear... what are the „margin of sides” ?

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, rows 306-308: in the sentence ”The user has to adjust the kernel function and parameters according to the situation, which will have a significant impact on the prediction accuracy rate.” please explain in a more concrete way the „situation”.

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, row 314: please rephrase the sentence „This step is to optimize the optimal classification model obtained in the previous step.”... to optomize the optimal ?

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, row 321-322: please correct the sentence „Table 2 confusion matrix of the results of SVM.”; Table 2 is the...

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, row 336-337: the sentence „The PCA takes 8, 16, 24 three different combinations to access the accuracy.” is unclear, please explain more in details the procedure.

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 10, row  337: the specification „Our development program CNN is written in Python.” was already presented in the previous part of the manuscript and can be deleted.

In chp. 4. Discussion of the Results, 4.2. Second Stage: Improvement Classification of CNN, pag. 13, rows 366-367: in the sentence ”Only the cabbage fields have one misjudgment by omission error. The accuracy rate is 98.6%.” there is an inconsistency between the text and Table 5. „Confusion matrix of CNN by PCA 24 epoch 30”....cabbage or patatoes ? please verify and correct also the accuracy rate.

In chp. 4. Discussion of the Results, 4.2. Second Stage: Improvement Classification of CNN, pag. 14, rows 376-377: The sentence: „In other words, CNN performs well for a certain  number of features [34-36]”; please explain more in detaiil this conclusion.

In chp. 5. Summary and Conclusion, pag. 15, rows 420-421: iin the sentences: „Three different cases are considered: PCA=8, epoch=30, the accuracy is 96%. 420 PCA=16, epoch=30 the accuracy is 98%. PCA=24, epoch=30, the accuracy is 98.6%”, please verify the accuracy values.

In chp. 5. Summary and Conclusion, pag. 15, rows 429-431: the sentence „It is expected that the pros of Support 429 Vector Machine (SVM) for hyperspectral image classification with good results and deep 430 learning (Convolution Neural Network; CNN) can improve the classification of image details.” are unclear, please rephrase it.

Author Response

Dear reviewer

 

First of all, I deeply appreciate your time to review the paper and gave us many helpful suggestions. This paper is written under the consideration of using the pros of SVM and CNN to combine them together. The paper is very interesting but not easy to understand due to it is a little bit complicated. Let me explain it in detail.

 

First, we have a hyperspectral image for many kinds of crops on it. The SVM is used for pixel-based classification and the CNN is using regional-based classification. We used SVM first and then used CNN. However, the disadvantage of pixel-based classification will produce the salt-pepper effect for generating the thematic map. Thus, we develop the CNN approach to fix the salt-pepper effect. Please see Figure 3 for the steps for the entire paper.

 

There is a fundamental question that arise is why do not directly use CNN to solve classification problem? Because CNN is very time-consuming. Therefore, we use the output of CNN to repair the errors of SVM. How to repair? Please see Table 2+Table 5 for an explanation. The item of potatoes is used as an example. The CNN has 100% of potatoes classification and removes the errors of SVM. Based on this method, the PCA for feature selection can use the smallest number of features to produce the perfect classification performance (Please see Table 7).

 

I hope the above description will help you to understand this paper. The CNN and SVM are not a new technology, however, this brand new idea makes the paper interesting and it really has a good scientific contribution.

 

Milly

 

 

Please see the red text in the new manuscript.

In Abstract, pag. 1, rows: 12-14: the sentence „The study plans to develop a variety of crop classification system to identify multiple major crops in the Chiayi Golden Corridor in multi-objective decision-making.” is unclear; please explain more clearly the main goal of the research.

We change to

The study planned to develop a multi-category crop hyperspectral image classification system for identifying the major crops in the Chiayi Golden Corridor.

In chp. 1 Introduction , pag. 1, rows 33-34: in  the sentence ”Therefore, the spatial resolution of 6m to 40m satellite image data in the target area of common format is very hard to use.”please give some example of  satellite systems with 6 to 40 m spatial resolution. Please consider also the avaibable satellite high resolution systems like: Jilin 1 (2019), Worldview 3 (2014), OHS (2018), etc.

 

We do have a series of research by using Worldview format paper (by Dr. Wan)

11.     Wan, S., and S. H. Chang. 2019. “Crop Classification with WorldView-2 Imagery Using Support Vector Machine Comparing Texture Analysis Approaches and Grey Relational Analysis in Jianan Plain, Taiwan.” International Journal of Remote Sensing 1–17

We would like to try a different image format

This research used different hyperspectral image formats which makes the classification better.

Very thank you for your suggestion for using Jilin 1. we will use it in the next research.

In chp. 1 Introduction , pag. 1, rows 38-39: The sentence „The hyperspectral images have improved much image spatial resolution and image bands (or call image spectrum)”; the term „spectral resolution” (defined as the span of the wavelength over which a spectral channel operates by the sensor) is more appropriate

We appreciate your suggestion.

We follow it.

In chp. 1 Introduction , pag. 1, rows 40-42: please verify the grammar of the sentence „However, the complicated hyperspectral image information has resulted in the use of traditional classifiers fail to obtain good classification accuracy”, and correct it.

Thank you, We change to

However, the hyperspectral image information which is complicated for traditional classifiers fail to obtain good classification accuracy.

In chp. 1 Introduction , pag. 2, rows 63-64: the sentence „Principal component analysis (PCA) is a technique to compute the principal components and transferring them to display a change of basis on the data”. Please consider also that the PCA is a technique for reducing the dimensionality of large datasets, increasing interpretability but at the same time minimizing information loss.

 

Very good suggestion We change to

On the one hand, Principal component analysis (PCA) is a technique to compute the principal components and transferring them to display a change of basis on the data[9]. On the other hand, It also is a technique for reducing the dimensionality of large datasets, increasing interpretability but at the same time minimizing information loss.

In chp. 1 Introduction , pag. 2, rows 40-42: the sentence ”The accuracy can effectively improve on how to effectively use the hyperspectral data to perform effective analysis on image classification” is unclear; please clarify it.

Indeed, this sentence is not good! We change to

Through using PCA as preprocessing technique of image data, the accuracy can be effectively improved.

In chp. 1 Introduction , pag. 2, rows 75-77: in the sentence” „ Specifically, given a set of training instances, each training instance will mark as one of two categories or the other, the SVM training algorithm establishes a model for a new instance assigned to one of the two categories”, please expalin what is the meaning of „each training instance will mark as one of two categories or the other”.

Thank you

We change training instance to training samples.

In chp. 1 Introduction , pag. 3, rows 100-101: in the sentence ”The study proposes a brand new idea that combines SVM and CNN to handle the multi-categories classification in remote sensing”, please consider the term „multiclass or multinomial classification” instead of multi-categories classification.

Thank you

We change to multiclass

In chp. 2. Materials, 2.2. Brief Introduction on Image Format, pag. 3, rows 128-129: in the sentence „After the creation of the sampling, there are 1000 sample points by the hyperspectral data on the left, 500 points are trained, 500 points are tested”, please expain what are the „1000 sample points by the hyperspectral data on the left”?

Thank you very much! This is my thoughtless.

We change to:

After the creation of the sampling, there are 1000 sample randomly selected for the points by the hyperspectral data, 500 points are trained, 500 points are tested.

In chp. 2. Materials, 2.3. Image Take Place on Study Area, pag. 3, rows 137-139: The information included in the sentence „In 2016, the spectral image used in this study provides the Compact Airborne Spectrographic Imager (CASI) hyperspectral image for Chung Hsing Surveying Company of Taiwan” are already presented in the 2.2. subchp. (rows 123-125). Please avoid duplication.

Thank you for suggestion.

We delete the second statement. Thank you very much!

In chp. 2. Materials, 2.3. Image Take Place on Study Area, pag. 3-4, rows 140-142: Similar with the previous comment, the information given in thr sentences „The image scanning system CASI-1500 was manufactured by THE Canadian COMPANY ITRES and the CASI-1500 instrument specifications have spectral wavelengths ranging from 365nm to 1050nm. It is equivalent to a visible to near-infrared light band range of 1m spatial resolution with 72 bands.” was already presented in subchp. 2.2, with a slight difference for the spectral domain: 365nm to 1050nm versus 380nm to 1050nm. Please correct.

Thank you for very carefully check! We also hope this paper can be presented perfectly. Without your help, we cannot make it perfect.

Both sentences are modified clearly.

The second sentence of spectral domain is delete to avoid duplicate.

 

We do appreciate your help!

In chp. 2. Materials, 2.3. Image Take Place on Study Area, pag. 3-4, rows 151-152: „The hyperspectral image in this study is taken in January and April 2016 as shown in Figure 1(a).” Please specify what is the aquisition date of the image presented in fig. 1 (a).

Thank you! We double check it

 

It should be April, 2016

In chp. 3. Research Method, 3.1. Study Plan, pag.4, rows 164-166: The sentence „As observing the remote sensing datasets, the convolutional neural networks (CNNs) have become the state-of-the-art approaches for object classification in images.” is only a general statement and I suggest to be removed.

 

Thank you!

Considering the characteristic of different crop field;

We change to “As observing the remote sensing datasets, the convolutional neural networks (CNNs) may be a very appropriate tool for analyzing various crop fields due to object classification.”

 

In chp. 3. Research Method, 3.2. Brief on Support Vector Machine, pag.5, rows 202-203: The sentence ”A soft margin problem for the case of SVM’s is to handle the linearly non-separable data first by Cortes and Vapnik (1995).” is unclear; please clarify it.

 

Thank you!

Soft margin problem is a general statement.

We change to

Soft Margin Formulation is employed to classify linearly inseparable data. More specifically, if there’s no specific linear decision boundary that can perfectly separate the data, This is so-called linearly inseparable.

In chp. 3. Research Method, 3.2. Brief on Support Vector Machine, pag.5, row 204: „In ξi ≥0 this case, SVMs algorithm...”, please define the parameter ξi.

 

Thank you!

It is fixed as:

The ξmeans is the count of errors made by our classifier on the set of training examples.

In chp. 3. Research Method, the nr. of the subchp. „Brief on Deep Learning for Recognition and Identification” is wrong (has to be 3.3) ; please correct it.

 

The title is change by another reviewer

We change to

3.1 Brief on Deep Learning for Regional Image Classification

It should be 3.3, Thank you!

In chp. 3. Research Method, subchp. Brief on Deep Learning for Recognition and Identification, pag.7, rows 236-238: The sentence „One way to characterize the image is to filter the image to get more useful  information, such as using the edge (Edge Detection) detection of the derivative (mask).” is unclear; please make it more clear.

It is a good suggestion.

We put one more sentence for clarity

In other words, convolutional layers are strong feature extractors in which the convolutional filters are capable of finding features of images.

In chp. 3. Research Method,pag. 8, row 282: the nr. and the name of the subchp. is wrong; please correct it. Please verify and correct the numbering and titles of the subchapters of chp. 3.

 

Thank you for carefully check!

We do appreciate.

Should be 3.4, We fixed it.

In chp. 3. Research Method, pag. 8, rows: 283-286”: in the sentence „The whole research plan is divided into four steps (see Figure 3) (1) material preparation for PCA ttribute selection (2) support vector machine before processing (3) convolutional neural network reprocessing detail class (4) to establish tree classification and layer rules, summarized as follows”; the steps presented are not clearly found in figure 3. Research Steps of Study. Please correct this.

The figure is change to :

See next page

 

In chp. 3. Research Method, pag. 8, row 288: Please include in the manuscript text a more detailed description of the Figure 3. Research Steps of Study.

The whole research plan is divided into five steps (see Figure 3) (1) support vector machine before processing (2) material preparation for PCA attribute selection (3) convolutional neural network reprocessing detail class (4) establish multi-classification and layer rules, (5) execute the repair model for fix the error classification outcomes. It summarizes as follows:

In chp. 3. Research Method, pag. 8, rows 290-292: the sentence „Figure 4 (a) is the Regional Object Classification model (ROC) [32] program that selects a combination set of parameters of the seeds Area(A) and Similarity(S).” is unclear...I suggest „Figure 4 (a) shows how the Regional Object Classification model (ROC) [32] program is able to select a combination set of parameters ....”.

 

 

Thank you

We follow your suggestion

In chp. 3. Research Method, pag. 8, rows 292-293: the sentence „The red line is the linear regression function and it will increase to the surrounding levees for each margin of sides as an integrated patch.” Is unclear... what are the „margin of sides” ?

Very good suggestion

We rewrite it as:

The red line is the linear regression function by collecting the coordinate data of blue pixels which are generated from ROC model and it will gradually increase to the surrounding levees for each margin of sides to the whole integrated patch.

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, rows 306-308: in the sentence ”The user has to adjust the kernel function and parameters according to the situation, which will have a significant impact on the prediction accuracy rate.” please explain in a more concrete way the „situation”.

Very good suggestion

We explain the situation as

More specifically, different distribution and dimension of data may search for a proper kernel function and the initial value of the parameter may also influence the computation speed.

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, row 314: please rephrase the sentence „This step is to optimize the optimal classification model obtained in the previous step.”... to optomize the optimal ?

We rewrite it as

This step is to optimize the model by searching the appreciate solution to obtain the classification outcome in the previous step.

Thank you.

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, row 321-322: please correct the sentence „Table 2 confusion matrix of the results of SVM.”; Table 2 is the...

Thank you for your carefully check

We fixed it.

We are apologize for our thoughtless. 

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 9, row 336-337: the sentence „The PCA takes 8, 16, 24 three different combinations to access the accuracy.” is unclear, please explain more in details the procedure.

We change to

A fundamental question that arises is how many dimensions to reduce to when doing PCA. Hence, It is decided to have better a understanding on how many PCAs are enough to carry out the outcomes of this study, the process of PCA takes 8, 16, 24 three different combinations to access the accuracy.

In chp. 4. Discussion of the Results, 4.1. First Stage Analysis of SVM, pag. 10, row  337: the specification „Our development program CNN is written in Python.” was already presented in the previous part of the manuscript and can be deleted.

It is deleted. Thank you very much!

In chp. 4. Discussion of the Results, 4.2. Second Stage: Improvement Classification of CNN, pag. 13, rows 366-367: in the sentence ”Only the cabbage fields have one misjudgment by omission error. The accuracy rate is 98.6%.” there is an inconsistency between the text and Table 5. „Confusion matrix of CNN by PCA 24 epoch 30”....cabbage or patatoes ? please verify and correct also the accuracy rate.

In Table 5 is the outputs of CNN classification. The cabbage fields have one misjudgment. We marked it. Then we used the fields of potatoes as an example to fix the SVM.

The accuracy rate is correct. We double -check it.

One error occurs at cabbage (Table 5)

Ac= 1-1/70=98.6%

In chp. 4. Discussion of the Results, 4.2. Second Stage: Improvement Classification of CNN, pag. 14, rows 376-377: The sentence: „In other words, CNN performs well for a certain number of features [34-36]”; please explain more in detail this conclusion.

We change to

In other words, through a series of testing, this study found that CNN performs well for a certain number of features to approach a satisfaction of accuracy rate.

In chp. 5. Summary and Conclusion, pag. 15, rows 420-421: in the sentences: „Three different cases are considered: PCA=8, epoch=30, the accuracy is 96%.  PCA=16, epoch=30 the accuracy is 98%. PCA=24, epoch=30, the accuracy is 98.6%”, please verify the accuracy values.

PCA=8, epoch=30 accuracy values is 97.1%. The rest of them is correct.

We already double check them.

Thank you for your mention.

In chp. 5. Summary and Conclusion, pag. 15, rows 429-431: the sentence „It is expected that the pros of Support 429 Vector Machine (SVM) for hyperspectral image classification with good results and deep 430 learning (Convolution Neural Network; CNN) can improve the classification of image details.” are unclear, please rephrase it.

Thank you!

The sentence has grammatical error in it and is not clear

We change to

In this study, the pros of the Support Vector Machine (SVM) for hyperspectral image classification can obtain an initial relative good result and deep learning (Convolution Neural Network; CNN) with the developed repair model can improve the classification of image details.

 

We do appreciate your help. You must spend a lot of time to review our paper!

Author Response File: Author Response.pdf

Reviewer 3 Report

The review comments to the authors are attached herewith. 

Comments for author File: Comments.pdf

Author Response

Dear reviewer

 

First of all, I deeply appreciate your time to review the paper and gave us many helpful suggestions. This paper is written under the consideration of using the pros of SVM and CNN to combine them together. The paper is very interesting but not easy to understand due to it is a little bit complicated. Let me explain it in detail.

 

First, we have a hyperspectral image for many kinds of crops on it. The SVM is used for pixel-based classification and the CNN is using regional-based classification. We used SVM first and then used CNN. However, the disadvantage of pixel-based classification will produce the salt-pepper effect for generating the thematic map. Thus, we develop the CNN approach to fix the salt-pepper effect. Please see Figure 3 for the steps for the entire paper.

 

There is a fundamental question that arise is why do not directly use CNN to solve classification problem? Because CNN is very time-consuming. Therefore, we use the output of CNN to repair the errors of SVM. How to repair? Please see Table 2+Table 5 for an explanation. The item of potatoes is used as an example. The CNN has 100% of potatoes classification and removes the errors of SVM. Based on this method, the PCA for feature selection can use the smallest number of features to produce the perfect classification performance (Please see Table 7).

 

I hope the above description can help you to understand this paper. Although the CNN and SVM are not a new technology, however, this brand new idea makes the paper interesting and it really has a good scientific contribution.

 

Milly

 

We do appreciate you provide the pdf file of the article. Indeed, it is very helpful.

Please see the blue text of the new manuscript.

Thank you very much for giving me the opportunity to review this manuscript entitled, “An Innovative Intelligent System with Integrated CNN and SVM: Considering Various Crops through Hyperspectral Image Data”. The authors have documented an interesting experiment of analysing the CASI images for crop identification of a vast area and used a novel two stage classification method utilising both pixel and region-based classification methods. This work is very useful and have high importance to analyse the remote sensing images for crop detection for a vast area within a considerable amount of time. It is a noble work to utilise the CNN region/cell-based classification results to modify the shortcomings of SVM pixel-based classification results.

This paper provides a brand new idea on how to integrate two classifiers together. Both CNN and SVM are not a new technology, however, we used a repair model(see figure 3) to link them together with considering SVM fast computation and CNN high accuracy.

As your comments are very positive to our research, we combine both pixel and region-based classification together for obtaining better accuracy results and accelerate the computational time.

Thank you for your encouragement.

After reading the manuscript, however, I found there are some sections that needs modifications and I would therefore like the authors to modify those sections with proper explanations. At this point I have rejected the manuscript so that the authors would have another chance to rectify the mistakes and resubmit.

Here are my comments to the manuscripts:

We will do our best to follow your suggestions and instructions.

The abstract and the introduction sections are well written and clearly understandable to the readers about the aims and objectives of this research work. However, comparing to the first section of the manuscript, the methods, results and analysis sections are not well-written. I found many sections are required to be explained in detail

Thank you for your comments

 

1. What are the rules of multi-object decisions that the authors mentioned on line 130?

 

It is a very good idea. I think you are talking about The rules for multi-objective decision-making are established. We rehearse it as

The rules for multi-objective decision-making are established by considering each different crop as a decisions and the 72 bands as attributes. These data are using for the analysis to the SVM and CNN classifiers.

It is a good approach to explain the SVM and CNN classification with regards to the crop identification from CASI images but there is no enough discussion of the stages how the authors improve the SVM classification method with the results of CNN region based classification. The function ‘cell’ as mentioned in the manuscript, there is no further explanation of it. This may create much confusions to the readers. Also, in regard to this there is not much explanation to the Figure -6 image. I would rather suggest to the authors to write the cell function in details or at least depicts it in a flow diagram format.

 

The program of the cell is developed based on the regional-based classification outcomes. We explain on it Figure 4. We rehearse the sentence as:

The red line is the linear regression function by collecting the coordinate data of blue pixels which are generated from ROC model. It will increase to the surrounding levees for each margin of sides as an integrated patch. These two parameters will adjust to enlarge and merge different cells as one region area.

We re-explain Figure 6 as

Figure 6 displays the different sizes of samples in region-based classification. Since different crops have different sizes, the region-based classification program which is executed by the Cell program as aforementioned in Figure 4 is step-by-step to detect the different areas of the entire image.

Why in so early epochs the model attains its optimum accuracy? The model also shows absolute 100% accuracy which is desirable however, if the model is too well trained with the specific data set of the study area, it is concern whether the model is generalised at all. It is a good idea if the authors apply the model on the CASI image of a different area. The experiments should be run with more than one test sample.

 

Thank you for your comments

The SVM is used for pixel-based classification and the CNN is using regional-based classification. We used SVM first and then used CNN. However, the disadvantage of pixel-based classification will produce the salt-pepper effect for generating the thematic map. Thus, we develop the CNN approach to fix the salt-pepper effect. Please see Figure 3 for the steps for the entire paper.

The authors should employ more than one type of error estimation. The dice index and F-score would be a good inclusion in the analysis section.

We display two ways of showing the results.

The Confusion matrix and Accuracy rate are two indexes to show the outcomes which are the most popular presentation in international paper. Also, the thematic maps are also shown for outcomes.

It must be good if we can show more than the above-mention indices to present the outcomes. But, we are very sorry that we can not show the dice index and F-score which is not familiar to us. Also, we are not able to explain them well in only 14 days of revising the paper. Please forgive us.

Overall, I would suggest to re-write the methods and Results Section and re-submitted for a thorough review. The authors also required to be proof-read the manuscript the problem in sentence construction which I found mainly in the Results and Discussion Section.

Many parts of the Methods and Results section are rehearse. All the English writing are carefully double checks. I also explain the entire study goals and steps for you in the previous part of the letter.

We appreciate your suggestion. The process is complicated, however, we do our best to make them present perfectly.

 

Author Response File: Author Response.pdf

Reviewer 4 Report

Dear authors,

I have the following recommendations that, I hope, will help you improve your paper. 

  1. Try to reduce the amount of copied and pasted text you used in your article, the similarity index of your document is high. 
  2. For the pattern classification stage it would have been good if you had validated your model with a more robust cross-validation method than Hold-Out. You could consider the K-Fold Cross-Validation with K=5 or K=10. 
  3. Provide the specs of the hardware (CPU, memory, GPU, etc.) and software (Operating System, libraries or packages) on which you developed your proposal.
  4. For comparison purposes, it would be good to know the results of the SVM with some other Kernel such as linear or polynomial. 
  5. It would be good to support the choice of the SVM, placing the results of other classifiers such as Random Forest or those based on Bayes' theorem. 
  6. Provide which are the features of the dataset that you generated for classification with the SVM, you could summarize the information of the dataset in a single table. 
  7. Provide the learning parameters with which you set up the CNN model. 

Author Response

Dear reviewer

 

First of all, I deeply appreciate your time to review the paper and gave us many helpful suggestions. This paper is written under the consideration of using the pros of SVM and CNN to combine them together. The paper is very interesting but not easy to understand due to it is a little bit complicated. Let me explain it in detail.

 

First, we have a hyperspectral image for many kinds of crops on it. The SVM is used for pixel-based classification and the CNN is using regional-based classification. We used SVM first and then used CNN. However, the disadvantage of pixel-based classification will produce the salt-pepper effect for generating the thematic map. Thus, we develop the CNN approach to fix the salt-pepper effect. Please see Figure 3 for the steps for the entire paper.

 

There is a fundamental question that arise is why do not directly use CNN to solve classification problem? Because CNN is very time-consuming. Therefore, we use the output of CNN to repair the errors of SVM. How to repair? Please see Table 2+Table 5 for an explanation. The item of potatoes is used as an example. The CNN has 100% of potatoes classification and removes the errors of SVM. Based on this method, the PCA for feature selection can use the smallest number of features to produce the perfect classification performance (Please see Table 7).

 

I hope the above description will help you to understand this paper. The CNN and SVM are not a new technology, however, this brand new idea makes the paper interesting and it really has a good scientific contribution. Please see the green text of our modification.

 

Milly

 

Please see blue text.

 

Try to reduce the amount of copied and pasted text you used in your article, the similarity index of your document is high. 

Thank you for your comment.

Indeed, the previous version of the submission has many duplicated texts.

This part is fixed.

Please take a look at the new version of our submission.

We are sorry for our thoughtless.

For the pattern classification stage it would have been good if you had validated your model with a more robust cross-validation method than Hold-Out. You could consider the K-Fold Cross-Validation with K=5 or K=10. 

 

  Figure 7 is the results of CNN with different epochs. Each of the epochs can be seen as a trial for searching for the best outcomes for performance.

This is a very good suggestion on SVM stage. We do try the pixel-based with using many sets of the training data. But the results do not improve much for the testing data outcomes.

            

 

Provide the specs of the hardware (CPU, memory, GPU, etc.) and software (Operating System, libraries or packages) on which you developed your proposal.

Good suggestion. We put it before Table7.

The testing computer is i7-8700 with 16RAM and 4G GTX-1050 Video Card with the calculation of the entire thematic map. The operating system is Win10 with using Python package of Keras in TensorFlow.

For comparison purposes, it would be good to know the results of the SVM with some other Kernel such as linear or polynomial. 

 

Thank you for your suggestion

This is good when we initially start running the project. But, we are limited to the size of the paper. If we put those trails of the different Kernel functions in paper, however, it will lose the focus point of our two-stage classification progress with a new ideal of repair module of this paper.

It would be good to support the choice of the SVM, placing the results of other classifiers such as Random Forest or those based on Bayes' theorem. 

 

This is an excellent idea. But same as the previous reason, we want to display a method how to integrate SVM and CNN together.

Actually, one of my students is working on Random Forest + CNN for his Master Thesis.

We will present it in the next paper.

Provide which are the features of the dataset that you generated for classification with the SVM, you could summarize the information of the dataset in a single table.

  We may not quite understand the meaning of your suggestion. But we state in the article as

In this study, the Radial Basis Function kernel (RBF) is selected for calculation. To obtain better model parameters, the Grid Search method repeats the test parameters C = 2100 (penalty parameter) and g = 2 (gamma function) for possible combination and calculate the correct rate of its parameters (C, g). If it meets its condition, end the repeated test and output its best C and g parameters, otherwise re-substitute with the new parameters until the combination is found. There is not too much information to list them in a Table which the Journal has a limitation of page numbers.

            

Provide the learning parameters with which you set up the CNN model. 

We rehearse the sentence as

The study provides 9 categories divided into 9 classes for analyzing data of CNN. Various cell sizes are generated to fit the size of the crop size considering the inputs for CNN.

Some of the information is already on 4.2.:

The program is designed to make a 30 × 30 CNN model and display different combination outcomes for results. For instance, Table 3 shows the outputs of sequential_14. The original size is 30 × 30. Layer 1 switches to 28 × 28. The activation function of the CNN model is “Relu”. The maximum epoch number is set up as 100 with a validation split 0.2. After executing Maxpool 2 × 2, it reduces to 14 × 14….and so on. Then, the program will automatically calculate to softmax 7 × 1 in which it will transform into a one-dimensional array.

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

There are a lot of flaws in the document. I would advise the authors to take the paper to a scientific language editor. I have a tried to through the paper but I could not finish. The research is sound but the reporting needs to be looked into thoroughly. 

Most of my comments from the previous review were not addressed. They must consider some of the inputs from my review and also re-work the whole document. 

Author Response

There are a lot of flaws in the document. I would advise the authors to take the paper to a scientific language editor. I have tried to through the paper but I could not finish it. The research is sound but the reporting needs to be looked into thoroughly.

Response: Thank you for your advice. We send it to the English Office of LTU. The modification is done and the paper is checked from stem to stern.

Most of my comments from the previous review were not addressed. They must consider some of the inputs from my review and also re-work the whole document.

Response: We follow your instruction in the previous review. You may miss seeing the pdf file. It is attached to this review. We know you spend a lot of time revising our work. We deeply appreciate it. We carefully track all your advice. But you did not see our revision on the pdf file. The rehearse words are in the new manuscript and in the hint (move the cursor to the highlight-text). There may exist some misunderstandings. The entire paper is fully revised.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors took into account the observations / comments submitted following the review.  

I recommend the manuscript for publication.

Author Response

The authors took into account the observations / comments submitted following the review.  

I recommend the manuscript for publication. 

Thank you for your help on the review work. You must spend a lot of time on reviewing our paper.

We deeply appreciate your advice.

Reviewer 4 Report

Dear authors, 

In my opinion, you have addressed all my initial concerns. I have no more recommendations. 

Author Response

In my opinion, you have addressed all my initial concerns. I have no more recommendations. 

 

We deeply appreciate your help. You must spend a lot of time. Many thanks.

Back to TopTop