Next Article in Journal
Application of the DIC Technique to Remote Control of the Hydraulic Load System
Next Article in Special Issue
An Improved Spatiotemporal Data Fusion Method Using Surface Heterogeneity Information Based on ESTARFM
Previous Article in Journal
Deformations Prior to the Brumadinho Dam Collapse Revealed by Sentinel-1 InSAR Data Using SBAS and PSI Techniques
Previous Article in Special Issue
Object-Based Multi-Temporal and Multi-Source Land Cover Mapping Leveraging Hierarchical Class Relationships
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Approach of Ensemble Learning Technique to Resolve the Uncertainties of Paddy Area through Image Classification

1
Department of Urban Planning and Spatial Information, Feng Chia University, Taichung 40724, Taiwan
2
Department of Information Technology, Ling Tung University, Taichung 40851, Taiwan
3
Department of Tourism and Leisure, National Penghu University, Magong 80011 Taiwan
*
Author to whom correspondence should be addressed.
Remote Sens. 2020, 12(21), 3666; https://0-doi-org.brum.beds.ac.uk/10.3390/rs12213666
Submission received: 19 September 2020 / Revised: 1 November 2020 / Accepted: 3 November 2020 / Published: 9 November 2020

Abstract

:
Remote sensing technology has rendered lots of information in agriculture. It has usually been used to monitor paddy growing ecosystems in the past few decades. However, there are uncertainties in data fusion techniques which can be resolved in image classification on paddy rice. In this study, a series of learning concepts integrated by a probability progress Fuzzy Dempster-Shafer (FDS) analysis is presented to upgrade various models and different types of image data which is the goal of this study. More specifically, the study utilized the FDS to generate a series of probability models in the classification of the system. In addition, Logistic Regression (LR), Support Vector Machine (SVM), and Neural Network (NN) approaches are employed into the developed FDS system. Furthermore, two different image types are Satellite Image and Aerial Photo used as the analysis material. The overall classification accuracy has been improved to 97.27%, and the kappa value is 0.93. The overall accuracy of the paddy field image classification for a multi-period of mid-scale satellite images is between 85% and 90%. The overall accuracy of the classification using multi-spectral numerical aerial photos can be between 91% and 95%. The FDS improves the accuracy of the above image classification results.

Graphical Abstract

1. Introduction

Paddy rice is a major food source for more than half of the world’s population, mostly in regions of Asia, Africa, and Latin America. On the other hand, paddy area cultivation has drawn attraction from the government which has become more important. The investigation of paddy areas needs new integration into different technology. Paddy areas are one of the major crops which have been cultivated all over Taiwan. Hence, a good solution from data sources and different classifiers need to be integrated. One of the uses of image data is to use satellite image data to handle the management of paddy areas precisely. Satellite image data provide the essential technology and methodology on monitoring, mapping, and observing the variation on paddy areas. It also considers repeated time intervals, which interprets paddy growing areas under a variety of aspects [1]. Furthermore, excessive research has been dedicated to significant efforts to employ satellite optical data to construct the target GIS map for delineation of paddy field areas by means of image classification. These problems have drawn great attention from the past half-century to recent times. Classification methods have used data from Landsat TM and ETM+ series [2], SPOT series [3], MODIS [4], RADARSAT series [5], ERS-1 and ERS-2 [6], ENVISAT/ASAR [7], IRS [8], AVHRR [9], Aerial-Photo [10], UAV [11], etc. Classifiers may be of two different types: supervised and unsupervised classification methods. Related research has also presented the size of the “area” of the “target category” through various classification methods. From the abovementioned remote sensed data, the present trend is that the spatial resolution becomes higher. The complexity of the computation time is also simultaneously increased. This also renders a great challenge to face the extraction of crucial data in remote sensing. Although the image resolution becomes higher, it provides a better solution for classification. However, considering the problem of pixel-based classification, the confusion information is filled in the analyzed image. For example, the salt and pepper effect is also produced. Furthermore, there are some solutions for solving the salt-pepper effect such as texture information. Such as the Gray Level Co-Occurrence Matrix, GLCM [12], Fractal dimension [13], Semi-Variogram [14], etc. These research results have been successfully reflected in these areas, such as agricultural resource exploration, medical treatment, and face recognition. In addition, the concepts of various classification rules are employed by machine learning [15]. Basically, the main idea is to use supervised, unsupervised learning and hybrid classification (i.e., semi-supervised learning and a fusion of supervised and unsupervised learning) [16]. The major remote sensing image classification techniques include pixel-wise, sub-pixel-wise, and object-based image classification methods, and highlight the importance of incorporating spatial-contextual information in remote sensing image classification [17]. The supervised model requires the selection of sample information. The most popular methods are (1) support vector machines [18]; (2) neural networks [19]; and (3) logistic regression [20]. The unsupervised model does not require the selection process of training samples. It mainly performs the maximum difference calculation of the category of the image information set provided.
On the other hand, Taiwan is a cloudy and foggy environment in which most of the satellite image quality is not good. Hence, such an environment cannot be solved with only one data source. In other words, we cannot rely on a single image source for paddy area thematic maps in Taiwan. In addition, various outcomes of classifiers have different characteristics and use all of the above various factors. The monitoring of rice area by image classification is determined by climate, species, man-made farming patterns, and other factors to analyze the various crops [21]. In addition, the spatial resolution of images, time resolution, spectral resolution, and even the uncertainty of analytical methods can cause various errors in the thematic map production. Past researches have rarely discussed the uncertainty of classification problems. Hence, a solution is requested to determine the best image source or classification method to produce the best paddy area thematic maps. This concept obviously cannot correctly solve the problem of a complex world [22]. In considering multi-data and multi-classification technology, it requests a reflection of the complexity of the real world through massive data collection and analyzed them systematically to become a crucial goal in this research. A possible solution on the development of a DFM (Data Fusion Model) concept with ensemble learning is a good idea with the construction of the paddy agriculture field thematic map. The ensemble learning technique typically requires a set of different computations rather than evaluating the data by a single model. Straightforward algorithms are usually used in ensemble methods and some complicated algorithms can render informative results as well.
Furthermore, the concept of DFM was first proposed in the 1960s as mathematical models for data manipulation. It was implemented in the US in the 1970s in the fields of robotics and defense [23]. A general concept idea is proposed to be applied to the problems generated during the analysis of engineering systems [23]. Then, it can handle the modes of various data sources including time series and images. The general model divides the data fusion technology into three major steps: identification, estimation, and validation (see Figure 1). The three steps are
  • The first step is the identification part, where the core issue is to understand data fusion. To handle the source of the data, one should be familiar with the methods and techniques of data analysis.
  • In the second step, estimations can be divided into four parts, and the processing procedures are (a) Signal level; (b) Pixel level; (c) Feature level; and (d) Symbol level, etc. The signal levels and pixel levels are represented in different ways. The data of the pixel level is more intuitive for humans, but these two levels of data lack the relationship between the mathematical models of the measured object [23]. Feature level processing is to extract features from the original data and then to carry out the image fusion. This part of the method can be divided into (a) Feature extraction and (b) Feature selection, two major mainstreams [24]. Principal Component Analysis (PCA) is a common feature extraction method. However, it is the symbol level processing, in this part of the step, which is to merge the data usage statistics and the logical inference methods to facilitate the subsequent mathematical modeling or data analysis. At the last stage of symbol level fusion, the data is combined with the aid of a mathematical model and the analysis is based on statistical and logical inference. The symbol level processing is the result of decision analysis. This part of the algorithm can be roughly divided into three categories: (a) Physical model recognition algorithm (Kalman filter, maximum likelihood estimation, generalized least squares etc.), (b) Parameter classification and recognition algorithms (Bayesian estimation, Dempster-Shafer evidence theory, entropy estimation, supervised classification, and unsupervised classification), (c) Cognitive architecture model (expert system, fuzzy sets, LIDA [25], ACT-R [26], SOAR [27], etc.
  • In the third step, the validation contains the following: (a) Uncertainty in the solution content (probability measure, false alarm rate, or accuracy classification). This includes the assessment of the performance of the data fusion model, which can be made to measure the uncertainty content in the solution. (b) To establish a benchmark program to improve data fusion. (c) Processing data and the fusion information is at the validation stage. The information can be hereby used to effectively integrate by the above process.
Finally, the present applications of data fusion expand into a wide range of fields. For instance, pattern recognition and radar tracking [28], robotics [29], traffic control [30], remote sensing [31], and geosciences [21], etc.
Therefore, this research developed a series of ensemble learning concepts through transformations of information processing, classification procedures, and the concept of data uncertainty analysis. To solve the series of aforementioned problems, this research has developed a systematic problem research method. It can be stated as follows: (a) The era of multi-sensors has been developed, and the monitoring environment no longer depends on a single device, but satellite and aerial photos are suitable devices for monitoring landform changes. Similarly, different classifiers have their own pros and cons. Therefore, we defined the equipment to monitor the environment as Pixel Level information. In this study, we used “multi-period and multi-spectral satellite imagery” + “single-period and multi-spectral numerical aerial photography” as sources of information for different sensors. (b) We used spectrum, spectral index and texture information for Feature Level. (c) Different from other studies, we used three different classifiers as tools to generate Symbol Level. These three classifiers are statistical “Logistic Regression (LR)” and “Support Vector Machine (SVM)” for machine learning, and “Artificial Neural Network (ANN)” in the field of artificial intelligence. However, each of these three classifiers has its advantages and disadvantages, and we will produce different results for this paddy area classification issue, so we use the concept of Dempster-Shafer (DS), through the practice of an evidence synthesis (Evidential reasoning, ER) algorithm, the three pieces of classification information are integrated into a piece of single decision-making information, thereby improving the uncertainty of the classification results, using the method of the FDS (Fuzzy-DS theory) theory of evidence. The concept is that this method has the ability to express what is “uncertain” and “not known” directly. It belongs to the category of artificial intelligence, and was first applied to some expert systems [32].
Finally, our study has six major steps: (a) Data collection and pre-processing; (b) Extraction and analysis of spectral characteristics of paddy fields and non-paddy fields in satellite images and aerial images; (c) Integration of new satellite images and the spectral characteristics of numerical aerial images are subdivided into mound patches units; (d) Multi-feature classification analysis of multi-scale images; (e) to establish the decision-making of the FDS module to present the uncertainty outcomes of patches; (f) Confusion matrix analysis for rice field image classification.

2. Material, Study Area, and Research Design

The Meinong area of Kaohsiung City was selected for the study case (Figure 2). The Meinong in the northeast of the geographical center of Kaohsiung City is terrain that is mostly mountainous. It is located at the alluvial fan of the Laku-laku stream and the whole region is a rich hydrological system. There is the richness of the Laku-laku stream and it has tributary Meinong stream throughout the territory. The region is an important paddy area growth area in Taiwan. The analysis images are those in which the grid data are divided into the satellite images and the numerical aerial images. The satellite image is the optical image data of Formosa II. It is panchromatic (with space resolution 2 m) and the multi-spectral image (with resolution 8 m). It can use the texture of the shape, the edge, and other characteristics. The results of the interpretation of the ground surface can take place on 1 February 2015 for the paddy transplanting stage (Figure 3a) and on 2 April 2015 for the paddy tillering stage (Figure 3b). In the high-resolution aerial photographs section, the study used Digital Mapping Camera (DMC), which has eight lenses, with four lenses in the middle forming a high-resolution panchromatic image of a 13,824 × 7680 cell, with four wide-angle lenses on the outside, blue, green, red, and infrared, etc. The Institute used DMC data to capture a total of 33 images of the first phase of the 2015 paddy DMC (Figure 3c). In addition, the ground truth is for the results of the 2015 paddy area interpretation and the latest version of the status map of the cultivated photo by the Agriculture and Agriculture Administration (see Figure 2). The red patches are the rice paddies and the green patches are the non-paddy fields. In this study, we use the concept of cross-validation to train and verify the model. The overall data have 26,815 numbers of patches, and the number of non-rice has 20,031. We initially divide the samples into training data in a proper range of numbers. The sample locations of both remote sensing data and aerial photograph data are carefully checked for consistency. The total number of training data is 8200. It is divided into a number of 3193 rice (the number is 1), and a number of 5007 non-rice (the number is 0). Due to there being a massive number of 0, we selected a high enough number of sample 1 to meet the classification criteria. The ratio is to meet the problem of the uneven number category of sampling. The total number of testing samples is 8604, of which the rice is 3591, and the non-rice is 5013, and the design of such research data should be sufficient for the sampling analysis.

3. Methods

The research method of this study can be divided into four major parts. (a) The ancillary information for the spatial database; (b) The introduction of the multiple classifiers used for our case; (c) Information processing architecture of multiple classifiers; and (d) The high uncertainty of the patches, which is generated by the multiple classifiers. The content of the interpretation module is as follows.

3.1. Ancillary Information for Spatial Database

This study provides the characteristic information of paddy area patches for multi-phase optical remote image data, which includes the original band of the image, vegetation information and texture information. For the paddy growth period, the study considers different conditions and chooses to use images of different scales to extract the characteristic information of farmland growth in order to establish a knowledge base of the multi-scale image of the farmland paddy area features. Table 1 presents the Spectrum characteristics index to improve analysis of the classification, which includes the Ratio Vegetation Index (RVI), the Normalized Difference Vegetation Index (NDVI), Perpendicular Vegetation Index (PVI), Soil-adjusted Vegetation Index (SAVI), Transformed Soil-adjusted Vegetation Index (TSAVI), Crop Management Factor Index (CMFI), Greenness Index (GI), Infrared Percentage Vegetation Index (IPVI), Modified Soil-adjusted Vegetation Index (MSAVI), Optimized Soil-adjusted Vegetation Index (OSAVI), Generalized Soil-adjusted Vegetation Index (GSVI). On the other hand, the Gray Level Co-Occurrence Matrix (GLCM) method is used for the extraction of image texture. The texture images contain spatial distribution-related information, thus, it can increase the image classification with the distinct information. In some cases, the appropriate selection of texture images can increase the classification accuracy. The following texture information is used to produce each band of the texture image: (1) Homogeneity, (2) Contrast, (3) Dissimilarity, (4) Entropy, (5) Variance, (6) Mean, and (7) Second Moment. All the vegetation indicators and texture information are shown in Table 1.

3.2. Multiple Classifiers Application

3.2.1. Logistic Regression

As part of this study, Logistic Regression was used as one of the classifiers for prediction. Logistic Regression is used extensively in the sciences as well as in many applications such as prediction of floods, debris flow, and landslides [33,34,35]. Logistic Regression is usually used for the prediction of the probability of occurrence of an event by fitting data to a logistic curve. In linear regression, the predictive values are theoretically inadmissible. In general, Logistic Regression predicts a discrete outcome to display different decision results for generating integer numbers. The decision or outcomes is dichotomous, such as success/failure or occurrence/non-occurrence outputs [36]. In statistics, a binary logistic model usually has a dependent variable with two values, pass or fail, which is represented by an indicator variable. The two values labeled “0” and “1” in this study represent paddy and non-paddy fields, respectively. In the logistic model, the independent variables can each be a binary variable (two classes, coded by an indicator variable) or a continuous variable (any real value). The corresponding probability varies between 0 and 1. A sigmoid function, defined as follows, is adopted in this study. It was decided to employ the Multinomial-polynomial model to construct our classifier and the base category for the classification, which is adopted for the rice in the training sample. In addition, the Pearson chi-square statistic is used to test the independence between variables. Specifically, the Maximum iterations is set up as 200 to facilitate the rapid convergence of the problem and achieve our classification purpose.

3.2.2. Support Vector Machine

The Support Vector Machine (SVM) algorithm is a popular machine learning tool that offers solutions for both classification and regression problems. Support vector machines (SVMs) are well-accepted supervised learning methods used for classification. Whereas the SVM classifier supports binary classification and multiclass classification, the structured SVM allows the training of a classifier for generally structured output labels. More specifically, there exist many hyperplanes which may be able to classify the data. One rational choice for the best hyperplane is one that represents the largest separation, or margin, between the two classes. The optimal choice of the hyperplane is to make certain the distance of it from the nearest data point on each side is maximized. A special feature of these classifiers is to minimize the empirical classification error and maximize the geometric margin, simultaneously [37]. The main core of the support vector machine is to select an appropriate kernel function. The function of the kernel is to take data as inputs and convert them into the required form. This is because different types of data cannot be linearized in the original space. When separated, the data after nonlinear projection can be easier to separate in a higher dimensional space, usually linear, polynomial, radial basis function (RBF), and Sigmoid function. This study employed polynomial function to show the classification results, and the value of bias we set to 0.

3.2.3. Artificial Intelligence

The artificial intelligence of neural networks is an information processing method inspired by the way biological neural systems process data. Neural Networks (NN) were first proposed in the early 1940s as an attempt to simulate human brain cognitive learning processes [38]. They may be programmed so that the primary function is to develop models of problems based on trial and error or learning procedures. In the past twenty years, Back Propagation Neural Network was extensively applied in many fields. The relationship of massive data and a certain phenomenon is obtained through a learning system (instead of calculation), based on the neuron cell concept. In the past, engineers and researchers have experienced that describing variables for classifying remote sensing imagery is a tough task. If a paddy area spatial database was well developed to describe the input variables and output categories rationally, it may be more suitable to apply a Back Propagation Neural Network as a learning machine [39]. In essence, the neural network is composed of many nodes to connect input neurons and output neurons to three different types of layers: input layers, hidden layers, and output layers. In this study, we designed and implemented a total of 22 different neurons, 2, 4 and 6-input biased and non-biased neurons with each having three different activation functions. Moreover, in terms of data training, this tool has six types of dataset training methods, which are quick, dynamic, multiple, prune, RBFN radius function network, and exhaustive prune thorough deletion. The study uses a fast method to construct sample data, where this method uses rules of thumb and data characteristics to select the appropriate network topology. Finally, the activation functions we selected are Logarithmic-Sigmoid (LogSig) as an output module for outcomes.

3.3. Information Processing Architecture of Multiple Classifiers

This study has two types of data sources with three classification methods. The attributes of the data include: (a) the R, G, B and NIR bands, (b) vegetation indicators, and (c) texture information through satellite imagery and aerial photos. Therefore, there will be six outcomes. This study will use two types of data in the multi-classification procedure (see Figure 4), and the interpretation results on the same scale basis of the GIS database. It can be a preliminary determination of the patch category results (as Figure 4). Basically, every patch will have two kinds of information. First, the “probability information” is given by the classifier, and then the final generalization of the “category information”. In past research, people have often used the category information for classification results but have usually ignored the probability information given by the classifier, which contains some uncertainties. Due to the large number of patches need to detect them by automated conditions, we first used category information for problem-checking. If considering in the same patch, different scenarios and various classifiers were shown in the same category after different classification procedures, for example in this case paddy area (results of 1) or non-paddy area (results of 0), we considered these kinds of patches that are of high information certainty. In addition, the most difficult decision-making patch field contradicted the different classification procedures for the prediction. At this time, the patch field can be regarded as being of high information uncertainty. The undetermined category involves further processing, which will be discussed in next part. On the other hand, the analytic tool is ab IBM SPSS Modeler in this study, which can perform different classification outputs. The graphical user interface (GUI) was applied widely to different problems.

3.4. Decision-Making Interpretation Module for Uncertain Patches

This study has developed a decision-making integrator on the high uncertainty data generated by the above analysis. It is based on the Concept of Dempster/Shafer Evidence Theory, which was first proposed by Dempster in 1967 and further developed by his student Shafer in 1976 to develop it as an inaccurate reasoning theory. DS Evidence Theory belongs to the category of artificial intelligence which is first applied to expert systems with the ability to deal with uncertain information [32]. As an uncertain reasoning method, the main characteristics of evidence theory are to meet weaker conditions than the Bayesian probability theory. Many techniques have to be refined and developed to implement the DS theory, one of which is the Evidential Reasoning (ER) algorithm. Because DS theory requires a combination of multiple uncertain evidences, the set of assumptions is gradually scaled down as the evidence accumulates, which results in precise reasoning results. The accumulation process of evidence requires a method or rule to calculate the degree of influence of multiple evidence based on the hypothesis for considering a specific problem. If these pieces of information or evidence are not completed, a trust degree evidence level can be calculated as a joint function for statistical analysis.
The study is the first to take out the classification value of each algorithm (Table 2). The real line frame in Table 2 is “category information” and the second is the use of “probability information” to distinguish the problem. For instance, the value of patch1 has (0, 0.00) by the Logistic method based on the satellite image. The 0 is category information and 0.00 is the probability information. The total number of patches in Table 2 is 26,815. Because there were too many items, we divided them into three types of categories for illustration purposes. In addition, the numbers of these types of patches are not consecutive. These three categories from top to bottom are the so-called patches with high/low certainty.
The number of these patches of non-paddy by classification results is 18,731. The results are all remarked as (0, 0, 0, 0, 0). The high degree of certainty is the paddies with the number of these patches is 5596 for the classification. They are all remarked as (1, 1, 1, 1, 1, 1) for paddies with high certainty. There is another kind of uncertainty classification in the Table in which the number of these patches is 2488. For instance, for the patches with a high uncertainty, we choose the patches numbered 42,643 to explain them (Figure 5). In these patches, LR, SVM and ANN are the classification results of satellite image classification (0, 0.132), (0, 0.102), (0, 0.072), respectively, while the aerial imagery classification outcomes of LR, SVM and ANN are (1, 0.594), (1, 0.548), (0, 0.329), respectively. There are many possible reasons that may induce the differences of the image by resolution. This also may be caused by the change in the complex farming behavior. It may also cause classification errors. In fact, it is very difficult to analyze the differences in such problems one by one. However, our developed system of the FDS concept can rationally integrate them. For instance, for the patch 42,643, based on this probability value, this study used Gaussian Blur function for the probability in the dataset (Figure 5a). The integrating probability values were generated by various methods into a joint de-fuzzification probability (Figure 5b). The de-fuzzification value of the fuzzy value of different methods among the different scales is shown in Figure 5c. The value of the probability of each algorithm used is blurred. Specifically, the value of the different methods of Figure 5a is used. Figure 5b has the y-axis present as the normalized fuzzy rate value. The x-axis is from the evidence theory of information of an uncertain intensity. There range from 0 to 1, where 0 is the strongest negative intensity, 1 is the positive intensity of the strongest, and the value of 0.5 indicates the highest degree of information uncertainty. In each point in the dataset, we can determine an appropriate alpha-cut value (the red dotted line in Figure 5c on the y-axis) as a basis for the study to be blurred. Through de-fuzzification, the minimum informative level, average informative level, and the maximum informative level obtained after de-fuzzification, the information intensity in the theory of evidence is well-tested. It then combines them among different algorithms for the highest possible category in determined by voting results. In this study, the alpha-cut (α) value was given at 0.9 as a threshold for decision categories.
The reason for using the DS theory is to combine the fuzzy-based probability on characteristic values to summarize the final classification attribute. The concept is that because the pattern is already a Gaussian function, the pattern is characterized by a standardized pattern, and the pattern is taken out of the 0.9 threshold of the Y-axis for the minimum (X1), the mean,(X3), and the maximum values (X5). Thus, one can measure the chance of this fuzzy integration information, and judge which position will be allocated. If we take 0.5 as the centerline, the overall information is a shift to the left. Hence, we will judge the non-paddy field. If the overall information is a shift to the right, we will judge the rice paddies. Very few (almost none) of the examples occur at 0.5 positions. If the position of 0.5 is in the center of the graph (green dotted line in a database), we can obtain a category determination. The value 0.5 is the threshold of determination. In this case, it is determined on the left side, thus the value is 0 and vice versa. Thus, we can synthesize all the information to produce a different result. Equation (1) to Equation (3) expresses the formula as follows [40]:
X 1 = m i n ;   X 2 = 1 m i n ;   X 3 = m e a n
X 4 = 1 m e a n ;   X 5 = m a x ;   X 6 = 1 m a x
X 7 = X 1 × X 2 + X 3 × X 4 + X 5 × X 6
Y 1 = X 1 × X 3 × X 5 X 7 ,   Y 2 = X 2 × X 4 × X 6 X 7
i f   Y 1 > Y 2   t h e n   c l a s s   0   ( i t s   m e a n   n o n p a d d y   a r e a ) i f   Y 1 < Y 2   t h e n   c l a s s   1   ( i t s   m e a n   p a d d y   a r e a )

4. Results and Discussion

The results of this study consist of three major parts: (1) spectral characteristic extraction analysis of paddy area paddies and non-paddy area fields; (2) the spectral characteristics of satellite imagery and numerical aerial images for the knowledge-based unit of the paddy area; and (3) multi-scale image classification analysis. The establishment of a decision-making interpretation module is for the high uncertainty information uncertainty of the analysis of the classification of paddy field images, which is summarized as follows:

4.1. The Image Feature Analysis

This study uses the ancillary information of multi-period images, which contains the original band of the image, vegetation information, and texture information. In this study, the use of images of different scales is selected to extract the growth characteristic information of the paddy area for different conditions. Using the Regional Object Classification (ROC) technique [41], it can transfer the image information from the pixel-scale to the regional scale operating unit as well as establishing paddy area information. The designed program is used to display the needs of the patch.

4.2. Integration of the Spectral Characteristics of Satellite Imageries and Aerial Photos

The material includes: (a) the R, G, B and NIR bands; (b) vegetation indicators; and (c) texture information through satellite imagery and aerial photos. The study adopted the mean value of previous attribute data. This research is classified as the paddy area by patch detection. The image segmentation technique is adopted to separate different patches by cutting them into different regions. The results for the summation and averaging of each region are taken into account for the material of the patch detection. The knowledge database of paddy area patches are established and organized to improve the misjudgment problem by the FDS module. The inconsistency of two different image types and three classifiers are analyzed by the probability process.

4.3. Multiple Feature Classification Analysis Results of Multi-Scale Images

4.3.1. The Results of Feature Classification

In this study, there are 26,815 patches where there are 6784 paddy area patch samples and there are a number of 20,031 is non-paddy area patch samples. The paddy area and non-paddy area patch samples are randomly selected for 3193 paddy area samples and 8783 non-paddy area samples in the training process. The number of paddy fields in the verification sample areas have 3591 samples and the non-paddy fields have 11,248 samples.
Table 3 uses Logistic regression, Support Vector Machine, and Neural Network model to train sample area and verification sample area classification results for satellite imagery (original band + vegetation index + texture information), the overall accuracy of the training classification is between 94–96%, Kappa is 0.86–0.91, and the overall accuracy of the verification classification is between 94–95%, and Kappa is between 0.84–0.87. Table 3 uses the support vector machine, logic regression and sample area of neural network type in the aerial image (original band + vegetation index + texture information) and verified sample area classification results. The overall accuracy of the training classification is between 95–96%, Kappa is between 0.89–0.91. The overall accuracy of the verification classification is between 95–96% and Kappa is between 0.88–0.90. Speaking overall, the spatial resolution of satellite images is poorer than aerial photos. Generally, both of them have good results. However, these classification outcomes have two states, that is, a high degree of certainty and uncertainty of the results. We do not handle high certainty patches. There may include some neglectable errors. The rest of the samples will be handled through the Gaussian Blur and the probability value. There is also the concept of a fuzzy set to obtain the intersection of fuzzy probability values.
Figure 6a is the result of an inconsistent classification of patches where there are 1107 inconsistent samples by satellite image. In Figure 6b, there are 683 inconsistency patches for an aerial image. Figure 6c is presented for six classification results (satellite image and aerial image). There are 2488 patches for six classification results which are not consistent. It includes (1) satellite image classification results that were consistent, but aerial image classification results that were inconsistent (575 patches); (2) aerial photo image classification results that were consistent, but satellite image classification results that were inconsistent (999 patches); and (3) results of satellite image classification were inconsistent, and the results of aerial image classification were inconsistent (914 patches). Observing the above figures, among the different resolutions of the image and the interpretation of paddy areas, this problem also has different abilities (Figure 6a,b). The classification uncertainty of satellite imagery is obviously higher than that of aerial photos. In addition to the lack of resolution, the interference of clouds and fog may exist in the classification progress. Although there are fewer uncertainties, aerial photos are also not perfect. This is largely due to the fact that the photos taken at the time were dependent on the different periods of paddy rice growth (only one realization). This is the first piece of evidence. The second piece of evidence is that the region of error randomly takes place. There is no majority error from the model. It will be compared with Figure 6a, b. Under the same conditions, different classification methods will have different classification results. Although past studies have tended to ignore the impact of these problems, it is important to effectively address these problems.

4.3.2. Establishment of the Results of the Decision-Making Interpretation Module

This study develops a set of decision-making mechanisms for the uncertainty paddy area data generated by the above analysis and further extracts the classification rate value of each algorithm with producing high information uncertainty. Based on this probability value, a Gaussian blur can be generated for the ratio of each algorithm. It then performs the maximum–minimum multiplier operation, which can obtain the preliminary fuzzy outputs. This study also integrates the classification signals between the scales with the probability distribution value of the evidence theory (DS) as a reference. Finally, through this decision-making process to determine the categories of unknown patches, it can effectively enhance the overall accuracy.
This study also based on the above classification results of the study area of the ground truth overlapping range of classification results (see Table 2). There are six classification results (satellite image and aerial image). The consistent patches include the classification of the 5596 patches for paddy area; and the non-paddy area paddies have 18,731 patches, 90.72% of the total (Figure 7a and Table 4). There were still 169 paddy area samples that were misjudged as non-paddy area samples and there are 57 non-paddy area fields. This allocated 0.84% of the total samples. The program inconsistently handled six classification results (satellite imagery and aerial imagery) and the paddy area was transformed by the FDS model. There are 884 for paddy area patches and 1138 non-paddy area patches amended, which stand for 7.39% of the total samples (Figure 7b and Table 4). However, FDS misjudged rice paddies have 232 patches and 274 patches for non-paddy area patches, which stands at 1.89% of the totals samples (Figure 7c and Table 4).
Table 5 are fuzzy-DS overall accuracy and classification results. The ground truth is rice and is determined to be paddy area with the number of 6383, which was determined as the non-paddy area and has a number of 401. The ground truth for the non-paddy area and the paddy area has a number of 330. There were 19,701 patches in the non-paddy area. The user accuracy of paddy was 94.09% and the user accuracy of the non-paddy are was 98.35%. The producer accuracy of the paddy area was 95.08% and the producer accuracy of the non-paddy area was 98.01%, respectively. The overall accuracy is 97.27%. The kappa value is also enhanced to 0.93. The results of its classification are shown in Figure 8.
Furthermore, the advantages of the FDS method are presented in Figure 9. It shows a comparison of three classifiers using two data sources in the empirical area of Figure 9a. From Figure 9a, we can clearly find that the same method is possible for different outcomes through different data sources. Moreover, the same data source is classified using different classification methods, the outcomes will not be the same. This is due to each classifier having its own characteristics and the ability of this characteristic is basically possible to meet the complex types of classification. No classifier can perform perfectly in image classification. On the other hand, we further discuss the effect of this study on the improvement of the highly uncertain patches through the FDS method. Because the patches are too large, we can only extract a small range of data from Figure 9a (black frame), in which there are a total of 1378 patches. We randomly selected five pieces belonging to the classification of inconsistent patches (blue frame), each of which are numbered in patches of 13,227, 13,240, 13,281, 13,336, and 13,795. For instance, the sample of 13,227 shows that the remote image classification is rice, while the aerial image classification is not rice. The ground truth data is not rice and the final classification of this study through FDS is not rice, and the evidence is obvious. In another example, the sample of 13,795 satellite image classification is not rice, but the aerial image classification is rice. The ground truth data is rice, and the final classification of this study through FDS is rice and the evidence is obvious. However, this tool is not perfect. Omission errors and commission errors will also occur. Most of these phenomena occur when the interpretation probability of rice and non-rice on the two image data is very close. This situation occasionally happens. We try to summarize several possible reasons, such as the interference of clouds or fog on image quality. Furthermore, it may be produced by the complexity of ground crops or it there are multiple crops in a patch. In addition, the diversity of characteristics of the classifier will also be another important factor that effects the results. These reasons have been discussed in previous content and may be available in the future. Finally, we have compiled the results of these errors in Table 6. Basically, the total number of misjudgments and missed judgments is 731 patches for about 3% of the entire data set. This analysis is perfect and practical with good acceptance. This will be good to plot a thematic map. Finally, in this research, we have confirmed this concept. To sum up, these uncertainties are less discussed in past research, but with this research the FDS system can integrate the uncertainties of these factors, and at the same time obtain higher-precision classification results. The contribution of this research is to integrate data resources and various classifiers and combine them in a heterogeneity FDS system. The pros of FDS are the commission errors and omission errors can be fixed by the probability value of judgments. However, there is the case of whether a sample has a low uncertainty in the initial stage, for example, a sample is a rice patch through the various classifiers and material data, but it is not a rice patch for ground truth, and vice versa. In this case, it could produce a wrong judgment that cannot be fixed by the FDS approach. There are the cons of the FDS model.

5. Summary and Conclusions

Data Fusion technology generally employs artificial intelligence and machine learning with a spatial network to develop simulation upgrade models. Our study intends to build a probability-based system to employ LR, SVM, and NN to resolve a complicated paddy rice determination system. Past studies have rarely tackled the classification uncertainties. This study can effectively improve the solution to handle the uncertainty of patches detection. We refer to the Esteban et al. [14] concept of his research and integrated supervised learning to propose a new concept of the paddy area thematic map. The study with the greatest feature is not image fusion but information integration. Among statistics analysis, machine learning and, artificial intelligence, the abovementioned approaches can be rationally combined to display some better classification outcomes. Therefore, the study integrates the concept of ensemble learning to get a thematic map of the patch through various multi resources/scales of image data. The Fuzzy-DS incorporates different evidence by different data resources with an adjustable α cut value to obtain some better classification outcomes.
According to our experience in classification outcomes, the accuracy of the satellite image of a single classifier was between 94–95% (Kappa: 0.85–0.89). The accuracy of aerial image classification was between 95–96% (Kappa: 0.89–0.90). Through the FDS progress, an integrated classification result was successfully scored with the overall classification accuracy, which increased to 97.26% and the kappa value to 0.93. The study has displayed 79.88% and the number of 2488 has been amended to the correct category. The number of 226 cannot be amended. The number of 506 contains original errors. These 732 patches (226 + 506) are the commission and omission errors, which only include 2.73% of the entire dataset.

Author Contributions

T.C.L. was responsible for plan and design of this study. He analyzed the data and discussion. S.W. help on the writing the Manuscript and discussion on results. S.-C.W. wrote the computer program. H.-P.W. used the program to plot the thematic map and Table generation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by 109 Agriculture Science-7.1.2-Food-Z3(4).

Acknowledgments

The authors would like to thank the Agricultural Production Section of Agricultural Council of S.G. Huang and P.Y. Li for providing image data and related information. The author is also very grateful to Z. H. Zhu, Department of Geography National Taiwan University for his advices and suggestions for this project.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kuenzer, C.; Knauer, K. Remote sensing of rice crop areas. Int. J. Remote Sens. 2012, 34, 2101–2139. [Google Scholar] [CrossRef]
  2. Diuk-Wasser, M.A.; Dolo, G.; Bagayoko, M.; Sogoba, N.; Toure, M.B.; Moghaddam, M.; Manoukis, N.; Rian, S.; Traore, S.F.; Taylor, C.E. Patterns of irrigated rice growth and malaria vector breeding in Mali using multi-temporal ERS-2 synthetic aperture radar. Int. J. Remote Sens. 2006, 27, 535–548. [Google Scholar] [CrossRef] [Green Version]
  3. El Hajj, M.; Bégué, A.; Guillaume, S.; Martiné, J.-F. Integrating SPOT-5 time series, crop growth modeling and expert knowledge for monitoring agricultural practices—The case of sugarcane harvest on Reunion Island. Remote Sens. Environ. 2009, 113, 2052–2061. [Google Scholar] [CrossRef]
  4. Chen, C.; Son, N.; Chang, L. Monitoring of rice cropping intensity in the upper Mekong Delta, Vietnam using time-series MODIS data. Adv. Space Res. 2012, 49, 292–301. [Google Scholar] [CrossRef]
  5. Choudhury, I.; Chakraborty, M.; Parihar, J.S. Estimation of rice growth parameter and crop phenology with conjunctive use of RADARSAT and ENVISAT. In Proceedings of the Envisat Symposium, Montreux, Switzerland, 23–27 April 2007. [Google Scholar]
  6. Liew, S.C.; Kam, S.P.; Tuong, T.P.; Chen, P.; Minh, V.Q.; Lim, H. Application of multitemporal ERS-2 synthetic aperture dadar in delineating rice cropping systems in the Mekong River Delta, Vietnam. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1412–1420. [Google Scholar] [CrossRef]
  7. Shen, S.; Yang, S.; Li, B.; Tan, B.; Li, Z.; Le Toan, T. A scheme for regional rice yield estimation using ENVISAT ASAR data. Sci. China Ser. D Earth Sci. 2009, 52, 1183–1194. [Google Scholar] [CrossRef]
  8. Panigrahy, R.K.; Ray, S.S.; Panigrahy, S. Study on the utility of IRS-P6 AWiFS SWIR band for crop discrimination and classification. J. Indian Soc. Remote Sens. 2009, 37, 325–333. [Google Scholar] [CrossRef]
  9. Quarmby, N.A.; Milnes, M.; Hindle, T.L.; Silleos, N. The use of multi-temporal NDVI measurements from AVHRR data for crop yield estimation and prediction. Int. J. Remote Sens. 1993, 14, 199–210. [Google Scholar] [CrossRef]
  10. Ryu, C.; Suguri, M.; Umeda, M. Multivariate analysis of nitrogen content for rice at the heading stage using reflectance of airborne hyperspectral remote sensing. Field Crop Res. 2011, 122, 214–224. [Google Scholar] [CrossRef] [Green Version]
  11. Senthilnath, J.; Kandukuri, M.; Dokania, A.; Ramesh, K. Application of UAV imaging platform for vegetation analysis based on spectral-spatial methods. Comput. Electron. Agric. 2017, 140, 8–24. [Google Scholar] [CrossRef]
  12. Anys, H.; He, D.-C. Evaluation of textural and multipolarization radar features for crop classification. IEEE Trans. Geosci. Remote Sens. 1995, 33, 1170–1181. [Google Scholar] [CrossRef]
  13. Jaggi, S.; Quattrochi, D.A.; Lam, N.S.-N. Implementation and operation of three fractal measurement algorithms for analysis of remote-sensing data. Comput. Geosci. 1993, 19, 745–767. [Google Scholar] [CrossRef]
  14. Chica-Olmo, M.; Abarca-Hernández, F. Computing geostatistical image texture for remotely sensed data classification. Comput. Geosci. 2000, 26, 373–383. [Google Scholar] [CrossRef] [Green Version]
  15. Lillesand, T.; Kiefer, R.W.; Chipman, J. Remote Sensing and Image Interpretation; John Wiley and Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  16. Alajlan, N.; Bazi, Y.; Melgani, F.; Yager, R.R. Fusion of supervised and unsupervised learning for improved classification of hyperspectral images. Inf. Sci. 2012, 217, 39–55. [Google Scholar] [CrossRef]
  17. Li, M.; Zang, S.; Zhang, B.; Li, S.; Wu, C. A Review of Remote Sensing Image Classification Techniques: The Role of Spatio-contextual Information. Eur. J. Remote Sens. 2014, 47, 389–411. [Google Scholar] [CrossRef]
  18. Marconcini, M.; Camps-Valls, G.; Bruzzone, L. A Composite Semisupervised SVM for Classification of Hyperspectral Images. IEEE Geosci. Remote Sens. Lett. 2009, 6, 234–238. [Google Scholar] [CrossRef]
  19. Deng, Z.; Sun, H.; Zhou, S.; Zhao, J.; Lei, L.; Zou, H. Multi-scale object detection in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 2018, 145, 3–22. [Google Scholar] [CrossRef]
  20. Dubovyk, O.; Menz, G.; Conrad, C.; Kan, E.; Machwitz, M.; Khamzina, A. Spatio-temporal analyses of cropland degradation in the irrigated lowlands of Uzbekistan using remote-sensing and logistic regression modeling. Environ. Monit. Assess. 2012, 185, 4775–4790. [Google Scholar] [CrossRef] [Green Version]
  21. Tuia, D.; Pasolli, E.; Emery, W. Using active learning to adapt remote sensing image classifiers. Remote Sens. Environ. 2011, 115, 2232–2242. [Google Scholar] [CrossRef]
  22. You, J.; Zhang, J. Uncertainty Comparison for Area-Class Maps Concerning Different Reference Data. Procedia Environ. Sci. 2011, 10, 2075–2082. [Google Scholar] [CrossRef] [Green Version]
  23. Esteban, J.; Starr, A.; Willetts, R.; Hannah, P.; Bryanston-Cross, P. A Review of data fusion models and architectures: Towards engineering guidelines. Neural Comput. Appl. 2005, 14, 273–281. [Google Scholar] [CrossRef] [Green Version]
  24. Richard, J. Combining rough and fuzzy sets for feature selection. Ph.D. Thesis, University of Edinburgh, Edinburgh, UK, 2005. [Google Scholar]
  25. Franklin, S.; Ramamurthy, U.; D’Mello, S.; McCauley, L.; Negatu, A.; Silva, R.; Datla, V. LIDA: A computational model of global workspace theory and developmental learning. In Proceedings of the AAAI Fall Symposium on AI and Consciousness: Theoretical Foundations and Current Approaches, Arlington, VA, USA, 9–11 November 2007. [Google Scholar]
  26. Anderson, J.R. The Architecture of Cognition; Harvard University Press: Cambridge, MA, USA, 1983. [Google Scholar]
  27. Laird, J.E. The Soar Cognitive Architecture; MIT Press: Cambridge, MA, USA, 2012; ISBN 978-0262122962. [Google Scholar]
  28. Linn, R.J.; Hall, D.L. A survey of data fusion systems. In Proceedings of the SPIE Conference on Data Structure and Target Classification, Orlando, FL, USA, 1–8 August 1991; Volume 1470, pp. 13–36. [Google Scholar]
  29. Ayari, I.; Haton, J.P. A framework for multi-sensor data fusion. In Proceedings of the IEEE Symposium on Emerging Technologies and Factory Automation, Paris, France, 10–13 October 1995; Volume 2, pp. 51–59. [Google Scholar]
  30. Nigay, L.; Coutaz, J. A generic platform for addressing the multimodal challenge. In Proceedings of the 27th International Conference on Human Factors in Computing Systems-CHI 95, Association for Computing Machinery (ACM), Denver, CO, USA, 7–11 May 1995; Volume 1, pp. 98–105. [Google Scholar]
  31. Bruzzone, L.; Fernandez, D.; Vernazza, G. Data fusion experience: From industrial visual inspection to space remote-sensing application. In Proceedings of the Academic and Industrial Cooperation in Space research (ESA SP-432), Vienna, Austria, 4–6 November 1998; pp. 147–151. [Google Scholar]
  32. Binaghi, E.; Madella, P. Fuzzy Dempster-Shafer reasoning for rule-based classifiers. Int. J. Intell. Syst. 1999, 14, 559–583. [Google Scholar] [CrossRef]
  33. Santacana, N.; Baeza, B.; Corominas, J.; De Paz, A.; Marturiá, J. A GIS-Based Multivariate Statistical Analysis for Shallow Landslide Susceptibility Mapping in La Pobla de Lillet Area (Eastern Pyrenees, Spain). Nat. Hazards 2003, 30, 281–295. [Google Scholar] [CrossRef]
  34. Ohlmacher, G.C.; Davis, J.C. Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng. Geol. 2003, 69, 331–343. [Google Scholar] [CrossRef]
  35. Lee, S.; Sambath, T. Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models. Environ. Earth Sci. 2006, 50, 847–855. [Google Scholar] [CrossRef]
  36. Shiu, Y.-S.; Chuang, Y.C. Yield Estimation of Paddy Rice Based on Satellite Imagery: Comparison of Global and Local Regression Models. Remote Sens. 2019, 11, 111. [Google Scholar] [CrossRef] [Green Version]
  37. Wan, S.; Chang, S.-H. Crop classification with WorldView-2 imagery using Support Vector Machine comparing texture analysis approaches and grey relational analysis in Jianan Plain, Taiwan. Int. J. Remote Sens. 2018, 40, 8076–8092. [Google Scholar] [CrossRef]
  38. Werbos, P. Beyond Regression, New tools for Prediction and Analysis in the Behavioral Sciences. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 1974. [Google Scholar]
  39. Kumar, P.; Prasad, R.; Mishra, V.N.; Gupta, D.K.; Singh, S.K. Artificial neural network for crop classification using C-band RISAT-1 satellite datasets. Russ. Agric. Sci. 2016, 42, 281–284. [Google Scholar] [CrossRef]
  40. D’Agostini, G. Bayesian Reasoning in Data Analysis—A Critical Introduction; World Scientific: Singapore, 2003. [Google Scholar]
  41. Wan, S.; Lei, T.C.; Chou, T.Y. Optimized object-based image classification: Development of landslide knowledge decision support system. Arab. J. Geosci. 2013, 7, 2059–2070. [Google Scholar] [CrossRef]
Figure 1. The general model of the data fusion technology [23].
Figure 1. The general model of the data fusion technology [23].
Remotesensing 12 03666 g001
Figure 2. The research area (Meinong District, Kaohsiung City).
Figure 2. The research area (Meinong District, Kaohsiung City).
Remotesensing 12 03666 g002
Figure 3. Research material (a) Satellite image of photo taking at 1 February 2015 (the rice transplanting stage) (b) Satellite image of photo taking at 2 April 2015 (the rice tillering stage). (c) Aerial photo image—DMC of photo taking at 2 April 2015.
Figure 3. Research material (a) Satellite image of photo taking at 1 February 2015 (the rice transplanting stage) (b) Satellite image of photo taking at 2 April 2015 (the rice tillering stage). (c) Aerial photo image—DMC of photo taking at 2 April 2015.
Remotesensing 12 03666 g003aRemotesensing 12 03666 g003b
Figure 4. Data fusion steps of classification analysis through multi-scale images.
Figure 4. Data fusion steps of classification analysis through multi-scale images.
Remotesensing 12 03666 g004
Figure 5. Solutions for the decision procedure of the uncertainty patches. (a) Probability fuzzification diagram. (No. of Patch is 42,643). (b) Probability de-fuzzification diagram. (c) Decision results of the uncertainty patches.
Figure 5. Solutions for the decision procedure of the uncertainty patches. (a) Probability fuzzification diagram. (No. of Patch is 42,643). (b) Probability de-fuzzification diagram. (c) Decision results of the uncertainty patches.
Remotesensing 12 03666 g005
Figure 6. Inconsistent classification results of (a) satellite images, (b) aerial images, (c) inconsistent classification of six classification results (considering satellite image and aerial image for LR, SVM, ANN).
Figure 6. Inconsistent classification results of (a) satellite images, (b) aerial images, (c) inconsistent classification of six classification results (considering satellite image and aerial image for LR, SVM, ANN).
Remotesensing 12 03666 g006aRemotesensing 12 03666 g006b
Figure 7. Results of FDS. (a) The consistent patches of six classification results (considering satellite image and aerial image with LR, SVM, ANN). (b) The inconsistent patches were fixed by the FDS program. (c) The inconsistent patches were incorrect from FDS program.
Figure 7. Results of FDS. (a) The consistent patches of six classification results (considering satellite image and aerial image with LR, SVM, ANN). (b) The inconsistent patches were fixed by the FDS program. (c) The inconsistent patches were incorrect from FDS program.
Remotesensing 12 03666 g007aRemotesensing 12 03666 g007b
Figure 8. The integration results of FDS outputs for the Data Fusion concept of Figure 4.
Figure 8. The integration results of FDS outputs for the Data Fusion concept of Figure 4.
Remotesensing 12 03666 g008
Figure 9. Results of image data fusion performed through three classifiers using two data sources. (a) Six different results comparison to FDS. (b) Detail selection of region to observe uncertainty patches
Figure 9. Results of image data fusion performed through three classifiers using two data sources. (a) Six different results comparison to FDS. (b) Detail selection of region to observe uncertainty patches
Remotesensing 12 03666 g009aRemotesensing 12 03666 g009b
Table 1. Formula list of vegetation indices and texture information.
Table 1. Formula list of vegetation indices and texture information.
Vegetation IndicesFormulaTexture IndicesFormula
RVI R N I R Homogeneity i = 0 N j = 0 N 1 1 + ( i j ) 2 C i j ( d ,   θ )
NDVI N I R R N I R + R Contrast i j | i j | 2 p ( i , j )
PVI N I R N I R S o i l 1 + B 2 Dissimilarity i = 0 n j = 0 n C i j | i j |
CMFI R N I R + R Entropy i = 0 n j = 0 n C i j l o g C i j
GI N I R G Variance i = 0 n j = 0 n ( i μ ) 2 p ( i , j )
IPVI N I R N I R + R Mean 1 n i = 0 n j = 0 n P i j
MSAVI ( 2 N I R + 1 ( 2 I R + 1 ) 2 8 ( N I R R ) 2 Second Moment i = 0 n j = 0 n { P ( i , j ) } 2
OSAVI N I R R N I R + R + Y
GSVI N I R N I R S o i l R + Z
SAVI ( 1 + L ) × N I R R N I R + R + L
TSAVI B ( N I R N I R S o i l ) R + B ( N I R A ) + X ( 1 + B 2 )
The experience factor:
L = 0.5 ;   X = 0.08 ;   Y = 0.16 ;   Z = 0.35
Soil linear equation of considering multiple scattering conditions: R Soil = A + B R ;   ( A = 0.011 ,   B = 1.16 )
Table 2. The classification outputs for the corresponding probability of each algorithm.
Table 2. The classification outputs for the corresponding probability of each algorithm.
Basic InformationClassification Results of Satellite ImagesClassification Results of Aerial Photo Image
No. of PatchesGround TruthLogisticSVMANNLogisticSVMANN
Category InformationClassification ProbabilityCategory InformationClassification ProbabilityCategory InformationClassification ProbabilityCategory InformationClassification ProbabilityCategory InformationClassification ProbabilityCategory InformationClassification Probability
1000.0000.0100.0700.0100.0100.00
2000.0000.0000.0700.0200.0700.03
3000.0600.0700.0700.0000.0000.00
4000.0000.0100.0700.0500.0700.02
5000.0000.0000.0700.0200.0400.01
6000.0000.0100.0700.0100.0500.03
7000.0000.0000.0700.0100.0600.01
39111.0011.0010.8710.9810.9910.92
44110.9810.9910.8710.9710.9610.93
48111.0011.0010.8710.9210.8610.89
54111.0011.0010.8710.9110.9710.93
62110.9310.9110.8710.8810.9310.93
77110.99710.99910.86810.90910.96410.931
84110.99510.99610.86810.9810.98310.928
42643000.13200.10200.07210.59410.54800.329
54514110.97710.52600.07210.98310.96110.932
55063000.34800.30210.80800.03300.09100.016
55068100.42310.60610.86810.96310.97610.932
55193010.84910.56310.82310.66100.15900.077
58327110.72100.21400.08110.93610.90610.922
66191000.00200.01800.07510.63910.72310.824
Table 3. The classification results of single approach (satellite images and aerial images).
Table 3. The classification results of single approach (satellite images and aerial images).
Satellite ImagesAerial Images
Logistic
Training
User’s AccuracyProducer’s AccuracyAccuracyKAPPAUser’s AccuracyProducer’s AccuracyAccuracyKAPPA
Paddy91.79%94.52%96.39%0.91Paddy93.83%92.58%96.35%0.91
non-Paddy98.06%97.05%non-Paddy97.27%97.75%
Testing
User’s AccuracyProducer’s AccuracyAccuracyKAPPAUser’s AccuracyProducer’s AccuracyAccuracyKAPPA
Paddy90.62%90.01%95.30%0.87Paddy93.68%91.09%96.25%0.9
non-Paddy96.79%97.00%non-Paddy97.08%97.96%
SVM
Training
User’s AccuracyProducer’s AccuracyAccuracyKAPPAUser’s AccuracyProducer’s AccuracyAccuracyKAPPA
Paddy90.39%94.75%96.10%0.9Paddy91.79%92.26%95.76%0.89
non-Paddy98.18%95.56%non-Paddy97.20%97.02%
Testing
User’s AccuracyProducer’s AccuracyAccuracyKAPPAUser’s AccuracyProducer’s AccuracyAccuracyKAPPA
Paddy89.42%90.86%95.26%0.87Paddy92.12%90.28%95.69%0.88
non-Paddy97.13%96.64%non-Paddy96.83%97.47%
ANN
Training
User’s AccuracyProducer’s AccuracyAccuracyKAPPAUser’s AccuracyProducer’s AccuracyAccuracyKAPPA
Paddy88.16%91.90%94.77%0.86Paddy91.54%93.84%96.14%0.9
non-Paddy97.18%95.76%non-Paddy97.81%96.95%
Testing
User’s AccuracyProducer’s AccuracyAccuracyKAPPAUser’s AccuracyProducer’s AccuracyAccuracyKAPPA
Paddy88.25%87.28%94.04%0.84Paddy91.76%91.71%96.00%0.89
non-Paddy95.89%96.23%non-Paddy97.35%97.37%
Table 4. FDS statistics results of overall patches.
Table 4. FDS statistics results of overall patches.
Classification results of the satellite images and aerial images are consistent
Correct patches identifiedPaddy553924,10189.88%
Non-Paddy18,562
Incorrect patches identifiedPaddy identify non-Paddy1692260.84%
Non-Paddy identify Paddy57
Classification results of the satellite images and aerial images are inconsistent
Correct patches identifiedPaddy84419827.39%
Non- Paddy1138
Incorrect patches identifiedPaddy identify non-Paddy2325061.89%
Non-Paddy identify Paddy274
Table 5. FDS overall accuracy and classification results.
Table 5. FDS overall accuracy and classification results.
Classification ResultTotalUser’s Accuracy
PaddyNon-Paddy
Truth GroundPaddy6383401678494.09%
Non-Paddy33019,70120,03198.35%
Total67132010226,815
Producer’s Accuracy95.08%98.01%Accuracy97.27%
kappa0.93
Table 6. An example for uncertainty patches for analysis.
Table 6. An example for uncertainty patches for analysis.
Basic InformationClassification Results of Satellite ImagesClassification Results of Aerial Photo ImagesEquation (1)
of X7
Equation (2)
of Y1
Equation (2)
of Y2
FDS
No. of PatchesGround TruthLogisticSVMANNLogisticSVMANN
13,22700.280.870.100.000.000.000.21610.0003.53620
13,24000.280.590.360.070.010.090.40250.00651.35170
13,28110.060.070.030.920.930.940.06630.14180.00001
13,33610.170.070.310.720.870.850.34990.29860.03981
13,79510.320.070.240.960.930.9 0.15155.51290.00001
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lei, T.C.; Wan, S.; Wu, S.-C.; Wang, H.-P. A New Approach of Ensemble Learning Technique to Resolve the Uncertainties of Paddy Area through Image Classification. Remote Sens. 2020, 12, 3666. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12213666

AMA Style

Lei TC, Wan S, Wu S-C, Wang H-P. A New Approach of Ensemble Learning Technique to Resolve the Uncertainties of Paddy Area through Image Classification. Remote Sensing. 2020; 12(21):3666. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12213666

Chicago/Turabian Style

Lei, Tsu Chiang, Shiuan Wan, Shih-Chieh Wu, and Hsin-Ping Wang. 2020. "A New Approach of Ensemble Learning Technique to Resolve the Uncertainties of Paddy Area through Image Classification" Remote Sensing 12, no. 21: 3666. https://0-doi-org.brum.beds.ac.uk/10.3390/rs12213666

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop