Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data

Wessel, Mathias; Brandmeier, Melanie; Tiede, Dirk

doi:10.3390/rs10091419

Open AccessArticle

Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data

by

Mathias Wessel

^1,2,

Melanie Brandmeier

¹

and

Dirk Tiede

^2,*

¹

ESRI Deutschland, Department Science and Education, Ringstr. 7, 85402 Kranzberg, Germany

²

Department of Geoinformatics—Z_GIS, University of Salzburg, Schillerstr. 30, 5020 Salzburg, Austria

^*

Author to whom correspondence should be addressed.

Remote Sens. 2018, 10(9), 1419; https://0-doi-org.brum.beds.ac.uk/10.3390/rs10091419

Submission received: 31 July 2018 / Revised: 28 August 2018 / Accepted: 31 August 2018 / Published: 6 September 2018

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

Download

Browse Figures

Versions Notes

Abstract

:

We use freely available Sentinel-2 data and forest inventory data to evaluate the potential of different machine-learning approaches to classify tree species in two forest regions in Bavaria, Germany. Atmospheric correction was applied to the level 1C data, resulting in true surface reflectance or bottom of atmosphere (BOA) output. We developed a semiautomatic workflow for the classification of deciduous (mainly spruce trees), beech and oak trees by evaluating different classification algorithms (object- and pixel-based) in an architecture optimized for distributed processing. A hierarchical approach was used to evaluate different band combinations and algorithms (Support Vector Machines (SVM) and Random Forest (RF)) for the separation of broad-leaved vs. coniferous trees. The Ebersberger forest was the main project region and the Freisinger forest was used in a transferability study. Accuracy assessment and training of the algorithms was based on inventory data, validation was conducted using an independent dataset. A confusion matrix, with User´s and Producer´s Accuracies, as well as Overall Accuracies, was created for all analyses. In total, we tested 16 different classification setups for coniferous vs. broad-leaved trees, achieving the best performance of 97% for an object-based multitemporal SVM approach using only band 8 from three scenes (May, August and September). For the separation of beech and oak trees we evaluated 54 different setups, the best result achieved an accuracy of 91% for an object-based, SVM, multitemporal approach using bands 8, 2 and 3 of the May scene for segmentation and all principal components of the August scene for classification. The transferability of the model was tested for the Freisinger forest and showed similar results. This project points out that Sentinel-2 had only marginally worse results than comparable commercial high-resolution satellite sensors and is well-suited for forest analysis on a tree-stand level.

Keywords:

Support Vector Machines; Random Forest; forest classification; GIS; remote sensing

Graphical Abstract

1. Introduction

Analysis and classification of tree species and tree species groups has a long history in the field of remote sensing. Recently, climate discussions have become more prominent and forests are among the ecosystems most affected by climate change [1]. Therefore, an understanding of forest dynamics and quantitative methods to assess climatic impacts on species distributions are of great relevance. Further interest in analyzing forest structures is driven by users of environmental monitoring, spatial planning enforcement or ecosystem-oriented natural resources management systems [2,3]. There is a wide range of studies [4,5,6,7,8,9] focusing on the usage of high or very high-resolution images (e.g., WorldView-2, RapidEye satellite imagery data), but these datasets are often expensive and not freely available. In recent years, deca-metric-resolution imagery (e.g., Landsat) was easier accessible and often cost-free for a broad majority of users, leading to many research projects. With the launch of the Sentinel-2 series in 2015 (Sentinel-2A) [10,11,12], a new mission of free and open satellite data with the main objective in land monitoring, new possibilities for research came into existence. The 180° phased twin-satellite constellation was completed with the launch of Sentinel-2B in March 2017. In the context of forest analysis, the Sentinel-2 mission is very important due to 10 m spatial resolution bands in the visible and the near infrared region (VNIR) as well as four bands (5, 6, 7, 8a) of 20 m resolution in the red-edge region of the electromagnetic spectrum and two bands (11,12) of 20 m resolution in the shortwave infrared (SWIR) [13]. The red-edge region is especially interesting, as it is well known for vegetation analysis and Sentinel-2 offers more bands in this spectral range than comparable satellite missions like the Landsat series. Fassnacht et al. [14] highlights that the respective visible to shortwave infrared Sentinel-2 wavelength regions mainly cover with absorption features of plant pigments and water, making it an ideal sensor for the analysis of vegetation characteristic. Thus, Sentinel-2 can set new standards for vegetation analysis with deca-metric-resolution imagery for areas that do not have a high complexity at a small scale. As the twin constellation of Sentinel-2 is phased at 180° in the same orbit, a high temporal revisit frequency of 5 days facilitates change detection analysis. It also has the advantage that many different scenes are available for one area of interest, which allows access to cloud-free data.

There exist only few studies using Sentinel-2 data for forest analysis as of now: Immitzer et al. [15] used Sentinel-2 data for their research at two forest sites as well as cropland in Bavaria (Germany). They tested object-based image analysis (OBIA) and pixel-based (PB) methods, with the conclusion, that for their study area OBIA achieved slightly better overall accuracies (66.2% accuracy vs 63.5% accuracy) for the forest classification of seven tree groups. They mentioned the red edge part of the spectrum and the shortwave infrared region as highly important for their study. Ng et al. [16] evaluated vegetation indices for the classification of vegetation using Sentinel-2 and Pléiades data. They mapped Prosopis (negative impacts on biodiversity) and Vachellia (positive biodiversity effects as a natural resource) trees as the Prosopis tree invaded Kenya strongly in recent years. In their classification, they tested the Random Forest classification algorithm (RF) with the conclusion, that the blue, green and near-infrared bands of Sentinel-2 were important for classification purposes. Compared to the commercial Pléiades data with a 2 m spatial resolution of the multispectral bands, Sentinel-2 data showed similar results in their study. First positive results with Sentinel-2 data were also reported by Hawrylo et al. [17] in a study about forest defoliation. They tested Random Forest (RF) and Support Vector Machine (SVM) algorithms to investigate the defoliation of Scots pines in Poland with the conclusion that Sentinel-2 data were well-suited for the purpose. Immitzer et al. [18] also focused on the comparison of deca-metric- and deci-centi-metric-resolution data by comparing Landsat data with WorldView2 data. The authors achieved good classification results for both satellite data types, but stated that the time of the image collection and the acquisition parameters had significant impacts on the results. Further examples for studies using Sentinel-2 for vegetation monitoring and/or mapping include studies by Addabbo et al. [19], Puletti et al. [20] and Sothe et al. [21]. Puletti et al. [19] focused on Mediterranean environments with good results for forest group classification while Sothe et al. [21] compared Sentinel-2 and Landsat-8 data for classifying successional forest stages in Brazil.

In our study we build on these previous studies with a focus on machine learning, but also on providing a workflow that is as simple as possible and can be scaled for processing large datasets. This latter focus was born out of an interest of State Agencies and Industry to have workflows that can be applied on other, larger datasets. We focused on OBIA versus PB methods using RF and SVM classifiers implemented as so-called raster functions (tools that are optimized for distributed and dynamic processing) in the ArcGIS Platform technology.

2. Materials and Methods

The overall approach of this study is shown in Figure 1. All aspects of the workflow will be described in the following sections.

2.1. Study Area

We selected two forests in Germany where forest inventory data and orthophotos were available for training and validation purposes. Figure 2 shows the project areas, the Ebersberger and the Freisinger Forest. Both regions exhibit almost complete coverage of trees with only few glades, as well as patterns of forest management with rather homogeneous patches of spruce, oak and beech trees as dominant species. The occurrence of similar tree types within patches makes it perfect for the analysis with deca-metric-resolution imagery like Sentinel-2. The characteristics of the test sites are described in detail by Immitzer et al. [15]. According to the Bavarian State Forest Enterprise [22], the annual temperature for the Ebersberger Forest is around 7.6° and the average annual precipitation is 850–950 mm with around 500 mm between April and September. It is one of Germanys largest connected forests with an extent of 76.3 km². The Freisinger Forest’s size equals to nearly 9 km² and was used for a transferability study. The Ebersberger Forest is one out of 10 state forest areas that are controlled by the forest enterprise Wasserburg. In 2015, the tillering ratio consisted of two-third coniferous trees and one-third broad-leaved tree types. Overall, spruce has a dominance of 56%, followed by beech with 11%. Pines reached 9%, whereas fir, larch and Douglas fir form a total of 1–2%. Oak trees have a 4% proportion of all trees species in these forest areas.

2.2. Data

In order to evaluate the potential of the Sentinel-2 data in an optimal scenario, several scenes were selected according to the criteria of minimum cloud cover for several stages of the vegetation period. Furthermore, data was selected for a year that was close to the data of the collection of the inventory data. Specifications of the datasets are given in Table 1. Data was downloaded from the Copernicus Open Access Hub. Three different dates over the year were chosen for multitemporal analysis. The product level of all files is L1C, i.e., calibration of data to top of atmosphere (ToA) reflectance.

Sentinel-2 has 13 bands with three different spatial resolutions and different bandwidth (Figure 3). There are four bands with a 10 m resolution, six bands with a 20 m resolution and three bands with 60 m resolution. It has a swath of 290 km and allows for an ideal revisit time of five days at the equator.

For training and validation, we used inventory data and orthophotos provided by the Bayerischen Staatsforsten (acquired during a regular forest inventory by the Bavarian State Forest Enterprise). The inventory data contains actual data from 2014 to 2016 with information about the percentage of tree species. The circular plot size is 500 m² (radius: 12.62 m) in the Freisinger forest and 400 m² (radius = 11.28 m) in the Ebersberger forest. In total, eight different tree groups are included in the inventory data (Table 2). Spruce is the most abundant coniferous tree type with only minor Pine, Larch or Fir trees. Oak and Beech dominate the broad-leaved trees and all other minor trees are summarized within a mixed class. We filtered all data to obtain only pure inventory circles that contain only one species in order to derive the training and testing data for the classification workflow (Table 2).

In addition to the inventory data, orthophotos from the Bavarian Administration of Surveying with a 20-cm spatial resolution were used for visual validation of the training and testing samples. Two types of orthophotos were used, RGB, as well as color infrared (CIR) orthophotos. This guarantees the quality of the input data which is crucially important for the performance of machine-learning algorithms

2.3. Data Pre-Processing

As Sentinel-2 Level 1C data is provided in top of atmosphere reflectance, we used the Sen2Cor processor to derive surface reflectance data (BoA Bottom of Atmosphere). Sen2Cor is a processor for Sentinel-2 Level 2A product generation and formatting; it performs the atmospheric-, terrain (optionally and not used for this study of flat terrain) and cirrus correction of the Top-Of-Atmosphere Level 1C input data [23].

According to Hadjimitsis et al. [24], this is particularly important for vegetation analysis, as atmospheric interaction is stronger when the target surfaces consist of non-bright objects, as is the case for vegetation. Furthermore, in order to evaluate spectral properties of tree species, correction algorithms that lead to a material-dependent signature are highly important. The usage of non-atmospheric corrected images can increase the uncertainty up to 10% [24].

In Figure 2 the corrected Sentinel-2 scene of the Ebersberger and Freisinger Forest is shown for 29 September 2016 as an example for the good data quality. The image is clear and we observe no effect of clouds. The other images have a similar quality after correction.

2.4. Classification and Accuracy Assessment

The main goals of our study were to (1) evaluate Sentinel-2 data with its particular spectral bands but also multitemporal characteristics, to (2) use and compare different machine-learning algorithms and (3) to design a semi-automated workflow that can be scaled and is optimized for processing big datasets. For this purpose, we used a hierarchical classification approach (Figure 1) and tested different band and scene combinations for PB and OBIA classifications. In a first step, we tested 14 combinations to differentiate between coniferous and broad-leaved tree types. Based on the best results acquired, we then tested 54 different approaches to distinguish between oak, beech and other broad-leaved trees within the broad-leaved subclass. Coniferous trees were not classified further as the proportions of tree species other than spruce are minor and not enough training samples were available.

The ML algorithms used are an SVM classifier with a radial basis kernel function and a RF algorithm that are implemented in ArcGIS Pro as so-called raster functions. These functions, in comparison to standard geoprocessing tools, work in a dynamical way (on-the-fly) and are well-suited for distributed processing on an Image Server. This is a requirement if we envisage industrial use of the tools on large datasets.

According to Pal and Mather [25], the RF classifier (also see Breiman [26] und Pal [27]) consists of a combination of tree classifiers where each classifier is generated using a random vector sampled independently from the input vector. It belongs to the ensemble methods of supervised learning and reduces overfitting effects and has become quite popular within the remote sensing community (cf. Belgiu and Drăgut [28]). The maximum number of trees we eventually used was 50 and the tree depth was set to 30. As these parameters also influence classification results, we used the same parameters that we found to work well for all band combinations and tests performed in this study.

SVMs, described by Vapnik [29], are based on the principle of Support Vector Classifiers, a linear classifier. For non-linear life phenomena, other SVMs were developed that use different kernel functions such as the radial basis function used in this study to solve non-linear problems. These kernels enlarge the feature space by using different kernel functions with e.g., polynomial kernel degrees or a radial kernel [30]. The radial kernel separates two classes if the Euclidean distance is large between a test observation x* and a training observation xi. Training observations that are far from x* will play no role for the prediction of the class label x*, if the kernel K is very tiny in large Euclidean distances. Kernel functions are used to separate non-linearly separable support vectors using a linear hyperplane [31]. Using kernels instead of increasing the feature space using functions of the original features, the computational advantage increases [30]. Also, for distinct pairs i, i´, it is only necessary to compute K for the training observations xi and xi´ without working in the enlarged feature space which is explicitly meaningful when the enlarged feature space would be so large that computations are unpredictable. We tested both algorithm in an OBIA and a PB approach for comparison.

For the object-based approaches, we used the mean shift algorithm [32] as segmentation algorithm that replaces each discrete point with a finitely bounded continuous kernel of density, and then groups points, according to a global density estimator [33]. The kernel can be manually adapted by the user. Spectral detail can be set in a range between 1.0 and 20.0, with a higher value being appropriate for features that should be classified separately but have somewhat similar spectral characteristics. Another parameter is the spatial detail setting. Small values produce smooth outcomes between clustered areas whereas higher values are more appropriate when small objects are observed and should be matched together. As we are dealing with small objects, high values were chosen for both parameters. The spectral detail value found to work well in this study was 15.5, the spatial detail value 15.

The first classification step consisted in separating broad-leaved and coniferous trees. Spectral profiles were analyzed with respect to absorption features and variability between the respective classes. Mean class spectra are shown in Figure 4. Significant differences in the reflectance values can be recognized in the red edge portion and the infrared region of the spectral signatures (most pronounced differences in bands 7–9). Figure 5 shows a multitemporal spectral signature (May, August and September) and it is obvious that differences are strongest in May, the flowering period of broad-leaved trees [34].

Based on the spectral characteristics, different band-combinations were tested on the seasonal input images. Furthermore, OBIA vs. PB performance was tested and all 14 runs are summarized in Table 3.

25 training regions and 10 independent test areas for accuracy assessment were defined based on the preprocessed inventory data. From these areas, samples were extracted by collecting the spectral signatures of each pixel contained in the circles. As only pixels completely within the circles were considered, the number of pixels per area training/test varies slightly with four samples per area on average. The confusion matrix for the first classification step is highlighted in the results section (Table 4).

Based on the first classification step, a polygon mask for the broad-leaved trees was used to extract the area for the second step of the classification. Analogous to the initial classification step, 54 different classification settings were tested and are summarized in the results section (Table 5). In contrast to the previous step, spectral similarities are much higher between broad-leaved tree species (compare Figure 6). Thus, a multitemporal approach as well as dimension reduction using Principal Component Analysis (PCA) was used to improve results. The main goal of using principal components is reducing noise and extracting information from data. The first principal component captures the largest possible variance within a dataset while further PCs have less contribution [35]. PCA was computed in ArcGIS Pro for each input dataset. Furthermore, a vegetation index (NDVI) was included in the classification.

For the object-based approach, we first evaluated 30 different segmentation settings to generate an optimal segmented image as input for the ML algorithms. In total, we tested 54 different classification settings that are summarized in Table 5. This approach allowed to first evaluate segmentation settings and then to optimize classification by comparing the two ML algorithms in an OBIA and PB approach on different bands, with or without additional variables such as PCs, NDVI. A total of 15 training areas and five reference areas for the creation of training and test samples were available for this second step. Samples were created as described in the previous step.

Different classification settings were chosen based on the spectral information but also the spatial resolution of the different bands (10 m vs. 20 m). For segmentation, for example, higher spatial resolution is important for better detail. Different multitemporal approaches were tested with a combination of the 20 m resolution red edge bands (bands seven, six and five). A variation of different segmentation parameters was applied to the 22 May scene (with the highest spectral differences) to test effects of varying parameters. The best segmentation result (see Table 3 part 1) was used as segmented image for the OBIA for both ML algorithms and compared to the PB classification using the same bands, PCs and indices. In total, 54 classifications were conducted and compared to optimize the separation of broad-leaved tree species (oak, beech and other deciduous trees).

For accuracy assessment of all classification approaches, samples derived from the inventory areas (compare Table 2 and Figure 1) were randomly split into training and test samples. The center of all pixels contained in the inventory circles were considered a sample. As metrics we calculated the confusion matrices with User’s and Producer’s accuracies, Kappa values and overall accuracies. Based on these metrics, all approaches were evaluated. We did not perform any additional statistical testing whether the performance of the different ML algorithms is significantly different as the performance of these algorithms depends a lot on sampling and the tuning of hyperparameters. In addition to this validation, we tested the overall workflow on a different area to better evaluate its performance and transferability: In this final step, a semi-automated workflow template was created and applied to the second study area, the Freisinger Forest, and the same metrics for accuracy assessment were calculated.

3. Results

Results for the first classification step, the separation of broad-leaved and coniferous trees, are summarized in Table 3.

A total of 14 different classification methods were tested and, except for one accuracy of only 74%, all other accuracies were above 80%, eight of them better than 90%. Comparing the results from OBIA and PB approaches, we do not find performance differences. However, a comparison of performance between the two machine-learning algorithms shows slightly better results for SVM than for the RF classifier. The confusion matrix for this classification step is presented in Table 4.

The best overall accuracy of 97% was reached in an object-based approach using the SVM classifier on only band 8 of each temporal image (22 May 2016, 9 August 2016, 29 September 2016). Similar results are also achieved by using bands 8, 4 and 3 of the spring scenes, the 10 m spatial resolution bands. Three spectral bands suffice to get high classification accuracies and using more bands does not result in better results. This is due to significant differences in the blooming and autumn period between broad-leaved and coniferous trees in band 8 and the higher spatial resolution of band 8 (and bands 3 and 4) in comparison to other spectral bands. A combination of the infrared band with visible bands (2, 3 and 4) as well as a combination with the red edge region bands (in this case 6 and 7) did not result in an accuracy as high as only using the infrared band 8. Combining two red edge bands (6 and 7) with the infrared band also resulted in a high accuracy of over 95%. These results show, that a very simple and therefore reproducible approach, is ideal for achieving very good classification results for coniferous vs. broad-leaved trees. There is no need for including many bands and therefore many variables that might even reduce accuracies. Figure 7 shows the final classification result for the separation of coniferous and broad-leaved trees.

Results from the second classification step separating single tree species (oak, beech, other broad-leaved trees) within the broad-leaved class are summarized in Table 5 and shown in Figure 8. For the OBIA approach, the evaluation of changing segmentation parameters and bands showed, that best results were achieved by using the May 22 scene with a band combination of the infrared band (8) and the green (3) and blue (2) band and a maximum number of pixels per segment of five and high values for spectral and spatial detail (15.5 and 15 respectively). The attributes for segmentation were mean digital value and active chromaticity color. Further parameters such as compactness or rectangularity did not result in better segmentation and therefore this segmentation was used for all OBIA classification approaches.

In contrast to the classification results for only coniferous vs. broad-leaved trees, the use of principal components 1–12 of the August scene achieved the highest overall accuracy of 91%. The use of principal components increased the accuracy by 2% compared to the untransformed August scene. The use of only the first five principal components for the same scene resulted in a similar accuracy of 90.5%, highlighting, that the original Sentinel-2 image contains a lot of “noise” and/or redundant information. This is also indicated by the eigenvalues and factor loadings of the PCs. For all three times of the year, the eigenvalues of the first PC range between 87 and more than 90%. Most information of the Sentinel-2 image can thus be represented in only a few principal components. For all PCAs, band 8 contributes most to the first PC, highlighting its importance for classification. The statistics for the PCAs can be found as additional online material.

In general, we found that OBIA methods outperformed PB methods, especially for the combination of the infrared band with the green and blue bands. Furthermore, even by including indices such as the NDVI or PCs, the PB approach had slightly lower accuracies than the OBIA results. However, this is probably due to the patchy nature of the forest structure and the overall setting with endmembers being rather similar. PB methods perform best if there are enough sample points for all classes that do not show too much colinearity.

From the ML perspective, slightly better results were acquired using the SVM. From all settings tested, 18 classifications resulted in accuracies higher than 80%, all of them using a SVM classifier and the OBIA approach. The final classification result for this step is displayed in Figure 8.

Table 6 shows the confusion matrix for the best result based on 19 independent test samples. Beech trees reach a 94% UA and a 79% PA, whereas oak trees reach a 100% accuracy for UA and PA. Other broad-leaved trees are represented with an 81% UA and a 94% PA. The Kappa value is 0.87 and indicates a very good classification result ([37] recommendation for a detailed discussion about the suitability of the Kappa value).

The final classification result was achieved by merging the tree species classes with the coniferous forest class. The result is shown in Figure 9 and includes all four tree species groups (Beech, Oak, Other broad-leaved trees and Coniferous trees). We performed another validation on the whole dataset (Table 7) with independent test samples from the inventory dataset (samples which were not used before).

The overall accuracy is 88% with a Kappa value of 0.83. For beech trees, the UA is 94%, the PA only 71% with two samples being assigned to the coniferous forest and 4 samples to the general broad-leaved category. Oak trees have a 100% accuracy in both fields but the mixed class “Other Broad-Leaved” has a UA of 81% and a PA of 81% with some misclassification of the beech class and coniferous class and from the beech class. Coniferous Forest has a 100% PA and an 80% UA with misclassification from beech trees and other broad-leaved trees. The very high PA and UA values, however, have to be interpreted carefully as they are dependent of the sampling method. By using the available forest inventory data and information from some fieldwork, the sampling and therefore validation depends on the distribution of these samples and is not completely random as would be an ideal case for classification validation.

4. Transferability Study to the Freisinger Forest

Transferability is a crucial topic in image classification if automation and applicability of an approach should be guaranteed with a certain workflow. As one of our goals was to provide tools capable of scaling and distributed processing, we performed a transferability study of the final classification workflow to the test area, the Freisinger Forest. The same parameters as for the Ebersberger Forest were used. Table 8 shows the accuracies for the final classification of the Freisinger Forest. The overall accuracy is 85% with a Kappa value of 79. Beech trees have an UA of 56% and a PA of 71%. Oak trees reach a 100% UA and a 79% PA. Other broad-leaved trees are classified with an accuracy of 75% for the UA and 86% for the PA. The coniferous tree group achieves a 100% accuracy for both, the UA and the PA. Results are slightly worse than for the Ebersberger forest but still very good. More misclassification within the broad-leaved classes occurs and is probably due to the spectral similarity of the respective classes.

Figure 10 shows the final classification for the Freisinger Forest. The good results from the transferability study suggest that the classification results for the Ebersberger Forest are not only based on specific characteristics of the Ebersberger Forest. The workflow can be at least transferred to closely related forest regions with similar characteristics such as species presence/absence and distribution.

The transferability study was based on a semi-automatic workflow created in ArcGIS Pro in model builder and as a chain of raster functions for distributed processing. It is a ready-to-use toolset, which can be applied to new datasets (Figure 11).

Additionally, as described in the validation section, we created a mobile application using Collector for ArcGIS to validate some misclassified samples in the field. The mobile application synchronizes with the classification results as well as a predefined layer to collect data in the field that allows attaching photos. The app was especially useful to collect data for misclassified samples in the field but can also be used to update inventory data. One example of misclassification is shown in Figure 12 where an inventory data point of the class coniferous was wrongly classified as “Other Broad-leaved Trees”. When rechecking the point, the error is directly visible. There are indeed several coniferous trees, but also the occurrence of maple trees and other broadleaf trees in the undergrowth.

5. Discussion

Results of this study indicate a high potential of Sentinel-2 data for forest classification using a hierarchical semi-automatic workflow. For the classification of coniferous and broad-leaved forest types a very simple combination of only three bands sufficed to obtain very good accuracies. Crucial bands were the red edge bands 6 and 7 and the infrared band 8. The best accuracy of 97% was obtained by either combining only the infrared bands 8 of all three months and not including any further bands or bands 8, 4 and 3 of the spring scenes. This result agrees with findings of Puletti et al. [20] who found an overall accuracy of 86.2% using a RF classifier for coniferous vs. broad-leaved forest classification in the Mediterranean summer with bands 2, 8 and 7 being of most importance. Band 8 also shows highest factor loadings for the first and most important PC calculated from the images.

The coarser spatial resolution of the red edge bands (20 m) in comparison to the 10 m resolution of bands 8, and 2–4 seems to be a great disadvantage for classification. It highlights, that even for rather patchy forests, besides spectral differences, spatial resolution is critical for the overall results.

The classification of broad-leaved tree species gave results of 91% overall accuracy for the three classes beech, oak and other broad-leaved trees. This high accuracy was achieved in an object-based approach and an SVM classifier. This agrees with findings from Sothe et al. [21] and others (e.g., Ma et al. [38], Maldonado and Weber [39]) who also found that SVM learning schemes show good performance in OBIA. The segmentation was based on a mean shift algorithm using band 8 together with the green and blue bands of the May scene and therefore the flowering period of the respective trees. The SVM classifier included the August scene with 12 PCs in addition to the segmented image. The usage of principle components increased the accuracy by 1.8%, whereas including the NDVI decreased the outcome accuracy to 80.8%. Our results confirm the relevance of the NIR region that was proposed earlier by Fassnacht et al. [14]. The authors also explain the high relevance of the visible bands by the absorption of photosynthetic pigments of chlorophyll a and b. However, their analysis showed that it is not enough to only use the RGB bands. Instead, a combination of infrared and visible bands leads to high accuracies for broad-leaved tree classifications with Sentinel-2 imagery. These results could be confirmed and extended by the inclusion of PCs that were capable of increasing accuracies by several percent.

The final classification result of this study with an overall accuracy of 88% and 85% does not reach the 96% overall accuracy reported for high-resolution imagery like World-View-2 [40] but is based on cost-free data. However, whereas Wulder et al. [41] encountered difficulties in mapping tree species based on deca-metric-resolution imagery, our results indicate that Sentinel-2 is at least capable of providing good results for forests with a less complex structure and intermingling of many species. Wulder et al. also refers to Landsat data that has a 30 m spatial resolution compared to the 10 m and 20 m resolution of the Sentinel-2 sensors. Thus, depending on the area, forest complexity and needed accuracy, data has to be chosen adequately.

With respect to the different ML algorithms, we found that the SVM classifier performed slightly better than the RF classifier. The good performance of SVM classifiers was also found for other spectral classification studies from other fields and is confirmed in other studies [42,43,44]. However, there are several reasons for performance differences of ML algorithms: First of all, there is a dependence on the training data (e.g., quality, number of training samples per class, characteristics of the ground-truth …). If RF, for example, is used with unbalanced training data, it tends to focus on the prediction accuracy of prevailing classes which might lead to lower accuracies in less represented classes [21]. Furthermore, the tuning of hyperparameters has an impact on classification results and depends on the data used. The slightly better performance of SVM in our study might be due to the specific datasets as was also found for other studies [45,46,47]. Thus, with respect to only small performance differences, both algorithms are well-suited for the purpose of landcover classification in general and forest classification in particular as in this study.

Compared to commercial and often expensive very high-resolution data, results using Sentinel-2 are very good, even though the spatial resolution is much lower. The high spectral resolution in the red edge area of Sentinel-2, however, is an advantage compared to other platforms such as Landsat and was tested extensively in this study. Our results showed, that the red edge part shows clearly visible differences in the spectral profiles, but higher resolution 10 m bands still reached better accuracies especially band 8. The lower spatial resolution (20 m) for the red edge region (bands 5 to 7) somewhat diminishes its capability to correctly classify vegetation on a species or even tree-patch level. However, for achieving good accuracies on a single-tree level, high spatial resolution as well as hyperspectral data are more adequate. For example, Dalponte et al. [6], found that the fusion of very high geometrical resolution multi-/hyperspectral images and light detection and ranging (LiDAR) data can result in accuracies of up to 93% for macro classes using ML algorithms. Even tree species classification at individual tree crown level was shown to reach accuracies of 0.89 (kappa accuracy) for boreal forests using airborne hyperspectral data and LiDAR [48], making this kind of data more suitable for research on a single tree level, while Sentinel-2 data is more appropriate on a tree stands level in low-cost studies.

Besides optical sensors, there are also studies on active systems only. Liang et al. [49], for example, investigated the possibility to classify broad-leaved and coniferous tree types based on first and last pulses of a LiDAR methodology. They reached an overall accuracy of 89%, concluding that parameters like the branch structure influences the accuracy of the classification using LiDAR technology. Compared to our study with overall accuracies of more than 90%, we conclude that, depending on the study area and the scale, deca-metric-resolution optical sensors perform well for environments with low complexity compared to very high-resolution point-cloud data.

However, for future studies, the use of additional data such as point cloud or radar data to benefit from the advantages of both, active and passive systems might help to further improve results was also suggested by Stratoulias et al. [50]. Data fusion can help to overcome problems that arise with more complex forest structures where trees intermingle. Data that captures different elevation levels and internal structures and morphologies can provide valuable additional information.

6. Conclusions

Results of the present study indicate the high potential of Sentinel-2 data for applications in applied forestry and vegetation analysis despite the deca-metric spatial resolution. Our proposed workflow achieved overall classification accuracies of 88% and 85% for the study area and in a transferability study respectively, indicating its robustness and potential for scaling to a larger level. The design and backend technology allow the use on large datasets due to the concept of raster functions that are capable of on-the-fly and distributed processing.

A comparison of SVM and RF classifiers using PB and OBIA hierarchical classification approaches showed, that OBIA approaches are better suited than PB approaches in this setting. The slightly better performance of the SVM classifier is probably due to the training data used, as discussed in the previous section. Along these lines, machine learning algorithms are strongly dependent of the quality of the training data (here: inventory data) as well as on the selected areas. Thus, future studies are needed to evaluate the workflow in other forest areas to assess the effect of different forest structures and other tree species. Testing of different band combinations for improving the classification results, highlighted the importance of band 8 in combination with the red edge bands as well as the other 10 m resolution bands of the spring and summer scenes. The red edge part shows clearly visible differences in the spectral profiles, but higher resolution 10 m bands were crucially important for good results. The lower spatial resolution (20 m) for the red edge region (bands 5 to 7) somewhat diminishes its capability to correctly classify vegetation on a species or even tree-patch level while using PCs of the summer scene together with the May scene shows best results for broad-leaved tree types.

We conclude that the proposed design is well-suited to be used on larger areas with a similar forest structure and allows a streamlined workflow for applied forestry by providing analysis results directly to mobile applications for validation and data collection in the field. We showed that Sentinel-2 data is a suitable, cost-free alternative to commercial satellite data with higher spatial resolution for classifying trees at a stand level using machine-learning algorithms.

Author Contributions

Conceptualization, M.W., M.B. and D.T.; Methodology, M.W., M.B.; Supervision, M.B. and D.T.; Writing-original draft, M.W.; Writing-review & editing, M.B. and D.T.

Funding

This research received no external funding

Acknowledgments

We would like to thank the Bayerische Staatsforsten for providing the forest inventory data used in this study and the Bayerische Vermessungsverwaltung for their support with orthophotos. Furthermore, we would like to thank C. Straub and R. Seitz for discussions and for helping with the provision of all datasets. The Open Access Publication Fund of the University of Salzburg supported this publication.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dotzler, S.; Hill, J.; Buddenbaum, H.; Stoffels, J. The potential of EnMap and Sentinel-2 data for detecting drought stress phenomena in deciduous forest communities. Remote Sens. 2015, 7, 14227–14258. [Google Scholar] [CrossRef]
Blaschke, T. Object based image analysis for remote sensing. ISPRS J. Photogram. Remote Sens. 2010, 65, 2–16. [Google Scholar] [CrossRef]
Turner, W.; Spector, S.; Gardiner, N.; Fladeland, M.; Sterling, E.; Steininger, M. Remote sensing for biodiversity science and conservation. Trends Ecol. Evol. 2003, 18, 306–314. [Google Scholar] [CrossRef]
Kovacs, J. Forest Cover Type Analysis of New England Forests Using Innovative WorldView-2 Imagery. Master’s Thesis, University of New Hampshire, Durham, NC, USA, 2014. [Google Scholar]
Zhou, Y.; Qiu, F. Fusion of high spatial resolution WorldView-2 imagery and LiDAR pseudo-waveform for object-based image analysis. ISPRS J. Photogramm. Remote Sens. 2015, 101, 221–232. [Google Scholar] [CrossRef]
Dalponte, M.; Brozzone, L.; Gianelle, D. Tree species classification in the Southern Alps based on the fusion of very high geometrical resolution multispectral/hyperspectral images and LiDAR data. Remote Sens. Environ. 2012, 123, 258–270. [Google Scholar] [CrossRef]
Johansen, K.; Phinn, S. Mapping structural parameters and species composition of riparian vegetation using IKONOS and Landsat ETM+ data in Australian tropical savannahs. Photogramm. Eng. Remote Sens. 2006, 72, 71–80. [Google Scholar] [CrossRef]
Peerbhay, K.Y.; Mutanga, O.; Ismail, R. Investigating the capability of few strategically placed Worldview-2 multispectral bands to discriminate forest species in KwaZulu-Natal, South Africa. IEEE J. STARS 2014, 7, 307–316. [Google Scholar] [CrossRef]
Dennison, P.E.; Brunelle, A.R.; Carter, V.A. Assessing canopy mortality during a mountain pine beetle outbreak using GeoEye-1 high spatial resolution satellite data. Remote Sens. Environ. 2010, 114, 2431–2435. [Google Scholar] [CrossRef]
ESA. Sentinel satellites—Overview. Observing the Earth. Available online: http://www.esa.int/Our_Activities/Observing_the_Earth/Copernicus/Overview4 (accessed on 27 December 2017).
European Space Agency. Sentinel-2 User Handbook. Available online: https://sentinels.copernicus.eu/documents/247904/685211/Sentinel-2_User_Handbook (accessed on 27 April 2018).
European Environment Agency. Copernicus. Available online: http://www.eea.europa.eu/about-us/what/seis-initiatives/copernicus (accessed on 28 December 2017).
Sentinel-2 MSI Introduction. User Guides, Sentinel Online. Available online: https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi (accessed on 28 December 2017).
Fassnacht, F.E.; Latifi, H.; Stereńczak, K.; Modzelewska, A.; Lefsky, M.; Waser, L.T.; Straub, C.; Ghosh, A. Review of studies on tree species classification from remotely sensed data. Remote Sens. Environ. 2016, 186, 64–87. [Google Scholar] [CrossRef]
Immitzer, M.; Vuolo, F.; Atzberger, C. First experience with sentinel-2 data for crop and tree species classifications in central Europe. Remote Sens. 2016, 8, 166. [Google Scholar] [CrossRef]
Ng, W.-T.; Rima, P.; Einzmann, K.; Immitzer, M.; Atzberger, C.; Eckert, S. Assessing the potential of sentinel-2 and pléiades data for the detection of prosopis and vachellia spp. In Kenya. Remote Sens. 2017, 9, 74. [Google Scholar] [CrossRef]
Hawryło, P.; Bednarz, B.; Wężyk, P.; Szostak, M. Estimating defoliation of scots pine stands using machine learning methods and vegetation indices of Sentinel-2. Eur. J. Remote Sens. 2018, 51, 194–204. [Google Scholar] [CrossRef]
Immitzer, M.; Böck, S.; Einzmann, K.; Vuolo, F.; Pinnel, N.; Wallner, A.; Atzberger, C. Fractional cover mapping of spruce and pine at 1 ha resolution combining very high and medium spatial resolution satellite imagery. Remote Sens. Environ. 2018, 204, 690–703. [Google Scholar] [CrossRef]
Addabbo, P.; Focareta, M.; Marcuccio, S.; Votto, C.; Ullo, S.L. Contribution of Sentinel-2 data for applications in vegetation monitoring. Acta IMEKO 2016, 5, 44–54. [Google Scholar] [CrossRef]
Puletti, N.; Chianucci, F.; Castaldi, C. Use of Sentinel-2 for forest classification in Mediterranean environments. Ann. Silvic. Res. 2018, 42. [Google Scholar] [CrossRef]
Sothe, C.; Almeida, C.; Liesenberg, V.; Schimalski, M. Evaluating sentinel-2 and Landsat-8 data to map sucessional forest stages in a subtropical forest in southern brazil. Remote Sens. 2017, 9, 838. [Google Scholar] [CrossRef]
BaySF. Regionales Naturschutzkonzept für den Forstbetrieb Wasserburg am Inn; Bayerische Staatsforsten Forstbetrieb Wasserburg: Wasserburg, Germany, 2013; p. 65. Available online: http://www.baysf.de/fileadmin/user_upload/01-ueber_uns/05-standorte/FB_Wasserburg_a._Inn/Naturschutzkonzept_Wasserburg.pdf (accessed on 8 February 2018).
Sen2Cor. STEP, Science Toolbox Exploitation Platform. Available online: http://step.esa.int/main/third-party-plugins-2/sen2cor/ (accessed on 5 June 2017).
Hadjimitsis, D.G.; Papadavid, G.; Agapiou, A.; Themistocleous, K.; Hadjimitsis, M.; Retalis, A.; Michaelides, S.; Chrysoulakis, N.; Toulios, L.; Clayton, C. Atmospheric correction for satellite remotely sensed data intended for agricultural applications: Impact on vegetation indices. Nat. Hazards Earth Syst. Sci. 2010, 10, 89–95. [Google Scholar] [CrossRef]
Pal, M.; Mather, P. Support vector machines for classification in remote sensing. Int. J. Remote Sens. 2005, 26, 1007–1011. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 2005, 26, 217–222. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Vapnik, V. Statistical Learning Theory; Wiley: New York, NY, USA, 1998; Volume 3. [Google Scholar]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: Berlin, Germany, 2013; Volume 112. [Google Scholar]
Yu, L.; Porwal, A.; Holden, E.-J.; Dentith, M.C. Towards automatic lithological classification from remote sensing data using support vector machines. Comput. Geosci. 2012, 45, 229–239. [Google Scholar] [CrossRef]
ArcGIS for Desktop. Segment Mean Shift. Available online: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/segment-mean-shift.htm (accessed on 5 May 2017).
Comaniciu, D.; Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 603–619. [Google Scholar] [CrossRef]
Bachofer, M.; Mayer, J. Der Kosmos Baumführer; Frankh Kosmos Verlag: Stuttgart, Germany, 2015. [Google Scholar]
ArcMap. How Principal Components Works. Available online: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/how-principal-components-works.htm (accessed on 20 April 2018).
Visa, S.; Ramsay, B.; Ralescu, A.L.; Van Der Knaap, E. Confusion matrix-based feature selection. MAICS 2011, 710, 120–127. [Google Scholar]
Pontius, R.G., Jr.; Millones, M. Death to Kappa: Birth of quantity disagreement and allocation disagreement for accuracy assessment. Int. J. Remote Sens. 2011, 32, 4407–4429. [Google Scholar] [CrossRef]
Ma, L.; Li, M.; Ma, X.; Cheng, L.; Du, P.; Liu, Y. A review of supervised object-based land-cover image classification. ISPRS J. Photogramm. Remote Sens. 2017, 130, 277–293. [Google Scholar] [CrossRef]
Maldonado, S.; Weber, R. A wrapper method for feature selection using support vector machines. Inf. Sci. 2009, 179, 2208–2217. [Google Scholar] [CrossRef]
Immitzer, M.; Atzberger, C.; Koukal, T. Tree species classification with random forest using very high spatial resolution 8-band Worldview-2 satellite data. Remote Sens. 2012, 4, 2661–2693. [Google Scholar] [CrossRef]
Wulder, M.A.; Hall, R.J.; Coops, N.C.; Franklin, S.E. High spatial resolution remotely sensed data for ecosystem characterization. AIBS Bull. 2004, 54, 511–521. [Google Scholar] [CrossRef]
Brandmeier, M.; Erasmi, S.; Hansen, C.; Höweling, A.; Nitzsche, K.; Ohlendorf, T.; Mamani, M.; Wörner, G. Mapping patterns of mineral alteration in volcanic terrains using aster data and field spectrometry in southern Peru. J. S. Am. Earth Sci. 2013, 48, 296–314. [Google Scholar] [CrossRef]
Bazi, Y.; Melgani, F. Toward an optimal svm classification system for hyperspectral remote sensing images. IEEE Trans. Geosci. Remote Sens. 2006, 44, 3374–3385. [Google Scholar] [CrossRef]
Foody, G.M.; Mathur, A. A relative evaluation of multiclass image classification by support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1335–1343. [Google Scholar] [CrossRef] [Green Version]
Adam, E.; Mutanga, O.; Odindi, J.; Abdel-Rahman, E.M. Land-use/cover classification in a heterogeneous coastal landscape using RapidEye imagery: Evaluating the performance of random forest and support vector machines classifiers. Int. J. Remote Sens. 2014, 35, 3440–3458. [Google Scholar] [CrossRef]
Ghosh, A.; Fassnacht, F.E.; Joshi, P.K.; Koch, B. A framework for mapping tree species combining hyperspectral and LiDAR data: Role of selected classifiers and sensor across three spatial scales. Int. J. Appl. Earth Observ. Geoinf. 2014, 26, 49–63. [Google Scholar] [CrossRef]
Féret, J.; Asner, G.P. Tree species discrimination in tropical forests using airborne imaging spectroscopy. IEEE Trans. Geosci. Remote Sens. 2013, 51, 73–84. [Google Scholar] [CrossRef]
Dalponte, M.; Ørka, H.O.; Ene, L.T.; Gobakken, T.; Næsset, E. Tree crown delineation and tree species classification in boreal forests using hyperspectral and ALS data. Remote Sens. Environ. 2014, 140, 306–317. [Google Scholar] [CrossRef]
Liang, X.; Hyyppä, J.; Matikainen, L. Deciduous-coniferous tree classification using difference between first and last pulse laser signatures. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2007, 36, 253–257. [Google Scholar]
Stratoulias, D.; Balzter, H.; Sykioti, O.; Zlinszky, A.; Tóth, V.R. Evaluating Sentinel-2 for lakeshore habitat mapping based on airborne hyperspectral data. Sensors 2015, 15, 22956–22969. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Flowchart showing the methodology of the study.

Figure 2. Map showing the two project regions (Ebersberger Forest and Freisinger forest). The true color image is from the Sentinel-2 scene in September after atmospheric correction. The areas used to generate the training and testing points during the classification process are also shown. Note that not all areas used in the first step delineating coniferous vs. broadleaved trees could be used on a species basis in the next step. Thus, the bigger symbols for “Deciduous Forest” beneath single tree species.

Figure 3. Spatial and Spectral resolution of Sentinel-2 (modified from Immitzer et al. [15]).

Figure 4. Mean spectral profile Ebersberger Forest May 22—all tree-types. (Inventory Data@Bayerische Staatsforsten).

Figure 5. Spectral profile for a multitemporal analysis of three Sentinel-2 images for four tree groups. (Inventory Data@Bayerische Staatsforsten).

Figure 6. Spectral Profile Beech Trees vs. Oak Trees Ebersberger and Freisinger Forest 22 May Level 2A. (Inventory Data@Bayerische Staatsforsten).

Figure 7. Classification Result Coniferous Forest vs. Broad-Leaved Forest (Ebersberger Forest). Furthermore, areas for training and testing samples are shown that were randomly split. The result for this first step of our hierarchical classification is shown in Figure 7 and the confusion matrix [36] that was calculated on 28 independent test samples derived from the inventory circles is shown in Table 4.

Figure 8. Classification result of single tree species within the broad-leaved class.

Figure 9. Final Classification with Four Classes, Broad-Leaved Trees and Coniferous Forest, Ebersberger Forest. This classification was obtained by merging the coniferous class with the best result of the classification of the deciduous tree species (multitemporal object-based image analysis (OBIA) Support Vector Machines (SVM), compare to Table 5).

Figure 10. Final Classification for the Freisinger Forest (compare Table 8).

Figure 11. Semi-automatic workflow (ArcGIS Pro) for the classification process of single tree species.

Figure 12. Ebersberger Forest Ground Imagery, Inventory Data Coniferous Tree Misclassified as Other Deciduous Tree. Photographed 15 June 2017.

Table 1. Specifications of Sentinel-2 Images for the Ebersberger and Freisinger Forest that were used in this project.

Date	Aquisition Time	Cloud Coverage %	Product Level
22 May 2016	18:24:38	28.7	1C
09 August 2016	05:07:27	4.5	1C
29 September 2016 EF *	18:51:41	0.0	1C
29 September 2016 FF *	18:19:08	0.0	1C

* EF/FF: Ebersberger Forest, Freisinger Forest.

Table 2. Available pure inventory areas for training and testing. Note that each circle contains more than one sample depending on the spatial resolution of the pixel.

Tree Type	Ebersberger Forest	Freisinger Forest
Spruce	777	70
Pine	2	1
Larch	6	2
Fir	1	2
Other Coniferous	8	2
Beech	75	4
Oak	21	2
Other Broad-Leaved	63	11

Table 3. Results from the first classification step separating coniferous and broad-leaved trees (Ebersberger Forest).

Accuracy	Method	Classifier	Segmented Image	Segmentation Settings
95.2	OBIA	SVM	May22/Bands 6 7 8	DS
86.8	OBIA	RF	May22/Bands 6 7 8	DS
92.3	PB	SVM	May22/Bands 6 7 8	DS
90.2	OBIA	SVM	May 22/Bands 3 4 8	DS
74	OBIA	RF	May 22/Bands 3 4 8	DS
97	PB	SVM	May 22/Bands 3 4 8	DS
87	OBIA	SVM	May 22/Bands 3 4 8	SegSize 5
85	OBIA	SVM	Multitemporal/May Band 3, Aug Band 8, Sept Band 7	DS
86.6	OBIA	RF	Multitemporal/May Band 3, Aug Band 8, Sept Band 7	DS
83	PB	SVM	Multitemporal/May Band 3, Aug Band 8, Sept Band 7	DS
97	OBIA	SVM	Multitemporal/May Band 8, Aug Band 8, Sept Band 8	DS
92.4	OBIA	SVM	Multitemporal/May Band 8, Aug Band 8, Sept Band 8	SegSize 5
81.6	OBIA	RF	Multitemporal/May Band 8, Aug Band 8, Sept Band 8	DS
89.8	PB	SVM	Multitemporal/May Band 8, Aug Band 8, Sept Band 8	DS

OBIA = Object Based Image Analysis; PB = Pixel based; SVM = Support Vector Machine; RF = Random Forest; DS = Default Segmentation Settings (Spectral detail = 15.5, Spatial detail = 15, segment size = 20), SegSize 5 = Minimum Segment Size in Pixels equals 5.

Table 4. Confusion matrix based on independent test samples for the first classification step (Ebersberger Forest). Results are shown for the classification setting highlighted in Table 3.

Class Name	Broad-Leaved-Forest	Coniferous Forest	Total	UserAccuracy	Kappa
Broad-Leaved Forest	35	0	35	1	0
Coniferous Forest	3	54	57	0.95	0
Total	38	54	92	0	0
ProducerAccuracy	0.92	1	0	0.97	0
Kappa	0	0	0	0	0.93

Table 5. Final classification results for all tested broad-leaved classifications with an additional segmented image (Ebersberger Forest).

Accuracy	Method	Classifier	Input Image	Segmented Additional Image	Segmentation Settings
60.9	OBIA	SVM	Sept 29/All Bands	May 22/Bands 8 3 2	SegSize 5
54.3	OBIA	SVM	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-2	May 22/Bands 8 3 2	SegSize 5
49.4	OBIA	SVM	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-2	May 22/Bands 8 3 2	SegSize5 & SA 1-6
71.7	OBIA	SVM	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-4	May 22/Bands 8 3 2	SegSize 5
67.3	OBIA	SVM	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-5	May 22/Bands 8 3 2	SegSize5 & SA 1-6
59	OBIA	SVM	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-6	May 22/Bands 8 3 2	SegSize5 & SA 1-2-5
61.6	OBIA	RF	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-7	May 22/Bands 8 3 2	SegSize 5
63.3	PB	SVM	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-8
76.2	PB	RF	Sept 29/Bands 2 3 4 5 6 7 8 9 & PCA 1-9
67.7	OBIA	SVM	Sept 29/PCA 1-12	May 22/Bands 8 3 2	SegSize 5
61.8	OBIA	SVM	May 22/All Bands	May 22/Bands 8 3 2	SegSize 5
80.7	OBIA	SVM	May 22/Sept 29/Bands 8 3 2 & PCA 1-4 & NDVI May 22	May 22/Bands 8 3 2	SegSize 5
76	OBIA	SVM	May 22/All Bands & NDVI & PCA 1-4	May 22/Bands 8 3 2	SegSize 5
65	OBIA	SVM	May 22/Bands 2 3 8 & NDVI May 22	May 22/Bands 8 3 2	SegSize 5
75.7	OBIA	SVM	May 22/Bands 2 3 7 8 9	May 22/Bands 8 3 2	SegSize 5
74.3	OBIA	SVM	May 22/Bands 2 3 7 8 9 & PCA 1-4 & NDVI May 22	May 22/Bands 8 3 2	SegSize 5
55	PB	SVM	May 22/Bands 2 3 7 8 9 & PCA 1-4 & NDVI May 22
75.1	PB	RF	May 22/Bands 2 3 7 8 9 & PCA 1-4 & NDVI May 22
87.2	OBIA	SVM	May 22/Bands 5 6 7	May 22/Bands 8 3 2	SegSize 5
59.9	OBIA	SVM	May 22/Bands 5 6 7 & PCA 1-4	May 22/Bands 8 3 2	SegSize 5
78.8	OBIA	SVM	May 22/Bands 5 6 7	May 22/Bands 8 3 2	SegSize5 & SA 1-2-3-4
53.7	PB	SVM	May 22/Bands 5 6 7
67.1	OBIA	RF	May 22/Bands 5 6 7	May 22/Bands 8 3 2	SegSize 5
87.2	OBIA	SVM	May 22/Bands 5 6 7 & NDVI May 22	May 22/Bands 8 3 2	SegSize 5
87.2	OBIA	SVM	May 22/Bands 4 5 6 7	May 22/Bands 8 3 2	SegSize 5
62.5	OBIA	RF	May 22/Bands 4 5 6 7	May 22/Bands 8 3 2	SegSize 5
80.3	OBIA	SVM	May 22/Band 7 & Aug 09/Band 6 & May 22/Band 5	May 22/Bands 8 3 2	SegSize 5
63.3	OBIA	SVM	May 22/PCA 1-12	May 22/Bands 8 3 2	SegSize 5
60	OBIA	SVM	May 22/All Bands & PCA May 1-4	May 22/Bands 8 3 2	SegSize 5
89.1	OBIA	SVM	August 09/All Bands	May 22/Bands 8 3 2	SegSize 5
63.1	OBIA	RF	August 09/All Bands	May 22/Bands 8 3 2	SegSize 5
46	PB	SVM	August 09/All Bands
65.2	OBIA	SVM	August 09/All Bands & PCA May 1-4	May 22/Bands 8 3 2	SegSize 5
80.5	OBIA	SVM	August 09/Bands 5 6 7	May 22/Bands 8 3 2	SegSize 5
80.7	OBIA	SVM	August 09/Bands 2 3 8	May 22/Bands 8 3 2	SegSize 5
91	OBIA	SVM	August 09/PCA 1-12	May 22/Bands 8 3 2	SegSize 5
78.3	OBIA	SVM	August 09/PCA 1-3	May 22/Bands 8 3 2	SegSize 5
86.3	OBIA	SVM	August 09/PCA 1-4	May 22/Bands 8 3 2	SegSize 5
53.3	OBIA	RF	August 09/PCA 1-4	May 22/Bands 8 3 2	SegSize 5
51	PB	SVM	August 09/PCA 1-10
90.5	OBIA	SVM	August 09/PCA 1-5	May 22/Bands 8 3 2	SegSize 5
84.5	OBIA	SVM	August 09/PCA 1-6	May 22/Bands 8 3 2	SegSize 5
85.2	OBIA	SVM	August 09/PCA 1-7	May 22/Bands 8 3 2	SegSize 5
80.8	OBIA	SVM	August 09/PCA 1-10	May 22/Bands 8 3 2	SegSize 5
81.9	OBIA	SVM	August 09/PCA 1-11	May 22/Bands 8 3 2	SegSize 5
90.9	OBIA	SVM	August 09/PCA 1-12	May 22/Bands 8 3 2	SegSize 5
72.3	OBIA	SVM	August 09/PCA 1 & 4	May 22/Bands 8 3 2	SegSize 5
74.8	OBIA	SVM	August 09/PCA 2-4	May 22/Bands 8 3 2	SegSize 5
78.3	OBIA	SVM	August 09/PCA 1	May 22/Bands 8 3 2	SegSize 5
80.8	OBIA	SVM	August 09/NDVI August	May 22/Bands 8 3 2	SegSize 5
73.5	OBIA	SVM	August 09/PCA 1-12 & NDVI August 09	May 22/Bands 8 3 2	SegSize 5
77.2	OBIA	SVM	August 09/PCA All & Sept 29/PCA All	May 22/Bands 8 3 2	SegSize 5
66	OBIA	SVM	May 22/PCA All & August 09/PCA 1-12 & Sept 29/PCA 1-12	May 22/Bands 8 3 2	SegSize 5
82.3	OBIA	SVM	May 22/PCA All & August 09/PCA All	May 22/Bands 8 3 2	SegSize 5

OBIA = Object Based Image Analysis; PB = Pixel based; SVM = Support Vector Machine; RF = Random Forest; DS = Default Segmentation Settings, SegSize 5 = Minimum Segment Size in Pixels equals 5; spa/spec = Spatial/Spectral Detail; SA = Segment Attributes with 1 = Active chromaticity color, 2 = Mean digital number, 3 = Standard deviation, 4 = Count of pixels, 5 = Compactness, 6 = Rectangularity.

Table 6. Validation Result Accuracy Assessment Broad-Leaved Forest Subset Classification (Ebersberger Forest). Results are shown for the classification approach highlighted in Table 5.

Class Name	Beech Trees	Oak Trees	Other Broad-Leaved	Total	UserAccuracy	Kappa
Beech Trees	15	0	1	16	0.94	0
Oak Trees	0	21	0	21	1	0
Other Broad-Leaved	4	0	17	21	0.81	0
Total	19	21	18	58	0	0
ProducerAccuracy	0.79	1	0.94	0	0.91	0
Kappa	0	0	0	0	0	0.87

Table 7. Final Confusion Matrix for Final 4 Classification Tree Species (Ebersberger Forest). This result was obtained by validating the final merged classification (Figure 9) with an independent test set.

Class Name	Beech Trees	Oak Trees	Other Broad-Leaved	Coniferous Forest	Total	UserAccuracy	Kappa
Beech Trees	15	0	1	0	16	0.94	0
Oak Trees	0	21	0	0	21	1	0
Other Broad-Leaved	4	0	17	0	21	0.81	0
Coniferous Forest	2	0	3	20	25	0.8	0
Total	21	21	21	20	83	0	0
ProducerAccuracy	0.71	1	0.81	1	0	0.88	0
Kappa	0	0	0	0	0	0	0.83

Table 8. Final Confusion Matrix for the final Classification of Tree Species, Freisinger Forest (compare Figure 10).

Class Name	Beech Trees	Oak Trees	Other Broad-Leaved	Coniferous Forest	Total	UserAccuracy	Kappa
Beech Trees	5	3	1	0	9	0.56	0
Oak Trees	0	11	0	0	11	1	0
Other Broad-Leaved	2	0	6	0	8	0.75	0
Coniferous Forest	0	0	0	11	11	1	0
Total	7	14	7	11	39	0	0
ProducerAccuracy	0.71	0.79	0.86	1	0	0.85	0
Kappa	0	0	0	0	0	0	0.79

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wessel, M.; Brandmeier, M.; Tiede, D. Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data. Remote Sens. 2018, 10, 1419. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10091419

AMA Style

Wessel M, Brandmeier M, Tiede D. Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data. Remote Sensing. 2018; 10(9):1419. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10091419

Chicago/Turabian Style

Wessel, Mathias, Melanie Brandmeier, and Dirk Tiede. 2018. "Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data" Remote Sensing 10, no. 9: 1419. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10091419

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data

2.3. Data Pre-Processing

2.4. Classification and Accuracy Assessment

3. Results

4. Transferability Study to the Freisinger Forest

5. Discussion

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI