Rapid Land Cover Classification Using a 36-Year Time Series of Multi-Source Remote Sensing Data

Yan, Xingguang; Li, Jing; Smith, Andrew R.; Yang, Di; Ma, Tianyue; Su, Yiting

doi:10.3390/land12122149

Open AccessArticle

Rapid Land Cover Classification Using a 36-Year Time Series of Multi-Source Remote Sensing Data

¹

College of Geoscience and Surveying Engineering, China University of Mining and Technology-Beijing, Beijing 100083, China

²

School of Environmental and Natural Sciences, Bangor University, Bangor LL57 2UW, UK

³

Environment Centre Wales, Bangor University, Bangor LL57 2UW, UK

⁴

Wyoming Geographic Information Science Center, University of Wyoming, Laramie, WY 82071, USA

^*

Author to whom correspondence should be addressed.

Land 2023, 12(12), 2149; https://0-doi-org.brum.beds.ac.uk/10.3390/land12122149

Submission received: 9 November 2023 / Revised: 30 November 2023 / Accepted: 8 December 2023 / Published: 11 December 2023

(This article belongs to the Special Issue Land Surface Monitoring Based on Satellite Imagery II)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Long time series land cover classification information is the basis for scientific research on urban sprawls, vegetation change, and the carbon cycle. The rapid development of cloud computing platforms such as the Google Earth Engine (GEE) and access to multi-source satellite imagery from Landsat and Sentinel-2 enables the application of machine learning algorithms for image classification. Here, we used the random forest algorithm to quickly achieve a time series land cover classification at different scales based on the fixed land classification sample points selected from images acquired in 2022, and the year-by-year spectral differences of the sample points. The classification accuracy was enhanced by using multi-source remote sensing data, such as synthetic aperture radar (SAR) and digital elevation model (DEM) data. The results showed that: (i) the maximum difference (threshold) of the sample points without land class change, determined by counting the sample points of each band of the Landsat time series from 1986 to 2022, was 0.25; (ii) the kappa coefficient and observed accuracy of the same sensor from Landsat 8 are higher than the results of the TM and ETM+ sensor data from 2013 to 2022; and (iii) the addition of a mining land cover type increases the kappa coefficient and overall accuracy mean values of the Sentinel 2 image classification for a complex mining and forest area. Among the land classifications via multi-source remote sensing, the combined variables of Spectral band + Index + Terrain + SAR result in the highest accuracy, but the overall improvement is limited. The method proposed is applicable to remotely sensed images at different scales and the use of sensors under complex terrain conditions. The use of the GEE cloud computing platform enabled the rapid analysis of remotely sensed data to produce land cover maps with high accuracy and a long time series.

Keywords:

Google Earth Engine; sample migration; land classification; multi-source remote sensing; spontaneous forest; machine learning; AI Earth

1. Introduction

Land cover classification is important in enabling detailed studies of temporal and spatial environmental change, land resource management, and sustainable development [1,2]. Changes in land cover can affect the carbon (C) balance; for example, a study in Shandong Province, China, showed that, between 2010 and 2020, land cover change resulted in the loss of 106 × 10⁴ t C stored in vegetation [3].

The classification of land cover is usually based on natural geographic features such as vegetation type, climatic conditions and topographic features that enable the construction of different types of thematic classification, e.g., urban land [4], biogeoclimatic ecosystems [5], and forest types [6]. Land classification methods traditionally rely on the historical data of land classification and field observations than can require a large amount of time and resources to process, as image-based land classification was mainly achieved through the visual interpretation of photogrammetry. Subsequently, the availability of remotely sensed data enabled land classification based on the statistical analysis of spectral features extracted from image pixels [7]. As the availability and diversity of multi-source remote sensing data has increased there have been opportunities to greatly improve the accuracy of land classification.

Remote sensing has an important role in determining land cover types, as multi-sensor-derived waveband information can be used to classify land use cover quickly and reproducibly at different temporal and spatial scales [8]. For example, Tadese et al. [9] used remote sensing data as a basis for analyzing and understanding the long-term dynamics of land use and land cover change in the Awash River Basin. Remote sensing imagery can also be used to generate macro time series land cover datasets for a region, country, or even globally. An example is the Global Land Cover 30 series (GlobeLand30) dataset, which consists of ten primary land cover classes, i.e., water bodies, wetland, artificial surfaces, cultivated land, permanent snow/ice, forest, shrubland, grassland, bareland, and tundra [10]. The release of GlobeLand30 provided a database for large-scale land cover change studies and has been used for large regional-scale studies [11]. Whilst the above studies demonstrate the value of land classification at the spatial scale, the datasets are only available for specific years and are not regularly updated, as the spectral characteristics of land cover or landscape features can vary interannually. As a result, the sample points selected for analysis in one year are not optimal for other years, which can create issues related to training datasets and model migratability [12]. To resolve this limitation, a sample point migration approach was developed, which enables the migration of classification thresholds for a feature from a single chronology to a long time series dataset [13].

The Google Earth Engine (GEE) has been recognized as a powerful tool for processing large-scale Earth observation data, with the ability to access and process large amounts of multi-source, multi-scale, and time series remote sensing data via a cloud platform [14]. The GEE provides access to a variety of datasets in an integrated system, including various satellite image sources, geophysical data, climate data, and demographic data that facilitate the use of time series and multi-source datasets for land cover mapping [15,16]. For example, Sidhu et al. [17] made use of the GEE platform’s utility in processing raster and vector image manipulations for the spatio-temporal analysis of urban and wetland land cover types in two subregions of Singapore, affirming the spatio-temporal analysis capabilities of GEE. However, most existing studies focus on one land cover type or generate land cover maps for certain areas at specific times of image collection. As a result, these studies often find it difficult to incorporate long time series datasets. The utility of the GEE for land cover detection using annual Landsat-derived normalized difference vegetation index time series data was demonstrated by Huang et al. [18] to create a dynamic map of the land cover change in Beijing over a 30-year period with an overall accuracy of 86.61%

The multi-petabyte curated catalogue of the geospatial datasets available in the GEE permits and improves classification results by reducing the likelihood of dataset gaps and uncertainty through the provision of multiple sources of data [19]. Multi-source remote sensing data is particularly effective at improving the efficiency of land cover classification as the data fusion and integration of spectral, spatio-temporal, and thermal information from multiple sensors can improve the accuracy of classification [20]. For example, Li et al. [21] generated a land cover map of the entire African continent at a resolution of 10 m using a combination of Sentinel-2, Landsat-8, Nighttime Light, and MODIS data.

Machine learning algorithms such as maximum likelihood [22], support vector machines [23], and random forest (RF) [24] are recognized as accurate and effective methods for analyzing large dimensional and complex spatio-temporal data when compared to traditional parametric algorithms [25]. The selection of a good classification method is a key factor in the classification process that is dependent on the analysis objectives; for example, RF is one of the most frequently used supervised machine learning methods due to its high efficiency and accuracy in identifying single-class elements such as urban number spaces [26] in remotely sensed imagery, as well as its ability to distinguish between multiple land types [27,28], time series data [29], and complex farming areas [30]. The improvement of machine learning methods to achieve efficient, fast, and accurate land classification for long time series remains a focus of research.

In this study, we implemented the RF classifier in the GEE to perform time series land classification at different spatial scales with Landsat-8 and Sentinel-2 datasets for the vegetation growing season in 2022. Our overarching aim was to use different land classification models constructed using multi-source remote sensing variables to establish an efficient, accurate, and general land classification model for time series datasets, and to identify land classification sample points and migration thresholds based on the differences in the sample point image values without land classification changes. Our objectives were to (1) determine the threshold value of sample point migration based on no change in land class; (2) analyze the accuracy of the land classification model produced using a 36-year time series of Landsat remote sensing imagery and high-precision Sentinel imagery based on those threshold values; and (3) determine the optimal RF land classification model based on different combinations of multi-source remote sensing variables and compare the impact of image resolution on the classification accuracy.

2. Materials and Methods

2.1. Study Area

Shanxi Province is located within the Loess Plateau and the Yellow River Basin (N34°34′–40°44′, E110°14′–114°33′) and occupies a total area of 156,700 km². Mountains account for more than 80% of the total surface area of the region, with its topography highest in the northeast and lowest in the southwest, with an average altitude of 1500 m. Shanxi Province is an important coal energy base in China, with its retained reserves of coal resources reaching 270.9 billion metric tons. Additionally, Shanxi Province contains seven national nature reserves and is an important ecological barrier between mining activities and the Yellow River Basin. Within the Jinzhong coal base of Shanxi Province, the Huodong National Coal Planning Area covers a total area of 4110 km². The region is widely forested and includes the Taiyue Mountain National Forest Park that is an intimate mix of mining and forestry operations. The study area and land classification sample sites are shown in Figure 1.

2.2. Data Sources

The Landsat series of satellites collect data at a resolution of 30 m and have been providing fundamental data for long time series scientific research on a global scale since their launch in 1972. In this study, remotely sensed data from 1 June 2022 to 31 August 2022 was used to capture the spectral reflectance of vegetation and assist in the identification and extraction of information on land cover types, such as forests and grasslands, while effectively distinguishing bare ground and other landscape features.

Sentinel-2 satellite data offers 13 spectral bands, which include four 10 m, six 20 m, and three 60 m spatial resolution bands. MultiSpectral Instrument (MSI), Level-1C data is the standard of the Sentinel-2 archive and represents the top of the atmosphere (TOA) reflectance. Sentinel-2 imagery is commonly used to monitor land use and land cover change on a global scale and is designed to provide high-resolution, multispectral remote sensing data for monitoring surface change and environmental conditions.

In addition to the above images, we used the National Aeronautics and Space Administration (NASA) digital elevation model (DEM) and Sentinel-1 synthetic aperture radar (SAR) as multi-source remote sensing images for land classification. All the multi-source remote sensing images involved in land classification are shown in Table 1.

The workflow of this analysis comprised the four phases described below: (1) pre-processing acquired imagery; (2) sample-point threshold acquisition; (3) land classification; and (4) accuracy assessment (Figure 2).

2.3. Image Pre-Processing

The pre-processing of optical remote sensing images included image stitching, de-clouding, mosaicking, and cropping. In particular, the image de-clouding methods all remove clouds and cloud shadow elements by labeling the QA quality bands of Landsat and Sentinel-2 data and operating the mask bit by bit. The mosaic processes of the images were fused using the median method, which in turn resulted in the Landsat series of images from 1986–2022 and Sentinel-2 remote sensing images of the vegetation growing seasons from 2019–2022, respectively.

The Sentinel-1 polarized data from the GEE has officially undergone ground range detection (GRD) boundary noise removal, thermal noise removal, radiometric calibration, and radiometric correction processes. In this study, the VV and VH polarization bands in the interferometric wide swath (IW) mode, which are suitable for remote sensing studies of land surfaces, were selected. The DEM data were reprojected and resampled to extract variables such as elevation, slope, and aspect as topographic factors to participate in the construction of the land classification model.

2.4. Sample Point Selection

The land classification of Shanxi Province was divided into six types: forest land, grassland, arable land, bare land, water bodies, and impervious surfaces. Additionally, a mining land type was added to the land classification system to account for the Huodong mining area and to assist in the differentiation of the mining and forest in the Taiyue Mountain National Forest Park complex area.

Fixed sample points for different land classifications were selected by importing the sample points into Google Earth to determine their accuracy by comparing high-resolution remote sensing images. A total of 1507 sample points from the Landsat imagery and 1235 sample points from the Sentinel imagery were selected. In total, 70% of the sample points were used as training sample points and 30% as validation sample points in the classification process; the specific land classification sample points are shown in Table 2.

2.5. Technical Method

2.5.1. Sample Migration

Spectral features and indices are common methods used to analyze remotely sensed imagery. Spectral features are calculated from ratios or differences between the reflectance or emissivity in different bands of the remotely sensed image. These features and indices can be used to extract feature information, monitor vegetation cover, and monitor water quality, among other things. In this study, the Normalized Difference Vegetation Index (NDVI), Normalized Difference Built-up Index (NDBI), Normalized Difference Water Index (NDWI), and Difference Vegetation Index (DVI) were used to calculate the difference in values between the forest, grassland, and cropland, respectively, from year to year, and NDBI and DVI were used to calculate the difference between built-up (working) land and bare land. The spectral characteristics of unchanged land types are counted over a number of years so that a reasonable range of thresholds can be determined. In the GEE, the ee.spectralDistance function was used for image difference statistics. The main purpose of this function is to compute the per-pixel spectral distance between two images.

2.5.2. Random Forest Algorithm

Random forest was used to train a decision tree with randomly selected samples and features from the dataset, with the results of the decision trees assessed to obtain a combined result. The advantage of using the RF algorithm is that it avoids the problem of overfitting and is reliable for handling data such as missing values and outliers.

2.5.3. Feature Model Construction

Comparison of the single and multi-source remote sensing variables was conducted by combining different dimensions of remote sensing variables to investigate their influence on the land classification results. Four remote sensing feature variables were selected: spectral band, spectral index, Terrain features, and SAR data, with the specific variable factors shown in Table 3. In the construction of the multi-source remote sensing variables, five combinations of spectral band, spectral band + spectral index, spectral band + SAR, spectral band + spectral index + SAR, and spectral band + spectral index + terrain features + SAR were used, respectively.

2.5.4. Accuracy Assessment

Accuracy of the classification results was determined by calculating the overall accuracy (OA) as a ratio of the number of samples correctly classified to the total number of samples, which is a common measure of classifier performance. The Kappa coefficient is a statistic used to measure the agreement between classifiers or evaluators. It can be used to assess the agreement between two evaluators on a classification task. Kappa coefficient values range from −1 to 1, with higher values indicating a better agreement.

3. Results

3.1. The Determination of Thresholds

A total of 180 sample points without land classification change were selected by comparing remote sensing images of the same period from 1986 to 2022, these used 30 sample points per land cover class and included data from each spectral band (Blue, Green, Red, Swir1, Swir2) and spectral index (NDVI, NDBI, NDW) for each point year by year to obtain the maximum and minimum value range (Table 4). The results show that Landsat can vary somewhat in its image classifications between bands and indices, but the fluctuation range is between 0.01 and 0.25. The variation between land classes indicated that water bodies are the most stable followed by grasslands; the bands associated with forests fluctuated more in the NDVI and NDWI indices. The final upper threshold value for the land classification sample points was set at 0.25 for Landsat long time series land classification.

3.2. Land Classification of Landsat Imagery

Land cover classification, using Landsat remote sensing images from 1986–2022, was conducted using a sample point migration threshold of 0.25 and the accuracy was assessed using the OA and kappa coefficient where the number of migrated sample points were counted (Figure 3). The results show that the classification accuracy of the images was highest in the years closer to the 2022 initial land classification, while the difference between the kappa coefficient and OA became larger as the number of years from the 2022 initial land classification sample points increased. However, the overall land classification accuracy remained high, with the lowest kappa coefficient being 0.60 and the lowest OA being 0.75 in 1999. The number of classification sample points decreases as the number of years from 2022 increases, with the migrated sample point data remaining stable at 900, which accounts for approximately 60% of the original number of sample points. It is noteworthy that the differences between the Landsat TM/ETM and OIL sensor technology can explain the lower accuracy of results from the start of the study in 1986 until 2012.

3.3. Land Classification of Sentinel-2 Images

To verify the generality of this paper among different remote sensing images and its reproducibility under complex terrain conditions, we selected the Huodong national planning mining area in Shanxi Province and its complex terrain conditions as the study area, and added a mining class to the land classification system for Sentinel-2 high-resolution remote sensing images from 2019 to 2022. The land cover classification accuracies in different threshold ranges (0.1–0.4) were assessed separately by counting each waveband for the different years of the land class (Table 5). The results show that the land classification accuracy is higher when the threshold value of the training sample’s point migration is set in the range of 0.20–0.30 and the number of sample points for year-by-year land classification after threshold screening is maintained at about 70% of the original number, which can meet the number of sample points required for land classification to a greater extent. At the same time, the kappa coefficients between 2019 and 2021 are stable around 0.90, while the OA is also maintained around 0.91.

3.4. Multi-Source Remote Sensing Images for Land Classification

3.4.1. Sentinel-2 Multi-Source Remote Sensing Land Classification

A combination of multi-source remote sensing variables improved the model accuracy of land classification (Table 6; Figure 4), and the model accuracy is improved with an increase in the number of variables, especially the combination of Spectral band + Index + SAR. In 2019, for example, the kappa coefficient eventually increased from 0.863 for a single Spectral band to 0.910 for the Spectral band + Index + Terrain + SAR, whilst the OA also increased from 0.888 to 0.927 for that sample variable combination. In addition, compared to the 2022 participation in land classification accuracy, the sample points after threshold screening can be used to eliminate the misclassification of sample points in the selection process, so that the 2019–2021 land classification accuracy is better than the 2022 land classification accuracy.

3.4.2. Landsat Multi-Source Remote Sensing Land Classification

The land classification accuracy of Landsat-8 (Table 7; Figure 5) with various combinations of variables is lower than those of the multi-source remote sensing land classification based on Sentinel-2 imagery. In 2022, for example, the highest land classification accuracy is achieved with the combination of Spectral band + Index + SAR, and the model combination of Spectral band + SAR is better than that of Spectral band + Index. The years 2019 and 2020 have the best accuracy for the full variable combination, while the best variable combination for 2021 and 2022 is Spectral band + Index + SAR.

3.4.3. Comparative Analysis of the Accuracy of Land Classification Products

The Landsat 8 and Sentinel-2 remote sensing images of the land classification results in 2020 are shown in Table 8; in the Huodong mining area, forest and grassland areas, as a whole, accounted for about 80% of the whole study area, of which forest land accounted for about 40%, while coal mine land accounted for about 1.5% of the whole study area, and the water and bare ground accounted for only between 0.2 and 0.3%.

Three high-resolution land classification products from 2022 were obtained for the comparison of forested land classification results at resolutions ranging from 10 to 30 m (Table 9). However, notably, in the list of the classification products shown in Table 9, the JAXA/ALOS/PALSAR/YEARLY/FNF4 products does contain both forest and non-forest land classes. Despite this subtle difference in the classification procedure and imagery resolution, the area of forested land ranged from 1136.74 to 1418.27 km², with both the highest and lowest area estimates being produced at a 10 m resolution.

4. Discussion

In this study, the utility of the GEE cloud computing platform for building land cover classification models using multiple sources of Landsat and Sentinel remote sensing imagery at different spatial resolutions over a 36-year time series was assessed. High-accuracy spatiotemporal land cover classification maps can help to reveal the impact of human activities such as coal mining and urban expansion on land use over time, which could enhance our understanding of the impact of population growth and changes in demography, and provide an evidence base to facilitate future government policy decisions; for example, creating accurate assessments of the spatiotemporal changes in forest C stocks in the context of C accounting and net-zero targets [31].

Cloud computing platforms such as GEE, PIE-Engine, and AI Earth have improved our access to the high-performance computing necessary to process large and complex datasets and facilitated an increase in both the speed and accuracy of land cover classification. The approach used in this study was to use the GEE platform to conduct classification based on sample point migration and determine the sample point threshold value required to detect a land cover classification change. The selection of the sample points’ migration method has the advantage of not requiring new sample points to be chosen for each time period image, thereby improving the efficiency of the classification process [13].

The fusion of multi-source remote sensing data into composite data products has been shown to improve the accuracy of land cover classification [32]. In this study, when assessing the classification of both Landsat and Sentinel multispectral images, differences in the classification for crops and grassland were apparent because the imagery obtained for the vegetative growing season did not have substantial differences in the image spectra between grassland and crops. This finding supports the need for multiple sources of remotely sensed images obtained with different sensors (e.g., the SAR and multispectral data available in the Landsat and Sentinel series of images) to accurately classify land cover.

In our comparison of publicly available land classification products (Table 8), the land classification results for forested land ranged between 1136.74 and 1418.27 km² for the Google and ESA products, respectively, which is broadly consistent with our own classification of the forest land area. The higher estimation of forested land area by the ESA product is likely due to its inclusion of sparse forest land in the classification of forested land, whereas the variance between the other four products is only 50.88 km² despite their differences in image resolution.

Land cover classification based on the non-parametric RF algorithm is able to handle multi-dimensional and non-linear data sources whilst also removing the requirement for a balanced number of individual sample points [27], unlike the non-parametric minimum distance, maximum likelihood, and Bayesian classification methods. The combination of multi-source remote sensing data and the RF method has been shown to perform land cover classification effectively and accurately. Random forest methods of land cover classification have generated higher accuracy outputs compared to other non-parametric machine learning methods such as support vector machines and artificial neural networks [33,34].

The generation of accurate land cover maps over a 36-year period has several challenges relating to the detection of land cover change and technological advances. Sensor technology is continually evolving, which has improved the diversity, quality, and quantity of the remote sensing datasets available for analysis. The difference in the satellite sensors between the Landsat-8 Operational Land Imager (OLI) and Sentinel-2 MultiSpectral Instrument (MSI) did not have a large impact on the land cover classification results of the same area, despite the higher resolution of the Sentinel-2-acquired datasets, which, in theory, should facilitate a more accurate land cover classification and reduce the misclassification of features and the necessity of filtering imagery [35,36]. However, the fusion of multi-source remote sensing datasets that incorporate textural features [37] has resulted in greater improvements in classification than relying on the increased resolution of images. For example, the fusion of datasets from different sensors has been shown to improve the accuracy of land classification [38], forest biomass estimation [39], and natural disaster monitoring [40]. The complex topography and forest species’ composition and density in the typical mountainous mining area used in this study demonstrated that the effective integration of topographic features such as elevation and slope can be more conducive to distinguishing forests from buildings and crops.

5. Conclusions

The GEE remote sensing cloud platform was used for rapid land cover classification using Landsat 5, 7, 8, and Sentinel-2 remotely sensed images with a time series spanning 36 years. Single sample point migration was used to produce a time series land cover classification map at both the provincial–regional scale and the scale of mining operations. The final sample point migration threshold value that corresponded to no change in classification was 0.25. The optimal combination of the multi-source remote sensing variables used to parameterize the RF machine learning algorithm was the Spectral band + Index + terrain + SAR combination for both Landsat 8- and Sentinel-2-generated data. The RF model produced a classification map with a highest accuracy for the year 2022 using the Landsat 8 data, with an OA of 0.90 and a Kappa coefficient of 0.919. Our analysis suggests that a higher accuracy can be achieved when imagery with a higher spatial and temporal resolution is used. Further work, assessing the combination of low-resolution remotely sensed imagery and machine learning techniques will enable the assessment of a global-scale land cover classification map over a long time series. As sensor technology develops, we expect that the accuracy of land cover classification will continue to improve, enabling the future identification of land cover classes that have not yet been considered.

To aid visualization and interpretation, a GEE-based land classification based on spectral differences (1984–present) application was developed and is available at the following URL: https://bqt2000204051.users.earthengine.app/view/land-classification-of-landsat-imagery (accessed on 1 October 2023). The main purpose of this land classification program is to allow users to input a predetermined set of land classification points for a specific year, choose a designated threshold, and utilize the RF algorithm to classify land images from 1984 to the current year.

Author Contributions

Conceptualization, X.Y., J.L. and A.R.S.; methodology, X.Y.; software, X.Y., D.Y. and T.M.; validation, D.Y. and A.R.S.; formal analysis, Y.S.; investigation, X.Y., T.M. and Y.S.; data curation, X.Y.; writing—original draft preparation, X.Y.; writing—review and editing, X.Y. and A.R.S.; visualization, X.Y.; supervision, A.R.S., J.L. and D.Y.; project administration, X.Y., J.L. and A.R.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Key Research and Development Program of China (intergovernmental and international cooperation in science, technology and innovation) under grant number 2022YFE0127700; the Royal Society International Exchanges 2022 Cost Share (NSFC) under grant number IEC\NSFC\223567; the University-Industry Collaborative Education Program under grant number 220702313180236.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors sincerely thank National Aeronautics and Space Administration (NASA) and the United States Geological Survey (USGS) for providing the Landsat and DEM data. The authors thank the European Space Agency (ESA) for providing the Sentinel-1 and Sentinel-2 data. We would like to express our gratitude to Google Earth Engine, PIE-Engine, and AI Earth for offering free cloud computing services. Thank you very much for the support of Alibaba Damo Academy 2023 AI Earth Joint Innovative Research for this study. The authors thank the anonymous reviewers for their valuable comments.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, J.; Chen, J.; Liao, A.; Cao, X.; Chen, L.; Chen, X.; He, C.; Han, G.; Peng, S.; Lu, M. Global land cover mapping at 30 m resolution: A POK-based operational approach. Isprs J. Photogramm. 2015, 103, 7–27. [Google Scholar] [CrossRef]
Qin, J.; Liu, Y.; Yi, D.; Sun, S.; Zhang, J. Spatial Accessibility Analysis of Parks with Multiple Entrances Based on Real-Time Travel: The Case Study in Beijing. Sustainability 2020, 12, 7618. [Google Scholar] [CrossRef]
Zhu, L.; Xing, H.; Hou, D. Analysis of carbon emissions from land cover change during 2000 to 2020 in Shandong Province, China. Sci. Rep. 2022, 12, 8021. [Google Scholar] [CrossRef]
Cadenasso, M.L.; Pickett, S.T.; Schwarz, K. Spatial heterogeneity in urban ecosystems: Reconceptualizing land cover and a framework for classification. Front. Ecol. Environ. 2007, 5, 80–88. [Google Scholar] [CrossRef]
Pojar, J.; Klinka, K.; Meidinger, D. Biogeoclimatic ecosystem classification in British Columbia. For. Ecol. Manag. 1987, 22, 119–154. [Google Scholar] [CrossRef]
Pfister, R.D.; Arno, S.F. Classifying forest habitat types based on potential climax vegetation. For. Sci. 1980, 26, 52–70. [Google Scholar]
DeFries, R.S.; Townshend, J. NDVI-derived land cover classifications at a global scale. Int. J. Remote Sens. 1994, 15, 3567–3586. [Google Scholar] [CrossRef]
Phiri, D.; Morgenroth, J. Developments in Landsat Land Cover Classification Methods: A Review. Remote Sens. 2017, 9, 967. [Google Scholar] [CrossRef]
Tadese, M.; Kumar, L.; Koech, R.; Kogo, B.K. Mapping of land-use/land-cover changes and its dynamics in Awash River Basin using remote sensing and GIS. Remote Sens. Appl. 2020, 19, 100352. [Google Scholar] [CrossRef]
Chen, J.; Cao, X.; Peng, S.; Ren, H. Analysis and applications of GlobeLand30: A review. Isprs Int. J. Geo-Inf. 2017, 6, 230. [Google Scholar] [CrossRef]
Hu, Q.; Xiang, M.; Chen, D.; Zhou, J.; Song, Q. Global cropland intensification surpassed expansion between 2000 and 2010: A spatio-temporal analysis based on GlobeLand30. Sci. Total Environ. 2020, 746, 141035. [Google Scholar] [CrossRef]
Qin, R.; Liu, T. A Review of Landcover Classification with Very-High Resolution Remotely Sensed Optical Images—Analysis Unit, Model Scalability and Transferability. Remote Sens. 2022, 14, 646. [Google Scholar] [CrossRef]
Nian, Y.; He, Z.; Zhang, W.; Chen, L. Land Cover Changes of the Qilian Mountain National Park in Northwest China Based on Phenological Features and Sample Migration from 1990 to 2020. Remote Sens. 2023, 15, 1074. [Google Scholar] [CrossRef]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Carrasco, L.; O’Neil, A.W.; Morton, R.D.; Rowland, C.S. Evaluating combinations of temporally aggregated Sentinel-1, Sentinel-2 and Landsat 8 for land cover mapping with Google Earth Engine. Remote Sens. 2019, 11, 288. [Google Scholar] [CrossRef]
Qiu, C.; Schmitt, M.; Geiß, C.; Chen, T.-H.K.; Zhu, X.X. A framework for large-scale mapping of human settlement extent from Sentinel-2 images via fully convolutional neural networks. Isprs J. Photogramm. 2020, 163, 152–170. [Google Scholar] [CrossRef]
Sidhu, N.; Pebesma, E.; Câmara, G. Using Google Earth Engine to detect land cover change: Singapore as a use case. Eur. J. Remote Sens. 2018, 51, 486–500. [Google Scholar] [CrossRef]
Huang, H.; Chen, Y.; Clinton, N.; Wang, J.; Wang, X.; Liu, C.; Gong, P.; Yang, J.; Bai, Y.; Zheng, Y. Mapping major land cover dynamics in Beijing using all Landsat images in Google Earth Engine. Remote Sens. Environ. 2017, 202, 166–176. [Google Scholar] [CrossRef]
Schmitt, M.; Zhu, X.X. Data fusion and remote sensing: An ever-growing relationship. IEEE Geosc Rem. Sen. Mag. 2016, 4, 6–23. [Google Scholar] [CrossRef]
Xu, Z.; Chen, J.; Xia, J.; Du, P.; Zheng, H.; Gan, L. Multisource earth observation data for land-cover classification using random forest. IEEE Geosci. Remote Sens. Lett. 2018, 15, 789–793. [Google Scholar] [CrossRef]
Li, Q.; Qiu, C.; Ma, L.; Schmitt, M.; Zhu, X.X. Mapping the land cover of Africa at 10 m resolution from multi-source remote sensing data with Google Earth Engine. Remote Sens. 2020, 12, 602. [Google Scholar] [CrossRef]
Otukei, J.R.; Blaschke, T. Land cover change assessment using decision trees, support vector machines and maximum likelihood classification algorithms. Int. J. Appl. Earth Obs. 2010, 12, S27–S31. [Google Scholar] [CrossRef]
Kamavisdar, P.; Saluja, S.; Agrawal, S. A survey on image classification approaches and techniques. Int. J. Adv. Res. Comput. Commun. Eng. 2013, 2, 1005–1009. [Google Scholar]
Khelifi, L.; Mignotte, M. Deep learning for change detection in remote sensing images: Comprehensive review and meta-analysis. IEEE Access 2020, 8, 126385–126400. [Google Scholar] [CrossRef]
Huang, C.; Davis, L.S.; Townshend, J.R.G. An assessment of support vector machines for land cover classification. Int. J. Remote Sens. 2002, 23, 725–749. [Google Scholar] [CrossRef]
Puissant, A.; Rougier, S.; Stumpf, A. Object-oriented mapping of urban trees using Random Forest classifiers. Int. J. Appl. Earth Obs. 2014, 26, 235–245. [Google Scholar] [CrossRef]
Phan, T.N.; Kuch, V.; Lehnert, L.W. Land cover classification using Google Earth Engine and random forest classifier—The role of image composition. Remote Sens. 2020, 12, 2411. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J.P. An assessment of the effectiveness of a random forest classifier for land-cover classification. Isprs J. Photogramm. 2012, 67, 93–104. [Google Scholar] [CrossRef]
Amini, S.; Saber, M.; Rabiei-Dastjerdi, H.; Homayouni, S. Urban land use and land cover change analysis using random forest classification of landsat time series. Remote Sens. 2022, 14, 2654. [Google Scholar] [CrossRef]
Jin, Y.; Liu, X.; Chen, Y.; Liang, X. Land-cover mapping using Random Forest classification and incorporating NDVI time-series and texture: A case study of central Shandong. Int. J. Remote Sens. 2018, 39, 8703–8723. [Google Scholar] [CrossRef]
Mauya, E.W.; Mugasha, W.A.; Njana, M.A.; Zahabu, E.; Malimbwi, R. Carbon stocks for different land cover types in Mainland Tanzania. Carbon. Bal. Manag. 2019, 14, 4. [Google Scholar] [CrossRef]
Quan, Y.; Tong, Y.; Feng, W.; Dauphin, G.; Huang, W.; Xing, M. A Novel Image Fusion Method of Multi-Spectral and SAR Images for Land Cover Classification. Remote Sens. 2020, 12, 3801. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.F.; Chica-Rivas, M. Evaluation of different machine learning methods for land cover mapping of a Mediterranean area using multi-seasonal Landsat images and Digital Terrain Models. Inter. J. Digit. Earth 2012, 7, 492–509. [Google Scholar] [CrossRef]
Abdel-Rahman, E.M.; Mutanga, O.; Adam, E.; Ismail, R. Detecting Sirex noctilio grey-attacked and lightning-struck pine trees using airborne hyperspectral data, random forest and support vector machines classifiers. ISPRS J. Photogramm. Remote Sens. 2014, 88, 48–59. [Google Scholar] [CrossRef]
Chaves, M.E.D.; Picoli, M.C.A.; Sanches, I.D. Recent Applications of Landsat 8/OLI and Sentinel-2/MSI for Land Use and Land Cover Mapping: A Systematic Review. Remote Sens. 2020, 12, 3062. [Google Scholar] [CrossRef]
Phiri, D.; Simwanda, M.; Salekin, S.; Nyirenda, V.R.; Murayama, Y.; Ranagalage, M. Sentinel-2 Data for Land Cover/Use Mapping: A Review. Remote Sens. 2020, 12, 2291. [Google Scholar] [CrossRef]
Capolupo, A.; Monterisi, C.; Tarantino, E. Landsat Images Classification Algorithm (LICA) to Automatically Extract Land Cover Information in Google Earth Engine Environment. Remote Sens. 2020, 12, 1201. [Google Scholar] [CrossRef]
Wang, J.; Bretz, M.; Dewan, M.A.A.; Delavar, M.A. Machine learning in modelling land-use and land cover-change (LULCC): Current status, challenges and prospects. Sci. Total Environ. 2022, 822, 153559. [Google Scholar] [CrossRef]
Yan, X.; Li, J.; Smith, A.R.; Yang, D.; Ma, T.; Su, Y.; Shao, J. Evaluation of machine learning methods and multi-source remote sensing data combinations to construct forest above-ground biomass models. Int. J. Digit. Earth 2023, 16, 4471–4491. [Google Scholar] [CrossRef]
Farhadi, H.; Esmaeily, A.; Najafzadeh, M. Flood monitoring by integration of Remote Sensing technique and Multi-Criteria Decision Making method. Comput. Geosci. 2022, 160, 105045. [Google Scholar] [CrossRef]

Figure 1. Overview of the study area. (a) Landsat 8 RGB image of Shanxi province in 2022; the red outline is the Huodong mining area. (b) Sentinel-2 RGB image of the Huodong mining area in 2022.

Figure 2. Flowchart of the land classification method based on machine learning methods and multi-source remote sensing variables. It consists of four parts: Data pre-processing, Sample Migration, Land Classification, and Accuracy Assessment.

Figure 3. 1986–2022 Landsat land classification and sample sites. The y-axis on the left of the figure represents the accuracy of the Kappa coefficient and the overall accuracy, and the y-axis on the right represents the number of sample points for land classification.

Figure 4. Sentinel-2 land classification in 2019. (a) Spectral band, (b) Spectral band + Index, (c) Spectral band + SAR, (d) Spectral band + Index + SAR, and (e) Spectral band + Index + Terrain + SAR.

Figure 5. Landsat-8 land classification in 2019. (a) Spectral band, (b) Spectral band + Index, (c) Spectral band + SAR, (d) Spectral band + Index + SAR, and (e) Spectral band + Index + Terrain + SAR.

Table 1. Multi-source remote sensing image data, at two different resolutions, used in this analysis.

Name	Earth Engine Snippet	Date	Resolution
Landsat 5	LANDSAT/LT05/C02/T1_L2	“16 March 1984”–“5 May 2012”	30 m
Landsat 7	LANDSAT/LE07/C02/T1_L2	“28 May 1999”	30 m
Landsat 8	LANDSAT/LC08/C02/T1_L2	“18 March 2013”	30 m
Sentinel 1	COPERNICUS/S1_GRD	“3 October 2014”	10 m
Sentinel 2	COPERNICUS/S2	“23 June 2015”	10 m
DEM	NASA/NASADEM_HGT/001	“11 February 2000”	30 m

Table 2. Number of sample points for each land classification.

Image	Samples	Land Classification							Total
Image	Samples	Forest	Water	Crop	Grass	Building	Bare	Mining	Total
Landsat	Training	139	104	227	176	274	135	0	1055
Landsat	Validation	59	44	97	76	118	58	0	452
Sentinel	Training	150	29	160	141	195	21	168	864
Sentinel	Validation	65	12	69	60	84	9	72	371

Table 3. Multi-source remote sensing feature variables.

Multi-Source Remote Sensing Image	Variable Factors
Spectral Band	Blue, Green, Red, Nir, Swir1, Swir2
Spectral Index	NDVI, NDBI, NDWI, RVI, DVI
Terrain	Elevation, Slope, Aspect
SAR	HH, HV

Table 4. Threshold information of each band for sample points without land class change.

Band	Landcover
Band	Forest	Water	Crop	Grassland	Building	Bare
Blue	0.13	0.07	0.12	0.11	0.15	0.04
Green	0.08	0.08	0.12	0.09	0.12	0.08
Red	0.02	0.05	0.06	0.04	0.15	0.11
Swir1	0.13	0.07	0.18	0.04	0.25	0.24
Swir2	0.06	0.05	0.10	0.02	0.21	0.19
NDVI	0.25	0.07	0.20	0.13	0.12	0.11
NDBI	0.04	0.03	0.23	0.08	0.01	0.02
NDWI	0.23	0.05	0.12	0.04	0.15	0.09
DVI	0.14	0.01	0.19	0.01	0.02	0.02

Table 5. Land classification accuracies for different thresholds in 2019–2021.

Threshold	Method	2019		2020		2021
Threshold	Method	Accuracy	Number of Samples	Accuracy	Number of Samples	Accuracy	Number of Samples
0.1	Kappa	0.333	19	0.639	56	0.582	11
0.1	OA	0.500	19	0.923	56	0.684	11
0.15	Kappa	0.707	108	0.644	160	0.867	70
0.15	OA	0.818	108	0.792	160	0.896	70
0.20	Kappa	0.829	560	0.910	681	0.935	556
0.20	OA	0.874	560	0.949	681	0.941	556
0.25	Kappa	0.884	863	0.886	956	0.914	901
0.25	OA	0.907	863	0.908	956	0.931	901
0.30	Kappa	0.901	1028	0.914	1094	0.870	1055
0.30	OA	0.919	1028	0.931	1094	0.910	1055
0.35	Kappa	0.882	1112	0.921	1157	0.889	1132
0.35	OA	0.903	1112	0.904	1157	0.876	1132
0.40	Kappa	0.846	1173	0.891	1193	0.926	1176
0.40	OA	0.875	1173	0.905	1193	0.893	1176

Table 6. Land classification accuracy of sentinel-2 multi-source remote sensing variables in 2019–2022.

Variable Combinations	2019		2020		2021		2022
Variable Combinations	Kappa	OA	Kappa	OA	Kappa	OA	Kappa	OA
Spectral band	0.863	0.888	0.877	0.900	0.867	0.893	0.860	0.887
Spectral band + Index	0.874	0.907	0.878	0.900	0.867	0.892	0.883	0.905
Spectral band + SAR	0.866	0.890	0.878	0.901	0.907	0.924	0.875	0.896
Spectral band + Index + SAR	0.903	0.915	0.913	0.929	0.896	0.916	0.900	0.915
Spectral band + Index + Terrain + SAR	0.910	0.927	0.880	0.903	0.921	0.936	0.889	0.919

Table 7. Landsat 8’s multi-source remote sensing variables’ land classification accuracy for the years 2019–2022.

Variable Combinations	2019		2020		2021		2022
Variable Combinations	Kappa	OA	Kappa	OA	Kappa	OA	Kappa	OA
Spectral band	0.833	0.864	0.828	0.864	0.836	0.869	0.881	0.903
Spectral band + index	0.837	0.868	0.835	0.866	0.851	0.879	0.828	0.861
Spectral band + SAR	0.848	0.877	0.870	0.896	0.846	0.876	0.882	0.903
Spectral band + index + SAR	0.831	0.864	0.866	0.894	0.871	0.894	0.917	0.933
Spectral band + index + Terrain + SAR	0.872	0.897	0.892	0.913	0.848	0.878	0.900	0.919

Table 8. Land classification results of different remote sensing images in 2020.

Land Classification	Landsat		Sentinel
Land Classification	Area (km²)	Percentage of Total Area (%)	Area (km²)	Percentage of Total Area (%)
Forest	1187.62	40.43	1163.98	39.68
Water	7.05	0.24	6.22	0.21
Crop	486.41	16.69	536.17	18.28
Grass	1161.89	39.55	1139.87	38.85
Building	48.11	1.64	48.77	1.66
Bare	8.29	0.28	8.00	0.27
Mining	34.41	1.17	34.58	1.18

Table 9. Comparative analysis of forested land classification products in 2022.

Earth Engine Snippet	Resolution (m)	Area (km²)
ESA/WorldCover/v100	10	1418.27
GOOGLE/DYNAMICWORLD/V1	10	1136.74
JAXA/ALOS/PALSAR/YEARLY/FNF4	25	1147.41
LANDSAT/LC08/C02/T1_L2	30	1187.62
COPERNICUS/S2_SR	10	1163.98

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yan, X.; Li, J.; Smith, A.R.; Yang, D.; Ma, T.; Su, Y. Rapid Land Cover Classification Using a 36-Year Time Series of Multi-Source Remote Sensing Data. Land 2023, 12, 2149. https://0-doi-org.brum.beds.ac.uk/10.3390/land12122149

AMA Style

Yan X, Li J, Smith AR, Yang D, Ma T, Su Y. Rapid Land Cover Classification Using a 36-Year Time Series of Multi-Source Remote Sensing Data. Land. 2023; 12(12):2149. https://0-doi-org.brum.beds.ac.uk/10.3390/land12122149

Chicago/Turabian Style

Yan, Xingguang, Jing Li, Andrew R. Smith, Di Yang, Tianyue Ma, and Yiting Su. 2023. "Rapid Land Cover Classification Using a 36-Year Time Series of Multi-Source Remote Sensing Data" Land 12, no. 12: 2149. https://0-doi-org.brum.beds.ac.uk/10.3390/land12122149

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Rapid Land Cover Classification Using a 36-Year Time Series of Multi-Source Remote Sensing Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Sources

2.3. Image Pre-Processing

2.4. Sample Point Selection

2.5. Technical Method

2.5.1. Sample Migration

2.5.2. Random Forest Algorithm

2.5.3. Feature Model Construction

2.5.4. Accuracy Assessment

3. Results

3.1. The Determination of Thresholds

3.2. Land Classification of Landsat Imagery

3.3. Land Classification of Sentinel-2 Images

3.4. Multi-Source Remote Sensing Images for Land Classification

3.4.1. Sentinel-2 Multi-Source Remote Sensing Land Classification

3.4.2. Landsat Multi-Source Remote Sensing Land Classification

3.4.3. Comparative Analysis of the Accuracy of Land Classification Products

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI