An ESTARFM Fusion Framework for the Generation of Large-Scale Time Series in Cloud-Prone and Heterogeneous Landscapes

Knauer, Kim; Gessner, Ursula; Fensholt, Rasmus; Kuenzer, Claudia

doi:10.3390/rs8050425

Open AccessArticle

An ESTARFM Fusion Framework for the Generation of Large-Scale Time Series in Cloud-Prone and Heterogeneous Landscapes

by

Kim Knauer

^1,2,*,

Ursula Gessner

¹,

Rasmus Fensholt

³

and

Claudia Kuenzer

¹

German Remote Sensing Data Center (DFD), German Aerospace Center (DLR), 82234 Wessling, Germany

²

Department of Remote Sensing, University of Wuerzburg, 97074 Wuerzburg, Germany

³

Department of Geosciences and Natural Resource Management, University of Copenhagen, 1350 Copenhagen, Denmark

^*

Author to whom correspondence should be addressed.

Remote Sens. 2016, 8(5), 425; https://0-doi-org.brum.beds.ac.uk/10.3390/rs8050425

Submission received: 23 March 2016 / Revised: 10 May 2016 / Accepted: 12 May 2016 / Published: 19 May 2016

Download

Browse Figures

Versions Notes

Abstract

:

Monitoring the spatio-temporal development of vegetation is a challenging task in heterogeneous and cloud-prone landscapes. No single satellite sensor has thus far been able to provide consistent time series of high temporal and spatial resolution for such areas. In order to overcome this problem, data fusion algorithms such as the Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM) have been established and frequently used in recent years to generate high-resolution time series. In order to make it applicable to larger scales and to increase the input data availability especially in cloud-prone areas, an ESTARFM framework was developed in this study introducing several enhancements. An automatic filling of cloud gaps was included in the framework to make best use of available, even partly cloud-covered Landsat images. Furthermore, the ESTARFM algorithm was enhanced to automatically account for regional differences in the heterogeneity of the study area. The generation of time series was automated and the processing speed was accelerated significantly by parallelization. To test the performance of the developed ESTARFM framework, MODIS and Landsat-8 data were fused for generating an 8-day NDVI time series for a study area of approximately 98,000 km² in West Africa. The results show that the ESTARFM framework can accurately produce high temporal resolution time series (average MAE (mean absolute error) of 0.02 for the dry season and 0.05 for the vegetative season) while keeping the spatial detail in such a heterogeneous, cloud-prone region. The developments introduced within the ESTARFM framework establish the basis for large-scale research on various geoscientific questions related to land degradation, changes in land surface phenology or agriculture.

Keywords:

vegetation dynamics; ESTARFM; MODIS; Landsat; phenology; West Africa; cloud gap filling; time series analysis

Graphical Abstract

1. Introduction

Monitoring the spatio-temporal development of vegetation is a challenging task in heterogeneous and cloud-prone landscapes [1]. Consistent time series of high temporal and spatial resolution satellite imagery could substantially improve various regional analyses of land degradation, changes in land surface phenology (LSP), or changes in extent of agricultural area. However, no single satellite sensor has yet been able to provide time series to sufficiently monitor the vegetation development in these challenging areas. Moderate spatial resolution sensors such as MODIS (Moderate Resolution Imaging Spectroradiometer), MERIS (Medium Resolution Imaging Spectrometer) and SPOT-VEGETATION (Satellite Pour l’ Observation de la Terre) provide information on the temporal development of LSP while they cannot sufficiently capture small-scale mosaics of different land cover types. On the other hand, current medium spatial resolution sensors such as Landsat provide the spatial detail necessary for various approaches in heterogeneous landscapes. Yet, in cloud-prone areas and generally for the first Landsat missions which did not provide frequent images globally, data availability is insufficient to assess the various growth stages and temporal changes of vegetation over the growing cycle. To overcome this classic discrepancy in remote sensing, fusing data from different sensors has gained increasing popularity in recent years for combining high temporal resolution sensors with medium to high spatial resolution sensors. Especially the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) [2] and its enhanced version ESTARFM [3] have been frequently applied as described in the literature (e.g., [4,5,6,7,8]). Recent comparison studies highlighted the respective advantages of the two algorithms compared to other simple fusion approaches, but the overall accuracies are found to be highest for ESTARFM [9,10]. Furthermore, in heterogeneous landscapes, ESTARFM has been documented to reproduce the small-scale objects of the land surface more accurately than the original STARFM [10].

However, thus far, data fusion with these two algorithms has only been applied to relatively small study areas corresponding to a single Landsat scene (~33,300 km²) or less, and the vast majority of studies focused on regions in North America (USA, Canada), Australia and China. Gao et al. [11] investigated the use of STARFM time series for the monitoring of crop condition in a study area of 225 km² in central Iowa, USA. They found the synthetic time series derived from MODIS and Landsat to be well suited to the task and could even distinguish different crop types (soybean and corn) and forest areas. Focusing on forest disturbance and regrowth, Schmidt et al. [12] generated a 12-year time series with STARFM for a 2960 km² study area in Queensland, Australia. By analyzing the time series with the Breaks for Additive Seasonal and Trend (BFAST) algorithm [13], they identified forest-clearing events and monitored the regrowing vegetation. In their conclusion, the value of such data fusion for land management and forest conversion was highlighted [12]. Tewes et al. [5] fused 250 m MODIS and 5 m RapidEye data using ESTARFM to monitor vegetation dynamics in semi-arid rangelands of South Africa (study area: 1434 km²). Their results indicate the suitability of ESTARFM to characterize phenological development of different vegetation types in such heterogeneous areas. Except for the latter study, ESTARFM in particular has only been used in performance and feasibility studies that show the potential of the algorithm for different purposes like urban flood mapping or crop monitoring, sometimes in comparison to other data fusion algorithms [8,9,10,14]. However, to the best of the authors’ knowledge, there have not yet been any studies applying ESTARFM to investigating specific geoscientific questions and for larger study areas. The frequent use reflects the high interest in this fusion method, but it also indicates that there might be limitations which have so far hindered broader applications.

This study identifies possible limitations of the ESTARFM algorithm and aims to improve these for a better applicability on large spatial scales (e.g., more than one Landsat tile). For this purpose, the algorithm was implemented in a framework facilitating the generation of time series and automatically accounting for regional differences in the heterogeneity of the study area. A further aim was to enhance the application of ESTARFM in cloud-prone areas by improving the availability of useable input Landsat data. Lastly, the benefits of the developed ESTARFM framework are demonstrated by applying it to the generation of a 30 m NDVI time series for a large, cloud-prone, and heterogeneous region of West Africa.

2. Study Area

The study region is located in the transboundary zone of southern Burkina Faso and northern Ghana (3°10′33″W–0°11′31″W/9°29′4″N–12°24′9″N) with an approximate size of 98,000 km² (Figure 1). The climate is tropical and strongly seasonal with distinct rainy and dry seasons. The rainy season lasts for about four to five months between May and October with mean annual rainfall ranging from 700 to 1200 mm. The temperatures are relatively stable over the year with monthly averages between 25 °C and 32 °C. The study area is located in the West Sudanian Savanna ecoregion with vegetation types ranging from open grassland to closed woodland and forests [15]. Dense woody vegetation or forests mainly occur along rivers or in protected areas such as the Nazinga game ranch, Sissili classified forest, the Kaboré-Tambi National Park, or the Mole National Park. Outside these protected areas, agricultural use is predominant, e.g., around Kaboré-Tambi National Park, a strong increase in agricultural area (from about 30% to 60%) between 2001 and 2013 has been documented [16]. The agriculture in the study area mostly consists of small-scale rain-fed crops with average field sizes around 1 ha and a low productivity (1.15 t·ha⁻¹·year⁻¹ cereals for Burkina Faso) as well as livestock grazing [17,18]. The small-scale mosaic of different land cover and land use types (mainly semi-natural vegetation and croplands) make this study region spatially very heterogeneous [18]. An important landmark in the northeast is Lake Bagré which retains the White Volta for hydro-power generation but also supplies water to irrigated agricultural areas downstream of the dam.

In some parts of the study area, the rainy season triggers flood events, whereas fires frequently occur during the dry season. In summary, the study area is highly dynamic and heterogeneous in space and time with expanding small-scale agriculture and a rainy and dry season cycle that strongly influences the seasonal development of vegetation.

3. Materials and Methods

3.1. Landsat-8 Operational Land Imager (OLI)

Landsat-8 Operational Land Imager (OLI) surface reflectance scenes were obtained from the USGS Landsat Archive [20] distributed on four tiles (195/52, 195/53, 194/52 and 194/53) and covering the period from December 2013 to January 2015 (Figure 2). The datasets contain surface reflectances atmospherically corrected with the L8SR algorithm [21] and complemented by cloud and cloud shadow masks based on CFmask [21], an open source C code version of the Fmask algorithm [22]. The masks were applied to the reflectance data prior to the calculation of NDVI (Figure 3). After a thorough inspection of the available datasets, scenes with a cloud cover of more than 40% were excluded from the analysis; thus, a total number of 38 Landsat images were used for the fusion. However, the threshold of cloud cover can be defined by the user individually based on expert knowledge of the respective study region.

3.2. Terra/Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) MCD43A4, Collection 6

For the same period as the Landsat data, daily, 500 m MODIS nadir bidirectional reflectance distribution function (BRDF)-adjusted reflectance datasets (MCD43A4) have been obtained in the product collection 6 for the tiles h17v07 and h17v08 (Figure 1). A major improvement of MCD43A4 collection 6 is that the production frequency has been increased from 8 days to daily, providing maximum flexibility in the definition of dates and time steps for generation of high temporal resolution time series with ESTARFM. MCD43A4 is a 16-day rolling composite, representing the best BRDF possible during this period of time. Unlike in the previous collection 5, the date associated with each daily dataset is the centre of the moving 16-day input window (and not its first day). More information on collection 6 of the MCD43A4 product can be found in the MODIS User Guide [25].

The red and NIR bands of the MODIS datasets have been resampled using the nearest neighbour method to a resolution of 30 m, automatically co-registered to Landsat, and reprojected to UTM zone 30°N using the MODIS Reprojection Tool (MRT) [26]. As recommended in the MODIS User Guide [25], only data with a full BRDF inversion (labelled with the best quality flag) have been processed. From these, NDVI was calculated and remaining outliers in the time series were removed (defined as observations which considerably differ from the median in a moving window with a length of 20 days). Similar to what was suggested by Eklundh and Jönsson [27], the threshold for outlier detection was defined as the standard deviation of the entire time series times a factor of 0.3 for the outliers above the median and times a factor of 0.1 for the outliers below the median. The temporal gaps in the result have been linearly interpolated and, subsequently, a Savitzky-Golay filter [28] with a total window length of 16 days and a quadratic polynomial has been applied to reduce residual noise. The factors and window lengths for outlier removal and Savitzky-Golay filtering were determined based on a visual analysis of their effect on temporal plots similar to those implemented in the software package TIMESAT [27].

3.3. Method for the Generation of High Temporal Resolution Time Series

The high temporal resolution Landsat-like time series were generated on the basis of the ESTARFM fusion algorithm [3] but with several modifications of the methodology to make the algorithm applicable to processing of large areas, characterized by heterogeneous landscapes and frequent cloud cover, in an automated way.

The original ESTARFM algorithm produces synthetic Landsat images at the dates of MODIS observations. For this purpose, one pair of simultaneous Landsat and MODIS images before (“left shoulder”) and one pair after the prediction date (“right shoulder”) is used. ESTARFM first establishes for each pixel a relationship between MODIS and Landsat of both image pairs via linear regression. The regressions are fitted for the central pixel in a moving window by taking into account all pixels in the window that are spectrally similar (similar pixels) to the central pixel. Spectral similarity is defined by the following threshold:

| R_{s} - R_{c} | \leq σ (B a n d) \times 2 / m

(1)

where

R_{s}

is the reflectance of the similar pixel candidate,

R_{c}

is the reflectance of the central pixel,

σ (B a n d)

is the standard deviation of the band within the entire image and

m

is the number of classes in the entire image, as estimated by the user.

The slope of the regression is used as a conversion coefficient between Landsat and MODIS. This coefficient allows transfer of the reflectance changes between the MODIS scenes (pair dates−prediction date) to the Landsat scene at the prediction date. For more information on the original ESTARFM fusion algorithm, see Zhu et al. [3].

However, ESTARFM has several limitations, as previously outlined by Zhu et al. [3], which hinder automated processing and its application to larger areas. These limitations are addressed and overcome in this study as described in the following.

3.3.1. Modified Selection of Similar Pixels

One issue of the original ESTARFM algorithm is how the aforementioned similar pixels are identified. Similar pixels are selected according to Equation (1), where the average number of land cover classes (

m

) in the study area must be estimated by the user. In areas of heterogeneous landscapes, a larger number of classes are specified, leading to a stricter condition for the selection of similar pixels [3]. On the one hand, the user estimation introduces an undesirable subjectivity to the process. On the other hand, by using an average number of land cover classes for the entire study area, spatial differences in the heterogeneity of the study area cannot be taken into account. This means that even in a study area containing small-scale mosaics of several land cover types in one part and relatively homogeneous land cover patterns in others, still only one overall number of classes would be considered. This becomes particularly important when working at larger scales.

In order to address this, we developed a modified approach for the identification of similar pixels in the proposed ESTARFM framework. This approach includes an automated clustering of the surface reflectance values of selected Landsat observations using the ISODATA algorithm [29]. The general possibility of a clustering was already mentioned by Zhu et al. [3], but they neither elaborated on it, nor implemented it in the original ESTARFM. In this study, clustering is performed once per year and on the basis of multi-spectral and multi-temporal Landsat reflectance values. Acquisition dates are selected for clustering which represent the major phenological stages of the vegetation development in the study area. The selection of these dates should be defined according to the seasonal and phenological characteristics of the study area. Then, based on this cluster image, similar pixels are selected as described in Equations (2) and (3). For a pixel to be selected as a similar pixel it has to belong to the same cluster as the central pixel of the moving window (Equation (2)) and the reflectance difference of the pixel to the central pixel must be below one standard deviation of the respective cluster in the moving window (Equation (3)).

In general, if a 30 m land cover dataset from the same year is available for a study area, it can be used to replace the clustering approach.

C_{s} = C_{c}

(2)

| R_{s} - R_{c} | \leq σ (C_{c W i n})

(3)

where

C_{s}

is the cluster of the similar pixel candidate,

C_{c}

is the cluster of the central pixel and

σ (C_{c W i n})

is the standard deviation of the reflectance values of this cluster in the moving window.

3.3.2. Integration of Cloud-Affected Landsat Scenes

Zhu et al. [3] recommended using a large number of temporally well-distributed input Landsat data in ESTARFM in order to rely on high-resolution observations that are as close as possible to the prediction dates. High-resolution images that are temporally too far from the prediction dates are expected to decrease the quality of the output, particularly if input scenes are from another vegetative season as compared to the prediction date [3]. However, previous ESTARFM studies either restricted the input data to clear-sky high-resolution images or, when integrating partially cloud-covered scenes, these approaches produce gaps in the resulting synthetic images [5,8]. This substantially hampers the utility of the ESTARFM algorithm in regions of high cloud cover. In this study, an automated gap-filling approach was implemented to make best use of available, even partly cloud-covered Landsat images, without producing gaps in the output of ESTARFM. The gaps are filled with ESTARFM predictions (Figure 4) based on the other modifications introduced in the framework and described as follows.

In a first step of this gap filling, the first target scene with cloud gaps to be filled is selected. The shoulder scene combination (SSC), i.e., one Landsat image before and one after the target scene, fulfilling two conditions is identified: (a) the temporal distance between these shoulder scenes has to be as small as possible; and (b) an ESTARFM prediction derived from this SSC has to fill at least 50% of the gaps in the target scene. The selected SSC is used to generate an ESTARFM prediction and fill the target scene. If after that the residual gaps make up less than one percent, the next cloud-affected scene is processed. If the gaps are still larger than one percent, the first step is repeated based on the remaining SSC candidates. If after this second iteration the residual gaps in the target scene is still above one percent, the SSC is selected that can fill the residual gaps so that the target scene is at least 99% gap free. This way, a maximum of three synthetic scenes have to be generated to fill one target scene and the filling is completed in a reasonable amount of time.

3.3.3. Quality Information

The ESTARFM algorithm does not give an indication on the quality of the fusion. It is, however, important to know whether a given prediction was based on stable regressions or if errors are likely to occur. For that reason, a quality dataset is produced here as an additional output of the ESTARFM framework (Figure 3). With each synthetic image, a layer with the number of similar pixels used for the prediction of each pixel is obtained. Furthermore, a layer for each band is calculated storing the coefficient of determination from the linear regressions between Landsat and MODIS used for derivation of the conversion coefficient. The information in these layers is considered an indicator of the stability and goodness of the regression and therefore provides an indicator of the proper functioning of the algorithm for different areas of the image.

3.3.4. Automation and Improvement of Performance

The original ESTARFM algorithm is designed for the generation of single synthetic scenes so that for the processing of a time series, it has to be run individually for each time step. To facilitate the generation of time series, a framework was designed in IDL (Figure 3), which requires as input all available MODIS and Landsat data for one Landsat tile and then generates the time series between a definable start and end date automatically. Input image pairs of Landsat and MODIS are automatically selected that are the temporally closest ones to the prediction date as well as the MODIS scene at the prediction date.

Furthermore, for the application of ESTARFM to larger areas and long time series, the processing speed was optimized. So far, most ESTARFM functions only run on one CPU core, so that the prediction of an exemplary Landsat scene with four bands (B-G-R-NIR) takes more than 24 h (Table 1). Thus, the algorithm has been parallelized here, so that the number of used CPU cores can be defined individually. The images are then split into tiles and one tile is assigned to each core. This way, the user can adapt the algorithm to its system specifications and the processing time can be improved significantly (Table 1).

3.3.5. Application of the Enhance Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM) Framework in the Study Area

Using the daily MCD43A4 MODIS product with the developed ESTARFM framework, the user can flexibly define the dates and time steps of the output time series. Though a daily ESTARFM time series could be produced, this also means a high amount of processing time and storage requirements while less time steps are sufficient for many applications. In this study, the ESTARFM framework was applied to the generation of an 8-day 30 m NDVI time series for the year 2014 (day of year, DOY 7 to 359) in the study area using all Landsat and MODIS input data described in Section 3.1 and Section 3.2. Previous studies demonstrated that using NDVI directly as input to ESTARFM instead of using reflectances and calculating the NDVI from the fusion output results in higher accuracies due to reduced error propagation [10,30].

Of the 30 Landsat scenes from the year 2014, 17 were partly cloud-covered but could be included in the ESTARFM fusion due to the advancement of the algorithm (Section 3.3.2). For our study area, only one cloud-free (<1% cloud cover) Landsat scene for the vegetative season (DOY 159) was available for path 195 and none for path 194. However, especially the latter contained several scenes with a cloud cover of less than 40% that could be processed based on the developed cloud-filling approach (Figure 5). In particular, on the southeastern tile (194/53), some areas had persistent cloud coverage over the year and had to be filled up to four times. In such regions, an application of the input scenes in ESTARFM without previous filling would result in gaps in the synthetic time series for most of the predictions.

3.4. Accuracy Assessment

The accuracy of the predicted ESTARFM time series was assessed in different ways. Similar to a cross-validation, the first approach was to exclude one Landsat scene from the input time series and predict this scene with ESTARFM on the basis of the remaining image acquisitions. The predicted scene was then correlated with the actual, formerly excluded Landsat scene and differences between the scenes were calculated. By doing so for every Landsat image (except for the first and last scenes since these cannot be predicted without shoulder scenes) an accuracy assessment of each tile of the ESTARFM time series was conducted. For this purpose, the interval of the output time series was adapted to the 16-day acquisition scheme of the Landsat path 195 (Figure 2) so that the Landsat scenes of the tiles 195/52 and 195/53 could be directly used for validation. For the Landsat path 194 with a different acquisition interval, additional images at the dates of existing Landsat scenes were generated for validation and, in this way, a total number of 195 images have been produced.

In addition to this cross-validation, since the used Landsat scenes do not cover a major part of the rainy season, eight additional Landsat scenes with cloud cover of up to 80% were used for validation of this period. In this way, the quality of predictions for the main rainy season can be estimated, although the results might represent only a very limited part of the study region due to the dominating cloud cover on these scenes.

Furthermore, in order to display the small-scale spatial patterns of the prediction quality, a subset area was monitored over several time steps. Based on these images, the prediction of different land cover types related to different phenological stages can be assessed.

In addition, the consistency of ESTARFM predictions between the eastern and western Landsat path was analyzed for the overlapping areas. Generally, low differences between the paths indicate a high prediction quality and also reduce the amount of subsequent processing of the time series for further geoscientific applications.

4. Results

4.1. Spatio-Temporal Patterns of the ESTARFM Time Series

The overall spatial patterns of the small-scale landscape in the study area are well reproduced by the ESTARFM time series of 2014 (Figure 6). Parts of the river network with its gallery forests exhibiting high mean NDVI values stand out sharply (e.g., north of Wa or south of Bolgatanga), as do the forest patches in the highly fragmented northwest of the study area. Several small cities emerge from the densely vegetated areas as round patches of low NDVI (e.g., the cities Wa and Tamale). Furthermore, there are no edges, i.e., abrupt shifts, in mean NDVI level visible between the four different Landsat tiles. This indicates the general proper functioning of the ESTARFM fusion, since major differences in the prediction of the single tiles would be expressed by edges between them (see also Section 4.2).

The transect in Figure 6e, crossing the study area from north to south, reflects the general vegetation gradient of the study area which is in line with the general gradient of West Africa. From north to south, NDVI level and vegetation density increase with rising annual rainfalls. The comparison of the temporal development of the ESTARFM, MODIS and Landsat NDVI time series is highlighted in four plots (Figure 6a–d): Figure 6a shows the average NDVI development over a semi-natural savanna area on the Nazinga game ranch. Here, the time series of ESTARFM closely follows the one from MODIS and the single Landsat NDVI values are also well matched. Figure 6 also shows that if using only Landsat data for the interpretation of phenology, some features such as the peak NDVI would be missed. The NDVI level of the focus site in the Mole National Park (Figure 6b) is generally higher than for the one from the Nazinga game ranch. This focus site also consists of semi-natural savanna but is more densely vegetated and has a higher percentage of tree cover. These differences again reflect the north−south gradient of vegetation in the study region. For the Mole focus site, the NDVI values of MODIS and Landsat diverge slightly during the growing season with a higher level for Landsat. In such a case, ESTARFM gives priority to the Landsat input data and the derived ESTARFM time series matches Landsat more closely than MODIS.

Figure 6c displays the time series of a rain-fed agricultural area close to the city of Bolgatanga in Ghana. In comparison to the semi-natural savannas, a generally lower NDVI can be observed, especially during the vegetative season between DOY 100 and 320 (mid-April to mid-November). The agricultural area shows a later green-up and a lower peak of NDVI. Figure 6d represents an irrigated agricultural area at the White Volta downstream of the Bagré dam in southern Burkina Faso. Its temporal development is considerably different from the other focus areas since it is independent of the timing of the rainy season. Two crop cycles can be observed in the graph, one around the DOY 100 (mid-April) and a second one around DOY 300 (end of October).

When comparing the annual mean NDVI from ESTARFM with Landsat and MODIS NDVI for three focus areas characterizing different landscape types, major differences become apparent (Figure 7). With its 500 m resolution, the MODIS data (second row) cannot reproduce the spatial details of the landscape patterns. For example, the waterbodies in Ouagadougou or the cultivated areas in the fragmented area in southern Burkina Faso cannot be delineated from the MODIS NDVI. Also the course of rivers in the Mole National Park or the irrigated fields around the dam in the southeast of the fragmented focus site cannot be adequately identified at this scale. However, the mean annual NDVI values provided by MODIS can be seen as realistic, because of the high temporal resolution of MODIS covering the entire phenological cycle. In contrast, the mean annual NDVI of the original Landsat time series (fourth row) allows the differentiation of the aforementioned features of the focus sites. On the other hand, the overall level of Landsat’s mean annual NDVI is considerably lower than the one of MODIS. The reason for this is that there are only severely cloud-affected Landsat data available during the peak vegetative season between July and October (compare Figure 2) and thus this period is not well represented in the mean NDVI, thereby resulting in values that are too low in comparison to the MODIS benchmark. The ESTARFM time series (third row) combines both the high spatial detail which allows delineation of even small objects, and the high temporal resolution allowing derivation of a realistic mean NDVI (similar NDVI levels compared to MODIS, second row).

4.2. Accuracy Assessment of the ESTARFM Predictions

The cross-validation of the predicted ESTARFM time series comprises three accuracy measures (Figure 8): the coefficient of determination (R²), the mean absolute error (MAE), and the root mean square error (RMSE). In general, the northeastern tile (194/52) produced the lowest error values with MAE around 0.02 and average RMSE values of 0.03. Also, the correlations between predicted and observed NDVI values were the highest for this tile (average R² = 0.83). Comparing the northern tiles (row 52) with the southern tiles (row 53), it becomes evident that ESTARFM predictions for the northern tiles commonly produce higher accuracies and lower error values (ranging between 0.01 and 0.04 MAE in the north and between 0.01 and 0.07 MAE in the south).

Focusing on the temporal development of accuracy over the year 2014, distinct differences can be observed. During the first part of the year between January and April (DOY 15–79 for path 195 and DOY 8-88 for path 194) representing the dry season, the errors are relatively low for all four tiles. Then, the errors rise with increasing vegetation growth (also compare Figure 6a–d) and the highest errors are observed in mid-November, DOY 319 on path 195 (mean MAE of 0.04) and DOY 312 on path 194 (mean MAE of 0.07), respectively. These dates correspond to a period of rapid temporal land surface changes when leaves start browning and NDVI decreases. In addition, the first fires of the dry season occur resulting in widespread burned areas during this time. For the last dates of the year in December, the quality of the ESTARFM predictions is significantly increased again.

These spatio-temporal differences of the prediction accuracies can also be identified in the scatterplots of observed versus predicted NDVI values (Figure 9). During the dry season at the beginning of 2014 (DOY 40), the correlation is highest (R² = 0.88 for row 52 and 0.80 for row 53) and the range of NDVI values is rather low. While the NDVI range increases during the vegetative season in June (DOY 152), the correlation decreases (R² = 0.79 for row 52 and 0.54 for row 53). At the end of 2014 (DOY 344), at the beginning of the next dry season, the correlation increases again while the NDVI range is still high (R² = 0.87 for row 52 and 0.77 for row 53). The north−south difference in accuracy is also visible in these scatterplots, with row 52 generally showing lower errors and less scattering of values in the plot than row 53.

To estimate the prediction quality of the main rainy season where only severely cloud-affected Landsat data is available, eight additional Landsat scenes of path 195 with cloud cover of up to 80% were used for validation (Table 2). The errors for this period vary between 0.03 and 0.10 for MAE and between 0.05 and 0.12 for RMSE with R² values between 0.3 and 0.76. Error values are found to increase towards the peak of rainy season (around DOY 280, beginning of October) and decrease afterwards.

In addition, a subset area in the northeast of the study region was selected to show an example of the spatial patterns of absolute differences between observed and predicted NDVI (Figure 10). This area represents the different landscape types of the study region including grasslands, woodlands and heterogeneous parklands, i.e., mixtures of cropland and semi-natural vegetation. The general differences for the late dry season (DOY 40) are low with some minor discrepancies for the waterbody in the northern part of the subset. During the onset of the vegetative season (DOY 152), the differences increase, especially for the densely vegetated areas of the image (regions in dark red on the Landsat false color composite). For example, the rivers characterized by gallery forests display higher values of absolute difference, which can partially be attributed to temporarily inundated areas along the course of the streams. The agricultural areas and regions with less dense vegetation cover (bright colors on the Landsat false color composite) maintain low absolute differences for the vegetative season (DOY 152). In the early dry season of December (DOY 344), the overall differences decrease again with major deviations only for the waterbody in the northern part.

An additional way to assess the quality of the fusion is to analyze the consistency between different Landsat paths of the predicted time series. The overlapping areas between path 195 and 194 of the ESTARFM time series show a median deviation which is generally close to zero with a distinct temporal pattern over the course of the year (Figure 11). The first and third quartile shown in light blue ranges between −0.05 and 0.08. During the rainy season from DOY 100 to 320 (mid-April to mid-November), the differences between the two Landsat paths increase, which goes in line with the tendency towards increasing errors during the rainy season as described before.

4.3. Input Data Availability and Influence on ESTARFM Predictions

To analyze the influence of the temporal distance (number of days between Landsat input observations and prediction dates) on the error level, a Landsat scene from the middle of the year (DOY 143, 195/52) was predicted not only with its nearest shoulder scenes, but also with all other, more distant shoulder combinations. The MAE between these predictions and the observed Landsat NDVI have been calculated and visualized against the total temporal distance of the input shoulder combinations (Figure 12). It is evident that the error level increases with increasing temporal distance, from less than 0.02 MAE to over 0.05 MAE. However, it can also be observed that the temporal distance is not the only factor determining the MAE. The symbols of Figure 12 highlight that the right shoulder (i.e., MODIS-Landsat combination after the prediction date) is significantly more important for the quality of prediction than the left shoulder (MODIS-Landsat combination before the prediction date). In the presented case, the left shoulder dates are all from the dry season at the beginning of 2014 (Figure 2), where changes in phenology between the scenes are minor, while the right shoulders are all from different phenological stages: DOY 159 is from approximately the same phenological stage as the target scene, DOY 319 from the period of browning after the vegetation peak, DOY 351 includes widespread burned areas, and DOY 18 (2015) is from the dry season of the following year. Hence, every time the right shoulder date (i.e., the phenological stage) is changed, a pronounced jump in MAE is observed. This analysis emphasizes the importance of using a shoulder scene from the same phenological stage if possible.

5. Discussion

5.1. Uncertainties in the Prediction with ESTARFM

Several modifications have been introduced in the presented ESTARFM framework resolving shortcomings and challenges that have been mentioned as drawbacks in the usage of the existing ESTARFM algorithm [3,10,31]. However, certain points affecting the accuracy of the ESTARFM fusion remain. First of all, the availability of Landsat input data has a great influence on the quality of the fusion. Minimizing the temporal distance of the shoulder pairs to the target scene is crucial for a successful prediction [3,32] and it became evident that using at least one shoulder pair from the same phenological stage as the prediction date has even higher positive influence (Figure 12).

A further factor influencing the fusion quality is the land cover type of a pixel (Figure 10). In our study region, land cover types with a lower NDVI range, such as mixed rain-fed agricultural areas and grasslands, show low absolute difference between observed and predicted NDVI over the year. In contrast, the predicted NDVI values of densely vegetated areas such as closed woodlands differ more starkly from the actual NDVI values. In our study region, these areas exhibit stronger phenological dynamics which are usually more difficult to predict with ESTARFM [9,33]. These dynamics are also a potential reason for the north−south difference of prediction quality in the study area since the southern regions are generally more densely vegetated than the northern parts. In addition, some of these areas with a high NDVI range are characterized by rapid changes through flooding (especially gallery forests) or burning (semi-arid savannas) which negatively affects the quality of prediction. However, not only the temporal variance plays a role for the reduced prediction quality during the main rainy season, but also the cloud coverage of the MODIS dataset. Although MODIS cloud gaps have been interpolated here, these areas are less accurate in the prediction of synthetic Landsat scenes. This issue should be more emphasized for land cover types with a distinct phenology decreasing the accuracy of interpolation. For areas with high temporal variance, Emelyanova et al. [9] and Jarihani et al. [10] achieved better results with STARFM than with ESTARFM. However, these comparison studies showed at the same time that the performance of ESTARFM is superior to STARFM for study regions with a high spatial heterogeneity (as being the case in this study) and the overall accuracies of ESTARFM tend to be higher as well [9,10]. A comparison of ESTARFM and ESTARFM framework with the newly developed Flexible Spatiotemporal Data Fusion (FSDAF) method by Zhu et al. [34] could be the focus of future research in order to identify potential synergies.

In addition, especially in areas with a complex canopy structure, the BRDF correction of input data can have an impact on the fusion quality. Walker et al. [33,35] tested the usability of the BRDF-corrected MODIS product MCD43A4, which was also applied in our study, together with Landsat-5 TM (Thematic Mapper) as an input to STARFM. They found generally low NIR correlations between Landsat and STARFM predictions and concluded that this could be due to Landsat-5 TM data not being corrected for BRDF effects in contrast to MCD43A4. They further reasoned that the discrepancy between the input datasets would be highest in areas dominated by tall, multi-layered vegetation causing high scattering of NIR radiation [33]. However, they also pointed out the suitability of the MCD43A4 for the fusion since BRDF effects in MODIS surface reflectance products can be considerable due to the large swath width [35,36]. Furthermore, the use of NDVI as in the presented study should decrease BRDF effects due to the similarity of directional signatures in the red and NIR bands [37]. Since Landsat-8, which was used in this study, is a pushbroom sensor, view angle effects should be minor. For the use of the whiskbroom systems Landsat-5 and Landsat-7, which are more affected by view angle effects in combination with MCD43A4, a potential solution could be to also correct the Landsat data for directional effects as done by Schmidt et al. [12].

5.2. Added Value of the ESTARFM Framework for Cloud-Prone Areas

In West Africa, as in many regions with a distinct rainy season, acquisitions of clear sky observations from polar orbiting instruments are challenging [38]. Landsat data are almost only available during the dry season, when vegetation and land surface in general have a significantly different appearance than during the rainy season. Previous studies indicated the importance of using Landsat input scenes that are temporally as close as possible to the prediction dates to minimize errors in the ESTARFM prediction [3,32]. Our results confirmed this assumption and additionally showed the value of Landsat input scenes from the same season and that abrupt events such as fires or floods between input and prediction date can decrease the accuracy. In order to avoid large gaps between input and prediction dates, we included an automatic cloud gap filling of Landsat scenes in the ESTARFM framework to make best use of scenes with a partial cloud cover without producing gaps in the output of the fused time series. With this filling approach, a total number of 30 Landsat input scenes from the year 2014 could be used for the study area instead of only the 13 cloud-free scenes. Thus, Landsat scenes from different seasons could be added to the input dataset, significantly improving the prediction quality.

It should be noted that when using the original STARFM [2] with two shoulder input pairs, cloud gaps in the fusion output can also be avoided as long as at least one of the shoulder observations is cloud free. However, gaps remain for those parts of the image where both shoulder datasets are clouded.

The different Landsat tiles have thus far been treated individually in the processing of the ESTARFM framework. In order to increase data availability for an application in higher latitudes, where overlapping areas between Landsat paths are considerably bigger than in the study region of this paper, it would be interesting to use the tiles of different paths jointly in the ESTARFM framework. This could be part of future improvements of the framework.

5.3. Added Value of the ESTARFM Framework for Large-Scale Analyses

Previous studies on the ESTARFM algorithm indicate its great potential for the generation of high spatial and temporal resolution time series, but they also pointed out issues like the slow processing speed or the way how similar pixels are selected that hinder its application on large scales [3,10,31]. The ESTARFM framework developed here significantly decreases processing time by parallelization and automating the algorithm. This feature, in addition to avoiding repeated manual input, is crucial for its application on large scales. Moreover, the way similar pixels are selected for the regression between Landsat and MODIS was changed in the framework. The approach in the original ESTARFM with a global threshold for the area of interest does not account for spatial differences in the heterogeneity of the study area. This means that if a study area comprises several Landsat tiles covering different landscape types, a single threshold is not sufficient for the selection of similar pixels. With the new approach presented here, regional differences are considered independently of the size of the study area.

When working on large scales, several Landsat paths have to be used and, due to the different acquisition dates of the paths, errors might occur in the overlapping areas of the predicted output. In order to improve the consistency of ESTARFM predictions for certain periods of the time series (compare Section 4.2), a matching of the different paths with suitable algorithms might be useful. This, however, depends on the further application of the ESTARFM time series and is beyond the scope of this study.

The fusion algorithm has as of yet only been applied to small study areas of less than 6000 km², which is far less than one Landsat scene (~33,300 km²). However, many relevant research questions in the field of vegetation mapping and monitoring require consistent vegetation information and related analyses at the national level or in larger geographic regions [39,40]. It is expected that ESTARFM time series can be of great use for improved mapping and monitoring of agricultural areas in West Africa. This has been a challenging task due to the patchy small-scale farming practices in the region, where areas of agriculture can be adequately separated from rangeland only during short periods of the year coinciding with prevailing cloud cover [1,41,42]. The phenological signatures produced by ESTARFM in the presented study show distinct characteristics of the small-scale rain-fed and irrigated agriculture and savanna types which could be helpful for an improved discrimination. If, however, these results cannot be produced efficiently at the national to regional levels, their relevance for decision makers would remain limited.

6. Conclusions and Outlook

In this study, a framework for the Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM) fusion algorithm was developed to make it applicable for large, cloud-prone and heterogeneous areas. For this purpose, several modifications were introduced that focus on the automation, acceleration of processing speed, handling of the algorithm and maximizing use of Landsat input data. The way similar pixels are selected for the regression between Landsat and Moderate Resolution Imaging Spectroradiometer (MODIS) was enhanced by accounting for spatial differences in the heterogeneity of the study area. Furthermore, an automatic filling of cloud gaps was developed to maximize the use of available cloud-free Landsat input data. In addition, the processing of time series in the ESTARFM framework was automated and processing speed was accelerated significantly by parallelization of the algorithm.

The results indicate that even in a challenging study area with heterogeneous land cover and abrupt fire and flood events, the developed ESTARFM framework proved to be suitable to generate time series on larger spatial scales. The errors varied between average mean absolute error (MAE) of 0.02 for the dry season and 0.05 for the vegetative season which indicates a higher prediction uncertainty for the latter. These fused datasets have the potential to substantially improve various regional analyses of land degradation, changes in land surface phenology, or changes in cropping practice where a combination of high temporal and spatial resolution is essential for gaining new insights in land surface processes.

With most recent satellite missions such as the European Space Agency’s Sentinel program, new possibilities emerge for data fusion approaches extending and adding to the period of observations that can be covered by MODIS-Landsat time series (2000−present time). Especially the satellite constellations of Sentinel-2 (10 m, ~5 days) and Sentinel-3 (300 m, ~2 days) are predestined for data fusion. The short repeat cycle of the Sentinel-2 satellites will significantly contribute to a better availability of high-resolution data in comparison to Landsat. However, capturing the complete phenological cycle of land surface using only Sentinel-2 data is still not likely in cloud-prone areas. For such regions, the fusion of Sentinel-2 with Sentinel-3, Proba-V or VIIRS (Visible/Infrared Imager Radiometer Suite) could significantly contribute to a better quality of high temporal and spatial resolution time series.

Acknowledgments

This work was conducted in the WASCAL (West African Science Service Center on Climate Change and Adapted Land Use) project and supported by the German Federal Ministry of Education and Research (BMBF) under Grant FKZ 01LG1202D. We thank Xiaolin Zhu for providing the code of ESTARFM and for some helpful comments. The new collection 6 time series have been kindly provided by Professor Crystal Schaaf and Qingsong Sun, School for the Environment, University of Massachusetts Boston, before the official distribution via the USGS MODIS websites. We would also like to thank the three anonymous reviewers for their helpful comments.

Author Contributions

Kim Knauer designed the algorithm developments and together with Ursula Gessner conceived and designed the experiments; Kim Knauer performed the experiments and analyzed the data; Rasmus Fensholt and Claudia Kuenzer guided the processing and interpretation. Kim Knauer wrote the first version of the manuscript and all authors jointly improved the version based on critical discussion.

Conflicts of Interest

The authors declare no conflict of interest.

References

Knauer, K.; Gessner, U.; Dech, S.; Kuenzer, C. Remote sensing of vegetation dynamics in West Africa. Int. J. Remote Sens. 2014, 35, 37–41. [Google Scholar] [CrossRef]
Gao, F.; Masek, J.G.; Schwaller, M.; Hall, F. On the blending of the Landsat and MODIS surface reflectance: Predicting daily Landsat surface reflectance. IEEE Trans. Geosci. Remote Sens. 2006, 44, 2207–2218. [Google Scholar]
Zhu, X.; Chen, J.; Gao, F.; Chen, X.; Masek, J.G. An enhanced spatial and temporal adaptive reflectance fusion model for complex heterogeneous regions. Remote Sens. Environ. 2010, 114, 2610–2623. [Google Scholar] [CrossRef]
Chen, B.; Huang, B.; Xu, B. Comparison of spatiotemporal fusion models: A review. Remote Sens. 2015, 7, 1798–1835. [Google Scholar] [CrossRef]
Tewes, A.; Thonfeld, F.; Schmidt, M.; Oomen, R.; Zhu, X.; Dubovyk, O.; Menz, G.; Schellberg, J. Using RapidEye and MODIS data fusion to monitor vegetation dynamics in semi-arid rangelands in South Africa. Remote Sens. 2015, 7, 6510–6534. [Google Scholar] [CrossRef]
Walker, J.; de Beurs, K.; Wynne, R. Phenological response of an Arizona dryland forest to Short-term climatic extremes. Remote Sens. 2015, 7, 10832–10855. [Google Scholar] [CrossRef]
Doña, C.; Chang, N.-B.; Caselles, V.; Sánchez, J.M.; Camacho, A.; Delegido, J.; Vannah, B.W. Integrated satellite data fusion and mining for monitoring lake water quality status of the Albufera de Valencia in Spain. J. Environ. Manag. 2015, 151, 416–426. [Google Scholar]
Bisquert, M.; Bégué, A.; Poncelet, P.; Teisseire, M. Evaluation of fusion methods for crop monitoring purposes. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec, QC, Canada, 13–18 July 2014; pp. 1429–1432.
Emelyanova, I.V.; McVicar, T.R.; van Niel, T.G.; Li, L.T.; van Dijk, A.I.J.M. Assessing the accuracy of blending Landsat-MODIS surface reflectances in two landscapes with contrasting spatial and temporal dynamics: A framework for algorithm selection. Remote Sens. Environ. 2013, 133, 193–209. [Google Scholar] [CrossRef]
Jarihani, A.; McVicar, T.; Van Niel, T.; Emelyanova, I.; Callow, J.; Johansen, K. Blending Landsat and MODIS data to generate multispectral indices: A comparison of “Index-then-Blend” and “Blend-then-Index” approaches. Remote Sens. 2014, 6, 9213–9238. [Google Scholar] [CrossRef] [Green Version]
Gao, F.; Wang, P.; Masek, J. Integrating remote sensing data from multiple optical sensors for ecological and crop condition monitoring. Proc. SPIE 2013, 8869. [Google Scholar] [CrossRef]
Schmidt, M.; Lucas, R.; Bunting, P.; Verbesselt, J.; Armston, J. Multi-resolution time series imagery for forest disturbance and regrowth monitoring in Queensland, Australia. Remote Sens. Environ. 2015, 158, 156–168. [Google Scholar] [CrossRef] [Green Version]
Verbesselt, J.; Hyndman, R.; Newnham, G.; Culvenor, D. Detecting trend and seasonal changes in satellite image time series. Remote Sens. Environ. 2010, 114, 106–115. [Google Scholar] [CrossRef]
Zhang, F.; Zhu, X.; Liu, D. Blending MODIS and Landsat images for urban flood mapping. Int. J. Remote Sens. 2014, 35, 3237–3253. [Google Scholar] [CrossRef]
Olson, D.M.; Dinerstein, E.; Wikramanayake, E.D.; Burgess, N.D.; Powell, G.V.N.; Underwood, E.C.; D’amico, J.A.; Itoua, I.; Strand, H.E.; Morrison, J.C.; et al. Terrestrial ecoregions of the world: A new map of life on earth. Bioscience 2001, 51, 933–938. [Google Scholar] [CrossRef]
Gessner, U.; Knauer, K.; Kuenzer, C.; Dech, S. Land surface phenology in a west African savanna: Impact of land use, land cover and fire. In Remote Sensing Time Series; Kuenzer, C., Dech, S., Wagner, W., Eds.; Springer International Publishing: Berlin, Germany; Heidelberg, Germany, 2015; pp. 203–223. [Google Scholar]
FAO FAOSTAT. Available online: http://faostat3.fao.org/faostat-gateway/go/to/home/E (accessed on 13 May 2016).
Forkuor, G. Agricultural Land Use Mapping in West Africa Using Multi-sensor Satellite Imagery; University of Wuerzburg: Wuerzburg, Germany, 2014. [Google Scholar]
Gessner, U.; Machwitz, M.; Esch, T.; Tillack, A.; Naeimi, V.; Kuenzer, C.; Dech, S. Multi-sensor mapping of West African land cover using MODIS, ASAR and TanDEM-X/TerraSAR-X data. Remote Sens. Environ. 2015, 164, 282–297. [Google Scholar] [CrossRef]
USGS Landsat Data Access. Available online: http://landsat.usgs.gov/Landsat_Search_and_Download.php (accessed on 3 December 2015).
USGS. Product Guide—Provisional Landsat 8 Surface Reflectance Product; Department of the Interior USA Geological Survey: Reston, VA, USA, 2016.
Zhu, Z.; Woodcock, C.E. Object-based cloud and cloud shadow detection in Landsat imagery. Remote Sens. Environ. 2012, 118, 83–94. [Google Scholar] [CrossRef]
Bliefernicht, J.; Kunstmann, H.; Hingerl, L.; Rummler, T.; Andresen, S.; Mauder, M.; Steinbrecher, R.; Frieß, R.; Gochis, D.; Gessner, U.; et al. Field and simulation experiments for investigating regional land-atmosphere interactions in West Africa: Experimental setup and first results. Clim. Land Surf. Chang. Hydrol. 2013, 359, 226–232. [Google Scholar]
Quansah, E.; Mauder, M.; Balogun, A.A.; Amekudzi, L.K.; Hingerl, L.; Bliefernicht, J.; Kunstmann, H. Carbon dioxide fluxes from contrasting ecosystems in the Sudanian Savanna in West Africa. Carbon Balance Manag. 2015, 10, 1–17. [Google Scholar] [CrossRef] [PubMed]
Professor Crystal Schaaf’s Lab MODIS User Guide V006. Available online: https://www.umb.edu/spectralmass/terra_aqua_modis/v006 (accessed on 14 March 2016).
LP DAAC MODIS Reprojection Tool. Available online: https://lpdaac.usgs.gov/tools/modis_reprojection_tool (accessed on 3 December 2015).
Eklundh, L.; Jönsson, P. TIMESAT 3.2 with Parallel Processing Software Manual; Lund University: Lund, Sweden, 2015. [Google Scholar]
Savitzky, A.; Golay, M.J.E. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
Tou, J.T.; Gonzalez, R.C. Pattern Recognition Principles; NASA: Washington, DC, USA, 1974.
Tian, F.; Wang, Y.; Fensholt, R.; Wang, K.; Zhang, L.; Huang, Y. Mapping and evaluation of NDVI trends from synthetic time series obtained by blending Landsat and MODIS data around a coalfield on the loess plateau. Remote Sens. 2013, 5, 4255–4279. [Google Scholar] [CrossRef]
Fu, D.; Chen, B.; Wang, J.; Zhu, X.; Hilker, T. An improved image fusion approach based on enhanced spatial and temporal the adaptive reflectance fusion model. Remote Sens. 2013, 5, 6346–6360. [Google Scholar] [CrossRef]
Gao, F.; Hilker, T.; Zhu, X.; Anderson, M.; Masek, J.; Wang, P.; Yang, Y. Fusing Landsat and MODIS data for vegetation monitoring. IEEE Geosci. Remote Sens. Mag. 2015, 3, 47–60. [Google Scholar] [CrossRef]
Walker, J.J.; de Beurs, K.M.; Wynne, R.H. Dryland vegetation phenology across an elevation gradient in Arizona, USA, investigated with fused MODIS and Landsat data. Remote Sens. Environ. 2014, 144, 85–97. [Google Scholar] [CrossRef]
Zhu, X.; Helmer, E.H.; Gao, F.; Liu, D.; Chen, J.; Lefsky, M.A. A flexible spatiotemporal method for fusing satellite images with different resolutions. Remote Sens. Environ. 2016, 172, 165–177. [Google Scholar] [CrossRef]
Walker, J.J.; de Beurs, K.M.; Wynne, R.H.; Gao, F. Evaluation of Landsat and MODIS data fusion products for analysis of dryland forest phenology. Remote Sens. Environ. 2012, 117, 381–393. [Google Scholar] [CrossRef]
Fensholt, R.; Sandholt, I.; Proud, S.R.; Stisen, S.; Rasmussen, M.O. Assessment of MODIS sun-sensor geometry variations effect on observed NDVI using MSG SEVIRI geostationary data. Int. J. Remote Sens. 2010, 31, 6163–6187. [Google Scholar] [CrossRef]
Bréon, F.M.; Vermote, E. Correction of MODIS surface reflectance time series for BRDF effects. Remote Sens. Environ. 2012, 125, 1–9. [Google Scholar] [CrossRef]
Fensholt, R.; Anyamba, A.; Stisen, S.; Sandholt, I.; Pak, E.; Small, J. Comparisons of compositing period length for vegetation index data from polar-orbiting and geostationary satellites for the cloud-prone region of West Africa. Photogramm. Eng. Remote Sens. 2007, 73, 297–309. [Google Scholar] [CrossRef]
Brandt, M.; Mbow, C.; Diouf, A.A.; Verger, A.; Samimi, C.; Fensholt, R. Ground- and satellite-based evidence of the biophysical mechanisms behind the greening Sahel. Glob. Chang. Biol. 2015, 21, 1610–1620. [Google Scholar] [CrossRef] [PubMed]
Fensholt, R.; Rasmussen, K.; Kaspersen, P.S.; Huber, S.; Horion, S.; Swinnen, E. Assessing land degradation/recovery in the African Sahel from long-term earth observation based primary productivity and precipitation relationships. Remote Sens. 2013, 5, 664–686. [Google Scholar] [CrossRef] [Green Version]
Leroux, L.; Jolivot, A.; Bégué, A.; Seen, D.; Zoungrana, B. How reliable is the MODIS land cover product for crop mapping sub-saharan agricultural landscapes? Remote Sens. 2014, 6, 8541–8564. [Google Scholar] [CrossRef] [Green Version]
Vintrou, E.; Desbrosse, A.; Bégué, A.; Traoré, S.; Baron, C.; lo Seen, D. Crop area mapping in West Africa using landscape stratification of MODIS time series and comparison with existing global land products. Int. J. Appl. Earth Observ. Geoinf. 2012, 14, 83–93. [Google Scholar] [CrossRef]

Figure 1. Overview over the study area in southern Burkina Faso and northern Ghana, delineated by the intersection of four Landsat tiles (194/52, 194/53, 195/52, 195/53) with two MODIS tiles (h17v07, h17v08). Underlying land cover classification has a 250 m spatial resolution for the year 2006 (data source: Gessner et al. [19]).

Figure 2. Overview of Landsat data used in this study distributed over four tiles for the period December 2013 to January 2015. Upper part shows the weekly precipitation sums over the year 2014 averaged over three rainfall stations in the study area [23,24].

Figure 3. Workflow of the automated ESTARFM framework including the pre-processing of input data. The left box shows the pre-processing of MODIS and Landsat input data including the automatic filling of cloud gaps in the Landsat data. The upper right box displays the automatic generation of time series based on the ESTARFM framework. The lower right box summarizes the output layers of the framework. * See also Figure 4.

Figure 4. Workflow of the automatic filling of cloud-affected Landsat scenes. The filling is based on ESTARFM predictions for the gaps in the target scene. Shoulder scene combinations (SSC) are automatically selected after a certain ruleset and then used as input for ESTARFM. A maximum of three synthetic scenes are generated this way to fill one target scene.

Figure 5. Map of the study area showing how often a pixel in the input Landsat time series of 2014 was filled.

Figure 6. Annual mean NDVI for the year 2014 derived from the ESTARFM time series (top left). Temporal plots of ESTARFM, MODIS and Landsat time series of four different areas for the year 2014 (marked in orange in the map): semi-natural savanna area on Nazinga game ranch (a); semi-natural savanna area in the Mole National Park (b); rain-fed agricultural area (c); irrigated agricultural area (d); Spatial transect across the study area from north to south ((e), see dashed line in the map); dashed line in the plot represents the linear trend line.

Figure 7. Landsat false color composite (NIR-R-G, row 1), mean annual NDVI (year 2014) derived from MODIS (row 2), ESTARFM (row 3) and Landsat (row 4) for three focus areas: a fragmented landscape area in the south of Burkina Faso (left column), the city of Ouagadougou in Burkina Faso (middle column) and the Mole National Park in Ghana (right column).

Figure 8. Accuracy and error measures of the ESTARFM prediction: coefficient of determination (R²), mean absolute error (MAE) and root mean square error (RMSE) between predicted and observed NDVI values for the four tiles (paths 194, 195 and rows 52, 53) and dates (DOY) 2014.

Figure 9. Scatterplots of observed Landsat versus predicted ESTARFM NDVI values for Landsat path 194 for three selected dates representing the late dry season in early 2014 (DOY 40), the vegetative season (DOY 152) and the early dry season at the end of 2014 (DOY 344). Upper three plots show Landsat row 52, lower three plots Landsat row 53.

Figure 10. Absolute difference images created between observed and ESTARFM-predicted NDVI for a focus area of tile 194/52 (location indicated in Figure 1). The lower three images show the absolute difference for three selected dates of 2014 with corresponding Landsat false color composites (RGB: 5-4-3) in the upper three images.

Figure 11. Median deviation of NDVI (blue line) and 1st to 3rd quartile (light-blue area) between ESTARFM predictions for the overlapping areas of path 195 and path 194 for the year 2014.

Figure 12. Relationship between the mean absolute error (MAE) and the temporal distance between the shoulder scene combinations (SSC). The MAE was calculated by using all possible SSC for predicting the Landsat scene (195/52) acquired on DOY 143, 2014, and the MAE was derived between predicted scenes and observed scene. The different symbols represent different acquisition dates of the right shoulder scene, e.g., the four diamond symbols represent four predictions of DOY 143 with the same right shoulder (DOY 319) but different left shoulders.

Table 1. Overview of processing times with the original ESTARFM and the ESTARFM framework using nine CPU cores. The processing was tested with four reflectance bands and one band of NDVI.

**Table 1.** Overview of processing times with the original ESTARFM and the ESTARFM framework using nine CPU cores. The processing was tested with four reflectance bands and one band of NDVI.
Input	Original ESTARFM	ESTARFM Framework (9 Cores)
4 bands	24 h 49 min	4 h 38 min
NDVI	15 h 32 min	1 h 45 min

Table 2. Mean error values of the ESTARFM time series during the main rainy season validated with eight Landsat scenes of path 195 containing cloud cover of up to 80%.

**Table 2.** Mean error values of the ESTARFM time series during the main rainy season validated with eight Landsat scenes of path 195 containing cloud cover of up to 80%.
Row	DOY 191			DOY 271			DOY 287			DOY 303
Row	MAE	RMSE	R²	MAE	RMSE	R²	MAE	RMSE	R²	MAE	RMSE	R²
52	0.05	0.07	0.74	0.10	0.12	0.41	0.07	0.10	0.53	0.03	0.05	0.76
53	0.06	0.07	0.58	0.07	0.09	0.30	0.06	0.08	0.33	0.05	0.06	0.56

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Knauer, K.; Gessner, U.; Fensholt, R.; Kuenzer, C. An ESTARFM Fusion Framework for the Generation of Large-Scale Time Series in Cloud-Prone and Heterogeneous Landscapes. Remote Sens. 2016, 8, 425. https://0-doi-org.brum.beds.ac.uk/10.3390/rs8050425

AMA Style

Knauer K, Gessner U, Fensholt R, Kuenzer C. An ESTARFM Fusion Framework for the Generation of Large-Scale Time Series in Cloud-Prone and Heterogeneous Landscapes. Remote Sensing. 2016; 8(5):425. https://0-doi-org.brum.beds.ac.uk/10.3390/rs8050425

Chicago/Turabian Style

Knauer, Kim, Ursula Gessner, Rasmus Fensholt, and Claudia Kuenzer. 2016. "An ESTARFM Fusion Framework for the Generation of Large-Scale Time Series in Cloud-Prone and Heterogeneous Landscapes" Remote Sensing 8, no. 5: 425. https://0-doi-org.brum.beds.ac.uk/10.3390/rs8050425

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An ESTARFM Fusion Framework for the Generation of Large-Scale Time Series in Cloud-Prone and Heterogeneous Landscapes

Abstract

1. Introduction

2. Study Area

3. Materials and Methods

3.1. Landsat-8 Operational Land Imager (OLI)

3.2. Terra/Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) MCD43A4, Collection 6

3.3. Method for the Generation of High Temporal Resolution Time Series

3.3.1. Modified Selection of Similar Pixels

3.3.2. Integration of Cloud-Affected Landsat Scenes

3.3.3. Quality Information

3.3.4. Automation and Improvement of Performance

3.3.5. Application of the Enhance Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM) Framework in the Study Area

3.4. Accuracy Assessment

4. Results

4.1. Spatio-Temporal Patterns of the ESTARFM Time Series

4.2. Accuracy Assessment of the ESTARFM Predictions

4.3. Input Data Availability and Influence on ESTARFM Predictions

5. Discussion

5.1. Uncertainties in the Prediction with ESTARFM

5.2. Added Value of the ESTARFM Framework for Cloud-Prone Areas

5.3. Added Value of the ESTARFM Framework for Large-Scale Analyses

6. Conclusions and Outlook

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI