Next Article in Journal
Recent Progress in Quantitative Land Remote Sensing in China
Next Article in Special Issue
Using Time Series of High-Resolution Planet Satellite Images to Monitor Grapevine Stem Water Potential in Commercial Vineyards
Previous Article in Journal
Monitoring the Impact of Land Cover Change on Surface Urban Heat Island through Google Earth Engine: Proposal of a Global Methodology, First Applications and Problems
Previous Article in Special Issue
Deep Recurrent Neural Network for Agricultural Classification using multitemporal SAR Sentinel-1 for Camargue, France
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessing the Variability of Corn and Soybean Yields in Central Iowa Using High Spatiotemporal Resolution Multi-Satellite Imagery

1
USDA, Agricultural Research Service, Hydrology and Remote Sensing Laboratory, Beltsville, MD 20705, USA
2
USDA, National Agricultural Statistics Service, Washington, DC 20250, USA
*
Author to whom correspondence should be addressed.
Submission received: 2 July 2018 / Revised: 30 August 2018 / Accepted: 14 September 2018 / Published: 18 September 2018
(This article belongs to the Special Issue High Resolution Image Time Series for Novel Agricultural Applications)

Abstract

:
The utility of remote sensing data in crop yield modeling has typically been evaluated at the regional or state level using coarse resolution (>250 m) data. The use of medium resolution data (10–100 m) for yield estimation at field scales has been limited due to the low temporal sampling frequency characteristics of these sensors. Temporal sampling at a medium resolution can be significantly improved, however, when multiple remote sensing data sources are used in combination. Furthermore, data fusion approaches have been developed to blend data from different spatial and temporal resolutions. This paper investigates the impacts of improved temporal sampling afforded by multi-source datasets on our ability to explain spatial and temporal variability in crop yields in central Iowa (part of the U.S. Corn Belt). Several metrics derived from vegetation index (VI) time-series were evaluated using Landsat-MODIS fused data from 2001 to 2015 and Landsat-Sentinel2-MODIS fused data from 2016 and 2017. The fused data explained the yield variability better, with a higher coefficient of determination (R2) and a smaller relative mean absolute error than using a single data source alone. In this study area, the best period for the yield prediction for corn and soybean was during the middle of the growing season from day 192 to 236 (early July to late August, 1–3 months before harvest). These findings emphasize the importance of high temporal and spatial resolution remote sensing data in agricultural applications.

Graphical Abstract

1. Introduction

Accurate estimation of crop yields before harvest is critical for sustaining agricultural markets and ensuring food security. Over the past several decades, ground-based remote sensing data have been demonstrated as a useful tool for estimating crop yield [1,2,3,4]. In the early 1980s, Tucker et al. [2] reported that the normalized difference vegetation index (NDVI) for a five-week period from stem elongation to anthesis explained about 64 percent of the grain yield variation of wheat. Pinter et al. [3] summed the NDVI for wheat and barley from heading to full senescence, and found that the integral of NDVI explains 88 percent of yield variation. Daughtry et al. [3] found that a single observation of leaf area index (LAI) or greenness has limited value in predicting corn yield, but the accumulated absorbed solar radiation over the growing season explained approximately 65 percent of the variation in corn yields. Wiegand et al. [5] concluded that yields could be predicted through spectral observations of crop canopies, because high yields require sufficient crop canopy development to maximize the interception of incident insolation. Wiegand et al. [5] further demonstrated that the relationships between crop yields and vegetation indices are strongest through the early grain filling stage; the senesced or photosynthetically inactive tissues that develop after the grain filling phase can degrade the relationship. Phytomass production is proportional to the cumulative absorbed photosynthetically active radiation (APAR) when nutrients and water are not limiting [4,5,6]. The fraction of incoming PAR that is absorbed by the plant canopies can be estimated by remotely sensed multispectral data, and phytomass production can be estimated as a function of cumulative APAR. Although these pioneer studies were typically focused on specific crops and were conducted over small fields, they generally covered the major crop development phases. Those early field experiments demonstrate the importance of frequent remote sensing observations for estimating crop yields.
In the recent era of rich satellite data availability, numerous studies have been published using satellite imagery to estimate crop yields. Many of these used empirical relationships between yields and various vegetation indices (VIs) [7,8,9]. The empirical approach builds a relationship between ground yield survey samples and the remote sensing derived parameters, and then applies the relationship to remote sensing imagery to map yield over the entire area. VI-based metrics (e.g., maximum VI, integral VI from the entire growing season or for a specific growth period) have been used for estimating crop yield (see review by Funk and Budde [8]). Although an empirical model built for a specific region has limited applicability to different areas or years [9,10], the empirical method is simple and effective for the local region if ground survey samples are representative and accurate. In the early stage, Advanced Very High Resolution Radiometer (AVHRR) sensors were the main data sources used for yield estimation [11,12,13,14,15,16,17,18,19,20,21]. Rasmussen [14,15] integrated the NDVI from AVHHR to build a yield regression model for operational uses. Wall et al. [21] demonstrated that NDVI based on AVHRR data from 1987 to 2002 predicted wheat yield four weeks earlier than a model that used the cumulative moisture index. Since 2000, Moderate Resolution Imaging Spectroradiometer (MODIS) data products have been widely used in forecasting crop yield [7,8,9,22,23,24,25,26,27,28,29,30,31]. Becker-Reshef et al. [26] used the maximum NDVI from MODIS and built a generalized regression model for forecasting winter wheat yields. Franch et al. [28] further improved this approach by adjusting NDVI before the peak date using growing degree day (GDD) information for earlier prediction. This approach has been investigated by the Group on Earth Observations Global Agricultural Monitoring Program (GEOGLAM [32]). Zhang and Zhang [29] used greenness metrics derived from daily two-band enhanced vegetation index (EVI2) time series from AVHRR (1981–1999) and MODIS (2000–2013) sensors, and built a yield prediction model for cereal at 0.05 degrees (~5 km) resolution globally.
Routine daily satellite observations would capture the entire crop development phase and enable the evaluation of crop yield variability. The daily observations currently available, however, are coarse (a few hundred meters and coarser). Application of coarse spatial resolution imagery to a low intensity agricultural area or an area with small field size may have limitations [22,29]. In contrast, Landsat satellites provide routine observations at 30 m spatial resolution every 16 days, and they have been used for the routine mapping of crop types and area [33]. Landsat data have also been considered for crop yield estimation, but to date, generally with a focus only on relatively small regions due to data volume and computational demands [9,23,34,35,36,37]. Recently, however, large area mapping with Landsat has been enabled using cloud computing technologies [30,38,39]. Success in mapping crop yields at the field scale sometimes depends on the frequency of clear-sky Landsat data availability [7]. Azzari et al. [30] compared results from MODIS and Landsat, and found that the relative benefit of the temporal resolution of MODIS and the spatial resolution of Landsat depends on location. In the United States and India, correlations are consistently high for both Landsat and MODIS. In Zambia, clear Landsat observations are limited due to high cloud contamination during the growing season. Even though the area is very heterogeneous, the benefit of temporal frequency from MODIS outperformed the benefit of the fine spatial resolution from Landsat [30].
In these previous studies, the remote sensing data used for yield estimation have generally come from a single data source, which may have limitations in either temporal or spatial resolution. However, by effectively combining information from multiple Landsat- and MODIS-like satellites, these limitations can be mitigated [9]. For example, a recent study by Guan et al. [40] used a Landsat–MODIS data fusion approach to improve paddy rice yield estimation in Vietnam. As of today, Landsat-7 is still in operation, even though it has suffered a failure in scan-line-correction since May 2003. Landsat-8 has been operational since 2013 [41]. Together, they provide a spatial resolution of 30 m with a combined 8-day repeat cycle. Two recently launched Sentinel-2 satellites (A and B) provide global coverage every five days at a 10–60 m resolution, greatly improving the frequency of observation at this critical sub-field scale [42]. Skakun et al. [43] combined Landsat-8 and Sentinel-2A data for winter wheat yield assessment. Both Landsat and Sentinel-2 data are now freely available. Even though a 10–60 m resolution is good enough for the most of agricultural regions in the U.S., other regions dominated by smallholder agricultural systems may need even finer resolution imagery [44].
In addition, fusion of medium resolution periodic Landsat data with near daily moderate resolution MODIS data provides a means for further improving the temporal sampling of the imaging data stream for agricultural applications [45], particularly in regions that are frequently cloud covered. Spatiotemporal data fusion aims at blending data from low temporal but high spatial resolution (e.g., Landsat) with high temporal but low spatial resolution (e.g., MODIS) data. Many spatiotemporal data fusion approaches have been developed in recent years, with a comprehensive review given by Zhu et al. [46]. The Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) [47] is one of the earliest developments of such a model. The STARFM approach uses Landsat–MODIS image pairs acquired from the same day to predict Landsat-like images using a MODIS image from the prediction day. STARFM can produce daily Landsat–MODIS images at Landsat spatial resolution and has been used in numerous applications that require high spatiotemporal data (see reviews in [45,46]).
This paper extends on our previous work on mapping crop phenology using the Landsat–MODIS data fusion approach [48]. The main objective of this paper is to assess the value to crop yield estimation of a high temporal and spatial resolution remote sensing VI time-series developed from multiple satellite data sources. Our hypothesis is that daily fused 30 m resolution VI time-series will provide improved metrics for crop yield estimation in comparison with single-sensor approaches by better capturing the timing and biomass conditions of phenologically critical stages of crop development. Specifically, we aim to answer the following questions:
  • Do high spatiotemporal resolution data (fused Landsat/Sentinel-2 MODIS) characterize crop yield variability better than using Landsat, Sentinel-2, or MODIS data alone?
  • What is the optimal time window for crop yield prediction?
  • Which vegetation index (NDVI or EVI2, both of which can be computed using data fusion from the 250 m MODIS surface reflectance in red and near infrared bands) better explains the variability in crop yield? Which time-series metric (maximum or cumulative VI) better explains the variability in crop yield?
To answer these questions, we analyzed temporal variability in corn and soybean yields over a study area in central Iowa (part of the U.S. Corn Belt) using long-term fused Landsat–MODIS surface reflectance (SR) data from 2001–2015 (daily at a 30 m spatial resolution), and a shorter-term analysis (2016–2017) fusing a new harmonized Landsat8–Sentinel2 surface reflectance dataset with MODIS.

2. Materials

2.1. Study Area

Our long-term temporal analyses were focused in central Iowa, USA—a rain-fed agricultural area in the U.S. Corn Belt. Annual precipitation in central Iowa over the study period (2001–2015) ranged from 633 mm to 1238 mm [49]. Precipitation for the growing season (from April to October) was less than 500 mm for 2012, and above 1000 mm for 2008 and 2010. The major crops were corn and soybean, typically rotated in consecutive years. Figure 1 shows the study area overlaid with the target Landsat scene (WRS-2 path 26 and row 31) and Sentinel-2 tile (15TVG). The Landsat scene covers major portions of 20 counties in central Iowa, while the Sentinel-2 tile covers most of eight counties.

2.2. Satellite Datasets

Studies were conducted using two datasets from 2001–2017. The first dataset used the Landsat–MODIS data fusion approach from 2001–2015. The second dataset additionally incorporated Sentinel-2 data for 2016–2017 from the Harmonized Landsat and Sentinel-2 (HLS) project into the data fusion process. The MODIS data products from 2001 to 2017 were downloaded from the NASA EarthData website [50]. Landsat surface reflectances from 2001 to 2015 were ordered from the U.S. Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center. HLS surface reflectances from 2016–2017 were downloaded from the NASA Goddard Space Flight Center [51]. The yearly NASS Cropland Data Layers (CDL) from 2001–2017 were downloaded from the CropScape portal [52]. The processing of these datasets is described further in the following sections.

2.2.1. Landsat, MODIS, and Fused Landsat–MODIS Data from 2001–2015

Figure 2 shows the Landsat data that were used in this study. Each dot represents a Landsat acquisition, and the size of the dot represents the percentage of valid and clear Landsat pixels within the Landsat scene. Mostly clear Landsat images from each year were chosen to pair with MODIS images that were acquired from the same day to generate daily Landsat–MODIS surface reflectance as described in Gao et al. [48]. Partially clear Landsat images (small dots) were also used in generating the smoothed and gap-filled daily VI time-series. All available Landsat data, including data from Landsat 5 Thematic Mapper (TM), Landsat 7 Enhanced Thematic Mapper Plus (ETM+), and Landsat 8 Operational Land Imager (OLI), were used in this study.
MODIS Collection6 data products from 2001 to 2015 were downloaded and processed. These include the daily surface reflectance at both 250 m (MOD09GQ) and 500 m (MOD09GA) resolutions [53], the MODIS Bidirectional Reflectance Distribution Function (BRDF) parameters at 500 m resolution (MCD43A1) [54], and the MODIS land cover types at 500 m resolution (MCD12Q1) [55].
The Landsat-MODIS data fusion results for 2001–2014 were generated in a previous study [48]. The mean absolute differences between the fused Landsat-MODIS and actual Landsat observations (not used in the data fusion) were less than 0.026 for the red band, 0.052 for the near infrared (NIR) band, and 0.083 for NDVI. The mean biases were within ±0.01 for the red band, ±0.02 for the NIR band, and in the range of −0.011 to 0.028 for NDVI from the previous study [48]. Data fusion results for 2015 were generated using Landsat 8 OLI images from days 194, 226, 258, and 338 in this study. Cloud masks were extracted from Landsat and MODIS QA layers, and were used to exclude cloud, cloud shadow, and snow pixels. Since Landsat 5 TM operational imaging ended in November 2011, and Landsat 8 OLI was not launched until February 2013, Landsat 7 ETM+ Scan Line Corrector (SLC)-off images were the only available Landsat data. For this reason, 2012 was not included, even though 2012 represented an extremely dry year which could have been valuable in the analysis.

2.2.2. Fused Landsat8–Sentinel2-MODIS Data from 2016–2017

Sentinel-2A and -2B were launched on 23 June 2015 and 7 March 2017 respectively. The Sentinel-2 Multi-Spectral Instrument (MSI) includes shortwave spectral bands very similar to Landsat-8 OLI. The combined Sentinel-2A and -2B platforms observe the entire Earth every five days. Since Sentinel-2 was not in full capacity in the early mission stage, for many areas in the U.S. the Sentinel-2 acquisition repeated every 10–20 days or even longer. The NASA HLS project processed Sentinel-2 and Landsat-8 using the consistent methods and data formats. The HLS Sentinel-2 Level-1 TOA reflectances were atmospherically corrected, co-registered, and re-sampled to match Landsat 30 m resolution images over the selected tiles. Band pass adjustments and BRDF corrections were applied to the HLS products to further harmonize Landsat-8 and Sentinel-2 [56]. The HLS project adopted the Sentinel-2 tiling grid to organize Landsat and Sentinel-2 images. Each tile is about 100 × 100 km2. One pre-processed HLS tile (15TVG) is located within the central Iowa study area. Both Sentinel-2 and Landsat-8 data from this tile were used. MODIS daily surface reflectance and 16-day BRDF parameters from 2016 and 2017 were also downloaded and processed for fusing with Landsat-8 and Sentinel-2 surface reflectance.

2.3. Crop Type and Yield Data

The USDA National Agricultural Statistics Service (NASS) Cropland Data Layer (CDL) has provided annual crop type information for the continental U.S. at a 30 m spatial resolution since 2008 [33]. CDL data for Iowa are available since 2000 and were downloaded [52] and used to identify corn and soybean pixels for generating statistical metrics. CDL products are generated using multiple sensors, including Landsat 5 TM, Landsat 7 ETM+, Landsat 8 OLI, the IRS-P6 Advanced Wide Field Sensor (AWiFS), the ResourceSat-2 LISS-3, the Disaster Monitoring Constellation (DMC) satellites, and Sentinel-2 MSI. In Iowa, the overall classification accuracies for major crops (soybeans and corn) were generally above 96%, except for 2001 (92–95%). CDL data for Iowa for all years were reprojected and resampled to match the extent of the Landsat scene and Sentinel-2 tile.
USDA NASS reports monthly yield forecasts during the growing season (August through November for corn and soybeans) at the national and state level. These yield estimates are finalized and published in the following January. Most of this information is collected from two large, ongoing panel surveys known as the Agricultural and Objective Yields Surveys. Regional MODIS-based model information is also used internally by NASS to supplement the on-the-ground reporting, particularly when there are yield anomalies. After the growing season has ended, and farmers have a better handle on their yields, an additional large survey is directed at them to establish county-level statistics [57,58]. This county-level information is reconciled with the already established state numbers, and then published in February [7]. For this work, all of the relevant annual county-level grain yield data from 2001 to 2017 for the study area were downloaded from the USDA NASS QuickStats database [59].

3. Methods

3.1. Landsat and MODIS VI Metrics from 2001–2015

To generate high spatiotemporal surface reflectance, the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) was used to fuse frequent yet coarse resolution MODIS and infrequent but fine Landsat data [47]. The STARFM approach uses spatial information from Landsat and temporal information from MODIS to produce daily Landsat-MODIS images at Landsat spatial resolution. It has been used in numerous applications that require high volumes of spatiotemporal data [45,48]. In this study, Landsat surface reflectance and the 250 m MODIS daily Nadir-BRDF Adjusted Reflectance (NBAR) were fused for the red and NIR bands. The MODIS daily NBAR at 250 m resolution was produced in this study using the daily 250 m surface reflectance product (MOD09GQ), and the 16-day 500 m MODIS BRDF product (MCD43A1) through the MODIS BRDF magnitude inversion approach [54]. MODIS daily NBAR products at 500 m resolution are available in Collection 6 [60]. We generated the 250 m NBAR for red and NIR bands in this study to preserve the MODIS original spatial resolution. Note that even though the 250 m MODIS daily bi-directional surface reflectances were corrected to the nadir view, the effective spatial resolution for off-nadir pixels was coarser than 250 m [61]. MODIS pair images with smaller view zenith angles tend to perform better as an input to STARFM [62].
In this study, the Landsat–MODIS data fusion approach was applied to reflectances in the red ( ρ r e d ) and near-infrared ( ρ n i r ) bands. This enabled the computation of both the NDVI and EVI2 indices from the original Landsat, MODIS, and the fused Landsat–MODIS surface reflectances:
NDVI = ( ρ n i r ρ r e d ) / ( ρ n i r + ρ r e d )
EVI 2 = G * ( ρ n i r ρ r e d ) / ( ρ n i r + C * ρ r e d + L )
where G, C and L are the adjustment factors for EVI2. The standard values (G = 2.5, C = 2.4 and L = 1.0) from Jiang et al. [63] were used in this study. As is well known, NDVI can saturate during the peak growing season [63,64], especially when NDVI is computed from the surface reflectance. The red band surface reflectance for dense green vegetation can be very small after atmosphere correction. EVI2 introduces adjustment factors and is normally lower than NDVI during the peak growing season. Based on the index formulas (Equations (1) and (2)), if the red band reflectance is close to zero, NDVI will be always close to 1 regardless of NIR reflectance. However, EVI2 can still capture variations from the NIR band due to the introduced adjustment factor. Despite the known deficiencies of NDVI, we included it here because it is still widely used in operational agricultural monitoring.
Due to possible cloud contamination in the Landsat and MODIS images, the fused Landsat–MODIS results still may have invalid values or gaps. This can affect the computation of maximum and cumulative values if there are significant data gaps present during the growing season. To fill these gaps, a modified Savitzky–Golay (SG) filter approach was built and applied to smooth and gap-fill the NDVI and EVI2 time-series on a pixel-by-pixel basis. The SG filter is a moving fitting approach. Each point is smoothed using the value computed from the polynomial function fit to the observations within the moving window. In contrast with traditional fitting approaches that use a fixed size of the moving window, our modified SG filter computed polynomial function used a fixed number of samples available around the target date. Since remote sensing time-series are typically not evenly distributed, some periods may have been observed more frequently and some periods less frequently, depending on the season and region. In our modified approach, the period with frequent observations would have a smaller moving window, and thus small variations could be preserved. For the period when valid observations are sparse, a large size of moving window was used to ensure that enough samples could be obtained; thus, continuous time-series could be produced. In the implementation, samples were collected from the central day and then moved one day before and after for each iteration until the minimum number of samples (adjustable threshold, default 15 samples) was reached or the maximum moving window (adjustable threshold, default ±75 days) was reached. The program removes spike points if the fitting errors are larger than the predefined threshold (default 3 standard deviation). The modified SG filter allowed us to retain small variations, but it also filled large gaps in an unevenly distributed time-series VI. Figure 3 shows the EVI time-series for a corn and a soybean pixel around the flux tower sites (yellow star in Figure 1) from the different datasets. Data fusion results fill the gaps between Landsat observations. The SG filter further removes noise, smooths the time-series, and generates the daily VI.
Maximum and cumulative (integral) NDVI and EVI2 metrics at a 30 m resolution for each year from 2001–2015 were computed using the smoothed and gap-filled daily VI time-series during the active growing months. The cumulative VIs were computed based on the cumulative period (from start to end date) and a base VI value as:
Cum _ VI = i = s t a r t e n d ( VI i base )
The base VI was included to reduce background or low value effects. Both large and small cumulative VIs were computed. The large cumulative (LCum) VI was defined as the cumulative value from all VIs (base = 0) during the period. The small cumulative (SCum) VI was defined as the cumulative value above the base value. We used the annual averaged VI at the green-up time as the base value. The green-up dates for corn and soybean were produced from our previous study [48]. The base value varied from year to year for each crop.
By defining the base value, the SCum VI emphasized the growing period by excluding days with lower VI values. The concept of large and small cumulative value has long been used in remote sensing time-series analysis, and our small cumulative value was similar to the small integral value in the TIMESAT software package [65]. Cumulative NDVI and EVI2 were not computed for Landsat and Sentinel-2 data, since time gaps between the actual observations were too large to produce any meaningful integral values. The annual maximum MODIS NDVI and EVI2 were computed using the daily NBAR at 250 m resolution produced for the central Iowa Landsat scene (path 26 and row 31).

3.2. Landsat-8, Sentinel-2 and MODIS Data Fusion Metrics from 2016–2017

For 2016 and 2017, Landsat-8 and Sentinel-2 surface reflectances from the HLS project were used. The MODIS daily NBAR at 250 m resolution were first produced using the MODIS daily surface reflectance (MOD09GQ, 250 m) and MODIS BRDF parameters (MCD43A1, 500 m). Landsat-8 surface reflectance, Sentinel-2 surface reflectance, and the MODIS daily NBAR (250 m) were used to generate daily 30 m resolution surface reflectance using the STARFM data fusion approach. Four maximum EVI2 datasets were produced for this analysis, including the maximum EVI2 derived using Landsat-8 (L8) only, Sentinel-2 (S2) only, the Harmonized Landsat-8 and Sentinel-2 (HLS) combination, and the MODIS–L8–S2 (MLS) time-series. The maximum EVI2 for MLS was generated using a daily time-series that was composed of the observed L8, the observed S2, and the MODIS–L8–S2 fused data if both L8 and S2 were not available.

3.3. Evaluation Metrics

NDVI and EVI2 metrics (maximum and cumulative) for each year at a 30 m resolution were averaged for corn and soybean pixels as classified in the CDL over each county. The averaged values of each metric at the county level were used to compute linear regression statistics using NASS crop yields at the county level from all covered counties. Most empirical yield models use a linear regression approach because fAPAR is linearly related to NDVI [66,67] and crop yield is linearly related to APAR (PAR*fAPAR) [3]. Since the objective of this study is to assess the explanatory ability of high temporal and spatial resolution remote sensing data for crop yield rather than to build an empirical model, we used a simple linear equation to compute the coefficient of determination (R2) and the relative mean absolute error (RMAE) for evaluation. The RMAE was computed relative to the reported yield using Equation (4):
RMAE = 1 n * i = 1 n ( | y i y i | / y i ) * 100 %
where yi is the reported yield for county I, y’i represents the predicted yield from the linear regression model for county I, and n is the total number of counties. Using relative errors allows for comparisons of performance from different years.
The statistics were calculated separately for corn and soybean pixels as identified in the annual CDL. The assessments were applied to the original single-sensor data and the data fusion results for two periods (2001–2015 and 2016–2017). Corn and soybean yields from 2001–2015 were modelled based on Landsat, MODIS, and the fused Landsat-MODIS data using a simple linear regression approach. The cross-validation was performed by random selecting 80% of samples for training and the remaining 20% for validation using the Cubist software [68]. The cross-validation repeated 100 times for each dataset, and the averaged R2 and RMAE were computed. The p-value for each regression was computed to test statistical significance.
To investigate the contributions of NDVI and EVI2 from different growing periods, R2 and RMAE plots were computed as a function of end date and cumulative interval days similar to the accumulating approach used in Lopez-Lozano et al. [69].

4. Results

4.1. Yields and High Spatiotemporal Data

4.1.1. Landsat-MODIS Data Fusion versus Single Data Source

Statistics between yields and VIs from a single data source (Landsat or MODIS) were generated and compared to the performance of a fused VI data set. An annual maximum EVI2 image was produced using all available Landsat scenes from that year. Since clear Landsat dates and pixels were different for each year, the annual maximum EVI2 value may come from different dates in different years. Figure 4 shows the scatter plots between the yields and the maximum EVI2 from a single data source (Landsat or MODIS), and from the fused data for corn and soybeans. From the plots, Landsat–MODIS data fusion results (right panel) showed a good linear relationship to yields for most years. The Landsat-only data (left panel) show scattered distributions, especially for corn. In the plots, two years (2003 and 2010) that had lower yields are highlighted (circles). The year of 2010 was a wet year (1064 mm rainfall during growing season), which caused lower yields in corn while soybean yields were normal. The year of 2003 was a drought year (596 mm rainfall during the growing season) which caused lower yields for soybeans but normal yields for corn. This also indicates that the impact of extremes in water availability (drought or pluvial) could be different for different crops depending on the timing of extreme events during crop growth and development. Table 1 shows R2 and RMAE for corn and soybean from Landsat, MODIS, and the fused Landsat–MODIS data averaged from cross-validations. For corn, the fused Landsat–MODIS data showed better relationships, especially compared to the Landsat data alone. For soybean, the results from Landsat and the fused Landsat–MODIS were similar when using all of the samples from 2001–2015. However, when the samples from 2003 (drought year) were excluded, statistics for MODIS and fused Landsat–MODIS were substantially improved. Even though 2003 for soybean was different from the rest of the years, the spatial variability for 2003 was still well captured by the data fusion results.
The weaker yield-VI relationships developed from Landsat data may due to the fact that a clear Landsat observation close to the peak VI time may not be available, and because the timing of clear Landsat observations varies from year to year. This suggests that Landsat data alone are inadequate in generating reliable relationships of yield–VI in the study area.
In this area, MODIS data at a 250 m spatial resolution could capture the spatial variability of yield at county scale, which ranged in area from 1107 to 1885 km2 (about 17,714 to 30,164 MODIS 250 m pixels). The field sizes were relatively large in Iowa, with an average area of 0.33 km2 [70]. The effect of the mixed pixels at the 250 m pixel scale (with an area of 0.0625 km2) may be less severe than for other areas in the world that normally have smaller field sizes. Even though this area was relatively homogeneous and the 250 m resolution MODIS data captured the spatial variability of the yield at the county level, the resolution was still too coarse to reliably capture the spatial variability at sub-field scales. Figure 5 illustrates corn yields estimated from MODIS (250 m) and the Landsat–MODIS data fusion results (30 m) over a subset area in 2010 (red box in Figure 1). The yield map from the Landsat–MODIS fusion result shows detailed within-field spatial variability in yield, which is important for field level management. Even though the MODIS yield map at 250 m resolution showed a similar spatial pattern and can capture yield variability at county level, the pixel blocks and mixed pixels around field boundaries could be identified. Since yield data at the field scale were not available for this study, validation at the field scale could not be performed.

4.1.2. Landsat8-Sentinel2-MODIS Data Fusion versus Single Data Source

Figure 6 shows scatter plots between the yields and maximum EVI2 extracted from four datasets averaged at the county level for 2016 and 2017. The maximum EVI2 values were computed using Landsat-8 alone, Sentinel-2 alone, the Harmonized Landsat-8 and Sentinel-2 (HLS) combination, and the MODIS–L8–S2 (MLS) data fusion. For both years, there was no clear Landsat-8 or Sentinel-2 image around the peak greenness time. Sentinel-2 was not in full acquisition capability for 2016 and 2017 in the U.S. Therefore, the maximum EVI2 from L8, S2 and HLS was always lower than from MLS, which was generated from the daily fused product. The relationships between the yield and maximum EVI2 from single sensors were weak. This is because that the maximum EVI2 was composited from different clear dates due to clouds. The maximum value composition may interrupt the spatial continuity of image. The HLS dataset improves the relationship. For MLS, the maximum EVI2 values were generated from the actual observations and daily data fusion results at 30 m resolution, and thus improved the relationships for both corn and soybean. However, there were some counties (Marshall, Hardin and Polk for corn in 2017) that showed lower EVI2 but higher yields. Although this tile only included eight counties from two years, the benefits of high temporal and spatial resolution are still noticeable.

4.2. Optimal Time Window for Yield Prediction

To investigate the predictive value of VIs from different growth periods, we generated statistics for small cumulative VIs using different time windows based on the fused Landsat-MODIS data from 2001–2015. Figure 7 shows R2 and RMAE between yield and cumulative NDVI and EVI2 for corn and soybeans averaged from 2001–2015. The cumulative EVI2 shows higher R2 and smaller RMAE in comparison with the cumulative NDVI plots. This suggests that EVI2 outperformed NDVI in the yield-VI linear model when the same time windows were used. In peak correlation and low RMAE areas (inside black contours), the end date (x-axis) varied from day 210 to 250 for corn, with interval of 1 to 70 days for corn, which means that from day 180 (=250–70) to 250 was a good time window for corn yield prediction. For soybean, the end date varied from day 220 to 260, and the interval varied from 1 to 80 days. Thus, day 180 (=260–80) to 260 was a good time window for soybean yield prediction. For central Iowa, the time window from day 180 (end of July) to 250 (early September) was good for both corn and soybean yield predictions. While crop growth stages may vary from year to year, this high R2 timeframe covered the silking to the dent stage for corn, and the blooming to pod setting stage for soybean [58]—critical stages for yield development. The results revealed the importance of the cumulative period for yield prediction. In this study area, the harvest dates for corn and soybean were around days 270–300 [57,58]. A good prediction using remote sensing data can therefore be expected at 1–3 months before harvest.
To compare the performance of maximum and cumulative VI, we selected the best time window for computing the cumulative VIs. The peak R2 in Figure 7 is located around the point (236, 45) (or day 191–236) for corn and around the point (250, 58) (or day 192–250) for soybean. Therefore, we chose the best time window from day 192 to 236 (11 July to 24 August) to compute the cumulative VIs for both corn and soybean, and compared these to results using maximum VI in the next sections.

4.3. Performance of VIs and Metrics

4.3.1. Performance of NDVI vs. EVI2 in Explaining the Yearly Spatial Yield Variability

The R2 and RMAE between the yield and VIs (NDVI and EVI2) were calculated for two metrics (maximum and cumulative). Both small and large cumulative VIs were computed during the period from days 192 to 236. Since VI values during this period are all larger than base VI values, results are identical for large and small cumulative VIs. Figure 8 shows the R2 and RMAE between the yields and VIs (maximum and small cumulative) from 2001–2015 for corn and soybeans from the study area. Each symbol in Figure 8 shows the statistics for one year based on yield and VIs from 20 counties. From this figure, we can see that R2 values for EVI2 were higher, and RMAEs were slightly lower than NDVI, which meant that EVI2 better captured the spatial variability of yield for a given year.
The relative MAEs for NDVI and EVI2 metrics (maximum, small and large cumulative) for both corn and soybean are less than 8% when using a simple linear regression model. This means the linear regression model works well within the year for each year from 2001–2015 in the study area. Since our objective in this paper was to assess the explanatory ability of different remote sensing data for crop yield rather than to build an empirical model, we have focused on the coefficient of determination (R2) and fitting error (RMAE) of the simple linear model. To assess the quality of linear regression, both metrics are important.

4.3.2. Performance of the VI Metrics in Explaining Yield Variability

Spatial Variability of Yield

To evaluate the explanatory ability of maximum and cumulative EVI2, we plotted the R2 and RMAE for maximum and small cumulative values separately. Figure 9 shows that the statistics using small cumulative values are similar to those from maximum EVI2. For many years, the small cumulative EVI2 outperformed the maximum EVI2.
The cumulative NDVI and EVI2 in Figure 9 cover the growing period from day 192 to 236. Figure 10 compares the maximum VIs and small cumulative VIs computed from the entire growing season (day 150–300). The statistics from maximum VIs outperformed those from the cumulative VIs. The performance of the cumulative VIs depended on cumulative time window, as also revealed in Figure 7.

Temporal Variability of Yield

The statistics derived for each year from the 20 counties mainly assessed the spatial explanatory ability of VI for crop yield. To evaluate the temporal explanatory ability of VIs and to examine the inter-annual variation of yield, we also generated statistics for each county using the historical yields and VIs from 2001–2015. Figure 11 shows the R2 and RMAE for each county using small cumulative and maximum VIs. The statistics were generated using 14 years of data for each county to assess the capability of VIs and metrics in explaining temporal variability of yield. In Figure 11, the R2 for the maximum and cumulative VIs were close. RMAE from cumulative VIs were slightly smaller (better) than those from the maximum VIs.
The temporal statistics in Figure 11 are not as good as the spatial statistics shown in Figure 9. To compare results from temporal and spatial models, we generated a histogram of RMAE for both models using small cumulative EVI2 (Figure 12). The simple temporal model shows larger errors for both corn and soybeans, which further implies that temporal variability of yield is more difficult to explain than spatial variability by using VIs alone.

5. Discussion

5.1. Multi-Sensor Combination

This study shows the data fusion results outperform single sensor in explaining yield variability. The Sentinel-2 MSI and Landsat-8 OLI have similar shortwave spectral bands and are freely available. Many studies have shown the benefits when two datasets are used in combination [43]. However, the direct combination of the different data sources could be complicated by differences between the sensor characteristics and the data pre-processing procedures. NASA’s Harmonized Landsat and Sentinel-2 (HLS) project produces the harmonized Landsat-8 and Sentinel-2 surface reflectance in a consistent way [56]. Band pass differences and BRDF effects have been corrected. The HLS product can be used directly for such analysis once the data product becomes available over a larger area in the U.S, and over a longer timeframe. Two years (2016 and 2017) of Landsat-8 and Sentinel-2 data are certainly not long enough for rigorous analysis. Additional years need to be processed and analyzed in the future.
To fully investigate the value of high spatial and temporal information, crop yield data at the field scale are needed. However, yield data at the field scale are scarce, and many datasets are not available to the public. Even though we used yield data at the county level in this study, the value of high temporal information can be observed. The daily observations from MODIS (250 m) described the yield variability reasonably well at the county level; however, yield correlations improved when MODIS is fused with Landsat data (Table 1). Even though Sentinel-2 was not in full capacity during the study period, the direct combination of the harmonized Landsat-8 and Sentinel-2 show better results than using any single sensor alone. The benefit of spatial resolution is more pronounced when yield maps or validation are needed at field or sub-field scales. A fully integrated data fusion system using Landsat, Sentinel-2, MODIS, and VIIRS (Visible Infrared Imaging Radiometer Suite) surface reflectance can be expected to improve yield estimation.

5.2. Time Window for Yield Prediction

Earlier research has suggested that the cumulative VI is effective in capturing yield variability [3,6,14,15,71,72]. However, computation of the cumulative VI can be impacted by the availability of clear satellite observations during the accumulation period. Using daily VI time-series data generated through the MODIS–Landsat fusion, we found that NDVI and EVI2 from day 180 to 250 (29 June to 7 September) provided valuable information for capturing the spatial variability in yield in this area. The best prediction time for the area from 2001 to 2015 is around day 192 to 236 (11 July to 24 August). An earlier study showed that the annual variation of corn grain yields at the national and state levels in the U.S. can be predicted accurately in early August [73], which is within the optimal time window from this study.
In this study, the maximum values of NDVI and EVI2 during the growing season have stronger correlations to yield than do the cumulative indices from the entire growing season. Crops affected by water stress or nutrient deficits will often have lower yields, and the ability to capture these impacts in a given metric depends on the timing of the stress. If the stress event occurs before the peak of the growing season, it will reduce growth (lower leaf area index) and result in lower peak VIs. In these scenarios, the relationship between yield and peak VI may be more observable than between yield and cumulative VI. If the stress happens after the peak of the growing season (e.g., during flowering, grain filling and maturation stages), it may not affect the peak VI, and the relationship between yield and VI could be weaker. The cumulative VI metrics integrate throughout the growing season, and they may lessen the signal of short-term stress in the latter parts of the accumulation period. However, a shorter cumulative period from day 192–236 generally outperforms maximum VI. This period corresponds to the middle of the growing season at around the silking to the dent stage for corn, and from blooming to the pod setting stage for soybean [58]—the reproductive stage in phenological development. In addition to using a shorter cumulative period, another feasible approach is to use a higher base value to compute the cumulative VI during the entire growing season, and thus to reduce the dependence of time window.

5.3. Yield and VIs

In this study, we evaluated the relationships of corn and soybean yields to metrics derived from NDVI and EVI2 image time-series. The MODIS 250 m surface reflectance product provides two spectral bands (red and NIR), which limits our ability to produce other vegetation indices that require additional spectral bands. The results showed that EVI2 derived from surface reflectance correlates better with yield than does NDVI. This is likely because NDVI has a greater tendency to saturate during the peak growing season [63,64].
Other red-NIR band based vegetation indices such as the Wide Dynamic Range Vegetation Index (WDRVI) [64] introduce a weighting coefficient to the NIR band reflectance to increase the contribution from red band in comparison with NDVI. Therefore, the WDRVI has a wider dynamic range than NDVI in densely vegetated areas. WDRVI showed a strong linear relationship with corn green LAI [27,71,73].
This study was limited to a predominantly rain-fed agricultural area. For irrigated fields where water stress is not a concern, the relationship between yield and VI could be different. A regional irrigation map at similar spatial resolution is needed, and may be developed using remote sensing approaches [74,75].
Using a process-based model, crop yields can be determined by APAR (PAR*fAPAR), light use efficiency (LUE), and harvest index. Our central Iowa study area is relative small, and so spatial variability of PAR is small. LUE is often parameterized as a crop-type dependent value, modified by functionals describing impacts on moisture stress (conveyed by a normalized ET variable) and surface temperature extremes at different stages of crop growth. The LUE for a given crop in a small rain-fed region for a year may be spatially homogeneous if the moisture stress is relative uniform. Since fAPAR is linearly related to VI, in theory, the spatial variability of crop yield in a small area for a given year can be described by VI. This is the foundation for many VI-based yield prediction models. However, these relationships can change from year to year and from location to location as many empirical crop yield models revealed [9]. VI alone cannot capture the temporal variability of yield, which could additionally be affected by water and nutrient availability, seed improvement, management enhancement etc. The harvest index can be highly variable between fields and over time, especially when water stress occurs after the peak of the growing season. In this case, the grain yields can be significantly reduced, but the above-ground biomass may not. More environmental variables are needed for crop yield modeling. Evapotranspiration (ET) and normalized evaporative stress index (ESI) products derived at the Landsat scale by similar data fusion methods may provide the necessary additionally information to capture the temporal variability of crop yield [76,77,78]—this is a topic of current investigation. For large areas, spatial variability of PAR, ET, temperature, seed species, and management approach etc. needs to be considered in rigorous crop yield modeling.

6. Conclusions

The value of high temporal and spatial resolution remote sensing for describing the spatial and temporal variability of crop yield was evaluated. This study was conducted in central Iowa from 2001 to 2015 using the Landsat-MODIS data fusion and from 2016–2017, using the Landsat8–Sentinel2–MODIS data fusion. Our results show that:
  • High temporal and spatial resolution data from the fused daily Landsat/Sentinel-2 MODIS results explain crop yield variability better than do Landsat, Sentinel-2, or MODIS data alone.
  • The optimal time window for crop yield prediction is from day 192–236 (early July to late August, 1–3 months before harvest) for corn and soybean in the study area.
  • The two band Enhanced Vegetation Index (EVI2) explains the variability of crop yield better than the Normalized Difference Vegetation Index (NDVI) when derived from surface reflectance. The cumulative VIs from the optimal time window outperforms maximum VIs. However, the cumulative VIs from the entire growing season underperforms maximum VIs.
Our findings highlight the importance of high temporal and spatial remote sensing data for crop yield estimation, and to support the development of new medium resolution sensors for agricultural applications.

Author Contributions

Conceptualization, F.G. and M.A.; Formal analysis, F.G., M.A., C.D., D.J.; Writing—Original Draft Preparation, F.G.; Writing—Review & Editing, M.A., C.D., D.J.

Funding

This work was partially supported by the NASA Science of Terra, Aqua, and Suomi NPP program, the NASA LCLUC program, and the US Geological Survey (USGS) Landsat Science Team program.

Acknowledgments

USDA is an equal opportunity provider and employer.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. MacDonald, R.B.; Hall, F.G. Global crop forecasting. Science 1980, 208, 670–679. [Google Scholar] [CrossRef] [PubMed]
  2. Tucker, C.J.; Holben, B.N.; Elgin, J.H.; McMurtry, J.E., III. Relationship of spectral data to grain yield variation. Photogramm. Eng. Remote Sens. 1980, 46, 657–666. [Google Scholar]
  3. Pinter, P.J., Jr.; Jackson, R.D.; Idso, S.B.; Reginato, R.J. Multidate spectral reflectance as predictors of yield in water stressed wheat and barley. Int. J. Remote Sens. 1981, 2, 43–48. [Google Scholar] [CrossRef]
  4. Daughtry, C.S.T.; Gallo, K.P.; Bauer, M.E. Spectral Estimates of Solar Radiation Intercepted by Corn Canopies. Agron. J. 1983, 75, 527–531. [Google Scholar] [CrossRef]
  5. Wiegand, C.L.; Richardson, A.J.; Jackson, R.D.; Pinter, P.J., Jr.; Aase, J.K.; Smika, D.E.; Lautenschlager, L.F.; McMurtrey, J.E., III. Development of agrometeorologlcal crop model inputs from remotely sensed information. IEEE Trans. Geosci. Remote Sens. 1986, 24, 90–98. [Google Scholar] [CrossRef]
  6. Daughtry, C.S.T.; Gallo, K.P.; Goward, S.N.; Prince, S.D.; Kustas, W.P. Spectral estimates of absorbed radiation and phytomass production in corn and soybean canopies. Remote Sens. Environ. 1992, 39, 141–152. [Google Scholar] [CrossRef]
  7. Johnson, D.M. An assessment of pre- and within-season remotely sensed variables for forecasting corn and soybean yields in the United States. Remote Sens. Environ. 2014, 141, 116–128. [Google Scholar] [CrossRef]
  8. Funk, C.; Budde, M.E. Phenologically tuned MODIS NDVI-based production anomaly estimates for Zimbabwe. Remote Sens. Environ. 2009, 113, 115–125. [Google Scholar] [CrossRef]
  9. Doraiswamy, P.C.; Hatfield, J.L.; Jackson, T.J.; Akhmedoc, B.; Prueger, J.; Stern, A. Crop condition and yield simulations using Landsat and MODIS. Remote Sens. Environ. 2004, 92, 548–559. [Google Scholar] [CrossRef]
  10. Mkhabela, M.S.; Bullock, P.; Raj, S.; Wang, S.; Yang, Y. Crop yield forecasting on the Canadian prairies using MODIS NDVI data. Agric. For. Meteorol. 2011, 151, 385–393. [Google Scholar] [CrossRef]
  11. Tucker, C.J.; Vanpraet, C.L.; Sharman, M.J.; Van Ittersum, G. Satellite remote sensing of total herbaceous biomass production in the Senegalese Sahel: 1980–1984. Remote Sens. Environ. 1985, 17, 233–249. [Google Scholar] [CrossRef]
  12. Malingreau, J.-P. Global vegetation dynamics: Satellite observations over Asia. Int. J. Remote Sens. 1986, 7, 1121–1146. [Google Scholar] [CrossRef]
  13. Rasmussen, M.S. Assessment of millet yields and production in northern Burkina-Faso using integrated NDVI from the AVHRR. Int. J. Remote Sens. 1992, 13, 3431–3442. [Google Scholar] [CrossRef]
  14. Rasmussen, M.S. Operational yield forecast using AVHRR NDVI data: Reduction of environmental and inter-annual variability. Int. J. Remote Sens. 1997, 18, 1059–1077. [Google Scholar] [CrossRef]
  15. Rasmussen, M.S. Developing simple, operational, consistent NDVI-vegetation models by applying environmental and climatic information. Part II: Crop yield assessment. Int. J. Remote Sens. 1998, 19, 119–139. [Google Scholar] [CrossRef]
  16. Maselli, F.; Conese, C.; Petkov, L.; Gilabert, M.-A. Use of NOAA-AVHRR NDVI data for environmental monitoring and crop forecasting in the Sahel. Preliminary results. Int. J. Remote Sens. 1992, 13, 2743–2749. [Google Scholar] [CrossRef]
  17. Benedetti, R.; Rossini, P. On the use of NDVI profiles as a tool for agricultural statistics: The case study of wheat yield estimate and forecast in Emilia Romagna. Remote Sens. Environ. 1993, 45, 311–326. [Google Scholar] [CrossRef]
  18. Quarmby, N.A.; Milnes, M.; Hindle, T.L.; Silleos, N. The use of multi-temporal NDVI measurements from AVHRR data for crop yield estimation and prediction. Int. J. Remote Sens. 1993, 14, 199–201. [Google Scholar] [CrossRef]
  19. Maselli, F.; Rembold, F. Analysis of GAC NDVI data for cropland identification and yield forecasting in Mediterranean African countries. Photogramm. Eng. Remote Sens. 2001, 67, 593–602. [Google Scholar]
  20. Ferencz, C.; Bognar, P.; Lichtenberger, J.; Hamar, D.; Tarcsai, G.; Timár, G.; Molnár, G.; Pásztor, S.Z.; Steinbach, P.; Székely, B.; et al. Crop yield estimation by satellite remote sensing. Int. J. Remote Sens. 2004, 25, 4113–4149. [Google Scholar] [CrossRef]
  21. Wall, L.; Larocque, D.; Léger, P.-M. The early explanatory power of NDVI in crop yield modelling. Int. J. Remote Sens. 2008, 29, 2211–2225. [Google Scholar] [CrossRef]
  22. Bolton, D.K.; Friedl, M.A. Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics. Agric. For. Meteorol. 2013, 173, 74–84. [Google Scholar] [CrossRef]
  23. Lobell, D.B.; Ortiz-Monasterio, J.I.; Asner, G.P.; Naylor, R.L.; Falcon, W.P. Combining field surveys, remote sensing, and regression trees to understand yield variations in an irrigated wheat landscape. Agron. J. 2005, 97, 241–249. [Google Scholar]
  24. Lobell, D.B. The use of satellite data for crop yield gap analysis. Field Crop Res. 2013, 143, 56–64. [Google Scholar] [CrossRef]
  25. Ren, J.; Chen, Z.; Zhou, Q.; Tang, H. Regional yield estimation for winter wheat with MODIS-NDVI data in Shandong China. Int. J. Appl. Earth Obs. Geoinf. 2008, 10, 403–413. [Google Scholar] [CrossRef]
  26. Becker-Reshef, I.; Vermote, E.; Lindeman, M.; Justice, C. A generalized regression-based model for forecasting winter wheat yields in Kansas and Ukraine using MODIS data. Remote Sens. Environ. 2010, 114, 1312–1323. [Google Scholar] [CrossRef]
  27. Sakamoto, T.; Gitelson, A.A.; Arkebauer, T.J. MODIS-based corn grain yield estimation model incorporating crop phenology information. Remote Sens. Environ. 2013, 131, 215–231. [Google Scholar] [CrossRef]
  28. Franch, B.; Vermote, E.F.; Becker-Reshef, I.; Claverie, M.; Huang, J.; Zhang, J.; Justice, C.; Sobrino, J.A. Improving the timeliness of winter wheat production forecast in the United States of America, Ukraine and China using MODIS data and NCAR growing degree day information. Remote Sens. Environ. 2015, 161, 131–148. [Google Scholar] [CrossRef]
  29. Zhang, X.; Zhang, Q. Monitoring interannual variation in global crop yield using long-term AVHRR and MODIS observations. ISPRS J. Photogramm. Remote Sens. 2016, 114, 191–205. [Google Scholar] [CrossRef]
  30. Azzari, G.; Jain, M.; Lobell, D.B. Towards fine resolution global maps of crop yields: Testing multiple methods and satellites in three countries. Remote Sens. Environ. 2017, 202, 129–141. [Google Scholar] [CrossRef]
  31. Wardlow, B.D.; Egbert, S.L. Large-area crop mapping using time-series MODIS 250 m NDVI data: An assessment for the U.S. Central Great plains. Remote Sens. Environ. 2008, 112, 1096–1116. [Google Scholar] [CrossRef]
  32. The Group on Earth Observations Global Agricultural Monitoring Initiative. Available online: http://geoglam-crop-monitor.org/ (accessed on 28 August 2018).
  33. Boryan, C.; Yang, Z.; Mueller, R.; Craig, M. Monitoring US agriculture: The US Department of Agriculture, National Agricultural Statistics Service, Cropland Data Layer program. Geocarto Int. 2011. [Google Scholar] [CrossRef]
  34. Liu, J.; Pattey, E.; Miller, J.R.; McNairn, H.; Smith, A.; Hu, B. Estimating crop stresses, aboveground dry biomass and yield of corn using multi-temporal optical data combined with a radiation use efficiency model. Remote Sens. Environ. 2010, 114, 1167–1177. [Google Scholar] [CrossRef]
  35. Gitelson, A.A.; Peng, Y.; Masek, J.G.; Rundquist, D.C.; Verma, S.; Suyker, A. Remote estimation of crop gross primary production with Landsat data. Remote Sens. Environ. 2012, 121, 404–414. [Google Scholar] [CrossRef] [Green Version]
  36. Rudorff, B.F.T.; Batista, G.T. Wheat yield estimation at the farm level using TM Landsat and agrometeorological data. Int. J. Remote Sens. 1991, 12, 2477–2484. [Google Scholar] [CrossRef]
  37. Thenkabail, P.S.; Ward, A.D.; Lyon, J.G. Landsat-5 thematic mapper models of soybean and corn crop characteristics. Int. J. Remote Sens. 1994, 15, 49–61. [Google Scholar] [CrossRef]
  38. Lobell, D.B.; Thau, D.; Seifert, C.; Engle, E.; Little, B. A scalable satellite-based crop yield mapper. Remote Sens. Environ. 2015, 164, 324–333. [Google Scholar] [CrossRef]
  39. Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
  40. Guan, K.; Li, Z.; Rao, L.N.; Gao, F.; Xie, D.; Hien, N.T.; Zeng, Z. Mapping paddy rice area and yields over Thai Binh Province in Viet Nam from MODIS, Landsat, and ALOS-2/PALSAR-2. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2018, 11. [Google Scholar] [CrossRef]
  41. Roy, D.P.; Wulder, M.A.; Loveland, T.R.; Woodcock, C.E.; Allen, R.G.; Anderson, M.C.; Helder, D.; Irons, J.R.; Johnson, D.M.; Kennedy, R.; et al. Landsat-8: Science and product vision for terrestrial global change research. Remote Sens. Environ. 2014, 145, 154–172. [Google Scholar] [CrossRef]
  42. Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
  43. Skakun, S.; Vermote, E.; Roger, J.C.; Franch, B. Combined use of Landsat-8 and Sentinel-2A images for winter crop mapping and winter wheat yield assessment at regional scale. AIMS Geosci. 2017, 3, 163. [Google Scholar] [CrossRef] [PubMed]
  44. Burke, M.; Lobell, D.B. Satellite-based assessment of yield variation and its determinants in smallholder African systems. Proc. Natl. Acad. Sci. USA 2017, 114, 2189–2194. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. Gao, F.; Hilker, T.; Zhu, X.; Anderson, M.; Masek, J.; Wang, P.; Yang, Y. Fusing Landsat and MODIS Data for Vegetation Monitoring. IEEE Geosci. Remote Sens. Mag. 2015, 3, 47–60. [Google Scholar] [CrossRef]
  46. Zhu, X.; Cai, F.; Tian, J.; Williams, T.K.-A. Spatiotemporal Fusion of Multisource Remote Sensing Data: Literature Survey, Taxonomy, Principles, Applications, and Future Directions. Remote Sens. 2018, 10, 527. [Google Scholar] [Green Version]
  47. Gao, F.; Masek, J.; Schwaller, M.; Hall, F. On the Blending of the Landsat and MODIS Surface Reflectance: Predict Daily Landsat Surface Reflectance. IEEE Trans. Geosci. Remote Sens. 2006, 44, 2207–2218. [Google Scholar]
  48. Gao, F.; Anderson, M.C.; Zhang, X.; Yang, Z.; Alfieri, J.G.; Kustas, W.P.; Mueller, R.; Johnson, D.; Prueger, J.H. Toward mapping crop progress at field scales through fusion of Landsat and MODIS imagery. Remote Sens. Environ. 2017, 188, 9–25. [Google Scholar] [CrossRef]
  49. Iowa Department of Agriculture and Land Stewardship. Available online: http://www.iowaagriculture.gov/climatology.asp (accessed on 17 September 2018).
  50. NASA EarthData. Available online: https://search.earthdata.nasa.gov/search (accessed on 17 September 2018).
  51. NASA Harmonized Landsat and Sentinel-2 (HLS). Available online: https://hls.gsfc.nasa.gov/ (accessed on 17 September 2018).
  52. USDA National Agricultural Statistics Service. Cropland Data Layer Distributed through the George Mason University’s CropScape. Available online: https://nassgeodata.gmu.edu/CropScape/ (accessed on 17 September 2018).
  53. Vermote, E.F.; El Saleous, N.Z.; Justice, C.O. Atmospheric correction of MODIS data in the visible to middle infrared: First results. Remote Sens. Environ. 2002, 83, 97–111. [Google Scholar] [CrossRef]
  54. Schaaf, C.B.; Gao, F.; Strahler, A.H.; Lucht, W.; Li, X.; Tsang, T.; Strugnell, N.C.; Zhang, X.; Jin, Y.; Muller, J.P.; et al. First operational BRDF, albedo and nadir reflectance products from MODIS. Remote Sens. Environ. 2002, 83, 135–148. [Google Scholar] [CrossRef]
  55. Friedl, M.A.; McIver, D.K.; Hodges, J.C.; Zhang, X.Y.; Muchoney, D.; Strahler, A.H.; Woodcock, C.E.; Gopal, S.; Schneider, A.; Cooper, A.; et al. Global land cover from MODIS: Algorithms and early results. Remote Sens. Environ. 2002, 83, 287–302. [Google Scholar] [CrossRef]
  56. Claverie, M.; Ju, J.; Masek, J.G.; Dungan, J.L.; Vermote, E.F.; Roger, J.-C.; Skakun, S.; Justice, C.O. The Harmonized Landsat and Sentinel-2 data set. Remote Sens. Environ. 2018, accepted, in press. [Google Scholar]
  57. USDA National Agricultural Statistics Service. Field Crops Usual Planting and Harvesting Dates. Available online: http://usda.mannlib.cornell.edu/usda/current/planting/planting-10-29-2010.pdf (accessed on 17 September 2018).
  58. USDA National Agricultural Statistics Service. Crop Progress and Condition. Available online: https://www.nass.usda.gov/Publications/State_Crop_Progress_and_Condition/index.php (accessed on 17 September 2018).
  59. USDA National Agricultural Statistics Service. Quick Statistics. Available online: https://www.nass.usda.gov/Quick_Stats/ (accessed on 17 September 2018).
  60. Wang, Z.; Schaaf, C.B.; Sun, Q.; Shuai, Y.; Román, M.O. Capturing Rapid Land Surface Dynamics with Collection V006 MODIS BRDF/NBAR/Albedo (MCD43) Products. Remote Sens. Environ. 2018, 207, 50–64. [Google Scholar] [CrossRef]
  61. Pahlevan, N.; Sarkar, S.; Devadiga, S.; Wolfe, R.E.; Román, M.; Vermote, E.; Lin, G.; Xiong, X. Impact of Spatial Sampling on Continuity of MODIS–VIIRS Land Surface Reflectance Products: A Simulation Approach. IEEE Trans. Geosci. Remote Sens. 2017, 55, 183–196. [Google Scholar] [CrossRef] [Green Version]
  62. Xie, D.; Gao, F.; Sun, L.; Anderson, M. Improving Spatial-Temporal Data Fusion by Choosing Optimal Input Image Pairs. Remote Sens. 2018, 10, 1142. [Google Scholar] [CrossRef]
  63. Jiang, Z.; Huete, A.R.; Didan, K.; Miura, T. Development of a two-band enhanced vegetation index without a blue band. Remote Sens. Environ. 2008, 112, 3833–3845. [Google Scholar] [CrossRef]
  64. Gitelson, A.A. Wide dynamic range vegetation index for remote quantification of biophysical characteristics of vegetation. J. Plant Physiol. 2004, 161, 165–173. [Google Scholar] [CrossRef] [PubMed]
  65. Jonsson, P.; Eklundh, L. TIMESAT—A program for analysing time-series of satellite sensor data. Comput. Geosci. 2004, 30, 833–845. [Google Scholar] [CrossRef]
  66. Myneni, R.B.; Williams, D.L. On the relationship between FAPAR and NDVI. Remote Sens. Environ. 1994, 49, 200–211. [Google Scholar] [CrossRef]
  67. Myneni, R.; Hoffman, S.; Knyazikhin, Y.; Privette, J.; Glassy, J.; Tian, Y.; Wang, Y.; Song, X.; Zhang, Y.; Smith, G.; et al. Global products of vegetation leaf area and fraction absorbed PAR from year one of MODIS data. Remote Sens. Environ. 2002, 83, 214–231. [Google Scholar] [CrossRef] [Green Version]
  68. Cubist Regression Tree Software. Available online: https://www.rulequest.com/cubist-info.html (accessed on 17 September 2018).
  69. López-Lozano, R.; Duveiller, G.; Seguini, L.; Meroni, M.; García-Condado, S.; Hooker, J.; Leo, O.; Baruth, B. Towards regional grain yield forecasting with 1 km-resolution EO biophysical products: Strengths and limitations at pan-European level. Agric. For. Meteorol. 2015, 206, 12–32. [Google Scholar] [CrossRef]
  70. Yan, L.; Roy, D.P. Conterminous United States crop field size quantification from multi-temporal Landsat data. Remote Sens. Environ. 2016, 172, 67–86. [Google Scholar] [CrossRef]
  71. Guindin-Garcia, N.; Gitelson, A.A.; Arkebauer, T.J.; Shanahan, J.; Weiss, A. An evaluation of MODIS 8- and 16-day composite products for monitoring maize green leaf area index. Agric. For. Meteorol. 2012, 161, 15–25. [Google Scholar] [CrossRef]
  72. Tucker, C.J.; Holben, B.N.; Elgin, J.H.; McMurtrey, J.E., III. Remote sensing of total dry-matter accumulation in winter wheat. Remote Sens. Environ. 1981, 11, 171–189. [Google Scholar] [CrossRef] [Green Version]
  73. Sakamoto, T.; Gitelson, A.A.; Arkebauer, T.J. Near real-time prediction of U.S. corn yields based on time-series MODIS data. Remote Sens. Environ. 2014, 147, 219–231. [Google Scholar] [CrossRef]
  74. Ozdogan, M.; Gutman, G. A new methodology to map irrigated areas using multi-temporal MODIS and ancillary data: An application example in the continental US. Remote Sens. Environ. 2008, 112, 3520–3537. [Google Scholar] [CrossRef] [Green Version]
  75. Ozdogan, M.; Yang, Y.; Allez, G.; Cervantes, C. Remote sensing of irrigated agriculture: Opportunities and challenges. Remote Sens. 2010, 2, 2274–2304. [Google Scholar] [CrossRef]
  76. Anderson, M.C.; Zolin, C.A.; Sentelhas, P.C.; Hain, C.R.; Semmens, K.; Tugrul Yilmaz, M.; Gao, F.; Otkin, J.A.; Tetrault, R. The Evaporative Stress Index as an indicator of agricultural drought in Brazil: An assessment based on crop yield impacts. Remote Sens. Environ. 2016, 174, 82–99. [Google Scholar] [CrossRef] [Green Version]
  77. Anderson, M.C.; Hain, C.R.; Jurecka, F.; Trnka, M.; Hlavinka, P.; Dulaney, W.; Otkin, J.A.; Johnson, D.; Gao, F. Relationships between the evaporative stress index and winter wheat and spring barley yield anomalies in the Czech Republic. Clim. Res. 2016, 70, 215–230. [Google Scholar] [CrossRef]
  78. Yang, Y.; Anderson, M.; Gao, F.; Wardlow, B.; Hain, C.; Otkin, J.; Alfieri, J.; Yang, Y.; Sun, L.; Dulaney, W. Field-scale mapping of evaporative stress indicators of crop yield: An application over Mead, NE, USA. Remote Sens. Environ. 2018, 210, 387–402. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Study site in central Iowa (a) overlaid with the Landsat scene (path 26 and row 31, near infrared-red–green composite) containing the main portions of 20 counties (white polygons) and a Sentinel-2 tile (15TVG, yellow rectangle). The red box shows the zoom-in window for Figure 5. The yellow star shows the flux tower site.
Figure 1. Study site in central Iowa (a) overlaid with the Landsat scene (path 26 and row 31, near infrared-red–green composite) containing the main portions of 20 counties (white polygons) and a Sentinel-2 tile (15TVG, yellow rectangle). The red box shows the zoom-in window for Figure 5. The yellow star shows the flux tower site.
Remotesensing 10 01489 g001
Figure 2. Landsat-5 (2001–2011, red), Landsat-7 (2001–2015, yellow), and Landsat-8 (2013–2015, green) images used in Study #1. The size of the dot represents the percentage of valid and clear pixels from the Landsat images.
Figure 2. Landsat-5 (2001–2011, red), Landsat-7 (2001–2015, yellow), and Landsat-8 (2013–2015, green) images used in Study #1. The size of the dot represents the percentage of valid and clear pixels from the Landsat images.
Remotesensing 10 01489 g002
Figure 3. Two examples from Landsat, Landsat–MODIS data fusion, and the SG filter-smoothed result for corn (a) and soybean (b) in 2011 near the flux tower sites (yellow star in Figure 1).
Figure 3. Two examples from Landsat, Landsat–MODIS data fusion, and the SG filter-smoothed result for corn (a) and soybean (b) in 2011 near the flux tower sites (yellow star in Figure 1).
Remotesensing 10 01489 g003
Figure 4. Corn and soybean yields and maximum normalized difference vegetation index (NDVI) and maximum two-band enhanced vegetation index (EVI2) for each county and year using Landsat only (left), MODIS only (middle), and Landsat–MODIS data fusion results (right). The fused data showed better explanatory ability even at the very coarse county level. The circles show samples from the flood year (2010) for corn (ac,gi) and the drought year (2003) for soybean (df,jl). Statistics in the plots were computed using all available samples.
Figure 4. Corn and soybean yields and maximum normalized difference vegetation index (NDVI) and maximum two-band enhanced vegetation index (EVI2) for each county and year using Landsat only (left), MODIS only (middle), and Landsat–MODIS data fusion results (right). The fused data showed better explanatory ability even at the very coarse county level. The circles show samples from the flood year (2010) for corn (ac,gi) and the drought year (2003) for soybean (df,jl). Statistics in the plots were computed using all available samples.
Remotesensing 10 01489 g004
Figure 5. Corn yields estimated from MODIS (a) and the Landsat–MODIS data fusion results (b) using a simple linear model (Figure 4b) at the county level in 2010 for a subset area in Figure 1 (red box). The Cropland Data Layer (CDL) for 2010 was used to mask out non-corn classes (black).
Figure 5. Corn yields estimated from MODIS (a) and the Landsat–MODIS data fusion results (b) using a simple linear model (Figure 4b) at the county level in 2010 for a subset area in Figure 1 (red box). The Cropland Data Layer (CDL) for 2010 was used to mask out non-corn classes (black).
Remotesensing 10 01489 g005
Figure 6. Relationship between yield and maximum EVI2 for Landsat-8 (L8), Sentinel-2 (S2), the Harmonized Landsat-8 and Sentinel-2 (HLS), and MODIS-L8-S2 (MLS) data fusion for corn and soybeans from 2016–2017.
Figure 6. Relationship between yield and maximum EVI2 for Landsat-8 (L8), Sentinel-2 (S2), the Harmonized Landsat-8 and Sentinel-2 (HLS), and MODIS-L8-S2 (MLS) data fusion for corn and soybeans from 2016–2017.
Remotesensing 10 01489 g006
Figure 7. R2 and RMAE of the yield with the cumulative NDVI (left) and EVI2 (right) for corn and soybean. The x-axis represents the end date and the y-axis represents the interval. The high R2 area (highest 10%) and low RMAE area (lowest 10%) are highlighted by black contours.
Figure 7. R2 and RMAE of the yield with the cumulative NDVI (left) and EVI2 (right) for corn and soybean. The x-axis represents the end date and the y-axis represents the interval. The high R2 area (highest 10%) and low RMAE area (lowest 10%) are highlighted by black contours.
Remotesensing 10 01489 g007
Figure 8. Comparison of R2 (a) and the relative mean absolute errors (RMAE) (b) between yield and VIs (x-axis: with NDVI; y-axis: with EVI2). Each symbol shows the statistics from one year (2001–2015) generated from 20 counties. Two VI metrics include cumulative VI (Cum_VI) and maximum VI (Max_VI). EVI2 shows higher R2 (above the 1:1 line) and lower RMAE (below the 1:1 line) than NDVI. Statistics that are not significant (p-value < 0.05) from both NDVI and EVI2 are excluded.
Figure 8. Comparison of R2 (a) and the relative mean absolute errors (RMAE) (b) between yield and VIs (x-axis: with NDVI; y-axis: with EVI2). Each symbol shows the statistics from one year (2001–2015) generated from 20 counties. Two VI metrics include cumulative VI (Cum_VI) and maximum VI (Max_VI). EVI2 shows higher R2 (above the 1:1 line) and lower RMAE (below the 1:1 line) than NDVI. Statistics that are not significant (p-value < 0.05) from both NDVI and EVI2 are excluded.
Remotesensing 10 01489 g008
Figure 9. Comparison of R2 (a) and RMAE (b) from small cumulative (day 192–236) and maximum VIs for corn and soybeans from 2001–2015. Each point shows statistics for one year (2001–2015) from 20 counties. Statistics that are not significant (p-value < 0.05) from both Max_VI and Cum_VI are excluded.
Figure 9. Comparison of R2 (a) and RMAE (b) from small cumulative (day 192–236) and maximum VIs for corn and soybeans from 2001–2015. Each point shows statistics for one year (2001–2015) from 20 counties. Statistics that are not significant (p-value < 0.05) from both Max_VI and Cum_VI are excluded.
Remotesensing 10 01489 g009
Figure 10. Comparison of R2 (a) and RMAE (b) from the small cumulative VIs (NDVI and EVI2) during the period between day 150–300, and the maximum VIs for corn and soybeans from 2001–2015. Each point shows statistics for one year (2001–2015) from 20 counties. Statistics that are not significant (p-value < 0.05) from both Max_VI and Cum_VI are excluded.
Figure 10. Comparison of R2 (a) and RMAE (b) from the small cumulative VIs (NDVI and EVI2) during the period between day 150–300, and the maximum VIs for corn and soybeans from 2001–2015. Each point shows statistics for one year (2001–2015) from 20 counties. Statistics that are not significant (p-value < 0.05) from both Max_VI and Cum_VI are excluded.
Remotesensing 10 01489 g010
Figure 11. Comparison of R2 (a) and RMAE (b) from each county for small cumulative (day 192–236) and maximum VIs for corn and soybeans. Each point shows statistics for one county from 2001–2015. Statistics that are not significant (p-value < 0.05) from both Max_VI and Cum_VI are excluded.
Figure 11. Comparison of R2 (a) and RMAE (b) from each county for small cumulative (day 192–236) and maximum VIs for corn and soybeans. Each point shows statistics for one county from 2001–2015. Statistics that are not significant (p-value < 0.05) from both Max_VI and Cum_VI are excluded.
Remotesensing 10 01489 g011
Figure 12. Histogram of RMAE from the spatial and the temporal (inter-annual) models. It shows that it is harder (larger RMAEs) to capture temporal variability of yield (solid lines) than spatial variability (dashed lines) when using VI alone. The RMAEs were computed using small cumulative EVI2 from day 192–236.
Figure 12. Histogram of RMAE from the spatial and the temporal (inter-annual) models. It shows that it is harder (larger RMAEs) to capture temporal variability of yield (solid lines) than spatial variability (dashed lines) when using VI alone. The RMAEs were computed using small cumulative EVI2 from day 192–236.
Remotesensing 10 01489 g012
Table 1. The coefficient of determination (R2) and the relative mean absolute error (RMAE) of yields and maximum EVI2 separated by training (_t) and validation (_v) results for corn and soybean (all linear regressions are statistically significant with p-value < 0.01).
Table 1. The coefficient of determination (R2) and the relative mean absolute error (RMAE) of yields and maximum EVI2 separated by training (_t) and validation (_v) results for corn and soybean (all linear regressions are statistically significant with p-value < 0.01).
DatasetR2_tRMAE_tR2_vRMAE_vp-Value
Corn (2001–2015)
Landsat0.467.400.467.40<0.0001
MODIS0.576.090.586.40<0.0001
Landsat–MODIS0.635.980.596.11<0.0001
Soybean (2001–2015)
Landsat0.418.970.379.93<0.0001
MODIS0.319.030.2710.25<0.0001
Landsat–MODIS0.388.980.399.07<0.0001
Soybean (2001–2015, excluding 2003)
Landsat0.476.140.466.31<0.0001
MODIS0.635.280.625.22<0.0001
Landsat–MODIS0.585.450.585.43<0.0001
Soybean (2003 only)
Landsat0.346.49 0.0088
MODIS0.465.65 0.0014
Landsat–MODIS0.723.82 <0.0001

Share and Cite

MDPI and ACS Style

Gao, F.; Anderson, M.; Daughtry, C.; Johnson, D. Assessing the Variability of Corn and Soybean Yields in Central Iowa Using High Spatiotemporal Resolution Multi-Satellite Imagery. Remote Sens. 2018, 10, 1489. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10091489

AMA Style

Gao F, Anderson M, Daughtry C, Johnson D. Assessing the Variability of Corn and Soybean Yields in Central Iowa Using High Spatiotemporal Resolution Multi-Satellite Imagery. Remote Sensing. 2018; 10(9):1489. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10091489

Chicago/Turabian Style

Gao, Feng, Martha Anderson, Craig Daughtry, and David Johnson. 2018. "Assessing the Variability of Corn and Soybean Yields in Central Iowa Using High Spatiotemporal Resolution Multi-Satellite Imagery" Remote Sensing 10, no. 9: 1489. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10091489

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop