## 1. Introduction

Net all-wave surface radiation (

R_{n}), characterizing the available radiative energy at the Earth’s surface, is the difference between total upward and downward radiation.

R_{n} drives the processes of evaporation, evapotranspiration, air and soil fluxes as well as other smaller energy-consuming processes such as photosynthesis [

1,

2].

R_{n} is a key component of surface energy balance and largely determines sensible and latent heat fluxes [

3]. In agrometeorology,

R_{n} is a parameter commonly used to estimate reference evapotranspiration and leaf wetness duration from physical models [

4]. Thus, reliable spatial and temporal

R_{n} information is required. However, directly measured

R_{n} is available only from a very small number of standard radiometric observatories because of the expensive instruments and constant maintenance needed to guarantee that reliable measurements can be provided [

5], and these in-situ measurements are unable to represent the spatial variability. Alternative methods for

R_{n} estimation have, therefore, been developed to compensate for experimental observations.

Mathematically,

R_{n} consists of four components:

where

R_{ns} is the net shortwave radiation,

R_{nl} is the net longwave radiation (W∙m

^{−2}),

R_{si} is the incoming shortwave radiation (W∙m

^{−2}),

R_{so} is the reflected shortwave radiation (W∙m

^{−2}), which is calculated by

R_{so}= α * R_{si},

α is the shortwave broadband albedo (dimensionless),

R_{li} is the incoming longwave radiation (W∙m

^{−2}),

R_{lo} is the outgoing longwave radiation (W∙m

^{−2}). If all components in Equation (1) are known, the calculation of

R_{n} is straightforward.

Many radiation measurement towers that measure these four components to determine net radiation, but these measurements are available only at individual sites. To estimate

R_{n} at regional and global scales, various methods have recently been developed, such as generating

R_{n} from various satellite data [

6], or from meteorological reanalysis products [

7,

8,

9]. Although these reanalysis products are tempo-spatially continuous, their spatial resolutions are coarse. To date, the finest spatial resolution for a global product is only about 0.3°. It is extremely difficult to estimate

R_{n} directly from satellite data because frequent cloud coverage can block surface information. However, satellite data can fortunately be used to estimate incoming solar radiation under all-sky conditions [

10,

11]. Therefore, an important research goal is to develop robust methods for estimating accurate, high tempo-spatial resolution

R_{n} from incoming shortwave radiation or shortwave net radiation, producing estimates that are location independent and universally applicable.

In recent years, numerous methods for accurately estimating

R_{n} have been explored and developed. Initially,

R_{n} was estimated only from the incoming solar radiation,

R_{si} [

12,

13]. Then surface albedo was incorporated with claims that the fitting accuracy of models could be improved [

14], but later studies found that the improvement brought mixed results [

15]. Later, other models were developed by incorporating more meteorological parameters (

i.e., air temperature, cloud cover) and other variables (

i.e., relative Earth-Sun distance, land covers represented by the Normalized Difference Vegetation Index (NDVI)) and these models were evaluated by many studies [

16,

17,

18,

19]. In addition to these empirical models, physically-based models and hybrid models have also been developed. The physically-based models usually estimated

R_{n} by calculating the individual terms in Equation (1) separately and most have focused on

R_{nl} parameterization [

20,

21]. Some of these models focused on deriving

R_{n} from satellite data for developing a continuous tempo-spatial regional or global

R_{n} [

22,

23,

24,

25]. Recently, a study conducted by Jiang

et al. [

26] compared and evaluated seven popular empirical linear models and one newly developed model for

R_{n} estimation using comprehensive global measurements. The results indicated that the linear empirical models performed well, but not well enough in all cases, and therefore the author suggested that nonlinear empirical models should be considered for

R_{n} estimation.

Although significant progress has been made in this study area, some problems still remain. Physically based models are thought to be more accurate than the empirical models, but the calculations are usually more complex and time-consuming, especially because it is difficult to collect all the inputs. Therefore, empirical models are still the first option for practical applications. However, almost all popular empirical

R_{n} estimation models consider only linear relationships between variables and therefore, nonlinear models should be explored. Ferreira

et al. [

27] and Geraldo-Ferreira

et al. [

28] sought to calculate

R_{n} using artificial neural networks (ANN) and proved their applicability. However, only one ANN model was used, and the number of observations used were too small to represent different atmosphere and environmental conditions. This suggests that more work in this research area is needed. Furthermore, the utility of other sources of data (

i.e., model reanalysis products) in

R_{n} estimation should be explored because poor quality or missing data are usually found in surface radiation measurements, meteorological variables, and even remotely-sensed products.

To address these issues, the primary objective of this study was to develop nonlinear

R_{n} estimation models using two ANNs based on multi-source data. The data included remotely sensed products, meteorological reanalysis products, and site observations. Two newly developed ANN models, a general regression neural network (GRNN) [

29] and Neuroet model [

30], were used in this study. A GRNN has a multi-input-output architecture that is different from other ANN models whereas Neuroet includes a novel procedure to determine the model architecture. The performance of these two ANNs for

R_{n} estimation, especially their adaptability to different conditions globally, was compared and analyzed. The advantages and disadvantages of each model were thereby determined, and a sensitivity analysis was carried out for the GRNN.

The remainder of this paper is organized as follows. Details of the ANN models and the data used in the study are provided in

Section 2.

Section 3 describes the analytical results. Discussions and concluding remarks are provided in

Section 4.

## 4. Conclusions

The high-resolution net radiation product of land surfaces is very important for many applications. Two ANN models were developed for R_{n} estimation using multi-source data, including remotely sensed products, reanalysis products and in-situ observations. To achieve a better understanding of the performance of the two models, the most comprehensive radiation ground measurements were collected for evaluation from 251 worldwide independent sites from 1992 through 2010, representing the major land cover types on Earth. The performance of the GRNN and Neuroet models was evaluated using the entire dataset (global mode) or four subsets based on surface albedo and NDVI values (conditional mode). Influence of scaling methods on these two models was also discussed. The importance of each input variable for the GRNN was examined using sensitivity analysis.

Based on extensive evaluations, it was found that the GRNN performed better and more stable than Neuroet in global mode, and its estimates had the determination coefficient (R^{2}) of 0.92, a root mean square error (RMSE) of 34.27 W∙m^{−2}, and a bias of −0.61 W∙m^{−2} based on validation. Neuroet can work as well as the GRNN in the conditional mode for four specific cases, and the GRNN conditional models performed similarly to the GRNN global model, which also proves the robustness of the GRNN model.

Furthermore, the structures of the two models can be re-built, which illustrates that the GRNN and Neuroet are not “black boxes” like other ANN models. In general, these two ANN models were found to be superior to linear regression models in terms of fitting accuracy, especially in some specific cases (i.e., S2 and S3), in which the fitting RMSEs were 33.82 and 15.66 W∙m^{−2} and 34.69 and 15.31 W∙m^{−2} by GRNN and Neuroet conditional models respectively. Experimental results indicated that the Z-score normalization method was preferable in the GRNN and Neuroet models in this study, but that highly dispersed inputs would affect GRNN performance. Sensitivity analysis of the GRNN model suggested that R_{si}, ABD, NDVI, T_{min}, RH, and CI were the major contributors to predictive accuracy, but the time needed for model training and prediction remained almost unchanged, which proved that the computational efficiency was determined mostly by the size of the training samples in the GRNN.

Although the GRNN has better performance than Neuroet in this study, some limitations still remain. First, the time required for GRNN model training and prediction is determined by sample size, which means that the GRNN is not suitable for a huge amount of data, whereas the time for Neuroet is shorter. Second, an under- or over-fitting problem is hard to detect in a GRNN because the entire training procedure is automatic and only the smoothing parameter can be adjusted. By contrast, the process of determining the optimal hidden number of neurons and other key parameters in Neuroet can be helpful in obtaining the optimal model. Third, the importance of each independent variable can be easily obtained from Neuroet, but not in GRNN. In summary, although GRNN and Neuroet each had their own advantages and shortcomings, both work well for R_{n} estimation compared to linear regression models and therefore, both could potentially be highly useful tools for future R_{n} estimation.

In the future, additional efforts should be pursued to improve the optimization methods for smoothing parameter determination in the GRNN that are essential to shortening the running time.