Next Article in Journal
Analytical Investigation of Some Time-Fractional Black–Scholes Models by the Aboodh Residual Power Series Method
Next Article in Special Issue
Variable Selection and Allocation in Joint Models via Gradient Boosting Techniques
Previous Article in Journal
Jensen–Mercer and Hermite–Hadamard–Mercer Type Inequalities for GA-h-Convex Functions and Its Subclasses with Applications
Previous Article in Special Issue
Bayesian Inference Algorithm for Estimating Heterogeneity of Regulatory Mechanisms Based on Single-Cell Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Estimation for Semi-Functional Linear Model with Autoregressive Errors

1
Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming 650091, China
2
City College, Kunming University of Science and Technology, Kunming 650500, China
3
School of Mathematical Sciences, Shanxi University, Taiyuan 030006, China
4
Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
*
Author to whom correspondence should be addressed.
Submission received: 6 November 2022 / Revised: 31 December 2022 / Accepted: 2 January 2023 / Published: 5 January 2023
(This article belongs to the Special Issue Recent Advances in Computational Statistics)

Abstract

:
It is well-known that the traditional functional regression model is mainly based on the least square or likelihood method. These methods usually rely on some strong assumptions, such as error independence and normality, that are not always satisfied. For example, the response variable may contain outliers, and the error term is serially correlated. Violation of assumptions can result in unfavorable influences on model estimation. Therefore, a robust estimation procedure of a semi-functional linear model with autoregressive error is developed to solve this problem. We compare the efficiency of our procedure to the least square method through a simulation study and two real data analyses. The conclusion illustrates that the proposed method outperforms the least square method, providing random errors follow the heavy-tail distribution.

1. Introduction

As a powerful tool in functional data analysis (FDA), functional regression modeling has been extensively studied in the past several decades. For example, Cardot et al. [1,2] introduced the functional linear model in which a scalar response variable is explained by a functional predictor. Yao et al. [3] proposed a functional linear regression model to analyse sparse longitudinal data. Hall and Horowitz [4] obtained the optimal convergence rates of estimators based on functional principal components analysis. Further, Li et al. [5] recently extended classical clusterwise linear regression to incorporate multiple functional predictors and proposed clusterwise functional linear regression models. For the functional nonparametric model, Ferraty and Vieu [6] extended the kernel regression to functional data. Baíllo and Grané [7] proposed a local linear regression and Burba et al. [8] used the k-Nearest Neighbour method to deal with the functional nonparametric models, respectively. Moreover, to improve the power of prediction or interpretation of the functional regression model, some functional semiparametric models have also been developed. For instance, Aneiros-Pérez and Vieu [9] first introduced a semi-functional partial linear regression model, and Aneiros-Pérez and Vieu [10] used this model to predict time series. Further, Shin [11] proposed new estimators of a partial functional linear model which explore the relationship between a scalar response variable and mixed-type predictors. Zhou and Chen [12] introduced a semi-functional linear model by combining the feature of a functional linear regression model and a nonparametric regression model.
It is well-known that the traditional functional regression models are mainly based on the least square or likelihood method. However, small proportions of outliers or the heavy-tailed nature of the error distribution could dramatically deteriorate the efficiency of estimators. To tackle this problem, many robust estimations have been introduced into the functional regression model. For example, Yu et al. [13] proposed a robust estimation method based on exponential square loss for the semi-functional linear regression model. Subsequently, Yu et al. [14] proposed a robust estimation procedure of a partial function linear model based on modal regression. Cai et al. [15] constructed a robust estimation with a modified Huber’s loss for partial functional linear models. Cao and Sun [16] studied the robust estimation of partial functional linear regression models based on functional principal component analysis and weighted compound quantile regression. Tang [17] introduced a robust estimation method for the semi-functional linear model, and Boente et al. [18] constructed new robust estimators for semi-functional linear regression models with a bounded loss function and a preliminary residual scale estimator. Recently, Boente and Daniela [19] considered the robust estimation for functional quadratic regression models. Moreover, Pannu and Billor [20] proposed a functional adaptive group LASSO variable selection method based on the weighted least absolute deviation. In the aforementioned literature, the random error is usually assumed to be independent and identically distributed. However, this assumption may be inappropriate in practice, especially when the observations are collected sequentially over time. However, as far as we know, only a few works have focused on robust estimation for regression models with serially correlated error. For example, Sanjoy et al. [21] considered robust estimation of nonlinear regression with autoregressive errors. Riazoshams et al. [22] studied the performance of a robust two-stage estimator in nonlinear regression with an autocorrelated error. Recently, Serenay and Baris [23] proposed a novel hybrid robust tapering approach in nonlinear regression with the presence of autocorrelation and outliers. Motivated by these previous works, this paper is mainly concerned with a robust semi-functional linear model with autoregressive errors (SFLAR). The rest of the paper is organized as follows. Section 2 introduces robust estimation and algorithm. Some simulation studies are presented in Section 3. Section 4 illustrates the good performance of the robust estimation method with two real data. Section 5 is the conclusion.

2. Model and Estimation

Suppose that x i ( t ) , y i , z i i = 1 n satisfies the following semi-functional linear model with autoregressive errors (SFLAR)
y i = T β ( t ) x i ( t ) d t + g ( z i ) + ε i , ε i = l = 1 q a l ε i l + e i ,
where y i is a real-valued random response variable, and z i is a real-valued covariate defined on a compact interval z = [ z ̲ , z ¯ ] . The functional covariate x i ( t ) defined on an interval T R is a zero mean, second-order (i.e., E | x ( t ) | 2 < for all t T ) stochastic process defined on ( Ω , B , P ) with sample paths in the Hilbert space L 2 ( T ) . The inner product of L 2 ( T ) is defined by u , v = T u ( t ) v ( t ) d t for any u , v L 2 ( T ) and norm u = u , u 1 / 2 . The unknown slope function β ( t ) belongs to L 2 ( T ) . Without losing generality, throughout this paper, we assume T = [ 0 , 1 ] . The nonparametric function g ( z ) is an unknown smooth function; e i is a random error with zero mean, finite variance, and independent of ( x i ( t ) , z i ) .
Since the unknown functions β ( t ) and g ( z ) are infinite-dimensional parameters, we cannot obtain their estimators directly via an optimization method. Therefore, some dimension reduction methods need to be applied. Specifically, suppose the covariance operator C x of the functional variable x is denoted by
C x ( s , u ) = C o v [ x ( s ) , x ( u ) ] ,
where C x ( s , u ) is continuous on [ 0 , 1 ] 2 , and the eigenvalues of C x , denoted by { λ j } j = 1 , satisfy λ 1 > λ 2 > > 0 . By Mercer’s Theorem, C x ( s , u ) can be expanded by
C x ( s , u ) = j = 1 λ j ϕ j ( s ) ϕ j ( u ) ,
where ϕ j is the continuous orthogonal eigenfunction, and λ j is the corresponding eigenvalue. Obviously, the eigenfunction sequence { ϕ j } j = 1 forms an orthonormal basis for L 2 [ 0 , 1 ] . Then, random function x i ( t ) and the slope function β ( t ) can be respectively expressed as
x i ( t ) = j = 1 ξ i j ϕ j ( t ) , β ( t ) = j = 1 b j ϕ j ( t ) ,
where ξ i j = T x i ( t ) ϕ j ( t ) d t are called the principle component scores satisfying E ξ i j = 0 , and E ξ i j ξ i k = λ j I ( j = k ) , and b j = T β ( t ) ϕ j ( t ) d t .
Moreover, let S r , N n be the space of polynomial splines defined on interval [ z ̲ , z ¯ ] with degree r 1 and knot sequence z ̲ < z 1 < < z N n < z ¯ . The space S r , N n is a K-dimensional linear space, K = K ( n ) = N n + r . Following the arguments of de Boor [24], we can conclude that, if it is sufficiently smooth, the nonparametric function g ( z ) can be approximately expressed as
g ( z ) k = 1 K c k B k ( z ) ,
where B k , k = 1 , 2 , , K are B-spline basis functions in S r , N n . For more details on spline function, we refer to de Boor [24]. Substituting Equation (4) and Equation (5) into Equation (1) yields
y i j = 1 J ξ i j b j + k = 1 K c k B k ( z i ) + ε i .
However, because the covariance function C x is unknown, eigenfunctions ϕ j are also unknown, and variables ξ i j are unobservable. To tackle this problem, we define the empirical version of covariance function C x by
C ^ x ( s , u ) = 1 n i = 1 n x i ( s ) x i ( u ) ,
define the estimated eigenelements of C ^ x by
C ^ x ( s , u ) ϕ ^ j ( u ) d u = λ ^ j ϕ ^ j ( s ) , ( 1 < j < n ) .
By Mercer’s Theorem, we have
C ^ x ( s , u ) = j = 1 n λ ^ j ϕ ^ j ( s ) ϕ ^ j ( u ) ,
where ( λ ^ j , ϕ ^ j ) are (eigenvalue, eigenfunction) pairs for the linear operator with kernel C ^ x , ordered such that λ ^ 1 λ ^ 2 > 0 . We regard ( λ ^ j , ϕ ^ j ) as an estimator of ( λ j , ϕ j ) . Moreover, Equation (6) can be written as
y i U ^ i b + B i c + l = 1 q a l y i l U ^ i l b B i l c + e i , q + 1 i n .
Furthermore, denote
L n ( a , b , c ) = 1 n q i = q + 1 n ρ y i U ^ i b B i c l = 1 q a l y i l U ^ i l b B i l c ,
where ρ ( · ) is a proper robust loss function, a = ( a 1 , , a q ) , b = ( b 1 , , b J ) , c = ( c 1 , , c K ) , U ^ i = ( ξ ^ i 1 , ξ ^ i 2 , , ξ ^ i J ) , and B i = B 1 ( z i ) , , B K ( z i ) . Therefore, a ^ , b ^ and c ^ can be obtained by solving the following minimization problem:
min a , b , c L n ( a , b , c ) .
Finally, the estimator of functional coefficient β ( t ) and the spline estimator of the nonparametric function g ( z ) are given by β ^ ( t ) = j = 1 J b ^ j ϕ ^ j ( t ) , g ^ ( z i ) = k = 1 K c ^ k B k ( z i ) , respectively.
Various loss functions ρ ( · ) have been proposed for the robust estimation when outliers are present in the data. For example, the least absolute deviations (LAD) loss, that is, ρ ( r ) = | r | , can be regarded as a median regression. Although the LAD is not sensitive to outliers, its disadvantage is that the LAD is not smooth at the zero point, which will lead to some problems in model training. Therefore, considering the requirements of robustness and differentiability of the loss function, the Huber loss function (HB) [25] is considered in this paper, which is convex, smooth, and symmetric about 0. Hence, the optimization of Equation (12) for estimators is well-defined, and the local minimum problem is avoided.
To implement the proposed method, we take B-spline functions with equally spaced knots and the fixed degree three to approximate the unknown function g ( z ) . We only need to choose the numbers of eigenfunctions J and B-spline functions K. Many methods, such as the Akaike information criterion (AIC) and Bayesian information criterion (BIC), can be used to select the numbers of eigenfunctions J and B-spline functions K. In this paper, we choose the truncation parameters J and K by minimizing the following Bayesian information criterion (BIC) criteria:
BIC J , K = log i = q + 1 n ρ y i j = 1 J b ^ j ξ ^ i j k = 1 K c ^ k B k z i + log n 2 n J + K ,
which is a function of J and K, the smaller the value of BIC, the better the fitting.
Furthermore, denote θ = ( b , c ) , H ^ i = ( U ^ i , B i ) . Then, Equation (11) can be written as the following equation:
L n ( a , θ ) = 1 n q i = q + 1 n ρ y i H ^ i θ l = 1 q a l y i l H ^ i l θ .
Moreover, Equation (11) can be rewritten as:
L n ( a , θ ) = 1 n q i = q + 1 n ρ V i D i θ ,
where V i = y i l = 1 q a l y i l , D i = H ^ i l = 1 q a l H ^ i l . Then, the estimators of parameter vector a = ( a 1 , a 2 , , a q ) and θ = ( b , c ) can be obtained by minimizing Equation (15). However, the objective function L n ( a , θ ) has no analytic solution for the unknown parameters. Therefore, this paper uses a two-step iteratively reweighted least-squares (TSIRLS) to obtain the local optimal solution a ^ and θ ^ of Equation (15). The TSIRLS algorithm is summarized as follows:
Initialization: given initial value: θ ^ ( 0 ) , a ^ ( 0 ) using the LS estimators and let k = 0 .
Step 1. Update ε ^ i ( k ) , W ( k ) = d i a g { w 1 ( k ) , , w n ( k ) } :
ε ^ i ( k ) = y i H i θ ^ ( k ) , w i ( k ) = ψ ( ε ^ i ( k ) ) ε ^ i ( k ) ;
Step 2. Update V ˜ ( k ) , D ˜ ( k ) :
V ˜ ( k ) = ε ^ q + 1 ( k ) ε ^ n ( k ) , D ˜ ( k ) = ε ^ q ( k ) ε ^ 1 ( k ) ε ^ n 1 ( k ) ε ^ n q ( k ) ;
Step 3. Update e ^ i ( k ) , W ˜ ( k ) = d i a g { w ˜ 1 ( k ) , , w ˜ n ( k ) } :
e ^ i ( k ) = ε ^ i ( k ) l = 1 q a ^ l ( k ) ε ^ i l ( k ) , w ˜ i ( k ) = ψ ( e ^ i ( k ) ) e ^ i ( k ) ;
Step 4. Update a ^ ( k + 1 ) :
a ^ ( k + 1 ) = ( D ˜ ( k ) W ˜ ( k ) D ˜ ( k ) ) 1 D ˜ ( k ) W ˜ ( k ) V ˜ ( k ) ;
Step 5. Update V i ( k ) , D i ( k ) :
V i ( k ) = y i l = 1 q a l ( k + 1 ) y i l , D i ( k ) = H ^ i l = 1 q a l ( k + 1 ) H ^ i l ;
Step 6. Update θ ^ ( k + 1 ) :
θ ^ ( k + 1 ) = ( D ( k ) W ( k ) D ( k ) ) 1 D ( k ) W ( k ) V ( k ) ,
where
V ( k ) = V q + 1 ( k ) V n ( k ) , D ( k ) = D q + 1 ( k ) D n ( k ) ;
Step 7. Update k = k + 1 . Repeat the above steps until θ ^ ( k + 1 ) θ ^ ( k ) 1 + a ^ ( k + 1 ) a ^ ( k ) 1 10 8 . Then, the estimation of the parameter vectors a ^ = ( a ^ 1 , a ^ 2 , , a ^ q ) , b ^ = ( b ^ 1 , b ^ 2 , , b ^ J ) , and c ^ = ( c ^ 1 , c ^ 2 , , c ^ K ) are obtained.

3. Simulation

In this section, we conduct some simulation studies to investigate the finite sample performance of our proposed method. The simulation data { ( x i ( t ) , y i , z i ) , 1 i n } are generated from the following SFLAR model:
y i = T β ( t ) x i ( t ) d t + g ( z i ) + ε i , ε i = l = 1 q a l ε i l + e i ,
where z i is distributed uniformly on [ 1 , 1 ] , and the functional predictor x i ( t ) = j = 1 50 ξ i j ϕ j ( t ) , ξ i j is distributed as independent normal with mean 0 and variance λ j = ( ( j 0.5 ) π ) 1 and ϕ j ( t ) = 2 sin ( ( j 0.5 ) π t ) . For the unknown functions β ( · ) and g ( · ) and autoregressive coefficients ( a 1 , , a q ) , given β ( t ) = 2 sin ( 0.5 π t ) + 3 2 sin ( 1.5 π t ) , g ( z ) = z + 3 cos ( 2 π z ) , the first-order autoregressive error A R ( 1 ) : ε t = 0.8 ε t 1 + e t . We evaluate the performance of various loss functions with different sample sizes n = 100 , 300 and 500. Each experiment has been repeated 1000 times. Similar to Cai et al. [15], the following four scenarios error distributions are considered:
 (1
e i N ( 0 , 1 ) ;
 (2
e i π N ( 0 , 1 ) + ( 1 π ) N ( 0 , 100 ) , where π = 0.8;
 (3
e i t ( 2 ) ;
 (4
e i Cauchy ( 0 , 1 ) .
Moreover, according to one of the reviewers’ suggestions, we also consider different values of π in Scenario 2. Specifically, the values of π are taken as 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9.
To evaluate the performances of our estimators, we define the integrated mean square errors (IMSE) of the estimators of function β ( t ) , g ( z ) ,
IMSE 1 = 0 1 [ β ^ ( t ) β ( t ) ] 2 d t , IMSE 2 = 0 1 [ g ^ ( z ) g ( z ) ] 2 d z ,
and the mean square prediction error (MSPE):
MSPE = 1 n i = q + 1 n y i ^ g ( z i ) 0 1 β ( t ) x i ( t ) d t 2 ,
where y ^ i = g ^ ( z i ) + T β ^ ( t ) x i ( t ) d t . Further, in order to obtain β ^ ( t ) and g ^ ( z i ) , the two tuning parameters J and K are selected by minimizing the BIC value. Specifically, the range of J is selected within 1 to 6, and the range of K is selected within 3 to 6.
We compare the finite sample performance of the SFLAR model with ordinary least squares (OLS), least absolute deviations (LAD) loss, and Huber (HB) loss function. Further, according to one of the reviewers’ suggestions, we also consider the exponential square (EXP) loss function (Yu et al. [13]). To implement the robust method considered in Yu et al. [13], the tuning parameters h, J and K need to be selected. In this paper, the tuning parameters are chosen by cross-validation and BIC. Specifically, for some fixed h, the tuning parameters J and K are selected by BIC. Then, under the selected J and K, like Jiang [26], the tuning parameter h is chosen by 5 fold cross-validation (CV) within the possible grids points h = 0.5 × 1 . 02 l ( l = 0 , 1 , , 100 ). We repeat the process until the tuning parameters are invariant.
Table 1 reports the bias and mean square error (MSE) of estimator a ^ under different error distributions. Table 2 and Table 3 present the sample mean and standard deviation (SD) of IMSE 1 and IMSE 2 under different error distributions. Table 4 lists the MSPE results of model prediction under different error distributions for different estimation methods. The symbol “-” indicates that these values are much higher than others, so they are not listed in the table. Figure 1 displays the sample means of IMSE = IMSE 1 + IMSE 2 for functional parameters β ( t ) , g ( z ) and MSE ( × 10 3 ) for autoregressive coefficient a when n = 500 , and π takes values in the set { 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 , 0.8 , 0.9 } for Scenario (2).
From the simulation results of different estimation methods under different error distributions in Table 1, Table 2, Table 3 and Table 4 and Figure 1, we can conclude that:
( 1 ) For the autoregressive coefficient vector a, the estimation results of all methods based on OLS, LAD, EXP and HB seem equally good, whenever the error distribution is normal or heavy-tailed, which indicates the estimator a ^ is not sensitive to the distribution of random error. At the same time, we can see that the values of Bias and MSE of HB and EXP are less than OLS. Hence, it can be concluded the robust estimators based on HB and EXP outperform the OLS method.
( 2 ) For the slope function β ( t ) and nonparametric function g ( z ) , the OLS estimator behaves badly when the error distribution is non-normal, particularly when the random error is heavy-tailed. In contrast, the estimators of LAD, EXP and HB are much better than the OLS. Even if the expectation and variance of the error do not exist, the robust estimation method is still feasible.
( 3 ) Whenever the distribution of random error is normal or heavy-tailed, the bias and MSE of a ^ , mean and SD of IMSE 1 , IMSE 2 and MSPE based on LAD, EXP and HB methods decrease as the sample size increases. Hence, the proposed robust procedure is effective.

4. Applications to Real Data

In this section, we illustrate the proposed robust estimation procedure for a semi-functional linear model with autoregressive errors by two real data analyses. The first example is the electricity consumed data, and the second one is the Tecator data.

4.1. Electricity Consumption Data

This subsection aims to compare the proposed robust estimation method with the OLS method, that is, whether the proposed robust estimation method is as good as the OLS estimation method in the absence of outliers in the data set. For this purpose, we consider electricity consumption data, which is available on the website http://www.eia.doe.gov/emeu/aer, (accessed on 31 December 2022). The data set includes the electricity consumption c i (KWH) from January 1973 to January 2016 (517 months) (see Figure 2) and annual average retail price z i (per KWH, including tax) (43 years).
As shown in Figure 2, the time series obviously shows some linear trends and some heterogeneity in the variance structure. Similar to Aneiros and Vieu [10] and Yu et al. [27], we eliminate the heteroscedasticity and the linear trend of the electricity retail sales data by differencing the log(electricity data) and obtain the time series (see Figure 3): X i = log c i + 1 log c i , i = 1 , 2 , , 516 . Let x i ( s ) = X 12 ( i 1 ) + s , i = 1 , 2 , , 43 , s = 1 , 2 , , 12 be the monthly log (difference electricity data).
There is only one observation per month, so we use 10 cubic B-spline basis functions to convert the discrete monthly electricity data into functional continuous smooth annual curve predictors { x i ( t ) , t [ 1 , 12 ] , i = 1 , 2 , , 43 } (see Figure 4). The response variable is y i ( s ) = X 12 i + s , which represents the electricity consumption data in the s-month of the i-year, where i = 1 , 2 , , 42 , s = 1 , 2 , , 12 . The electricity data set is divided into the training sample y i ( s ) , z i , x i ( t ) i = 1 41 and the test sample y 42 ( s ) , z 42 , x 42 ( t ) .
To ascertain whether there are outliers in the data set, the boxplot of the electricity consumption data from January to December is displayed in Figure 5. As shown in Figure 5, there are some outliers in March and November. Hence, the robust method may perform better than OLS. Meanwhile, to determine whether the random errors are serially correlated, we first display the autocorrelation function plot (see Figure 6) of the residual sequences { ε ^ i ( s ) } i = 1 41 , s = 1 , , 12 of the following semi-functional linear model (SFLM) based on the HB estimation procedure:
y i ( s ) = 1 12 β ( t ) x i ( t ) d t + g ( z i ) + ε i ( s ) , i = 1 , 2 , , 41 ; s = 1 , , 12 .
From Figure 6, it can be seen that there exist serial correlations for random error sequences ε i ( 8 ) and ε i ( 9 ) . We further use the Ljung–Box (LB) test statistics to test whether first-fourth order serial correlations exist. The corresponding p values are 0.0049 and 0.0013, respectively. Under the given significance level of 0.05, we can also conclude that the random error sequences ε i ( 8 ) and ε i ( 9 ) are the fourth-order serial correlation, which is consistent with Figure 6. Hence, we consider using the robust method to estimate the following SFLAR model based on the training set:
y i ( s ) = 1 12 β ( t ) x i ( t ) d t + g ( z i ) + ε i , ε i = l = 1 q a l ε i l + e i , i = 1 , 2 , , 41 ,
and then the test set is used to predict the values of y 42 ( s ) .
We also compare the robust estimation procedure (LAD, EXP and HB) with OLS to illustrate the proposed mehtod. Further, to implement our approach, the unknown function g ( · ) is approximated by a cubic B-spline function with equispaced knots. The truncation parameters J and K are selected by the BIC method and calculated according to Equation (13) in Section 2. Similar to Aneiros and Vieu [10] and Yu et al. [27], the following two error criteria, mean quadratic error (MQE):
MQE = 1 12 s = 1 12 y 42 ( s ) y ^ 42 ( s ) 2 ,
and mean relative quadratic error (MRQE):
MRQE = 1 12 s = 1 12 y 42 ( s ) y ^ 42 ( s ) 2 Var ( y ( s ) )
are used to evaluate the prediction ability of the model, where Var ( y ( s ) ) is the empirical variance of y i ( s ) i = 1 41 . Table 5 shows the MQE and MRQE of LAD, HB, EXP and OLS estimation. It can be seen from Table 5 that the robust estimation procedure performs better than OLS when there are some outliers.

4.2. Tecator Data

In this subsection, we further apply the robust procedure to analyze the Tecator spectral data set, which originated from http://lib.stat.cmu.edu/datasets/tecator, (accessed on 31 December 2022). This data set consists of 215 meat samples. Each sample records a spectrometric curve corresponding to a spectrum of absorbance measured at 100 wavelengths between 850 and 1050 nm by the Near-Infrared Transmission (NIT) principle, and the contents of moisture, protein and fat measured in percentages. Chen et al. [28], Aneiros and Vieu [29], Lin et al. [30] and Sang et al. [31] have conducted extensive research on these data, aiming to predict the content of certain components (such as fat) by using spectral absorbance. Recently, Cai et al. [15] predicted fat content (Y) according to water content ( Z 1 ), protein content ( Z 2 ) and spectral curve X ( t ) .
This paper is interested in whether the accuracy of LAD, EXP and HB is better than OLS in predicting fat content (y) when the distribution of the response is not normal, and the random errors are correlated. For this purpose, we first use the Shapiro–Wilk statistic to test whether the fat content distribution is normal. The p value of the test statistic is 0.000, which indicates that the distribution is not normal. Further, the skewness value (0.796) shows that the data have a right-skewed distribution. The least squares estimation may not be effective, so the robust method is adopted. Then, like in Section 4.1, we apply the LB test statistics to test whether there are first-order serial correlations based on the HB method. The corresponding p value is 0.0222, which is less than the given significance level of 0.05. Therefore, we also apply the robust method to estimate the following SFLAR model:
y i = 850 1050 β ( t ) x i ( t ) d t + g ( z i ) + ε i , ε i = a 1 ε i 1 + e i , i = 1 , 2 , , 215 ,
where the response variable y i is the fat content of the i-th sample, and the explanatory variables z i and x i ( t ) stand for the protein and the spectrometric curve of the i-th sample, respectively.
Similar to Cai et al. [15], the two criteria, mean absolute prediction error (MAPE):
MAPE = 1 214 i = 2 215 y i y ^ i ,
and the median prediction error (MPE):
MPE = median i { 2 , 3 , , 215 } y i y ^ i
are used to evaluate the prediction effect of different methods. From the values of MAPE and MPE of LAD, HB, EXP and OLS displayed in Table 6, it can be seen that the robust methods have better performance than the OLS estimation when the distribution of the response is not normal.

5. Conclusions

This paper considers the estimation problem of a semi-functional linear model with autoregressive errors when the random error follows a heavy-tailed distribution, or there are some outliers. The traditional estimation methods, such as the least square or likelihood method, are susceptible to outliers or heavily tailed errors, resulting in the corresponding estimation no longer being effective. Therefore, this paper introduces the robust estimation procedure. Simulation research and two real data analyses show that if the random error follows the normal distribution, the robust estimation has the same good statistical properties as the OLS. However, if the random error is heavily tailed, the robust estimation outperforms the OLS method.

Author Contributions

Conceptualization, M.C.; methodology, J.Z. and B.Y.; software, B.Y. and T.S.; formal analysis, J.Z. and B.Y.; resources, J.Z. and B.Y.; data curation, J.Z. and B.Y.; writing—original draft preparation, B.Y.; writing—review and editing, J.Z.; visualization, B.Y.; supervision, M.C. and J.Z.; project administration, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China under 210 Grant Nos. 11861074, 11731011, 12261051 and 11731015, and Applied Basic Research Project of Yunnan 211 Province under Grant No. 2019FB138.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data sets generated and analysed are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AICAkaike information criterion
BICBayesian information criterion
EXPexponential square
HBHuber loss
IMSEintegral of mean square errors
KWHkilowatt hour
LADLeast absolute deviations
MNMixed normal
MQEmean quadratic error
MRQEmean relative quadratic error
MSEmean square error
MSPEmean square prediction error
OLSOrdinary least squares
SDstandard deviation
SFLARsemi-functional linear model with autoregressive error
TSIRLStwo-step iteratively reweighted least squares

References

  1. Cardot, H.; Ferraty, F.; Sarda, P. Functional linear model. Stat. Probab. Lett. 1999, 45, 11–22. [Google Scholar] [CrossRef]
  2. Cardot, H.; Ferraty, F.; Sarda, P. Spline estimators for the functional linear model. Stat. Sin. 2003, 13, 571–591. [Google Scholar] [CrossRef]
  3. Yao, F.; Müller, H.G.; Wang, J.L. Functional linear regression analysis for longitudinal data. Ann. Stat. 2005, 33, 2873–2903. [Google Scholar] [CrossRef] [Green Version]
  4. Hall, P.; Horowitz, J.L. Methodology and convergence rates for functional linear regression. Ann. Stat. 2007, 35, 70–91. [Google Scholar] [CrossRef] [Green Version]
  5. Li, T.; Song, X.Y.; Zhang, Y.Y.; Zhu, H.T.; Zhu, Z.Y. Clusterwise functional linear regression models. Comput. Stat. Data Anal. 2021, 158, 0167–9473. [Google Scholar] [CrossRef]
  6. Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis: Theory and Practice; Springer: New York, NY, USA, 2006. [Google Scholar]
  7. Baíllo, A.; Grané, A. Local linear regression for functional predictor and scalar response. J. Multivar. Anal. 2009, 100, 102–111. [Google Scholar] [CrossRef] [Green Version]
  8. Burba, F.; Ferraty, F.; Vieu, P. k-Nearest Neighbour method in functional nonparametric regression. J. Nonparametric Stat. 2009, 21, 453–469. [Google Scholar] [CrossRef]
  9. Aneiros-Pérez, G.; Vieu, P. Semi-functional partial linear regression. Stat. Probab. Lett. 2006, 76, 1102–1110. [Google Scholar] [CrossRef]
  10. Aneiros-Pérez, G.; Vieu, P. Nonparametric time series prediction: A semi-functional partial linear modeling. J. Multivar. Anal. 2008, 99, 834–857. [Google Scholar] [CrossRef]
  11. Shin, H. Partial functional linear regression. J. Stat. Plan. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
  12. Zhou, J.J.; Chen, M. Spline estimators for semi-functional linear model. Stat. Probab. Lett. 2012, 82, 505–513. [Google Scholar] [CrossRef]
  13. Yu, P.; Zhu, Z.Y.; Zhang, Z.Z. Robust exponential squared loss-based estimation in semi-functional linear regression models. Comput. Stat. 2019, 34, 503–525. [Google Scholar] [CrossRef]
  14. Yu, P.; Zhu, Z.Y.; Shi, J.H.; Ai, X.K. Robust estimation for partial functional linear regression model based on modal regression. J. Syst. Sci. Complex. 2020, 33, 527–544. [Google Scholar] [CrossRef]
  15. Cai, X.; Xue, L.G.; Lu, F. Robust estimation with a modified Huber’s loss for partial functional linear models based on splines. J. Korean Stat. Soc. 2020, 49, 1214–1237. [Google Scholar] [CrossRef]
  16. Cao, P.; Sun, J. Robust estimation for partial functional linear regression models based on FPCA and weighted composite quantile regression. Open Math. 2021, 19, 1493–1509. [Google Scholar] [CrossRef]
  17. Tang, Q.G. Estimation for semi-functional linear regression. Statistics 2015, 49, 1262–1278. [Google Scholar] [CrossRef]
  18. Boente, G.; Salibian-Barrera, M.; Vena, P. Robust estimation for semi-functional linear regression models. Comput. Stat. Data Anal. 2020, 152, 107041. [Google Scholar] [CrossRef]
  19. Boente, G.; Daniela, P. Robust estimation for functional quadratic regression models. arXiv 2022, arXiv:2209.02742. [Google Scholar] [CrossRef]
  20. Pannu, J.; Billor, N. Robust sparse functional regression model. Commun. Stat. - Simul. Comput. 2022, 51, 4883–4903. [Google Scholar] [CrossRef]
  21. Sinha, S.K.; Field, C.A.; Smith, B. Robust estimation of nonlinear regression with autoregressive errors. Stat. Probab. Lett. 2003, 63, 49–59. [Google Scholar] [CrossRef]
  22. Riazoshams, H.; Midi, H. B; Sharipov, O.S. The performance of robust two-stage estimator in nonlinear regression with autocorrelated error. Commun. Stat. - Simul. Comput. 2010, 39, 1251–1268. [Google Scholar] [CrossRef]
  23. Serenay, K.; Baris, A. A novel hybrid robust tapering approach for nonlinear regression in the presence of autocorrelation and outliers. Commun. Stat. - Simul. Comput. 2021, 1–17. [Google Scholar] [CrossRef]
  24. de Boor, C. A Practical Guide to Spline; Springer: New York, NY. USA, 1978. [Google Scholar] [CrossRef]
  25. Maronna, R.A.; Martin, R.D.; Yohai, V.J.; Salibián-Barrera, M. Robust Statistics: Theory and Methods (with R); Wiley: New York, NY, USA, 2018. [Google Scholar]
  26. Jiang, Y.L. An exponential-squared estimator in the autoregressive model with heavy-tailed errors. Stat. Its Interface 2016, 9, 233–238. [Google Scholar] [CrossRef]
  27. Yu, P.; Li, T.; Zhu, Z.Y.; Zhang, Z.Z. Composite quantile estimation in partial functional linear regression model with dependent errors. Metrika 2019, 82, 633–656. [Google Scholar] [CrossRef]
  28. Chen, D.; Hall, P.; Müller, H.G. Single and multiple index functional regression models with nonparametric link. Ann. Stat. 2011, 39, 1720–1747. [Google Scholar] [CrossRef]
  29. Aneiros, P.; Vieu, P. Partial linear modelling with multi-functional covariates. Comput. Stat. 2015, 30, 647–671. [Google Scholar] [CrossRef]
  30. Lin, Z.H.; Cao, J.G.; Wang, L.L.; Wang, H.N. Locally sparse estimator for functional linear regression models. J. Comput. Graph. Stat. 2017, 26, 306–318. [Google Scholar] [CrossRef]
  31. Sang, P.J.; Richard, A.L.; Cao, J.G. Sparse estimation for functional semiparametric additive models. J. Multivar. Anal. 2018, 168, 105–118. [Google Scholar] [CrossRef]
Figure 1. (a) The sample mean of IMSE for functional parameters β ( t ) and g ( z ) when n = 500 and π takes values in the set { 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 , 0.8 , 0.9 } ; (b) the sample mean of MSE ( × 10 3 ) for autoregressive coefficient a under the same setting.
Figure 1. (a) The sample mean of IMSE for functional parameters β ( t ) and g ( z ) when n = 500 and π takes values in the set { 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 , 0.8 , 0.9 } ; (b) the sample mean of MSE ( × 10 3 ) for autoregressive coefficient a under the same setting.
Mathematics 11 00277 g001
Figure 2. The ln (electricity data) from January 1973 to January 2016.
Figure 2. The ln (electricity data) from January 1973 to January 2016.
Mathematics 11 00277 g002
Figure 3. The differenced log (electricity data) from January 1973 to January 2016.
Figure 3. The differenced log (electricity data) from January 1973 to January 2016.
Mathematics 11 00277 g003
Figure 4. Annual curves of electricity data in the commercial sector from January 1973 to January 2016.
Figure 4. Annual curves of electricity data in the commercial sector from January 1973 to January 2016.
Mathematics 11 00277 g004
Figure 5. The boxplots from January to December.
Figure 5. The boxplots from January to December.
Mathematics 11 00277 g005
Figure 6. Autocorrelation graph of residual sequences from January ( s = 1 ) to December ( s = 12 ).
Figure 6. Autocorrelation graph of residual sequences from January ( s = 1 ) to December ( s = 12 ).
Mathematics 11 00277 g006
Table 1. The Bias and MSE of a ^ under different error distributions.
Table 1. The Bias and MSE of a ^ under different error distributions.
nMethods N ( 0 , 1 ) MN t ( 2 ) Cauchy ( 0 , 1 )
BiasMSEBiasMSEBiasMSEBiasMSE
OLS−0.01060.0043−0.00840.0040−0.00810.0037−0.02250.0038
100LAD−0.14240.0264−0.06130.0068−0.10360.0164−0.02320.0017
EXP−0.02400.0059−0.00250.0008−0.00840.0021−0.00070.0300
HB−0.01060.0044−0.00190.0006−0.00490.0019−0.00110.0002
OLS−0.00510.0014−0.00670.0012−0.00510.0012−0.00970.0011
300LAD−0.04000.0033−0.01400.0004−0.02260.0012−0.00260.0000
EXP−0.01430.0019−0.00160.0001−0.00410.00050.00000.0094
HB−0.00530.0014−0.00210.0002−0.00310.0005−0.00020.0000
OLS−0.00450.0008−0.00420.0007−0.00170.0007−0.00530.0005
500LAD−0.02340.0015−0.00020.00010.00990.0004−0.00110.0000
EXP−0.01190.0010−0.00060.0001−0.00180.0002−0.00010.0058
HB−0.00440.0008−0.00090.0001−0.00080.0002−0.00020.0000
Table 2. The mean (SD) of IMSE 1 for β ^ ( t ) under different error distributions and sample sizes.
Table 2. The mean (SD) of IMSE 1 for β ^ ( t ) under different error distributions and sample sizes.
nMethods N ( 0 , 1 ) MN t ( 2 ) Cauchy ( 0 , 1 )
Mean (SD)Mean (SD)Mean (SD)Mean (SD)
OLS0.2664 (0.1266)1.4068 (1.2305)0.8923 (4.2805)-    (-)
100LAD0.3298 (0.1663)0.4360 (0.2541)0.4730 (0.3153)0.4996 (0.3083)
EXP0.3243 (0.1714)0.4247 (0.7672)0.3786 (0.2029)0.4602 (0.2534)
HB0.2711 (0.1305)0.3963 (0.2124)0.3800 (0.2211)0.4734 (0.2721)
OLS0.1099 (0.0445)0.4753 (0.3015)0.4589 (2.0454)-    (-)
300LAD0.1272 (0.0526)0.1870 (0.0788)0.1438 (0.0638)0.1964 (0.0830)
EXP0.1663 (0.0712)0.1787 (0.0745)0.1792 (0.0759)0.1982(0.0815)
HB0.1110 (0.0451)0.1813 (0.0764)0.1357 (0.0569)0.1986 (0.0805)
OLS0.0836 (0.0305)0.3248 (0.1985)0.7908 (17.3387)-    (-)
500LAD0.0936 (0.0352)0.0928 (0.0346)0.1026 (0.0417)0.1557 (0.0573)
EXP0.1394 (0.0537)0.1469 (0.0553)0.1463 (0.0546)0.1570(0.0583 )
HB0.0844 (0.0308)0.1474 (0.0550)0.0980 (0.0380)0.1567 (0.0584)
Table 3. The mean (SD) of IMSE 2 for g ^ ( · ) under different error distributions and sample sizes.
Table 3. The mean (SD) of IMSE 2 for g ^ ( · ) under different error distributions and sample sizes.
nMethods N ( 0 , 1 ) MN t ( 2 ) Cauchy ( 0 , 1 )
Mean (SD)Mean (SD)Mean (SD)Mean (SD)
OLS0.3509 (0.3710)7.1999 (8.9383)7.0279 (-)-    (-)
100LAD0.4821 (0.7918)1.4814 (2.1051)1.1504 (4.6860)1.9271 (2.9633)
EXP0.4132 (0.4437)1.0579 (2.0157)0.7482 (0.8314)1.2962 (1.5780)
HB0.3713 (0.3992)0.8917 (1.0224)0.7717 (0.8362)1.3913 (1.7224)
OLS0.1140 (0.1166)2.2063 (2.6667)2.0490 (12.9143)-    (-)
300LAD0.1597 (0.1726)0.3016 (0.3629)0.2406 (0.2751)0.3699 (0.4413)
EXP0.1385 (0.1356)0.2434 (0.2624)0.2229 (0.2740 )0.3681 (0.4213)
HB0.1189 (0.1242)0.2388 (0.2597)0.2074 (0.2528)0.3629 (0.4087)
OLS0.0677 (0.0754)1.3708 (1.5933)3.2699 (-)-    (-)
500LAD0.1043 (0.1177)0.1046 (0.1065)0.1410 (0.1513)0.1945 (0.2207)
EXP0.0914 (0.0891)0.1414 (0.1428)0.1376 (0.1440)0.2129 (0.2366)
HB0.0722 (0.0787)0.1485 (0.1636)0.1266 (0.1406)0.2085 (0.2365)
Table 4. The mean (SD) of MSPE under different error distributions and sample sizes.
Table 4. The mean (SD) of MSPE under different error distributions and sample sizes.
nMethods N ( 0 , 1 ) MN t ( 2 ) Cauchy ( 0 , 1 )
Mean (SD)Mean (SD)Mean (SD)Mean (SD)
OLS0.3588 (0.3542)7.3662 (8.8633)7.3858 (-)-    (-)
100LAD0.4677 (0.4338)1.4368 (1.7128)0.9803 (0.9559)1.9208 (2.7565)
EXP0.4426 (0.4447)1.0267 (1.6935)0.7628 (0.8031)1.3085 (1.3842)
HB0.3792 (0.3800)0.8862 (0.9175)0.7665 (0.7987)1.4059 (1.6041)
OLS0.1266 (0.1173)2.3559 (2.6711)2.1143 (12.5808)-    (-)
300LAD0.1767 (0.1725)0.3310 (0.3619)0.2613 (0.2739)0.4033 (0.4421)
EXP0.1607 (0.1349)0.2703 (0.2626)0.2502 (0.2712)0.4017 (0.4206)
HB0.1317 (0.1246)0.2662 (0.2597)0.2274 (0.2509)0.3969 (0.4090)
OLS0.0775 (0.0751)1.4681 (1.5899)3.6659 (-)-    (-)
500LAD0.1165 (0.1175)0.1172 (0.1067)0.1563 (0.1508)0.2207 (0.2209)
EXP0.1105 (0.0889)0.1635 (0.1429)0.1601 (0.1448)0.2399 (0.2368)
HB0.0821 (0.0785)0.1706 (0.1644)0.1409 (0.1411)0.2353 (0.2365)
Table 5. MQE and MRQE of three estimation methods under the SFLAR model.
Table 5. MQE and MRQE of three estimation methods under the SFLAR model.
CriterionsMethods
OLSLADHBEXP
MQE0.000170.000160.000170.00017
MRQE0.344830.279290.313470.34435
Table 6. MAPE and MPE of three estimation methods under SFLAR.
Table 6. MAPE and MPE of three estimation methods under SFLAR.
CriterionsMethods
OLSLADHBEXP
MAPE1.8621.7711.7831.891
MPE1.5281.3451.3881.138
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, B.; Chen, M.; Su, T.; Zhou, J. Robust Estimation for Semi-Functional Linear Model with Autoregressive Errors. Mathematics 2023, 11, 277. https://0-doi-org.brum.beds.ac.uk/10.3390/math11020277

AMA Style

Yang B, Chen M, Su T, Zhou J. Robust Estimation for Semi-Functional Linear Model with Autoregressive Errors. Mathematics. 2023; 11(2):277. https://0-doi-org.brum.beds.ac.uk/10.3390/math11020277

Chicago/Turabian Style

Yang, Bin, Min Chen, Tong Su, and Jianjun Zhou. 2023. "Robust Estimation for Semi-Functional Linear Model with Autoregressive Errors" Mathematics 11, no. 2: 277. https://0-doi-org.brum.beds.ac.uk/10.3390/math11020277

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop