A Convex Combination Approach for Artificial Neural Network of Interval Data

Yamaka, Woraphon; Phadkantha, Rungrapee; Maneejuk, Paravee

doi:10.3390/app11093997

Open AccessArticle

A Convex Combination Approach for Artificial Neural Network of Interval Data

by

Woraphon Yamaka

,

Rungrapee Phadkantha

^* and

Paravee Maneejuk

Center of Excellence in Econometrics, Faculty of Economics, Chiang Mai University, Chiang Mai 50200, Thailand

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(9), 3997; https://0-doi-org.brum.beds.ac.uk/10.3390/app11093997

Submission received: 12 April 2021 / Revised: 22 April 2021 / Accepted: 26 April 2021 / Published: 28 April 2021

(This article belongs to the Special Issue Modeling and Simulation with Artificial Neural Network)

Download

Browse Figures

Versions Notes

Abstract

:

As the conventional models for time series forecasting often use single-valued data (e.g., closing daily price data or the end of the day data), a large amount of information during the day is neglected. Traditionally, the fixed reference points from intervals, such as midpoints, ranges, and lower and upper bounds, are generally considered to build the models. However, as different datasets provide different information in intervals and may exhibit nonlinear behavior, conventional models cannot be effectively implemented and may not be guaranteed to provide accurate results. To address these problems, we propose the artificial neural network with convex combination (ANN-CC) model for interval-valued data. The convex combination method provides a flexible way to explore the best reference points from both input and output variables. These reference points were then used to build the nonlinear ANN model. Both simulation and real application studies are conducted to evaluate the accuracy of the proposed forecasting ANN-CC model. Our model was also compared with traditional linear regression forecasting (information-theoretic method, parametrized approach center and range) and conventional ANN models for interval-valued data prediction (regularized ANN-LU and ANN-Center). The simulation results show that the proposed ANN-CC model is a suitable alternative to interval-valued data forecasting because it provides the lowest forecasting error in both linear and nonlinear relationships between the input and output data. Furthermore, empirical results on two datasets also confirmed that the proposed ANN-CC model outperformed the conventional models.

Keywords:

artificial neural network; convex combination method; interval-valued data; time series

Graphical Abstract

1. Introduction

Time-series point (single-value data) forecasting normally fails to reflect the range of fluctuation or uncertainty for economic, financial, and environmental data. Moreover, the existing model of interval forecasting is still incomplete, complex, and has relatively low accuracy. Thus, interval-value data forecasting has become an important issue to be investigated [1,2]. This study comes within interval-valued time series forecasting framework by introducing the convex combination (CC) method designed to choose the reference points that better represent the interval-valued data. The CC method automatically explores the set of reference points from input and output variables to build the neural networks (NN) models. This is an enhancement and a generalization over existing methods.

Interval-valued data forecasting serves the needs of investors and data scientists who are sometimes interested in single value-data and the variability of value intervals in the data. Interval-valued data provides rich information that can help investors and data scientists make an accurate decision [3]. With the advance in data science, enormous information and big data can be collected nowadays. However, conventional forecasting methods cannot be effectively implemented to deal with this big data to yield accurate results. Furthermore, these methods are generally proposed to forecast future observations using only the point-valued data, which gives rise to a higher computation cost when dealing with big data. Moreover, it is sometimes difficult to express the real behavior of the variable using only point-valued data. Thus, interval-valued data, such as the range of temperature, stock returns, and willingness to pay (minimum and maximum), is generally used to predict uncertainty in many situations. Interval analysis suggested by Lauro and Palumbo [4] assumed that observations and estimations in the world are usually uncertain, incomprehensive, and do not precisely represent the real behavior of the data. This is quite true, as our use of point-valued data will entitle us to lose substantial information about prices or values during the day or the week. With interval-valued data, however, we can capture more realistic movement or information of the variables and can also handle big data forecasting at the same time. Thus, the interval approach should be considered to explain real data behavior under the context of big data. Interval-valued data is a special type for symbolic data analysis (SDA), composed of lower and upper bounds of an interval. The objective of SDA is to provide a way to construct aggregated data described by multivalued variables, and thereby provide an efficient way to summarize the large data sets by some reference value of symbolic data. Thus, the estimation tools for interval-valued data analysis are very intensively required in recent studies. The use of interval-data to represent uncertainty is common in various situations, such as in coalitional games where payoffs of coalitions are uncertain and could be modeled as intervals of real numbers, see, e.g., Branzei et al. [5] and Kiekintveld et al. [6].

From a methodological point of view, interval-valued data is generally transformed into point-valued data or reference point data by some techniques; then, this reference point data is furthered used as a variable in the models. One of the famous approaches for dealing with interval-valued data is the mid-point method which was introduced by Billard and Diday [7]. They analyzed these data using ordinary least squares regression on the midpoints of the intervals, namely the lower and upper bounds of the independent and dependent variables. Neto and De Carvalho [8] improved this approach by presenting a new method based on two linear regression models (center-range method). The first regression model is fitted at the midpoints of the intervals and the second one on the ranges of the intervals. It was found to be more efficient than that of Billard and Diday [7]. More recently, Souza et al. [9], Chanaim et al. [10], and Phadkantha et al. [11] argued that the mid-point and range methods are not appropriate to be the reference of the interval data as they cannot present the real behavior of the data and both are also too restrictive. Thus, Souza et al. [9] introduced the parametrized approach (PM) to the intervals of input variables. This method can choose the reference points that better represent the input intervals before building the regression. Chanaim et al. [10] and Phadkantha et al. [11] extended the PM approach by suggesting the convex combination (CC) method to get the reference points of the input and output intervals before estimating the regression model. Specifically, instead of restricting the weight of interval-valued data

(w)

to be 0.5 (mid-point),

X^{c} = 0.5 (X^{u}) + 0.5 (X^{l})

, they generalized this weight to be unknown. Hence, the reference point data become

X^{c c} = w (X^{u}) + (1 - w) (X^{l})

, where

w \in [0, 1]

is the weight parameter. Then, this reference point data is used as the input data in the regression model. Buansing et al. [12] also proposed the iterative, information-theoretic (IT) method for forecasting interval-valued data. Their method differs from others, as it does not assume that every point in the interval emerged from the same underlying process; there may be multiple models behind the process. They showed that the IT method provides more accurate forecasting of the upper and lower bounds when compared to the center-range method.

There is voluminous literature on stock market estimation and forecasting based on a wide variety of both linear and nonlinear models. Linear models like autoregressive (AR) and autoregressive moving average (ARMA) [13,14] and nonlinear models like threshold and Markov switching models [15,16,17,18] have been used for time series forecasting. However, the equivocal and unforeseeable nature of the time series data has brought about the difficulty in prediction [19], so artificial neural networks (ANN) models were introduced to forecast the future return of the stock. The major advantage of ANN models is their flexible nonlinear modeling capability. With ANN, there is no need to specify a particular model form. Instead, the model is adaptively formed based on the features inherent in the data. This data-driven approach is suitable for many empirical data sets where no theoretical guidance is available to suggest an appropriate data generating process. ANN provides desirable properties that some traditional linear and nonlinear regression models lack, such as being noise tolerant. The structure of the ANN model is inspired by real-life information-processing abilities of the human brain. Key attributes of the brain’s information network include a nonlinear, parallel information processing structure and multiple connections between information nodes [20].

In the recent decade, several studies have indicated the higher performance of the ANN models in forecasting compared to that of regression models. Zhang et al. [21] and Leung et al. [22] examined various prediction models based on multivariate classification techniques and compared both neural network and classical forecasting models. Their experiment results suggested that the probabilistic neural network can outperform the level estimation models, including adaptive exponential smoothing, vector autoregression with Kalman filter updating, multivariate transfer function, and multilayered feedforward neural network, in terms of the prediction accuracy. In a more recent study, Cao et al. [23] demonstrated the accuracy of ANN in predicting Shanghai Stock Exchange (SHSE) movement by comparing the neural networks and linear models under the capital asset pricing model (CAPM) and Fama and French’s three-factor contexts, and they found that neural networks outperformed the linear models.

As we mentioned above, this paper comes within the framework of interval-valued data forecasting by ANN. To the best of our knowledge, work-related to interval-valued data forecasting by neural networks is somewhat limited. Some attempts along this line include San Roque et al. [24], Maia et al. [25], Maia and De Carvalho [26], and Yang et al. [27]. Maia et al. [25] proposed the ANN-Center method to predict the interval value by using the midpoint of the interval as the input. Later, Maia and De Carvalho [26] introduced the ANN-LU method by using ANN-Center and ANN-Range (predicting the difference between upper and lower bounds) for predicting the lower and upper bounds of intervals separately. Yang et al. [27] suggested that ANN-LU may face the unnatural interval crossing problem when the predicted lower bounds of intervals are larger than the predicted upper ones and vice versa and thereby leading to an invalid interval prediction. Hence, they introduced the regularized ANN-LU (RANN-LU) model for interval-valued time series forecasting.

Although ANN-center, ANN-LU, and RANN-LU are found to be superior to the classical linear regression models for interval-valued data prediction of Billard and Diday [7], these models rely either on midpoint (ANN-Center) or lower and upper bounds of the interval for forecasting, which may not be a good input for predicting the future value of an interval. For example, in the case of RANN-LU, if we predict the future value of the upper bound and lower bound separately, the prediction may not be reliable, since the whole information of the interval is not taken into account [27,28]. Although ANN-Center and ANN-LU consider both upper and lower bounds in the prediction process, the prediction still relies on the midpoint of the interval indicating the symmetric weight between lower and upper bounds. To overcome these problems, in this study, we introduce the convex combination (CC) approach to ANN for predicting interval-valued data, say, lower and upper bounds. Our model is a generalization of the ANN-Center, allowing the weight to be more flexible and not to be fixed at 0.5 (asymmetric weight).

The novelty of the proposed ANN-CC can be summarized in the following two aspects: First, our method can construct prediction intervals based on the CC approach and ANN models. More specifically, this study proposes the novel CC method for interval ANN modeling. In this approach, the intervals of input and output variables are parametrized through the convex combination of lower and upper bounds. The proposed ANN-CC is a promising alternative to the existing approaches. With optimal reference points that better represent the intervals, we are able to find an efficient solution that improves the prediction accuracy of the lower and upper bounds. Second, to the best of our knowledge, there is no study extending the CC approach for interval ANN modeling. Our proposed method fills in such a literature gap and can capture both linear and nonlinear patterns within interval-valued data.

The rest of this paper is organized as follows. Section 2 gives a brief review of interval-valued prediction methods. Section 3 presents the proposed ANN with the convex combination method. In Section 4, we provide a simulation study to assess the performance of our proposed method; Section 5 describes the data used in this study. The analytical results are presented in Section 5. Section 6 provides the conclusion of this study.

2. Reviews of Existing Methods

2.1. Linear Regression Based on Center Method

Let

X^{l} = (x_{t 1}^{l}, \dots, x_{t k}^{l})

and

X^{u} = (x_{t 1}^{u}, \dots, x_{t k}^{u})

,

t = 1, \dots, T

, are the lower and upper bounds of the intervals, respectively.

According to Billard and Diday [7], the center or mid-point of interval-valued explanatory variables denoted as

X^{c}

is calculated from:

X^{c} = \frac{X^{l} + X^{u}}{2} .

(1)

Likewise, interval-valued response variable denoted as

Y^{c}

is calculated from:

Y^{c} = \frac{Y^{l} + Y^{u}}{2},

(2)

where

Y^{l} = (y_{1}^{l}, \dots, y_{T}^{l})

and

Y^{u} = (y_{1}^{u}, \dots, y_{T}^{u})

.

Thus, regression based on the center method can be constructed as

Y^{c} = (X_{}^{c})^{'} β_{}^{c} + ε^{c},

(3)

where

β_{}^{c} = (β_{1}^{c}, \dots, β_{k}^{c})

is the vector of parameters.

ε^{c} = (ε_{1}^{c}, \dots, ε_{T}^{c})

are errors that have a normal distribution. Using matrix notation, this problem can be estimated by the ordinary least squares (OLS) method under the full rank assumption:

{\hat{β}}^{c} = {(X^{c}^{'} X^{c})}^{- 1} X^{c}^{'} Y^{c} .

(4)

Then, the estimates for the response lower and upper bounds are presented as

{\overset{⌢}{Y}}_{}^{l} = X_{}^{l} {\overset{⌢}{β}}^{c}

and

{\overset{⌢}{Y}}_{}^{u} = X_{}^{u} {\overset{⌢}{β}}^{c}

, respectively.

2.2. Linear Regression Based on Center-Range Method

Neto and De Carvalho [8] also introduced the center-range method to predict the upper and lower bounds of the dependent variable intervals. In this method, the lower and upper bounds of the interval-valued response variable are separately predicted by the mid-points and ranges of the interval-valued explanatory variables. Thus, this model is built on two linear regression models, namely the regression based on center method (Equation (3)) and the regression-based on range method:

Y^{r} = X_{}^{r}^{'} β_{}^{r} + ε^{r},

(5)

where

X_{}^{r} = (X^{u} - X^{l}) / 2

is half-ranges of explanatory variables,

Y^{r}

is half-ranges of response variables,

ε^{r}

is the error. Using matrix notation, the least-squares estimator parameter

{\hat{β}}^{r}

is given by

{\hat{β}}^{r} = {(X^{r}^{'} X^{r})}^{- 1} X^{r}^{'} Y^{r} .

(6)

Then, we can predict the response lower and upper bounds as

{\overset{⌢}{Y}}_{}^{l} = X_{}^{c}^{'} {\overset{⌢}{β}}^{c} - X_{}^{r}^{'} {\overset{⌢}{β}}^{r}

and

{\overset{⌢}{Y}}_{}^{u} = X_{}^{c}^{'} {\overset{⌢}{β}}^{c} + X_{}^{r}^{'} {\overset{⌢}{β}}^{r}

, respectively.

2.3. Linear Regression Based on Convex Combination Method

Chanaim et al. [10] suggested that the center method may lead to the misspecification problem, as the midpoint of the intervals might not be a good reference of the intervals. To tackle this problem, they proposed employing the convex combination approach to determine the best reference point between the ranges of interval data

x_{k}^{c c} = w_{1 k} x_{k}^{l} + (1 - w_{1 k}) x_{k}^{u}; w_{1 k} \in [0, 1],

(7)

Y^{c c} = w_{2} Y^{l} + (1 - w_{2}) Y^{u}; w_{2} \in [0, 1],

(8)

where

w {= [w}_{11}, \dots, w_{1 k} w_{2}]

is the weight parameter of the interval data with values [0, 1]. The advantage of this method lies in the flexibility to assign weights in calculating the appropriate value between intervals. Thus

Y^{c c} = X_{}^{c c}^{'} β_{}^{c c} + ε^{c c},

(9)

where

X_{}^{c c} = (x_{t 1}^{c c}, \dots, x_{t k}^{c c})

,

t = 1, \dots, T

. Using matrix notation, this problem is also estimated by the OLS method under the full rank assumption:

{\hat{β}}^{c c} = {(X^{c c}^{'} X^{c c})}^{- 1} (X^{c c}^{'} Y^{c c}) .

(10)

The response lower bound prediction is described in Equation (11), and the model to predict the response upper bound is given by Equation (12),

{\overset{⌢}{Y}}_{}^{l} = {\overset{⌢}{X}}_{}^{c c} {\overset{⌢}{β}}^{c c} + {\overset{⌢}{Y}}^{r} ({\overset{⌢}{w}}_{2}),

(11)

{\overset{⌢}{Y}}_{}^{u} = {\overset{⌢}{X}}_{}^{c c} {\overset{⌢}{β}}^{c c} + {\overset{⌢}{Y}}^{r} (1 - {\overset{⌢}{w}}_{2}),

(12)

where

{\overset{⌢}{Y}}^{r} = X_{}^{l}^{'} {\overset{⌢}{β}}_{}^{c c} - X_{}^{u}^{'} {\overset{⌢}{β}}_{}^{c c},

(13)

where

{\overset{⌢}{Y}}^{r}

is the range prediction and

{\overset{⌢}{X}}_{}^{c c} = ({\overset{⌢}{w}}_{1 j} x_{j}^{l} + (1 - {\overset{⌢}{w}}_{1 k}) x_{j}^{u})

,

j = 1, \dots, k

.

2.4. Regularized Artificial Neural Network (RANN)

Yang et al. [27] introduced RANN for interval-valued data prediction. This method is able to approximate various forms of nonlinearity in the data and directly models the non-cross lower and upper bounds of intervals. In this model, the relation between the interval-valued output (

Y^{u}

and

Y^{l}

) and interval-valued inputs (

X^{u}

and

X^{l}

) is as follows:

{\hat{Y}}^{l} = f (\sum_{j = 1}^{J} g (\sum_{i = 1}^{2 k} X_{i} ω_{i j}^{I, l} + b_{j}^{I, l}) ω_{j}^{o, l} + b_{j}^{o, l}),

(14)

{\hat{Y}}^{u} = f (\sum_{j = 1}^{J} g (\sum_{i = 1}^{2 k} X_{i} ω_{i j}^{I, u} + b_{j}^{I, u}) ω_{j}^{o, u} + b_{j}^{o, u}),

(15)

where

X = (X^{u}, X^{l}) \in ℝ^{2 k}

consists of 2k inputs.

ω_{i j}^{I, u}

,

ω_{i j}^{I, l}

and

b_{j}^{I, l}

,

b_{j}^{I, u}

represent the weight parameters and bias terms of the jth hidden layer neuron of input layer.

ω_{i j}^{o, u}

,

ω_{i j}^{o, l}

and

b_{j}^{o, l}

,

b_{j}^{o, u}

represent the weight parameters and bias terms of the jth hidden layer neuron of output layer.

f (\cdot)

and

g (\cdot)

are the activation functions. To meet non-crossing lower and upper bounds of intervals, a non-crossing regularize is introduced in the loss function as follows,

L o s s = \frac{1}{2 T} \sum_{t = 1}^{T} {(Y_{}^{l} - {\hat{Y}}_{}^{l})}^{2} + \frac{1}{2 T} \sum_{t = 1}^{T} {(Y_{}^{u} - {\hat{Y}}_{}^{u})}^{2} + \frac{λ}{2 T} \sum_{t = 1}^{T} {\max (0, {\hat{Y}}_{}^{l} - {\hat{Y}}_{}^{u})}^{2},

(16)

where

λ > 0

is the regularization parameter for controlling the non-crossing strength.

3. The Proposed Method Artificial Neural Network with Convex Combination (ANN-CC)

Artificial neural network (ANN) models can approximate various forms of nonlinearity in the data. There are various types of ANN, but the most popular one is the Multilayer Perceptron (MLP). ANN models have been successfully applied in a variety of fields such as accounting, economics, finance, and marketing, as well as forecasting [20,26]. In this study, we use a three-layer ANN model in which both the inputs and outputs contain the lower and upper bounds of intervals. As shown in Figure 1, suppose the interval-valued data

(X, Y)

consists of one predictor and one response. The first layer is the input layers, the middle layer is called the hidden layer, and the last layer is the output layers. All the data are expressed in the form of a lower-upper bound. Thus, we have

{X_{1}^{u}, X_{1}^{l}}

and

{Y^{u}, Y^{l}}

. For convenience, we use the notation

X^{c c} {= w}_{1} X_{1}^{l} + (1 - w_{1}) X_{1}^{u}

for input variable and

Y^{c c} {= w}_{2} Y^{l} + (1 - w_{2}) Y^{u}

for output variable. Note that ANN-center is the particular case of ANN-CC if we set

w_{1} = w_{2} = 0.5

. Each layer contains jth neurons which are connected to one another and are also connected with all neurons in the immediate next layer. The neuron input path has a signal on its

X^{c c}

, and the strength of the path is characterized by a weight of neuron

j

(

w_{j}

). The neuron is modeled as summing the path weight times the input signal over all paths and adding the node bias

(b)

. Note that the neural network consists of input function and output function; thus, we can express these functions as:

H_{j}^{c c, I} = g (X^{c c}^{'} ω_{j}^{I} + b_{j}^{I}),

(17)

where

H_{j}^{c c, I}

is the jth hidden neuron’s input,

ω_{j}^{I}

is the weight vector between the hidden layer and the input layer.

g (\cdot)

is the activation function for the hidden layer.

b_{j}^{I}

is the bias term of input layer. Then,

H_{j}^{c c, I}

is transformed into output

Y^{c c}

with the activation function of the output layer. Thus, the model can be written as:

{\hat{Y}}^{c c} = f (\sum_{j = 1}^{J} H_{j}^{c c, I} ω_{j}^{o} + b_{j}^{o}),

(18)

where

ω_{}^{o} = {ω_{1}^{o}, \dots, ω_{j}^{o})

is the weight vector between the hidden layer and the output layer.

b_{j}^{o}

is the bias term of output layer.

f (\cdot)

is the activation function of the output layer. A challenge in ANN design is the selection of activation function. It is also known as transfer function, and can be basically divided into four types: tanh or hyperbolic tangent activation function (tanh), sigmoid or logistic activation function (sigmoid), linear activation function (linear), and exponential activation function (exp).

Learning occurs through the adjustment of the path weights and node biases. The most common method used for this adjustment is backpropagation. In this method, the optimal weights

ω_{j}^{I}

and

ω_{j}^{o}

are estimated by minimizing the squared difference between the model output and the estimated. We formulate the loss function as follows,

l o s s = \frac{1}{T} \sum_{t = 1}^{T} (Y^{c c} - {\hat{Y}}_{}^{c c}),

(19)

where

{\hat{Y}}_{}^{c c}

and

Y^{c c}

are the estimated output and observed output, respectively. In addition to the weights of neuron j

(ω_{j})

, our estimation also considers the weight parameter of the interval data

(w)

to determine the reference of the output and input variables.

As the interval data is manipulated as a new type of numbers represented by an ordered pair of its minimum and maximum values, the numerical manipulations of interval data should follow “interval calculus” [25]. In practice, we can separately predict the lower and upper bounds of the interval using the min-max method. However, this method does not guarantee the mathematical coherence of predicted bounds. That is, the predicted lower bounds of intervals should be smaller than the predicted upper ones. Otherwise, the unnatural interval crossing problem will occur, which leads to an invalid interval prediction [27]. Furthermore, the forecasting performance can be impaired if there is not a clear dependency between the respective bounds of output and input [28,29]. Instead of restricting the weight of interval-valued data

(w = (w_{1} {, w}_{2}))

to be 0.5 (mid-point), in this study, we consider the convex combination method to get the reference point of the interval-valued data. Thus, in this estimation, the weights of neuron j

(ω_{j})

are dependent on the given weight

w

. Since there is no close form solution for this weight parameter of the interval data, we thus employ a grid search selection of

w

that minimizes the sum of squared errors, denoted as

l o s s (w)

. Then, we can rewrite our loss function in Equation (19) as

l o s s (w) = \frac{1}{T} \sum_{t = 1}^{T} (Y^{c c} - {\hat{Y}}_{}^{c c} (w))^{2} .

(20)

This loss function is computed using two steps of estimation. Firstly, it is important to solve a nonlinear optimization to obtain

(ω_{j})

which depends on the candidate

w_{i}

. In the second step, the loss function is then minimized with respect to the candidate

w_{i}

. Then, we introduce another candidate

w

to repeat the first step. After

l o s s (w)

is computed for all candidates

w_{i}

, the minimum value of

l o s s (w)

is preferred. Thus, the optimal

w = (w_{1} {, w}_{2})

is obtained by

\overset{⌢}{w} = \underset{w}{argmin} = l o s s (w) .

(21)

We note that ANN estimates over a grid search between 0 and 1. Finally, following the CC method in Section 2.3, we can predict the lower and upper bounds as follows.

{\hat{Y}}^{l} = f (\sum_{j = 1}^{J} H_{j}^{c c, I} ({\overset{⌢}{w}}_{1}) {\overset{⌢}{ω}}_{j}^{o} + {\overset{⌢}{b}}_{j}^{o}) - {\hat{Y}}^{r} ({\overset{⌢}{w}}_{2}),

(22)

{\hat{Y}}^{u} = f (\sum_{j = 1}^{J} H_{j}^{c c, I} ({\overset{⌢}{w}}_{1}) {\overset{⌢}{ω}}_{j}^{o} + {\overset{⌢}{b}}_{j}^{o}) + {\hat{Y}}^{r} (1 - {\overset{⌢}{w}}_{2}),

(23)

where

H_{j}^{c c, I} ({\overset{⌢}{w}}_{1}) = g (x^{c c} {({\overset{⌢}{w}}_{1})}^{'} {\overset{⌢}{ω}}_{j}^{I} + {\overset{⌢}{b}}_{j}^{I})

is the jth hidden neuron’s input

X^{c c} ({\overset{⌢}{w}}_{1})

and the range prediction is computed by

{\hat{Y}}^{r} = f (\sum_{j = 1}^{J} H_{j}^{u, I} {\overset{⌢}{ω}}_{j}^{o} + {\overset{⌢}{b}}_{j}^{o}) - f (\sum_{j = 1}^{J} H_{j}^{l, I} {\overset{⌢}{ω}}_{j}^{o} + {\overset{⌢}{b}}_{j}^{o}),

(24)

where

H_{j}^{l, I} = g (X^{l}^{'} {\overset{⌢}{ω}}_{j}^{I} + {\overset{⌢}{b}}_{j}^{I})

and

H_{j}^{u, I} = g (X^{u}^{'} {\overset{⌢}{ω}}_{j}^{I} + {\overset{⌢}{b}}_{j}^{I})

. By making this prediction, the predicted lower bounds of intervals should not cross over the corresponding upper bound. We also note that

{\overset{⌢}{ω}}_{j}^{I}

and

{\overset{⌢}{ω}}_{j}^{o}

have contained the information of the weight parameter of the interval data

X

and

Y

.

In our work, the ANN-CC methodology includes the grid search for the parameter

\overset{⌢}{w} {= {\overset{⌢}{w}}_{1}, {\overset{⌢}{w}}_{2})

. Note that grid search was executed before estimating the weight parameters

{\overset{⌢}{ω}}_{j}^{o}

and

{\overset{⌢}{ω}}_{j}^{o}

in the ANN structure. A pseudo-code for the performed grid search and ANN-CC estimation is presented in Algorithm 1.

Algorithm1. Pseudo code for the proposed ANN-CC with one predictor and one response

Require : w {= {w}_{1}, w_{2})

, where

w_{1}

and

w_{2}

are the set of candidate weight of input

{X_{1}^{u}, X_{1}^{l}}

and output

{Y^{u}, Y^{l}}

, respectively, within [0, 1].

#Serach the optimal

\overset{⌢}{w}

For each

w_{i 1}, w_{i 2}

in

w_{1}, w_{2} = [0.001, 0.002, \dots, 1]

Calculate

Y_{i}^{c c} = w_{i 2} Y^{l} + (1 - w_{i 2}) Y^{u}

and

X_{i}^{c c} = w_{i 1} X_{1}^{l} + (1 - w_{i 1}) X_{1}^{u}

#Define the loss function of ANN structure

ω_{i}^{o} = {ω_{i 1}^{o}, \dots, ω_{i j}^{o})

,

ω_{i}^{I} = {ω_{i 1}^{I}, \dots, ω_{i j}^{I})

= Parameters( )

H_{i j}^{c c, I} = g (x_{i}^{c c}^{'} ω_{i j}^{I} + b_{i j}^{I})

{\tilde{Y}}_{i}^{c c} = f (\sum H_{i j}^{c c, I} ω_{i j}^{o} + b_{i j}^{o})

l o s s_{i} (w_{i}) = {‖ Y_{i}^{c c} - {\tilde{Y}}_{i}^{c c} ‖}^{2}

# Follow the gradients until convergence

ω_{i} = (ω_{i}^{o}, ω_{i}^{I})

repeat

ω_{i} = ω_{i} - ρ \nabla ω_{i}^{L o s s}

until convergence

end for

#Choose the

w {= {w}_{1}, w_{2})

with the lowest

L o s s_{i}

as the optimal

\overset{⌢}{w} {= {\overset{⌢}{w}}_{1}, {\overset{⌢}{w}}_{2})

\overset{⌢}{w} = \underset{w}{argmin} = l o s s (w)

Calculate

{\overset{⌢}{Y}}^{c c} = {\overset{⌢}{w}}_{2} Y^{l} + (1 - {\overset{⌢}{w}}_{2}) Y^{u}

and

{\overset{⌢}{X}}^{c c} = {\overset{⌢}{w}}_{1} X_{1}^{l} + (1 - {\overset{⌢}{w}}_{1}) X_{1}^{u}

# Compute the loss function of ANN structure using

{\overset{⌢}{Y}}^{c c}

and

{\overset{⌢}{X}}^{c c}

L o s s^{*} (\overset{⌢}{w}) = {‖ Y^{c c} - {\hat{Y}}_{}^{c c} ‖}^{2}

# Follow the gradients until convergence

ω = (ω_{}^{o}, ω_{}^{I})

repeat

ω = ω - ρ \nabla ω_{}^{L o s s^{*}}

until convergence

4. Simulation Study

To examine the performance of our proposed method, we conducted a simulation study. We considered two data generation processes which are different in structure: linear and nonlinear.

4.1. Linear Structure

The simple interval generation process is conducted, and one independent variable is assumed. We considered three typical data generation processes with different weight parameters. For this purpose, we considered the following three Scenarios of weight in the intervals:

Scenario 1: Center of interval data:

w_{1} = 0.5

,

w_{2} = 0.5

[(0.5) Y^{l} + (1 - 0.5) Y^{u}] = 1 + 5 [(0.5) X^{l} + (1 - 0.5) X^{u}] + ε .

(25)

Scenario 2: Deviate from the center to the lower of the interval data:

w_{1} = 0.2

,

w_{2} = 0.2

[(0.2) Y^{l} + (1 - 0.2) Y^{u}] = 1 + 5 [(0.2) X^{l} + (1 - 0.2) X^{u}] + ε .

(26)

Scenario 3: Deviate from the center to the upper of the interval data:

w_{1} = 0.8

,

w_{2} = 0.8

[(0.8) Y^{l} + (1 - 0.8) Y^{u}] = 1 + 5 [(0.8) X^{l} + (1 - 0.8) X^{u}] + ε .

(27)

For each scenario, we performed 100 replications. In each simulation, we proceeded as follows.

(1) Generate the error from the normal distribution with mean zero and variance one.

(2) Generate the upper bound of the independent variable

X^{u}

, from the uniform (1,3). Then, we computed the lower bound of the independent variable

X^{l} = X^{u} - r_{x}

, where

r_{x} ~ U (0, 2)

denotes the range between the upper and lower bounds.

(3) Compute the expected independent variable

X^{c c}

of the intervals that have been computed by

X^{c c} = (w_{1} X^{u}) + (1 - w_{1}) X^{l}

. Then, we could generate the expected dependent variables

Y^{c c} = X^{c c} β + ε

.

(4) Finally, we derived the upper and lower bounds of intervals by

Y^{l} = Y^{c c} - r_{y}

, where

r_{y} ~ U (0, 2)

and

Y^{u} = (Y^{c c} - (1 - w_{2}) Y^{l}) / w_{2}

. We note

r_{x}

and

r_{y}

are a random number for simulating the interval

X

and

Y

, respectively. This guaranteed that the bounds are not crossing each other.

In this simulation study, we performed 100 replications with a sample size n = 1000 for all three scenarios. Each simulation dataset a randomly split, with 80% for training and 20% for testing. Our ANN with the convex combination method (ANN-CC) was then compared with two conventional models, namely the ANN-Center and the RANN-LU method of Yang et al. [27]. In this simulation study, four transfer functions, namely tanh or hyperbolic tangent activation function (tanh), sigmoid or logistic activation function (sigmoid), linear activation function (linear), and exponential activation function (exp) were considered. To simplify the comparison, one input layer, one output layer, one hidden layer, and one hidden neuron were assumed. We noted that the ANN-Center could be estimated by the validann package in R programming language [30]. In addition, this package also provides validation methods for the replicative, predictive, and structural validation of artificial neural network models.

To assess the performance of these models, we conducted the following measures: mean absolute error (MAE), mean squared errors (MSE), and root mean squared errors (RMSE). The formulae of MAE, MSE, and RMSE are as follows:

MAE = \frac{(\sum_{i = 1}^{N} | {\overset{⌢}{Y}}_{i}^{u} - Y_{i}^{u} | / N) + (\sum_{i = 1}^{N} | {\overset{⌢}{Y}}_{i}^{l} - Y_{i}^{l} | / N)}{2},

(28)

MSE = \frac{\sum_{i = 1}^{N} {({\overset{⌢}{Y}}_{i}^{u} - Y_{i}^{u})}^{2} / N + \sum_{i = 1}^{N} {({\overset{⌢}{Y}}_{i}^{l} - Y_{i}^{l})}^{2} / N}{2},

(29)

RMSE = \frac{\sqrt{\sum_{i = 1}^{N} {({\overset{⌢}{Y}}_{i}^{u} - Y_{i}^{u})}^{2} / N} + \sqrt{\sum_{i = 1}^{N} {({\overset{⌢}{Y}}_{i}^{l} - Y_{i}^{l})}^{2} / N}}{2}

(30)

We repeated the simulation 100 times and obtained the simulation data with 100 samples. An example of each of the simulated interval-valued time series is presented in Figure 2. In this figure, each square plot represents the relationship between the interval

X

and

Y

.

Table 1 presents the results of 100 repetitions for the linear structure case. The MAE, MSE, and RMSE are reported. We observed that the ANN model with the CC method (ANN-CC) showed its powerful nonlinear approximation ability (tanh, sigmoid, exp) as the MAE, MSE, and RMSE values were lower than those of the ANN-Center and RANN-LU in Scenarios 2 and 3. It is also noticed that tanh function performed the best fit function for the ANN-CC model for these simulated datasets. Not surprisingly, we observed that our ANN-CC did not outperform the ANN-Center method under Scenario 1. This is due to the interval data being simulated from the midpoint. However, our ANN-CC method still performed better than RANN-LU in this scenario. In sum, from evaluating our ANN model with convex combination performance, we reached a similar conclusion for Scenarios 2 and 3. Our proposed model performed well in the simulation study, and the ANN-CC method showed high performance in all scenarios.

4.2. Nonlinear Structure

Similar to the linear structure, we considered three typical data generation processes with different weight parameters. Three Scenarios of weight in the intervals were as follows:

Scenario 1: Center of interval data:

w_{1} = 0.5

,

w_{2} = 0.5

[(0.5) Y^{l} + (1 - 0.5) Y^{u}] = 1 + 3 e^{{[(0.5) X^{l} + (1 - 0.5) X^{u}]}^{2}} + ε .

(31)

Scenario 2: Deviate from the center to the lower of the interval:

w_{1} = 0.2

,

w_{2} = 0.2

[(0.2) Y^{l} + (1 - 0.2) Y^{u}] = 1 + 3 e^{{[(0.2) X^{l} + (1 - 0.2) X^{u}]}^{2}} + ε .

(32)

Scenario 3: Deviate from the center to the upper of the interval:

w_{1} = 0.8

,

w_{2} = 0.8

[(0.8) Y^{l} + (1 - 0.8) Y^{u}] = 1 + 3 e^{{[(0.8) X^{l} + (1 - 0.8) X^{u}]}^{2}} + ε .

(33)

For each scenario, we performed 100 replications. This data was more complicated than that in the linear structure case, as it showed a nonlinear relationship between the dependent and independent variables, as shown in Figure 3. The results are summarized in Table 2.

Table 2 presents the simulation results based on the nonlinear case. Similar results were obtained, as the proposed ANN-CC showed higher performance in Scenarios 2 and 3. We observed that ANN-Center still performed poorly in Scenarios 2 and 3. The reason is simple. This model fixes the weight parameter at the center, which does not correspond to the true data generating process, thus the ANN-Center leads to higher bias of the prediction.

The experiments were carried out using an Intel Core i5-6400 CPU 2.7 GHz 4 core 16Gb RAM. The computational cost of our ANN-CC model were a bit higher than those of ANN-Center and RANN-LU. The training was the only time-consuming step for all models. It was also observed that the computation time of our proposed model was larger than ANN-Center and RANN-LU, as the additional weight of interval-valued data was estimated simultaneously during the optimization.

5. Application to Real Data

5.1. Capital Asset Pricing Model: Thai Stocks

The performance of this ANN-CC model was assessed using interval-valued returns (maximum and minimum returns) from the Stock Exchange of Thailand. In this study, we compared the forecasting accuracy of several ANN models under the context of the capital asset pricing model (CAPM). Our analysis employed lower and upper bounds of the real daily stock returns, including SET50, PTT, SCC, and CPALL, over the period from 26 October 2011, to 7 May 2019. These three companies were selected for this study because they are considered well-performing companies in terms of their share prices. Moreover, they also experienced large trade volumes in the current decade and are regarded as fast-growing and highly volatile in the Thai stock market. All data were collected from Thomson Reuters DataStream. The criteria to choose the best fit model were MAE, MSE, and RMSE.

CAPM was proposed in separate studies by Sharpe [31] and Lintner [32] to measure the risk of the individual stock against the market in terms of the beta risk

β_{i}

. The

β_{i}

is a measure of a stock’s risk (volatility of returns) reflected by measuring the fluctuation of its price changes relative to the overall market. In other words, it is the stock’s sensitivity to market risk. The model can be written as

r_{i t} - r_{f t} = β_{0} + β_{i} (r_{m t} - r_{f t}) + ε_{i t},

(34)

where

r_{i t}

is the return of the stock

i

,

r_{m t}

is the return of the market,

r_{f t}

is risk-free,

ε_{i t}

is error term at time

t

. If

β_{i} > 0

, the stock is called aggressive stock (high risk), otherwise defensive stock (low risk) of excess stock return. In this study, we preserve the interval format of the stock

i

as

[r_{i t}^{l}, r_{i t}^{u}] = [\frac{P_{i t}^{l} - P_{t - 1}^{A}}{P_{i t - 1}^{A}}, \frac{P_{i t}^{u} - P_{i t - 1}^{A}}{P_{i t - 1}^{A}}],

(35)

where

P_{i t}^{u}

,

P_{i t}^{l}

and

P_{i t}^{A}

are maximum, minimum, and average prices of the individual stock

i

at time

t

, respectively. Note that the risk-free rate

r_{f t}^{l}

and

r_{f t}^{u}

, in this empirical study, is assumed to be zero.

Again, we preserve the interval format of the market return as

[r_{m t}^{l}, r_{m t}^{u}] = [\frac{P_{m t}^{l} - P_{m t - 1}^{A}}{P_{m t - 1}^{A}}, \frac{P_{m, t}^{u} - P_{m t - 1}^{A}}{P_{m t - 1}^{A}}],

(36)

where

P_{m t}^{u}

,

P_{m t}^{l}

and

P_{m t}^{A}

are maximum, minimum, and average prices of the interval of the market at time

t

, respectively.

We constructed a deep neural network using stock returns from the Stock Exchange of Thailand (SET), the major stock market in Thailand. We have selected three major stocks with high market capitalization at the beginning of the sample period. We collected SET50 (proxy of the market) and three company stocks, PTT, SCC, and CPALL, over the period from 4 January 2012 to 30 December 2019. The interval-valued data were constructed from a daily range of selected price indexes, i.e., the lowest and highest trading index values for the day were calculated to define the movement on the market for that day. Then, the construction of our interval-valued prediction for these three stocks would be made based on the models developed in Section 3. We note that all the interval price series were transformed to be interval returns following Equations (34) and (35). The descriptive statistics, namely the mean, standard deviation, minimum value, and maximum value of the variables for the full sample, are summarized in Table 3. We observe that the returns exhibited negative skewness for lower bound returns and positive for upper bound returns. In addition, the values of skewness on both sides were asymmetric, indicating that the gain and the loss in the Thai stock market were quite different. For illustration, a visual relationship between the stock market and each stock is shown in Figure 4.

Moreover, a unit root test was also conducted to examine whether a time series variable was nonstationary and possessed a unit root. In this study, we used the minimum Bayes factor (MBF) as a tool for making a statistical test. The MBF has a significant advantage over the p-value because the likelihood of the observed data can be expressed under each hypothesis [33]. If 1 < MBF < 1/3, 1/3 < MBF < 1/10, 1/10 < MBF < 1/30, 1/30 < MBF < 1/100, 1/100 < MBF < 1/300 and MBF < 1/300, there is a chance that the MBF favors, respectively, the weak evidence, moderate evidence, substantial evidence, strong evidence, very strong evidence, and decisive evidence for the null hypothesis. According to the results, all data series were decisive stationary, as shown by the low MBF values [33].

5.2. Comparison Results

This section shows the results of the models representing artificial neural networks for interval-valued data of Thailand’s stock market. In this empirical example, we consider the number of hidden neurons between one and three hidden neurons with four activation functions. Again, one input layer, one output layer, and one hidden layer were assumed. Thus, twelve ANN specifications were used to describe and forecast the excess return of PTT, SCC, and CPALL stock returns under the CAPM context. To evaluate the forecasting performance of the twelve ANN-class specifications presented in this section, the forecasting process was handled as follows: Each dataset was split into 80% for training and 20% for testing. Thus, both in-sample and out-of-sample forecasts were conducted in this comparison. The performance evaluation of the interval-valued data forecasting models ANN-CC under CAPM was accomplished through MAE, MSE, and RMSE.

Table 4 shows the in-sample and out-of-sample estimation results for different ANN-CC specifications. Focusing on the MAE, MSE, and RMSE, the result indicated that the interval-valued prediction results were a bit sensitive to activation functions. These activation functions were compared, and we found that the exponential activation function enabled more accurate forecasting in most cases, as the MAE, MSE, and RMSE of this activation were lower than of other activation functions. When the number of hidden nodes considered in this comparison was compared, we observed that the higher the number of hidden nodes, the lower MAE, MSE, and RMSE in some cases. Finally, we considered the weight of interval-valued data, which was calculated by the convex combination method. The results show that most of the weight parameters for interval excess stock return and interval excess market return were not equal to 0.5. Therefore, it can be said that the assumption of the center method in the neural network forecasting of Maia et al. [25] may be inappropriate for practical problems. This result confirms the reliability of the convex combination method in ANN-CC forecasting.

Furthermore, we also compared the performance of the ANN-CC models with the traditional ANN models, ANN-Center (Table 5) and RANN-LU (Table 6). Table 5 and Table 6 also provide the MAE, MSE, and RMSE of ANN-Center and RANN-LU, respectively. We note that four activations and a different number of hidden neurons were compared, and we also found that the exponential activation function provided higher performance in most cases. Furthermore, the prediction error decreased for a larger number of hidden neurons because the MAE, MSE, and RMSE of the ANN-Center and RANN-LU models were lower when the number of hidden neurons increased.

To make a clear explanation, the best prediction models for the three stocks are summarized in Table 7. We found that the exponential activation function was mostly selected (except for ANN-Center in the PTT case). This indicates a nonlinear pattern of these three stock companies. It is evident that the performance measures of the best specification of ANN-CC models were lower than those of the ANN-Center and RANN-LU for all stock indices, meaning that the ANN-CC model was superior to the conventional models in both in- and out-of-sample forecast performances. Additionally, in this section, we also compared the performance of our ANN-CC against the methods proposed in the literature, namely, center and center-range methods of Billard and Diday [7], PM method of Souza et al. [9], and IT method of Buansing et al. [12]. The result clearly confirmed the higher prediction performance of our proposed ANN-Center.

A robustness check was also conducted to confirm the performance of our proposed model. As in practice, the simple loss functions, such as MAE, MSE, and RMSE, may not yield a piece of sufficient information to identify a single forecasting model as “best”. Therefore, in this study, another accuracy measure, the model confidence set (MCS), was used to evaluate forecasting performance. Hansen et al. [34] introduced the MCS test to validate the forecasting performance of forecasting models. In this study, our MCS tests were based on two loss functions, MAE, MSE, and RMSE. We note that when the p-value was higher, it meant that they were more likely to reject the null hypothesis of equal predictive ability. In other words, the greater the p-value, the better the model. For more details of the MCS test, we referred to Hansen et al. [34].

The statistical performance results for all competing models are provided in the bracket shown in Table 7. According to the MCS test results, the ANN-CC model clearly outperformed other competing models for both in- and out-of-sample forecasts. The p-values of the ANN-CC model were equal to one, but the other three models were less than the 0.10 threshold. It means that ANN-Center, RNN-LU, and linear regression forecasting models were removed in the MCS inspection process, and thus the ANN-CC was the only survivor model.

5.3. Hong Kong Air Quality Monitoring Dataset

In the second dataset, we considered the Hong Kong air quality monitoring dataset as another example. This dataset is suggested in Yang et al. [27] and can be retrieved from http://www.epd.gov.hk (accessed on 20 February 2021). They provide hourly air quality data of 16 monitoring stations in Hong Kong. In this study, we also considered the data from Central/Western Station and downloaded the hourly data ranging from 1 January 2020, to 31 December 2020. Then, we aggregated the hourly data to the minimum and maximum form according to each day’s record. There were seven air quality indicators in the database. However, we considered some of them and selected the respirable suspended particulates (RSP) as the interval-valued response variable and selected dinitrogen tetroxide (NO₂) and sulfur dioxide (SO₂) as the interval-valued explanatory variables.

Again, each dataset was split into 80% for training and 20% for testing. Thus, both in-sample and out-of-sample forecasts were conducted in this comparison. The performance evaluation of the interval-valued data forecasting models ANN-CC under CAPM was accomplished through MAE, MSE, and RMSE. As shown in Table 8, similar to the first example, our proposed ANN-CC model was the best in terms of MAE, MSE, RMSE, and MCS’s p-value.

To illustrate the performance of our ANN-CC models, we show the out-of-sample forecasting result of our model on the Thai stocks and RSP in Figure 5. For clarity, only 10% of the out-of-sample forecasting result are shown. In this figure, the red vertical line segment represents a predicted interval-valued data, while the gray vertical line segment represents a predicted interval-valued data; the extremes correspond to the minimum and maximum interval values. The comparison between actual values and predicted values indicates the quality of modeling and the prediction task. The results show that the predicted values were very close to the actual values, indicating the goodness of fit of our model.

5.4. Discussion

Although the traditional models like ANN-Center and RANN-LU provide acceptable prediction results in all stocks and RSP, they still face some limitations. The ANN-Center relies on the midpoint of the data, which may not reflect real behavior of the data during the day. Likewise, the RANN-LU predicts future observation based on either lower or upper bounds, separately, clearly leaving out the information within the interval-value data. Thus, our proposed ANN-CC model is used to solve such a challenging task and consider the whole information during the interval. According to the above experimental results, we can draw the following conclusions:

(1): Regarding the prediction performance, our ANN-CC was superior to other traditional models in all datasets.
(2): We note that the symmetric weight within the interval data should not be $w = 0.5$ . We found that the prediction result was sensitive to the weight $w$ ; thus, the weight should not be a fixed parameter.
(3): Our model outperformed the ANN-LU and RANN-LU models in situations in which the interval series had linear and nonlinear behavior.
(4): We also studied the sensitivity of each activation function and found that the quality of the prediction model was not very sensitive in many cases. However, careful assessment needs to be made when choosing the activation function.
(5): Even though the exponential activation function seemed to be the best fit one in the ANN architecture, it was noticed that other activation functions performed well in some cases. Although the exponential activation function performed very well in the selected three stocks, it may not be reliable in other stocks or under other ANN structures.
(6): However, we can draw an important conclusion that our ANN-CC is a promising model for interval-valued data forecasting. The ANN-CC method has the advantages of not assuming constraints for the weight nor fixing reference points. The ANN-CC model is adaptive and adjusts itself for the best fit. The fitted model allows the behavior analysis of response lower and upper bounds based on the variation of the reference points of input and output intervals.

6. Conclusions

This paper proposed an artificial neural network with a convex combination (ANN-CC) method for interval-valued data prediction. Simulation and experimental results on real data showed that the proposed ANN-CC model is a useful tool in interval-valued prediction tasks, especially for complicated nonlinear datasets. Moreover, the proposed ANN-CC model fills the research gap by considering interval-valued data using the convex combination method. Our proposed model was examined by comparing its performance with conventional ANN with the center method (ANN-Center) and regularized ANN-LU (RANN-LU), linear regression with the center method, center-range, PM, and IT methods. We considered three stock returns in the Thai stock market and Hong Kong air quality monitoring dataset in our empirical comparison. According to the in-sample and out-of-sample forecasts, we found that the performances of various ANN-CC specifications were not much different. However, we observed that the tanh activation function performed well in the in-sample and out-sample forecast, while the linear activation function performed relatively well in the in-sample forecast. In addition, we could also confirm the higher performance of our ANN-CC compared to that of the ANN-Center and RANN-LU. Experimental results on two real datasets also confirmed that the proposed ANN-CC model outperformed the conventional models.

In this study, a neural network with one hidden layer was assumed. However, real-world data is quite complex, and one hidden layer may not be enough to learn the data. Thus, a deep neural network (more than two hidden layers) should be more promising in the approximation performance. Another meaningful future work is to employ other deep learning methods, such as recurrent neural networks and long short-term memory. These models handle incoming data in time order and learn about the previous time to predict future value. In addition to deep learning methods, the fuzzy inference system (FIS) modeling approach for interval-valued time series forecasting [1] is also suggested for further study. Finally, our proposed model can be applied for forecasting in other areas such as environmental and medical sciences.

Author Contributions

Conceptualization, R.P. and W.Y.; methodology, W.Y.; software, W.Y.; validation, R.P., P.M. and W.Y.; formal analysis, R.P.; investigation, W.Y.; resources, P.M.; data curation, P.M.; writing—original draft preparation, W.Y.; writing—review and editing, W.Y.; visualization, R.P.; supervision, W.Y. and P.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Center of Excellence in Econometrics, Faculty of Economics, Chiang Mai University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

In this study, we used simulated data to show the performance of our model, and the simulation processes are already explained in the paper. For the real data analysis section, the data can be freely collected from Thomson Reuter DataStream. However, the data are available from the author upon request ([email protected] (accessed on 27 April 2021)).

Acknowledgments

The authors would like to thank Laxmi Worachai, Vladik Kreinovich, and Hung T. Nguyen for their helpful comments on this paper. The authors are also grateful for the financial support offered by the Center of Excellence in Econometrics, Chiang Mai University, Thailand.

Conflicts of Interest

The authors declare no conflict of interests.

References

Maciel, L.; Ballini, R. A fuzzy inference system modeling approach for interval-valued symbolic data forecasting. Knowl. Based Syst. 2019, 164, 139–149. [Google Scholar] [CrossRef]
Ma, X.; Dong, Y. An estimating combination method for interval forecasting of electrical load time series. Expert Syst. Appl. 2020, 158, 113498. [Google Scholar] [CrossRef]
Chou, J.S.; Truong, D.N.; Le, T.L. Interval forecasting of financial time series by accelerated particle swarm-optimized multi-output machine learning system. IEEE Access 2020, 8, 14798–14808. [Google Scholar] [CrossRef]
Lauro, C.N.; Palumbo, F. Principal component analysis of interval data: A symbolic data analysis approach. Comput. Stat. 2000, 15, 73–87. [Google Scholar] [CrossRef]
Branzei, R.; Branzei, O.; Gök, S.Z.A.; Tijs, S. Cooperative interval games: A survey. Cent. Eur. J. Oper. Res. 2010, 18, 397–411. [Google Scholar] [CrossRef]
Kiekintveld, C.; Islam, T.; Kreinovich, V. Security games with interval uncertainty. In Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems, Paul, MN, USA, 6–10 May 2013; pp. 231–238. [Google Scholar]
Billard, L.; Diday, E. Regression analysis for interval-valued data. In Data Analysis, Classification, and Related Methods; Springer: Berlin/Heidelberg, Germany, 2000; pp. 369–374. [Google Scholar]
Neto, E.D.A.L.; de Carvalho, F.D.A. Centre and range method for fitting a linear regression model to symbolic interval data. Comput. Stat. Data Anal. 2008, 52, 1500–1515. [Google Scholar] [CrossRef]
Souza, L.C.; Souza, R.M.; Amaral, G.J.; Silva Filho, T.M. A parametrized approach for linear regression of interval data. Knowl. Based Syst. 2017, 131, 149–159. [Google Scholar] [CrossRef]
Chanaim, S.; Sriboonchitta, S.; Rungruang, C. A Convex Combination Method for Linear Regression with Interval Data. In Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2016; Huynh, V.N., Inuiguchi, M., Le, B., Le, B., Denoeux, T., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2016; Volume 9978, pp. 469–480. [Google Scholar] [CrossRef]
Phadkantha, R.; Yamaka, W.; Tansuchat, R. Analysis of Risk, Rate of Return and Dependency of REITs in ASIA with Capital Asset Pricing Model. In Predictive Econometrics and Big Data. TES 2018; Kreinovich, V., Sriboonchitta, S., Chakpitak, N., Eds.; Studies in Computational Intelligence; Springer: Cham, Switzerland, 2018; Volume 753, pp. 536–548. [Google Scholar] [CrossRef]
Buansing, T.T.; Golan, A.; Ullah, A. An information-theoretic approach for forecasting interval-valued SP500 daily returns. Int. J. Forecast. 2020, 36, 800–813. [Google Scholar] [CrossRef]
Zhang, G.P. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003, 50, 159–175. [Google Scholar] [CrossRef]
Mondal, P.; Shit, L.; Goswami, S. Study of effectiveness of time series modeling (ARIMA) in forecasting stock prices. Int. J. Comput. Sci. Eng. Appl. 2014, 4, 13. [Google Scholar] [CrossRef]
McMillan, D.G. Nonlinear predictability of stock market returns: Evidence from nonparametric and threshold models. Int. Rev. Econ. Financ. 2001, 10, 353–368. [Google Scholar] [CrossRef] [Green Version]
Nyberg, H. Predicting bear and bull stock markets with dynamic binary time series models. J. Bank. Financ. 2013, 37, 3351–3363. [Google Scholar] [CrossRef] [Green Version]
Pastpipatkul, P.; Maneejuk, P.; Sriboonchitta, S. Markov Switching Regression with Interval Data: Application to Financial Risk via CAPM. Adv. Sci. Lett. 2017, 23, 10794–10798. [Google Scholar] [CrossRef]
Phochanachan, P.; Pastpipatkul, P.; Yamaka, W.; Sriboonchitta, S. Threshold regression for modeling symbolic interval data. Int. J. Appl. Bus. Econ. Res. 2017, 15, 195–207. [Google Scholar]
Hiransha, M.; Gopalakrishnan, E.A.; Menon, V.K.; Soman, K.P. NSE stock market prediction using deep-learning models. Procedia Comput. Sci. 2018, 132, 1351–1362. [Google Scholar]
Haykin, S.; Principe, J. Making sense of a complex world [chaotic events modeling]. IEEE Signal Process. Mag. 1998, 15, 66–81. [Google Scholar] [CrossRef]
Zhang, G.; Patuwo, B.E.; Hu, M.Y. Forecasting with artificial neural networks: The state of the art. Int. J. Forecast. 1998, 14, 35–62. [Google Scholar] [CrossRef]
Leung, D.; Abbenante, G.; Fairlie, D.P. Protease inhibitors: Current status and future prospects. J. Med. Chem. 2000, 43, 305–341. [Google Scholar] [CrossRef]
Cao, Q.; Leggio, K.B.; Schniederjans, M.J. A comparison between Fama and French′s model and artificial neural networks in predicting the Chinese stock market. Comput. Oper. Res. 2005, 32, 2499–2512. [Google Scholar] [CrossRef]
San Roque, A.M.; Maté, C.; Arroyo, J.; Sarabia, Á. iMLP: Applying multi-layer perceptrons to interval-valued data. Neural Process. Lett. 2007, 25, 157–169. [Google Scholar] [CrossRef]
Maia, A.L.S.; de Carvalho, F.D.A.; Ludermir, T.B. Forecasting models for interval-valued time series. Neurocomputing 2008, 71, 3344–3352. [Google Scholar] [CrossRef]
Maia, A.L.S.; de Carvalho, F.D.A. Holt’s exponential smoothing and neural network models for forecasting interval-valued time series. Int. J. Forecast. 2011, 27, 740–759. [Google Scholar] [CrossRef]
Yang, Z.; Lin, D.K.; Zhang, A. Interval-valued data prediction via regularized artificial neural network. Neurocomputing 2019, 331, 336–345. [Google Scholar] [CrossRef] [Green Version]
Mir, M.; Nasirzadeh, F.; Kabir, H.D.; Khosravi, A. Neural Network-based interval forecasting of construction material prices. J. Build. Eng. 2021, 39, 102288. [Google Scholar]
Moore, R.E. Methods and Applications of Interval Analysis; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1979. [Google Scholar]
Humphrey, G.B.; Maier, H.R.; Wu, W.; Mount, N.J.; Dandy, G.C.; Abrahart, R.J.; Dawson, C.W. Improved validation framework and R-package for artificial neural network models. Environ. Model. Softw. 2017, 92, 82–106. [Google Scholar] [CrossRef] [Green Version]
Sharpe, W.F. Capital asset prices: A theory of market equilibrium under conditions of risk. J. Financ. 1964, 19, 425–442. [Google Scholar]
Lintner, J. Security prices, risk, and maximal gains from diversification. J. Financ. 1965, 20, 587–615. [Google Scholar]
Maneejuk, P.; Yamaka, W. Significance test for linear regression: How to test without P-values? J. Appl. Stat. 2021, 48, 827–845. [Google Scholar] [CrossRef]
Hansen, P.R.; Lunde, A.; Nason, J.M. The model confidence set. Econometrica 2011, 79, 453–497. [Google Scholar] [CrossRef] [Green Version]

Figure 1. The network architecture of the interval-valued data artificial neural network with convex combination (ANN-CC) model (one hidden layer is assumed.

Figure 2. Interval-valued data plot for the linear case. The red dots indicate the mid-point value within the interval.

Figure 3. Interval-valued data plot for the nonlinear case. The red dots indicate the mid-point value within the interval.

Figure 4. Interval-valued data plots for Stock Exchange of Thailand (SET) and stocks. The red dot presents the midpoint data plot for SET and stocks.

Figure 5. The out of sample interval forecasts by ANN-CC (red) vs. the actual data (gray) for example data.

Table 1. Experimental results for the linear case.

	ANN-CC			ANN-Center			RANN-LU
Scenario 1	MAE	MSE	RMSE	MAE	MSE	RMSE	MAE	MSE	RMSE
tanh	2.4011 (0.1548)	10.8590 (1.3251)	3.2951 (1.2434)	2.3154 (0.1215)	9.4115 (1.1154)	3.0681 (1.0859)	3.1244 (1.1245)	14.2251 (2.3414)	3.7729 (1.5521)
sigmoid	2.5870 (0.1211)	10.9894 (1.3511)	3.3154 (1.2433)	2.4023 (0.1011)	9.5558 (1.1584)	3.0917 (1.0756)	3.1148 (1.3584)	12.5441 (3.6974)	3.5423 (1.2532)
linear	2.7244 (0.1513)	12.0524 (1.1125)	3.4723 (1.3234)	2.4488 (0.1254)	10.3554 (1.0015)	3.2184 0028)	3.5415 (1.2121)	14.3554 (1.5487)	3.7890 (1.1112)
exp	2.5980 (0.1148)	11.3486 (1.981)	3.3693 (1.2113)	2.4223 (0.1011)	10.0215 (1.1057)	3.1661 (1.0723)	3.1057 (1.2554)	12.5015 (3.4548)	3.5361 (0.9723)
Scenario 2	MAE	MSE	RMSE	MAE	MSE	RMSE	MAE	MSE	RMSE
tanh	2.1350 (0.1254)	7.0789 (1.3258)	2.6610 (1.1123)	3.1332 (2.1254)	9.0173 (2.4848)	3.0021 (1.4434)	3.5445 (1.3145)	15.1541 (2.6879)	3.8935 (1.2329)
sigmoid	2.3328 (0.1158)	8.6030 (1.2217)	2.9337 (1.0283)	4.2214 (3.3141)	10.6030 (1.6278)	3.2565 (1.3233)	2.5847 (1.2597)	9.6984 (3.3354)	3.1154 (1.2810)
linear	2.7825 (0.1698)	8.9356 (1.1369)	2.9894 (1.0022)	4.3112 (2.6545)	10.9356 (2.3679)	3.3078 (1.2232)	4.1125 (1.3541)	14.0778 (1.1548)	3.7533 (1.0012)
exp	2.4625 (0.1354)	8.5072 (1.2589)	2.9173 (0.9233)	3.5845 (2.3651)	10.4072 (2.2589)	3.2269 (1.1129)	3.4797 (1.5479)	10.3155 (1.1554)	3.2129 (0.9928)
Scenario 3	MAE	MSE	RMSE	MAE	MSE	RMSE	MAE	MSE	RMSE
tanh	2.1049 (0.1112)	6.9441 (1.5159)	2.6352 (0.6333)	3.4488 (2.1254)	9.9410 (4.3549)	3.1539 (1.6727)	2.4141 (1.0413)	8.5454 (2.5444)	2.9245 (1.5529)
sigmoid	2.1249 (0.1874)	7.0088 (1.6511)	2.6468 (0.7843)	3.9784 (2.3743)	15.0113 (5.4035)	3.8730 (1.2270)	2.5454 (1.1115)	10.5544 (2.1125)	3.2456 (1.2332)
linear	2.8524 (0.1369)	12.9612 (1.3594)	3.6013 (1.0091)	3.8411 (2.6588)	14.9023 (5.8941)	3.8607 (1.3410)	2.9445 (0.8797)	13.1154 (2.1112)	3.6210 (1.2221)
exp	2.4489 (0.1364)	10.4617 (1.5114)	3.2344 (0.8833)	4.8778 (4.2643)	16.4107 (5.4113)	4.0511 (2.0013)	3.0124 (1.0694)	11.3547 (1.9967)	3.2683 (1.1009)

Note: The parenthesis denotes the standard deviation. The bold number presents the lowest values of mean absolute error (MAE), mean squared errors (MSE), and root mean squared errors (RMSE).

Table 2. Experimental results for the nonlinear case.

	ANN-CC			ANN-Center			RANN-LU
Scenario 1	MAE	MSE	RMSE	MAE	MSE	RMSE	MAE	MSE	RMSE
tanh	3.6341 (0.3015)	14.8140 (2.3840)	3.8491 (1.0023)	3.1258 (0.2474)	9.8741 (2.0126)	3.1438 (0.9343)	3.9874 (1.4126)	14.9874 (3.2158)	3.8715 (0.9323)
sigmoid	3.4558 (0.3114)	10.9894 (5.0114)	3.3155 (0.9833)	3.2154 (0.2099)	9.9741 (2.3654)	3.1582 (0.8834)	4.9845 (1.9136)	15.3898 (5.1145)	3.9245 (0.9823)
linear	5.3155 (1.4259)	17.1148 (5.6584)	4.1377 (1.0824)	5.2145 (1.3978)	16.4213 (3.3665)	4.0523 (1.1112)	5.4136 (2.0113)	17.8854 (5.7897)	4.2295 (1.2334)
exp	4.6211 (0.8797)	16.3123 (2.8557)	4.0391 (1.0067)	3.1158 (0.9788)	14.1314 (2.6547)	3.7599 (0.9823)	4.8654 (1.3541)	17.0198 (3.9844)	4.1256 (1.2220)
Scenario 2	MAE	MSE	RMSE	MAE	MSE	RMSE	MAE	MSE	RMSE
tanh	3.2250 (0.8453)	9.1588 (3.6941)	3.0372 (0.8832)	3.9788 (2.5481)	10.5448 (4.6641)	3.2475 (1.1234)	3.7888 (1.1255)	9.8368 (3.6879)	3.1367
sigmoid	4.0581 (1.3511)	13.5154 (3.6444)	3.6765 (0.8734)	5.3698 (2.4145)	20.3688 (4.3658)	4.5139 (2.4449)	4.3781 (1.8746)	16.1556 (5.1034)	4.0197
linear	4.8744 (2.5584)	17.9981 (4.3398)	4.2428 (0.9734)	6.1142 (3.9451)	27.6548 (10.4891)	5.2597 (2.1240)	5.6684 (2.9876)	22.8314 (6.3598)	4.7783
exp	4.1158 (0.7446)	15.4458 (3.3658)	3.9311 (0.9872)	5.4101 (2.3155)	20.0123 (4.4115)	4.4744 (2.1098)	4.2666 (1.3659)	14.9155 (1.4231)	3.8628
Scenario 3	MAE	MSE	RMSE	MAE	MSE	RMSE	MAE	MSE	RMSE
tanh	3.4589 (0.7894)	9.3158 (4.1158)	3.0544 (0.9239)	5.8115 (3.0125)	21.1930 (10.3661)	4.6033 (1.2323)	3.5012 (1.2458)	9.4125 (4.2320)	3.0681 (1.9383)
sigmoid	4.1585 (1.3841)	14.3651 (3.9887)	3.7901 (1.1112)	5.9884 (2.3688)	20.0113 (10.4035)	4.4730 (1.2223)	4.3685 (1.5645)	14.8554 (4.3598)	3.8544 (1.2234)
linear	4.8664 (2.1355)	16.6557 (6.0123)	4.0814 (1.0389)	5.9424 (2.6871)	25.1253 (10.3211)	5.0131 (1.4980)	4.9785 (1.8994)	16.9974 (5.3145)	4.1229 (1.4409)
exp	4.2556 (1.2154)	14.4456 (4.1106)	3.8009 (0.9227)	6.6698 (5.1155)	31.4107 (12.9987)	5.6044 (1.3409)	4.8664 (2.0115)	15.5024 (2.1258)	3.9377 (1.2284)

Note: One hidden layer node is assumed for ANN in this simulation study. The parenthesis denotes the standard deviation. The bold number presents the lowest values of MAE, MSE, and RMSE.

Table 3. Data description.

	SET_ u	SET_ l	PTT_ u	PTT_ l	SCC_ u	SCC_ l	CPALL_ u	CPALL_ l
Mean	0.012	−0.011	0.019	−0.018	0.018	−0.017	0.021	−0.019
Median	0.010	−0.009	0.016	−0.015	0.016	−0.015	0.017	−0.016
Maximum	0.097	0.025	0.136	0.033	0.113	0.027	0.185	0.032
Minimum	−0.019	−0.106	−0.027	−0.128	−0.016	−0.105	−0.058	−0.172
Std. Dev.	0.010	0.010	0.016	0.015	0.014	0.013	0.017	0.016
Skewness	1.894	−1.812	1.576	−1.564	1.653	−1.475	2.038	−2.280
Kurtosis	12.247	11.383	7.981	8.775	8.337	7.522	12.824	15.743
Jarque—Bera	7659.845	6398.846	2665.961	3308.472	3022.738	2235.950	8677.507	14,052.600
MBF Jarque-Bera	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
Observations	1841	1841	1841	1841	1841	1841	1841	1841
Unit root test	−2.230	−3.454	−2.125	−2.125	−2.661	−2.385	−2.307	−3.255
MBF-unit root	0.083	0.003	0.105	0.105	0.029	0.058	0.070	0.005

Note: _ u and _ l are upper and lower bounds, respectively.

Table 4. Experimental results of ANN-CC.

One Hidden Node	Weight Parameter		MAE		MSE		RMSE
PTT	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.648521	0.999000	0.014094	0.013565	0.000315	0.000298	0.017751	0.017265
sigmoid	0.608577	0.962942	0.011501	0.011931	0.000249	0.000237	0.015792	0.015362
linear	0.611678	0.999000	0.013707	0.013696	0.000303	0.000301	0.017401	0.017378
exp	0.607746	0.999000	0.010572	0.010023	0.000221	0.000209	0.014862	0.014465
SCC	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.510112	0.927919	0.011382	0.011377	0.000218	0.000211	0.014768	0.014533
sigmoid	0.513121	0.910014	0.010506	0.011060	0.000190	0.000160	0.013781	0.012658
linear	0.519982	0.920239	0.011793	0.011423	0.000235	0.000232	0.015343	0.015238
exp	0.509987	0.930019	0.010734	0.011065	0.000193	0.000168	0.013897	0.012970
CPALL	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.412312	0.481917	0.013848	0.013835	0.000391	0.000384	0.019765	0.019588
sigmoid	0.461660	0.500013	0.012627	0.012231	0.000235	0.000191	0.015343	0.013832
linear	0.386464	0.342022	0.013619	0.012975	0.000353	0.000277	0.018756	0.016650
exp	0.461668	0.500101	0.013608	0.012891	0.000342	0.000269	0.018489	0.016413
Two Hidden Nodes	Weight Parameter		MAE		MSE		RMSE
PTT	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.398420	0.398420	0.014168	0.013570	0.000328	0.000294	0.018121	0.017151
sigmoid	0.409278	0.409278	0.010203	0.009748	0.000167	0.000157	0.012934	0.012535
linear	0.421373	0.421373	0.013431	0.013755	0.000315	0.000298	0.017763	0.017269
exp	0.389728	0.389728	0.009030	0.009085	0.000151	0.000145	0.012275	0.012040
SCC	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.358503	0.358503	0.012146	0.011344	0.000242	0.000201	0.0155543	0.014181
sigmoid	0.387659	0.387659	0.010360	0.010467	0.000162	0.000168	0.012736	0.012977
linear	0.359730	0.359730	0.010858	0.010981	0.000172	0.000190	0.013121	0.013779
exp	0.351820	0.351820	0.010011	0.010006	0.000156	0.000155	0.012494	0.012469
CPALL	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.423709	0.423709	0.014073	0.013789	0.000318	0.000319	0.017836	0.017856
sigmoid	0.376741	0.376741	0.012700	0.012163	0.000255	0.000262	0.015960	0.016191
linear	0.431735	0.431735	0.015190	0.013474	0.000393	0.000301	0.019825	0.017356
exp	0.427827	0.427827	0.011830	0.011123	0.000212	0.000201	0.014567	0.014183
Three Hidden Nodes	Weight Parameter		MAE		MSE		RMSE
PTT	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.394010	0.394010	0.012486	0.011650	0.000279	0.000215	0.016751	0.014672
sigmoid	0.387755	0.387755	0.010510	0.009682	0.000192	0.000151	0.013866	0.012291
linear	0.425890	0.425890	0.012102	0.011565	0.000258	0.000222	0.016080	0.014912
exp	0.383586	0.383586	0.010312	0.009675	0.000180	0.000150	0.013425	0.012256
SCC	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.363167	0.363167	0.011112	0.011669	0.000183	0.000215	0.013529	0.014668
sigmoid	0.337693	0.337693	0.010553	0.010319	0.000175	0.000161	0.013230	0.012692
linear	0.364510	0.364510	0.011384	0.011578	0.000187	0.000214	0.013677	0.014633
exp	0.367622	0.367622	0.010887	0.010341	0.000179	0.000164	0.013381	0.012815
CPALL	$w 1$	$w 2$	out	in	out	in	out	In
tanh	0.399262	0.399262	0.014182	0.013737	0.000397	0.000306	0.019931	0.017499
sigmoid	0.410937	0.410937	0.012159	0.012331	0.000261	0.000261	0.016160	0.016158
linear	0.415327	0.415327	0.013745	0.012910	0.000319	0.000282	0.017865	0.016780
exp	0.423794	0.423794	0.011926	0.012321	0.000231	0.000252	0.015185	0.015873

Note: “in” and “out” denote in-sample and out-of-sample forecasts, respectively. Bold number indicates the lowest MAE, MSE, and RMSE in each case.

Table 5. Experimental results of ANN-Center.

One Hidden Node	Weight Parameter		MAE		MSE		RMSE
PTT	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.019527	0.019337	0.000626	0.000593	0.025023	0.024332
sigmoid	0.500000	0.500000	0.019831	0.019362	0.000639	0.000596	0.025286	0.024424
linear	0.500000	0.500000	0.019064	0.019355	0.000597	0.000583	0.024441	0.02415
exp	0.500000	0.500000	0.012656	0.011672	0.000261	0.000225	0.016154	0.015011
SCC	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.017691	0.017901	0.000479	0.000496	0.021866	0.022275
sigmoid	0.500000	0.500000	0.017738	0.017932	0.000506	0.000499	0.022495	0.022342
linear	0.500000	0.500000	0.017561	0.017892	0.000465	0.000486	0.021569	0.022032
exp	0.500000	0.500000	0.017469	0.017703	0.000455	0.000472	0.021335	0.021737
CPALL	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.021078	0.020439	0.000801	0.000631	0.028314	0.025126
sigmoid	0.500000	0.500000	0.021049	0.020295	0.000797	0.000626	0.028236	0.025027
linear	0.500000	0.500000	0.021850	0.020635	0.000857	0.000658	0.029283	0.025666
exp	0.500000	0.500000	0.022129	0.020771	0.000891	0.000733	0.029861	0.027078
Two Hidden Nodes	Weight Parameter		MAE		MSE		RMSE
PTT	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.018938	0.019209	0.000558	0.000556	0.023622	0.023830
sigmoid	0.500000	0.500000	0.019426	0.019363	0.000608	0.000611	0.024658	0.024490
linear	0.500000	0.500000	0.018735	0.018200	0.000548	0.000541	0.023434	0.023269
exp	0.500000	0.500000	0.019611	0.019322	0.000629	0.000595	0.025092	0.024391
SCC	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.017883	0.017851	0.000472	0.000487	0.021731	0.022073
sigmoid	0.500000	0.500000	0.018067	0.017798	0.000517	0.000476	0.022742	0.021825
linear	0.500000	0.500000	0.018658	0.017653	0.000546	0.000461	0.023375	0.021481
exp	0.500000	0.500000	0.013203	0.012860	0.000291	0.000281	0.017057	0.016770
CPALL	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.020415	0.020792	0.000690	0.000681	0.026271	0.026090
sigmoid	0.500000	0.500000	0.021457	0.020544	0.000752	0.000665	0.027426	0.025781
linear	0.500000	0.500000	0.020851	0.020698	0.000644	0.000692	0.025382	0.026314
exp	0.500000	0.500000	0.015538	0.014341	0.000472	0.000386	0.021739	0.019653
Three Hidden Nodes	Weight Parameter		MAE		MSE		RMSE
PTT	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.019384	0.019376	0.000581	0.000607	0.024115	0.024637
sigmoid	0.500000	0.500000	0.019355	0.019373	0.000608	0.000600	0.024667	0.024495
linear	0.500000	0.500000	0.019215	0.019423	0.000603	0.000602	0.024549	0.024536
exp	0.500000	0.500000	0.009504	0.009139	0.000178	0.000165	0.013344	0.012845
SCC	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.017898	0.017843	0.000493	0.000482	0.022213	0.021961
sigmoid	0.500000	0.500000	0.017337	0.017988	0.000458	0.000491	0.021412	0.022166
linear	0.500000	0.500000	0.017841	0.017864	0.000489	0.000483	0.022121	0.021968
exp	0.500000	0.500000	0.012653	0.012585	0.000273	0.000267	0.016530	0.016353
CPALL	$w 1$	$w 2$	out	in	out	in	out	in
tanh	0.500000	0.500000	0.014358	0.014825	0.000393	0.000414	0.019813	0.020359
sigmoid	0.500000	0.500000	0.020873	0.020682	0.000718	0.000674	0.026790	0.025967
linear	0.500000	0.500000	0.021531	0.020515	0.000791	0.000655	0.028131	0.025598
exp	0.500000	0.500000	0.014286	0.014245	0.000413	0.000385	0.020323	0.019629

Note: “in” and “out” denote in-sample and out-of-sample forecasts, respectively. Bold number indicates the lowest MAE, MSE, and RMSE in each case.

Table 6. Experimental results of regularized ANN-LU (RANN-LU).

One Hidden Neuron	MAE		MSE		RMSE
PTT	out	in	out	in	out	in
tanh	0.020572	0.020551	0.000699	0.000633	0.026443	0.025164
sigmoid	0.019717	0.019512	0.000641	0.000612	0.025322	0.024745
linear	0.011574	0.014203	0.000348	0.000239	0.018662	0.015467
exp	0.011498	0.014183	0.000347	0.000204	0.018634	0.014291
	MAE		MSE		RMSE
SCC	out	in	out	in	out	In
tanh	0.017615	0.017566	0.000461	0.000463	0.021466	0.021522
sigmoid	0.017725	0.017619	0.000529	0.000470	0.023012	0.021684
linear	0.018800	0.017891	0.000541	0.000487	0.023267	0.022070
exp	0.018008	0.017821	0.000519	0.000480	0.022791	0.021915
	MAE		MSE		RMSE
CPALL	out	in	out	in	out	in
tanh	0.020698	0.020667	0.000679	0.000673	0.026053	0.025945
sigmoid	0.020784	0.020711	0.000735	0.000675	0.027123	0.025978
linear	0.020946	0.020754	0.000785	0.000730	0.028028	0.027032
exp	0.018078	0.016966	0.000571	0.000516	0.023883	0.022729
Two Hidden Neurons	MAE		MSE		RMSE
PTT	out	in	out	in	out	in
tanh	0.020031	0.019463	0.000657	0.000611	0.025640	0.024736
sigmoid	0.009529	0.009196	0.000186	0.000165	0.013632	0.012854
linear	0.019771	0.019383	0.000608	0.000610	0.024667	0.024689
exp	0.009449	0.009187	0.000184	0.000160	0.013573	0.012657
	MAE		MSE		RMSE
SCC	out	in	out	In	out	In
tanh	0.018178	0.017976	0.000517	0.000506	0.022743	0.022488
sigmoid	0.013537	0.013251	0.000314	0.000294	0.017732	0.017155
linear	0.013337	0.013334	0.000295	0.000300	0.017182	0.017329
exp	0.013058	0.013048	0.000285	0.000281	0.016898	0.016773
	MAE		MSE		RMSE
CPALL	out	in	out	in	out	in
tanh	0.014284	0.014469	0.000359	0.000406	0.018953	0.020149
sigmoid	0.014353	0.014793	0.000393	0.000415	0.019832	0.020377
linear	0.020375	0.020873	0.000663	0.000698	0.025759	0.026432
exp	0.020621	0.020821	0.000675	0.000696	0.025989	0.026378
Three Hidden Neurons	MAE		MSE		RMSE
PTT	out	in	out	in	out	in
tanh	0.019891	0.019421	0.000646	0.000611	0.025424	0.024728
sigmoid	0.019574	0.019510	0.000602	0.000621	0.024545	0.024933
linear	0.009158	0.009339	0.000173	0.000171	0.013169	0.013071
exp	0.009113	0.009108	0.000162	0.000156	0.012731	0.012481
	MAE		MSE		RMSE
SCC	out	in	out	in	out	in
tanh	0.017995	0.018026	0.000514	0.000507	0.022674	0.022523
sigmoid	0.014593	0.013787	0.000355	0.000315	0.018857	0.017751
linear	0.018369	0.017916	0.000524	0.000503	0.022878	0.022455
exp	0.013318	0.013200	0.000296	0.000295	0.017223	0.017189
	MAE		MSE		RMSE
CPALL	out	in	out	in	out	in
tanh	0.020252	0.020950	0.000637	0.000713	0.025246	0.026711
sigmoid	0.020111	0.020974	0.000604	0.000719	0.024565	0.026829
linear	0.020553	0.020850	0.000651	0.000705	0.025521	0.026563
exp	0.014032	0.014000	0.000364	0.000393	0.019083	0.019832

Note: “in” and “out” denote in-sample and out-of-sample forecasts, respectively. Bold number indicates the lowest MAE, MSE, and RMSE in each case.

Table 7. Summary of the forecasting performance and model confidence set (MCS) test of the first dataset.

		Number	MAE		MSE		RMSE
PTT	Activation	Hidden Neuron	out	in	out	in	out	in
ANN-CC	exp	2	0.009030 (1.0000)	0.009085 (1.0000)	0.000151 (1.0000)	0.000145 (1.0000)	0.012275 (1.0000)	0.012040 (1.0000)
ANN-Center	linear	2	0.018735 (0.0000)	0.018200 (0.0000)	0.000548 (0.0000)	0.000541 (0.0000)	0.023434 (0.0000)	0.023269 (0.0000)
RANN-LU	exp	3	0.009113 (0.2012)	0.009108 (0.1015)	0.000162 (0.2450)	0.000156 (0.0010)	0.012731 (0.3232)	0.012481 (0.0000)
IF			0.017932 (0.0000)	0.017817 (0.0000)	0.000513 (0.0000)	0.000501 (0.0000)	0.022652 (0.0000)	0.022383 (0.0000)
PM			0.018212 (0.0000)	0.018326 (0.0000)	0.000523 (0.0000)	0.000561 (0.0000)	0.022873 (0.0000)	0.0236861 (0.0000)
Center			0.021453 (0.0000)	0.020541 (0.0000)	0.000751 (0.0000)	0.000664 (0.0000)	0.027410 (0.0000)	0.025771 (0.0000)
Center-range			0.019877 (0.0000)	0.01532 (0.0000)	0.000717 (0.0000)	0.000602 (0.0000)	0.026771 (0.0000)	0.024538 (0.0000)
			MAE		MSE		RMSE
SCC			out	in	out	in	out	in
ANN-CC	exp	2	0.010011 (1.0000)	0.010006 (1.0000)	0.000156 (1.0000)	0.000155 (1.0000)	0.012494 (1.0000)	0.012469 (1.0000)
ANN-Center	exp	3	0.012653 (0.0000)	0.012585 (0.0000)	0.000273 (0.0000)	0.000267 (0.0000)	0.016530 (0.0000)	0.016353 (0.0000)
RANN- LU	exp	2	0.013058 (0.0000)	0.013048 (0.0000)	0.000285 (0.0000)	0.000281 (0.0000)	0.016898 (0.0000)	0.016773 (0.0000)
IF			0.012455 (0.0000)	0.012101 (0.0000)	0.000250 (0.0000)	0.000236 (0.0000)	0.015813 (0.0000)	0.015366 (0.0000)
PM			0.013013 (0.0000)	0.012992 (0.0000)	0.000271 (0.0000)	0.000263 (0.0000)	0.016464 (0.0000)	0.016221 (0.0000)
Center			0.014044 (0.0000)	0.014021 (0.0000)	0.000285 (0.0000)	0.000281 (0.0000)	0.016877 (0.0000)	0.016765 (0.0000)
Center-range			0.013221 (0.0000)	0.013019 (0.0000)	0.000277 (0.0000)	0.000275 (0.0000)	0.016642 (0.0000)	0.016581 (0.0000)
			MAE		MSE		RMSE
CPALL			out	in	out	in	out	in
ANN-CC	exp	2	0.011830 (1.0000)	0.011123 (1.0000)	0.000212 (1.0000)	0.000201 (1.0000)	0.014567 (1.0000)	0.014183 (1.0000)
ANN-Center	exp	3	0.014286 (0.0000)	0.014245 (0.0000)	0.000413 (0.0000)	0.000385 (0.0000)	0.020323 (0.0000)	0.019629 (0.0000)
RANN- LU	exp	3	0.014032 (0.0000)	0.014000 (0.0000)	0.000396 (0.0000)	0.000393 (0.0000)	0.019083 (0.0000)	0.019832 (0.0000)
IF			0.013251 (0.0000)	0.013656 (0.0000)	0.000372 (0.0000)	0.000369 (0.0000)	0.019291 (0.0000)	0.019205 (0.0000)
PM			0.015130 (0.0000)	0.014561 (0.0000)	0.000431 (0.0000)	0.000426 (0.0000)	0.020763 (0.0000)	0.02071 (0.0000)
Center			0.016125 (0.0000)	0.015365 (0.0000)	0.00510 (0.0000)	0.000439 (0.0000)	0.071423 (0.0000)	0.020961 (0.0000)
Center-range			0.019877 (0.0000)	0.01532 (0.0000)	0.000717 (0.0000)	0.000602 (0.0000)	0.026762 (0.0000)	0.024524 (0.0000)

Note: “in” and “out” denote in-sample and out-of-sample forecasts, respectively. Bold number indicates the lowest MAE, MSE, and RMSE in all cases. The number in the bracket indicates the p-value of MCS test results by using bootstrap simulation 1000 times.

Table 8. Summary of the forecasting performance and MCS test of the second dataset.

		Number	MAE		MSE		RMSE
PTT	Activation	Hidden Neuron	out	in	out	in	out	in
ANN-CC	sigmoid	2	2.6374 (1.0000)	2.7092 (1.0000)	7.3232 (1.0000)	7.4085 (1.0000)	2.7078 (1.0000)	2.7225 (1.0000)
ANN-Center	sigmoid	2	2.9833 (0.0000)	2.8363 (0.0000)	8.1092 (0.0000)	8.3233 (0.0000)	2.8483 (0.0000)	2.8854 (0.0000)
RANN-LU	sigmoid	3	2.7762 (0.0000)	2.7423 (0.0000)	7.6403 (0.0000)	7.5409 (0.0510)	2.7656 (0.0000)	2.74613 (0.0000)
IF			2.8532 (0.0000)	2.7821 (0.0000)	8.5110 (0.0000)	8.3368 (0.0000)	2.9161 (0.0000)	2.8866 (0.0000)
PM			2.7011 (0.0000)	2.6893 (0.0000)	7.5215 (0.0000)	7.4212 (0.0000)	2.7427 (0.0000)	2.7240 (0.0000)
Center			3.0029 (0.0000)	2.9861 (0.0000)	8.7334 (0.0000)	8.7433 (0.0000)	2.9543 (0.0000)	2.9571 (0.0000)
Center-range			2.8763 (0.0000)	2.7823 (0.0000)	8.5532 (0.0000)	8.3499 (0.0000)	2.9239 (0.0000)	2.8883 (0.0000)

Note: “in” and “out” denote in-sample and out-of-sample forecasts, respectively. Bold number indicates the lowest MAE, MSE, and RMSE in all cases. The number in the bracket indicates the p-value of MCS test results by using bootstrap simulation at 1000 times.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yamaka, W.; Phadkantha, R.; Maneejuk, P. A Convex Combination Approach for Artificial Neural Network of Interval Data. Appl. Sci. 2021, 11, 3997. https://0-doi-org.brum.beds.ac.uk/10.3390/app11093997

AMA Style

Yamaka W, Phadkantha R, Maneejuk P. A Convex Combination Approach for Artificial Neural Network of Interval Data. Applied Sciences. 2021; 11(9):3997. https://0-doi-org.brum.beds.ac.uk/10.3390/app11093997

Chicago/Turabian Style

Yamaka, Woraphon, Rungrapee Phadkantha, and Paravee Maneejuk. 2021. "A Convex Combination Approach for Artificial Neural Network of Interval Data" Applied Sciences 11, no. 9: 3997. https://0-doi-org.brum.beds.ac.uk/10.3390/app11093997

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Convex Combination Approach for Artificial Neural Network of Interval Data

Abstract

1. Introduction

2. Reviews of Existing Methods

2.1. Linear Regression Based on Center Method

2.2. Linear Regression Based on Center-Range Method

2.3. Linear Regression Based on Convex Combination Method

2.4. Regularized Artificial Neural Network (RANN)

3. The Proposed Method Artificial Neural Network with Convex Combination (ANN-CC)

4. Simulation Study

4.1. Linear Structure

4.2. Nonlinear Structure

5. Application to Real Data

5.1. Capital Asset Pricing Model: Thai Stocks

5.2. Comparison Results

5.3. Hong Kong Air Quality Monitoring Dataset

5.4. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI