Application of Fuzzy-Based Support Vector Regression to Forecast of International Airport Freight Volumes

Yang, Cheng-Hong; Shao, Jen-Chung; Liu, Yen-Hsien; Jou, Pey-Huah; Lin, Yu-Da

doi:10.3390/math10142399

Open AccessEditor’s ChoiceArticle

Application of Fuzzy-Based Support Vector Regression to Forecast of International Airport Freight Volumes

¹

Department of Business Administration, Tainan University of Technology, Tainan 71002, Taiwan

²

Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80778, Taiwan

³

Ph.D. Program in Biomedical Engineering, Kaohsiung Medical University, Kaohsiung 80708, Taiwan

⁴

School of Dentistry, Kaohsiung Medical University, Kaohsiung 80708, Taiwan

⁵

Drug Development and Value Creation Research Center, Kaohsiung Medical University, Kaohsiung 80708, Taiwan

⁶

Department of Computer Science and Information Engineering, National Penghu University of Science and Technology, Penghu 880011, Taiwan

^*

Authors to whom correspondence should be addressed.

Mathematics 2022, 10(14), 2399; https://0-doi-org.brum.beds.ac.uk/10.3390/math10142399

Submission received: 2 June 2022 / Revised: 2 July 2022 / Accepted: 6 July 2022 / Published: 8 July 2022

(This article belongs to the Special Issue Advancements in Machine Learning and Statistical Modeling, and Real-World Applications)

Download

Browse Figures

Versions Notes

Abstract

:

As freight volumes increase, airports are likely to require additional infrastructure development, increased air services, and expanded facilities. Prediction of freight volumes could ensure effective investment. Among the computational intelligence models, support vector regression (SVR) has become the dominant modeling paradigm. In this study, a fuzzy-based SVR (FSVR) model was used to solve the freight volume prediction problem in international airports. The FSVR model can use a fuzzy time series of historical traffic changes for predictions. A fuzzy classification algorithm was used for elements of similar levels in the time series to appropriately divide traffic changes into fuzzy sets, generate membership function values, and establish a fuzzy relationship to produce a fuzzy interpolation with a minimal error. A comparison of the FSVR model with other models revealed that the FSVR model had the lowest mean absolute percentage error (all < 2.5%), mean absolute error, and root mean square error for all types of traffic at all the analyzed airports. Fuzzy sets can handle uncertainty and imprecision in time series. Therefore, the prediction accuracy of the entire time series model is improved by taking advantage of SVR and fuzzy sets. By using the highly accurate FSVR model to predict the future growth of air freight volume, airport management could analyze their existing facilities and service capacity to identify operational bottlenecks and plan future development. The FSVR model is the most accurate forecasting model for air traffic forecasting.

Keywords:

air freight volume; fuzzy logic; support vector regression; time series prediction

MSC:

37M10

1. Introduction

Globalization has changed people’s lives. According to the International Monetary Fund [1], the four fundamental aspects of globalization are “trade and international exchanges” [2], “capital and investment flows” [3], “population flows”, and “knowledge spreading”. Increasing international trade and investment have integrated the global markets into a “global village” [4]. Air transportation, which facilitates international interactions, is a global industry. Transportation for any purpose—including tourism, entertainment, business, and freight delivery—deepens the connections among international terminals worldwide; these terminals are closely related and affect each other. The 20-Year Passenger Forecast by the International Air Transport Association (IATA) indicated that air passenger and freight transportation demand has remained strong. The center of the industry has been speeding up eastward in October 2018. They further stated that, compared with the current standard, the approaching 20-year air transport would double [5]. The compound growth rate of air transport will reach up to 3.5%. By 2037, air passengers will double again, up to 8.2 billion [6]. Furthermore, air transport remains strong compared with July 2021 (calculated based on freight ton/kilometers) according to the global air transport periodic report published on 31 August 2021. Global air freight demand in July 2021 was 8.6% higher than in July 2019, a substantial increase in comparison with the long-term average increase of 4.7% [7].

According to the IATA, global air traffic reached only 43% of its pre-pandemic level in 2021 [8], a downward revision from the estimation of 51% at the end of 2020. However, freight demand is increasing rapidly and may continue to exceed expectations. The IATA expected air freight volumes to increase by 13.1% in 2021 to 63.1 million tons, surpassing 2019 levels and approaching the peak in 2018. Air transport is ripe for development in numerous countries and the future of the aviation industry is bright. Aviation will flourish and remain a vital component of the world economy due to the benefits of increased international connectivity. However, the IATA has also suggested that airports and air traffic control may not be able to handle passenger demand if this trend continues. Governments and infrastructure management authorities must strategically plan the future development of aviation infrastructure and their decisions affect the value created by aviation in their corresponding regions [9].

Southwest Airlines instigated the age of low-cost aviation by entering the market as the earliest low-cost carrier (LCC) in 1971 [10]. In the last decade, LCCs have already increased the transportation demand. They provided approximately 3.6 billion seats in 2008 and this number had increased to approximately 5.3 billion by 2017. Likewise, the LCC market share grew from 21% in 2007 to 29% in 2017. Moreover, LCCs grew from having 4.4% of the intercontinental market share in 2008 to 11.4% in 2017; LCCs’ regional market share grew from 23.6% to 31.4% in the same time period. Thus, both intercontinental and regional LCCs have grown rapidly.

As air traffic increases, the aviation industry continues to develop. Governments worldwide must devise methods for reducing airline bottlenecks and developing domestic air transport infrastructure [11]. Existing infrastructure and air services must be improved in response to the increased demand and any expansion of these services must be sustainable. Forecasting airport operations is necessary to ensure that investments are well-targeted and obtain a return. Governments must also understand that globalization has increased the social and economic prosperity of the world and the use of protectionism to curb globalization will reduce development opportunities. Therefore, governments worldwide must perform appropriate traffic forecast planning for the aviation industry [12], construct efficient infrastructure and services, and deploy programs to meet national economic development goals [13].

Wang et al., explored the relative efficiency of three combined forecasts in forecasting the tourism demand for Hong Kong. Two-, three- and four-model are consisted of four modeling techniques: autoregressive integrated moving average (ARIMA) model, autoregressive distributed lag model, error correction model, and vector autoregressive model. The results showed that the combined forecast outperformed the best single forecasting model and avoided the risk of complete forecast failure [14]. Therefore, if predictive models are available—which model would generate the best predictions is unknown—combining the predictions of the alternative models is an optimal low-risk approach. Saayman used the Naive and Naive Forecasting Model, the Holt–Winters Exponential Smoothing Model (ETS) [15], ARIMA [16], and seasonal ARIMA (SARIMA) [17] to model and forecast tourism in South Africa’s major intercontinental tourism markets, namely the United Kingdom, Germany, the Netherlands, the United States, and France [18]. The results revealed that the SARIMA model had the most accurate arrival predictions on three time horizons: 3, 6, and 12 months. Univariate forecasts were concluded to provide relatively accurate forecasts of tourist arrivals in South Africa, especially in the short term. However, the model did not assess the impact of external events; thus, its policy applications are limited [18].

Hassani used the root mean square error (RMSE) and direction of change (DC) to evaluate models; singular spectrum analysis (SSA)-R, SSA-V [19], ARIMA, and the trigonometric Box—Cox autoregression (AR)-moving average (MA) trend seasonal model [20] outperformed other models for forecasting the number of tourists arriving in Europe. Suryani developed a system dynamics model to forecast air freight demand in the future to determine terminal capacity required to support long-term growth in Taiwan Taoyuan International Airport [21]. Alexander developed a gravity model to predict air freight demand. They evaluated the gravity models to predict and accurately explain major economic shocks such as the global financial crisis [22]. The least accurate models were a neural network [15], the ETS, and AR fractionally integrated MA (ARFIMA), MA, and weighted MA (WMA) models [23]. Cao et al., used the hybrid prediction of ARIMA and support vector machines (SVMs) to explore research and analysis of the diversion law of passenger transportation on the subway during holidays [24]. Passenger flow data from China’s Xi’an Line 3 subway station on National Day were used with the ARIMA model, the SVM model, and a hybrid of these models to predict the hourly passenger flow on the subway during the holiday. The research results revealed that the relative error between the predicted results and the actual results of the hybrid model was smaller than those for the other models. The hybrid model (ARIMA+SVM) is practical, generalizable, and suitable for predicting subway passenger diversion; thus, the hybrid model provides a quantitative basis for rationally developing the organization and management of passenger transport in subway stations during holidays or festivals [25]. Moreover, Bildirici has proposed modeling strategies that possess benefit from neural network-based GARCH models and SVR-GARCH models to augment commonly used volatility models with support vector machines and neural networks [26]. Sharifian proposed a hybrid ensemble of SVR/GARCH predictors in each subscale and trained it to predict the refactored components of workload dynamic resource allocation in mobile cloud computing environments [27]. In summary, hybrid models can obtain better forecasting results than single models. No single model is ever superior in terms of prediction accuracy. Traditional time-series analysis models are not entirely inferior to machine-learning prediction models; single traditional time-series analysis and forecasting models can be the best forecasting choice for a specific time series. However, all these models still have many disadvantages in forecasting in real situations. The traditional methods could not handle the forecasting problems in which the historical data presented as approximate numerical forecast values [28]. A comparison of the SVR-based and ARIMA-based methods, each having their own merits and weaknesses, has not been undertaken in the field of forecasting the approximate numerical forecast values of historical data.

The fuzzy time series (FTS) model has demonstrated to effectively solve the limitation related with forecasting the approximate numerical forecast values of historical data [29]. Tai proposed an improved fuzzy time-series (IFTS) forecasting model using variations of data to effectively forecast the approximate numerical forecast values of historical data. Moreover, IFTS has shown better accurate predictions than other fuzzy SVRs [30]. Our objective is to achieve an accurate grasp of international airport freight volumes using fuzzy-based SVR (FSVR). Our FSVR used IFTS as the primary mechanism for our forecasting system, which employed SVR with an improved fuzzy set to approximate fuzzy upper and lower bounds and then approximate numerical forecast values. The parameters in the models were investigated to find the optimal values for each data set because no model was considered to be optimal in all situations. All parameters were investigated and determined by appropriate methods. Performed on many datasets with different characteristics, the proposed model was compared with the existing models. Freight volume period data for membership functions are fuzzy values that can be used to simulate processes that require economic expertise and knowledge. Our forecasting system is suitable for seasonal time series that can interpolate historical data and predict the future. Furthermore, FSVR can efficiently handle time series/non-linear problems, resulting in a better performance. The contributions of this study are as follows: (1) development and optimization of an FSVR model for forecasting international airport freight volumes; (2) validation of the ability of FSVR model to generate forecasts under multiple forecast parameters; (3) by using robust statistical indicators, the international airport association’s forecast and observational data are compared with data published on the airport authority’s official website to assess the model’s accuracy over 1 period and 12 periods.

2. Methods

2.1. Dataset Description

The object of this study was a time series of air traffic data. The selected airports were the 10 airports with the highest global airport passenger traffic in 2018 according to the statistics of Airports Council International; these 10 airports were Hartsfield–Jackson Atlanta International Airport (ATL), Beijing Capital International Airport (PEK), Dubai International Airport (DXB), Los Angeles International Airport (LAX), Tokyo International Airport (HND), Chicago O’Hare International Airport (ORD), London Heathrow Airport (LHR), Hong Kong International Airport (HKG), Shanghai Pudong International Airport (PVG), and Paris Charles de Gaulle International Airport (CDG) [31]. Data were collected for the period from August 2014 to December 2019. Data were obtained from the statistical reports of the International Airports Association and data published on the official websites of the airports’ management authorities [32]. For the training data set, 1590 observations among 10 airports were extracted from August 2014 to December 2018. The data used the original values of freight volume and the data unit was 10,000 tons for freight volume. For the test data set, 360 observations among 10 airports were extracted from January to December 2019. The time-series data used monthly intervals to sum the values of observations within the same month in an airport. The training data set was used to analyze and build various predictive models. The established model forecast was used to generate forecast values for the period of January to December 2019 and to verify the model forecast values against the test data set. The object of this study was to provide a forecasting time series of freight volume data.

2.2. Support Vector Regression

Support vector regression (SVR) extends the conventional SVM algorithm [33]. SVR is a supervised learning model based on regression analysis and is characterized by an ε-insensitive loss function for the training data; SVR can handle predictions of continual data [34]. In SVR, a hyperplane is produced and the distance from the farthest sample point to the hyperplane is minimized. A nonlinear problem can thus be transformed into a linear problem by mapping the training data to a high-dimensional feature space. The training dataset is represented as {(x_i, y_i); I = 1, 2, …, N; x_i ∈ Rⁿ; y_i ∈ R}, where x_i is the input value of i in the nth dimension, y_i is the actual output value, and N is the dataset size. The SVR function is as follows:

y = f (x_{i}) = ω^{T} φ (x_{i}) + b

(1)

where f(x_i) is the predicted value and φ(x_i) is the feature function of the input; ω and b are adjustment factors, which are estimated using a penalty function as follows:

R (C) = \frac{1}{2} {‖ ω ‖}^{2} + C \cdot \frac{1}{n} \sum_{i = 1}^{n} {‖ y}_{i} - f {(x) ‖}_{ε}

(2)

{|y - f (x)|}_{ε} = \{\begin{matrix} 0, |y - f (x)| \leq ε \\ |y - f (x)| - ε, o t h e r w i s e \end{matrix}

(3)

where C is the penalty coefficient and ε is the maximum tolerable error. Two slack variables,

ξ_{i}

and

ξ_{i}^{*},

are introduced to handle the infeasible constraints of the optimization problem; these can be expressed as follows:

\min_{ω b ξ^{(*)}} \frac{1}{2} {‖ ω ‖}^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*}), subject to \{\begin{matrix} - y_{i} + ω^{T} φ (x_{i}) + b \leq ε + ξ_{i}, (i = 1, \dots, m), \\ y_{i} - ω^{T} φ (x_{i}) - b \leq ε + ξ_{i}^{*}, (i = 1, \dots, m), \\ ξ_{i}^{*} \geq 0, (i = 1, \dots, m), \end{matrix}

(4)

where

ξ^{(*)}

ensures that the constraints are satisfied, C controls the balance between model complexity and the training error rate, and ε is a constant that controls the error tolerance. If ε is small, overfitting can occur; otherwise, underfitting can occur. The dual-optimization problem-based Lagrangian equations are obtained as follows:

\min_{α_{i,} α_{i}^{*}} \frac{1}{2} \sum_{i, j = 1}^{n} y_{i} (α_{i} - α_{i}^{*}) (α_{j} - α_{j}^{*}) k (x_{i}, x_{j}) + \sum_{i = 1}^{n} ((ε - y_{i}) α_{i} + (ε + y_{i}) α_{i}^{*}), Subject to \{\begin{matrix} \sum_{i = 1}^{n} (α_{i} - α_{i}^{*}) = 0, \\ 0 \leq α_{i}^{*} \leq C, (i = 1, \dots, m) \end{matrix}

(5)

Therefore, an SVR function can be obtained as follows:

f (x) = \sum_{i = 1}^{n} (α_{i} - α_{i}^{*}) k (x_{i}, x) + b,

(6)

where

α_{i}

and

α_{i}^{*}

are Lagrangian multipliers and k(x_i, x) is the kernel function. A multivariate model is built by additive decomposition of a univariate time-series model and the kernel function class is closed under additive decomposition for the SVR model. Five kernels (spline kernel, Gaussian radial basis function (RBF) kernel, linear kernel, polynomial kernel, and pair Hidden Markov Model (PHMM)) are commonly used for SVR models. Gaussian RBF kernels are widely used for nonlinear mapping between an input and a high-dimensional space. The Gaussian-RBF kernel has good implementation with additive decomposition, especially for consideration of interaction between the single time series [35]. The Gaussian-RBF kernel assumes full interaction between a single time series and all time points in a window, which can be as error prone as the assumption of no interaction. Rohmah has shown that Gaussian-RBF has better results than other common SVR kernel functions in time-series analysis [36]. The Gaussian-RBF kernel function constructs a nonlinear decision hyperplane in the input space. The formula is as follows:

k (x_{i}, x) = e x p (- σ || x - x_{i} {||}^{2})

(7)

where σ is the scaling factor of the Gaussian RBF kernel.

The accuracy and stability of SVR are closely related to three parameters: the penalty coefficient C, kernel σ, and width ε of the insensitive loss function. C is the penalty coefficient and is used to balance the complexity of the model and training error; ε is the width of the insensitive loss function, which controls the width of the SVR sensitive band and affects the number of support vectors; finally, σ is the kernel parameter for the Gaussian RBF. The kernel parameter affects the distribution and range characteristics of the training sample data, which determine the width of the local neighborhood.

2.3. Fuzzy SVR

The FSVR model is based on the IFTS model [30]. The fuzzy system provides a dynamic, probable, and intensive rule base for the system to overcome uncertainties in the raw data [37,38]. The IFTS model is based on changes in data between two continual periods and the fuzzy relationship between elements in the series. The IFTS model can be used for fuzzy historical interpolation and prediction of the future; it can be applied to nonseasonal time series. In the proposed model, all parameters are calculated with appropriate methods to enable operations on data sets with different characteristics. The IFTS model is more effective for predictions than other models. The IFTS model is included in the R program’s application package function execution program, facilitating its application. The steps for the IFTS are introduced in Algorithm 1, and further details follow the algorithm [30].

Algorithm 1: Fuzzy time series using variations of data
Definition: Universal set U contains the interval between the least and greatest variations in the dataset. U_i = X_i₊₁ − X_i, i = 1, 2, …, n − 1 U = [Min [39], Max [39]]
Input: Air traffic: passenger, aircraft movements, and freight data set Xi corresponds to time t_i, i=1, 2, …, n.
Output: Fuzzy model of air traffic volume time series with the smallest RMSE value.
1	$Divided U into m equal intervals fuzzy set u_{i}, i = 1, 2, \dots, m . Find the middle points of the intervals (u_{i}^{0}, i = 1, \dots, m)$ , where initial values m = 5, 6, 7, …, 11
2	Calculate the C value of each interval t = 0 initial values k = 500, ε = 1 × 10⁶, $a^{(0)} = 0, b^{(0)} = 1, ∆ C^{(0)} = 0.5, n^{(0)} = 1$
3	if (t = i && i ≥ 1)
4	$Compute ∆ C^{(t)} = \frac{b^{(t)} - a^{(t)}}{k}, a n d C_{i}^{(t)}$
5	if (a = 0 && b = 1)
6	$C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 1, 2, \dots, k - 1$
7	if (a = 0 && b ≠ 1)
8	$C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 1, 2, \dots, k$
9	if (a ≠ 0 && b = 1)
10	$C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 0, 1, 2, \dots, k - 1$
11	if (a ≠ 0 && b ≠ 1)
12	$C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 0, 1, 2, \dots, k$
13	Run IFTS to find $C_{l}^{(t)}$ $, 0 \leq l \leq k$
14	Find $C = C_{l}^{(m)}$ $until b^{(m)} - a^{(m)} < ε$
15	Determine respective values of the set of the fuzzy set with C, $μ A_{i} (u_{i}) = \frac{1}{1 + {[C \times (U_{i} - u_{i}^{0})]}^{2}}, i = 1, 2, \dots, m$
16	Choose a basis w=12 (1 < w < n), corresponding to the intervals of prior time.
17	$Calculate fuzzy relationship matrix R^{w} (t)$ $= O^{w} (t) \cap K (t) = [\begin{matrix} \begin{matrix} R_{11} \\ R_{21} \end{matrix} \begin{matrix} R_{12} \\ R_{22} \end{matrix} \begin{matrix} \dots \\ \dots \end{matrix} \begin{matrix} R_{1 j} \\ R_{2 j} \end{matrix} \\ \begin{matrix} \dots \\ R_{i 1} \end{matrix} \begin{matrix} \dots \\ R_{i 2} \end{matrix} \begin{matrix} \dots \\ \dots \end{matrix} \begin{matrix} \dots \\ R_{i j} \end{matrix} \end{matrix}]$ .
18	Define F(t) the fuzzy forecasting of variations at the moment t $F t = [E (R_{11}, R_{21}, \dots, R_{i 1}) E (R_{12}, R_{22}, \dots, R_{i 2}) \dots E (R_{1 j}, R_{2 j}, \dots, R_{i j})]$ $E (R_{1 k}, R_{2 k}, \dots, R_{i k}) = \frac{R_{1 k} + R_{2 k} + \dots + R_{i k}}{w - 2}, k = 1, 2, \dots, j$
19	Forecast 7(m(7)×w(1)) fuzzy model data for time series, forecast value, and the result is calculated for the value t = w based on the variations in the result of prior values(t−1, …, t−w) $\hat{X} (t) = X (t - 1) + V (t)$ $V (t) = \frac{\sum_{i = 1}^{w} μ_{t} (u_{i}) \times u_{m}^{i}}{\sum_{i = 1}^{w} μ_{t} (u_{i})}$
20	Each fuzzy model data are compared with real data, using the RMSE for all the fuzzy model data calculated, and we use the RMSE as an evaluation criterion to compare with the listed models.

Definition 1.

Let U be the whole object to be discussed, called the universe, U = {u₁, u₂, …, u_m}. U represents each element in the universe. The fuzzy set of U is defined as follows

A = {μ_A(u₁)/ u₁, μ_A(u₂)/ u₂, …, μ_A(u_m)/ u_m}

(8)

where μ_A(u_i) is the membership function, μ_A(u_i):U→[0, 1], μ_A(u_i) indicates that the u_i is in the collection. The degree of membership in A, μ_A(u_i)∈[0, 1], and 1 ≤ i ≤ m.

Definition 2.

Let X(t), (t = 1, 2, …), for any X∈ U, if a real number f_i(t)∈ [0, 1] is specified, f_i(t) is defined as a fuzzy subset. If F(t) is the set of f₁(t), f₂(t) …f_i(t), then F(t) is called the fuzzy time series of X(t).

Definition 3.

Evaluate the difference between the original forecast and estimated data, {

{\hat{X}}_{i}

}, i = 1, 2, …, n, using MSE, MAE, MAPE, symmetric MAPE (SMAPE), and MASE. The smaller difference indicates the better the predictive ability of the model.

Suppose the data set X_i corresponds to time t_i, i = 1, 2, …, n. Based on the fuzzy set between X_i+₁ and X_i, the five steps of establishing an improved fuzzy time-series model [30] are as follows:

Step 1: calculate the data change between two consecutive periods and find the minimum value Min [39] and maximum value Max [39] between domains U.

U_i = X_i _{+ 1}−X _i , i = 1, 2, …, n−1

(9)

Step 2: divide the universe U into m equal length intervals u_i, i = 1, 2, …, m. Each interval u_i contains different growth rate values for different data. The midpoint of each interval (

u_{i}^{0}

, i = 1, 2, …, m) can be obtained.

Step 3: determine the corresponding value of the fuzzy set A_i of F(t). Fuzzy sets A₁, A₂, …, A_m is defined as

A_i = {μA_i(u_i)/ u_i}, u_i∈U, μA_i ∈ [0, 1]

(10)

μ A_{i} (u_{i}) = \frac{1}{1 + {[C \times (U_{i} - u_{i}^{0})]}^{2}}, i = 1, 2, \dots, m

(11)

where C is a constant C ∈ (0,1), U_i is the data change between two consecutive periods determined in Step 1, and

u_{i}^{0}

is the midpoint of each interval determined in Step 2.

Step 4: Select the corresponding interval base w value (1 < w < n). In accordance with the value of w, the fuzzy relation matrix R^w(t) is calculated. Once w has been selected, an i × j operation matrix O^w (t) is obtained, where i is the number of rows and j is the number of columns. The operation matrix conforms to the data t − 2, t − 3, …, t − w and is consistent with the number of change intervals. A 1 × j matrix K(t) (row matrix corresponding to the fuzzy change in data from year t − 1) is obtained. The relation matrix R(t) is represented as follows:

R (t) = O^{w} (t) \cap K (t) = [\begin{matrix} \begin{matrix} R_{11} \\ R_{21} \end{matrix} \begin{matrix} R_{12} \\ R_{22} \end{matrix} \begin{matrix} \dots \\ \dots \end{matrix} \begin{matrix} R_{1 j} \\ R_{2 j} \end{matrix} \\ \begin{matrix} \dots \\ R_{i 1} \end{matrix} \begin{matrix} \dots \\ R_{i 2} \end{matrix} \begin{matrix} \dots \\ \dots \end{matrix} \begin{matrix} \dots \\ R_{i j} \end{matrix} \end{matrix}]

(12)

The formula of fuzzy time series F (T) is represented

F t = [E (R_{11}, R_{21}, \dots, R_{i 1}) E (R_{12}, R_{22}, \dots, R_{i 2}) \dots E (R_{1 j}, R_{2 j}, \dots, R_{i j})]

(13)

where

E (R_{1 k}, R_{2 k}, \dots, R_{i k}) = \frac{R_{1 k} + R_{2 k} + \dots + R_{i k}}{w - 2}, k = 1, 2, \dots, j

(14)

Step 5: use the following formula to predict the data of time t:

\hat{X} (t) = X (t - 1) + V (t)

(15)

V (t) = \frac{\sum_{i = 1}^{w} μ_{t} (u_{i}) \times u_{m}^{i}}{\sum_{i = 1}^{w} μ_{t} (u_{i})}

(16)

where μ_t(u_i) is an element of F(t), X(t − 1) is the value at time t − 1, and

\hat{X} (t)

is the predicted value at time t. The predicted value

\hat{X} (t)

depends on the actual real value of X(t − 1) and the value of V(t). The value of V(t) is determined in accordance with the change in data in the entire time series and its previous values. The algorithm is described as follows:

Data changes between two consecutive periods are divided into appropriate groups; larger data changes represent a greater number of clusters. A fuzzy relationship between the universe and the fuzzy set can be established. In accordance with the result of t = w based on the previous value (the results of the changes of t − 1, t − 2, …, t − w), a previous value (t = w) is selected as the predicted value. The obtained results are compared with the actual values and the error is estimated to evaluate the model’s validity. The constant C affects the result of μA_i(u_i). The criterion for evaluating the forecasting model (CEF model) is used to select the optimal C value; this is achieved through the following five steps:

Step 1: import k and ε values, where k is the number of times each iteration is divided and ε is the error of C. The smaller the value of ε, the longer the computation time required.

Step 2: when t = 0, allocate the initial value: a⁽⁰⁾ = 0, b⁽⁰⁾ = 1.

Step 3: when t = i, i ≥ 1, calculate the following values.

a^{(t)} = a^{(t - 1)} + [n^{(t - 1)} - 1] △ C^{(t - 1)},

(17)

b^{(t)} = a^{(t - 1)} + [n^{(t - 1)} + 1] △ C^{(t - 1)},

(18)

△ C^{(t)} = \frac{b^{(t)} - a^{(t)}}{k}, and C_{i}^{(t)},

(19)

where

if a = 0 and b = 1, then

C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 1, 2, \dots, k - 1,

if a = 0 and b

\neq

1, then

C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 1, 2, \dots, k,

if a

\neq

0 and b

=

1, then

C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 0, 1, \dots, k - 1,

if a

\neq

0 and b

\neq

1, then

C_{i}^{(t)} = a^{(t)} + i △ C^{(t)}, i = 0, 1, \dots, k .

Step 4: calculate IFTS using

C_{i}^{(t)}

. Find

C_{l}^{(t)}

to optimize CEF model.

Step 5: repeat Steps 3 and 4 to find C =

C_{l}^{(m)}

until b^(m)−a^(m) < ε.

In the IFTS model, dividing intervals for the universal set (DIU) algorithm includes the following three steps:

Step 1: when t = 0, ε > 0 is a small positive number. Clustering elements of the initialization sequence is represented as

Z^{(0)} = (z_{1}^{(0)}, z_{2}^{(0)}, \dots, z_{n}^{(0)}) = (x_{1}, x_{2}, \dots, x_{n})

.

Step 2: update each fuzzy data point according to the following formula.

z_{i}^{(t + 1)} = \frac{\sum_{i^{'} = 1}^{n} f (z_{i}^{(t)}, z_{i^{'}}^{(t)}) \cdot z_{i^{'}}^{(t)}}{\sum_{i^{'} = 1}^{n} f (z_{i}^{(t)}, z_{i^{'}}^{(t)})}

(20)

where f(

z_{i}^{(t)}, z_{i^{'}}^{(t)}

) is a truncated Gaussian kernel and the formula is

f (z_{i}^{(t)}, z_{i^{'}}^{(t)}) = \{\begin{matrix} \exp (- \frac{d}{λ}) i f d = d (z_{i}^{(t)}, z_{i}^{’ (t)}) \leq d_{s}, \\ 0 i f d > d_{s}, \end{matrix}

(21)

where

d (z_{i}^{(t)}, z_{i}^{’ (t)})

is the Euclidean distance between

z_{i}^{(t)}

and

z_{i}^{’ (t)}

and d_s is the average value of the distance between all data element pairs, and the formula is

d_{s} = \frac{2}{n (n - 1)} \sum_{i < i^{'}} d (x_{i}, x_{i}^{’})

(22)

where n is the number of data points, λ depends on d_s. When λ→0, the data have n intervals and when λ→∞, the data have one interval.

Step 3: repeat Step 2 until

m a x_{i} \{d (z_{i}^{(t)}, z_{i}^{(t + 1)})\} < ε

are met. Each element in the data set can converge to the representative element

z_{i}^{(t)}

, i = 1, 2, …, m. When the calculation is finished, we have a sequence of m representative elements and m is the interval value dividing the universe.

In this study, the fuzzy time series of the aforementioned model was used to fuzzify an original air traffic time series. A set of corresponding fuzzy data was output through fuzzy reasoning and the fuzzy time-series data were taken as the independent regression variable for SVR. The independent regression variable extracted the base number corresponding to a 12-period interval (due to the seasonal cycle) combined with the results from the machine-learning SVR model. Thus, an FSVR model was developed for predicting air traffic volume. In the process of fuzzification, the factors affecting the fuzzy set segmentation were as follows:

The schedule in winter and summer in accordance with the time zone of each airport;
The role of each airport in the global air transport network in terms of its unique function due to its geographical location;
The continuous holidays of countries in each region;
Demand for tourism in the low and peak seasons or the impact of significant activities, such as the Olympic Games or World Expo.

Two seasonal flight schedules were adopted. Winter flights were defined as those from November 1 to March 31 of the following year and summer flights were those from April 1 to October 31. The fuzzy set was established in accordance with the air traffic domain data based on the shift table factor and could be divided into the winter peak, summer peak, low peaks, and an intermediate conversion value. The basic fuzzy set segmentation was m = 5 (summer peak, winter peak, intermediate transformation, summer low peak, and low winter peak) in addition to other factors affecting traffic, such as local tourism demand in light and peak seasons, national holidays, and demand for goods or trade. The membership function of the fuzzy features could be adjusted to obtain additional fuzzy sets. In this study, the fuzzy set of air traffic was divided into at least 5 fuzzy sets and at most 11 fuzzy sets; the number of fuzzy sets was thus 5 ≤ m ≤ 11. The previous time interval was set to w = 12 and the seasonal factor was 12 months. By using the IFTS model, a fuzzy data model with seven sets of fuzzy sets (m = 7) for w = 12 was generated. For each group of fuzzy data sets, the RMSE was calculated and the group of fuzzy data sets with the smallest RMSE was selected as the input data for the autoregressive independent variables in the next step. The fuzzy extraction parameter w was the interval cardinality corresponding to the 12 cycles and the fuzzy relationship matrix was calculated prior to these 12 cycles. Therefore, the amount of data in each group was divided by 12. The original data were the dependent variable for the SVR regression calculation; the independent variable was fuzzy. The data were then divided into training and testing sets to identify the optimal parameters for SVR and to establish the prediction model.

2.4. Evaluation Criteria

In this study, the MA percentage error (MAPE), MA error (MAE), and RMSE were selected as the measurement indexes to evaluate which models had high predictive performance. The MAPE, RMSE, and MAE are calculated as follows:

M A P E = \frac{1}{n} \sum_{i = 1}^{n} |\frac{Y_{i} - {\hat{Y}}_{i}}{Y_{i}}| \times 100

(23)

R M S E = \sqrt{\frac{1}{n} \times \sum_{i = 1}^{n} {({\hat{Y}}_{i} - Y_{i})}^{2}}

(24)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |Y_{i} - {\hat{Y}}_{i}|

(25)

where Y_i represents the actual value,

{\hat{Y}}_{i}

represents a predicted value, and n represents the number of prediction periods. The MAPE is a relative index and unaffected by units or the magnitude of the actual and predicted values. The MAPE can be used to evaluate the relative size of the difference between predicted and actual values. The RMSE is the square root of the ratio of the deviation between observed values and the true value to the number of observations. The RMSE is thus a measurement of the deviation between observed and true values. The standard error, MAE, indicates the precision of measurements. The MAE is the absolute residual value between the average predicted value and actual value for each observation; thus, it can be used to evaluate differences between predicted and actual values.

3. Results and Discussion

The FSVR model predictions were compared with those of five models, namely the Holt–Winters, ETS, ARIMA, SARIMA, and SVR models. The training data set was extracted from August 2014 to December 2018; the test data set was extracted from January to December 2019. The training set was used to train each predictive model; the established models were then used to predict values from January to December 2019; these predicted values for 2019 were verified with the test set values. Table 1 and Table 2 provide data on the freight volume of each airport.

Based on the study by Zhang (2021) [40], we assumed that additive decomposition decomposes the time series into three components as,

y_{t} = T_{t} + S_{t} + R_{t}

(26)

where T_t is the trend item, S_t is the seasonal item, and R_t is the residual item.

The trends and seasonality of the three air traffic time series (with 3-, 6-, or 12-month time horizons) were decomposed; the trend strength is defined as in Formula (27) and is between 0 and 1. The seasonal intensity is defined as in Formula (28), which is computed using detrended data and is also between 0 and 1. Seasonal intensity close to 0 indicates that the series has little seasonality [15].

F_{T} = \max_{} (0, 1 - \frac{V a r (R_{t})}{V a r (T_{t} + R_{t})})

(27)

F_{s} = \max_{} (0, 1 - \frac{V a r (R_{t})}{V a r (S_{t} + R_{t})})

(28)

The seasonality and trend strength of the three air traffic time series are presented in Table 2. In terms of freight volume, the airports with seasonality exceeding 0.9 were PEK, HND, HKG, and PVG, which had values ranging from 0.92 to 0.96; DXB had weaker minimum seasonality of 0.69. ATL, ORD, and CDG had seasonal intensity of 0.73–0.79, and LAX and LHR had seasonal intensity of 0.86–0.88. However, most airports had low seasonality for freight traffic. The trend strength for freight volume was lowest for DXB at 0.48; CDG also had a weak trend, with a strength of 0.53. LHR has the strongest trend, with a strength of 0.90. The other seven airports had trends of strength 0.71–0.89, indicating weak trends. The trend in freight volume at each airport was weak.

Among the five forecasting models, the SVR model resulted in the lowest MAE value for five airports, namely ATL, LAX, ORD, DXB, and CDG (Table 3). The Holt–Winters additive model yielded the smallest MAE for LHR and HKG. ETS had the lowest MAE for PEK and HND; SARIMA had the lowest MAE for only PVG. However, the MAE and accuracy of the FSVR model were superior to those of the other models for all airports. The average MAE of the FSVR model (0.209) was 82% smaller than the next best MAE, that of the SVR model (1.215).

The MAPEs of the five prediction models (Table 3) were >10% for only two airports: ATL (8.601%) and PEK (9.925%). The Holt–Winters additive model had MAPE >10% for ORD and PVG (good predictions) and <10% for all other airports (highly accurate predictions). However, the FSVR model yielded a lower MAPE (<1.6) than the other models for all airports, indicating that its predictions were highly accurate. The overall average MAPE of the FSVR model (1.019%) was approximately 84% lower than that of SVR, the next best model (6.625%).

The FSVR model was thus the best model for forecasting freight volumes in international airports according to all three of the indicators. The relevant parameters of each prediction model are listed in Table 3. Figure 1 presents the actual and predicted freight volumes for 2019. The RMSEs of the five prediction models are also presented in Table 4. The ARIMA model resulted in the highest RMSE and was thus the least accurate prediction model. The Holt–Winters additive model yielded the smallest RMSE for LAX, LHR, and HKG, the ETS model had the lowest RMSE for PEK and HND, and the SARIMA model had the lowest value for PVG. The SVR model returned the lowest RMSE for three airports: ATL, DXB, and CDG. However, the FSVR model had a smaller RMSE than the other five models for all airports. The mean RMSE for the FSVR model was 0.304, approximately 81% lower than that for the next-best model (SVR), which was 1.681. The hyperparameters of the SVR algorithm, the penalty coefficient C, and the kernel parameter of the Gaussian RBF σ should be optimized to find the optimal values for each data set, because no model is considered to be optimal in all situations. For most of the models, the width of the insensitive loss function ε set to 0.1; however, the ε is the allowable error of “ε-tube”, which represents the approximation accuracy of the training data points. Many studies have demonstrated that adjusted ε can improve the SVR accuracy [41,42,43]. In our results, the ε was also found to influence the SVR accuracy in hyperparameters of both SVR and FSVR algorithms.

Historical air traffic data were used as the autoregressive term in the regression item of the SVR model. The function correlation characteristics between the current data dependent variables and the historical traffic data independent variables were used to calculate the results. Global air transport is primarily passenger transport and airlines estimate passenger travel demand for each season. Annual applications for allocations are divided into two fixed seasons, winter and summer, as determined by an international conference. Airline transportation operates in accordance with these winter and summer schedules; thus, the schedule has a 1-year (12-month) cycle. According to the seasonal analysis of various types of air traffic volume presented in Table 5, the air freight volume at each airport was inconsistent due to the relatively small volume of all-freight aircraft service traffic, and the seasonal intensity varied from strong to weak. Therefore, the current traffic volume was assumed to be related to the χ value of the independent variable either 1 or 12 periods behind the original traffic data. Through SVR fitting of data from 1 or 12 periods in the past, the functional relationships of the data were calculated. The number of autoregressive lag periods that produced the optimal dependent variable y could be determined using the RMSE and MAPE values. For example, the traffic volume at ATL was determined through the following procedure. The analysis table of the SVR autoregressive lag period of ATL air traffic volume is presented in Table 5. The one period behind indicates that the predicted variable is modeled with the first lagged observation in a form of y_t = f(y_t − 1). The independent variable in the freight volume item was superior for the SVR model when using traffic data that were one period behind. Therefore, SVR model predictions of the freight volume for each airport in this paper are based on data from one period behind.

For the SVM function in the R software, the three key parameters of SVR were set to their default values: penalty coefficient C = 1, kernel parameter of the Gaussian RBF σ = 1, and width of the insensitive loss function ε = 0.1. The three parameter adjustment stages were for C, σ, and ε; the adjustments were performed using the grid search method in R. The default values of three parameters were starting value in the optimization algorithm. A corresponding model was trained for each parameter combination, the model performance was determined and the model with the highest performance was selected. During the training process, the built-in tune function in the R e1071 package was used to adjust the parameters and automatically cross-validate the model to ensure that it was reliable with the adjusted parameters. The following parameter settings were investigated: C = [20, 20.1, 20.2, …, 214], σ = [2–10, 2–9.9, 2–9.8, …, 20], and ε = [2–10, 2–9.9, 2–9.8, …, 20].

The fuzzification of different airports and traffic categories form different universes of discourse. In fuzzy theory, each universe of discourse is divided into different numbers of fuzzy sets in accordance with the universe’s characteristics, and membership functions are calculated. The FSVR model used the IFTS model combined with SVR to fuzzify air traffic volume; fuzzy traffic volume was then used as the AR independent variable for SVR. In the fuzzification process, the universe of data for the time series of traffic volume was defined and divided into appropriate fuzzy sets with different increments of change; this process produced fuzzy data with a lower RMSE. As described near the end of the Methods section, seven fuzzy groups defined as m = 5, 6, 7, 8, 9, 10, or 11 were selected, and w was 12. By using the fuzzified data from these 12 periods, the seven groups were further fuzzified by these seven groups. The fuzzified data resulting in the smallest RMSE error were used as the input for the optimal FSVR AR term in the model data.

The top ten airports were divided into three regions in accordance with their geographic location: (1) North America; (2) the Middle East and Europe; and (3) Asia. Table 6 presents the optimal number of fuzzy sets for each airport, with these numbers obtained through experimentation. The fuzzy segmentations for airports in the North American region—such as passenger traffic, aircraft takeoffs and landings, and freight volume—were divided into five to seven sets. In the Middle East and Europe, the segmentations for freight volume differed substantially between airports, at 6 to 10 sets. In the Asian region, the segmentations also varied substantially between airports, at six to nine sets.

The results presented in Table 6 reveal that most of the fuzzy sets of the air traffic IFTS were divided into five to seven sets; for some airports, a greater division into 8–11 sets was required to minimize the RMSE. In the IFTS model, the universe of discourse was established and fuzzy relationships were determined in accordance with changes in historical data. With the fuzzy classification algorithm, the data were appropriately divided into fuzzy sets to establish membership function values for elements of similar levels in a time series. Corresponding fuzzy membership values were then established. The fuzzy relationship matrix was calculated for the period before the relevant time series, facilitating fuzzification of the fuzzy interpolation time series with fewer errors. The fuzzified data set could achieve better predictions than the original data, as indicated by Table 7. The IFTS fuzzification was thus advantageous.

DXB expanded rapidly, but its rate of expansion declined. The decline in passenger traffic at DXB in 2019 was due to temporary runway closures and the bankruptcy of India’s Jet Airways (for which Dubai was a major destination). Moreover, the second-largest airline at DXB after Emirates—Flydubai—was affected by non-delivery of the Boeing 737 MAX. In 2019, PEK continued to be affected by intensifying economic and trade friction between China and the United States, geopolitical conflict, financial market turmoil, and other problems. With the completion and operation of Beijing Daxing Airport (PKX) on 25 September 2019, some flights were transferred from PEK. Capacity at Daxing Airport has declined under the new pattern of “one city, two games”. The freight and mail throughput of Beijing Capital Airport in 2019 reached 1,955,326 tons, 5.7% lower than the previous year.

HKG is another airport affected by Sino, US economic and trade issues. Many vital industries in Hong Kong, such as trade, finance, and tourism, rely on convenient and reliable air travel. The airport’s convenient transportation has made Hong Kong an essential gateway between the world and China. From the second half of 2019, the air traffic volume of Hong Kong International Airport was affected by intense political turmoil lasting several months and geopolitical tensions between China and the United States. Actions related to the Hong Kong protests, such as the blockage of transit facilities and widespread violence between protestors and police, affected Hong Kong’s export-oriented economy and international image. Due to the violence and rapidly changing political situation, HKG was closed, and approximately 40 countries issued a travel warning for Hong Kong. The overall freight volume of HKG decreased by 6.1% year-on-year to 4.8 million tons. International tourism in Hong Kong remained low; the decline in passenger traffic to and from mainland China and Southeast Asia also significantly affected the operation of the airport.

The passenger volume of DXB declined due to the temporary shocks. PEK had reduced traffic volume due to competition from PKX. HKG was closed due to protests, violence, and political instability, resulting in a decline in its role as a traffic hub. Thus, various nonseasonal factors can lead to a decline in traffic at airports. The use of Tai’s IFTS model enabled FSVR [30] to be applied to nonseasonal time series. Historical traffic changes could be used to calculate the fuzzified traffic volume and improve fuzzy historical interpolation. By contrast with other models, the FSVR model still accurately predicted the air traffic volume of each airport in 2019. A comparison of the actual and predicted volumes for these three airports is presented in Figure 2.

4. Conclusions

Forecast transportation volumes are the primary basis for the planning of transportation capacity increases by authorities in the transportation industry. Traffic volumes and trends must be accurately forecast as early as possible to predict bottlenecks and plan equipment updates and capacity expansion. Several years’ or decades’ worth of air freight volume data for airports were collected to form a time series. A reasonable and accurate forecast model was used to predict the future traffic volume of the airports. These predictions can be used for timely reviews of airport capacity to enable early planning of necessary construction projects.

A predictive analysis of the freight volume for the 10 international airports with the highest global airport passenger traffic (ATL, PEK, DXB, LAX, HND, ORD, LHR, HKG, PVG, and CDG) from August 2014 to December 2019 was performed. The input data for the produced FSVR model were air traffic time-series data fuzzified by the IFTS model proposed by Tai [30]. As IFTS can be applied to nonseasonal time series and historical traffic changes were considered when fuzzifying the time series, a fuzzy classification algorithm was used to appropriately classify traffic changes for elements of similar time series levels. A suitable fuzzy set was formed, a membership function was generated, and a fuzzy relationship was established, resulting in fuzzy interpolation with minimal errors. The fuzzy data set with the minimal RMSE was used as the input for SVR for predicting the time-series data. However, fuzzy uncertainty treats noise on different data entries as independent disturbances, which may limit FSVR applicability. The air freight volume forecasts made by the FSVR model had the lowest MAE, RMSE, and MAPE of all compared models for all airports. In particular, the MAPE of the FSVR model was <2.5% for all airports, indicating highly accurate forecasts that are superior to those of other forecasting models. Thus, the FSVR model is the best model for air traffic forecasting among the models analyzed. In addition, the hybrid modeling of the FSVR model over the SVR model provides significant improvement over the SVR model.

Author Contributions

C.-H.Y., conceptualization, project administration; J.-C.S., Y.-H.L. and P.-H.J., conceptualization, methodology, writing—review and editing; Y.-D.L., conceptualization, project administration. All authors have read and agreed to the published version of the manuscript.

Funding

The funding sources are the Ministry of Science and Technology, Taiwan (under Grant no. 108-2221-E-992-031-MY3 and 110-2222-E-346-002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data were obtained from the statistical reports of the International Airports Association and data published on the official websites of the airports’ management authorities (https://www.iata.org/en/services/statistics/, accessed on 1 June 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Ritzer, G.; Dean, P. Globalization: The Essentials; John Wiley & Sons: Hoboken, NJ, USA, 2019. [Google Scholar]
Duarte, R.; Pinilla, V.; Serrano, A. Factors driving embodied carbon in international trade: A multiregional input–Output gravity model. Econ. Syst. Res. 2018, 30, 545–566. [Google Scholar] [CrossRef]
Hannan, S.A. Revisiting the Determinants of Capital Flows to Emerging Markets—A Survey of the Evolving Literature; International Monetary Fund: Washington, DC, USA, 2018. [Google Scholar]
Fatehi, K.; Choi, J. International Business Management; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
IATA. 20 Year Passenger Forecast; International Air Transport Association (IATA): Geneva, Switzerland, 2018. [Google Scholar]
Dube, K.; Nhamo, G. Major Global aircraft manufacturers and emerging responses to the SDGs Agenda. In Scaling Up SDGs Implementation; Springer: Berlin/Heidelberg, Germany, 2020; pp. 99–113. [Google Scholar]
Gupta, D.; Garg, A. Sustainable development and carbon neutrality: Integrated assessment of transport transitions in India. Transp. Res. Part D Transp. Environ. 2020, 85, 102474. [Google Scholar] [CrossRef]
Alam, M.; Parveen, R. COVID-19 and Tourism. Int. J. Adv. Res. 2021, 9, 788–804. [Google Scholar] [CrossRef]
Abate, M.; Christidis, P.; Purwanto, A.J. Government support to airlines in the aftermath of the COVID-19 pandemic. J. Air Transp. Manag. 2020, 89, 101931. [Google Scholar] [CrossRef] [PubMed]
Bowen, J. Low-Cost Carriers in Emerging Countries; Elsevier: Amsterdam, The Netherlands, 2019. [Google Scholar]
Dorian, J.P.; Franssen, H.T.; Simbeck, D.R. Global challenges in energy. Energy Policy 2006, 34, 1984–1991. [Google Scholar] [CrossRef]
Belobaba, P.; Odoni, A.; Barnhart, C. The Global Airline Industry; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Wensveen, J.G. Air Transportation: A Management Perspective; Routledge: London, UK, 2018. [Google Scholar]
Wong, K.K.; Song, H.; Witt, S.F.; Wu, D.C. Tourism forecasting: To combine or not to combine? Tour. Manag. 2007, 28, 1068–1078. [Google Scholar] [CrossRef] [Green Version]
Hyndman, R.J.; Athanasopoulos, G. Forecasting: Principles and Practice; OTexts: Melbourne, Australia, 2018. [Google Scholar]
Hyndman, R.J.; Khandakar, Y. Automatic time series forecasting: The forecast package for R. J. Stat. Softw. 2008, 27, 1–22. [Google Scholar] [CrossRef] [Green Version]
Banihabib, M.E.; Valipoor, M.; Behbahani, S.M. Comparison of autoregressive static and artificial dynamic neural network for the forecasting of monthly inflow of dez reservoir. J. Environ. Sci. Technol. 2011, 13, 1–14. [Google Scholar]
Saayman, A.; Saayman, M. Forecasting tourist arrivals in South Africa. Acta Commer. 2010, 10, 281–293. [Google Scholar] [CrossRef] [Green Version]
Hassani, H. Singular spectrum analysis: Methodology and comparison. J. Data Sci. 2007, 5, 239–257. [Google Scholar] [CrossRef]
De Livera, A.M.; Hyndman, R.J.; Snyder, R.D. Forecasting time series with complex seasonal patterns using exponential smoothing. J. Am. Stat. Assoc. 2011, 106, 1513–1527. [Google Scholar] [CrossRef] [Green Version]
Suryani, E.; Chou, S.-Y.; Chen, C.-H. Dynamic simulation model of air cargo demand forecast and terminal capacity planning. Simul. Model. Pract. Theory 2012, 28, 27–41. [Google Scholar] [CrossRef]
Alexander, D.; Merkert, R. Applications of gravity models to evaluate and forecast US international air freight markets post-GFC. Transp. Policy 2021, 104, 52–62. [Google Scholar] [CrossRef] [PubMed]
Hassani, H.; Silva, E.S.; Antonakakis, N.; Filis, G.; Gupta, R. Forecasting accuracy evaluation of tourist arrivals. Ann. Tour. Res. 2017, 63, 112–127. [Google Scholar] [CrossRef]
Cao, J.; Guan, X.; Zhang, N.; Wang, X.; Wu, H. A hybrid deep learning-based traffic forecasting approach integrating adjacency filtering and frequency decomposition. IEEE Access 2020, 8, 81735–81746. [Google Scholar] [CrossRef]
Cao, X.; Ma, C.; Jia, Y. ARIMA and SVM Combination Forecast for Holiday Subway Passenger Traffic. In Proceedings of the 2018 World Transport Convention, Beijing, China, 18–21 June 2018. [Google Scholar]
Bildirici, M.; Ersin, Ö. Asymmetric power and fractionally integrated support vector and neural network GARCH models with an application to forecasting financial returns in ise100 stock index. Econ. Comput. Econ. Cybern. Stud. Res. 2014, 48, 1–22. [Google Scholar]
Sharifian, S.; Barati, M. An ensemble multiscale wavelet-GARCH hybrid SVR algorithm for mobile cloud computing workload prediction. Int. J. Mach. Learn. Cybern. 2019, 10, 3285–3300. [Google Scholar] [CrossRef]
Gao, R.; Duru, O. Parsimonious fuzzy time series modelling. Expert Syst. Appl. 2020, 156, 113447. [Google Scholar] [CrossRef]
Bose, M.; Mali, K. Designing fuzzy time series forecasting models: A survey. Int. J. Approx. Reason. 2019, 111, 78–99. [Google Scholar] [CrossRef]
Vovan, T. An improved fuzzy time series forecasting model using variations of data. Fuzzy Optim. Decis. Mak. 2019, 18, 151–173. [Google Scholar] [CrossRef]
Airports Council International. Preliminary World Airport Traffic Rankings Released. ACI World. 2019. Available online: https://aci.aero/2019/03/13/preliminary-world-airport-traffic-rankings-released/ (accessed on 5 July 2022).
Unal, A.; Hu, Y.; Chang, M.E.; Odman, M.T.; Russell, A.G. Airport related emissions and impacts on air quality: Application to the Atlanta International Airport. Atmos. Environ. 2005, 39, 5787–5798. [Google Scholar] [CrossRef]
Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Processing Syst. 1996, 9, 155–161. [Google Scholar]
Müller, K.-R.; Smola, A.J.; Rätsch, G.; Schölkopf, B.; Kohlmorgen, J.; Vapnik, V. Predicting time series with support vector machines. In Proceedings of the International Conference on Artificial Neural Networks, Lausanne, Switzerland, 8–10 October 1997; pp. 999–1004. [Google Scholar]
Rüping, S. SVM Kernels for Time Series Analysis; Technical Report: Komplexitätsreduktion in Multivariaten Datenstrukturen; Sonderforschungsbereich 475; Universität Dortmund: Dortmund, Germany, 2001. [Google Scholar]
Rohmah, M.; Putra, I.; Hartati, R.; Ardiantoro, L. Comparison Four Kernels of SVR to Predict Consumer Price Index. J. Phys. Conf. Ser. 2021, 1737, 012018. [Google Scholar] [CrossRef]
Yang, C.-H.; Moi, S.-H.; Hou, M.-F.; Chuang, L.-Y.; Lin, Y.-D. Applications of deep learning and fuzzy systems to detect cancer mortality in next-generation genomic data. IEEE Trans. Fuzzy Syst. 2020, 29, 3833–3844. [Google Scholar] [CrossRef]
Yang, C.-H.; Chuang, L.-Y.; Lin, Y.-D. Epistasis analysis using an improved fuzzy C-means-based entropy approach. IEEE Trans. Fuzzy Syst. 2019, 28, 718–730. [Google Scholar] [CrossRef]
Zeng, G.; Yu, W.; Wang, R.; Lin, A. Research on Mosaic Image Data Enhancement for Overlapping Ship Targets. arXiv 2021, arXiv:2105.05090. [Google Scholar]
Zhang, Y.; Li, G.; Muskat, B.; Law, R. Tourism demand forecasting: A decomposed deep learning approach. J. Travel Res. 2021, 60, 981–997. [Google Scholar] [CrossRef]
Parashar, N.; Khan, J.; Aslfattahi, N.; Saidur, R.; Yahya, S.M. Prediction of the Dynamic Viscosity of MXene/palm Oil Nanofluid Using Support Vector Regression. In Recent Trends in Thermal Engineering; Springer: Berlin/Heidelberg, Germany, 2022; pp. 49–55. [Google Scholar]
Yang, Y.; Che, J.; Deng, C.; Li, L. Sequential grid approach based support vector regression for short-term electric load forecasting. Appl. Energy 2019, 238, 1010–1021. [Google Scholar] [CrossRef]
Shamsah, S.M.I.; Owolabi, T.O. Modeling the maximum magnetic entropy change of doped manganite using a grid search-based extreme learning machine and hybrid gravitational search-based support vector regression. Crystals 2020, 10, 310. [Google Scholar] [CrossRef]

Figure 1. The forecast results of the freight volume of each airport from January to December 2019: (A) ATL Atlanta; (B) LAX Los Angeles; (C) ORD O’Hare; (D) DXB Dubai; (E) LHR London Heathrow; (F) CDG Paris Charles de Gaulle; (G) PEK Beijing Capital; (H) PVG Shanghai Pudong; (I) HND Tokyo Haneda; (J) HKG Hong Kong.

Figure 2. Comparison of air traffic trends in the past five years and FSVR forecast in 2019.

Table 1. Statistics on the freight volume of the top ten airports.

Airport	Min	Max	Mean	Q1	Q3	IQR	SD	CV
Atlanta	4.76	6.39	5.46	5.16	5.71	0.56	0.38	6.92
Beijing	10.371	18.87	16.49	16.10	17.52	1.43	1.74	10.54
Dubai	18.62	24.10	21.45	20.32	22.21	1.89	1.39	6.47
Los Angeles	15.16	21.66	19.01	18.06	20.42	2.36	1.64	8.63
Haneda, Tokyo	8.43	13.19	10.64	9.89	11.46	1.57	1.13	10.66
O’Hare	11.49	18.33	15.20	14.16	16.46	2.31	1.57	10.29
Heathrow, London	12.33	16.31	14.09	13.32	14.72	1.40	0.94	6.70
Hongkong	26.40	47.50	39.87	37.10	43.10	6.00	4.35	10.91
Pudong, Shanghai	19.43	36.39	29.82	27.87	32.06	4.19	3.36	11.28
Charles de Gaulle, Paris	14.01	20.10	16.24	15.32	16.91	1.59	1.19	7.30
Total	141.00	222.82	188.27	177.29	200.57	23.29	17.69	89.71

Unit—10,000 tons; Min—minimum value; Max—maximum value; Mean—average value; Q1—first quartile; Q3—third quartile; IQR—Interquartile range; SD—standard deviation; CV—coefficient of variation.

Table 2. Seasonality and trend strength of three major categories of air freight traffic at the top ten airports.

Airport	Freight Volume
Airport	Seasonal	Trend
Atlanta (ATL)	0.74	0.80
Dubai (PEK)	0.96	0.77
Dubai (DXB)	0.69	0.48
Los Angeles (LAX)	0.86	0.87
Haneda (HND)	0.92	0.87
O’Hare (ORD)	0.73	0.71
Heathrow, London (LHR)	0.88	0.90
Hongkong (HKG)	0.94	0.87
Pudong (PVG)	0.93	0.89
Charles de Gaulle (CDG)	0.79	0.53

Table 3. Parameters related to each forecasting model of freight volume.

Airport	Holt–Winters (ADD) (α, β, γ)	ETS (α, β, γ)	ARIMA (p, d, q)	SARIMA (p, d, q)(P, D, Q)S	SVR (ε, C, σ)	FSVR (ε, C, σ)
ATL	(0.404, 0.000, 0.194)	(0.462, N, 0.001)	(1, 0, 1) with non-zero mean	(2, 0, 2) (1, 0, 0) with non-zero mean	(0.536, 1.000, 0.125)	(0.010, 91.000, 0.125)
LAX	(0.731, 0.000, 1.000)	(0.772, 0.001, 0.001)	(4, 1, 1)	(2, 0, 0) (1, 0, 0) with non-zero mean	(0.59, 0.020, 0.04)	(0.010, 0.500, 0.100)
ORD	(0.438, 0.026, 1.000)	(0.589, N, 0.001)	(0, 1, 2)	(1, 0, 0) (1, 0, 0) with non-zero mean	(0.616, 0.010, 0.758)	(0.044, 0.794, 0.054)
DXB	(0.797, 0.000, 1.000)	(0.718, N, 0.002)	(0, 1, 2) with drift	(0, 1, 0) (0, 1, 0)	(0.435, 1.000, 0.189)	(0.088, 90.510, 0.088)
LHR	(0.046, 0.661, 0.432)	(0.001, 0.001, 0.003) φ = 0.973	(0, 1, 0)	(0, 1, 1)(0, 1, 1)	(0.354, 0.316, 0.074)	(0.010, 5.000, 0.016)
CDG	(0.273, 0.000, 0.560)	(0.365, N, 0.001)	(0, 1, 0)	(0, 1, 1) (0, 1, 1)	(0.500, 0.562, 0.063)	(0.100, 256.000, 0.100)
PEK	(0.327, 0.006, 0.612)	(0.424, N, 0.001)	(0, 1, 2)	(1, 0, 1) (1, 0, 0) with non-zero mean	(0.650, 8.000, 0.101)	(0.100, 64.000, 1.000)
PVG	(0.333, 0.000, 0.479)	(0.373, N, 0.001)	(0, 1, 1)	(0, 0, 2) (1, 0, 1) with non-zero mean	(0.287, 0.251, 0.574)	(0.032, 512.000, 0.001)
HND	(0.208, 0.569, 0.724)	(0.188, 0.133, 0.001)	(0, 1, 1)	(0, 1, 2) (0, 1, 1)	(0.100, 1.000, 0.300)	(0.100, 128.000, 0.001)
HKG	(0.204, 0.317, 0.749)	(0.515, N, 0.001)	(0, 1, 0)	(1, 1, 0) (0, 1, 1)	(0.650, 1.00, 0.790)	(0.100, 512.000, 0.001)

Table 4. The experimental results of the freight volume forecast model MAPE, MAE, and MAPE values.

Airport	Criteria	Holt–Winters (ADD)	ETS	ARIMA	SARIMA	SVR	FSVR
ATL	MAPE(%)	6.120	6.036	8.601	6.915	5.381	0.300
	MAE	0.316	0.313	0.449	0.359	0.286	0.016
	RMSE	0.412	0.385	0.485	0.419	0.349	0.020
LAX	MAPE(%)	5.304	7.961	10.782	5.418	4.937	1.431
	MAE	1.002	1.521	1.988	1.024	0.914	0.268
	RMSE	1.115	1.577	2.304	1.139	1.223	0.321
ORD	MAPE(%)	14.430	9.957	13.146	9.957	6.835	1.590
	MAE	2.125	1.441	1.830	1.406	0.933	0.232
	RMSE	2.176	1.536	2.204	1.628	1.334	0.297
DXB	MAPE(%)	9.781	7.356	11.845	7.376	6.708	1.071
	MAE	1.997	1.490	2.404	1.512	1.410	0.230
	RMSE	2.425	1.931	2.735	1.835	1.574	0.266
LHR	MAPE(%)	2.154	3.287	11.952	3.703	3.793	0.816
	MAE	0.303	0.465	1.641	0.520	0.546	0.115
	RMSE	0.374	0.535	1.740	0.610	0.778	0.154
CDG	MAPE(%)	6.454	7.916	11.585	5.697	5.753	0.176
	MAE	1.019	1.243	1.779	0.900	0.887	0.026
	RMSE	1.211	1.424	1.979	1.053	1.050	0.047
PEK	MAPE(%)	5.566	4.688	9.925	6.649	7.125	1.197
	MAE	0.840	0.705	1.311	0.952	0.854	0.190
	RMSE	0.998	0.865	2.195	1.229	2.040	0.373
PVG	MAPE(%)	10.544	9.353	10.497	4.180	9.011	1.243
	MAE	3.385	2.927	2.704	1.263	2.377	0.384
	RMSE	4.190	3.713	4.114	1.827	3.264	0.598
HND	MAPE(%)	8.492	6.867	12.366	6.994	7.930	1.114
	MAE	0.900	0.720	1.254	0.729	0.843	0.117
	RMSE	0.948	0.780	1.475	0.811	0.977	0.170
HKG	MAPE(%)	3.688	5.615	13.717	6.100	8.777	1.254
	MAE	1.347	2.061	4.808	2.292	3.099	0.510
	RMSE	1.648	2.431	6.454	2.578	4.220	0.797
Ave. of MAPE(%)		7.253	6.904	11.442	6.299	6.625	1.019
Ave. of MAE		1.323	1.288	2.017	1.096	1.215	0.209
Ave. of RMSE		1.550	1.518	2.568	1.313	1.681	0.304

Bold means the lowest value. Underline means the value >10.

Table 5. Analysis of SVR autoregressive lag periods for air traffic at Atlanta International Airport.

Order	Freight Volume
Order	RMSE	MAPE	No. of Support Vectors
12 periods behind	0.586	10.555	51
1 period behind	0.368	5.523	46

Bold means the lowest RMSE and MAPE values.

Table 6. The optimal number of fuzzy sets for the time series of fuzzy air traffic.

Area	North America			Middle East and Europe			Asia
Airport	ATL	LAX	ORD	DXB	LHR	CDG	PEK	PVG	HND	HKG
Freight volume	5	6	7	10	7	6	6	7	9	6

Bold indicates that the fuzzy segmentation has more fuzzy sets.

Table 7. Fuzzy air freight traffic time series errors.

Freight Volume	Lag1-RMSE	Fuzzy-RMSE
ATL	0.397	0.048
LAX	1.462	0.586
ORD	1.438	0.452
LHR	0.855	0.250
CDG	1.420	0.055
DXB	1.416	0.439
PEK	2.033	0.574
PVG	3.237	0.943
HKG	4.252	1.328
HND	1.122	0.300

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, C.-H.; Shao, J.-C.; Liu, Y.-H.; Jou, P.-H.; Lin, Y.-D. Application of Fuzzy-Based Support Vector Regression to Forecast of International Airport Freight Volumes. Mathematics 2022, 10, 2399. https://0-doi-org.brum.beds.ac.uk/10.3390/math10142399

AMA Style

Yang C-H, Shao J-C, Liu Y-H, Jou P-H, Lin Y-D. Application of Fuzzy-Based Support Vector Regression to Forecast of International Airport Freight Volumes. Mathematics. 2022; 10(14):2399. https://0-doi-org.brum.beds.ac.uk/10.3390/math10142399

Chicago/Turabian Style

Yang, Cheng-Hong, Jen-Chung Shao, Yen-Hsien Liu, Pey-Huah Jou, and Yu-Da Lin. 2022. "Application of Fuzzy-Based Support Vector Regression to Forecast of International Airport Freight Volumes" Mathematics 10, no. 14: 2399. https://0-doi-org.brum.beds.ac.uk/10.3390/math10142399

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Fuzzy-Based Support Vector Regression to Forecast of International Airport Freight Volumes

Abstract

1. Introduction

2. Methods

2.1. Dataset Description

2.2. Support Vector Regression

2.3. Fuzzy SVR

2.4. Evaluation Criteria

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI