1. Introduction
Currently, the modern power system infrastructure is undergoing major challenges caused by the requirements of its operation to pursue green, sustainable, safe, secure, and resilient considerations. In this regard, one of the driving designs considered to be potentially optimal is the development and implementation of hybrid power systems at both global and local levels. To cover the energy needs of regions geographically remote from large cities, a large-scale introduction of affordable alternative energy sources into this sector is required, and renewable energy sources are the optimal type to ensure their energy supply. As a result, the scenario of the functioning of the power system in an isolated mode, which was considered the most unfavourable a few decades ago, today has great potential for implementation for local distributed generation based on a variety of renewable energy sources. However, the utilization of renewable energy such as solar and wind is exposed to the variability issue due to weather and climate conditions causing the output to fluctuate. As result, it leads to restricting the stability of renewable energy systems. For example, wind speed is highly variable depending on climatic region, physical geography (the proportion of land and sea, the size of land mass, etc.), topography (obstacles, mountains, valleys), and time (the amount of wind continuously and annually varies, seasons).
Even though long-term wind fluctuations are difficult to understand, which creates obstacles to accurately predicting the feasibility of a wind farm project, in a shorter time these fluctuations are quite predictable, despite the still existing differences depending on location, time of day, etc.) [
1], and as the system operators are interested of the wind predictions in the short-term horizons (usually, from seconds to an hour), it is a possibility to manage deviations from the planned wind energy and foresee the action plan in case of unexpected fluctuations in wind power. As result, the provision of power reserves, as well as monitoring functions, will be ensured for balancing purposes and guarantee network security [
2]. Therefore, predictability is very important for the implementation of wind farms in terms of their design optimization, reliable operation, and effective management/control to provide a possibility for power supply from other generating units in an organized way targeting cost minimization.
The relevance of predicting and its methods appeared in the last century in the 10 s and 20 s. It can be assumed that for such a long time this problem if it not been solved fully but at least reached its minimal error. However, the requirements for the quality of forecasts and their results’ expectations have increased over time. The key indicators of the accuracy of the prediction are the following: its reliability, and the speed of the prediction. Plenty of work is dedicated to predicting issues to seeking the best solutions based on the targeted application area that is growing annually. Currently, many innovative methods and models have been developed and designed to predict a variety of tasks.
Examples of the use of combined solar, hydro, and wind energy technologies and others, depending on their feasibility, include the importance of taking into account the criteria and requirements for their optimal design, which are widely presented in the scientific works of many authors. Developments are focused on accelerated cost reduction and improvements in the efficiency of these systems, paying special attention to the description of the proposed optimization model, for example [
3,
4,
5,
6,
7,
8]. Therefore, the literature review in [
9] is of great interest, the authors of which consider the results of the study obtained by comparing a variety of optimization methods to represent the sizes of hybrid systems necessary for the efficient and economical use of RES.
There are several main groups of methods that are used to predict wind speed such as [
10]:
The first group of models uses a large amount of meteorological data and complex models of atmospheric motion [
11]. For instance, a model based on the Kalman filter is presented in [
12], in which the change in meteorological parameters is described using a Gaussian process. Such an approach can provide the required forecast accuracy but imposes high requirements on the accuracy of input meteorological data, requires an archive of observations over a long period, and is characterized by high computational complexity [
13]. On the other hand, the statistical approach is much simpler and, in general, can be applied if only retrospective data on wind speeds are available. As in other areas where time series forecasting is needed, various autoregressive methods are widely used, most often these are modifications of ARIMA (Autoregressive Integrated Moving Average) and methods using exponential smoothing. One of the main advantages of such methods is low computational work (resulting in ease of customization, low risk of error in the application, and low computational resource requirements) relative to physical models and machine learning-based methods. However, their accuracy not always is satisfactory. To improve accuracy, authors often use hybrid models that combine the principles of autoregression in combination with some filtering method. This method has found its application in works [
14,
15,
16,
17,
18,
19]. The last group of methods—intellectual models—is very diverse. It can be argued that all sections of artificial intelligence were applied to the problem of predicting wind speed, most often artificial neural networks (ANN) are used [
9,
20,
21,
22,
23,
24,
25,
26]; fuzzy logic [
22,
26,
27], and support vector machine (SVM) [
22,
28]. Thus, in [
9] a detailed analysis of a hybrid renewable energy system optimum sizing approaches based on genetic algorithms, particle swarm optimization, and simulated annealing was provided.
For the efficient use of wind energy, wind speed is considered one of the most important parameters to predict wind turbine power including a selection of a site and the optimal wind turbine size for a particular site. Wind speed can be predicted using traditional predicting methods [
29], and more recently it has been increasingly observed using artificial intelligence methods [
30]. Hybrid models become more deployed due to their design advancements and operating benefits leading to improve performance of stand-alone models. For instance, the authors in [
31] proposed the hybrid neural network (NN) model for short-term wind speed forecasting based on a time-series algorithm with the consideration of the multi-learner ensemble and adaptive error correction. The hybrid model that includes three types of linear time series models such as autoregressive, moving average, and autoregressive moving average has been used for both short and long-timescale prediction of wind speeds [
32]. Authors in [
33] proposed an adaptive neural fuzzy inference system algorithm for a wind speed prediction at 30 s and 60 s based on its historical data of wind speed and direction. Another study developed a hybrid prediction method integrating multiple-layer perception regressor, random forest regressor, K-nearest neighbours regressor, and decision tree regressor algorithms in the five-minute timescale targeting the low-cost solution [
34]. Many other research works propose and investigate the feasibility of the robust techniques in terms of their prediction stability and accuracy considering the input features such as wind speed, wind direction, temperature, air pressure, relative humidity, local time, etc., and defining forecasting of the wind speed, wind power, turbine power as an output [
35,
36,
37,
38,
39,
40,
41,
42,
43].
This paper proposes a method for predicting wind speed based on a recurrent neural network (NN) with feedback in the form of a backpropagation coefficient. At the same time, forecasting is carried out for four seasons of the year based on hourly retrospective wind speed data. Two mathematical predicting models are considered for a long-term continuous sample and individual sample hours in a daily interval. The predicted values of renewable and alternative energy sources serve as the basis for optimizing the electricity consumption of individual generating consumers to minimize their financial and technical costs. Along with this, the possibility of exporting electricity to a neighboring country has been considered to obtain additional income, in particular, for the GBAO isolated power system during periods of excess. Such a problem statement is a systematic vision of energy efficiency in the conditions of autonomous power supply.
The organization of the paper is as follows:
Section 2 provides information about the object of the study and presents an algorithm for managing the learning process.
Section 3 contains the results of the review and discussion. Finally, the conclusions are given in
Section 4.
2. Materials and Methods
2.1. The Study Object—Gorno-Badakhshan Autonomous Oblast
The Gorno-Badakhshan Autonomous Oblast (GBAO) is a region of the Republic of Tajikistan that is located in the eastern part of the country. The isolated power system of the GBAO has been considered for the validation of the proposed concept. This hybrid energy system includes generating units based on renewable energy sources. For instance, hydropower resources are utilized in the hydropower plants (HPP) on small mountain rivers (in total 11 units) and alternative sources which include one solar power plant (SPP) in conjunction with an energy storage device. The key feature of the electrical energy balance under these conditions is the unpredictability of energy generation by the above-indicated generation sources [
44,
45,
46,
47].
The minimum tariffs for wind energy are at least 4 cents/kWh in the world. In the GBAO, electricity tariffs are 0.24 somoni which is equal to 2.5 cents/kWh. Therefore, wind energy at the moment in the region cannot compete with existing hydropower and solar generation from an economic point of view. However, during certain periods there is a shortage of provided electricity leading to the search for alternative energy sources, where one of the suitable is wind energy. For instance, from mid-autumn to mid-spring the water level in the rivers drops that creating a possibility to use wind energy, and in summer it would be advisable to accumulate excess energy further to export in case of a need or storage [
47,
48,
49].
In the GBAO, the network of meteorological observations is considered undeveloped and insufficiently dense. Therefore, the real potential of wind energy in this region remains practically unexplored until now. Based on the available data, the potential for the development of wind generation for different regions of GBAO is uneven. For example, the average annual wind speed on the Fedchenko glacier is 6.0 m/s, and in typical places such as Lake Karakul—3.0–3.7 m/s. In rural areas such as Rushan, Khorog, Murghab, and Ishkashim—2.0–2.7 m/s. In general, in most other regions of the country, the average wind speed is insignificant and varies from 0.9–4.8 m/s [
50,
51,
52]. The indicators of the wind energy potential of the GBAO are presented in
Figure 1.
2.2. The Forecasting Model and Learning Process Control Algorithm
As known the dispatching control in the power system is conducted separately for every hour leading to the requirement of the monitoring of electricity balance for every hour of the day. Therefore, the predicting block is an important component of safe and reliable operating services of the modern power system.
In this study, a NN is used to perform such a prediction having a perceptron with one hidden layer (a multilayer perceptron). In this paper, the simplest model of neural network architecture is specially selected to show that even for very compact neural networks, the choice of training method and activation function is very important. In addition, as will be seen from the results of experimental studies, such a choice is a non-trivial task, since it depends on the source data and the architecture of the neural network. For comparison, some of the most frequently used and at the same time significantly different training methods and activation functions are selected. Sigmoidal activation functions are selected as a classical function, which together with a hyperbolic tangent has dominated among all activation functions for a long time, and ReLU is an example of a newer and at the same time already very frequently used function. A similar logic is used when choosing a pair of training methods. The classic Stochastic Gradient Descent underlies many other methods and the newer Adam.
At the input, the model receives retrospective data on wind speeds, at the output it gives a forecast—one wind speed value for one hour or 24 h ahead depending on the model building option such as follows:
- (1)
Inputs are the previous hours that coincide with the forecast hour during the month (30 values);
- (2)
Inputs are all previous hours considered during the week (168 values).
The first option could be explained in such a way: for the forecast of wind speed at 10:00 a.m. on 10 June, wind speeds at 10:00 a.m. on 1 May to 9 June will be used as input data. As a result, the model can be applied for forecasting one hour ahead and 24 h ahead.
The used NN model can be described as follows:
Min-max normalizer layer (it scales wind speeds to values from 0 to 1).
Input layer: 30 or 168 wind speed values.
Hidden layer with an adjustable number of neurons:
Output neuron.
Inverse min-max normalizer (it scales the last neuron output to wind speed).
A NN is used to perform such a prediction having a perceptron with one hidden layer. When the NN is training based on the training dataset, it is important that training does not turn into overfitting, otherwise, the model starts to fit the data instead of revealing the true dependencies between the input and output variables. The proposed learning algorithm applies the analysis of the graphs to predict error reduction during training based on the training and validation parts of the dataset. This allows for stopping the learning process when it stagnates or when there is a noticeable discrepancy between the error reduction on the training and validation samples. In turn, to avoid the contribution of the algorithm to overfitting due to stopping a training at the optimal moment according to the validation dataset, the decision to end training is made once every 200 training epochs.
Considering both options, the following hyperparameters of the model were determined experimentally:
- (1)
The number of hidden layer neurons that varies from 3 to 21 with a step of 3;
- (2)
The activation functions of the hidden layer such as ReLU and sigmoidal;
- (3)
The learning method such as SGD and Adam:
- (4)
The learning rate such as 10−4, 10−3, and 10−2.
2.3. Neural Network Learning Algorithms
In this work, we used common learning algorithms based on stochastic gradient descent: classical stochastic gradient descent (stochastic gradient descent) and adapted (Adam).
If classical gradient descent can be described by the expression [
53]:
then the Adam method will be represented as follows [
53]:
where
W—weight matrix;
dW—a matrix of gradients specifying the direction of error increase,
მE/მW;
Vdw—a matrix characterizing the inertial properties of the parameters of the ANN, in fact, the matrix of the rate of change of parameters; β
1—parameter that sets the balance between considering the previous direction of the gradient and the direction of the gradient obtained on the next training epoch and the next packet, usually the value of this parameter is close to 1 (~0.9);
Sdw—a matrix characterizing the degree (“energy”, since the gradient is squared) of the change in the ANN parameters, without taking into account the direction of change; β
2—a parameter that sets the balance between taking into account the previous energy of changing the direction of the gradient and the direction of the gradient obtained at the next training epoch and the next packet, usually the value of this parameter is close to 1 (~0.999);
ε—a positive number close to zero to prevent division by zero;
α—the size of the learning step;
t—package number during training.
As result, the issue of predicting values based on hourly wind speed samples has been solved by using the proposed learning process control algorithm in the above-described solution. Considering both options (activation function and learning algorithm), the following hyperparameters of the model were determined experimentally:
- (1)
The number of hidden layer neurons that varies from 3 to 21 with a step of 3;
- (2)
The activation functions of the hidden layer such as ReLU and sigmoidal;
- (3)
The learning method such as SGD and Adam:
- (4)
The learning rate such as 10−4, 10−3, and 10−2.
3. Obtained Validation Results and Discussion
The study was conducted based on wind speed data including an analysis of indicators of the wind potential of the GBAO for one full year (wind speed values for each hour). At the same time, a separate construction of models and analysis of the obtained results for each of the four seasons of the year were carried out. This allows you to develop several simpler and more concentrated models, instead of creating a single complicated model that operates in all climate conditions.
The best results for the first option in terms of every combination of the activation function and the training method for the different seasons are given in
Table 1,
Table 2,
Table 3 and
Table 4. Data is transmitted to the input of the model only for the previous hours that coincide with the forecast hour during the month.
Table 1,
Table 2,
Table 3 and
Table 4 show that for each of the seasons of the year, an acceptable result was achieved only when using ReLU and Adam. The optimal number of neurons is stable and equals 12–15 which indicates the stability of this regularity. The training step is also the same for the combination of ReLU and Adam and is 10
−3, while when using SGD the optimal step value varies from 10
−2 (
Table 2 and
Table 4) to 10
−4 (
Table 3) depending on the season and the number of neurons. It was found that the best combination of ReLU +Adam requires from 600 to 800 epochs. Changing the activation function to Sigmoid or the learning method to SGD slows down the learning process and does not improve its quality. Therefore, the obtained results of simulations have shown the advantage of the Adam learning method and the ReLU activation function for all seasons. Moreover, the sharp changes in wind speed characterize the autumn period in the examined area (
Table 4) which increases the difficulties of the forecasting process.
Simulation results with the detailed experimental procedure and obtained results are provided in
Table 5 and
Table 6. Similar experiments were executed for all activation functions, training methods, and the number of neurons and seasons.
Table 5 shows that if the learning rate is too high (10
−2), the process does not converge. The error does not decrease in the learning process. With a correctly selected rate (10
−3), the accuracy quickly increases both on the training and validation sets. If the learning rate is too low (10
−4), then the learning process is very slow, so the time to achieve acceptable accuracy is an order of magnitude longer than with a correctly chosen step.
Table 6 demonstrates the effect of the number of neurons on the results of model learning. It can be seen that 15 is the optimal neuron number for the ReLU+Adam option. Since the data set is quite small, an increase in the number of neurons can lead to the identification of false dependencies.
Figure 2 provides examples of the model learning process. For instance, it can be seen that when using ReLU+Adam, the number of neurons has a much stronger influence on the learning process than when using ReLU+SGD.
For considering the second option, when the model accepts data for all previous hours during the week, the best ReLU+Adam configuration was taken with a learning rate of 10
−3, and the number of neurons was selected separately (
Table 7). The obtained results show that this option is significantly inferior to the option when only the corresponding hours of the previous day are used for the forecast (the first option). It also indicates a very high variability of wind speed in the GBAO of the Republic of Tajikistan. The obtained results of the day ahead wind speed predictions are presented in
Figure 3. It presents the variability of wind speed and shows that the forecast repeats the daily profile in general but in some hours the deviations can be very large.
4. Conclusions
Nowadays to achieve carbon neutrality all over the world, there is a need for a rapid increase in the capacity of installations for generating electricity from RES, among which wind energy remains the undisputed leader. However, further expansion of wind energy production will require better climate forecasts that will be able to more accurately assess changes in wind speed in the coming seasons, years, and decades. This is extremely important for the planning of wind energy resources, which in turn is necessary to facilitate the large-scale integration of wind energy into the broader energy system. Wind energy, being at the same time a clean, easily accessible, and sustainable source of energy, is becoming increasingly important in the modern energy system. However, the chaotic, random, irregular, and unstable nature of its changes remains a big problem for the active introduction of wind farms into the energy sector, which, in turn, has a significant negative impact both on the planning and management of energy systems in general and on the dynamic management of wind turbines in particular. In this regard, the need to obtain the most accurate forecast of wind speed, which can improve the planning of wind energy production, reduce costs and improve the use of resources, is particularly acute.
This work determines the optimal hyperparameters of a multilayer perceptron such as the number of neurons in the hidden layer, the learning algorithm, the learning step, and the activation functions of neurons in the hidden layer in terms of prediction accuracy. The algorithm allows finding the moment when the process of model training traps stagnation or the search for false dependencies. Therefore, the training time is reduced and the retraining of the model is prevented. The average error in predicting wind speed on the validation dataset ranged from 20–28%. To improve the prediction accuracy, it is required to use additional meteorological data or Earth remote sensing data, which is beyond the scope of this work. However, such a level of prediction provides some degree of probability for the issuance of guaranteed power from the wind farm. Moreover, the smaller the guaranteed delivered capacity of an individual WPP, the higher its probability because at any speed over 12 m/s the WPP can develop full power, and in the period from 3–12 m/s, the power output will be guaranteed by 5–8%.
The limitations of the study include:
Non-use of additional meteorological features such as humidity, pressure, and temperature;
Absence of wind direction in the forecasting model;
Manual determination of the point in time when the neural network training process should be completed.
At the same time, the model architecture and the proposed approach make it possible to consider these limitations and improve forecasting accuracy. To achieve this, it is enough to add meteorological features, including wind direction, to the model input and retrain the model. The direction of the wind should be also used as the model output so that the model will be able to predict it. The learning process control automation is a direction for further research.