An Empirical Modal Decomposition-Improved Whale Optimization Algorithm-Long Short-Term Memory Hybrid Model for Monitoring and Predicting Water Quality Parameters

Li, Binglin; Xu, Hao; Lian, Yufeng; Li, Pai; Shao, Yong; Tan, Chunyu

doi:10.3390/su152416816

Open AccessArticle

An Empirical Modal Decomposition-Improved Whale Optimization Algorithm-Long Short-Term Memory Hybrid Model for Monitoring and Predicting Water Quality Parameters

¹

School of Electrical and Electronic Engineering, Changchun University of Technology, Changchun 130012, China

²

School of Computer and Automation, Wuhan Technology and Business University, Wuhan 430070, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(24), 16816; https://0-doi-org.brum.beds.ac.uk/10.3390/su152416816

Submission received: 13 November 2023 / Revised: 12 December 2023 / Accepted: 12 December 2023 / Published: 13 December 2023

(This article belongs to the Special Issue Application of Modeling and Assessment in Sustainable Water Quality Management)

Download

Browse Figures

Versions Notes

Abstract

:

Prediction of water quality parameters is a significant aspect of contemporary green development and ecological restoration. However, the conventional water quality prediction models have limited accuracy and poor generalization capability. This study aims to develop a dependable prediction model for ammonia nitrogen concentration in water quality parameters. Based on the characteristics of the long-term dependence of water quality parameters, the unique memory ability of the Long Short-Term Memory (LSTM) neural network was utilized to predict water quality parameters. To improve the accuracy of the LSTM prediction model, the ammonia nitrogen data were decomposed using Empirical Modal Decomposition (EMD), and then the parameters of the LSTM model were optimized using the Improved Whale Optimization Algorithm (IWOA), and a combined prediction model based on EMD-IWOA-LSTM was proposed. The study outcomes demonstrate that EMD-IWOA-LSTM displays improved prediction accuracy with reduced RootMean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) in comparison to the LSTM and IWOA-LSTM approaches. These research findings better enable the monitoring and prediction of water quality parameters, offering a novel approach to preventing water pollution rather than merely treating it afterwards.

Keywords:

water quality parameter prediction; empirical modal decomposition; whale optimization algorithm; long and short-term memory neural networks

1. Introduction

Water is a vital resource that sustains and propels human progress. However, the escalating need for water resources, driven by economic and societal developments, has aggravated the already deteriorating trend of water pollution [1]. Water quality prediction is essential for the development of water pollution control planning and comprehensive prevention and control. The aim of this study was to understand the current status and trend of water quality, forming a basis for subsequent water environment evaluation, water resource planning and scientific decision-making. Predictive analysis plays a crucial role in managing water environment information [2]. Based on the data from monitoring water quality, accurately predicting the trend of its changes was essential for early warning decision-making regarding major water pollution events [3].

Frequently applied techniques for forecasting water quality primarily comprise the time series method [4], the grey system prediction method [5], and the regression analysis method [6], among others. The time series method is a statistical method, which is based on mathematical statistics and stochastic processes, using the trend of change of historical data, to make inferences about the trend of change in future series. The time series methods are the Autoregressive (AR) [7] model, the Vector Autoregressive Model (VAR) [8], the Autoregressive Moving Average (ARMA) [9] model and so on. The time series water quality prediction methods have the advantages of a sound theoretical basis, high computational speed and low sample requirements. However, they fall short in considering the interactions between various water quality parameters, leading to a considerable margin of error. The grey system prediction method begins by analyzing the development trends among system factors. Subsequently, it generates original data processing and establishes relative differential equations based on the system’s change patterns. Finally, it achieves the prediction of the future trend of the series. At present, the GM(1,1) grey model [10] is widely utilized in water quality prediction, and numerous enhancements have been made upon this foundation. Pai et al. [11]. employed a first-order univariate grey model for prediction of nine parameters in groundwater and showed that only conductivity, chloride, and total dissolved solids were predicted with better accuracy. The grey system prediction method demands less data and can be implemented in scenarios with incomplete information. However, it is not suitable for cases where the water quality series sequence is nonlinear. Regression analysis [12] is grounded in statistical regression, elucidating the interrelationships between variables through mathematical formulas. Currently, water quality parameter prediction research is increasing, and there are some successful applications using machine learning for water quality prediction [13]. Support Vector Regression (SVR) is a kind of machine learning model [14], based on the principle of statistical learning, which is more suitable for dealing with complex nonlinear relationships, thus replacing the traditional multiple regression method, and has been widely researched and applied in water quality prediction. Su et al. [15] combined the Improved Sparrow Search Algorithm (ISSA) and SVR to predict water quality parameters, and employed ISSA to select the penalty factor c and kernel function parameter g of SVR to improve the accuracy and generalization ability of the model, and obtained better prediction results.

With the rapid growth of artificial intelligence, deep learning demonstrates distinct advantages over machine learning. It possesses excellent capabilities in nonlinear modelling and performs better in terms of adaptability and generalization [16,17,18]. Deep learning represents a machine learning subset that employs complex neural network structures to grasp intricate data relationships [19]. Artificial neural networks (ANN) form a key component of deep learning methods that process complex information by modelling the neural networks of the human brain. ANN prediction techniques are increasingly utilized in the field of water quality forecast [20]. Gautam et al. [21] employed ANN modeling to predict parameters, including the adsorption rate of sodium in groundwater. They introduced the Levenberg–Marquardt (L-M) three-layer backpropagation technique to establish a dependable predictive model for sodium parameters. Water quality parameter data exhibit nonlinearity and temporality. With the development of deep learning, Recurrent Neural Network (RNN) is more suitable for processing nonlinear time series compared to ANN [22]. Kumar et al. [23] conducted a model prediction study on monthly river flow data using a RNN network and proved that RNN has high accuracy in predicting time series data. However, RNN additionally encounters issues with gradient vanishing and gradient explosion. As a solution to these shortcomings, Hochreiter et al. [24] proposed Long Short-Term Memory (LSTM). Liu et al. [25] employed the LSTM to investigate and forecast water quality. The outcomes revealed that the model’s predicted values fitted well with the actual values, and the prediction error was smaller than that of other models, demonstrating the feasibility of LSTM for water quality prediction. Li et al. [26] used sparrow search algorithm to select the optimal hyperparameters of LSTM to achieve wastewater quality prediction. The findings indicate that optimizing LSTM parameters can enhance model performance. Yang et al. [27] proposed a whale optimization algorithm–bidirectional long and short-term memory (WOA-BILSTM) model using WOA to optimize the hyperparameters of the BILSTM model. Their study found that WOA performed better than Bayesian optimization and lattice search algorithms, as it converged faster and was better at finding the optimal hyperparameters. Cai et al. [28] proposed a hybrid model that combines Kalman filter and LSTM techniques in order to enhance the accuracy of water quality data predictions. The Kalman filter is utilized to reconstruct and preprocess the data, resulting in improved model performance. The findings indicate that water quality data can enhance model prediction accuracy when analyzed during the data preprocessing phase. Empirical Modal Decomposition (EMD) is a widely used time-frequency decomposition method [29] during the data preprocessing stage of deep learning in recent years. Zhang et al. [30] utilized the EMD-LSTM model to forecast the urban drainage network’s water quality, revealing that EMD is more advantageous in extracting data features than other data preprocessing techniques.

Based on the above, there are frequently limitations to the use of a single neural network model [31,32]. Specifically, when dealing with input sequences of high nonlinear complexity, neural networks may struggle to adequately capture the key features of the data, increasing the difficulty of prediction [33]. Furthermore, the network parameters are a major factor influencing the performance of neural networks. In this paper, we propose a novel model for predicting water quality parameters, called the Empirical Modal Decomposition-Improved Whale Optimization Algorithm-long short-term memory (EMD-IWOA-LSTM) hybrid model. Specifically, we introduce nonlinear convergence factors and adaptive weights to improve the traditional WOA. Compared to the traditional WOA, this revision not only boosts convergence speed but also reduces the convergence value. The main characteristics of the original data are obtained by decomposing it through EMD, and the LSTM network’s parameters are adjusted with the aid of IWOA to surmount the constraints of the neural network model and improve the prediction accuracy of the water quality parameters.

The research plan of this paper is shown in Figure 1, where the original data is first decomposed into several components using EMD, and then the training and test sets are divided. The training set data is then fed into the LSTM for training. Furthermore, the number of iterations, the learning rate, and the number of hidden layer units are ascertained using IWOA. Lastly, the trained model undergoes testing with the test set. The predicted values of all components are then merged to obtain the final results.

The paper is structured as follows: Section 2 divides the original data set, removes outliers, fills in missing values, and uses EMD to decompose the data into seven IMF components and one RES component; Section 3 gives an overview of the methodology used, improves the basic WOA, defines the parameters of the LSTM model and determines the optimization interval of the IWOA; Section 4 compares and discusses the results of the different models; Section 5 discusses the shortcomings of the work done in this paper and the ideas of the subsequent research; and Section 6 concludes the whole paper.

2. Data Sources and Processing

2.1. Data Sources

The research data were gathered from Changchun City, located in Jilin Province in Northeast China (Figure 2). The region experiences a temperate continental climate, with heavy industry being integral to its economy. Discharged industrial wastewater contains a considerable number of hazardous substances that have a direct impact on the quality of the surrounding water bodies. Due to this, the prediction of water quality is imperative. In this paper, 1440 sets of water quality data of ammonia nitrogen concentration were used for training and testing the model, the first 80% of the data were used to train the parameters of the model, and the last 20% of the data were used to check the accuracy of the model.

2.2. Outlier Handling and Filling

Considering conditions such as sudden changes in some of the monitoring data due to malfunctioning of the online monitoring equipment or the occurrence of sudden factors, the outliers in the sample data were eliminated using the Lajda criterion (3σ criterion), and their standard deviation σ was calculated according to the following Formula (1).

σ = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^{n} {(X_{i} - \bar{X})}^{2}}

(1)

where,

\bar{X}

is the average value of samples

X_{1}, X_{2}, \dots X_{n}

, and

n

is the number of samples. If a monitoring data

X_{i}

satisfies Formula (2):

|X_{i} - \bar{X}| > 3 σ

(2)

Then the monitoring data

X_{i}

is considered as an outlier, and the average of the monitoring values of the two time intervals before and after is used instead, as shown in Formula (3). The ammonia nitrogen monitoring results following the above method are shown in Figure 3.

X_{i} = \frac{X_{i - 2} + X_{i - 1} + X_{i + 1} + X_{i + 2}}{4}

(3)

where,

X_{i}

denotes the missing values, and

X_{i - 2}

,

X_{i - 1}

,

X_{i + 1}

and

X_{i + 2}

denote the monitoring values for the two time intervals before and after.

2.3. EMD-Based Nonlinear Decomposition of Data

EMD is a method of analyzing nonlinear data. The EMD technique resolves the issue of traditional methods resulting in the exclusion of valuable data and inaccurate demonstration of intricate phenomena. The nonlinear and non-smooth characteristics of water quality data are obstacles to accurate prediction of water quality parameters. The prediction requires a large amount of a priori information, which is very time-consuming and has a large time span. Therefore, EMD can be employed to preprocess the water quality parameter data to extract the essential features in the complex sequences and provide a reference for the subsequent prediction.

The results of the EMD decomposition of the ammonia nitrogen concentration parameters are shown in Figure 4:

In Figure 4, the original ammonia nitrogen data were decomposed by EMD to produce seven IMF sequences and a residual sequence, where the IMF1 sequence is the highest frequency component in the original data; the IMF2–6 sequences represent the different mid-frequency components in the data; the IMF7 sequence is the lowest frequency component; and the residual sequence contains the low frequency or trending components of the original signal that were not been fully captured by IMF1–7. After EMD decomposition, the features of the ammonia nitrogen data can be deeply mined and the accuracy of the neural network prediction can be improved.

3. Method Design and Improvement

3.1. LSTM Neural Network

RNN models are frequently employed to tackle temporal issues in neural network. The RNN structure is schematically shown in Figure 5.

X_{t}

denotes the input of the

t

th moment, A denotes that the memory unit can be calculated based on the output of the previous input layer with the state of the hidden layer in the previous moment,

h_{t}

is the output of the

t

th moment.

However, in RNN network, the problem of “Long Term Dependencies” is very serious, and the characteristics of the previous longer time points are easily overwritten. LSTM can solve the long-term dependency problem of RNN because LSTM introduces the gate mechanism for the flow of information. This gating mechanism allows the network to selectively forget or store information, enabling it to better process time-series data, especially in contexts where long-term dependencies need to be taken into account, and LSTMs perform better compared to traditional RNN. The cell structure of LSTM is shown in Figure 6.

One of the functions of the forget gate is deciding what information is kept and what is forgotten. The formula for the forgetting gate is as follows:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(4)

The input gate determines the new information that is added to the memory cell. The formula for an input gate is as follows:

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(5)

{\tilde{C}}_{t} = \tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(6)

where,

i_{t}

denotes the output of the input gate,

W_{i}

and

b_{i}

are the weights and biases of the input gate,

W_{C}

and

b_{C}

are the weights and biases of the candidate cell state

{\tilde{C}}_{t}

at moment

t

, tanh denotes the hyperbolic tangent function.

Next, the outputs of the forget gate and the input gate need to be used to update the memory cell. The formula for updating the memory unit is as follows:

C_{t} = f_{t} \cdot C_{t - 1} + i_{t} \cdot {\tilde{C}}_{t}

(7)

where,

C_{t}

denotes the state of the storage unit at the current moment.

Output gates, that control what information is to be output from the memory cell, are formulated as follows.

o_{t} = σ (W_{o} [h_{t - 1}, x_{t}] + b_{o}) h_{t}

(8)

h_{t} = o_{t} \cdot \tanh (C_{t})

(9)

where,

o_{t}

denotes the output of the output gate,

W_{o}

and

b_{o}

are the weight and bias of the output gate, respectively, and

h_{t}

denotes the output message.

3.2. Principles and Improvement of WOA

3.2.1. Basic WOA

The WOA is a novel heuristic optimization algorithm inspired by the feeding behavior of humpback whales. The optimization accuracy of the WOA basically does not decrease with the expansion of the dimensionality of the problem, and the idea is novel and easy to operate, so the WOA has gradually become one of the most popular research and improvement algorithms in the field of swarm intelligence optimization algorithms in recent years, and it has been successfully applied to a number of fields [34,35].

WOA is subdivided into three stages: surrounding the prey, producing a frothy net attack, and seeking out the prey.

Surrounding the prey;

Assuming that the current position of the best whale individual

X^{*}

is

(X_{1}^{*}, X_{2}^{*}, \dots, X_{d}^{*})

and the position of the whale individual

X_{}^{j}

is

(X_{1}^{j}, X_{2}^{j}, \dots, X_{d}^{j})

in the

d

-dimensional space, the next position

X_{}^{j + 1}

(X_{1}^{j + 1}, X_{2}^{j + 1}, \dots, X_{d}^{j + 1})

of the whale individual

X_{}^{j}

under the influence of the best whale individual

X^{*}

is given by:

X_{k}^{j + 1} = X_{k}^{*} - A_{1} * D_{k}

(10)

D_{k} = |C_{1} * X_{k}^{*} - X_{k}^{j}|

(11)

C_{1} = 2 r_{2}

(12)

A_{1} = 2 a * r_{1} - a

(13)

where,

X_{k}^{j + 1}

denotes the kth component of the spatial coordinate

X_{}^{j + 1}

. || in the formula for

D_{k}^{}

denotes the meaning of finding the absolute value. Both

r_{1}

and

r_{2}

are random numbers between 0 and 1.

a

is linearly decreasing from 2 to 0 as the number of iterations increases, with the formula:

a = (2 - 2 t / T_{M a x})

(14)

where,

t

denotes the current number of iterations,

T_{\max}

denotes the maximum number of iterations.

Producing a frothy net attack;

To model the feeding behavior of whales accurately, two mathematical models were designed, each representing spitting and feeding.

(a): Shrinkage bracket

The shrinkage bracket is similar to the mathematical model for the behavior of surrounding the prey, with the distinction in the range of values for

A_{1}^{}

. The range of values of

A_{1}^{}

is adjusted from the

[- a, a]

to

[- 1, 1]

, and other formulas remain unchanged.

(b): Spiral position update

The current whale individual approaches the current best whale individual in a spiral with the mathematical formula:

X_{k}^{j + 1} = X_{k}^{*} + D_{k} * e^{b l} * \cos (2 π l)

(15)

D_{k} = |X_{k}^{*} - X_{k}^{j}|

(16)

where,

b

is the helix shape constant and

l

is a random number between −1 and 1.

Humpback whales not only contract their enclosure when rounding up prey but also swim towards their prey in a spiral, each with a 50% probability of choosing to contract their enclosure and swim towards their prey in a spiral, as mathematically modeled below:

X_{k}^{j + 1} = X_{k}^{*} - A_{1} * D_{k}, p < 0.5

(17)

X_{k}^{j + 1} = X_{k}^{*} + D_{k} * e^{b l} * \cos (2 π l), p \geq 0.5

(18)

Seeking out the prey.

In the mathematical model of contraction encirclement, the value of A falls within the range of [–1, 1]. However, when A exceeds this range, the current whale individual may fail to approach the best whale individual, which leads to the concept of search predation. This idea enhances the global search capability of the whale group, but results in the current whale individual deviating from the target prey. The mathematical model of search-predation behavior is presented below.

X_{k}^{j + 1} = X_{k}^{r a n d} - A_{1} * D_{k}

(19)

D_{k} = |C_{1} * X_{k}^{r a n d} - X_{k}^{j}|

(20)

C_{1} = 2 r_{2}

(21)

A_{1} = 2 a * r_{1} - a

(22)

In the process of the WOA, when the control parameter

|A| < 1

, the local optimal solution search is carried out, at this time, the whale performs encircling the prey with 50% probability, and 50% probability of spiraling motion, and the algorithm carries out the global optimal solution search when the control parameter

|A| \geq 1

and

p < 0.5

.

3.2.2. Steps and Processes of WOA

The optimization process of WOA is shown in Figure 7 with the following steps:

Initialize the population position $X_{}^{j}$ of whales, and the parameters $a$ , $A$ , $l$ , $p$ , $t$ and $T_{M a x}$ ;
Calculate the fitness value for each individual whale in the population and record the current optimal solution;
When $p < 0.5$ and $|A| < 1$ , the group updates its position according to the “Surrounding the prey” behavior, otherwise the group updates its position according to the “spiral movement” behavior, and when $p \geq 0.5$ , the group updates its position according to the “bubble net feeding” behavior;
Updating the global optimal position yields the global optimal solution;
Determine whether the termination condition of the algorithm is satisfied, if the abort condition is satisfied, the algorithm ends; otherwise go to step 2 to continue;
Output the optimal position and fitness value.

3.2.3. Improvement of WOA

The WOA is mainly divided into population initialization phase, global exploration phase and local exploitation phase. The algorithm suffers from deficiencies such as low solution accuracy, slow convergence speed and easy to fall into local optimization, etc. In order to overcome the shortcomings of the standard WOA and further improve the solution performance of the WOA algorithm, this paper proposes a multi-strategy collaborative IWOA.

Nonlinear convergence factor;

By analyzing the basic principles of the basic WOA, it can be seen that the global exploration and local development ability of the algorithm mainly depend on the parameter

A

, and the value of the parameter

A

mainly depends on the change of the convergence factor

a

. Therefore, the convergence factor

a

is crucial for finding the optimal solution of the algorithm, and a larger convergence factor can provide a stronger global exploration ability to avoid falling into the local optimum; while a smaller convergence factor makes the algorithm have a stronger local exploitation ability, which can accelerate the convergence speed of the algorithm. However, in traditional WOA, the change in the convergence factor decrease linearly with the increase in the number of iterations, and this change easily makes the algorithm converge too slowly.

a = 2 - 2 \sin (μ \frac{t}{T_{M a x}} π + φ)

(23)

where:

t

represents the current number of iterations,

T_{M a x}

represents the maximum number of iterations, and

μ

and

φ

correspond to the parameters associated with its expression. For this particular case,

μ

is set to 1/2 and

φ

is set to 0. The decreasing convergence factor a generates a larger parameter A in the early iteration of the algorithm, which can more effectively improve the global exploration ability, and at the same time accelerate the convergence speed of the algorithm; in the later iteration, it generates a smaller parameter A, which can effectively improve the ability of local development. The curves before and after the optimization of the convergence factor are shown in Figure 8, where the orange curve is the curve before the parameter a is not optimized, and the blue curve is the curve after the parameter a is optimized.

Adaptive weighting factor;

In WOA, the position of the prey represents the optimal solution of the optimization problem; however, the position

X_{}^{*} (t)

of the prey is underutilized in the position update formulation of the traditional WOA algorithm. Therefore, in this paper, adaptive weights are introduced into the position update so that the optimal solution can be more fully utilized to improve the optimization accuracy of the algorithm. The definition is as follows:

X (t + 1) = (\frac{t_{}^{3}}{T_{M a x}^{3}}) • X_{}^{*} (t) - A • D |A| < 1, P < 0.5

(24)

X (t + 1) = (\frac{t_{}^{3}}{T_{M a x}^{3}}) • X_{r a n d}^{} - A • D |A| \geq 1, P < 0.5

(25)

X (t + 1) = D^{'} • e_{}^{b l} • \cos (2 π l) + (1 - \frac{t_{}^{3}}{T_{M a x}^{3}}) • X_{}^{*} (t) P \geq 0.5

(26)

where,

t_{}^{3} / T_{M a x}^{3}

denotes the adaptive weight of the prey position, and the weight coefficient increases with the increase of the number of iterations, which means that, in the stage of encircling the prey and the stage of randomly searching for the prey in the WOA, i.e., when

P < 0.5

, the adaptive weight a is introduced into the prey position in order to make full use of the optimal solution in the optimal search problem. At the same time, the adaptive weight adopted in the stage of updating the position in the spiral is

1 - t_{}^{3} / T_{M a x}^{3}

, and with the increase of the number of iterations increases, the whale will keep approaching the prey, and at this time, a smaller weight is used to change the prey position in order to improve the algorithm’s local exploitation ability, thus improving the algorithm’s optimization accuracy.

3.3. EMD-IWOA-LSTM Based Prediction Model for Water Quality Parameters

In this paper, the EMD-IWOA-LSTM method is used for modeling, i.e., the original water quality parameter sequence {

X_{t}

} is first decomposed by EMD into the eigenmode function equation

I_{t}^{(i)}

and the residual part

R_{t}

, i.e.,

X_{t} = \sum_{i = 1}^{N} I_{t}^{(i)} + R_{t}

, and the

I_{t}^{(i)}

and

R_{t}

sequences are predicted using the IWOA-LSTM method, respectively, and the prediction results are superimposed to obtain the prediction of the original data sequence. The prediction flow is shown in Figure 9:

The basic parameters of the LSTM model proposed in this study are shown in Table 1. The employed loss function is the mean square error function, as displayed in Formula (27).

loss = \frac{1}{n} \sum_{i = 1}^{n} (Y_{i} - \bar{Y})

(27)

where,

n

denotes the number of samples,

Y_{i}

denotes the predicted value,

\bar{Y}

denotes the actual value.

There are numerous hyperparameters in LSTM neural networks, and the quantity of LSTM layer units, the initial learning rate, and the maximum number of iterations greatly affect the model’s performance. Therefore, employing IWOA to ascertain these three hyperparameters hold significant importance. The algorithm’s whale population size is set to 20, with 10 generations for iterations, and the intervals of the optimization parameters are defined as shown in Table 2.

3.4. Evaluation Indicators

In order to verify the accuracy of model prediction, Root root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) are used to compare and evaluate the model prediction performance, respectively. Among them, RMSE mainly describes the deviation between the predicted value and the measured value; MAPE mainly describes the percentage of the predicted value that deviates from the true value; the formulas for RMSE and MAPE are as follows:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(Y_{i} - f (x_{i}))}_{}^{2}}

(28)

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{f (x_{i}) - Y_{i}}{Y_{i}}|

(29)

where,

n

is the sample size of the data,

Y_{i}

is the actual value at the

i

th moment and

f (x_{i})

is the predicted value at the

i

th moment.

4. Evaluation of Method Results

4.1. Comparison of WOA and IWOA

An algorithm’s performance is evaluated through its convergence curve, where faster convergence indicates superior performance. We use the IWOA to optimize the number of LSTM units, initial learning rate, and maximum iterations for an LSTM model. The fitness of the whale population is the average prediction error of the LSTM model under individual whale parameters. The convergence curves for both the WOA and IWOA are shown in Figure 10, with the WOA in black and the IWOA in red.

Figure 10 illustrates that the IWOA surpasses the WOA in terms of rapid convergence and diminished convergence rate. This can be attributed to the inclusion of the nonlinear convergence factor and the adaptive weight factor. The nonlinear convergence factor

a

decays rapidly early in the iteration, improving convergence speed, and later in the iteration, improving the local exploitation of the algorithm. The adaptive weight factor decreases as the iterations increase, resulting in improved balance between the weights of the global and local searches.

4.2. Comparison of LSTM Model and IWOA-LSTM Model Simulation

The preprocessed sample data were input to the LSTM and trained using the IWOA-LSTM model, respectively, and the prediction results of the two models for the test set are shown in Figure 11, where the black curve is the monitored value of ammonia nitrogen content, the red curve is the prediction curve of the LSTM, and the blue curve is the prediction curve of the IWOA-LSTM model.

Utilizing the Improved Whale Optimization Algorithm (IWOA) to optimize LSTM parameters—including the number of units, initial learning rate, and maximum iterations—resulted in an optimal configuration [45, 0.019, 113] after 10 iterations. This configuration, comprising 45 layers, a learning rate of 0.019, and a maximum of 113 iterations, notably improved data prediction accuracy compared to the original LSTM model.

Figure 11 and Table 3 illustrate that the IWOA-LSTM model outperforms the base model significantly, resulting in a 56.5% and 54.1% decrease in RMSE and MAPE, respectively. This enhancement is attributed to IWOA’s effective parameter tuning, which enables the model to capture complex data features and significantly reduces the prediction error.

4.3. Comparison of IWOA-LSTM Model and EMD-IWOA-LSTM Model Simulation

The experimental results in Section 4.2 confirm that the IWOA-LSTM has clear advantages over the unoptimized LSTM. To further enhance the model’s predictive capacity, we introduce EMD empirical modal decomposition for data decomposition. This facilitates the extraction of additional features and contributes to the improvement of prediction model accuracy. The IMF1–7 components and residuals, created through EMD decomposition of the initial training dataset, were each individually used for training in IWOA-LSTM. The resulting parameters for the model are displayed in Table 4. The EMD-IWOA-LSTM predictions were generated by combining the model predictions for all IMF components and residuals and compared to the IWOA-LSTM predictions in Section 3.2, as shown in Figure 12. The monitored ammonia nitrogen content is represented by the black curve, while the IWOA-LSTM and EMD-IWOA-LSTM model prediction curves are represented by the blue and red curves, respectively.

Upon analyzing the prediction results in Figure 12, it is evident that the use of Empirical Mode Decomposition (EMD) as a preprocessing step to decompose the original data into multiple components enhances the model’s ability to capture non-linear relationships in time series data. Subsequently, each component is predicted individually through the use of the IWOA-LSTM approach. The ensuing predictions are combined to produce the final results. In contrast to the IWOA-LSTM-only method, the amalgamated EMD-IWOA-LSTM procedure demonstrates more effective data fitting.

Table 5 presents the RMSE and MAPE results of the EMD-IWOA-LSTM model and the IWOA-LSTM model. The former showed a 37.8% decrease in RMSE and a 41.1% decrease in MAPE compared to the latter. This supports the effectiveness of the EMD introduced in the data preprocessing step, which significantly reduces the model prediction error.

5. Discussion

The experimental outcomes exhibit that the EMD-IWOA-LSTM hybrid model proposed in this study enhances the precision of the model in forecasting ammonia nitrogen content in water quality parameters through data decomposition and LSTM parameters optimization. Nevertheless, there are still certain aspects that warrant further investigation in our research.

In our research, we employed EMD—a time-frequency domain signal decomposition technique to process the data, and obtained good results. Our research findings support the superior performance of EMD in data decomposition, which is consistent with findings in the literature [34,35,36]. However, EMD methods commonly present two issues, namely modal aliasing and marginal effects. Modal aliasing denotes the interaction between distinct intrinsic modal functions (IMFs) during EMD, which can lead to aliasing in one or various IMFs, ultimately impacting the accuracy of the decomposition results. To address this issue, future studies could explore incorporating random noise to achieve a more even decomposition outcome by performing EMD multiple times. This can help mitigate the impact of modal aliasing. Additionally, there is the problem of edge effects, where the lack of sufficient signal data at both ends may result in instability and inaccuracy of the IMFs. At signal boundaries, the selection of extreme points for EMD may lack consistency and impact decomposition results. Future research may reduce marginal effects and enhance EMD stability by adding periodic marginal extensions to signal ends. This improvement should alleviate issues created by insufficient data at signal edges, thus enabling a more reliable application of EMD for signal decomposition.

The significant impact of the hyperparameters on a neural network’s performance is demonstrated in the literature [37,38,39,40]. In this paper, the use of IWOA to determine the hyperparameters of the LSTM improves the prediction accuracy of the model, which is consistent with the findings in the above literature. However, due to the experimental limitations, we solely measured the ammonia nitrogen content among the water quality parameters as the initial dataset for this study. Our proposed hybrid EMD-IWOA-LSTM model utilizes solely ammonia nitrogen content as the input to predict ammonia nitrogen content. In other words, both the input and output of the model consist of ammonia nitrogen content. Future research should aim to incorporate additional key water quality parameters, including dissolved oxygen and pH, into the dataset. Furthermore, the intrinsic relationships between these parameters should be analyzed in more detail. By analyzing strongly correlated parameters and using them as auxiliary inputs, the model’s accuracy in predicting ammonia nitrogen content is expected to improve. Additionally, validation of the model’s generalization ability is necessary. This involves evaluating the model’s predictive accuracy for various water quality parameters and verifying its effectiveness under different regional and water body circumstances. Such comprehensive validation will help to confirm the applicability and reliability of the model in different environments.

The hybrid model combining EMD, IWOA and LSTM can predict water quality parameters more accurately, overcoming some limitations of the traditional methods and providing a new approach for water quality prediction. In the future, the model is expected to provide more accurate prediction results in real water quality monitoring. In practical applications, the model has the potential to become a powerful tool in the field of water quality monitoring and management. It can be applied to water quality monitoring and management to capture water pollution emergencies in a timely manner, predict water quality trends, and provide key information to decision makers so that they can quickly develop and implement appropriate measures. The model is expected to play an active role in the protection of water resources and make substantial contributions to society and the ecological environment.

6. Conclusions

In this paper, a hybrid prediction model, EMD-IWOA-LSTM, is developed to deal with the non-linear and time-series characteristics of water quality parameters. The following are our main conclusions:

(1): Utilizing EMD to decompose the data during the data preprocessing phase facilitates LSTM in extracting data characteristics, hence enhancing prediction accuracy.
(2): Addressing the issue of slow convergence and the inclination of traditional WOA to fall into the local optimum, an improvement is made to traditional WOA by introducing the non-linear convergence factor and adaptive weights, which speed up the convergence speed of the algorithm and reduce the convergence value, and it is easier to find the globally optimal solution.
(3): By comparing the RMSE and MAPE of various models, the findings demonstrate that the hybrid EMD-IWOA-LSTM model put forward in this manuscript exhibits the lowest prediction error and the greatest prediction accuracy. This suggests that the hybrid model can significantly enhance the precision of ammonia and nitrogen content prediction in water quality parameters.

This paper confirms the effectiveness of the EMD-IWOA-LSTM hybrid model in predicting ammonia-nitrogen levels in water quality parameters. It also provides a new idea for predicting water quality parameters, which has good application prospects and application value.

Author Contributions

Conceptualization, B.L. and H.X.; Data curation, H.X. and Y.S.; Funding acquisition, B.L.; Methodology, H.X. and P.L.; Resources, B.L.; Software, H.X. and P.L.; Validation, B.L. and Y.L.; Writing—original draft, H.X. and P.L.; Writing—review and editing, H.X., Y.L. and C.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Science and Technology Development Project of Jilin Province under Grant 20210201106GX.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available on request due to restrictions, e.g., privacy or ethical. The data presented in this study are available on request from the corresponding author. The data are not publicly available due to [Experimental data from actual subject projects, not convenient for public disclosure].

Conflicts of Interest

The authors declare no conflict of interest.

References

Meng, F. The impact of water resources and environmental improvement on the development of sustainable ecotourism. Desalination Water Treat. 2021, 219, 40–50. [Google Scholar] [CrossRef]
Wang, W.; Tang, D.S.; Pilgrim, M.; Liu, J.N. Water Resources Compound Systems: A Macro Approach to Analysing Water Resource Issues under Changing Situations. Water 2016, 8, 2. [Google Scholar] [CrossRef]
Rustam, F.; Ishaq, A.; Kokab, S.T.; Diez, I.D.; Mazón, J.L.V.; Rodríguez, C.L.; Ashraf, I. An Artificial Neural Network Model for Water Quality and Water Consumption Prediction. Water 2022, 14, 3359. [Google Scholar] [CrossRef]
Li, H.Y.; Wang, X.S.; Guo, H.Y. Uncertain time series forecasting method for the water demand prediction in Beijing. Water Supply 2022, 22, 3254–3270. [Google Scholar] [CrossRef]
Men, B.H.; Wu, Z.J.; Liu, H.L.; Hu, Z.H.; Li, Y.S. Improved grey prediction method for optimal allocation of water resources: A case study in Beijing in China. Water Supply 2019, 19, 1044–1054. [Google Scholar] [CrossRef]
Liu, S.Y.; Xu, L.Q.; Li, D.L. Prediction of Aquaculture Water Quality Based on Combining Principal Component Analysis and Least Square Support Vector Regression. Sens. Lett. 2013, 11, 1305–1309. [Google Scholar] [CrossRef]
Wang, Y.Z.; Zhang, K.; Ma, X.P.; Liu, P.Y.; Wang, H.C.; Guo, X.; Liu, C.L.; Zhang, L.M.; Yao, J. A physics-guided autoregressive model for saturation sequence prediction. Geoenergy Sci. Eng. 2023, 221, 211373. [Google Scholar] [CrossRef]
Li, Y.H.; Genton, M.G. Single-Index Additive Vector Autoregressive Time Series Models. Scand. J. Stat. 2009, 36, 369–388. [Google Scholar] [CrossRef]
Zhang, J.P.; Xiao, H.L.; Fang, H.Y. Component-based Reconstruction Prediction of Runoff at Multi-time Scales in the Source Area of the Yellow River Based on the ARMA Model. Water Resour. Manag. 2022, 36, 433–448. [Google Scholar] [CrossRef]
Zhang, X.Q.; Wu, X.L.; Xiao, Y.M.; Shi, J.W.; Zhao, Y.; Zhang, M.H. Application of improved seasonal GM(1,1) model based on HP filter for runoff prediction in Xiangjiang River. Environ. Sci. Pollut. Res. 2022, 29, 52806–52817. [Google Scholar] [CrossRef]
Pai, T.Y.; Wu, R.S.; Chen, C.H.; Lo, H.M.; Wan, T.J.; Liu, M.H.; Chen, W.C.; Lin, Y.P.; Hsu, C.T. Prediction of Groundwater Quality Using Seven Types of First-Order Univariate Grey Model in the Chishan Basin, Taiwan. Water Air Soil Pollut. 2022, 233, 481. [Google Scholar] [CrossRef]
Mokhtar, A.; Elbeltagi, A.; Gyasi-Agyei, Y.; Al-Ansari, N.; Abdel-Fattah, M.K. Prediction of irrigation water quality indices based on machine learning and regression models. Appl. Water Sci. 2022, 12, 76. [Google Scholar] [CrossRef]
Li, T.T.; Lu, J.; Wu, J.; Zhang, Z.H.; Chen, L.W. Predicting Aquaculture Water Quality Using Machine Learning Approaches. Water 2022, 14, 2836. [Google Scholar] [CrossRef]
Liang, N.; Zou, Z.H.; Wei, Y.G. Regression models (SVR, EMD and FastICA) in forecasting water quality of the Haihe River of China. Desalination Water Treat. 2019, 154, 147–159. [Google Scholar] [CrossRef]
Su, X.H.; He, X.L.; Zhang, G.; Chen, Y.H.; Li, K.Y. Research on SVR Water Quality Prediction Model Based on Improved Sparrow Search Algorithm. Comput. Intell. Neurosci. 2022, 2022, 7327072. [Google Scholar] [CrossRef] [PubMed]
Bi, J.; Lin, Y.Z.; Dong, Q.X.; Yuan, H.T.; Zhou, M.C. Large-scale water quality prediction with integrated deep neural network. Inf. Sci. 2021, 571, 191–205. [Google Scholar] [CrossRef]
Wang, Y.; Cheng, Y.H.; Liu, H.; Guo, Q.; Dai, C.J.; Zhao, M.; Liu, D.Z. A Review on Applications of Artificial Intelligence in Wastewater Treatment. Sustainability 2023, 15, 13557. [Google Scholar] [CrossRef]
Zheng, H.; Liu, Y.Y.; Wan, W.H.; Zhao, J.S.; Xie, G.T. Large-scale prediction of stream water quality using an interpretable deep learning approach. J. Environ. Manag. 2023, 331, 117309. [Google Scholar] [CrossRef]
Najah, A.; El-Shafie, A.; Karim, O.A.; El-Shafie, A.H. Application of artificial neural networks for water quality prediction. Neural Comput. Appl. 2013, 22, S187–S201. [Google Scholar] [CrossRef]
Ding, S.F.; Li, H.; Su, C.Y.; Yu, J.Z.; Jin, F.X. Evolutionary artificial neural networks: A review. Artif. Intell. Rev. 2013, 39, 251–260. [Google Scholar] [CrossRef]
Gautam, V.K.; Pande, C.B.; Moharir, K.N.; Varade, A.M.; Rane, N.L.; Egbueri, J.C.; Alshehri, F. Prediction of Sodium Hazard of Irrigation Purpose using Artificial Neural Network Modelling. Sustainability 2023, 15, 7593. [Google Scholar] [CrossRef]
Wongburi, P.; Park, J.K. Prediction of Wastewater Treatment Plant Effluent Water Quality Using Recurrent Neural Network (RNN) Models. Water 2023, 15, 3325. [Google Scholar] [CrossRef]
Nagesh Kumar, D.; Srinivasa Raju, K.; Sathish, T. River Flow Forecasting using Recurrent Neural Networks. Water Resour. Manag. 2004, 18, 143–161. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Liu, P.; Wang, J.; Sangaiah, A.K.; Xie, Y.; Yin, X.C. Analysis and Prediction of Water Quality Using LSTM Deep Neural Networks in IoT Environment. Sustainability 2019, 11, 2058. [Google Scholar] [CrossRef]
Li, G.B.; Cui, Q.Z.; Wei, S.N.; Wang, X.F.; Xu, L.X.; He, L.X.; Kwong, T.C.H.; Tang, Y.Y. Long short-term memory network-based wastewater quality prediction model with sparrow search algorithm. Int. J. Wavelets Multiresolution Inf. Process. 2023, 21, 2350019. [Google Scholar] [CrossRef]
Yang, X.Y.; Li, S.Y. Prediction of COVID-19 Using a WOA-BILSTM Model. Bioengineering 2023, 10, 883. [Google Scholar] [CrossRef] [PubMed]
Cai, H.; Zhang, C.; Xu, J.L.; Wang, F.; Xiao, L.H.; Huang, S.X.; Zhang, Y.F. Water Quality Prediction Based on the KF-LSTM Encoder-Decoder Network: A Case Study with Missing Data Collection. Water 2023, 15, 2542. [Google Scholar] [CrossRef]
Huang, Y.; Wang, K.; Zhou, Q.; Fang, J.; Zhou, Z. Feature extraction for gas metal arc welding based on EMD and time–frequency entropy. Int. J. Adv. Manuf. Technol. 2017, 92, 1439–1448. [Google Scholar] [CrossRef]
Zhang, Y.T.; Li, C.L.; Jiang, Y.Q.; Sun, L.; Zhao, R.B.; Yan, K.F.; Wang, W.H. Accurate prediction of water quality in urban drainage network with integrated EMD-LSTM model. J. Clean. Prod. 2022, 354, 131724. [Google Scholar] [CrossRef]
Pietrolaj, M.; Blok, M. Neural network training with limited precision and asymmetric exponent. J. Big Data 2022, 9, 63. [Google Scholar] [CrossRef]
Yang, Y.F.; Ren, X.M.; Qin, W.Y.; Wu, Y.F.; Zhi, X.Z. Prediction of chaotic time series based on EMD method. Acta Phys. Sin. 2008, 57, 6139–6144. [Google Scholar] [CrossRef]
Wang, D.H.; Zhao, G.S.; Chen, H.N.; Liu, Z.X.; Deng, L.; Li, G.Q. Nonlinear tensor train format for deep neural network compression. Neural Netw. 2021, 144, 320–333. [Google Scholar] [CrossRef]
Liu, X.C.; Liu, B.L. A Hybrid Time Series Model for Predicting the Displacement of High Slope in the Loess Plateau Region. Sustainability 2023, 15, 5423. [Google Scholar] [CrossRef]
Liu, D.; Zeng, H.T.; Xiao, Z.H.; Peng, L.H.; Malik, O.P. Fault diagnosis of rotor using EMD thresholding-based de-noising combined with probabilistic neural network. J. Vibroengineering 2017, 19, 5920–5931. [Google Scholar] [CrossRef]
Li, Y.Y.; Shao, M.H.; Sun, L.J.; Wang, X.M.; Song, S.Z. Research on Demand Price Elasticity Based on Expressway ETC Data: A Case Study of Shanghai, China. Sustainability 2023, 15, 4379. [Google Scholar] [CrossRef]
Peng, J.H.; Xie, W.; Wu, Y.; Sun, X.R.; Zhang, C.L.; Gu, H.; Zhu, M.Y.; Zheng, S. Prediction for the Sluice Deformation Based on SOA-LSTM-Weighted Markov Model. Water 2023, 15, 3724. [Google Scholar] [CrossRef]
Song, C.G.; Yao, L.H. A hybrid model for water quality parameter prediction based on CEEMDAN-IALO-LSTM ensemble learning. Environ. Earth Sci. 2022, 81, 262. [Google Scholar] [CrossRef]
Zhang, Q.L.; Zhu, Y.W.; Ma, R.; Du, C.X.; Du, S.L.; Shao, K.; Li, Q.B. Prediction Method of TBM Tunneling Parameters Based on PSO-Bi-LSTM Model. Front. Earth Sci. 2022, 10, 854807. [Google Scholar] [CrossRef]
Chang, W.; Chen, X.; He, Z.; Zhou, S. A Prediction Hybrid Framework for Air Quality Integrated with W-BiLSTM(PSO)-GRU and XGBoost Methods. Sustainability 2023, 15, 16064. [Google Scholar] [CrossRef]

Figure 1. Research plan chart.

Figure 2. Geographic location of the study area.

Figure 3. Ammonia nitrogen concentration monitoring data.

Figure 4. EMD decomposition results of the ammonia nitrogen concentration monitoring data, (a) for IMF1-IMF4 sequences, (b) for IMF5-IMF7 and res sequences.

Figure 5. Schematic structure of RNN.

Figure 6. LSTM cell structure.

Figure 7. Flowchart of WOA.

Figure 8. Plot of convergence factor before and after improvement.

Figure 9. EMD-LSTM model prediction flow.

Figure 10. Plot of WOA and IWOA adaptation curves.

Figure 11. LSTM and IWOA-LSTM prediction results.

Figure 12. IWOA-LSTM and EMD-IWOA-LSTM prediction results.

Table 1. LSTM network model parameters.

Parameter	Value
Input Layer	15
Output Layer	1
Fully Connected Layer	1
Learning rate	0.001
Iterations	300
Number of hidden layers	1
Number of hidden layer units	30
Regularization factor	0.001
Gradient descent algorithm	Adam
Activation function	Relu

Table 2. Optimized parameter intervals.

Parameter	Value
Number of hidden layer units	[20, 100]
Initial learning rate	[0.001, 0.1]
Iterations	[100, 300]

Table 3. Prediction errors of LSTM and IWOA-LSTM models.

Mould	RMSE	MAPE
LSTM	0.00085	0.037%
IWOA-LSTM	0.00037	0.017%

Table 4. Optimized model structure.

Sequences	Number of Hidden Layer Units	Initial Learning Rate	Iterations
IMF1	43	0.06627	191
IMF2	95	0.00563	279
IMF3	26	0.04197	165
IMF4	47	0.00622	139
IMF5	13	0.03990	202
IMF6	72	0.02310	127
IMF7	100	0.00100	300
RES	68	0.00840	253

Table 5. Prediction errors of IWOA-LSTM and EMD-IWOA-LSTM models.

Model	RMSE	MAPE
IWOA-LSTM	0.00037	0.017%
EMD-IWOA-LSTM	0.00023	0.010%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, B.; Xu, H.; Lian, Y.; Li, P.; Shao, Y.; Tan, C. An Empirical Modal Decomposition-Improved Whale Optimization Algorithm-Long Short-Term Memory Hybrid Model for Monitoring and Predicting Water Quality Parameters. Sustainability 2023, 15, 16816. https://0-doi-org.brum.beds.ac.uk/10.3390/su152416816

AMA Style

Li B, Xu H, Lian Y, Li P, Shao Y, Tan C. An Empirical Modal Decomposition-Improved Whale Optimization Algorithm-Long Short-Term Memory Hybrid Model for Monitoring and Predicting Water Quality Parameters. Sustainability. 2023; 15(24):16816. https://0-doi-org.brum.beds.ac.uk/10.3390/su152416816

Chicago/Turabian Style

Li, Binglin, Hao Xu, Yufeng Lian, Pai Li, Yong Shao, and Chunyu Tan. 2023. "An Empirical Modal Decomposition-Improved Whale Optimization Algorithm-Long Short-Term Memory Hybrid Model for Monitoring and Predicting Water Quality Parameters" Sustainability 15, no. 24: 16816. https://0-doi-org.brum.beds.ac.uk/10.3390/su152416816

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Empirical Modal Decomposition-Improved Whale Optimization Algorithm-Long Short-Term Memory Hybrid Model for Monitoring and Predicting Water Quality Parameters

Abstract

1. Introduction

2. Data Sources and Processing

2.1. Data Sources

2.2. Outlier Handling and Filling

2.3. EMD-Based Nonlinear Decomposition of Data

3. Method Design and Improvement

3.1. LSTM Neural Network

3.2. Principles and Improvement of WOA

3.2.1. Basic WOA

3.2.2. Steps and Processes of WOA

3.2.3. Improvement of WOA

3.3. EMD-IWOA-LSTM Based Prediction Model for Water Quality Parameters

3.4. Evaluation Indicators

4. Evaluation of Method Results

4.1. Comparison of WOA and IWOA

4.2. Comparison of LSTM Model and IWOA-LSTM Model Simulation

4.3. Comparison of IWOA-LSTM Model and EMD-IWOA-LSTM Model Simulation

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI