Construction Project Cost Prediction Method Based on Improved BiLSTM

Wang, Chaoxue; Qiao, Jiale

doi:10.3390/app14030978

Open AccessArticle

Construction Project Cost Prediction Method Based on Improved BiLSTM

by

Chaoxue Wang

and

Jiale Qiao

^*

School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(3), 978; https://0-doi-org.brum.beds.ac.uk/10.3390/app14030978

Submission received: 27 November 2023 / Revised: 21 January 2024 / Accepted: 22 January 2024 / Published: 23 January 2024

(This article belongs to the Special Issue Machine/Deep Learning: Applications, Technologies and Algorithms)

Download

Browse Figures

Versions Notes

Abstract

:

In construction project management, accurate cost forecasting is critical for ensuring informed decision making. In this article, a construction cost prediction method based on an improved bidirectional long- and short-term memory (BiLSTM) network is proposed to address the high interactivity among construction cost data and difficulty in feature extraction. Firstly, the correlation between cost-influencing factors and the unilateral cost is calculated via grey correlation analysis to select the characteristic index. Secondly, a BiLSTM network is used to capture the temporal interactions in the cost data at a deep level, and the hybrid attention mechanism is incorporated to enhance the model’s feature extraction capability to comprehensively capture the interactions among the features in the cost data. Finally, a hyperparameter optimisation method based on the improved particle swarm optimisation algorithm is proposed using the prediction accuracy as the fitness function of the algorithm. The MAE, RMSE, MPE, MAPE, and coefficient of determination of the simulated prediction results of the proposed method on the dataset are 7.487, 8.936, 0.236, 0.393, and 0.996%, respectively, where MPE is a positive coefficient. This avoids the serious consequences of underestimating the cost. Compared with the unimproved BiLSTM, the MAE, RMSE, and MAPE are reduced by 15.271, 18.193, and 0.784%, respectively, which reflects the superiority and effectiveness of the method and can provide technical support for project cost estimation in the construction field.

Keywords:

construction cost prediction; particle swarm optimisation algorithm; attention mechanism; bidirectional extended short-term memory network

1. Introduction

The scale and complexity of construction work continues to rise in the context of the global construction boom. Engineering cost is one of the most critical factors affecting the success of engineering projects; in the bidding stage for construction projects, accurate cost prediction is the basis for assessing the feasibility of the project and selecting design options, which directly affects how reasonable the bidding price is and the probability of success in winning the bid. In turn, these factors affect the economic feasibility and overall quality of the project. Consequently, an indispensable need exists to provide an accurate method of predicting construction cost.

In previous studies, linear models such as the autoregressive moving average (ARIMA) [1] model have been utilised due to their simple structure and robustness to data size and noise levels. However, the ARIMA model is unsuitable for capturing the nonlinearities of time series in engineering costs [2], which negatively affects prediction accuracy. To solve this problem, the support vector machine (SVM) [3,4,5], backpropagation (BP) neural network [6,7,8], and other machine learning models have been applied to cost prediction. Although such methods can effectively handle nonlinear problems, the SVM has limitations regarding data correlation processing and slow processing speed, and the BP neural network can quickly lose time series data and fall into local minimal values. With the continuous development of deep learning models, long short-term memory (LSTM) networks have achieved good prediction results in engineering cost prediction. Dong et al. [9] compared an LSTM network model to an SVM model through construction cost prediction experiments. The results revealed that the LSTM model can handle high-dimensional feature vectors and the selective recording of historical information better than an SVM while also having advantages in terms of prediction accuracy and parameter adjustment. Cao et al. [10] used LSTM to predict the construction cost of highway projects. The results showed that LSTM provides more accurate predictions in short, medium, and long-term prediction scenarios compared to the traditional model. However, LSTM still has shortcomings in dealing with the strong interactions between the features of the cost data. A bidirectional LSTM [11,12,13,14] (BiLSTM) network is a variant of the LSTM network that has an additional layer providing an inverse structure that can mine more data. Sima et al. [15] proved that the prediction accuracy of a BiLSTM network is higher than that of an LSTM network for time series problems, and suggested using BiLSTM instead of LSTM for issues related to time series analysis. A single model often fails to achieve optimal performance, and combined models can provide better performances and more accurate predictions by combining the strengths of multiple underlying models [16]. Chen et al. [17] incorporated an attention mechanism (AM) into a BiLSTM model for feature weight adjustment, which further improved the prediction effect of the model. Jalil et al. [18] used a particle swarm optimisation (PSO) algorithm to optimise the hyperparameters of a BiLSTM model and reduce its prediction error.

Implicit temporal patterns exist in construction cost data as factors such as materials, labor, and variable project duration. In addition, the data contain non-linear relationships and multi-level dependencies that allow for complex interactions between different cost characteristic indicators. However, existing studies have yet to consider these effects fully, so the constructed models are deficient in generalisation ability and prediction accuracy. Based on the above problems, this paper introduces the BiLSTM network into the field of engineering cost prediction. Grey correlation feature analysis (GRA) is performed to calculate the correlation between indicators and costs for feature indicator selection. The inter-temporal and inter-feature interactions in the cost data are captured by constructing the model prediction part by fusing the hybrid attention mechanism (HAM) and BiLSTM network. They aim to address the shortcomings of the traditional particle swarm optimisation algorithm. An improved particle swarm optimisation algorithm is proposed to find the optimal hyper-parameter configuration of the model, which avoids the tediousness and uncertainty problems of manual parameter adjustment. Experiments were conducted on 120 completed residential building projects in Jiangsu Province from 2019 to 2022, and the results show that the method has significant advantages in mean absolute error, root-mean-square error, and symmetric mean absolute percentage error.

2. Methodology for the Selection of Characteristic Indicators

Considering the specificity and diversity of construction projects, the factors affecting project costs are complex and diverse. However, not all factors have the same weight and importance regarding project costs [19]. Therefore, when selecting indicators, the principle of moderation should be considered to screen out indicators that can effectively describe project characteristics or have a significantly impactful project cost.

GRA is a data analysis method based on grey system theory that aims to study the interrelatedness between multiple indicators [20]. The data associated with each indicator are summed to determine the relative degree of each indicator, and the grey correlation between indicators is calculated to determine the degree of influence of each indicator on a problem. Compared to the traditional correlation analysis method, GRA can more accurately reflect the correlations between indicators and their degree of influence. Additionally, GRA has the advantages of a simple model, a small data volume, and interpretable results, which are suitable for selecting construction cost indicators. The steps involved in the grey correlation calculations are as follows.

Raw data standardisation: the values of indicators in the raw data are mapped to intervals that ensure the same scale and range of variation among different variables.
Absolute difference matrix construction: we take the absolute value of the difference between the target indicator and each candidate indicator to form an absolute difference matrix as:

$Δ_{i} (k) = |x_{0} (k) - x_{i} (k)|$

(1)

where $x_{0}$ and $x_{i}$ denote the standardised target and candidate indicators, respectively, $k$ denotes the kth sample, and $i$ denotes the ith candidate indicator.
Correlation matrix construction: a correlation matrix is derived by considering each element in the absolute difference matrix to calculate the minimum and maximum values of the absolute difference matrix as:

$ζ_{i} (k) = \frac{Δ_{\min} + ρ Δ_{\max}}{Δ_{i} (k) + ρ Δ_{\max}}$

(2)

where $Δ_{\min}$ and $Δ_{\max}$ denote the minimum and maximum values of the matrix, respectively, and $ρ$ denotes the resolution factor, which was set to 0.5 in this study.
The grey correlation is calculated as:

$r_{0, i} = \frac{1}{n} \sum_{k = 1}^{n} ζ_{i} (k)$

(3)

In this study, GRA was used to select characteristic indicators. By analysing the frequency of influencing factors and the difficulty of obtaining them in previous literature [21,22,23,24,25,26], we selected 18 indicators that have significant impacts on construction cost as candidate indicators: foundation type, structural type, fortification intensity, management level, façade material, aboveground floor area, underground floor area, number of aboveground floors, number of underground floors, aboveground floor height, underground floor height, interior wall decoration, types of doors and windows, number of elevators, roof type, concrete prices, reinforcing steel prices, and duration. The correlations between the candidate indicators and the unilateral cost were calculated via GRA, and the characteristic indicators were selected according to the resulting correlation value.

3. Predictive Modelling Design

The structure of the prediction model is shown in Figure 1, which consists of four parts: an input layer, a BiLSTM layer, an attention layer, and an output layer.

In this case, the input layer pre-processes the cost data to transform it into a shape that the BiLSTM can process. The BiLSTM layer is used to capture inter-temporal interactions in the costing data, and the attention layer is used to obtain inter-feature interactions in costing data. The output layer contains the dropout layer and the dense layer, where the dropout layer randomly switches off some neurons with probability p during training to set their outputs to zero, thus reducing the risk of overfitting. The weight matrix W of each neuron is multiplied by the probability p to maintain consistency during testing to compensate for the random switching-off operation during training, thus improving the model generalisation and robustness. In the dense layer, the neurons establish connections with each neuron in the previous layer, each connection has a corresponding weight, and each feature in the input data is passed through these connections to the dense layer where a weighted summation is performed in the neuron and nonlinear mapping is performed on the weighted output sequence to produce the output of the neuron.

3.1. Capturing Inter-Temporal Interactions

An LSTM network is a variant structure of a recurrent neural network (RNN) that solves the problem of gradient vanishing that occurs in RNNs when handling long sequential data. The structure of an LSTM network is illustrated in Figure 2.

An LSTM unit contains three gate structures called the forget gate, input gate, and output gate, respectively. The forget gate determines that information

f_{t}

is overlooked based on the hidden state

h_{t - 1}

in the previous moment and the input

x_{t}

in the current moment. The input gate takes the information update value

i_{t}

and candidate cell state

{\hat{c}}_{t}

through

h_{t - 1}

and

x_{t}

. We multiply

f_{t}

times the cell state

c_{t - 1}

in the previous moment and add the data from the cell update to obtain the new cell state

c_{t}

, which yields the initial output

o_{t}

through

h_{t - 1}

and

x_{t}

and the current moment output

h_{t}

through

c_{t}

and

o_{t}

. The corresponding formulas are defined as follows:

f_{t} = σ (w_{f} h_{t - 1} + w_{f} x_{t} + b_{f})

(4)

i_{t} = σ (w_{i} h_{t - 1} + w_{i} x_{t} + b_{i})

(5)

{\hat{c}}_{t} = \tanh (w_{c} h_{t - 1} + w_{c} x_{t} + b_{c})

(6)

c_{t} = f_{t} c_{t - 1} + i_{t} {\hat{c}}_{t}

(7)

o_{t} = \tanh (w_{o} h_{t - 1} + w_{o} x_{t} + b_{o})

(8)

h_{t} = o_{t} \tanh (c_{t})

(9)

where

w_{f}

,

w_{i}

,

w_{c}

, and

w_{o}

represent the weight matrices of the different gates;

b_{f}

,

b_{i}

,

b_{c}

, and

b_{o}

represent the corresponding bias vectors; and

σ

and

\tanh

represent the activation functions.

In contrast to traditional LSTM, a BiLSTM network consists of two LSTM layers for the forward and backward directions and can fully account for both historical and future information. The structure of BiLSTM is illustrated in Figure 3.

The BiLSTM layer performs inter-temporal interaction capture by conducting both forward and backward pass steps. During the forward pass, the data in each time step are processed stepwise, starting from the first time step of the time series data. For each time step, the BiLSTM calculates and updates the hidden state

h_{i}^{f}

of the current time step based on the input data

x_{i}

of the current time step and hidden state

h_{i - 1}^{f}

from the previous time step. In contrast to the forward pass, in the backward pass, the BiLSTM begins processing forward from the last time step in the time series data. For each time step, BiLSTM calculates and updates the hidden state

h_{i}^{b}

of the current time step based on the input data

x_{i}

of the current time step and the hidden state

h_{i + 1}^{b}

from the subsequent time step. Because the information transmitted in the past and the information transmitted in the future regarding the cost data have different degrees of importance, a more comprehensive information representation

o_{t}

is obtained by connecting

h_{i}^{f}

and

h_{i}^{b}

through the adaptive assignment of weights to capture the inter-temporal interactions at a deep level, which is calculated as follows:

h_{t}^{f} = LSTM (x_{t}, h_{t - 1}^{f})

(10)

h_{t}^{b} = LSTM (x_{t}, h_{t - 1}^{b})

(11)

o_{t} = w_{f} h_{t}^{f} + w_{b} h_{t}^{b} + b

(12)

where

w_{f}

and

w_{b}

represent the weight matrices of the forward and backward LSTM layers, respectively, and

b

represents the bias vector.

3.2. Capturing Inter-Feature Interactions

The attention mechanism is a powerful tool in deep learning, and its design is inspired by the information processing mechanism of the human brain [27,28]. The attention mechanism makes the deep learning model pay more attention to the critical parts of the input data while ignoring the unimportant parts. The core idea is to assign different weights dynamically to different inputs so that the model can effectively learn and utilise important information related to the task and adapt more flexibly to the complex relationships in the data, thus improving the performance and generalisation of the model.

Given the complex spatial-temporal interactions between features of cost data, this article proposes a hybrid attention mechanism (HAM) to capture the inter-feature interactions, aiming to focus on both the temporal and feature dimensions of the data and to improve the ability of the model to capture multivariate time-series information. HAM contains two modules: channel attention and spatial attention. The structure is shown in Figure 4.

Here, the channel attention module aggregates the information about the feature dimensions through the tie pooling and maximum pooling operations to obtain the deep feature representation of the feature dimensions, which is then mapped into the attention weights on the feature dimensions by the multilayer perceptron. The

s i g m o i d

activation function is used to activate the outputs of the summation to obtain the weights MC between zero and one. Unlike channel attention, the spatial attention module targets the temporal dimension by concatenating the pooled outputs and inputting them into a convolutional layer to extract features with high attentional weights in the temporal dimension, the convolution operation is utilised to emphasise the significance of the different time steps, and the

s i g m o i d

activation function is used to activate the outputs of the summed outputs in order to obtain weights between zero and one MS. Connecting the channel-attention module and the spatial-attention module in a parallel way makes the modules consider the features of both the channel and spatial dimensions, avoiding the mutual influence of the channel and spatial dimensions and enabling the model to understand the multidimensional time series data better. This hybrid model helps the model capture key information about the data more comprehensively, which in turn improves the performance and generalisation of the model, as calculated in the following formula:

M c (F) = σ (M L P (A v e P o o l (F)) + M L P (M a x P o o l (F)))

(13)

M s (F) = σ (C o n v (C o n n [A v e P o o l (F), M a x P o o l (F)]))

(14)

F^{'} = M c (F) \otimes F \otimes M s (F)

(15)

where

σ

represents the

s i g m o i d

function,

M L P

is a multilayer perceptual machine,

A v e P o o l

and

M a x P o o l

denote tie pooling and maximum pooling,

C o n v

stands for convolution,

C o n n

stands for connection, and

M c

and

M s

represent the outputs of channel attention and spatial attention, respectively.

3.3. Loss Function Selection

During the training of the prediction model, the Adam optimisation algorithm was selected to update the model parameters based on the losses. Adam is a gradient descent-based optimisation algorithm that combines the advantages of momentum gradient descent and the root-mean-squared transfer algorithm [29]. It can effectively improve the training of a model, converge to optimal solutions faster, and iteratively update the weights and biases of a neural network based on training data to optimise the loss function output value. The loss function of the model was the MSE, which is calculated as follows:

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}

(16)

where

n

denotes the number of samples,

y_{i}

denotes the actual value, and

{\bar{y}}_{i}

denotes the model output value.

4. Hyperparametric Optimisation Methods

In the formulation of the cost prediction task, the accuracy of the HAM-BiLSTM model not only depends on the comprehensiveness of feature extraction, but is also affected by the combination of hyperparameters used by the model. In this study, the improved particle swarm optimisation (IPSO) algorithm was used to search for optimal hyperparameter combinations automatically to avoid the tedium and instability of manual parameter tuning and further improve model performance.

4.1. PSO Algorithm

The PSO algorithm [30,31,32] is an optimisation algorithm based on group intelligence inspired by the collective behaviors of flocks of birds or schools of fish. In the PSO algorithm, individuals are called particles. Each particle represents a potential solution and particles find the optimal solution by adjusting their speed and position. The inertia weight is an important parameter for balancing and adjusting the global and local search ability of the algorithm. The particle trajectory depends on the individual and social experiences of the particle, which are controlled by a learning factor. The velocity and position update formulas for a particle are defined as follows:

V_{i, t + 1} = w V_{i, t} + c_{1} r_{1} (p b e s t_{i} - X_{i, t}) + c_{2} r_{2} (g b e s t_{t} - X_{i, t})

(17)

X_{i, t + 1} = X_{i, t} + λ V_{i, t + 1}

(18)

where

V_{i, t}

denotes the velocity of particle

i

after

t

iterations,

w

denotes the inertia weight,

c_{1}

and

c_{2}

denote the individual learning factor and social learning factor,

r_{1}

and

r_{2}

denote random numbers,

x_{i, t}

denotes the position of particle

i

after

t

iterations, and

λ

denotes the velocity coefficient.

4.2. Phase Adjustment of Inertia Weights and Learning Factors

In traditional PSO algorithms, inertia weights and learning factors are typically set to fixed values, which limits global optimisation capability and convergence speed. Reference [33] dynamically adjusted inertia weights and learning factors according to changes in fitness, [34] dynamically adjusted inertia weights according to the number of iterations, and [35] dynamically adjusted the learning factor according to the number of iterations. These schemes all resulted in significant improvements in algorithm performance. However, considering the different states of particles in different periods, it is difficult to maximise PSO performance based on a single factor. Therefore, we combine the number of iterations and degree of adaptability and propose adjusting the inertia weights and learning factors in stages according to the following formulas:

w = w_{m i n} + (w_{m a x} - w_{m i n}) * {(\frac{f_{i, t} - f_{b e s t}}{f_{i, t}})}^{m}

(19)

c_{1} = c_{1, m i n} + (c_{1, m a x} - c_{1, m i n}) * {(\frac{f_{i, t} - f_{b e s t}}{f_{i, t}})}^{m}

(20)

c_{2} = c_{1, m i n} + (c_{1, m a x} - c_{1})

(21)

m = \frac{t^{2}}{T^{2}}

(22)

where

w_{m a x}

and

w_{m i n}

are the maximum and minimum values of the inertia weight

w

, respectively;

f_{i, t}

and

f_{b e s t}

are the fitness value of particle

i

and the value of globally optimal particle fitness in iteration

t

, respectively;

c_{1, m a x}

and

c_{1, m i n}

denote the maximum and minimum values of the individual learning factor

c_{1}

, respectively;

c_{2}

denotes the social learning factor; and

t

and

T

denote the current iteration and a maximum number of iterations, respectively. Additionally,

w \in [0.4, 0.9]

and

c_{1} \in [1, 4]

.

Stage regulation divides the iterative process of the algorithm into three phases.

Exploration phase: at the beginning of the iteration, when the value of $m$ is small, the inertia weights and learning factors are relatively less affected by adaptation, which reduces the magnitude of changes in the weight and learning factors, thereby ensuring that the particles have larger inertia weights and stronger individual learning abilities, promoting global exploration and escape from local optima.
Equilibrium phase: in the middle of the iteration, when the value of $m$ is moderate, the inertia weights and learning factors become smoother through adaptation, which improves the social learning ability of the particles and helps them better utilise collective experiences for refined searching and adjustment.
Convergence phase: in later iterations, when the value of $m$ is large, the inertia weights and learning factors are more sensitive to the influence of adaptation, which enhances the social cognitive ability of the particles, accelerates the convergence of the algorithm and helps guide particles to converge to the globally optimal solution more quickly.

As the value of

m

varies in the interval

[0, 1]

, it enables constraining the particles while adjusting their behavior, enabling them to strike a balance between exploration and convergence and thereby optimising the performance of the algorithm.

4.3. Negative Selection Evolutionary Strategy

In the iterative process of the algorithm, when the fitness values of the particles gradually become close to each other, the algorithm may be in a state of convergence or stagnation. At this time, the particles will be classified into advantageous and disadvantageous particles according to the average fitness value, and the disadvantageous particles generally converge to advantageous particles as the iteration progresses in the traditional PSO, which prevents the algorithm from finding a better solution in the solution space.

Inspired by biological evolution, each particle is considered as an individual and each dimension of a particle is considered as a gene segment. Under the premise of retaining the superior particles in the population, a gene crossover operation is used to break the gene structures of inferior individuals to promote their evolution. First, the ratio

C

of the average fitness of each particle to the fitness of the global optimal particle is calculated using the following formulas:

\frac{f_{b e s t}}{f_{a v e}} = C

(23)

f_{a v e} = \frac{\sum_{i = 1}^{n} f_{i, t}}{n}

(24)

where

f_{b e s t}

,

f_{a v e}

, and

f_{i, t}

are the average fitness of each particle, global optimal fitness, and fitness of particle

i

after

t

iterations, respectively, and

n

is the number of particles.

Second, a critical point

K

is defined by observing the magnitude of the change in

C

. When

C

is greater than

K

, inferior particles are selected as bi-parental samples (e.g., particles A and B). Eventually,

x

gene segments in the same position are randomly selected from the bi-parental samples to be interchanged, where the value of

x

is in

[1, n - 1]

, to produce new particles A1 and B1. This maintains the diversity of the population and prompts the algorithm to continue exploring the solution space. The negative selection evolutionary strategy is illustrated in Figure 5.

4.4. Hyperparameter Optimisation Process

The hyperparameter optimisation process is shown in Figure 6.

The specific steps are as follows:

The batch size, number of LSTM layer cells, dropout probability, number of dense layer cells, and number of iterations in the HAM-BiLSTM model are selected as hyper-parameters to be optimised, and the respective search ranges are set.
We defined the population size $N$ , maximum number of iterations $T$ , inertia weight $w$ search range, learning factor $c_{1}$ search range, and particle dimension $d$ of the IPSO algorithm and initialised the particle velocity $V$ and particle position $X$ .
The MSE of the true and predicted values was used as the fitness function.
The performance was evaluated by training the HAM-BiLSTM model, calculating the fitness of each particle, and recording the average fitness $f_{a v e}$ and global optimal fitness $f_{b e s t}$ of the current particle population.
We determined whether $f_{b e s t}$ was less than the predefined value $g_{b e s t}$ or whether the current number of iterations $t$ reached $T$ . If so, then we output the optimal solution.
The individual and population extremes were updated based on particle fitness.
We updated $w$ , $c_{1}$ , and $c_{2}$ according to Equations (18)–(21).
We determined whether the ratio $C$ of $f_{b e s t}$ to $f_{a v e}$ was greater than the critical value $K$ according to Equations (22) and (23). If so, the negative selection evolution strategy was executed.
The $V$ and $X$ values of the current particle were updated according to Equations (7) and (8) before returning to step four.

5. Forecasting

The prediction process is shown in Figure 7.

The model prediction steps are as follows.

Data pre-processing: the sample data were divided into training and testing sets and normalised.
Model training: the model was trained using the training set, and the IPSO algorithm was utilised to determine the optimal combination of hyperparameters for the model.
Model prediction: the trained model performed predictions on the testing set by employing an optimal combination of hyperparameters.
Output results: the predictions of the model were inversely normalised to the original data range and outputted.

6. Experimental Results and Analysis

6.1. Data Preprocessing

This study is based on the data collection of residential cost data provided by Guanglianda Indicator Network and material price data provided by the Guangcai Network. The balance of the sample needs to be considered in the sample collection process, including the fact that different areas will have different land costs and code standards, high-rise buildings will require more complex and robust structural design and engineering than mid-rise buildings, and different forms of delivery represent different levels of renovation, all of which will affect the balance of the sample. Therefore, we restricted the region to Jiangsu Province during the sample collection process, the number of building floors to mid-rise residential buildings (18 floors and below), and the form of residential delivery to be simple. We initially screened 167 pieces of data from 2019 to 2023, excluding invalid samples with missing data, and ultimately obtained 156 valid sample data.

6.1.1. Relevance Analysis

The correlation values of the candidate indicators are listed in Table 1, and the candidate indicators with correlation values greater than 0.9 are designated as feature indicators. The foundation type (

x_{1}

), structure type (

x_{2}

), fortification intensity (

x_{3}

), management level (

x_{4}

), aboveground floor area (

x_{5}

), number of aboveground floors (

x_{6}

), number of underground floors (

x_{7}

), aboveground floor height (

x_{8}

), basement floor height (

x_{9}

), concrete prices (

x_{10}

), reinforcing steel prices (

x_{11}

), and duration (

x_{12}

) indicators were selected as the model inputs, and the unit cost (

y

) was defined as the model output.

6.1.2. Data Quantification

For qualitative indicators to be used as inputs for the model, they must be quantified. These indicators include foundation type, structure type, fortification intensity, and management level. The detailed quantification process is defined in Table 2.

6.1.3. Data Normalisation

The sample data were divided at a ratio of 8:2 to obtain training and test sets. The input quantities of the model were the 12 feature indicators selected above, and the output quantities were the unilateral costs. Considering the magnitude differences between different feature indicators, to eliminate the impacts of such differences on the model training, reduce the training time, and accelerate the convergence speed, we adopted the max-min method to normalise all sample data. The max-min method eliminates the impact caused by the large magnitude differences by mapping original data to the interval

[0, 1]

such that different feature indicators have the same scale range. This process is formulated as follows:

y_{i} = \frac{x_{i} - x_{m i n}}{x_{m a x} - x_{m i n}} (i = 1, 2, \dots, n)

(25)

where

x_{m i n}

denotes the minimum value of indicator

x

in the data and

x_{m a x}

denotes the maximum value of indicator

x

in the data.

Some of the data obtained after normalising the sample data are presented in Table 3.

6.2. IPSO Algorithm Performance Test

To verify the superiority of the IPSO algorithm, the PSO and IPSO algorithms were executed on eight benchmark test functions. The population size for the PSO and IPSO algorithms was set to 50, the number of iterations was set to 100, the dimensionality was set to 30, and the number of independent runs was set to 20. As shown in Table 4, the minima, maxima, averages, and standard deviations obtained by the IPSO algorithm were better than those obtained by the PSO algorithm and were closer to the global optimum. Figure 8 compares the convergence curves of the PSO and IPSO algorithms, and one can draw the following conclusions: the IPSO algorithm performs better in terms of convergence accuracy, and it enables the particles to gather more stably in the vicinity of the global optimum, therefore improving the ability of the PSO algorithm to overcome local extreme values.

6.3. Hyperparametric Optimisation Results

The hyperparameters to be optimised in the model are the package batch size

N_{b a t c h}

, number of LSTM layer cells

N_{L S T M}

, dropout probability

N_{D r o p o u t}

, number of dense layer cells

N_{D e n s e}

, and number of iterations

N_{e p o c h}

. In this paper, the hyperparameter search ranges were set as shown in Table 5.

The results of IPSO and traditional PSO optimisation searches were compared. The number of iterations of the optimisation algorithm was set to 50, and the population was set to 30. The variation of fitness values during the iteration process of the PSO and IPSO algorithms is shown in Figure 9.

One can see that the IPSO algorithm has a better ability to find the optimum compared to the PSO algorithm. The optimisation results are summarised in Table 6.

6.4. Evaluation Indicators

In this study, we used five evaluation indices to evaluate the model: the mean absolute error (MAE), root-mean-square error (RMSE), mean percentage error (MPE), mean absolute percentage error (MAPE), and coefficient of determination R². The smaller the values of MAE, RMSE, and SMAPE, the better the performance of the model, and the larger the value of R², the better the model fitting ability. In assessing the feasibility of a project, the consequences of underestimating costs are more serious than those of overestimating them. The MPE was used to measure the directionality of the forecasting index. The specific formulas are as follows:

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(26)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(27)

M P E = \frac{100 %}{n} \sum_{i = 1}^{n} \frac{{\hat{y}}_{i} - y_{i}}{y_{i}}

(28)

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}|

(29)

R^{2} = 1 - \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2} / \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}

(30)

where

y_{i}

is the true value,

{\hat{y}}_{i}

is the predicted value,

{\bar{y}}_{i}

is the mean of the true values, and

n

is the number of data used.

6.5. Comparative Experiments and Analyses

Figure 10 compares the predicted and real value curves of the proposed model on the test set for the unilateral cost of construction projects. The predicted and real value curves fit well together, which shows that the model has good prediction performance for the unilateral cost.

To verify the superiority of BiLSTM in capturing the interactions between data tensors, BiLSTM is compared with the traditional benchmark models BP, SVM, and LSTM. The prediction results and errors are shown in Figure 11. The prediction results demonstrate that BiLSTM has the best fit to the true values, and the overall error variation of BiLSTM is relatively smooth, as seen from the error curves.

Table 7 shows the results of the comparison of each single model on the five evaluation indicators. For the construction cost data, the average errors of LSTM and BiLSTM with time series capturing capability are relatively small, and the MAE, RMSE, and MAPE of BiLSTM are reduced by 2.536, 3.107, and 0.142%, respectively, compared with those of LSTM, which verifies the superiority of the BiLSTM bi-directional processing mechanism in capturing the inter-temporal interactions of the data.

To validate the effectiveness of the model components further, the model was compared with BiLSTM, AM-BiLSTM, HAM-BiLSTM, and PSO-HAM-BiLSTM. The prediction results and error comparisons are shown in Figure 12.

As can be seen from the figure, although the prediction results of each model have trends similar to that of the true value curve, the error curves of each model have a large difference. Comparing the error curves of HAM-BiLSTM and AM-BiLSTM reveals that the number of errors greater than 20 decreases from 13 to 8, reflecting that HAM enhances the robustness of the model in terms of its ability to capture inter-feature interactions due to traditional AM. The HAM-BiLSTM with hyperparameters optimised by IPSO has a smoother trend of error curves and a smaller range of error fluctuations relative to the HAM-BiLSTM with hyperparameters optimised by the conventional PSO, reflecting the fact that IPSO can further improve the accuracy of the model.

Table 8 shows the comparison results of the models on the five evaluation metrics, the mean percentage error of the five models is positive, avoiding the serious consequences of underestimating the cost, and all the other error metrics of HAM-BiLSTM after optimising the hyperparameters by IPSO are significantly reduced. In particular, the average absolute error of HAM-BiLSTM is reduced by 6.711 and 4.252%, the RMSE is reduced by 7.757 and 3.670%, and the average absolute percentage error is reduced by 0.325 and 0.237%, respectively. This further demonstrates the effectiveness of HAM in capturing the interactions between the features and enhancing the feature extraction capability of the model. The average absolute error of HAM-BiLSTM after IPSO optimisation of hyperparameters is reduced by 8.560 and 6.236%, the RMSE is reduced by 10.436 and 7.658%, and the average absolute percentage error is reduced by 0.463 and 0.324%, in comparison with those of HAM-BiLSTM and AM-BiLSTM after PSO of the hyperparameters, indicating that the improved PSO algorithm has a better search capability, which can better overcome the problem of falling into local optimal solutions and then find a more comprehensive search of the hyperparameter space to find a better combination. Thus, the improved particle swarm optimisation algorithm has stronger search capability and can better overcome the problem of falling into local optima, thus searching the hyperparameter space more comprehensively and finding better hyperparameter combinations.

7. Conclusions

In this study, a new construction cost prediction method was developed to address the problem of strong interaction among construction cost data. Based on the exploration and improvement of the BiLSTM network in the field of deep learning, the temporal interactions in the data are captured at a deep level, and the HAM module is added to the network, which enhances the ability of the network to capture the interactions among the features and then constructs an effective construction project cost prediction model. In addition, to select the feature indicators that have greater impacts on the cost as model inputs, GRA was used to assess the importance of the factors related to influencing the cost, and the most valuable feature indicators were screened according to the principle of proportionality. To avoid the tediousness and uncertainty of manual parameter tuning, an improved particle swarm optimisation algorithm was designed to find the optimal hyperparameter combination for the model automatically. Through comparative experiments and analyses, the performance of the method in construction cost prediction work was verified.

Even though the results obtained with the improved model are very positive compared to the BiLSTM network, it must be recognised that it has limitations. Firstly, since HAM contains two attention modules, it embodies a powerful feature extraction capability, which will pay more computational resources accordingly. Secondly, IPSO is committed to improving the algorithm’s ability to find the global optimal solution, and we do not reduce the number of iterations and running time of the algorithm relative to the original algorithm.

In future research, we will devote ourselves to overcoming these limitations by focusing on an improved method to simplify the model structure in the feature extraction problem, and a hyperparameter optimisation method that combines computing accuracy and computing speed in the hyperparameter optimisation problem. In addition, for the construction cost prediction problem, not only is an accurate prediction model needed, but also the selection of characteristic indexes is equally important. We will analyse the influencing factors of the cost in depth to contribute more value to the development of this field.

Author Contributions

Conceptualisation, C.W.; methodology, J.Q.; validation, J.Q.; investigation, J.Q.; data curation, J.Q.; writing—original draft preparation, J.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 62072363) and the Natural Science Foundation of Shaanxi Province (No. 2019JM-167).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study are available at (www.gldzb.com, accessed on 18 January 2024) and (www.gldjc.com, accessed on 18 January 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Choi, C.Y.; Ryu, K.R.; Shahandashti, M. Predicting City-Level Construction Cost Index Using Linear Forecasting Models. J. Constr. Eng. Manag. 2021, 147, 04020158. [Google Scholar] [CrossRef]
Kim, S.; Choi, C.Y.; Shahandashti, M.; Ryu, K.R. Improving Accuracy in Predicting City-Level Construction Cost Indices by Combining Linear Arima and Nonlinear ANNs. J. Manag. Eng. 2022, 38, 04021093. [Google Scholar] [CrossRef]
Petruseva, S.; Zileska-Pancovska, V.; Žujo, V.; Brkan-Vejzović, A. Construction Costs Forecasting: Comparison of the Accuracy of Linear Regression and Support Vector Machine Models. Tech. Gaz. 2017, 24, 1431–1438. [Google Scholar]
Ali, Z.H.; Burhan, A.M.; Kassim, M.; Al-Khafaji, Z. Developing an Integrative Data Intelligence Model for Construction Cost Estimation. Complexity 2022, 2022, 4285328. [Google Scholar] [CrossRef]
Li, L. Dynamic Cost Estimation of Reconstruction Project Based on Particle Swarm Optimization Algorithm. Informatica 2023, 47, 173–182. [Google Scholar] [CrossRef]
Wang, X. Forecasting Construction Project Cost Based on BP Neural Network. In Proceedings of the 10th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), Changsha, China, 10–11 February 2018; IEEE Publications: New York, NY, USA, 2018; Volume 2018, pp. 420–423. [Google Scholar] [CrossRef]
Ye, D. An Algorithm for Construction Project Cost Forecast Based on Particle Swarm Optimization-guided BP Neural Network. Sci. Program. 2021, 2021, 4309495. [Google Scholar] [CrossRef]
Wang, B.; Dai, J. Discussion on the Prediction of Engineering Cost Based on Improved BP Neural Network Algorithm. J. Intell. Fuzzy Syst. 2019, 37, 6091–6098. [Google Scholar] [CrossRef]
Dong, J.; Chen, Y.; Guan, G. Cost Index Predictions for Construction Engineering Based on LSTM Neural Networks. Adv. Civ. Eng. 2020, 2020, 6518147. [Google Scholar] [CrossRef]
Cao, Y.; Ashuri, B. Predicting the Volatility of Highway Construction Cost Index Using Long Short-Term Memory. J. Manag. Eng. 2020, 36, 04020020. [Google Scholar] [CrossRef]
Joseph, L.P.; Deo, R.C.; Prasad, R.; Salcedo-Sanz, S.; Raj, N.; Soar, J. Near Real-Time Wind Speed Forecast Model with Bidirectional LSTM Networks. Renew. Energy 2023, 204, 39–58. [Google Scholar] [CrossRef]
Li, X.; Pan, Y.; Zhang, L.; Chen, J. Dynamic and Explainable Deep Learning-Based Risk Prediction on Adjacent Buildings Induced by Deep Excavation. Tunn. Undergr. Space Technol. 2023, 140, 105243. [Google Scholar] [CrossRef]
Atef, S.; Nakata, K.; Eltawil, A.B. A Deep Bi-directional Long-Short Term Memory Neural Network-Based Methodology to Enhance Short-Term Electricity Load Forecasting for Residential Applications. Comput. Ind. Eng. 2022, 170, 108364. [Google Scholar] [CrossRef]
Niu, D.; Sun, L.; Yu, M.; Wang, K. Point and Interval Forecasting of Ultra-short-Term Wind Power Based on a Data-Driven Method and Hybrid Deep Learning Model. Energy 2022, 254, 124384. [Google Scholar] [CrossRef]
Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The Performance of LSTM and BiLSTM in Forecasting Time Series. In Proceedings of the IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; IEEE Publications: New York, NY, USA, 2019; Volume 2019, pp. 3285–3292. [Google Scholar] [CrossRef]
Ribeiro, M.H.D.M.; dos Santos Coelho, L. Ensemble Approach Based on Bagging, Boosting and Stacking for Short-Term Prediction in Agribusiness Time Series. Appl. Soft Comput. 2020, 86, 105837. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, R.; Qi, Y.; Wen, F. A Watershed Water Quality Prediction Model Based on Attention Mechanism and Bi-LSTM. Environ. Sci. Pollut. Res. Int. 2022, 29, 75664–75680. [Google Scholar] [CrossRef]
Vaziri, J.; Farid, D.; Nazemi Ardakani, M.; Hosseini Bamakan, S.M.; Shahlaei, M. A Time-Varying Stock Portfolio Selection Model Based on Optimised PSO-BiLSTM and Multi-objective Mathematical Programming Under Budget Constraints. Neural Comput. Appl. 2023, 35, 18445–18470. [Google Scholar] [CrossRef]
Hatamleh, M.T.; Hiyassat, M.; Sweis, G.J.; Sweis, R.J. Factors Affecting the Accuracy of Cost Estimate: Case of Jordan. Eng. Constr. Archit. Manag. 2018, 25, 113–131. [Google Scholar] [CrossRef]
Gai, R.; Guo, Z. A Water Quality Assessment Method Based on an Improved Grey Relational Analysis and Particle Swarm Optimisation Multi-classification Support Vector Machine. Front. Plant Sci. 2023, 14, 1099668. [Google Scholar] [CrossRef] [PubMed]
Ahn, J.; Ji, S.H.; Ahn, S.J.; Park, M.; Lee, H.; Kwon, N.; Lee, E.; Kim, Y. Performance Evaluation of Normalization-Based CBR Models for Improving Construction Cost Estimation. Autom. Constr. 2020, 119, 103329. [Google Scholar] [CrossRef]
Dursun, O.; Stoy, C. Conceptual Estimation of Construction Costs Using the Multistep Ahead Approach. J. Constr. Eng. Manag. 2016, 142, 04016038. [Google Scholar] [CrossRef]
Xiao, X.; Skitmore, M.; Yao, W.; Ali, Y. Improving Robustness of Case-Based Reasoning for Early-Stage Construction Cost Estimation. Autom. Constr. 2023, 151, 104777. [Google Scholar] [CrossRef]
Ahn, J.; Park, M.; Lee, H.S.; Ahn, S.J.; Ji, S.; Song, K.; Son, B. Covariance Effect Analysis of Similarity Measurement Methods for Early Construction Cost Estimation Using Case-Based Reasoning. Autom. Constr. 2017, 81, 254–266. [Google Scholar] [CrossRef]
Ji, S.H.; Ahn, J.; Lee, H.S.; Han, K. Cost Estimation Model Using Modified Parameters for Construction Projects. Adv. Civ. Eng. 2019, 2019, 8290935. [Google Scholar] [CrossRef]
Hu, W.; Chang, Y.; He, X. Influencing Factors and Prediction Model of Construction Project Duration. Civ. Eng. 2018, 51, 103–112. [Google Scholar]
Brauwers, G.; Frasincar, F. A General Survey on Attention Mechanisms in Deep Learning. IEEE Trans. Knowl. Data Eng. 2021, 35, 3279–3298. [Google Scholar] [CrossRef]
Niu, Z.; Zhong, G.; Yu, H. A Review on the Attention Mechanism of Deep Learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Jain, M.; Saihjpal, V.; Singh, N.; Singh, S.B. An Overview of Variants and Advancements of PSO Algorithm. Appl. Sci. 2022, 12, 8392. [Google Scholar] [CrossRef]
Shami, T.M.; El-Saleh, A.A.; Alswaitti, M.; Al-Tashi, Q.; Summakieh, M.A.; Mirjalili, S. Particle Swarm Optimization: A Comprehensive Survey. IEEE Access 2022, 10, 10031–10061. [Google Scholar] [CrossRef]
Bonyadi, M.R. A Theoretical Guideline for Designing an Effective Adaptive Particle Swarm. IEEE Trans. Evol. Comput. 2019, 24, 57–68. [Google Scholar] [CrossRef]
Zhao, G.; Jiang, D.; Liu, X.; Tong, X.; Sun, Y.; Tao, B.; Kong, J.; Yun, J.; Liu, Y.; Fang, Z. A Tandem Robotic Arm Inverse Kinematic Solution Based on an Improved Particle Swarm Algorithm. Front. Bioeng. Biotechnol. 2022, 10, 832829. [Google Scholar] [CrossRef]
Wang, H.; Peng, M.J.; Wesley Hines, J.W.; Zheng, G.Y.; Liu, Y.K.; Upadhyaya, B.R. A Hybrid Fault Diagnosis Methodology with Support Vector Machine and Improved Particle Swarm Optimization for Nuclear Power Plants. ISA Trans. 2019, 95, 358–371. [Google Scholar] [CrossRef]
Yu, H. Evaluation of Cloud Computing Resource Scheduling Based on Improved Optimization Algorithm. Complex Intell. Syst. 2021, 7, 1817–1822. [Google Scholar] [CrossRef]

Figure 1. Forecasting model structure.

Figure 2. LSTM network structure.

Figure 3. BiLSTM network structure.

Figure 4. Structure of a HAM.

Figure 5. Negative selection evolutionary strategy.

Figure 6. Hyperparameter optimisation process.

Figure 7. Model prediction process.

Figure 8. Comparison of convergence curves for eight test functions.

Figure 9. Hyperparametric optimisation convergence curves.

Figure 10. Comparison of real and projected values.

Figure 11. Prediction results and errors for single model comparisons.

Figure 12. Prediction results and errors for combined model comparisons.

Table 1. Candidate indicator correlation values.

Indicator Name	Relatedness	Indicator Name	Relatedness
Foundation type	0.9852	Aboveground floor height	0.9732
Structure type	0.9897	Underground floor height	0.9436
Fortification intensity	0.9603	Interior wall decoration	0.8763
Management level	0.9718	Types of doors and windows	0.8525
Facade material	0.8956	Number of elevators	0.8643
Aboveground floor area	0.9496	Roof type	0.8726
Underground floor area	0.8831	Concrete prices	0.9558
Number of aboveground floors	0.9764	Reinforcing steel prices	0.9623
Number of underground floors	0.9668	Duration	0.9311

Table 2. Quantitative table of qualitative indicators.

Foundation Type	Value	Structure Type	Value	Fortification Intensity	Value	Management Level	Value
Raft slab foundation	1	Shear wall structure	1	6 degrees	1	Excellent	1
Pile foundation	2	Framework structure	2	7 degrees	2	Good	2
Mantenna foundation	3	Frame shear construction	3	8 degrees	3	Bad	3
				9 degrees	4

Table 3. Normalised data.

Number	x1	x2	x3	x4	x5	x6	x7	x8	x9	x10	x11	x12	y
1	1.00	0.00	1.00	1.00	0.54	0.91	1.00	0.00	0.78	0.34	0.47	0.98	0.83
2	1.00	0.00	1.00	1.00	1.00	0.86	1.00	0.00	0.69	0.34	0.47	0.37	1.00
3	1.00	0.50	0.50	1.00	0.20	0.96	0.67	0.00	0.89	0.36	0.47	0.22	0.63
4	0.00	0.00	1.00	0.50	0.18	0.91	0.67	1.00	0.78	0.36	0.47	0.61	0.18
⁝
155	1.00	0.50	0.50	0.50	0.02	1.00	0.18	0.33	0.52	0.94	0.36	0.27	0.23
156	1.00	1.00	0.50	0.00	0.06	1.00	0.27	0.33	0.71	0.04	0.71	0.86	0.59

Table 4. Comparison of PSO and IPSO algorithm test results.

Function	Arithmetic	Min	Max	Average	STD
Sphere	PSO	2.23 × 10⁻¹	1.12	5.53 × 10⁻¹	2.24 × 10⁻¹
Sphere	IPSO	3.62 × 10⁻³⁰	3.32 × 10⁻²⁶	1.46 × 10⁻²⁶	1.70 × 10⁻²⁷
Schwefel	PSO	1.22	4.05	1.97	6.30 × 10⁻¹
Schwefel	IPSO	1.28 × 10⁻¹⁵	3.61 × 10⁻¹³	3.64 × 10⁻¹⁴	9.27 × 10⁻¹³
Sum squares	PSO	5.67	2.20 × 10¹	1.16 × 10¹	5.06
Sum squares	IPSO	6.48 × 10⁻³⁰	5.78 × 10⁻²⁶	2.89 × 10⁻²⁶	3.97 × 10⁻²⁷
Rosenbrock	PSO	6.18 × 10¹	2.68 × 10²	1.26 × 10²	5.16 × 10¹
Rosenbrock	IPSO	2.87 × 10¹	2.90 × 10¹	2.89 × 10¹	3.76 × 10⁻²
Rastigrin	PSO	4.73 × 10¹	1.01 × 10²	7.36 × 10¹	1.45 × 10¹
Rastigrin	IPSO	1.82 × 10⁻¹	1.35 × 10¹	6.46	3.79
Ackley	PSO	7.91 × 10⁻¹	3.18	2.28	4.75 × 10⁻¹
Ackley	IPSO	8.88 × 10⁻¹⁶	8.88 × 10⁻¹⁶	8.88 × 10⁻¹⁶	0
Griewank	PSO	2.43 × 10⁻²	1.04 × 10⁻¹	4.86 × 10⁻²	1.68 × 10⁻²
Griewank	IPSO	2.84 × 10⁻⁴	7.60 × 10⁻³	8.78 × 10⁻⁴	1.60 × 10⁻³
Penalised	PSO	1.71 × 10⁻²	2.45 × 10⁻¹	9.51 × 10⁻²	6.30 × 10⁻²
Penalised	IPSO	2.79 × 10⁻⁵	2.62 × 10⁻⁴	1.53 × 10⁻⁴	5.61 × 10⁻⁵

Table 5. Hyperparameter search ranges.

Hyperparameter	Search Scope
$N_{b a t c h}$	1–30
$N_{L S T M}$	1–100
$N_{D r o p o u t}$	0.1–0.5
$N_{D e n s e}$	1–100
$N_{e p o c h}$	200

Table 6. Optimisation results.

Algorithm	$N_{b a t c h}$	$N_{L S T M}$	$N_{D r o p o u t}$	$N_{D e n s e}$	$N_{e p o c h}$
PSO	5	26	0.2	22	200
IPSO	3	32	0.1	18	200

Table 7. Comparison of single model performance evaluation.

Model	MAE	RMSE	MPE/%	MAPE/%	R²
BP	33.980	39.174	−0.124	1.787	0.921
SVM	29.254	36.068	−0.154	1.517	0.933
LSTM	25.294	30.236	0.472	1.329	0.953
BiLSTM	22.758	27.129	0.054	1.187	0.962

Table 8. Comparison of performance evaluation of combined models.

Model	MAE	RMSE	MPE/%	MAPE/%	R²
BiLSTM	22.758	27.129	0.054	1.181	0.962
AM-BiLSTM	20.299	23.042	0.838	1.093	0.973
HAM-BiLSTM	16.047	19.372	0.647	0.856	0.981
PSO-HAM-BiLSTM	13.723	16.594	0.161	0.717	0.986
IPSO-HAM-BiLSTM	7.487	8.936	0.236	0.393	0.996

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Qiao, J. Construction Project Cost Prediction Method Based on Improved BiLSTM. Appl. Sci. 2024, 14, 978. https://0-doi-org.brum.beds.ac.uk/10.3390/app14030978

AMA Style

Wang C, Qiao J. Construction Project Cost Prediction Method Based on Improved BiLSTM. Applied Sciences. 2024; 14(3):978. https://0-doi-org.brum.beds.ac.uk/10.3390/app14030978

Chicago/Turabian Style

Wang, Chaoxue, and Jiale Qiao. 2024. "Construction Project Cost Prediction Method Based on Improved BiLSTM" Applied Sciences 14, no. 3: 978. https://0-doi-org.brum.beds.ac.uk/10.3390/app14030978

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Construction Project Cost Prediction Method Based on Improved BiLSTM

Abstract

1. Introduction

2. Methodology for the Selection of Characteristic Indicators

3. Predictive Modelling Design

3.1. Capturing Inter-Temporal Interactions

3.2. Capturing Inter-Feature Interactions

3.3. Loss Function Selection

4. Hyperparametric Optimisation Methods

4.1. PSO Algorithm

4.2. Phase Adjustment of Inertia Weights and Learning Factors

4.3. Negative Selection Evolutionary Strategy

4.4. Hyperparameter Optimisation Process

5. Forecasting

6. Experimental Results and Analysis

6.1. Data Preprocessing

6.1.1. Relevance Analysis

6.1.2. Data Quantification

6.1.3. Data Normalisation

6.2. IPSO Algorithm Performance Test

6.3. Hyperparametric Optimisation Results

6.4. Evaluation Indicators

6.5. Comparative Experiments and Analyses

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI