Probabilistic Shear Strength Prediction for Deep Beams Based on Bayesian-Optimized Data-Driven Approach

Liu, Mao-Yi; Li, Zheng; Zhang, Hang

doi:10.3390/buildings13102471

Open AccessArticle

Probabilistic Shear Strength Prediction for Deep Beams Based on Bayesian-Optimized Data-Driven Approach

by

Mao-Yi Liu

¹,

Zheng Li

^2,* and

Hang Zhang

²

¹

School of Civil Engineering, Chongqing University, Chongqing 400044, China

²

Chongqing Urban Construction Investment (Group) Co., Ltd., Chongqing 400707, China

^*

Author to whom correspondence should be addressed.

Buildings 2023, 13(10), 2471; https://0-doi-org.brum.beds.ac.uk/10.3390/buildings13102471

Submission received: 11 September 2023 / Revised: 25 September 2023 / Accepted: 26 September 2023 / Published: 28 September 2023

(This article belongs to the Topic AI Enhanced Civil Infrastructure Safety)

Download

Browse Figures

Versions Notes

Abstract

:

To ensure the safety of buildings, accurate and robust prediction of a reinforced concrete deep beam’s shear capacity is necessary to avoid unpredictable accidents caused by brittle failure. However, the failure mechanism of reinforced concrete deep beams is very complicated, has not been fully elucidated, and cannot be accurately described by simple equations. To solve this issue, machine learning techniques have been utilized and corresponding prediction models have been developed. Nevertheless, these models can only provide deterministic prediction results of the scalar type, and the confidence level is uncertain. Thus, these prediction results cannot be used for the design and assessment of deep beams. Therefore, in this paper, a probabilistic prediction approach of the shear strength of reinforced concrete deep beams is proposed based on the natural gradient boosting algorithm trained on a collected database. A database of 267 deep beam experiments was utilized, with 14 key parameters identified as the inputs related to the beam geometry, material properties, and reinforcement details. The proposed NGBoost model was compared to empirical formulas from design codes and other machine learning methods. The results showed that the NGBoost model achieved higher accuracy in mean shear strength prediction, with an R² of 0.9045 and an RMSE of 38.8 kN, outperforming existing formulas by over 50%. Additionally, the NGBoost model provided probabilistic predictions of shear strength as probability density functions, enabling reliable confidence intervals. This demonstrated the capability of the data-driven NGBoost approach for robust shear strength evaluation of RC deep beams. Overall, the results illustrated that the proposed probabilistic prediction approach dramatically surpassed the current formulas adopted in design codes and machine learning models in both prediction accuracy and robustness.

Keywords:

shear capacity; deep beam; probabilistic prediction; machine learning; NGBoost

1. Introduction

Currently, reinforced concrete (RC) buildings are the prevalent choice for constructing civil infrastructure all over the world. They will suffer from various load scenarios throughout their service life [1]. To maintain their safety, the accurate prediction of the various RC components’ capacities is significant [2]. Among various RC components, the prediction of the shear strength of an RC deep beam is one of the most-challenging tasks due to its sophisticated failure mechanism [3,4]. To predict an RC deep beam’s shear capacity, in the past few decades, plenty of experimental studies have been carried out to figure out its failure process [5,6,7]. It has been found that the classical plane section assumption is no longer satisfied in deep beams, which makes their failure mechanism very complicated.

To resolve this issue, several theoretical models have been proposed, for instance the well-developed strut-and-tie model. Foster proposed the use of brace-and-tie models to describe the mechanisms of the observed non-flexural behavior and presented the rationale behind the empirically determined minimum reinforcement requirements, as well as the minimum mesh reinforcement requirements [8]. Tuchscherer et al. compiled a database of 868 deep beam shear tests from the literature and fabricated and tested 37 additional deep beam specimens. Through comprehensive analysis of the database, deficiencies in the existing specifications were discovered. Therefore, the authors proposed suggestions for these methods, resulting in an improved brace–tie-modeling procedure [9] An experimental verification of reinforced concrete deep beams designed by the strut-and-tie method was performed by Starvcev et al. [10].

However, due to the unavoidable theoretical simplifications and assumptions, the shear capacity of the deep beam is controlled by various complicated behaviors, such as the flexure–shear coupling effect, reinforcement bonding behavior, and the size effect [11,12,13]. Consequently, a single theoretical model cannot consider all the effects of these behaviors, and the corresponding prediction accuracy would not be ideal.

Currently, machine learning algorithms are becoming more and more prevalent in civil engineering [14,15,16,17]. They have been used to achieve various tasks, such as component and structure behavior prediction, model updating, and traffic load identification [18,19,20]. Chen et al. developed data-driven models using ensemble learning algorithms (GBDT and RF) to accurately predict the bond strength between carbon-fiber-reinforced polymer (CFRP) and steel in CFRP-strengthened steel structures, with the GBDT achieving the highest accuracy (R² = 0.98) and potential for structural design and evaluation [21]. Benbouras et al. designed a new machine learning model, in particular a DNN model, to predict the pile-bearing capacity more effectively and efficiently than traditional methods, providing a user-friendly interface called “BeaCa2021” for researchers and civil engineers [22]. Czarnecki et al. proposed a hybrid model combining non-destructive methods and neural networks to accurately estimate the subsurface tensile strength of cementitious composites containing waste granite powder, demonstrating its practical suitability for civil engineering applications with error values ranging from 10 to 12% [23]. In this scenario, machine learning algorithms have been utilized to tackle the prediction of the deep beam’s shear capacity owing to their excellent regression ability and adequate experimental RC deep beam data. Karim et al. utilized a machine learning (ML) algorithm to accurately estimate the shear strength of fiber-reinforced polymer (FRP) reinforced concrete (RC) beams and collected 302 shear test results from the literature to develop the most-effective prediction model [24]. Sanad and Saka [25] initially tried using a neural network for predicting the ultimate shear strength of RC deep beams. However, a neural network is usually treated as a black box model, which is uninterpretable by the users. Then, Ashour et al. [26] and Gandomi et al. [27] adopted a genetic programming technique to achieve an explicit predicting formula for the deep beam’s shear strength. However, the accuracy of a formula obtained from genetic programming is usually not comparable to the one obtained from the black box machine learning model.

Alshboul et al. proposed a closed-form model based on GEP to predict the shear strength of SFRC deep beams with an R² value of 78.9%. The analytical results showed that the GEP model accurately predicted the effects of the concrete strength, the flexural steel percentage, and the ratio of the shear span to the beam depth [28]. Esteghamati et al. proposed a framework based on training simulation-based seismic and environmental assessments of 720 mid-rise concrete office buildings. The framework aimed to develop generalizable surrogate models to predict the seismic vulnerability and environmental impacts of a class of buildings at specific sites [29]. Meanwhile, Esteghamati et al. compared knowledge-driven, data-driven, and physics-driven alternative models to assess the relative ability to estimate earthquake losses under complete and incomplete design information scenarios [30]. Feng et al. [31] implemented explainable ensemble learning algorithms such as the gradient boosting regression tree (GBRT) to achieve precise shear strength prediction of an RC deep beam. In addition, in the case that the relevant experimental data are not sufficient, Chen and Feng [32] proposed a method for utilizing multi-fidelity data to train a machine learning model for deep beam shear strength prediction.

Among all these studies, it can be found that, without a doubt, the performance of all these data-driven models would dramatically surpass the formulas used in each design code [33,34]. In most cases, boosting ensemble learning algorithms such as the GBRT and extreme gradientboosting (XGBoost) are the most-promising choices. This trend also coincides with the results of famous machine learning competitions such as Kaggle [35]. However, these models can only provide a deterministic prediction about the shear strength. The aleatory or epistemic uncertainty within training data such as measurement and recording errors is not considered in the models, which could generate large deviations [36]. Consequently, a scalar prediction without any confidence interval is useless and even harmfulto the structural assessment and design.

To address this shortcoming, Li et al. [4] proposed a probabilistic prediction model for a deep beam’s shear strength based on Markov chain Monte Carlo (MCMC) calibration. The obtained model could provide probabilistic prediction results in a probabilistic density function (PDF) form. Therefore, the aforementioned uncertainty could be effectively considered. Wu et al. proposed a probabilistic fatigue life prediction model to predict the fatigue life of reinforced concrete beams in a chlorinated environment and took into account statistical uncertainty using Bayesian inference to determine and update the model parameters [37]. Yu et al. proposed a probabilistic prediction model based on Gaussian process regression (GPR) with an anisotropic composite kernel. This method overcomes the limitations of traditional mechanics and data-driven approaches in rationally quantifying the complex uncertainties regarding the shear strength of reinforced concrete (RC) beam–column joints [38]. However, the convergence issues within MCMC severely demand the users to have expertise [39]. To attain convenient probabilistic predictions, a novel algorithm dubbed natural gradient boosting (NGBoost) was proposed and introduced into this field recently [40,41].

In this context, in this paper, NGBoost was used for the probabilistic prediction of the shear capacity of deep beams. For the convenience of the application, the Bayesian optimization of NGBoost’s hyper-parameters was adopted. The proposed model was thoroughly examined and compared with the existing prediction formulas and conventional machine learning models. In the following sections, the collected experimental database of the deep beams is presented first. Then, the methodology of the proposed probabilistic prediction approach is introduced. The implementation of the proposed method on the shear capacity prediction is presented next, along with a comparison of the results.

2. Collected Experimental Database

To establish an accurate predicting model of a deep beam’s ultimate capacity, a database with 267 deep beam experiment samples was collected and utilized [31]. Figure 1 illustrates a typical RC deep beam with the relevant experiment arrangement for measuring its shear capacity. Based on previous studies, the factors influencing a deep beam’s shear capacity include: (1) the geometric factors, such as the span of the beam L, the shear span a, the height h, the effective height

h_{0}

, and the width b of the beam section; (2) the material factors, such as the concrete strength

f_{c}

, the yield strength of the longitudinal reinforcement

f_{y}

, the yield strength of the horizontal and vertical web reinforcement

f_{y h}, f_{y v}

; (3) the reinforcement details, such as the longitudinal reinforcement ratio

ρ_{l}

, the horizontal and vertical web reinforcement ratio

ρ_{w h}, ρ_{w v}

, and the spacing for the horizontal and vertical web reinforcement

S_{w h}, S_{w v}

. Hence, these 14 factors were selected as the input features for the machine learning prediction models, while the output of the model was the beam’s shear capacity

V_{U}

.

The marginal distribution of the 14 input factors is displayed in Figure 2. As can be seen, the distribution of all factors covers a wide value range. In addition, the size of the database is 267, which is 10-times larger than the input feature dimension of 14. Therefore, this database is sufficient for establishing a data-driven prediction model via a machine learning algorithm.

3. Methodology

To achieve both accurate mean prediction and a robust probabilistic prediction of the deep beam’s shear capacity, a novel algorithm, NGBoost, was adopted here. The fundamentals of NGBoost are briefly introduced.

Fundamentals of NGBoost

Natural gradient boosting (NGBoost) is a gradient boosting method, which is a machine learning technique for supervised learning. NGBoost uses natural gradients to improve the model performance and convergence speed under the gradient-boosting framework.

The basic idea is to use the classification and regression tree (CART) as the base learner. Each time a new CART is trained, it reduces the error of the existing model:

M_{n} (x) = \sum_{i = 1}^{n} T_{i} (x) .

(1)

Specifically, assuming that the existing model is

M_{n - 1} (x)

, the new CART model

T_{n} (x)

tries to approximate the negative gradient of the current model, that is:

T_{n} (x) \approx - \nabla loss (M_{n - 1} (x))

(2)

where loss represents the loss function.

Then, add

T_{n} (x)

to the existing model to form a new model:

M_{n} (x) = M_{n - 1} (x) + T_{n} (x)

(3)

By continuously adding

T_{n} (x)

to approximate the negative gradient direction, the loss can be quickly reduced and the model improved. The innovation of NGBoost is that it uses “natural gradients” instead of numerical gradients. The reason is that numerical gradients are sensitive to parameter scale changes, while natural gradients improve this problem. The calculation formula for natural gradients is:

G = F^{- 1} \nabla loss

(4)

where F represents the Fisher information matrix.

Thus, on this background, NGBoost is proposed to solve this issue.

The major improvement of NGBoost is the substitution of the numerical gradient with the natural gradient as the object of

T_{i} (x)

training, which is why it is named NGBoost. Compared with numerical gradients, natural gradients have nothing to do with the measurement of the parameter space and are more suitable as optimization goals for model improvement. This allows NGBoost to make probabilistic predictions, not just point estimates. Specifically, NGBoost uses the log-likelihood loss function, and the prediction target is the probability distribution

P (y | x, θ)

. Each new CART

T_{n} (x)

approaches the natural gradient of the current model distribution relative to the true distribution, making the model as a whole approach the true posterior distribution, thereby making probability predictions.

To achieve probabilistic prediction, the prediction result of a model should be a probabilistic distribution

P (y ∣ x, θ)

. Therefore, instead of a single scalar prediction, the prediction should be at least a two-dimensional type in order to describe a distribution. As a result, the classic boosting framework cannot be applied because the traditional loss function and numerical gradient are not suitable. To solve this problem, the negative log-likelihood

- log P (θ ∣ x, y)

as the common index for quantifying the difference between two distributions was used to replace the traditional loss function. Furthermore, the natural gradient was used for training

T_{i} (x)

to achieve a prediction of the multi-dimensional type [42].

With this improvement, NGBoost could achieve probabilistic prediction while maintaining the highly accurate performance of a boosting algorithm.

4. Applying NGBoost in Deep Beam Shear Capacity Prediction

For the probabilistic prediction of a deep beam’s shear capacity, the collected database in Section 2 was separated into training and testing datasets via a 7:3 ratio. Then, in training NGBoost, just like conventional machine learning models, its hyper-parameters need to be carefully tuned in order to achieve a satisfying performance. In this study, instead of the commonly used grid searching or random searching strategy, the Bayesian optimization algorithm was used, which is more efficient for high-dimensional parameter optimization [43]. The main idea behind Bayesian optimization is using the Gaussian process to construct a surrogate model of the objective function for optimizing, and this model was used to find the optimal point in an iterative manner. Bayesian optimization usually contains four steps, as shown in Figure 3.

1. Surrogate model fitting: In this step, the Gaussian process (GP) algorithm is used to fit a surrogate model. The Gaussian process is a non-parametric Bayesian method that captures the uncertainty of the objective function by providing a distribution for each input point. The observed data points are used to train this model to predict function values and uncertainties for unobserved points.

2. Acquisition evaluation: The acquisition function defines whether the current information should be explored or exploited during the next search. Common acquisition functions include the expected improvement (EI), probability of improvement (PI), and upper confidence bound (UCB). These functions help determine the next sampling point by balancing exploration and exploitation.

3. Searching: In this step, the optimal point of the next candidate is found according to the acquisition function. This usually involves an optimization process aimed at finding input values that maximize the value of the acquisition function. This point may be a minimum value of the objective function, or it may be in an area of greater uncertainty, thereby achieving a balance between exploration and utilization.

4. Update: The objective function is evaluated on the newly found candidate points, and then, the surrogate model is updated with this newly observed data point. This updated process typically involves refitting the Gaussian process model to more accurately capture the shape and uncertainty of the objective function.

More-specific details about Bayesian optimization can be consulted in [43].

Cross-validation (CV) was also adopted in this paper. CV is a statistical method for evaluating the performance of machine learning models. Its main purpose is to ensure that the model has a good generalization ability, that is the model not only performs well on the training data, but also performs well on the test data.

The basic idea of CV is to divide the original dataset into multiple subsets and, then, use one subset as the test set and the remaining subsets as the training set to train and evaluate the model. This process is repeated multiple times, and each subset is used as a test set for validation, ultimately resulting in multiple model performance evaluation results. The average of these results is usually calculated to evaluate the overall performance of the model. In order to take into account the effectiveness of thecomputing performance, this article used a five-fold CV.

After these training processes, the obtained NGBoost model would be further examined in the testing to illustrate its prediction ability on the deep beam’s shear capacity. The training process for NGBoost is summarized in Figure 4.

5. Shear Capacity Prediction Result Discussion

The optimization process of the hyper-parameters of NGBoost for shear capacity prediction is given in Figure 5. The optimized hyper-parameters of NGBoost include the base learner’s (CART) maximum depth, the minimum sample number for splitting, the minimum sample number for each leaf, the learning rate, and the CART number for the entire model. It can be seen in Figure 5a and the parameter searching range given in Table 1 via Bayesian optimization that the optimal hyper-parameter sets can be efficiently obtained in a high-dimensional parameter space. Figure 5b also provides the typical optimization contour of two key hyper-parameters: the CART number and the learning rate. The optimal hyper-parameters are listed in Table 1.

Then, the average prediction performance of NGBoost on the deep beam’s shear capacity was compared with the performance of representative machine learning and ensemble learning algorithms, including linear regression, support vector regression, a neural network, random forest, and XGBoost. These algorithms are mature algorithms that have been realized in the Python software package “Scikit-learn” v1.3.1 [44].

5.1. Comparison with Representative Machine Learning Algorithm

The performance of each algorithm on the training and testing dataset of the deep beam’s shear capacity experiment is provided in Figure 6, where the red line represents a perfect prediction. It should be noted that, besides NGBoost, all five compared machine learning models’ hyper-parameters were also carefully optimized by Bayesian optimization, just like the process provided for NGBoost.

From Figure 6, it can be seen that, except for linear regression, all other models achieved good performance in predicting the shear capacity. Their performance varied a little. Moreover, NGBoost could achieve comparable performance on the scalar prediction of the shear capacity of a deep beam in comparison with the excellent ensemble learning algorithm, XGBoost. Furthermore, to better quantify the performance of different models, the indexes the determination coefficient R² and root-mean-squared error (RMSE) were adopted to illustrate the overall fitting accuracy and absolute error of each model.

From Table 2, we can observe the error performance of different machine learning models in predicting the shear load capacity. Among all the listed models—linear regression (LR), support vector regression (SVR), neural network (NN), random forest (RF), XGBoost, and NGBoost—NGBoost showed the best prediction performance. First, it can be noted that NGBoost reached 0.9045 in the coefficient of determination (R²), which was the highest among all models, indicating that the NGBoost model has high relevance and accuracy in predicting actual values. Secondly, the NGBoost model also performed well in terms of the root-mean-squared error (RMSE), reaching the lowest value of 38.7976 kN, which further confirmed the accuracy and reliability of the model in predicting shear bearing capacity.

In comparison, the LR and SVR models performed more generally in terms of the R² and RMSE, especially the linear regression model, whose R² was only 0.8302. The RMSE was 51.7434 kN, which is relatively high, indicating that its prediction accuracy and reliability were low.

NN, RF, and XGBoost also showed good performance, but still failed to surpass NGBoost. For example, although the neural network reached 0.9003 on the R², its RMSE was still 39.6392 kN, which was higher than NGBoost. Similarly, XGBoost’s R² was 0.8976, and the RMSE was 40.1715 kN, which also failed to surpass NGBoost.

Taken together, the NGBoost model performed most prominently in terms of the error in shear bearing capacity prediction. Its higher coefficient of determination and lower root-mean-squared error proved that it was superior to the other models. The NGBoost model demonstrated higher prediction accuracy and reliability.

5.2. Comparison with Empirical Formulas

To further illustrate the superiority of the NGBoost in shear capacity prediction, several empirical formulas globally adopted in current design codes were used for comparison. In this section, the prediction formulas from the codes in China, the U.S., Canada, and Europe were selected [45,46,47,48], as listed here:

Chinese Guobiao (GB) code for the design of concrete structures:

\begin{matrix} V_{U, GB} = \frac{1.75 f_{t} b h_{0}}{a / h_{0} + 1} + \frac{l_{0} / h - 2}{3 s_{h}} f_{y v} A_{s v} h_{0} + \frac{5 - l_{0} / h}{6 s_{v}} f_{y h} A_{s h} h_{0} \end{matrix}

(5)

American Concrete Institute (ACI) building code requirements for structural concrete:

V_{U, ACI} = [0.765 w_{t} \cos θ \sin θ + 0.425 (l_{E} + l_{P}) \sin θ^{2}] β_{s} f_{c} b

(6)

This formula is derived from the strut-and-tie model, in which

w_{t} = 2 (h - h_{0})

is the bottom width of the strut;

l_{E}, l_{P}

are the loading and supporting area width;

θ = \arctan (d / l_{a}) \geq 25

is the angle of the strut; and

β_{s}

is the strut coefficient.

Canadian Standards Association (CSA) design of concrete structures:

V_{U, CSA} = \frac{1.88 w_{t} \cos θ \sin θ + [(l_{E}) + (l_{P})] \sin θ}{1.6 + 340 ε_{1}} f_{c} b w_{s}

(7)

Similarly, this formula is also derived from the strut-and-tie model, where

ε_{1} = ε_{s} + (ε_{s} + 0.002) \cot^{2} θ

and

ε_{s}

is the ultimate tensile strain of the tie.

Eurocode (EU) design of concrete structures:

V_{U, EU} = 0.85 β_{s} f_{c} b w_{s} \sin θ

(8)

where

w_{s}

is the strut width and equals to

[1.85 w_{t} \cos θ + [l_{E} + l_{P}] \sin θ] / 2

.

Subsequently, the prediction results of these four formulas on the identical testing set of deep RC beams can be calculated and is presented in Figure 7. For better comparison, the mean prediction result of NGBoost is also provided here. Firstly, it can be seen that, owing to the purpose of maintaining the reliability level of the designed structure, all formulas were more likely to underestimate the shear capacity of a deep beam as the prediction point falls below the red line in Figure 7. Among the four formulas, though the CSA code’s formula had the best overall regression accuracy, its performance was still evidently worse than that of NGBoost based on the distribution of the scatters given in Figure 7.

Similarly, to better quantify the accuracy of each formula, their R² and RMSE scores are listed in Table 3. It clear that the NGBoost model’s prediction accuracy on the deep beam’s shear capacity dramatically surpassed all the representative prediction formulas derived from relevant mechanics. The improvement ratio of NGBoost over the best formula from CSA reached a 66% R² and 53% for the RMSE. These results strongly prove the effectiveness and superiority of using NGBoost for the scalar prediction of a deep beam’s shear capacity.

5.3. Probabilistic Prediction Results

Besides the excellent scalar prediction ability of NGBoost on the deep beam’s shear capacity, its probabilistic prediction ability is the other advantage of NGBoost. As mentioned in Section 3, the NGBoost algorithm can predict the probabilistic density distribution of the deep beam’s shear capacity. In this study, a normal distribution type was selected, and thus, NGBoost can output a mean value

μ

and a standard deviation

σ

about the deep beam’s shear capacity. On this basis, a confidence interval can be provided for the users to make decisions. As given in Figure 8, apart from the mean prediction on the shear capacity, a 98% confidence interval was provided for the entire database based on

μ \pm 2 σ

. Although the mean prediction accuracy of NGBoost was high enough, with the help of a confidence interval, a robust prediction result would be more helpful for the subsequent reliability-based structural design and assessment.

To better illustrate the probabilistic ability of NGBoost for the deep beam’s shear capacity, four deep beam samples were selected, and the corresponding prediction results are given in Figure 9. According to Figure 9, the mean prediction accompanied by the predicted probabilistic density function is shown, while the actual shear capacity is pointed out. From these four samples, it can be demonstrated that NGBoost can adaptively consider the uncertainty within the given database and provide the corresponding probabilistic results on the deep beam’s shear capacity. In comparison with the classic machine learning algorithms, which can only attain scalar prediction, the probabilistic prediction result could provide the uncertainty about the shear capacity. Consequently, the users could be more confident about the corresponding decisions based on pre-defined target reliability levels. For a higher reliability level, a wider confidence interval would be obtained. As a result, the actual shear capacity value would be contained in the interval, and the following design and assessment result would be more reliable. With the help of NGBoost, the data-driven prediction result could actually facilitate this field.

6. Conclusions

The prediction of an RC deep beam’s shear capacity is still an important, but difficult task in civil engineering due to the complicated failure mechanism behind it. In order to solve this task and maintain the safety of the relevant structures, a data-driven prediction approach based on machine learning algorithms has been tried recently, but these models can only provide a scalar prediction result about the shear capacity. Because these machine learning models are usually uninterpretable, a scalar prediction is untrustworthy and cannot be used in practice. Under this circumstance, a probabilistic prediction approach of the shear strength of reinforced concrete deep beams was established based on the newly developed NGBoost algorithm. The performance of the proposed model was thoroughly compared with classic machine learning models and existing empirical formulas. In addition, its probabilistic prediction ability was illustrated. Based on the results, the following conclusions can be reached:

(1): Though the advantage of NGBoost lies in its probabilistic ability, the mean prediction accuracy of NGBoost on the deep beam’s shear capacity was still better than the accuracy of the commonly used machine learning algorithms including the advancing ensemble learning algorithms such as random forest and XGBoost.
(2): In comparison with the prediction formulas for the deep beam shear capacity used in the relevant codes, NGBoost’s prediction performance dramatically surpassed all the prediction formulas. These results revealed the effectiveness of using NGBoost in predicting the shear capacity of a deep beam.
(3): NGBoost can successfully provide a probabilistic prediction about the shear capacity of a deep beam via a probabilistic density function. Based on this probabilistic density function, the confidence level of the deep beam’s strength can be easily obtained by the users. Consequently, a more-robust design and assessment of the deep beam can be conducted. The issue of the prediction of a machine learning algorithm being untrustworthy can be avoided. Overall, the proposed NGBoost-based probabilistic prediction approach could effectively provide an accurate and robust result about the deep beam’s shear capacity.

Although the model in this article achieved results that exceeded other research results, there are still some issues worthy of further study. To attain a probabilistic prediction via NGBoost, the type of probabilistic distribution should be pre-set, which requires some prior knowledge about the targeted issue, such as the shear capacity prediction in this study. If the type pf distribution is not appropriate, the probabilistic prediction results of NGBoost may not be satisfactory. Therefore, in the future, NGBoost should be further modified to be able to adaptively construct the probabilistic distribution from the data, attaining more applicability.

Author Contributions

Conceptualization, Z.L.; data curation, M.-Y.L. and H.Z.; investigation, H.Z.; methodology, M.-Y.L. and Z.L.; resources, Z.L.; software, M.-Y.L.; writing—original draft, M.-Y.L.; writing—review and editing, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Chongqing Science Fund for Distinguished Young Scholars (Grant Number CSTB2022NSCQ-JQX0020), the Chongqing Talent Plan: Chongqing Technological Innovation and Application Development Project (Grant Number cstc2021ycjh-bgzxm0246), and the Chongqing Technological Innovation and Application Development Project (Grant Number CSTB2022TIAD-KPX0144). Chongqing Construction Science and Technology Plan Project (Chengkezi 2023 No.1-1).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

We confirm that we have no financial or personal affiliations with individuals or organizations that could have potentially influenced our work.

References

Yang, G.; Chen, S.Z.; Wang, X.Y.; Hu, D. Study on Data-Driven Identification Method of Hinge Joint Damage under Moving Vehicle Excitation. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part A Civ. Eng. 2023, 9, 04023035. [Google Scholar] [CrossRef]
Chen, S.Z.; Feng, D.C.; Sun, Z. Reliability-based vehicle weight limit determination for urban bridge network subjected to stochastic traffic flow considering vehicle-bridge coupling. Eng. Struct. 2021, 247, 113166. [Google Scholar] [CrossRef]
Lim, E.; Hwang, S.J. Modeling of the strut-and-tie parameters of deep beams for shear strength prediction. Eng. Struct. 2016, 108, 104–112. [Google Scholar] [CrossRef]
Li, Z.; Liu, X.; Kou, D.; Hu, Y.; Zhang, Q.; Yuan, Q. Probabilistic Models for the Shear Strength of RC Deep Beams. Appl. Sci. 2023, 13, 4853. [Google Scholar] [CrossRef]
Eun, H.C.; Lee, Y.H.; Chung, H.S.; Yang, K.H. On the shear strength of reinforced concrete deep beam with web opening. Struct. Des. Tall Spec. Build. 2006, 15, 445–466. [Google Scholar] [CrossRef]
Abdul-Razzaq, K.S.; Jebur, S.F. Experimental verification of strut and tie method for reinforced concrete deep beams under various types of loadings. J. Eng. Sustain. Dev. 2017, 21, 39–55. [Google Scholar]
Chen, H.; Yi, W.J.; Hwang, H.J. Cracking strut-and-tie model for shear strength evaluation of reinforced concrete deep beams. Eng. Struct. 2018, 163, 396–408. [Google Scholar] [CrossRef]
Foster, S.J. Design of non-flexural members for shear. Cem. Concr. Compos. 1998, 20, 465–475. [Google Scholar] [CrossRef]
Tuchscherer, R.G.; Birrcher, D.B.; Williams, C.S.; Deschenes, D.J.; Bayrak, O. Evaluation of Existing Strut-and-Tie Methods and Recommended Improvements. ACI Struct. J. 2014, 1451–1460. [Google Scholar] [CrossRef]
Starčev-Ćurčin, A.; Rašeta, A.; Malešev, M.; Kukaras, D.; Radonjanin, V.; Šešlija, M.; Žarković, D. Experimental testing of reinforced concrete deep beams designed by Strut-And-Tie method. Appl. Sci. 2020, 10, 6217. [Google Scholar] [CrossRef]
Ceresa, P.; Petrini, L.; Pinho, R. Flexure-shear fiber beam-column elements for modeling frame structures under seismic loading—State of the art. J. Earthq. Eng. 2007, 11, 46–88. [Google Scholar] [CrossRef]
Adhikary, B.B.; Mutsuyoshi, H.; Sano, M. Shear strengthening of reinforced concrete beams using steel plates bonded on beam web: Experiments and analysis. Constr. Build. Mater. 2000, 14, 237–244. [Google Scholar] [CrossRef]
Jin, L.; Yu, W.; Du, X.; Yang, W. Mesoscopic numerical simulation of dynamic size effect on the splitting-tensile strength of concrete. Eng. Fract. Mech. 2019, 209, 317–332. [Google Scholar] [CrossRef]
Zhang, S.Y.; Chen, S.Z.; Jiang, X.; Han, W.S. Data-driven prediction of FRP strengthened reinforced concrete beam capacity based on interpretable ensemble learning algorithms. Structures 2022, 43, 860–877. [Google Scholar] [CrossRef]
Al-Taai, S.R.; Azize, N.M.; Thoeny, Z.A.; Imran, H.; Bernardo, L.F.; Al-Khafaji, Z. XGBoost Prediction Model Optimized with Bayesian for the Compressive Strength of Eco-Friendly Concrete Containing Ground Granulated Blast Furnace Slag and Recycled Coarse Aggregate. Appl. Sci. 2023, 13, 8889. [Google Scholar] [CrossRef]
Zhong, Q.M.; Chen, S.Z.; Sun, Z.; Tian, L.C. Fully automatic operational modal analysis method based on statistical rule enhanced adaptive clustering method. Eng. Struct. 2023, 274, 115216. [Google Scholar] [CrossRef]
Chen, S.Z.; Zhang, S.Y.; Feng, D.C.; Taciroglu, E. Embedding prior knowledge into data-driven structural performance prediction to extrapolate from training domains. J. Eng. Mech. 2023. [Google Scholar]
Chen, S.Z.; Wu, G.; Xing, T.; Feng, D.C. Prestressing force monitoring method for a box girder through distributed long-gauge FBG sensors. Smart Mater. Struct. 2017, 27, 015015. [Google Scholar] [CrossRef]
Yang, G.; Wang, P.; Han, W.; Chen, S.; Zhang, S.; Yuan, Y. Automatic generation of fine-grained traffic load spectrum via fusion of weigh-in-motion and vehicle spatial–temporal information. Comput.-Aided Civ. Infrastruct. Eng. 2022, 37, 485–499. [Google Scholar] [CrossRef]
Chen, S.Z.; Zhong, Q.M.; Hou, S.T.; Wu, G. Two-stage stochastic model updating method for highway bridges based on long-gauge strain sensing. Structures 2022, 37, 1165–1182. [Google Scholar] [CrossRef]
Chen, S.Z.; Feng, D.C.; Han, W.S.; Wu, G. Development of data-driven prediction model for CFRP-steel bond strength by implementing ensemble learning algorithms. Constr. Build. Mater. 2021, 303, 124470. [Google Scholar] [CrossRef]
Benbouras, M.A.; Petrişor, A.I.; Zedira, H.; Ghelani, L.; Lefilef, L. Forecasting the bearing capacity of the driven piles using advanced machine-learning techniques. Appl. Sci. 2021, 11, 10908. [Google Scholar] [CrossRef]
Czarnecki, S.; Moj, M. Comparative Analyses of Selected Neural Networks for Prediction of Sustainable Cementitious Composite Subsurface Tensile Strength. Appl. Sci. 2023, 13, 4817. [Google Scholar] [CrossRef]
Karim, M.R.; Islam, K.; Billah, A.M.; Alam, M.S. Shear Strength Prediction of Slender Concrete Beams Reinforced with FRP Rebar Using Data-Driven Machine Learning Algorithms. J. Compos. Constr. 2023, 27, 04023003. [Google Scholar] [CrossRef]
Sanad, A.; Saka, M. Prediction of ultimate shear strength of reinforced-concrete deep beams using neural networks. J. Struct. Eng. 2001, 127, 818–828. [Google Scholar] [CrossRef]
Ashour, A.; Alvarez, L.; Toropov, V. Empirical modelling of shear strength of RC deep beams by genetic programming. Comput. Struct. 2003, 81, 331–338. [Google Scholar] [CrossRef]
Gandomi, A.H.; Yun, G.J.; Alavi, A.H. An evolutionary approach for modeling of shear strength of RC deep beams. Mater. Struct. 2013, 46, 2109–2119. [Google Scholar] [CrossRef]
Alshboul, O.; Almasabha, G.; Al-Shboul, K.F.; Shehadeh, A. A comparative study of shear strength prediction models for SFRC deep beams without stirrups using Machine learning algorithms. In Structures; Elsevier: Amsterdam, The Netherlands, 2023; Volume 55, pp. 97–111. [Google Scholar]
Esteghamati, M.Z.; Flint, M.M. Developing data-driven surrogate models for holistic performance-based assessment of mid-rise RC frame buildings at early design. Eng. Struct. 2021, 245, 112971. [Google Scholar] [CrossRef]
Esteghamati, M.Z.; Flint, M.M. Do all roads lead to Rome? A comparison of knowledge-based, data-driven, and physics-based surrogate models for performance-based early design. Eng. Struct. 2023, 286, 116098. [Google Scholar] [CrossRef]
Feng, D.C.; Wang, W.J.; Mangalathu, S.; Hu, G.; Wu, T. Implementing ensemble learning methods to predict the shear strength of RC deep beams with/without web reinforcements. Eng. Struct. 2021, 235, 111979. [Google Scholar] [CrossRef]
Chen, S.Z.; Feng, D.C. Multifidelity approach for data-driven prediction models of structural behaviors with limited data. Comput.-Aided Civ. Infrastruct. Eng. 2022, 37, 1566–1581. [Google Scholar] [CrossRef]
Mohammadhassani, M.; Saleh, A.; Suhatril, M.; Safa, M. Fuzzy modelling approach for shear strength prediction of RC deep beams. Smart Struct. Syst. 2015, 16, 497–519. [Google Scholar] [CrossRef]
Chou, J.S.; Ngo, N.T.; Pham, A.D. Shear strength prediction in reinforced concrete deep beams using nature-inspired metaheuristic support vector regression. J. Comput. Civ. Eng. 2016, 30, 04015002. [Google Scholar] [CrossRef]
Bojer, C.S.; Meldgaard, J.P. Kaggle forecasting competitions: An overlooked learning opportunity. Int. J. Forecast. 2021, 37, 587–603. [Google Scholar] [CrossRef]
Chen, S.Z.; Zhong, Q.M.; Zhang, S.Y.; Yang, G.; Feng, D.C. Evaluation of Performance of Bridge Weigh-in-Motion Methods Considering Spatial Variability of Bridge Properties. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part A Civ. Eng. 2023, 9, 04023036. [Google Scholar] [CrossRef]
Wu, J.; Zhang, B.; Xu, J.; Jin, L.; Diao, B. Probabilistic fatigue life prediction for RC beams under chloride environment considering the statistical uncertainty by Bayesian updating. Int. J. Fatigue 2023, 173, 107680. [Google Scholar] [CrossRef]
Yu, Z.; Xie, W.; Yu, B.; Cheng, H. Probabilistic prediction of joint shear strength using Gaussian process regression with anisotropic compound kernel. Eng. Struct. 2023, 277, 115413. [Google Scholar] [CrossRef]
Cowles, M.K.; Carlin, B.P. Markov chain Monte Carlo convergence diagnostics: A comparative review. J. Am. Stat. Assoc. 1996, 91, 883–904. [Google Scholar] [CrossRef]
Duan, T.; Anand, A.; Ding, D.Y.; Thai, K.K.; Basu, S.; Ng, A.; Schuler, A. Ngboost: Natural gradient boosting for probabilistic prediction. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 2690–2700. [Google Scholar]
Chen, S.Z.; Feng, D.C.; Wang, W.J.; Taciroglu, E. Probabilistic machine-learning methods for performance prediction of structure and infrastructures through natural gradient boosting. J. Struct. Eng. 2022, 148, 04022096. [Google Scholar] [CrossRef]
Amari, S.I. Natural gradient works efficiently in learning. Neural Comput. 1998, 10, 251–276. [Google Scholar] [CrossRef]
Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
Kramer, O.; Kramer, O. Scikit-learn. In Machine Learning for Evolution Strategies; Springer: Berlin/Heidelberg, Germany, 2016; pp. 45–53. [Google Scholar]
GB50010-2010; Code for Design of Concrete Structures. Standardization Administration of China: Beijing, China, 2002.
ACI 318; Building Code Requirements for Structural Concrete. Technical Report; ACI: Farmington Hills, MI, USA, 2014.
CSA A23.3:19; Design of Concrete Structures. CSA Group: Toronto, ON, Canada, 2004.
EN 1992-2; Eurocode 2: Design of Concrete Structures—Part 2: Concrete bridges—Design and Detailing Rules. Thomas Telford: London, UK, 2007.

Figure 1. Illustration of a typical RC deep beam: (a) A typical deep beam. (b) Key impact factors for a deep beam.

Figure 2. Marginal distributions of the input factors in the deep beam shear database: (a) L. (b) a. (c) h. (d)

h_{0}

. (e) b. (f)

f_{c}

. (g)

f_{y}

. (h)

f_{y h}

. (i)

f_{y v}

. (j)

ρ_{l}

. (k)

ρ_{w h}

. (l)

ρ_{w v}

. (m)

S_{w h}

. (n)

S_{w v}

.

Figure 2. Marginal distributions of the input factors in the deep beam shear database: (a) L. (b) a. (c) h. (d)

h_{0}

. (e) b. (f)

f_{c}

. (g)

f_{y}

. (h)

f_{y h}

. (i)

f_{y v}

. (j)

ρ_{l}

. (k)

ρ_{w h}

. (l)

ρ_{w v}

. (m)

S_{w h}

. (n)

S_{w v}

.

Figure 3. Bayesian optimization process.

Figure 4. Implementation workflow for NGBoost.

Figure 5. Optimization process of NGBoost’s hyper-parameters: (a) Optimization history. (b) Optimization contour of two basic hyper-parameters.

Figure 6. Scalar prediction performance of models on deep beam shear database: (a) Linear regression. (b) Support vector regression. (c) Neural network. (d) Random forest. (e) XGBoost. (f) NGBoost.

Figure 7. Scalar prediction performance of empirical formulas on testing database of deep beam shear experiment: (a) GB. (b) ACI. (c) CSA. (d) EU. (e) NGBoost.

Figure 8. Probabilistic prediction result of NGBoost on deep beam’s shear capacity.

Figure 9. Probabilistic prediction of shear capacity on four deep beam samples: (a) Sample 1. (b) Sample 2. (c) Sample 3. (d) Sample 4.

Table 1. Optimal hyper-parameters for NGBoost.

Hyper-Parameter	Searching Range	Optimal Value
Maximum depth for CART	[1, 50]	3
Minimum sample number for splitting	[2, 10]	5
Minimum sample number for each leaf	[1, 5]	3
Learning rate	[0.001, 0.3]	0.062
CART number	[10, 1000]	417

Table 2. Error performance of different machine learning models on shear capacity prediction.

Index	LR	SVR	NN	RF	XGBoost	NGBoost
R²	0.8302	0.8651	0.9003	0.8746	0.8976	0.9045
RMSE (kN)	51.7434	46.1173	39.6392	44.4582	40.1715	38.7976

Table 3. Error performance of different prediction formulas on shear capacity prediction.

Index	GB	ACI	CSA	EU	NGBoost
R²	0.1681	0.4063	0.5457	0.3617	0.9045
RMSE (kN)	114.5231	96.7460	84.6275	100.3124	38.7976

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, M.-Y.; Li, Z.; Zhang, H. Probabilistic Shear Strength Prediction for Deep Beams Based on Bayesian-Optimized Data-Driven Approach. Buildings 2023, 13, 2471. https://0-doi-org.brum.beds.ac.uk/10.3390/buildings13102471

AMA Style

Liu M-Y, Li Z, Zhang H. Probabilistic Shear Strength Prediction for Deep Beams Based on Bayesian-Optimized Data-Driven Approach. Buildings. 2023; 13(10):2471. https://0-doi-org.brum.beds.ac.uk/10.3390/buildings13102471

Chicago/Turabian Style

Liu, Mao-Yi, Zheng Li, and Hang Zhang. 2023. "Probabilistic Shear Strength Prediction for Deep Beams Based on Bayesian-Optimized Data-Driven Approach" Buildings 13, no. 10: 2471. https://0-doi-org.brum.beds.ac.uk/10.3390/buildings13102471

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Probabilistic Shear Strength Prediction for Deep Beams Based on Bayesian-Optimized Data-Driven Approach

Abstract

1. Introduction

2. Collected Experimental Database

3. Methodology

Fundamentals of NGBoost

4. Applying NGBoost in Deep Beam Shear Capacity Prediction

5. Shear Capacity Prediction Result Discussion

5.1. Comparison with Representative Machine Learning Algorithm

5.2. Comparison with Empirical Formulas

5.3. Probabilistic Prediction Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI