Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices

Trinchero, Riccardo; Canavero, Flavio

doi:10.3390/IEC2020-06967

Open AccessProceeding Paper

Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices^†

by

Riccardo Trinchero

^*

and

Flavio Canavero

Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129 Torino, Italy

^*

Author to whom correspondence should be addressed.

^†

Presented at the 1st International Electronic Conference—Futuristic Applications on Electronics, 1–30 November 2020; Available online: https://iec2020.sciforum.net/.

Eng. Proc. 2020, 3(1), 3; https://0-doi-org.brum.beds.ac.uk/10.3390/IEC2020-06967

Published: 30 October 2020

(This article belongs to the Proceedings of 1st International Electronic Conference—Futuristic Applications on Electronics)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents a preliminary version of an active learning (AL) scheme for the sample selection aimed at the development of a surrogate model for the uncertainty quantification based on the Gaussian process regression. The proposed AL strategy iteratively searches for new candidate points to be included within the training set by trying to minimize the relative posterior standard deviation provided by the Gaussian process regression surrogate. The above scheme has been applied for the construction of a surrogate model for the statistical analysis of the efficiency of a switching buck converter as a function of seven uncertain parameters. The performance of the surrogate model constructed via the proposed active learning method is compared with that provided by an equivalent model built via a Latin hypercube sampling. The results of a Monte Carlo simulation with the computational model are used as reference.

Keywords:

uncertainty quantification; active learning; surrogate model; Gaussian process regression; switching converter

1. Introduction

Uncertainty quantification represents a key resource for the design of complex electronic devices, since it allows quantifying statistically the effects of possible uncertain design parameters (e.g., the components tolerances) on the system’s performance [1].

Monte Carlo (MC) simulation can be seen as the most straightforward way to carry out the above statistical analysis. The underlying idea is to estimate the probability density function (pdf) of the outputs of interest by collecting the results of a large number of deterministic simulations calculated on a random set of configurations of the unknown parameters, drawn according to their probability distributions. Within the plain implementation of the MC method, the deterministic simulations are run with the so-called computational model. Such a deterministic model can be considered as the most accurate synthetic approximation of the system under modeling able of providing, for any configurations of the system parameters, a prediction of the system’s outputs. Despite its accuracy, a plain implementation of the MC method turns out to be computationally heavy, since, in order to guarantee the convergence of the statistical quantities of interest (e.g., means and standard deviation), it requires us to run a large number of simulations (usually in the order of thousands) with the expensive computational model.

Surrogate models, also known as metamodels, can be considered as an effective solution to reduce the computational cost of MC simulations [2,3,4,5,6]. They provide a closed-form and fast-to-evaluate approximation of the non-linear input-output behavior of the computational model, thereby providing an efficient alternative, which can be directly embedded within the MC simulation flow. Surrogate models are constructed via either regression or fitting techniques from a limited set of simulation results, called training samples, computed with the computational model. Several regressions techniques with different features have been successfully adopted in many fields and applications for the construction of surrogate models ranging from least-squares approaches [2] to the more recent kernel and machine learning regressions (e.g., support vector machine [3], least-square support vector machine [4], Gaussian process regression (GPR) [5,6]). However, the use of the most appropriate regression technique does not guarantee good model accuracy, since the latter is also influenced by the training samples used to train it. A common approach is to select the training samples based on a Latin hypercube sampling (LHS) scheme [7], in which the configurations of the input parameters used to train the model are selected in order to cover the experimental space as much as possible.

This work investigates the possible advantages and the performance of an alternative approach for the sampling selection given by the combination of an active learning (AL) scheme and the GPR [8,9,10,11,12]. The effectiveness of the proposed AL technique has been investigated by considering the uncertainty quantification of the DC efficiency of a switching converter as a function of seven uncertain parameters. The performance of the model built with the help of the proposed AL scheme is compared with that of an equivalent model in which the training samples were computed via plain LHS, by using as references the results of a MC simulation with the computational model.

2. Methods

2.1. Gaussian Process Regression (GPR)

The discussion starts introducing the GPR. Under the assumption that a generic non-linear computational model

M

, which provides a non-linear input-output map

y = M (x)

between the parameters

x \in X \subset R^{d}

and output of interest

y \in R

, follows a Gaussian process (GP) prior, a noise-free GPR reads [13]:

\begin{matrix} y = M (x) \sim \underset{M_{G P R} (x)}{\underset{︸}{G P (m (x), k (x, x^{'})}} \end{matrix}

(1)

where

M_{G P R} (x) \sim G P (m (x), k (x, x^{'}))

is a GP defined by the trend function

m (x)

and covariance function

k (x, x^{'})

. A GP extends the concept of a Gaussian distribution from numbers to functions. The trend

m (x)

provides the average function, among the ones drawn from the GP prior, while the covariance provides the correlation between the values of such functions at different point (i.e.,

x

and

x^{'}

) in the parameter space.

The above GP is called prior distribution, since it fixes the properties of the unknown non-linear model, before looking at the training data [13]. In fact, differently from deterministic regressions (e.g., the support vector machine regression and least-square regression), in which the candidate functions of the model are restricted to a specific class of functions (e.g., polynomial, linear, etc.), the GPR model considers as candidate functions for our model

M_{G P R}

all the possible non-linear functions drawn from the GP prior by letting data “speak” and assigns a probability to each of them [14].

The prior, combined with the information provided by the computational model, allows one to estimate the posterior distribution. Given a set of training samples

D_{1 : n} = {(x_{i}, y_{i})}_{i = 1}^{n}

, computed for a given set of configurations of the input parameters

x_{i} \in X \subset R^{d}

, with the computational model

y_{i} = M (x_{i})

, the posterior distribution approximates the output value

y_{*} = M (x_{*})

for any input

x_{*}

in terms of a Gaussian distribution, which reads:

\begin{matrix} p (y_{*} | x_{*}, D_{1 : n}) \sim N (μ_{x_{*}}, σ_{x_{*}}^{2}), \end{matrix}

(2)

where the posterior mean

μ_{x_{*}}

and variance

σ_{x_{*}}^{2}

are:

μ_{x_{*}} = m (x_{*}) + k_{*} K^{- 1} y

(3a)

σ_{x_{*}}^{2} = k_{* *} - k_{*} K^{- 1} k_{*}^{T} .

(3b)

where

y = {[y_{1}, \dots, y_{L}]}^{T}

,

K \in R^{L \times L}

is the correlation matrix in which the entries

K_{i j} = k (x_{i}, x_{j})

,

k_{*} = [k (x_{*}, x_{1}), \dots, k (x_{*}, x_{L})] \in R^{1 \times L}

and

k_{* *} = k (x_{*}, x_{*})

.

The above equations require one to specify both the trend and the covariance functions. In this paper, we will consider a GPR built from a GP prior with a constant mean function (i.e.,

m (x) = β_{0}

) and a Matern 5/2 covariance function with automatic relevance determination (ARD) hyper-parameters [13]:

\begin{matrix} k (x, x^{'}) = σ_{f}^{2} (1 + \sqrt{5} r + \frac{5}{3} r^{2}) exp (- \sqrt{5} r), \end{matrix}

(4)

with

\begin{matrix} r = \sqrt{\sum_{m = 1}^{d} \frac{{(x_{m} - x_{m}^{'})}^{2}}{σ_{m}^{2}}}, \end{matrix}

(5)

where

σ_{f}

and

σ_{m}

for

m = 1, \dots, d

are the hyper-parameters of the covariance. Both the covariance hyper-parameters and the GP mean (i.e.,

σ_{f}

,

σ_{m}

for

m = 1, \dots, d

and

β_{0}

) are estimated during the training of the model from the training samples [13].

The probabilistic interpretation in (2) allows computing for any configuration of the input parameters

x_{*}

the confidence interval (CI), such that

\begin{matrix} y_{*} & \in [μ_{x_{*}} - z_{1 - \frac{α}{2}} σ_{x_{*}}, μ_{x_{*}} + z_{1 - \frac{α}{2}} σ_{x_{*}}] \end{matrix}

(6)

with a probability of

100 (1 - α) %

, where z denotes the inverse of the Gaussian cumulative distribution function evaluated at

1 - \frac{α}{2}

.

2.2. Active Learning (AL) Strategy

The statistical information provided by the probabilistic model constructed via the GPR can be suitably adopted to efficiently explore the parameter space

X

, in order to get the optimal set of training samples [8,9,10,11]. The proposed AL approach is iterative. Given a set of training samples

D_{1 : n} = {(x_{i}, y_{i})}_{i = 1}^{n}

, a probabilistic model

M_{G P R, n}

is constructed via the GPR. Then, the algorithm searches for a new candidate point

x_{n + 1}

to be included in the training set at the next iteration, such that the posterior standard deviation

σ_{x_{*}} / μ_{x_{*}}

with

x_{*} \in X

provided by the GPR model

M_{G P R, n}

is minimized [9,11,12]. To that end, at each iteration, a new candidate configuration of the input parameters is selected by solving the following optimization problem:

\begin{matrix} x_{n + 1} = \underset{x_{*} \in X}{argmax} \{\begin{matrix} σ_{x_{*}}, for μ_{x_{*}} = 0 \\ σ_{x_{*}} / μ_{x_{*}}, otherwise \end{matrix} \end{matrix}

(7)

Unfortunately, the above optimization problem cannot be solved exactly, since it would require to evaluate the relative posterior mean

μ_{x_{*}}

and standard deviation

σ_{x_{*}}

of the GPR model

M_{G P R, n}

for every configuration of the input parameters belonging to parameter space (i.e., for any

x_{*} \in X

). Our implementation of the above optimization scheme searches on a finite set of points

x_{*} \in X_{*}

, drawn according to the parameter distributions via a LHS, where the set

X_{*} = {x_{*, i}}_{i = 1}^{n *}

with

n^{*} ⋙ n

. It is important to remark that a large value of

n_{*}

can be used, since the prediction of the posterior mean and standard deviation with the considered GPR model

M_{G P R, n}

is extremely fast.

At the next iteration, only the new configuration of the input parameters

x_{n + 1}

selected during the above optimization process will be used as input for the computational model to compute the corresponding output

y_{n + 1} = M (x_{n + 1})

, and a new model

M_{G P R, n + 1}

is trained with the new training set

D_{1 : n + 1} = D_{1 : n} \cup (x_{n + 1}, y_{n + 1})

. The iteration process starts at the first iteration with an initial set

D_{1 : n_{0}}

with

n_{0}

training samples selected by a generic sampling scheme (e.g., the LHS), and it stops when either the model budget in terms of maximum number of training samples

n_{m a x}

or a given tolerance is reached.

3. Results and Discussion

The AL technique presented in the previous section has been applied for the uncertainty quantification of the DC efficiency

η

, of the switching buck converter shown in Figure 1, as a function of seven parameters (i.e.,

d = 7

). Specifically, the values of the 12 V DC voltage source, the 50 μH inductor and its 10 mΩ equivalent series resistance (ESR), the 44.1 μF capacitance and its 20 mΩ ESR, the 1.25 Ω load resistance and the 0.3 Ω switch resistance have been modeled as seven uncorrelated Gaussian variables centered at their nominal values with standard deviations of 20% of their means. The above scenario has been implemented as a parametric netlist in LTSpice (additional details are provided in [6]). The resulting model will be used as the computational model in the following analysis.

The AL sampling technique presented in Section 2 is applied to build a surrogate model for the prediction of the converter efficiency. The algorithm starts with

n_{0}

training samples provided by the computational model and selected via a standard LHS. Then, the AL method is used to select the new candidate points in the parameter space, and to compute, along with the computational model, the corresponding training responses. The iterative algorithm stops when a the maximum number of training samples

n_{m a x}

is reached. For the sake of simplicity, in the following results

n_{0} = ⌊ 2 n_{m a x} / 3 ⌋

.

The performance of the surrogate mode GPR + AL, in which the GPR is combined with the AL, has been investigated for an increasing number of training samples,

n_{m a x}

= 25, 50, 75 and 100, by considering the root-mean-square error (RMSE) computed between the model predictions and the corresponding results provided by a MC simulation with 10,000 samples. The obtained results are then compared with the ones predicted by an equivalent GPR-based model, called GPR + LHS, in which the training samples were selected via a plain LHS scheme. Since the accuracy of the resulting surrogates necessarily depends on the specific training samples used to build them, five different realizations of the training set are considered for each size

n_{m a x}

.

Figure 2 shows the results of the above comparison in terms of mean values (red and green dots) and standard deviations (red and green bars) of the RMSE computed by considering 5 different realizations of the training set for each size

n_{m a x}

. The results clearly highlight the improved accuracy of the proposed GPR + AL model with respect to the plain GPR surrogate. In fact, the mean values of the RMSE computed for the GPR + LHS surrogate are always lower than the corresponding ones obtained with the the equivalent GPR + LHS surrogate, thereby highlighting the benefits of the proposed AL strategy.

For the sake of illustration, Figure 3 and Figure 4 show the scatter plots and the pdfs calculated from the predictions of proposed AL+GPR surrogate and the ones of the GPR + LHS surrogate for a single set of

n_{m a x} = 100

training samples, again by using as reference the results of a 10,000 samples MC simulation with the computational mode. The plots confirm the capability of the two surrogate models of providing an accurate prediction of the actual behaviors of the converter efficiency. Additionally, Figure 5 shows an additional comparison between the

99 %

CI predicted by the two models for 15 validation samples randomly selected among the samples used for the MC simulation. According to the results, the two surrogate models allow one to accurately account for the uncertainty of the model predictions, since most of the validation samples fall within the CIs predicted by the probabilistic models.

4. Conclusions

This paper presented a preliminary version of an AL scheme. The proposed AL algorithm, developed for the GPR, has been adopted for the optimal selection of the training samples in order to construct an accurate surrogate model of the scalar output of interest. The proposed methodology has been applied for the construction of a surrogate model for the uncertainty quantification of the efficiency of a buck converter as a function of seven uncertain parameters. The performance of the resulting surrogate has been compared with that of an equivalent GPR-based surrogate model in which the training samples were selected via a standard LHS scheme. According to the results presented in this work, the proposed AL methodology can be considered as a promising candidate approach for the training selection. Additional investigations are needed in order to better understand the performance of such methodology in different test-cases and for different dimensionality of the parameter space; possible different kinds of non-linear input–output behaviors could also be considered.

Conflicts of Interest

The authors declare no conflict of interest.

References

Spence, R.; Soin, R.S. Tolerance Design of Electronic Circuits; Imperial College Press: London, UK, 1997. [Google Scholar]
Spina, D.; Ferranti, F.; Dhaene, T.; Knockaert, L.; Antonini, G.; Ginste, D.V. Variability Analysis of Multiport Systems Via Polynomial-Chaos Expansion. IEEE Trans. Microw. Theory Tech. 2012, 60, 2329–2338. [Google Scholar] [CrossRef]
Trinchero, R.; Manfredi, P.; Stievano, I.S.; Canavero, F.G. Machine learning for the performance assessment of high-speed links. IEEE Trans. Electromagn. Compat. 2018, 60, 1627–1634. [Google Scholar] [CrossRef]
Trinchero, R.; Larbi, M.; Torun, H.M.; Canavero, F.G.; Swaminathan, M. Machine learning and uncertainty quantification for surrogate models of integrated devices with a large number of parameters. IEEE Access 2019, 7, 4056–4066. [Google Scholar] [CrossRef]
Trinchero, R.; Larbi, M.; Swaminathan, M.; Canavero, F.G. Statistical Analysis of the Efficiency of an Integrated Voltage Regulator by means of a Machine Learning Model Coupled with Kriging Regression. In Proceedings of the IEEE 23rd Workshop on Signal and Power Integrity (SPI), Chambery, France, 18–21 June 2019; pp. 1–4. [Google Scholar]
Trinchero, R.; Canavero, F.G. Combining LS-SVM and GP regression for the uncertainty quantification of the EMI of power converters affected by several uncertain parameters. IEEE Trans. Electromagn. Compat. 2020, 1755–1762. [Google Scholar] [CrossRef]
McKay, M.; Beckman, R.; Conover, W. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 2000, 42, 55–61. [Google Scholar] [CrossRef]
De Ridder, S.; Deschrijver, D.; Spina, D.; Dhaene, T.; Ginste, D.V. A Bayesian Approach to Adaptive Frequency Sampling. In Proceedings of the IEEE 23rd Workshop on Signal and Power Integrity (SPI), Chambery, France, 18–21 June 2019; pp. 1–4. [Google Scholar]
Schreiter, J.; Nguyen-Tuong, D.; Eberts, M.; Bischoff, B.; Markert, H.; Toussaint, M. Safe exploration for active learning with Gaussian processes. In Proceedings of the European Conference on Machine Learning (ECML), Porto, Portugal, 7–11 September 2015; Volume 9284, pp. 133–149. [Google Scholar]
Torun, H.M.; Hejase, J.A.; Tang, J.; Beckert, W.D.; Swaminathan, M. Bayesian Active Learning for Uncertainty Quantification of High Speed Channel Signaling. In Proceedings of the IEEE 27th Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS), San Jose, CA, USA, 14–17 October 2018; pp. 311–313. [Google Scholar]
Pasolli, E.; Melgani, F. Gaussian process regression within an active learning scheme. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Vancouver, BC, Canada, 24–29 July 2011; pp. 3574–3577. [Google Scholar]
Shahriari, B.; Swersky, K.; Wang, Z.; Adams, R.P.; De Freitas, N. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 2016, 104, 148–175. [Google Scholar] [CrossRef]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
Ebden, M. Gaussian Processes: A Quick Introduction. arXiv 2015, arXiv:1505.02965. [Google Scholar]

Figure 1. Schematic of the considered buck converter. It is a 12 V:5 V switching converter operating at the switching frequency of

100

kHz. For each component, the nominal values are indicated.

Figure 1. Schematic of the considered buck converter. It is a 12 V:5 V switching converter operating at the switching frequency of

100

kHz. For each component, the nominal values are indicated.

Figure 2. Comparison among the RMSE computed by comparing the results of a 10,000 samples MC simulation with the corresponding predictions of the proposed GPR + AL and GPR + LHS surrogates for an increasing size

n_{m a x}

of the training set. The means (dots) and standard deviations (bars) of the RMSE values were computed by considering five realizations of the training samples.

Figure 2. Comparison among the RMSE computed by comparing the results of a 10,000 samples MC simulation with the corresponding predictions of the proposed GPR + AL and GPR + LHS surrogates for an increasing size

n_{m a x}

of the training set. The means (dots) and standard deviations (bars) of the RMSE values were computed by considering five realizations of the training samples.

Figure 3. Scatter plots of the converter efficiency

η

computed by considering the correlation between the results of a MC simulation with 10,000 samples and the corresponding mean values predicted by the proposed GPR + AL (red dots) and GPR + LHS (green dots) surrogates trained with

n_{m a x} = 150

samples.

Figure 3. Scatter plots of the converter efficiency

η

computed by considering the correlation between the results of a MC simulation with 10,000 samples and the corresponding mean values predicted by the proposed GPR + AL (red dots) and GPR + LHS (green dots) surrogates trained with

n_{m a x} = 150

samples.

Figure 4. Pdfs of the converter efficiency estimated by the proposed GPR+AL (solid green curve) and GPR+LHS surrogates (solid red curve) and by a MC simulation with 10,000 samples (blue histogram).

Figure 5. Comparison between the results of 15 simulations with the computational model and the corresponding mean values and

99 %

CIs predicted by the proposed GPR + AL (green dots and bars) and GPR + LHS (red dots and bars).

Figure 5. Comparison between the results of 15 simulations with the computational model and the corresponding mean values and

99 %

CIs predicted by the proposed GPR + AL (green dots and bars) and GPR + LHS (red dots and bars).

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Trinchero, R.; Canavero, F. Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices. Eng. Proc. 2020, 3, 3. https://0-doi-org.brum.beds.ac.uk/10.3390/IEC2020-06967

AMA Style

Trinchero R, Canavero F. Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices. Engineering Proceedings. 2020; 3(1):3. https://0-doi-org.brum.beds.ac.uk/10.3390/IEC2020-06967

Chicago/Turabian Style

Trinchero, Riccardo, and Flavio Canavero. 2020. "Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices" Engineering Proceedings 3, no. 1: 3. https://0-doi-org.brum.beds.ac.uk/10.3390/IEC2020-06967

Article Menu

Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices^†

Abstract

1. Introduction

2. Methods

2.1. Gaussian Process Regression (GPR)

2.2. Active Learning (AL) Strategy

3. Results and Discussion

4. Conclusions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices †

Abstract

1. Introduction

2. Methods

2.1. Gaussian Process Regression (GPR)

2.2. Active Learning (AL) Strategy

3. Results and Discussion

4. Conclusions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Use of an Active Learning Strategy Based on Gaussian Process Regression for the Uncertainty Quantification of Electronic Devices^†