Calibrating the CreditRisk+ Model at Different Time Scales and in Presence of Temporal Autocorrelation †

Giacomelli, Jacopo; Passalacqua, Luca

doi:10.3390/math9141679

Open AccessArticle

Calibrating the CreditRisk⁺ Model at Different Time Scales and in Presence of Temporal Autocorrelation †

by

Jacopo Giacomelli

^1,2,*,†

and

Luca Passalacqua

^2,*

¹

SACE S.p.A., Piazza Poli 42, 00187 Rome, Italy

²

Department of Statistics, Sapienza University of Rome, Viale Regina Elena 295, 00161 Rome, Italy

^*

Authors to whom correspondence should be addressed.

^†

The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy or position of SACE S.p.A.

Mathematics 2021, 9(14), 1679; https://0-doi-org.brum.beds.ac.uk/10.3390/math9141679

Submission received: 4 June 2021 / Revised: 9 July 2021 / Accepted: 13 July 2021 / Published: 16 July 2021

(This article belongs to the Special Issue Statistical Methods of Analyzing Financial Equilibrium, Performance and Risk)

Download

Browse Figures

Versions Notes

Abstract

:

The CreditRisk

^{+}

model is one of the industry standards for the valuation of default risk in credit loans portfolios. The calibration of CreditRisk

^{+}

requires, inter alia, the specification of the parameters describing the structure of dependence among default events. This work addresses the calibration of these parameters. In particular, we study the dependence of the calibration procedure on the sampling period of the default rate time series, that might be different from the time horizon onto which the model is used for forecasting, as it is often the case in real life applications. The case of autocorrelated time series and the role of the statistical error as a function of the time series period are also discussed. The findings of the proposed calibration technique are illustrated with the support of an application to real data.

Keywords:

CreditRisk⁺; calibration; time series; default correlation; dependence structure

MSC:

62F25; 62H12; 62H25; 62M10; 62P05

JEL Classifications:

C38; C51; G21; G22

1. Introduction

While the development of modern portfolio credit risk models started in the 1980–1990 decade [1] within the framework of the Basel Accords, it is with the great credit crisis of 2008 [2] that increasing attention started to be paid to the precise determination of the structure of dependence among default events. It is well established [3] that tails of the distribution of the value of asset/liabilities portfolios are dominated by the structure of dependence rather than by the other fundamental components of credit risk (i.e., the marginal probability and the severity associated with each future default event). The vast research interest in modeling the structure of dependence resulted in the formalization of the so-called copula theory [4,5]. This “language” was explicitly adopted by the second generation of portfolio credit models to describe the dependence among loss events [6,7,8,9].

In this regard, the calibration issues raised by a particular structure of dependence (or, equivalently, the corresponding copula) can be as important as the choice of the structure itself. Generally, calibrating the dependence structure of a portfolio model is a demanding task, given the large number of parameters needed to provide a realistic description of the modeled dependencies, and considering that, on the other hand, historical data are usually not numerous enough to fill the sample space in a way sufficient for a precise estimation of the parameters.

In this work, we address a typical real-life problem: how to choose the frequency of the historical time series of default used to calibrate a classic credit portfolio model, CreditRisk

^{+}

, in order to provide the most accurate estimation of the structure of dependence parameters, or, in other words, how the calibration error “scales” with the time series frequency. This problem is especially relevant for all the cases when the debtors underlying a credit portfolio are small/medium enterprises. The lack of market information, such as CDS spread, stock price, or bond yield, forces to calibrate the model using a reduced-form approach based on historical cluster data, such as default rate time series associated with the economic sector of each debtor. This case is typical in activities such as credit insurance, surety, and factoring. In most cases, publicly available time series have a sampling period ranging from one to three months (e.g., [10]), while the calibrated CreditRisk

^{+}

model is used on a projection horizon that is at least one year long (e.g., the unwind period required to quantify a capital requirement both in Solvency 2 and in Basel 3 regulatory frameworks).

CreditRisk

^{+}

[11], disclosed in 1997, belongs to the first generation of portfolio credit risk models of “actuarial inspiration”. Applications of CreditRisk

^{+}

to the credit insurance sector are documented in the literature well before the 2008 financial credit crisis [12,13], while research activity is still ongoing in the area of actuarial science [14]. At present, CreditRisk

^{+}

is still one of the financial and actuarial industry standards for the assessment of credit risk in portfolios of financial loans or credit/suretyship policies.

Despite the vast research activity on this model and its calibration, the issue of using two different time scales for calibration and projection remains not investigated to date. The research conducted to date on the calibration of CreditRisk

^{+}

[14] has addressed the issues related to the decomposition of a given covariance matrix among the time series, which is the final necessary step to complete the calibration of the model. However, the covariance matrix is obtained by the “classical” estimator, under the assumption that the sampling period of the time series and the projection horizon are equal.

This work shows that calibrating the model at a shorter time scale than the projection horizon is possible, nontrivial, and convenient. The internal consistency of the CreditRisk

^{+}

assumptions when simultaneously imposed at different time scales has been proved and guarantees that the investigated calibration mode is not ill-posed. However, the form of the covariance estimator needed to obtain a set of parameters coherent with a specific projection horizon, using time series with a smaller sampling period, depends on the two chosen time scales. Indeed, the proposed estimator coincides with the classical one only when calibration and projection time scales are equal. Finally, we show that calibrating at a smaller time scale than the projection one provides a more precise estimation of the model parameters. The estimation error and its dependence on the difference between the two time scales are discussed.

The article is organized as follows. In Section 2, we summarize assumptions and features of the CreditRisk

^{+}

model. In Section 3, we discuss the internal consistency of the model assumptions when imposing them to be simultaneously true at different time horizons. The calibration of the model parameters, which define the dependence structure, is considered in Section 4. The different degree of precision of the estimators defined at increasing time scales is discussed in Section 5. The techniques introduced in this work are applied to a real-world case study in Section 6. The main results are summarized in Section 7.

2. The CreditRisk⁺ Model

The CreditRisk

^{+}

model is a portfolio model developed by Credit Suisse First Boston (CSFB) by Tom Wilde [15] and coworkers, first documented in [11] and later widely discussed in [16]. It is a model actuarially inspired in the sense that losses are due only to default events and not to other sources of financial risk, e.g., variation of the credit standing (the so-called “credit migration” effect). CreditRisk

^{+}

can be classified as a frequency–severity model, cast in a single-period framework, with the peculiarity that a doubly-stochastic process (i.e., the Poisson–Gamma mixture) describes the frequency of default events. Loss severity is assumed to be deterministic, although this ansatz can be easily relaxed at the cost of some additional computational burden. However, severity-related issues can be neglected for what follows.

The structure of dependence of default events is described using a factor model framework, where factors are unobservable (i.e., latent) stochastic “market” variables, whose precise financial/actuarial identification is irrelevant since the model integrates on all possible realizations (“market scenarios”). Therefore, CreditRisk

^{+}

can be further classified into the family of factor models and, in particular, into the subfamily of conditionally independent factor models, since, conditionally on the values assumed by the factors, defaults are supposed (by the model) to be independent.

The structure of the model can be summarized as follows. Let N be the number of different risks in a given portfolio and

{1 I}_{i}

the default indicator function of the i-th risk (

i = 1, \dots, N

) over the time horizon

(t, T]

. The indicator function

{1 I}_{i}

is a Bernoulli random variable such that

E [{1 I}_{i}] = q_{i}, var [{1 I}_{i}] = q_{i} (1 - q_{i}), i = 1, \dots, N .

(1)

The “portfolio loss” L over the reference time horizon

(t, T)

is then given by

L = \sum_{i = 1}^{N} {1 I}_{i} E_{i}

(2)

where each exposure

E_{i}

is supposed to be deterministic.

In order to ease the semianalytic computation of the distribution of L, the model introduces a new set of variables

Y_{i}

, each replacing the corresponding indicator function

{1 I}_{i}

(

i = 1, \dots, N

). The new variables

Y_{i}

are supposed to be Poisson-distributed, conditionally on the value assumed by the market latent variables.

Assumption 1

(CreditRisk

^{+}

distributional assumption). Given a time horizon

(t, T]

and a set of N risky debtors, the number

Y_{i}

of insolvency events generated by each i-th debtor over

(t, T]

is distributed as follows:

Y_{i} \sim Poisson (p_{i} (Γ)), p_{i} (Γ) : = q_{i} \cdot (ω_{i 0} + \sum_{k = 1}^{K} ω_{i k} Γ_{k})

(3)

where

Γ = (Γ_{1} \dots Γ_{K}) \in R_{+}^{K}

is an array of independent r.v.’s such that

Γ_{k} \sim Gamma (β_{k}^{- 1}, β_{k}), β_{k} \in R_{+}

(4)

and the factor loadings

ω_{i k}

are supposed to be all non-negative and to sum up to unity:

\begin{matrix} ω_{i k} \geq 0, i = 1, \dots, N, k = 0, \dots, K, \\ \sum_{k = 0}^{K} ω_{i k} = 1, i = 1, \dots, N . \end{matrix}

(5)

The

Γ

parameters set

{β_{1} \dots β_{K}}

is equivalent to the classical shape-scale parameterization

{α_{k}, β_{k}}

of each Gamma distributed r.v.

Γ_{k}

, after having imposed the assumption

E [Γ_{k}] = 1

, that is stated in the original formulation of the CreditRisk

^{+}

model. Hence, the k-th scale parameter

β_{k}

is equal to the variance

σ_{k}^{2}

of

Γ_{k}

. Given the independence among

Γ_{k}

’s, the covariance matrix

Σ

takes the form

Σ : = cov [Γ] = diag (σ_{1}^{2} \dots σ_{K}^{2}) = diag (β_{1} \dots β_{K})

(6)

Assumption 1 implies that

q_{i}

is the unconditional expected default frequency

q_{i} = E [p_{i} (Γ)] = \int_{R_{+}^{K}} p_{i} (Γ) f (Γ) d Γ_{1} \dots d Γ_{k},

(7)

where

f (x) = \prod_{k = 1}^{K} \frac{x_{k}^{α_{k} - 1}}{β_{k}^{α_{k}} Γ (α_{k})} e^{- x_{k} / β_{k}}, x_{k} \geq 0, α_{k}, β_{k} > 0,

(8)

and that the identity between the expected values of the original Bernoulli variable

{1 I}_{i}

and the new Poisson variable

Y_{i}

is granted:

E [Y_{i}] = E [{1 I}_{i}] = q_{i} .

(9)

The portfolio loss is now represented by the r.v.

L_{Y}

L_{Y} = \sum_{i = 1}^{N} Y_{i} \cdot E_{i}, where Y_{i} | Γ \sim Poisson (p_{i} (Γ)) .

(10)

In [11], the distribution of

L_{Y}

is obtained by using a recursive method, further described in [17]. The accuracy, stability, and possible variants of the original algorithm are discussed in [16]. The same distribution can be easily computed through Monte Carlo simulation due to the availability of a dedicated importance sampling algorithm in [18].

Notice that, although the distributions of L and

L_{Y}

differ, the expected value of the portfolio loss is the same

E [L] = E [L_{Y}]

.

In the language of copula functions, the structure of dependence implied by (3) corresponds [19] to a multivariate Clayton copula, i.e., an Archimedean copula where latent variables are Gamma-distributed (for the relation between Archimedean copula functions and factor models see, e.g., ([9] [§2.1])). The copula parameters are the factor loadings

ω_{i k}

and they can be gathered, taking into account the normalization condition stated in Assumption 1, in an

N \times K

matrix

Ω

:

Ω : = (\begin{matrix} ω_{11} & \dots & ω_{1 K} \\ ⋮ & ⋱ & ⋮ \\ ω_{N 1} & \dots & ω_{N K} \end{matrix}),

(11)

which is, for typical values of N and K, much smaller than the

N \times N

covariance matrix between the default indicators

1 I

.

Remark 1.

This work is specifically focused on improving the estimation of the CreditRisk

^{+}

copula parameters

{Ω, Σ}

. Further investigations on the properties of CreditRisk

^{+}

dependence structure, apart from those needed for the estimation improvement, and its comparison with the other copulae are beyond the scope of this study.

As shown in [14], it holds

cov [Y_{i}, Y_{j}] = q_{i} q_{j} \sum_{k = 1}^{K} ω_{i k} ω_{j k} σ_{k}^{2} + δ_{i j} q_{i},

(12)

where

δ_{i j}

is the Kronecker delta. Equation (12) allows the calibration of the factor loadings, and thus of the dependence structure of the CreditRisk

^{+}

model, by matching the observed covariance matrix of historical default time series with model values. However, since the model is defined in a single-period framework, with a reference “forecasting” time horizon

(t, T]

, that is typically of 1 year, i.e.,

T = t + 1

, it is not a priori evident how to use historical time series with a different frequency (e.g., quarterly) in a consistent way, when calibrating the model parameters. Naively, it is reasonable to expect that the larger the information provided by the historical time series (i.e., the higher the frequency), the better the calibration. This issue is addressed in the next sections.

3. CreditRisk⁺ Using Multiple Unwind Periods

The original CreditRisk

^{+}

formulation, summarized in Assumption 1, defines the model in a uniperiodal framework, where only one time scale

T - t

is considered. In this section, we discuss the internal consistency of the model assumption when imposing it more than once at distinct time scales. In this context, the expression “internal consistency” means that it is possible and well-posed imposing Assumption 1 to be true at two distinct time scales. The same applies also considering a slightly modified version of the CreditRisk

^{+}

framework (i.e., imposing Assumption 2, introduced in the following, instead of Assumption 1).

Extending the original CreditRisk

^{+}

formulation to a multiperiod framework enables the calibration of the model considering a time scale different from the one chosen for its application. The results presented in this section are applied in the next Section 4 to estimate the elements of the matrix

A : = Ω^{T} Σ Ω .

(13)

Estimating A is a fundamental step in order to complete the calibration of the model. In Section 4 estimators are defined using historical series sampled with a period that is not necessarily equal to the projection horizon on which

Σ

and

Ω

are defined. Section 5 shows the convenience of choosing a sampling period shorter than the projection horizon in order to evaluate

\hat{A}

.

3.1. The Single Unwind Period Case

As discussed in Section 2, in CreditRisk

^{+}

each risk (i.e., debtor) is modeled by a Poisson distributed r.v.

Y_{i}

, although the Bernoulli distribution is the natural choice to represent absorbing events, such as default. Assumption 1 is convenient in terms of analytical tractability since

L_{Y}

distribution can be computed through a semianalytical method. However, in order to address the problem of calibrating CreditRisk

^{+}

in a “roll-over” framework, defined by an arbitrary set of time intervals, it is useful to recover the Bernoulli representation of each debtor by introducing a new r.v.

{\tilde{Y}}_{i} : = {1 I}_{Y_{i} > 0}

.

Both the r.v.

Y_{i}

and its distribution parameter

p_{i} (Γ)

can take values larger than 1. This is formally correct, given that

Y_{i} \sim Poisson (p_{i} (Γ))

, despite not coping with the representation of absorbing events, that can occur at most once by definition. The so-called “Poisson approximation”, introduced by substituting

{1 I}_{i}

with

Y_{i}

, is numerically sound as

q_{i}

approaches to zero—a condition that is well fulfilled in most real world relevant cases.

Indeed, Assumption 1 implies that

{\tilde{Y}}_{i} | Γ \sim Bernoulli ({\tilde{p}}_{i} (Γ))

where the distribution parameter is

{\tilde{p}}_{i} (Γ) = Prob (Y_{i} > 0 | Γ) = 1 - exp [- q_{i} (ω_{i 0} + \sum_{k = 1}^{K} ω_{i k} Γ_{k})] .

(14)

It holds by construction

E [{\tilde{Y}}_{i}] = \int_{R_{+}^{K}} {\tilde{p}}_{i} (Γ) f (Γ) d Γ_{1} \dots d Γ_{K} .

(15)

Computing the integral in (15) and then approximating the term

exp [- q_{i} ω_{0}]

with its second order Taylor series centered at

q_{i} = 0

leads to the following result.

Proposition 1

(Asymptotic equivalence between Bernoulli and Poisson representation of risks). Let

{\tilde{Y}}_{i} : = {1 I}_{Y_{i} > 0}

where

Y_{i}

is distributed according to Assumption 1. Then

{\tilde{q}}_{i} : = E [{\tilde{Y}}_{i}] = 1 - e^{- q_{i} ω_{i 0}} \prod_{k = 1}^{K} {(1 + q_{i} ω_{i k} σ_{k}^{2})}^{- 1 / σ_{k}^{2}} .

(16)

Further,

{\tilde{q}}_{i} = q_{i} + O (q_{i}^{2}) \overset{q_{i} \to 0^{+}}{\to} q_{i} .

(17)

Proposition 1 implies that

E [L_{Y_{i}}] ≃ E [L_{{\tilde{Y}}_{i}}]

, provided that

q_{i} ≪ 1

. Moreover, the same result enables also the exact satisfaction of

E [L_{Y_{i}}] = E [L_{{\tilde{Y}}_{i}}]

, in case the stochastic parameter

{\tilde{p}}_{i} (Γ)

is redefined through the substitution

q_{i} \mapsto q_{i}^{'}

, where

q_{i}^{'}

verifies the following modified version of (16):

E [{\tilde{Y}}_{i} (q_{i}^{'})] = 1 - e^{- q_{i}^{'} ω_{i 0}} \prod_{k = 1}^{K} {(1 + q_{i}^{'} ω_{i k} σ_{k}^{2})}^{- 1 / σ_{k}^{2}} = q_{i} = E [Y_{i}] .

(18)

It is worth noticing that the substitution

{1 I}_{i} \mapsto Y_{i}

discussed in Section 2 implies the preservation of the expected value

E [L] = E [L_{Y}]

due to the fact that it is done before the introduction of the market factors

Γ

. On the other hand, restoring the Bernoulli representation of each risk after having introduced the dependence structure requires the results presented in Proposition 1.

Proposition 1 permits the introduction of a slightly modified version of the CreditRisk

^{+}

model that is asymptotically equivalent to the original one stated in Assumption 1. The equivalence between the two models is further analyzed in the next sections.

Assumption 2

(Modified CreditRisk

^{+}

distributional assumption). Given a time horizon

(t, T]

and a set of N risky debtors, the number of insolvency events generated by each i-th debtor over

(t, T]

is represented by the r.v.

{\tilde{Y}}_{i} \sim Bernoulli ({\tilde{p}}_{i} (Γ))

, where the distribution parameter

{\tilde{p}}_{i} (Γ)

satisfies (14). Assumptions on market factors Γ and factor loadings Ω remain the same stated in Assumption 1.

In Assumption 2 the linear dependence of the parameters

p_{i} (Γ)

from the latent variables has been replaced with a log link function. Thus, the modified version of CreditRisk

^{+}

is also referred to as “exponential” in the following.

3.2. The Multiple Unwind Periods Case

This section investigates the consequences of imposing the internal consistency of Assumption 1 or Assumption 2 at distinct time scales. Assumptions 3 and 4 are introduced hereinafter, in order to specify the family of parameters that have to be considered at the distinct time intervals where the model is applied.

The following assumption guarantees the internal consistency at different time scales of the classical CreditRisk

^{+}

model, defined in Assumption 1.

Assumption 3

(CreditRisk

^{+}

parameters at different time scales). Let

t \equiv t_{0}, t_{1}, \dots, t_{m} \equiv T

be a partition of the time interval

(t, T]

. Let Assumption 1 be satisfied over each j-th interval

(t_{j - 1}, t_{j}]

by the set

{Y_{i}^{(j)}}

(

i = 1 \dots N

), where

Y_{i}^{(j)}

is the r.v. representing the i-th risk observed during the j-th interval and the following holds for the associated set

{q_{i}^{(j)}; Γ^{(j)}; Ω^{(j)}}

of parameters and market factors:

q_{i}^{(j)} = q_{i} \frac{t_{j} - t_{j - 1}}{T - t} = constant,

(19)

Γ_{k}^{(j)} \sim Gamma (σ_{k}^{- 2} ξ_{k j}^{- 1} \frac{t_{j} - t_{j - 1}}{T - t}, σ_{k}^{2} ξ_{k j} \frac{T - t}{t_{j} - t_{j - 1}}),

(20)

Ω^{(j)} = Ω,

(21)

where

ξ_{k j} \in R_{+}

.

Further, the following assumption guarantees the internal consistency at different time scales of the modified version of CreditRisk

^{+}

model, introduced in Assumption 2.

Assumption 4

(Modified CreditRisk

^{+}

parameters at different time scales). Let

t \equiv t_{0}, t_{1}, \dots, t_{m} \equiv T

be a partition of the time interval

(t, T]

. Let Assumption 2 be satisfied over each j-th interval

(t_{j - 1}, t_{j}]

by the set

{{\tilde{Y}}_{i}^{(j)}}

(

i = 1 \dots N

), where

{\tilde{Y}}_{i}^{(j)}

is the r.v. representing the i-th risk observed during the j-th interval. The associated set

{q_{i}^{(j)}; Γ^{(j)}; Ω^{(j)}}

of parameters and market factors satisfies the same assumptions stated in Assumption 3.

Finally, for the sake of simplicity, the additional Assumption 5 is introduced, with regard to the independence among market factors considered at different times. However, being possible that real-data time series violate Assumption 5, this assumption is weakened in the following Section 3.3.

Assumption 5

(Non-autocorrelated market factors). Given Assumption 3, let

cov [Γ_{k}^{(j)}, Γ_{k}^{(j^{'})}] = δ_{j j^{'}} var [Γ_{k}^{(j)}] .

(22)

Considering the assumptions introduced above, we prove that CreditRisk

^{+}

is internally consistent when extended to a roll-over framework.

Theorem 1

(Internal consistency of CreditRisk

^{+}

in absence of autocorrelation). Let us consider a set of risks

{Y_{i}}

(

i = 1 \dots N

), observed through a time horizon

(t, T]

, and an arbitrary partition

t \equiv t_{0}, t_{1}, \dots, t_{m} \equiv T

of

(t, T]

, such that Assumptions 3 (“CreditRisk

^{+}

parameters at different time scales”) and Assumption 5 (“non-autocorrelated market factors”) are verified with

ξ_{k j} = 1

(23)

for each

i = 1 \dots N

,

k = 1 \dots K

and

j = 1 \dots m

. Then

{Y_{i}}

satisfies Assumption 1 (“CreditRisk

^{+}

distributional assumption”) over

(t, T]

.

The statement above remains true replacing Assumption 3 with Assumption 4 (“modified CreditRisk

^{+}

parameters at different time scales”) and Assumption 1 with Assumption 2 (“modified CreditRisk

^{+}

distributional assumption”), ceteris paribus.

The proof of Theorem 1 is reported in Appendix A.1.

This result shows that extending the CreditRisk

^{+}

model to a multiperiod framework is well-posed.

Remark 2.

The choice

ξ_{j k} = 1

implies no loss of generality, since a different (positive) constant

ξ_{j k} = c

is equivalent to redefine the variances of the market factors

c σ_{k}^{2} \mapsto σ_{k}^{2}

.

3.3. Internal Consistency and Autocorrelation in Time Series

As shown in Section 2, the dynamics of each parameter

p_{i}

is induced by the latent Gamma factors only. Imposing Assumption 5 to any (arbitrarily short) time scale implies that considered time series

{Γ_{k}^{(j)}}_{j = 1, 2, \dots}

must exhibit zero autocorrelation. Hence autocorrelation must be completely absent from the historical default frequencies too.

However, this requirement could not be satisfied by the observed time series used in calibrating the model. Indeed, we need to verify that the model can preserve its internal consistency if autocorrelation has to be considered.

The purpose of this work is to investigate whether it is possible and convenient to calibrate the CreditRisk

^{+}

model at a time scale that copes with the available historical data (i.e., the sampling period of the historical time series) instead of using the same time scale needed for projections (usually bigger). Hence, in case it is not possible to preserve the internal consistency of the model at each arbitrary time scale, due to the presence of autocorrelation, it is sufficient to ask that it holds up to the smallest of the two time scales of interest—the historical sampling period and the projection horizon.

Let us specialize to the constant mesh case

t_{j} - t_{j - 1} = (T - t) / m = δ_{m}

. This choice copes with a typical real case, where the sampling period

δ_{m}

of the available historical time series is constant and the considered projection horizon

T - t

is a multiple of it. Under these premises, a weakened version of Assumption 5 is introduced.

Assumption 6

(Autocorrelated market factors). Given Assumption 3, for each k-th latent variable, considered at the time scale

δ_{m}

, a time-invariant ACF

ϱ_{x k}

exists, such that

cov (Γ_{k}^{(j)}, Γ_{k}^{(j + x)}) = ϱ_{x k} var (Γ_{k}^{(j)}) .

(24)

Furthermore, the following closure with respect to the addition holds

\sum_{j = 1}^{m} Γ_{k}^{(j)} \sim Gamma (α_{k}, β_{k})

(25)

for a couple

α_{k}, β_{k}

of shape and scale parameters.

Assumption 6 is considered instead of Assumption 5 to state the following alternate version of Theorem 1.

Theorem 2

(Internal consistency of CreditRisk

^{+}

model in presence of autocorrelation). Let us consider a set of risks

{Y_{i}}

(

i = 1 \dots N

), observed through a time horizon

(t, T]

, and a uniform partition

{t_{j} : = t + j δ_{m}}_{j = 1 \dots m}

of

(t, T]

, such that Assumption 3 (“CreditRisk

^{+}

parameters at different time scales”) and Assumption 6 (“autocorrelated market factors”) are verified with

ξ_{k j} = {[1 + 2 \sum_{x = 1}^{m - 1} ϱ_{x k} (1 - \frac{x}{m})]}^{- \frac{1}{2}}

(26)

for each

i = 1 \dots N

,

k = 1 \dots K

and

j = 1 \dots m

. Then

{Y_{i}}

satisfies Assumption 1 (“CreditRisk

^{+}

distributional assumption”) over

(t, T]

.

The statement above remains true replacing Assumption 3 with Assumption 4 (“modified CreditRisk

^{+}

parameters at different time scales”) and Assumption 1 with Assumption 2 (“modified CreditRisk

^{+}

distributional assumption”), ceteris paribus.

The proof of Theorem 2 is reported in Appendix A.2.

Assumption 6 can be either well-posed or ill-posed, depending on the considered

ϱ_{x k}

. The trivial case

ϱ_{x k} = 0

for each

x \in Z

copes with Assumption 5. Correlated Gamma variables, as well as the distributional properties of the sum of Gamma variables, have been intensively studied in the literature, and this is still an active research field [20,21,22,23], due to its relevance for information technology. At least in case of identically distributed Gamma variables—such as

Γ_{k}^{(j)}

in our framework—with ACF obeying to a power-law

ϱ_{x k} = ρ_{k}^{| x |}, ρ_{k} \in (0, 1),

(27)

the distribution of the sum

Γ_{k}

is known to be approximately Gamma [20], while more generical cases imply the sum to be distributed differently [22,23]. Moreover, it is known that partial sums of independent Gamma variables can be used to generate sequences of (auto)correlated Gamma variables [21].

Remark 3.

The exponential ACF in Equation (27) provides a non-trivial case that satisfies Assumption Assumption 6 and, thus, Theorem 2. In the following Section 4.4, Theorem 2 permits the estimation of A in presence of autocorrelated time series. Equation (27) is then considered in Section 5.3 to investigate numerically the estimators introduced in Section 4.4. However, to date, a general framework is missing to tell whether a given

ϱ_{x k}

lets the partial sums

\sum_{j} Γ_{k}^{(j)}

remain (approximately) Gamma distributed, with the exception of exponential ACFs.

The estimators introduced in Section 4.4 to consider autocorrelation in time series are still applicable to an inconsistent framework, provided that at least the latent variables

Γ_{k}

(defined onto the projection horizon) are Gamma distributed and

Γ_{k}^{(j)}

satisfy the mean and variance requirements implied by Assumption 6 above.

4. Calibration of the Structure of Dependence

The model is calibrated based on a partition of the risks in H homogeneous sets

c_{h} (t)

,

h = 1, \dots, H

. In this context “homogeneity” means that two risks belonging to the same set

c_{h} (t)

have the same vector of factor loadings

ω^{(h)}

. The sets have an explicit time dependence since they can change by the occurrence of defaults. On the contrary, the structure of dependence, defined by

ω^{(h)}

is supposed to be time-independent.

Hence, solving the calibration problem requires the evaluation of

H factor loading vectors ${ω^{(h)}}_{h = 1 \dots H}$ , that link each of the homogenous clusters to the K latent variables;
K volatilities ${σ_{k}}_{k = 1 \dots K}$ , needed to specify the distribution of each of the latent variables.

The calibration is achievable by a two-step procedure. Firstly, the matrix

A : = Ω^{T} Σ Ω

, introduced in Section 3, is estimated. Then, A is decomposed under the proper constraints in order to evaluate

Ω

and

Σ

separately. This section describes a method to complete the first step, providing an estimator of A both for the single and the multiple unwind period cases, with a moment-matching approach that allows expressing

\hat{A}

as a function of the covariance matrix among the historical frequencies of default. The second step is addressed later in Section 6, which provides an example of calibration using a real data set.

Adopting the standard CreditRisk

^{+}

Assumption 1, Equation (12) can be used to link the covariance matrix among the historical frequencies of default with the matrix A. In Section 4.1,

\hat{A}

is provided in the case of historical frequencies of default, sampled with the same tenor of the projection horizon. In Section 4.2,

\hat{A}

is generalized to the case of historical frequencies of default sampled with an arbitrary tenor.

Furthermore, in Section 4.3,

\hat{A}

is determined under the exponential version of the CreditRisk

^{+}

framework, introduced in Assumption 2. Thanks to this modified assumption, the corresponding functional form of

\hat{A}

is simpler than the one obtained in Section 4.2 based on Assumption 1.

Section 4.2 and Section 4.3 cope with Assumption 5, that implies absence of autocorrelation in time series. The final Section 4.4 uses Assumption 6 instead, generalizing the main results presented in this section to the case where autocorrelation must be taken into account. In this case, the simpler form of

\hat{A}

obtained in Section 4.3 comes in handy in the generalization to the non-trivial ACF case.

4.1. The Single Unwind Period Case

The first case considered is that of a single unwind period

(t, T]

. For each set

c_{h} (t)

, let

n_{h} (t) : = | c_{h} (t) |

,

F_{h} : = \frac{1}{n_{h} (t)} \sum_{i \in c_{h} (t)} Y_{i}

and

G_{h} 1 - F_{h}

. The expected values of

F_{h}

and

G_{h}

are respectively:

q_{h} : = E [F_{h}] = \frac{\sum_{i \in c_{h} (t)} q_{i}}{n_{h} (t)},

(28)

s_{h} : = E [G_{h}] = 1 - E [F_{h}] .

(29)

Remark 4.

The slight abuse of notation in (28) is done to avoid the introduction of a new symbol to represent

E [F_{h}]

. However, the letters chosen for indexing risks and cluster (“i” and “h” respectively) are maintained in the following of this work, clarifying the meaning of the “q” symbol each time it is used.

For any pair of sets of risks

{h, h^{'}}

, the covariance between the default frequencies is:

\begin{matrix} cov (F_{h}, F_{h^{'}}) & = & E [(F_{h} - E [F_{h^{'}}]) (F_{h^{'}} - E [F_{h^{'}}])] \\ = & \frac{1}{n_{h} n_{h^{'}}} E [\sum_{i \in c_{h}} (Y_{i} - q_{i}) \sum_{i^{'} \in c_{h^{'}}} (Y_{i^{'}} - q_{i^{'}})] \\ = & \frac{1}{n_{h} n_{h^{'}}} \sum_{i \in c_{h}} \sum_{i^{'} \in c_{h^{'}}} cov (Y_{i}, Y_{i^{'}}), \end{matrix}

(30)

that, using Equation (12), becomes:

cov (F_{h}, F_{h^{'}}) = \frac{1}{n_{h} n_{h^{'}}} \sum_{i \in c_{h}} \sum_{i^{'} \in c_{h^{'}}} (q_{i} q_{i^{'}} \sum_{k = 1}^{K} ω_{i k} ω_{i^{'} k} σ_{k}^{2} + δ_{i i^{'}} q_{i}) .

(31)

Equation (31) shows the relation between the observed covariance of default frequencies and the factor loadings, describing the structure of dependence of the model.

Moreover, assuming that all risks in a given homogenous set share the same factor loadings, the above expression simplifies to:

cov (F_{h}, F_{h^{'}}) = q_{h} q_{h^{'}} \sum_{k = 1}^{K} ω_{h k} ω_{h^{'} k} σ_{k}^{2} + δ_{h h^{'}} \frac{q_{h}}{n_{h}}

(32)

Notice that the second term in Equation (32) is present only when

h = h^{'}

, and becomes quickly negligible as

n_{h}

grows (since

q_{h} < 1

).

Equation (32) enables the estimation of A over the same time scale

T - t

used for projections:

{\hat{A}}_{h h^{'}} = \frac{1}{q_{h} q_{h^{'}}} [c \hat{o} v (F_{h}, F_{h^{'}}) - δ_{h h^{'}} \frac{q_{h}}{n_{h}}] .

(33)

4.2. The Multiple Unwind Period Case

Let us consider a set of H time series defined using a constant step

δ_{m} = (T - t) / m

. As done in Section 2, each variable introduced in Section 4.1 for the time interval

(t, T]

can be redefined over each of the considered time intervals. Namely, in the following we use the set of observables quantities

{F_{h}, G_{h}, q_{h}, s_{h}}

, measured either over

(t, T]

or

(t_{j - 1}, t_{j} = t_{j - 1} + δ_{m}]

or a generic time interval

(t, t^{'}]

. For the latter two cases, we introduce the notation

{F_{h}^{(j)}, G_{h}^{(j)}, q_{h}^{(j)}, s_{h}^{(j)}}

and

{F_{h} (t, t^{'}), G_{h} (t, t^{'}), \dots}

, respectively. Further, the variables

F_{m h} : = 1 - \prod_{j = 1}^{m} [1 - F_{h}^{(j)}],

(34)

G_{m h} : = \prod_{j = 1}^{m} G_{h}^{(j)} = 1 - F_{m h}

(35)

are introduced.

In CreditRisk

^{+}

,

F_{h} (t, t^{'})

arises from a doubly stochastic process, since each absorbing event is generated conditioned to the latent systematic factors. For the sake of simplicity, we neglect the idiosyncratic uncertainty brought by each

Y_{i} (t, t^{'})

. In fact, for

n_{h} (t)

large enough, the Bernoulli (or Poisson) r.v.’s contributions to the variance of

F_{h} (t, t^{'})

are dominated by the contribution of

Γ (t, t^{'})

. This permits the following assumption.

Assumption 7

(Large clusters). For each cluster

c_{h}

(

h = 1 \dots H

) and each time interval

(t, t^{'}] \subseteq (t, T]

it holds

var [F_{h} (t, t^{'}) | Γ (t, t^{'})] = 0 .

Then the following holds:

Proposition 2

(CreditRisk

^{+}

scale-invariance law). Let us consider a set of risks

{Y_{i}}

(

i = 1 \dots N

), observed through a time horizon

(t_{a}, t_{b}]

and classified into a set of homogenous clusters

c_{h}

(

h = 1 \dots H

). Let Assumption 3 (“CreditRisk

^{+}

parameters at different time scales”), Assumption 5 (“non-autocorrelated market factors”) and Assumption 7 (“large clusters”) hold with

ξ_{k j} = 1

for each

(t, T] \subseteq (t_{a}, t_{b}]

and for each uniform partition

t \equiv t_{0} < t_{1} < \dots < t_{m} \equiv T

of

(t, T]

, (

m \in N^{*}

). Then, the couple

F_{h} (t, T), F_{h^{'}} (t, T)

satisfies the conservation law

{[cov (F_{h} (t, T), F_{h^{'}} (t, T)) + s_{h} (t, T) s_{h^{'}} (t, T)]}^{\frac{1}{T - t}} = constant .

(36)

for each pair of clusters

c_{h}, c_{h^{'}}

and each

(t, T] \subseteq (t_{a}, t_{b}]

.

The proof of Proposition 2 is reported in Appendix A.3.

Proposition 2 is one of the main results of this work. It allows to build an estimator of

cov (F_{h} (t, T), F_{h^{'}} (t, T))

using default frequencies

F_{h}^{(j)}

defined on a different time scale

δ_{m}

. The dependence upon m of the precision of the covariance estimator is discussed in Section 5.

Indeed, applying Proposition 2 to Equation (33), it is possible to calibrate the dependence structure of the CreditRisk

^{+}

model, by first determining the elements of the A matrix as

A_{h h^{'}} = \frac{1}{q_{h} q_{h^{'}}} [{(cov (F_{h}^{(j)}, F_{h^{'}}^{(j)}) + s_{h}^{(j)} s_{h^{'}}^{(j)})}^{m} - s_{h} s_{h^{'}} - δ_{h h^{'}} \frac{q_{h}}{n_{h}}]

(37)

for any

j = 1, \dots, m

, and then decomposing A, thus obtaining a separate estimate of the

\{Ω, σ_{Γ}^{2}\}

parameters. The SNMF decomposition can be performed, e.g., by using the technique described in [14].

4.3. The Exponential Case

In this section the problem of calibrating the dependence structure is addressed using the exponential form of the model introduced in Assumptions 2 and 4. Theorem 1 proves that also the exponential form remains consistent when considering multiple unwind periods. Since now

{\tilde{Y}}_{i}

variables are used instead of the corresponding

Y_{i}

, the frequencies

F_{h}

and their complements

G_{h}

are replaced by

{\tilde{F}}_{h}

and

{\tilde{G}}_{h}

, defined by the substitution

Y_{i} \mapsto {\tilde{Y}}_{i}

in

F_{h}

and

G_{h}

definitions, respectively. Furthermore, it is convenient to introduce the following

L_{h} : = - \frac{q_{h}}{q_{h}^{⋆}} ln {\tilde{G}}_{h}

(38)

where

q_{h}^{⋆} : = - ln \frac{\sum_{i \in c_{h} (t)} e^{- q_{i}}}{n_{h} (t)} .

(39)

The notation introduced in Section 4.2 for

{F_{h}, G_{h}, q_{h}, \dots}

are extended to the exponential case as well. Hence, the sets of symbols

{{\tilde{F}}_{h} (t, t^{'}), {\tilde{G}}_{h} (t, t^{'}), \dots}

and

{{\tilde{F}}_{h}^{(j)}, {\tilde{G}}_{h}^{(j)}, \dots}

are also used. The log link function that relates

{\tilde{p}}_{i}

and

Γ

simplifies the form of the scale invariariance law presented in Proposition 2. Indeed, in this case the following holds.

Proposition 3

(Modified CreditRisk

^{+}

scale-invariance law). Let us consider a set of risks

{{\tilde{Y}}_{i}}

(

i = 1 \dots N

), observed through a time horizon

(t_{a}, t_{b}]

and classified into a set of homogenous clusters

c_{h}

(

h = 1 \dots H

). Let Assumption 4 (“modified CreditRisk

^{+}

parameters at different time scales”), Assumption 5 (“non-autocorrelated market factors”) and Assumption 7 (“large clusters”) hold with

ξ_{k j} = 1

for each

(t, T] \subseteq (t_{a}, t_{b}]

and for each uniform partition

t \equiv t_{0} < t_{1} < \dots < t_{m} \equiv T

of

(t, T]

, (

m \in N^{*}

). Then

L_{h} (t, T), L_{h^{'}} (t, T)

obey to the conservation law

\frac{1}{T - t} cov [L_{h} (t, T), L_{h^{'}} (t, T)] = constant

(40)

for each pair of clusters

c_{h}, c_{h^{'}}

and each

(t, T] \subseteq (t_{a}, t_{b}]

.

The proof of Proposition 3 is reported in Appendix A.4.

Proposition 3 states a conservation law for the modified version of the model, likewise Proposition 2 in the original (i.e., Poisson–Gamma) CreditRisk

^{+}

framework. The form obtained for the LHS of Equation (40) is simpler than the corrisponding LHS of Equation (36). In general, this framework results to be more tractable than the original model. This is especially useful when estimating A given a non-trivial ACF, as shown in the next Section 4.4.

In this case, A can be estimated as

A_{h h^{'}} = \frac{1}{q_{h} q_{h^{'}}} cov [L_{h}, L_{h^{'}}] = \frac{1}{q_{h}^{⋆} q_{h^{'}}^{⋆}} cov [ln (1 - {\tilde{F}}_{h}), ln (1 - {\tilde{F}}_{h^{'}})]

(41)

where we have neglected the contribution of

cov ({\tilde{Y}}_{i}, {\tilde{Y}}_{i}) \propto \frac{1}{n_{h} (t_{1})} ≃ 0

. Definition (38) and Proposition 3 imply

A_{h h^{'}} = \frac{m}{q_{h}^{⋆ (j)} q_{h^{'}}^{⋆ (j)}} cov [ln (1 - {\tilde{F}}_{h}^{(j)}), ln (1 - {\tilde{F}}_{h^{'}}^{(j)})]

(42)

for each

j = 1 \dots m

.

4.4. Handling Autocorrelated Time Series in Calibration

In this section a generalization of estimators in Equations (37) and (42) is provided, in case Assumption 5 has to be replaced with Assumption 6 due to the presence of autocorrelation in time series. We preliminarily report below a second order approximation that comes in handy to generalize Equation (37).

\begin{matrix} \prod_{j = 1}^{m} E [G_{h}^{(j)} G_{h^{'}}^{(j)}] & = & \prod_{j = 1}^{m} (1 - q_{h}^{(j)} - q_{h^{'}}^{(j)} + E [F_{h}^{(j)} F_{h^{'}}^{(j)}]) \\ = & 1 - \sum_{j = 1}^{m} (q_{h}^{(j)} + q_{h^{'}}^{(j)}) + \sum_{j = 1}^{m} E [F_{h}^{(j)} F_{h^{'}}^{(j)}] \\ + & \sum_{j < j^{'}} \sum_{h, h^{'} = 1, 2} q_{h}^{(j)} q_{h^{'}}^{(j^{'})} + \dots \end{matrix}

(43)

We now consider again the relation between

cov (F_{m h}, F_{m h^{'}})

and

cov (F_{h}^{(j)} F_{h^{'}}^{(j)})

implied by Proposition 2, under the presence of autocorrelation for the latent variables. Unlike in Section 3.2, in this case covariance terms at delay

| j - j^{'} | \geq 1

cannot be nullified.

\begin{matrix} cov (F_{m h}, F_{m h^{'}}) & = & E [\prod_{j = 1}^{m} G_{h}^{(j)} G_{h^{'}}^{(j)}] - s_{h} s_{h^{'}} \\ = & 1 - \sum_{j = 1}^{m} (q_{h}^{(j)} + q_{h^{'}}^{(j)}) \\ + & \sum_{j < j^{'}} \sum_{h, h^{'} = 1, 2} E [F_{h}^{(j)} F_{h^{'}}^{(j^{'})}] \\ + & \sum_{j = 1}^{m} E [F_{h}^{(j)} F_{h^{'}}^{(j)}] - s_{h} s_{h^{'}} + \dots \end{matrix}

(44)

Replacing Equation (43) into Equation (44), we have

\begin{matrix} cov (F_{m h}, F_{m h^{'}}) & = & \prod_{j = 1}^{m} E [G_{h}^{(j)} G_{h^{'}}^{(j)}] \\ + & \sum_{j < j^{'}} \sum_{h, h^{'} = 1, 2} cov [F_{h}^{(j)} F_{h^{'}}^{(j^{'})}] - s_{h} s_{h^{'}} + O_{3} \end{matrix}

(45)

where

O_{3}

is a compact notation for the sum of all the terms of order 3 or greater. Given that

O_{3} \overset{q \to 0}{\to} 0

, the approximation

O_{3} \approx 0

is numerically sound in practice and implies the following generalization of

A_{h h^{'}}

in Equation (37)

A_{h h^{'}} \approx \frac{1}{q_{h} q_{h^{'}}} [{(cov (F_{h}^{(j)}, F_{h^{'}}^{(j)}) + s_{h}^{(j)} s_{h^{'}}^{(j)})}^{m} + A C_{h h^{'}}^{(L)} - s_{h} s_{h^{'}} - δ_{h h^{'}} \frac{q_{h}}{n_{h}}],

(46)

where the autocorrelation term

A C^{(L)}

is defined as

A C_{h h^{'}}^{(L)} : = \sum_{x = 1}^{m - 1} (m - x) (cov [F_{h}^{(j)} F_{h}^{(j + x)}] + cov [F_{h^{'}}^{(j)} F_{h^{'}}^{(j + x)}] + 2 cov [F_{h}^{(j)} F_{h^{'}}^{(j + x)}]) .

(47)

This completes the extension of the linear case presented in Section 4.2 to autocorrelated time series.

The exponential case—introduced in Section 4.3—turns out to be more tractable, since the linear structure implied by Proposition 3 allows us to avoid approximations similar to the one applied to extend the linear case above. Indeed, only the simplification implied by Assumption 5 must be abandoned, implying

cov [L_{h}, L_{h^{'}}] = m cov [L_{h}^{(j)}, L_{h^{'}}^{(j)}] + \sum_{x = 1}^{m - 1} 2 (m - x) cov [L_{h}^{(j)}, L_{h^{'}}^{(j + x)}] .

(48)

This is implied by the fact that

L_{h}^{(j)}

are still identically distributed for the same h but not independent. Hence, the estimator in Equation (42) becomes

A_{h h^{'}} (t, T) = \frac{m}{q_{h}^{⋆ (j)} q_{h^{'}}^{⋆ (j)}} cov [ln (1 - {\tilde{F}}_{h}^{(j)}), ln (1 - {\tilde{F}}_{h^{'}}^{(j)})] + A C_{h h^{'}}^{(E)}

(49)

where

A C_{h h^{'}}^{(E)} : = \frac{1}{q_{h}^{⋆ (j)} q_{h^{'}}^{⋆ (j)}} \sum_{x = 1}^{m - 1} 2 (m - x) cov [ln (1 - {\tilde{F}}_{h}^{(j)}), ln (1 - {\tilde{F}}_{h^{'}}^{(j + x)})]

(50)

5. The Advantage of a Short Sampling Period

Let us consider a

Δ_{t}

-long projection period and a set of historical time series of defaults that span a (past) time interval of length

n Δ_{t}

. Typical examples can be

Δ_{t} = 1

year and

5 \leq n \leq 20

. Moreover, let the historical time series be sampled with a period

δ_{m}

, which is m times smaller than

Δ_{t}

(i.e.,

δ_{m} : = Δ_{t} / m

). Considering

Δ_{t} = 1

year, realistic assumptions are

m = 4

(quarterly time series) or

m = 12

(monthly time series). Therefore, the considered time series are defined over

m \times n

intervals of length

δ_{m}

, defined by a schedule

t_{0}, \dots, t_{m \times n}

.

This section discusses the precision improvement achievable by calibrating the model on historical default time series with a period smaller than the time horizon on which the calibrated model is applied. Indeed, the statistical error on the determination of A depends on m, i.e., on the sampling frequency of the observations, as shown in Section 5.1. Further, given Assumption 7 (“large clusters”), the statistical error can be written as a closed-form function of m, as

σ_{k}^{2}

approaches to zero (

k = 1 \dots K

). In the following, the assumption of “small” volatilities is referred to as “Gaussian regime”, because it implies

Γ_{k} \dot{\sim} N (1, β_{k})

(

k = 1 \dots K

), as discussed in the proof of Theorem 3.

As in the previous Section 3 and Section 4, both the standard CreditRisk

^{+}

framework (Assumptions 1 and 3) and the modified “exponential” version (Assumptions 2 and 4) are discussed hereinafter.

In applications where

c_{h}

’s are scarcely populated or

σ_{k}

’s are not negligible, Theorem 3 is not guaranteed to cope with observations. This case is addressed in Section 5.2, where the robustness of the closed-form expression (54) is investigated by Monte Carlo simulations.

A numerical approach is maintained in Section 5.3 as well, where the estimation error of

\hat{A}

at different time scales is measured in presence of autocorrelation, following the generalization introduced in Section 3.3 and Section 4.4. In this case, the exponential version of the model comes in handy: indeed, it is observed that the error on the estimator introduced in (46) (i.e., standard CreditRisk

^{+}

version) does not decrease at increasing m, while the opposite is true for the estimator presented in (49) (i.e., exponential CreditRisk

^{+}

version).

In Section 4, the

\hat{A}

estimator has been presented in multiple versions, depending on the considered model (standard or exponential version of CreditRisk

^{+}

), the chosen sampling period

δ_{m}

and the presence or absence of autocorrelation. Thus, it is worth introducing a compact notation to identify the different versions of

\hat{A}

.

The expressions for

A_{h h^{'}}

presented in (37) and (46) are addressed as “linear” estimators (as opposed to “exponential”) in the following. In these cases the symbol

{\hat{A}}_{h h^{'}}^{(L, m)}

is used, where L stands for “linear” and

m = (T - t) / δ_{m}

is the ratio between the projection and calibration time scales.

On the other hand, the expressions for

A_{h h^{'}}

presented in (42) and (49) are addressed as “exponential” estimators and so the symbol

{\hat{A}}_{h h^{'}}^{(E, m)}

is used.

For the sake of brevity, when L or E is omitted,

{\hat{A}}_{h h^{'}}^{(m)}

refers to both the cases and, when m is omitted,

{\hat{A}}_{h h^{'}}

refers to the

m = 1

case.

The improvement in statistical precision with respect to the estimate with no subsampling, can be quantified by the following ratio:

ε [{\hat{A}}_{h h^{'}}^{(m)}] : = \sqrt{\frac{var [{\hat{A}}_{h h^{'}}^{(m)}]}{var [{\hat{A}}_{h h^{'}}]}} .

(51)

Symbol

ε_{h h^{'}}^{(m)}

and its further specifications

ε_{h h^{'}}^{(L, m)} : = ε [{\hat{A}}_{h h^{'}}^{(L, m)}]

and

ε_{h h^{'}}^{(E, m)}

can be used as well. The last short notation that results to be convenient in the following is

c_{h h^{'}}^{(L m)} : = cov [F_{h}^{(j)}, F_{h^{'}}^{(j)}] + s_{h}^{(j)} s_{h^{'}}^{(j)},

(52)

c_{h h^{'}}^{(E m)} : = cov [L_{h}^{(j)}, L_{h^{'}}^{(j)}] .

(53)

where

F_{h}^{(j)}

,

L_{h}^{(j)}

and

s_{h}^{(j)}

(

j = 1 \dots m

) are i.i.d. variables quantified using a sampling period

δ_{m}

.

Remark 5.

The notation “

\hat{A}

” refers to the fact the covariances involved in the definitions must be replaced with the corresponding sample estimators, when applying

A_{h h^{'}}^{(m)}

to historical time series. The same applies to the symbol

\hat{c}

.

5.1. Precision of $\hat{A}$ at Different Time Scales under the Gaussian Regime

The following result quantifies the precision gain of performing CreditRisk

^{+}

model calibration by historical time series available at increasing sampling frequencies. As anticipated, the precision of the estimated parameters increases as the sampling period decrease. This result holds under Assumption 7, in the limit

σ \to 0^{+}

and considering absence of autocorrelation. The cases where some

n_{h}

is small (i.e., it does not verify Assumption 7) or where some

σ_{k}

is not negligible are addressed numerically in the next Section 5.2—showing that the precision is still increasing as shorter sampling periods are considered. The introduction of autocorrelation is addressed in Section 5.3.

Theorem 3

(Estimation errors under Gaussian regime). Let us consider a set of risks, observed through a time interval

(t_{0}, t]

and classified into a set of homogenous clusters

c_{h}

(

h = 1 \dots H

). Let Assumption 3 (“CreditRisk

^{+}

parameters at different time scales”), Assumption 5 (“non-autocorrelated market factors”) and Assumption 7 (“Large clusters”) hold with

ξ_{k j} = 1

for a given uniform partition

t_{0} < t_{1} < \dots < t_{j} < \dots < t_{m \times n} \equiv t

of

(t_{0}, t]

, (

t_{j} - t_{j - 1} = δ_{m}

;

m, n \in N^{*}

). Let

\hat{A}

be the estimate of A needed to calibrate the CreditRisk

^{+}

model in order to project losses over the time horizon

(t, T]

, such that

(t - t_{0}) / (T - t) = n

and

(T - t) / (δ_{m}) = m

. Then the following is true for

{\hat{A}}_{h h^{'}}^{(m)}

:

ε_{h h^{'}}^{(m)} \overset{σ \to 0^{+}}{\to} \sqrt{\frac{n - 1}{m \cdot n - 1}}

(54)

Equation (54) remains true also considering Assumption 4 (“modified CreditRisk

^{+}

parameters at different time scales”) instead of Assumption 3.

The proof of Theorem 3 is reported in Appendix A.5.

5.2. Beyond the Gaussian Regime: Numerical Simulations

In this section we verify that both the estimators

{\hat{A}}_{h h^{'}}^{(E, m)}

and

{\hat{A}}_{h h^{'}}^{(L, m)}

are more precise at increasing m. The closed-form results obtained in the Gaussian regime, discussed in Section 5.1, hold when the factor volatilities

σ_{Γ}

are much less than 1. Increasing

σ_{k}

(

k = 1, \dots, K

) the Gaussian regime becomes less satisfactory and the difference of precision amongst determinations with different values of m becomes smaller. However, the error of

{\hat{A}}_{h h^{'}}^{(m)}

remains monotonically decreasing in m, even far from the Gaussian regime conditions.

We considered a case study with a two-factors market (

Γ_{k}

,

k = 1, 2

). The couple of systematic factors induces the dependence between two populations of risks, as per the weights reported in Table 1.

The volatilities (

σ_{k}

,

k = 1, 2

) associated to the factors are chosen according to seven different scenarios (indexed by

i_{σ}

), respectively as

σ_{Γ} : = 2^{i_{σ}} (\begin{matrix} 2.5 \cdot 10^{- 2} \\ 5.0 \cdot 10^{- 2} \end{matrix}), i_{σ} = 0 \dots 6 .

(55)

For each scenario, the distributions of the estimators

{\hat{A}}_{12}^{(E, m)}

and

{\hat{A}}_{12}^{(L, m)}

(

m = 1 \dots 12

) have been determined using

10^{5}

simulations of

{F_{1} (t, n_{1}), F_{2} (t, n_{2})}

where

t \in (t_{0}, t_{0} + n Δ_{t}]

(

n = 10

) and

n_{h}

(

h = 1, 2

) is the number of risks belonging to each cluster. For both estimators the dynamic

F_{h} (t, n_{h})

is that reported in (14). All risks belonging to the same cluster are supposed to have the same unconditioned intensity of default

q_{i} (t, t + Δ_{t}) = - \frac{1}{Δ_{t}} log (0.99), i = 1, \dots, n_{h}, h = 1, 2 .

(56)

To investigate the additional contribution to the error

σ [{\hat{A}}_{12}]

, generated by the finiteness of each cluster, different values of

n_{h}

have been considered. In particular, the number of claims per each elementary temporal step

δ_{m} = 1 / m

is extracted from a binomial distribution with parameter

n_{h} \in \{10^{3}, 2.5 \cdot 10^{3}, 5 \cdot 10^{3}, 10^{4}, 2.5 \cdot 10^{4}, 5 \cdot 10^{4}\}, h = 1, 2 .

(57)

For simplicity’s sake, it is assumed that each defaulted risk is instantly replaced by a new risk, keeping the population of each cluster constant in time. Finally, the case

n_{h} = \infty

(absence of binomial source of randomness) is also considered.

Figure 1 shows the behaviour of

ε [{\hat{A}}_{12}^{(m)}]

as a function of m, comparing various choices of

σ_{Γ}

. In this case we are not considering yet the contribution to error due to the finite population (

n_{h} = \infty

for each cluster h). Equation (54) (red curve) is almost perfectly verified by the least volatility scenario (

σ_{Γ} = (2.5, 5.0) \cdot 10^{- 2}

). At increasing volatility values (brighter curves), the gain in precision obtained at higher m is reduced, as well as the accordance with Equation (54).

Since the transformation of

ε [{\hat{A}}_{12}^{(m)}]

moving away from the Gaussian regime (i.e., increasing

| σ_{Γ} |

) is smooth, estimating

A_{12}

with

m > 1

remains convenient even for

{[σ_{Γ}]}_{k} ≳ 1

, despite the fact that Equation (54) is not verified anymore.

Comparison between the left and the right panel of Figure 1 shows that the above argument holds both in the linear and in the exponential case. This fact is also verified for all the other results of this section.

The results shown in Figure 1 are numerically checked against the case of finite portfolio populations: we tested each of the

n_{h}

declared in Equation (57). Even the smallest size considered (i.e.,

n_{h} = 10^{3}

—Figure 2), that is affected by the largest binomial contribution to the error, leads to results comparable to the ones observed in the

n_{h} = \infty

case. The size

n_{h} = 10^{3}

is considered to be a limiting value for a realistic case.

We simulated the distribution of the estimator

{\hat{A}}_{12}^{(m)}

as a function of m, testing all the possible combinations of

σ_{Γ}

and

n_{h}

declared in Equations (55) and (57). Figure 3 reports an example of the results. All the other considered

(n_{h}, σ_{k})

couples resulted to have a similar behavior. The visual comparison between

E [{\hat{A}}_{12}^{(m)}]

(blue “X” symbol) and

A_{12}

level (red horizontal line) shows that indeed

{\hat{A}}_{12}^{(m)}

is unbiased, both in the linear and in the exponential case (Equations (37) and (42) respectively). The dispersion around the mean reduces at increasing m, in agreement with both Equation (54) and the numerical results in Figure 1 and Figure 2.

As implied by Figure 2 and Figure 3, the number of risks

n_{h}

does not play a relevant role (if any) in computing the ratio

ε [{\hat{A}}_{12}^{(m)}]

, while the absolute value of the standard error

σ [{\hat{A}}_{12}^{(m)}]

is sensitive to the size of the portfolio.

This fact is confirmed by the results shown in Figure 4, where the estimates of

σ [{\hat{A}}_{12}^{(m)}]

have been arranged as functions of

n_{h}

at fixed

σ_{Γ}

and m values. As expected, the standard error is greater when considering smaller

n_{h}

values, while the dependence on

n_{h}

of the error disappears quickly as approaching

n_{h} \to \infty

.

5.3. Estimation Error in Presence of Autocorrelation

In Section 5.2 the precision gain at increasing m is measured in absence of autocorrelation. In this section, the same numerical simulations are re-performed, introducing autocorrelation and comparing the results against the theoretical estimation of

ε

. The effect of autocorrelation on

ε

is discussed in Appendix B.

The numerical setup introduced above in Section 5.2 has been maintained, with a further assumption about ACF. Indeed, we assume that each latent variable (

k = 1, 2

) obeys to the following ACF law, discussed in Section 3.3

ϱ_{x k} = ρ^{| x |}, m = 12

where the considered

ρ

values are

0.05

and

0.5

. For

m^{'} < 12

cases, we have considered ACF’s resulting for the latent variables time series

{\tilde{Γ}}_{k}^{(j^{'})}

obtained by the clustering operation

{\tilde{Γ}}_{k}^{(j^{'})} \equiv \frac{m^{'}}{m} \sum_{j} Γ_{k}^{(j)}, j = 1 + \frac{m}{m^{'}} (j^{'} - 1) \dots \frac{m}{m^{'}} j^{'}

given the aforementioned ACF law at

1 / m

time scale. Since the contribution of the finite population to the error has been shown to be neglectable in Section 5.2, simulations in presence of autocorrelation have been performed under

n_{h} = \infty

assumption only.

Figure 5 shows that the estimator

{\hat{A}}_{h h^{'}}^{(E, m)}

remains more precise at increasing m, even in presence of autocorrelation. The analytical results obtained in the Gaussian regime (i.e., theoretical superior and inferior estimates of

ε

—dashed and solid red lines in Figure 5), discussed in Appendix B, are in good agreement with the numerical results obtained in the considered set up. All the empirical measures of

ε

are included between the two theoretical limits (yellow areas).

Moreover, precision gain (i.e.,

ε < 1

) at

m > 1

is also possible when using the estimator

ε [{\hat{A}}_{12}^{(L, m)}]

, introduced in Equation (46). However, due to the approximation introduced in this case, the estimator is not convenient (i.e.,

ε > 1

) in the majority of the considered configurations.

6. An Application to Market Data

This section provides an example of the calibration technique applied to a real-world data set. The calibration technique is applied to a set of historical time series of bad loan rates supplied from the Bank of Italy. “Bad loan” is a subcategory of the broader class “Non-Performing Loan” and it is defined as exposures to debtors that are insolvent or in substantially similar circumstances [24].

In particular, the chosen data set is composed of the quarterly historical series TRI30496 (

m = 4

) over a five year period (from 1 January 2013 to 31 December 2017,

n = 5

,

Δ_{t} = 1

). The data are publicly available at [10]. The time series are supplied by the customer sector (“counterpart institutional sector”) and geographic area (“registered office of the customer”). The latter, in the example, is held fixed to a unique value that corresponds to the whole country (Italy). Table 2 and Table 3 report the definition of the 6 different clusters and their main features.

By inspection of Table 3, it is possible to perform a rough estimate of

σ_{Γ}

. Equation (14) implies that the following holds for coefficients of variation

C V_{h}

(

h = 1, \dots, H \equiv 6

):

C V_{h} : = \frac{σ_{h}}{{\bar{p}}_{h}} ≃ \sum_{k = 1}^{K} ω_{h k} {[σ_{Γ}]}_{k}

Furthermore, the normalization requirement over the factor loadings

ω_{h k}

implies

\sum_{k = 1}^{K} ω_{h k} ≲ 1

Hence we can state that

〈C V〉 : = \frac{1}{H} \sum_{h} C V_{h}

has the same order of magnitude of

\frac{1}{K} \sum_{k} {[σ_{Γ}]}_{k}

. Since

〈C V〉 ≃ 0.124

, results in Section 5.2 suggest that this data set is not far from the Gaussian regime and so there is an appreciable increase of precision in estimating

A (0, 1)

with

m > 1

.

\hat{A} (0, 1)

is estimated by applying Equation (42) over a one-year period. The results obtained for

{\hat{A}}^{(E, m)} (0, 1)

(

m = 1, 4

) are reported in Table 4.

The elementwise precision gain for

m = 4

,

ε [{\hat{A}}^{(E, 4)} (0, 1)]

, obtained under the Gaussian regime assumption, is shown in Table 5. This result is obtained applying definition (51) and Equation (A25) both to the cases

m = 4

and

m = 1

. Equation (A25) has been shown to be valid under the Gaussian regime, discussed in Section 5.1.

In this case, the preliminary decomposition of

{\hat{A}}^{(E, 4)} (0, 1)

, that would be needed using the Monte Carlo method discussed in Section 5.2, is not needed.

According to Equation (54), the elements of

ε [{\hat{A}}^{(E, 4)} (0, 1)]

reported in Table 5 should be all approximately equal to

0.46

, since they should depend only on the couple

m, n

(

m = 4

and

n = 5

in this case). However, in a real world case like the one considered, the assumption of zero autocorrelation is satisfied with a different precision by each time series

p_{h} (t)

. Furthermore, the estimated covariance matrices might need to be regularized (indeed the Higham regularization algorithm [25] was used both for

m = 1

and

m = 4

series). Hence, a different ratio for each element

(h, h^{'}) = 1, \dots, 6

is justified. Nonetheless, it is worth noticing that all the ratios reported in Table 5 have the same order of magnitude of the predicted value

0.46

.

Knowledge of the historical number of risky subjects

n_{h} (t)

for each cluster (

h = 1, \dots, 6

) at each observation date (

t = 1 / 4, 2 / 4, \dots, 5

) allows to take into account the binomial contribution to the error

σ [{\hat{A}}^{(E, m)} (0, 1)]

, both for

m = 4

(quarterly series) and

m = 1

(yearly series), although the finiteness of the population does not add a relevant contribution to the error, as already observed in Section 5.2.

Table 6 provides Monte Carlo estimation of

σ [{\hat{A}}^{(E, m)} (0, 1)]

(m = 1, 4)

, which considers also the role of

n_{h} (t)

. Since the values in Table 6 provide a measure of the error in the determination of

{\hat{A}}^{(E, m)} (0, 1)

, it turns out that the estimates reported in Table 4 are elementwise consistent one with the other.

The Monte Carlo estimation of

σ [{\hat{A}}^{(E, m)} (0, 1)]

, as done in Section 5.2, requires the a priori knowledge of the true dependence structure

W, σ_{Γ}

. Since this is a case study, we do not have an a priori parameterization of the calibrated model. Hence, we have used

\hat{W}, {\hat{σ}}_{Γ}

estimated from

{\hat{A}}^{(E, 4)} (0, 1)

instead, as a proxy of the “true” model parameters. The computation of

\hat{W}, {\hat{σ}}_{Γ}

from

{\hat{A}}^{(E, 4)} (0, 1)

is discussed below.

In order to complete the CreditRisk

^{+}

calibration, we have to decompose

\hat{A}

and find the factor loadings matrix

\hat{W}

together with the vector of systematic factors variances

{\hat{σ}}_{Γ}^{2}

. To do so, we use the Symmetric Non-negative Matrix Factorization (SNMF), an iterative numerical method to search an approximate decomposition of

\hat{A}

which satisfies the requirements of the CreditRisk

^{+}

model over

\hat{W}

(i.e., all elements

ω_{h k} > 0

and

\sum_{k} ω_{h k} = 1

). The application of SNMF to CreditRisk

^{+}

is discussed in detail in [14]. In the following, we give evidence only of the implementation details necessary to address this case study. Being an iterative method, SNMF requires an initial choice of matrixes

\begin{matrix} {\hat{U}}_{0} & : = & {\hat{W}}_{U} {\hat{Σ}}^{1 / 2}, \\ {\hat{V}}_{0} & : = & {\hat{Σ}}^{1 / 2} {\hat{W}}_{V}, \end{matrix}

such that

\hat{A} = {\hat{U}}_{0} {\hat{V}}_{0}

. It is not required that

{\hat{U}}_{0} = {\hat{V}}_{0}^{T}

, nor all the elements of

{\hat{U}}_{0}

and

{\hat{V}}_{0}

have to be positive. We set

{\hat{U}}_{0}, {\hat{V}}_{0}

from the eigenvalues decomposition of

{\hat{A}}^{(E, 4)} (0, 1)

.

For the considered data set, the eigenvalues decomposition returned the set of eigenvalues and eigenvectors reported in Table 7.

We use the

\tilde{ω}, \tilde{σ}

notation to address the quantities over which the normalization requirement of CreditRisk

^{+}

has not been imposed yet.

Since more than the

95 %

of variance is explained by the first three eigenvectors, we reduced the dimensionality of the latent variables vector to be

K = 3

. Hence we define

\begin{matrix} {\hat{U}}_{0} & = & [{\tilde{ω}}_{1}, {\tilde{ω}}_{2}, {\tilde{ω}}_{3}] \cdot diag ({\tilde{σ}}_{1}, {\tilde{σ}}_{2}, {\tilde{σ}}_{3}) \\ = & [\begin{matrix} 1.80 & 5.29 & 1.15 \\ 2.78 & 5.18 & 3.71 \\ 2.51 & 5.58 & 4.46 \\ 27.79 & - 2.65 & - 0.57 \\ 2.33 & 5.88 & 1.93 \\ 2.07 & 10.58 & - 5.96 \end{matrix}] \cdot 10^{- 2} \end{matrix}

and

{\hat{V}}_{0} = {\hat{U}}_{0}^{T}

. In general, SNMF aims to minimize iteratively the cost function

{|\hat{A} - \hat{U} \hat{V}|}^{2} + α {|\hat{U} - {\hat{V}}^{T}|}^{2}

where

| \cdot |

is the Frobenious norm, eventually weighted, and

α

is a free parameter to weight the asymmetry penality term. Further details on the method are available in [14]. The application of SNMF method, together with the normalization constraint over the factor loadings, leads to the result reported in Table 8.

A reasonable economic interpretation supports the set of parameters resulting from the calibration process described above. Indeed, factor loadings associated with the “general government” sector (

h = 4

) are completely distinct from the ones of the other sectors (i.e., this is the only sector mainly depending on the

k = 1

factor): this fact copes with the different nature of the public entities from the ones belonging to the other considered sectors. Furthermore, “companies” (

h = 2, 3

) share approximately the same dependence structure. The same applies when considering “households” (

h = 1, 5

). Finally, the “institutions serving households” sector (

h = 6

) shares the same latent factor (

k = 2

) but shows a different balance between idiosyncratic and systematic factor loadings compared to “households”, that is coherent with the nature of a sector strongly linked to “household” sectors, despite not being completely equivalent.

Results in Table 8 have been used to quantify the estimation errors reported in Table 6.

7. Conclusions

In this work, we have investigated how to calibrate the dependence structure of the CreditRisk

^{+}

model, when the sampling period

δ_{m}

of the (available) default rate time series is different from

Δ_{t}

—the length of the future time interval chosen for the projections.

Preliminarily, we proved that CreditRisk

^{+}

remains internally consistent when imposing the underlying distributional assumption to be simultaneously true at different time scales (Theorem 1). The model internal consistency is robust against the introduction of autocorrelation, depending on the considered ACF form (Theorem 2).

Then the problem has been approached in terms of moment matching, providing two asymptotically equivalent formulations for estimating the covariance matrix A amongst the systematic factors of the model (Propositions 2 and 3). The choice between the two estimators of A, provided in Equations (37) and (42), depends on the functional form (linear or exponential) that links the probability of claim/default and the latent variables. Both the estimators are explicitly dependent on the ratio

Δ_{t} / δ_{m}

, allowing for the calibration of the model at a time scale that is different from the one chosen for applying the calibrated model. Both the estimators have been generalized to autocorrelated time series in Equations (46) and (49), although only the latter (i.e., exponential case) is an exact result, while a second-order approximation has been adopted for the linear case.

Furthermore, calibrating the model on a shorter time scale than the projection horizon has been proved to be convenient in terms of reduced estimation error on

\hat{A}

. Analytical expressions for the error are provided in the Gaussian regime (i.e., small variances of the latent variables) by Theorem 3. In contrast, the case of increasing variance has been investigated numerically, confirming that, in general, the precision of the calibration is higher when employing historical data with a shorter sampling period. It has been verified that the convenience of calibrating the model at short time scales also remains in the presence of autocorrelation, although this is guaranteed only in the exponential framework, where an exact correction term is available.

Finally, the techniques presented in this work are shown to be numerically sound when applied to a real, publicly available data set of Italian bad loan rates.

Author Contributions

Conceptualization, J.G. and L.P.; methodology, J.G. and L.P.; software, J.G. and L.P.; validation, J.G. and L.P.; formal analysis, J.G. and L.P.; investigation, J.G. and L.P.; resources, J.G. and L.P.; data curation, J.G. and L.P.; writing—original draft preparation, J.G.; writing—review and editing, L.P.; visualization, J.G. and L.P.; supervision, J.G. and L.P.; project administration, J.G. and L.P.; funding acquisition, not applicable. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are openly available in “Banca d’Italia—Base Dati Statistica” [10].

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs

This section reports the proofs of theorems and propositions presented in this study.

Appendix A.1. Proof of Theorem 1

Proof.

Firstly, the statement is proved considering Assumptions 1 and 3.

Assumption 3 implies by construction that

{Y_{i}^{(j)}}_{j = 1 \dots m}

is a set of Poisson r.v.’s, which are mutually independent, conditionally on the realization of

{Γ^{(j)}}_{j = 1 \dots m}

. Poisson distribution is closed with respect to addition. Hence

\sum_{j = 1}^{m} Y_{i}^{(j)} | Γ^{(j)} \sim Poisson (p_{i Σ}),

(A1)

where the distribution parameter is

p_{i Σ} = \sum_{j = 1}^{m} \underset{q_{i}^{(j)}}{\underset{⏟}{q_{i} \frac{t_{j} - t_{j - 1}}{T - t}}} (ω_{i 0} + \sum_{k = 1}^{K} ω_{i k} Γ_{k}^{(j)}) .

(A2)

Equation (20) in Assumption 3, the choice

ξ_{k j} = 1

and the scaling property of Gamma distribution imply that

\frac{t_{j} - t_{j - 1}}{T - t} Γ_{k}^{(j)} \sim Gamma (σ_{k}^{- 2} \frac{t_{j} - t_{j - 1}}{T - t}, σ_{k}^{2})

(A3)

Furthermore, Assumption 5 and the fact that independent Gamma r.v.’s with the same scale parameter are closed with respect to addition imply that

\sum_{j = 1}^{m} \frac{t_{j} - t_{j - 1}}{T - t} Γ_{k}^{(j)} \sim Gamma (σ_{k}^{- 2}, σ_{k}^{2}) .

(A4)

Hence

\sum_{j = 1}^{m} \frac{t_{j} - t_{j - 1}}{T - t} Γ_{k}^{(j)} \equiv Γ_{k}

and so

\sum_{j = 1}^{m} Y_{i}^{(j)} \equiv Y_{i}

. This implies that

{Y_{i}}

satisfies Assumption 1 over

(t, T]

.

The proof above can be extended to the exponential case, i.e., when considering Assumptions 2 and 4 instead of Assumptions 1 and 3. The form of parameter

p_{i Σ}

in (A2) can be obtained also can be obtained also from Assumption 4. In fact, the substitution

Y_{i}^{(j)} \mapsto {\tilde{Y}}_{i}^{(j)}

implies that

{\tilde{Y}}_{i} \sim Bernoulli ({\tilde{p}}_{i})

where

ln (1 - {\tilde{p}}_{i}) = ln \prod_{j = 1}^{m} (1 - {\tilde{p}}_{i}^{(j)}) = \sum_{j = 1}^{m} q_{i} \frac{t_{j} - t_{j - 1}}{T - t} (ω_{i 0} + \sum_{k = 1}^{K} ω_{i k} Γ_{k}^{(j)}) .

(A5)

Considering Equation (A5) instead of (A2), the proof presented above holds for the

{\tilde{Y}}_{i}

representation of risks, ceteris paribus, implying that

{{\tilde{Y}}_{i}}

satisfies Assumption 2 over

(t, T]

. □

Appendix A.2. Proof of Theorem 2

Proof.

The same arguments that lead to (A2) or to (A5) in proof of Theorem 1 are still valid in this case. Hence, it suffices to prove that mean and variance of the latent variable

Γ_{k}^{'} : = \sum_{j = 1}^{m} \frac{t_{j} - t_{j - 1}}{T - t} Γ_{k}^{(j)} = \frac{1}{m} \sum_{j = 1}^{m} Γ_{k}^{(j)}

remain consistent with CreditRisk

^{+}

requirements, stated in Assumption 1. It holds

E [Γ_{k}^{'}] = 1

, since

E [Γ_{k}^{(j)}] = 1

. Moreover, the coefficient

ξ_{j k}

compensates the bias introduced in

var [Γ_{k}^{'}]

by the fact that

Γ_{k}^{(j)}

(

j = 1 \dots m

) are autocorrelated according to the ACF

ϱ_{x k}

:

\begin{matrix} var [\sum_{j = 1}^{m} Γ_{k}^{(j)}] & = & \sum_{j = 1}^{m} var [Γ_{k}^{(j)}] + \sum_{j = 1}^{m} \sum_{j^{'} \neq j} cov [Γ_{k}^{(j)}, Γ_{k}^{(j^{'})}] \\ = & var [Γ_{k}^{(1)}] \underset{m ξ_{k j}^{- 2}}{\underset{⏟}{(m + 2 \sum_{x = 1}^{m - 1} (m - x) ϱ_{x k})}} \end{matrix}

which implies

var [Γ_{k}^{'}] = σ_{k}^{2}

directly.

The fact that

Γ_{k}^{'}

is Gamma distributed is imposed in Assumption 6, implying that

Γ_{k}^{'} \equiv Γ_{k}

and so that Assumption 1 is satisfied. □

Appendix A.3. Proof of Proposition 2

Proof.

Given a time interval

(t, T] \subseteq (t_{a}, t_{b}]

and a uniform partition (

j = 1 \dots m

) over

(t, T]

, Assumptions A3 and A5 imply that

{Y_{i}}

satisfies Assumption 1 over

(t, T]

by Theorem 1. Assumption 7 guarantees the convergence of

F_{h}

to

E [F_{h} | Γ]

and of

F_{h}^{(j)}

to

E [F_{h}^{(j)} | Γ^{(j)}]

, where we recall that

F_{h} = F_{h} (t, T)

.

For any interval

(t, T] \subseteq (t_{a}, t_{b}]

and any pair of clusters

c_{h}, c_{h^{'}}

, definitions (34), (35) and Assumption 5 imply that the covariance between

F_{m h}

and

F_{m h^{'}}

is given by

cov (F_{m h}, F_{m h^{'}}) = \prod_{j = 1}^{m} [cov (F_{h}^{(j)} F_{h^{'}}^{(j)}) + s_{h}^{(j)} s_{h^{'}}^{(j)}] - s_{h} s_{h^{'}}

(A6)

Since all the considered subintervals

(t_{j - 1}, t_{j}]

have the same length

δ_{m} = t_{j} - t_{j - 1}

, the frequencies

F_{h}^{(j)}

are i.i.d., so that the above expression simplifies to:

cov (F_{m h}, F_{m h^{'}}) + s_{h} s_{h^{'}} = {[cov (F_{h}^{(j)}, F_{h^{'}}^{(j)}) + s_{h}^{(j)} s_{h^{'}}^{(j)}]}^{m}

(A7)

for any

j = 1, \dots, m

.

Each cluster

c_{h}

is supposed to be homogenous by definition, i.e.,

ω^{(i)} = ω^{(h)}

for each risk

Y_{i} \in c_{h}

. Hence, distributional Assumptions 1 and 3 imply that both

F_{h} \overset{n_{h} (t) \to \infty}{\to} E [F_{h} | Γ]

and

F_{h}^{(j)} \overset{n_{h} (t_{j}) \to \infty}{\to} E [F_{h}^{(j)} | Γ^{(j)}]

are sample estimators of the parameters

p_{h} (Γ) : = q_{h} (ω_{h 0} + \sum_{k} ω_{h k} Γ_{k})

and

p_{h}^{(j)} (Γ^{(j)})

respectively, leading to the equivalence relation

F_{m h} = F_{h} = {\hat{q}}_{h} (ω_{h 0} + \sum_{k = 1}^{K} ω_{h k} Γ_{k}),

(A8)

therefore both

F_{m h} (t, T)

and

F_{h} (t, T)

are estimators of the default frequency for the

(t, T]

interval. Thus, Equation (A7) can be rewritten as:

cov (F_{h}, F_{h^{'}}) + s_{h} s_{h^{'}} = {[cov (F_{h}^{(j)}, F_{h^{'}}^{(j)}) + s_{h}^{(j)} s_{h^{'}}^{(j)}]}^{m}

(A9)

and, since

m = (T - t) / δ_{m}

,

{[cov (F_{h}, F_{h^{'}}) + s_{h} s_{h^{'}}]}^{1 / (T - t)} = {[cov (F_{h}^{(j)}, F_{h^{'}}^{(j)}) + s_{h}^{(j)} s_{h^{'}}^{(j)}]}^{1 / δ_{m}} .

(A10)

To complete the proof, let

(t, T]

and

(t^{'}, T^{'}]

be two subintervals of

(t_{a}, t_{b}]

, such that

(T - t) / (T^{'} - t^{'}) \in Q

. Hence,

GCD {T - t; T^{'} - t^{'}} : = \bar{δ} \in R_{+}

exists.

\bar{δ}

can be used as the mesh to define two uniform partitions over the two considered intervals.

Given these partitions, (A10) can be applied both to

T - t

and to

T^{'} - t^{'}

, leading to

\begin{matrix} {[cov (F_{h} (t, T), F_{h^{'}} (t, T)) + s_{h} (t, T) s_{h^{'}} (t, T)]}^{1 / (T - t)} = \\ {[cov (F_{h} (t^{'}, T^{'}), F_{h^{'}} (t^{'}, T^{'})) + s_{h} (t^{'}, T^{'}) s_{h^{'}} (t^{'}, T^{'})]}^{1 / (T^{'} - t^{'})} \end{matrix}

and completing the proof. The requirement

(T - t) / (T^{'} - t^{'}) \in Q

can be easily weakened by the convergence of finite continued fractions with an increasing number of terms, until the desired degree of precision is reached. □

Appendix A.4. Proof of Proposition 3

Proof.

Given a time interval

(t, T] \subseteq (t_{a}, t_{b}]

and a uniform partition (

j = 1 \dots m

) over

(t, T]

, Assumptions 4 and 5 imply that

{{\tilde{Y}}_{i}}

satisfies Assumption 2 over

(t, T]

by Theorem 1.

Assumption 7 guarantees the convergence of

L_{h}

to

E [L_{h} | Γ]

, where we recall that

L_{h} = L_{h} (t, T)

. Furthermore, it holds by definition that

E [L_{h} | Γ] = p_{h} (Γ)

, where the notation

p_{h}

has been introduced in the proof of Proposition 2.

The same apply to

L_{h}^{(j)}

(

j = 1 \dots m

) for each uniform partition of

(t, T]

considered; indeed, Assumption 7 implies

L_{h}^{(j)} \to E [L_{h}^{(j)} | Γ^{(j)}] = p_{h}^{(j)} (Γ)

.

Since

p_{h} (Γ) = \sum_{j = 1}^{m} p_{h}^{(j)} (Γ^{(j)})

and given that the partition is uniform, it holds

L_{h} = m L_{h}^{(j)}

for each

j = 1 \dots m

. Since

m (T - t) / δ_{m}

, we have

\frac{1}{T - t} L_{h} = \frac{1}{δ_{m}} L_{h}^{(j)}

(A11)

Assumption 5 and Equation (A11) imply that

\frac{1}{T - t} cov [L_{h}, L_{h^{'}}] = \frac{1}{δ_{m}} cov [L_{h}^{(j)}, L_{h^{'}}^{(j)}]

(A12)

for each considered pair of clusters

c_{h}, c_{h^{'}}

. The proof is completed by the same argument used in proof of Proposition 2, after Equation (A10). □

Appendix A.5. Proof of Theorem 3

Proof.

Assumptions 3 and 5 and

ξ_{k j} = 1

imply Assumption 1 by Theorem 1. The same theorem implies Assumption 2 in case Assumption 4 is considered instead of Assumption 3, ceteris paribus. Furthermore, Assumptions 3, 5 and 7 and

ξ_{k j} = 1

imply that

{\hat{A}}_{h h^{'}}^{(L, m)} = \frac{1}{q_{h} q_{h^{'}}} [{({\hat{c}}_{h h^{'}}^{(L m)})}^{m} - s_{h} s_{h^{'}} - δ_{h h^{'}} \frac{q_{h}}{n_{h}}]

(A13)

by Proposition 2, for any

j = 1 \dots m

and

h, h^{'} = 1 \dots H

. Analogously, considering Assumption 4 instead of Assumption 3, it holds

{\hat{A}}_{h h^{'}}^{(E, m)} = \frac{m}{q_{h} q_{h^{'}}} {\hat{c}}_{h h^{'}}^{(E m)}

(A14)

by Proposition 3, for any

j = 1 \dots m

and

h, h^{'} = 1 \dots H

.

The next step of the proof is showing that

Γ_{k} \dot{\sim} N (1, β_{k})

in the limit

σ_{k} \to 0^{+}

. In fact, both Assumptions 3 and 4 state that

Γ_{k}^{(j)} \sim Γ (\frac{1}{m β_{k}}, m β_{k}), E [Γ_{k}^{(j)}] = 1, var [Γ_{k}^{(j)}] = m β_{k}, j = 1, \dots, m .

Hence their probability densities

d F_{k} (x)

satisfy the following:

d F_{k} (x) \propto x^{{(m β_{k})}^{- 1} - 1} exp (- {(m β_{k})}^{- 1} x) d x

(A15)

Since it holds

{(m β_{k})}^{- 1} - 1 \overset{σ_{k} \to 0^{+}}{\to} {(m β_{k})}^{- 1}

, we have

lim_{σ_{k} \to 0^{+}} d F_{k} (x) \propto exp (\frac{ln x - x}{m β_{k}}) d x .

(A16)

By introducing the auxiliary variable

x^{'} x - 1

and replacing

ln (1 + x^{'})

with the first three terms of its Maclaurin series, relation (A16) can be equivalently written as

lim_{σ_{k} \to 0^{+}} d F_{k} (x (x^{'})) \propto exp (- \frac{{x^{'}}^{2}}{2 m β_{k}}) d x^{'}

(A17)

In the limit

σ_{k} = β_{k} \to 0^{+}

, Equation (A17) implies that

Γ_{k}^{(j)} \sim N (μ = 1, σ^{2} = m β_{k}) .

(A18)

Hence it holds that each

F_{h}^{(j)}

is normally distributed, with variance

m σ_{h}^{2} : = m \sum_{k} ω_{h k} β_{k}

—when considering the linear case (i.e., Assumptions 1 and 3). Analogously, also each

L_{h}^{(j)}

is normally distributed in the exponential case (i.e., Assumptions 2 and 4).

Considering the market factors—as well as the historical observations of default frequency—as normal random variables is relevant to prove the theorem, since it implies that the covariance matrix estimators

{\hat{c}}^{(L m)}

and

{\hat{c}}^{(E m)}

are Wishart distributed. Hence the variance associated to a given matrix element is

var [{\hat{c}}_{h h^{'}}^{(m)}] = \frac{m^{2}}{m \cdot n - 1} (ρ_{h h^{'}}^{2} + 1) σ_{h}^{2} σ_{h^{'}}^{2}

(A19)

in both linear and exponential cases. In the exponential case Equation (A19) is equivalent to the following

var [{\hat{c}}_{h h^{'}}^{(E m)}] = \frac{1}{m \cdot n - 1} [{(c_{h h^{'}}^{(E m)})}^{2} + c_{h h}^{(E m)} c_{h^{'} h^{'}}^{(E m)}]

(A20)

while the same is not true in the linear case. Given Equation (A19), it is possible to prove Equation (54) separately in the two cases.

Proof in the linear case. Proposition 2 implies

\begin{matrix} var [{\hat{A}}_{h h^{'}}^{(L, m)}] & = & \frac{1}{{(q_{h} q_{h^{'}})}^{2}} var [{({\hat{c}}_{h h^{'}}^{(L m)})}^{m}] = \frac{1}{{(q_{h} q_{h^{'}})}^{2}} [{(E [{({{\hat{c}}_{h h^{'}}^{(L m)})}^{2}])}^{m} -(E [{\hat{c}}_{h h^{'}}^{(L m)}])}^{2 m}] \\ = & \frac{1}{{(q_{h} q_{h^{'}})}^{2}} [{(var [{\hat{c}}_{h h^{'}}^{(L m)}] + {(E [{\hat{c}}_{h h^{'}}^{(L m)}])}^{2})}^{m} - {(E [{\hat{c}}_{h h^{'}}^{(L m)}])}^{2 m}] \end{matrix}

(A21)

In the limit

σ \to 0^{+}

the binomial above can be replaced with its leading term. Hence

var [{\hat{A}}_{h h^{'}}^{(L, m)}] = \frac{1}{{(q_{h} q_{h^{'}})}^{2}} var [{\hat{c}}_{h h^{'}}^{(L m)}] {(E [{\hat{c}}_{h h^{'}}^{(L m)}])}^{2 (m - 1)}

(A22)

By applying Equation (A19) we have

\begin{matrix} var [{\hat{A}}_{h h^{'}}^{(L, m)}] & = & \frac{1}{{(q_{h} q_{h^{'}})}^{2}} \frac{m^{2}}{m \cdot n - 1} (ρ_{h h^{'}}^{2} + 1) σ_{h}^{2} σ_{h^{'}}^{2} {(c_{h h^{'}}^{(L m)})}^{2 (m - 1)} \\ = & \frac{1}{{(q_{h} q_{h^{'}})}^{2}} \frac{1}{m \cdot n - 1} \frac{ρ_{h h^{'}}^{2} + 1}{ρ_{h h^{'}}^{2}} {(c_{h h^{'}}^{(L m)} - s_{h}^{(j)} s_{h^{'}}^{(j)})}^{2} {(c_{h h^{'}}^{(L m)})}^{2 (m - 1)} \end{matrix}

(A23)

Applying Proposition 2 once again we have

{(c_{h h^{'}}^{(L m)})}^{2 m} = {(c_{h h^{'}}^{(L 1)})}^{2}

. Furthermore, we have

s_{h}^{(j)} s_{h^{'}}^{(j)} {(c_{h h^{'}}^{(L m)})}^{2 (m - 1)} \overset{σ \to 0^{+}}{\to} {(s_{h}^{(j)} s_{h^{'}}^{(j)})}^{2 m} = s_{h}^{2} s_{h^{'}}^{2}

. Hence it holds

var [{\hat{A}}_{h h^{'}}^{(L, m)}] = \frac{1}{{(q_{h} q_{h^{'}})}^{2}} \frac{1}{m \cdot n - 1} \frac{ρ_{h h^{'}}^{2} + 1}{ρ_{h h^{'}}^{2}} [{(c_{h h^{'}}^{(L 1)})}^{2} - s_{h}^{2} s_{h^{'}}^{2}]

(A24)

and thus the ratio

var [{\hat{A}}_{h h^{'}}^{(L, m)}] / var [{\hat{A}}_{h h^{'}}^{(L, 1)}]

verifies Equation (54), completing the proof for the linear case.

Proof in the exponential case. Equation (A20) and Proposition 3 imply

\begin{matrix} var [{\hat{A}}_{h h^{'}}^{(E, m)}] & = & \frac{m^{2}}{{(q_{h} q_{h^{'}})}^{2}} var [{\hat{c}}_{h h^{'}}^{(E m)}] = \frac{m^{2}}{{(q_{h} q_{h^{'}})}^{2}} \frac{1}{m \cdot n - 1} [{(c_{h h^{'}}^{(E m)})}^{2} + c_{h h}^{(E m)} c_{h^{'} h^{'}}^{(E m)}] \\ = & \frac{1}{{(q_{h} q_{h^{'}})}^{2}} \frac{1}{m \cdot n - 1} [{(c_{h h^{'}}^{(E 1)})}^{2} + c_{h h}^{(E 1)} c_{h^{'} h^{'}}^{(E 1)}] \end{matrix}

(A25)

The latter implies that in case

m = 1

we have

var [{\hat{A}}_{h h^{'}}^{(E, 1)}] = \frac{1}{{(q_{h} q_{h^{'}})}^{2}} \frac{1}{n - 1} [{(c_{h h^{'}}^{(E 1)})}^{2} + c_{h h}^{(E 1)} c_{h^{'} h^{'}}^{(E 1)}]

(A26)

Hence, the ratio

var [{\hat{A}}_{h h^{'}}^{(E, m)}] / var [{\hat{A}}_{h h^{'}}^{(E, 1)}]

verifies Equation (54), completing the proof for the exponential case. □

Appendix B. Covariance Estimation Error in Presence of Autocorrelation

In this section a generalization of Equation (54) is provided, considering the presence of autocorrelation. Only the exponential case is discussed, because a closed form for

{\hat{A}}_{h h^{'}}^{(E, m)}

is still available when autocorrelation has to be considered—while only a second order approximation has been computed for the linear case

{\hat{A}}_{h h^{'}}^{(L, m)}

.

A comparison between Equations (41) and (49) allows us to generalize Proposition 3.

c_{h h^{'}}^{(E 1)} = m c_{h h^{'}}^{(E m)} + 2 \sum_{x = 1}^{m - 1} (m - x)_{x} c_{h h^{'}}^{(E m)}

(A27)

where

_{x} c_{h h^{'}}^{(E m)} : = cov [L_{h}^{(j)}, L_{h^{'}}^{(j + x)}]

(A28)

It holds by definition

\begin{matrix} c_{h h^{'}}^{(E m)} & = & \frac{q_{h} q_{h^{'}}}{m^{2}} \sum_{k = 1}^{K} ω_{h k} ω_{h^{'} k} var [Γ_{k}^{(j)}] \\ _{x} c_{h h^{'}}^{(E m)} & = & \frac{q_{h} q_{h^{'}}}{m^{2}} \sum_{k = 1}^{K} ω_{h k} ω_{h^{'} k} cov [Γ_{k}^{(j)}, Γ_{k}^{(j + x)}] \end{matrix}

Hence, Assumption 6 implies

E [_{x} {\hat{c}}_{h h^{'}}^{(E m)}] = E [\frac{q_{h} q_{h^{'}}}{m^{2}} \sum_{k = 1}^{K} ω_{h k} ω_{h^{'} k} ϱ_{x k} var [Γ_{k}^{(j)}]] =_{x} {\tilde{ϱ}}_{h h^{'}}

(A29)

where

_{x} {\tilde{ϱ}}_{h h^{'}} : = \frac{\sum_{k = 1}^{K} {\tilde{w}}_{k h h^{'}} ϱ_{x k}}{\sum_{k = 1}^{K} {\tilde{w}}_{k h h^{'}}}; {\tilde{w}}_{k h h^{'}} : = ω_{h k} ω_{h^{'} k} m ξ_{k}^{2} σ_{k}^{2}

(A30)

Furthermore, applying Equation (A19), it follows that

\begin{matrix} var [_{x} {\hat{c}}_{h h^{'}}^{(E m)}] & = & \frac{1}{m \cdot n - 1} (E {[_{x} {\hat{c}}_{h h^{'}}^{(E m)}]}^{2} + E [\hat{cov} [L_{h}^{(j)}, L_{h}^{(j)}]] E [\hat{cov} [L_{h^{'}}^{(j + x)}, L_{h^{'}}^{(j + x)}]]) \\ = & \frac{1}{m \cdot n - 1} (_{x} {\tilde{ϱ}}_{h h^{'}}^{2} {(c_{h h^{'}}^{(E m)})}^{2} + c_{h h}^{(E m)} c_{h^{'} h^{'}}^{(E m)}) \\ = & var [{\hat{c}}_{h h^{'}}^{(E m)}] - \frac{1 -_{x} {\tilde{ϱ}}_{h h^{'}}^{2}}{m \cdot n - 1} {(c_{h h^{'}}^{(E m)})}^{2} \end{matrix}

(A31)

Equation (A29) leads to another version of Equation (A27)

c_{h h^{'}}^{(E m)} = \frac{1}{m} {(1 + 2 \sum_{x = 1}^{m - 1} (1 - \frac{x}{m})_{x} {\tilde{ϱ}}_{h h^{'}})}^{- 1} c_{h h^{'}}^{(E 1)}

(A32)

From Equation (49) we have

var [{\hat{A}}_{h h^{'}}^{(E, m)}] = \frac{m^{2}}{{(q_{h} q_{h^{'}})}^{2}} var [{\hat{c}}_{h h^{'}}^{(E m)} + 2 \sum_{x = 1}^{m - 1} (1 - \frac{x}{m})_{x} {\hat{c}}_{h h^{'}}^{(E m)}]

(A33)

Equation (A33) implies that

var [{\hat{A}}_{h h^{'}}^{(E, m)}]

depends on the correlation matrix

ϱ_{x x^{'}}^{(\hat{c})}

among the considered covariance estimators

_{x} {\hat{c}}_{h h^{'}}^{(E m)}

(x = 0, 1, \dots)

, as shown below by choosing an equivalent expression for the RHS:

var [{\hat{A}}_{h h^{'}}^{(E, m)}] = \frac{m^{2}}{{(q_{h} q_{h^{'}})}^{2}} \sum_{x, x^{'} = 0}^{m - 1} ϱ_{x x^{'}}^{(\hat{c})}_{x} s_{h h^{'}}_{x^{'}} s_{h h^{'}}

(A34)

where

_{x} s_{h h^{'}} : = (2 - δ_{0 x}) (1 - \frac{x}{m}) {(var [_{x} {\hat{c}}_{h h^{'}}^{(E m)}])}^{\frac{1}{2}}

(A35)

In case the covariance estimators are independent from each other (i.e.

ϱ_{x x^{'}}^{(\hat{c})} = δ_{x x^{'}}

), an inferior limit to the considered variance is obtained

var [{\hat{A}}_{h h^{'}}^{(E, m)}] \geq \frac{m^{2}}{{(q_{h} q_{h^{'}})}^{2}} (var [{\hat{c}}_{h h^{'}}^{(E m)}] + 4 \sum_{x = 1}^{m - 1} {(1 - \frac{x}{m})}^{2} var [_{x} {\hat{c}}_{h h^{'}}^{(E m)}])

(A36)

Equation (A31) can be substituted into Equation (A34). Hence, RHS of inequality (A36) becomes

var [{\hat{c}}_{h h^{'}}^{(E m)}] \frac{m^{2}}{{(q_{h} q_{h^{'}})}^{2}} [1 + 4 \sum_{x = 1}^{m - 1} {(1 - \frac{x}{m})}^{2} (1 - \frac{1 -_{x} {\tilde{ϱ}}_{h h^{'}}^{2}}{m \cdot n - 1} {cv}^{- 2} [{\hat{c}}_{h h^{'}}^{(E m)}])]

where the notation

cv [\cdot]

stands for the coefficient of variation.

{(c_{h h^{'}}^{(E m)})}^{2}

and

var [{\hat{c}}_{h h^{'}}^{(E m)}]

can be expressed by using Equation (A32). Hence Equation (A36) can be used to estimate an inferior limit to

ε [{\hat{A}}_{h h^{'}}^{(E, m)}]

in the gaussian regime. A superior limit for the same quantity can be computed as well, imposing

ϱ_{x x^{'}}^{(\hat{c})} = 1

for each considered

x, x^{'}

.

Remark A1.

Equation (A34) does not converge to (A25) in the limit

ϱ_{x k} \to 0 \Rightarrow_{x} {\tilde{ϱ}}_{h h^{'}} \to 0

. This copes with the fact that assuming

ϱ_{x k} = 0

in Equation (A25) implies a lesser error than measuring it.

References

Crouhy, M.; Galai, D.; Mark, R. A comparative analysis of current credit risk models. J. Bank. Financ. 2000, 24, 59–117. [Google Scholar] [CrossRef]
Murphy, D. Unravelling the Credit Crunch; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
Schönbucher, P.J. Credit Derivatives Pricing Models: Model, Pricing and Implementation; Wiley: Hoboken, NJ, USA, 2003. [Google Scholar]
Fréchet, M. Sur les Tableaux de Corrélation Dont les Marges Sont Donnés; Annales de l’Université de Lyon, Science: Lyon, France, 1951; Volume 4, pp. 13–84. [Google Scholar]
Sklar, A. Fonctions de Répartition à n Dimensions et Leurs Marges; Institut Statistique de l’Université de Paris: Paris, France, 1951; Volume 8, pp. 229–231. [Google Scholar]
Joe, H. Multivariate Models and Dependence Concepts; Chapman and Hall/CRC: Boca Raton, FL, USA, 1997. [Google Scholar]
Nelsen, R.B. Introduction to Copulas, 1st ed.; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
Li, D.X. On Default Correlation: A Copula Function Approach. J. Fixed Income 2000, 9, 43–54. [Google Scholar] [CrossRef]
Mai, J.-F.; Scherer, M. Simulating Copulas: Stochastic Models, Sampling Algorithms, and Applications, 2nd ed.; World Scientific Publishing Company: Singapore, 2017. [Google Scholar]
The Bad Loan Rate Series Is Labelled as TRI30496_35120163. The Count of Performing Borrowers at the Initial Period Series Is Labelled as TRI30496_351122141. Bank of Italy Statistical Database. Available online: https://infostat.bancaditalia.it/inquiry/ (accessed on 1 May 2021).
Credit Suisse First Boston. CreditRisk⁺, a Credit Risk Management Framework; Credit Suisse First Boston: London, UK, 1998. [Google Scholar]
Passalacqua, L. A Pricing Model for Credit Insurance; Giornale Dell’Istituto Italiano Degli Attuari: Rome, Italy, 2006; Volume LXIX, pp. 1–37. [Google Scholar]
Passalacqua, L. Measuring Effects of Excess-of-Loss Reinsurance on Credit Insurance Risk Capital; Giornale Dell’Istituto Italiano Degli Attuari: Rome, Italy, 2006; Volume LXX, pp. 81–102. [Google Scholar]
Vandendorpe, A.; Ho, N.D.; Vanduffel, S.; Van Dooren, P. On the parameterization of the CreditRisk⁺ model for estimating credit portfolio risk. Insur. Math. Econ. 2008, 42, 736–745. [Google Scholar] [CrossRef]
Wilde, T. CreditRisk⁺. In Encyclopedia of Quantitative Finance; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
Gundlach, M.; Lehrbass, F. (Eds.) CreditRisk⁺ in the Banking Industry; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
Klugman, S.A.; Panjer, H.H.; Willmot, G.E. Loss Models: From Data to Decisions; Wiley: Hoboken, NJ, USA, 2012. [Google Scholar]
Glasserman, P.; Li, J. Importance Sampling for Portfolio Credit Risk. Manag. Sci. 2005, 51, 1643–1656. [Google Scholar] [CrossRef] [Green Version]
McNeil, A.; Frey, R.; Embrechts, P. Quantitative Risk Management; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
Kotz, S.; Adams, J.W. Distribution of Sum of Identically Distributed Exponentially Correlated Gamma-Variables. Ann. Math. Stat. 1964, 35, 277–283. [Google Scholar] [CrossRef]
Mathai, A.M.; Moschopoulos, P.G. A Form of Multivariate Gamma Distribution. Ann. Inst. Stat. Math. 1992, 44, 97–106. [Google Scholar] [CrossRef]
Florent, C.; Borgnat, P.; Tourneret, J.; Abry, P. Parameter estimation for sums of correlated gamma random variables. Application to anomaly detection in Internet Traffic. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP-08, Las Vegas, NV, USA, 31 March–4 April 2008. [Google Scholar]
Feng, Y.; Wen, M.; Zhang, J.; Ji, F.; Ning, G. Sum of arbitrarily correlated Gamma random variables with unequal parameters and its application in wireless communications. In Proceedings of the IEEE 2016 International Conference on Computing, Networking and Communications (ICNC), Kauai, HI, USA, 15–18 February 2016. [Google Scholar] [CrossRef]
Non-Performing Loans (NPLs) in Italy’s Banking System. 2017. Available online: https://www.bancaditalia.it/media/views/2017/npl/ (accessed on 1 May 2021).
Higham, N. Computing the nearest correlation matrix—A problem from finance. IMA J. Numer. Anal. 2002, 22, 329–343. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Precision gain

ε_{12}^{(m)}

, as a function of m and

i_{σ}

. The left and right plots show the values of

ε [{\hat{A}}_{12}^{(E, m)}]

and

ε [{\hat{A}}_{12}^{(L, m)}]

respectively, as a function of m, for each volatility scenario (

i_{σ} = 0, \dots, 6

), each depicted with darker to brighter curves, in the

n_{h} = \infty

assumption. The red curve is the theoretical value of

ε [{\hat{A}}_{12}^{(m)}]

in the Gaussian regime.

Figure 1. Precision gain

ε_{12}^{(m)}

, as a function of m and

i_{σ}

. The left and right plots show the values of

ε [{\hat{A}}_{12}^{(E, m)}]

and

ε [{\hat{A}}_{12}^{(L, m)}]

respectively, as a function of m, for each volatility scenario (

i_{σ} = 0, \dots, 6

), each depicted with darker to brighter curves, in the

n_{h} = \infty

assumption. The red curve is the theoretical value of

ε [{\hat{A}}_{12}^{(m)}]

in the Gaussian regime.

Figure 2.

ε [{\hat{A}}_{12}^{(E, m)}]

and

ε [{\hat{A}}_{12}^{(L, m)}]

as a function of m, considering increasing

i_{σ}

(from darker to brighter curve) and

n_{h} = 10^{3}

. The red curve is the theoretical value of

ε [{\hat{A}}_{12}^{(m)}]

as a function of m in the Gaussian regime. For

σ_{1}, σ_{2} ≪ 1

the analytical result is perfectly satisfied. However,

ε [{\hat{A}}_{12}^{(L, m)}]

is shown to be a decreasing function of m in general. Comparing this result with the

n_{h} = \infty

case, we can state that

ε [{\hat{A}}_{h h^{'}}^{(L, m)}]

is almost insensitive to

n_{h}

(

h = 1, 2

).

Figure 2.

ε [{\hat{A}}_{12}^{(E, m)}]

and

ε [{\hat{A}}_{12}^{(L, m)}]

as a function of m, considering increasing

i_{σ}

(from darker to brighter curve) and

n_{h} = 10^{3}

. The red curve is the theoretical value of

ε [{\hat{A}}_{12}^{(m)}]

as a function of m in the Gaussian regime. For

σ_{1}, σ_{2} ≪ 1

the analytical result is perfectly satisfied. However,

ε [{\hat{A}}_{12}^{(L, m)}]

is shown to be a decreasing function of m in general. Comparing this result with the

n_{h} = \infty

case, we can state that

ε [{\hat{A}}_{h h^{'}}^{(L, m)}]

is almost insensitive to

n_{h}

(

h = 1, 2

).

Figure 3. Boxplot of

{\hat{A}}_{12}^{(E, m)}

and

{\hat{A}}_{12}^{(L, m)}

distributions, as a function of m. The red horizontal line represent the true value of

A_{12}

and the blue X’s stand for the average value of

{\hat{A}}_{12}^{(m)}

.

Figure 3. Boxplot of

{\hat{A}}_{12}^{(E, m)}

and

{\hat{A}}_{12}^{(L, m)}

distributions, as a function of m. The red horizontal line represent the true value of

A_{12}

and the blue X’s stand for the average value of

{\hat{A}}_{12}^{(m)}

.

Figure 4.

σ [{\hat{A}}_{12}^{(E, m)}]

and

σ [{\hat{A}}_{12}^{(L, m)}]

as a function of

n_{h}

. Decreasing m values are considered from darker to brighter curve.

Figure 4.

σ [{\hat{A}}_{12}^{(E, m)}]

and

σ [{\hat{A}}_{12}^{(L, m)}]

as a function of

n_{h}

. Decreasing m values are considered from darker to brighter curve.

Figure 5. Precision gain in presence of autocorrelation.

ε [{\hat{A}}_{12}^{(E, m)}]

(exact—left panels) and

ε [{\hat{A}}_{12}^{(L, m)}]

(2nd order approximation—right panels), for each volatility scenario (

i_{σ} = 0, \dots, 6

, depicted with darker to brighter curves), for

ρ = 0.05

(top) and

ρ = 0.5

(bottom). The yellow area includes all the values between the maximum (dashed red line) and the minimum (solid red line) expected from the results of Appendix B. The frontier

ε = 1

(dotted line) allows to check the presence of a precision gain at

m > 1

.

Figure 5. Precision gain in presence of autocorrelation.

ε [{\hat{A}}_{12}^{(E, m)}]

(exact—left panels) and

ε [{\hat{A}}_{12}^{(L, m)}]

(2nd order approximation—right panels), for each volatility scenario (

i_{σ} = 0, \dots, 6

, depicted with darker to brighter curves), for

ρ = 0.05

(top) and

ρ = 0.5

(bottom). The yellow area includes all the values between the maximum (dashed red line) and the minimum (solid red line) expected from the results of Appendix B. The frontier

ε = 1

(dotted line) allows to check the presence of a precision gain at

m > 1

.

Table 1. Matrix of weights used for the numerical simulations.

k	0	1	2
$ω_{1 k}$	0.30	0.40	0.30
$ω_{2 k}$	0.50	0.25	0.25

Table 2. Definition of the clusters

h = 1, \dots, 6

used in data set TRI30496.

Table 2. Definition of the clusters

h = 1, \dots, 6

used in data set TRI30496.

Cluster Index h	Sector Code	Description
1	600	Consumer households
2	S11	Non-financial companies
3	S12BI7	Financial companies other than monetary financial institutions
4	S13	General government
5	S14BI4	Producer households
6	S15BI1	Non-profit institutions serving households and unclassifiable units

Table 3. Main features of the considered historical time series over the period 1 January 2013–31 December 2017.

{\bar{p}}_{h}

(

h = 1, \dots, 6

) is the yearly average bad loan rate;

σ_{h}

is the volatility associated to each

{\bar{p}}_{h}

;

〈n_{h}〉

is the average number of borrowers.

Table 3. Main features of the considered historical time series over the period 1 January 2013–31 December 2017.

{\bar{p}}_{h}

(

h = 1, \dots, 6

) is the yearly average bad loan rate;

σ_{h}

is the volatility associated to each

{\bar{p}}_{h}

;

〈n_{h}〉

is the average number of borrowers.

h	1	2	3	4	5	6
${\bar{p}}_{h}$	0.0119	0.0352	0.0255	0.0056	0.0259	0.0088
$σ_{h}$	0.0010	0.0042	0.0023	0.0014	0.0022	0.0010
$〈n_{h}〉$	269,515	407,602	3191	5416	132,179	4020

Table 4. Values of

{\hat{A}}^{(E, m)} (0, 1)

(

m = 4

left,

m = 1

right) obtained from the quarterly historical series TRI30496 over the period 1 January 2013–31 December 2017. Results are expressed in

10^{- 2}

units.

Table 4. Values of

{\hat{A}}^{(E, m)} (0, 1)

(

m = 4

left,

m = 1

right) obtained from the quarterly historical series TRI30496 over the period 1 January 2013–31 December 2017. Results are expressed in

10^{- 2}

units.

0.53	0.28	0.33	0.36	0.41	0.48	0.68	0.40	0.36	1.01	0.56	0.73
0.28	0.59	0.48	0.61	0.43	0.40	0.40	1.50	0.98	1.26	0.87	−0.16
0.33	0.48	0.67	0.52	0.43	0.40	0.36	0.98	0.87	0.78	0.72	0.27
0.36	0.61	0.52	7.80	0.48	0.33	1.01	1.26	0.78	6.50	1.10	0.66
0.41	0.43	0.43	0.48	0.47	0.54	0.56	0.87	0.72	1.10	0.74	0.47
0.48	0.40	0.40	0.33	0.54	1.53	0.73	−0.16	0.27	0.66	0.47	1.35

Table 5. The elementwise precision gain

ε [{\hat{A}}^{(E, 4)} (0, 1)]

associated with results reported in Table 4.

Table 5. The elementwise precision gain

ε [{\hat{A}}^{(E, 4)} (0, 1)]

associated with results reported in Table 4.

0.36	0.26	0.37	0.41	0.33	0.39
0.26	0.18	0.24	0.30	0.23	0.33
0.37	0.24	0.35	0.43	0.30	0.45
0.41	0.30	0.43	0.55	0.37	0.52
0.33	0.23	0.30	0.37	0.29	0.42
0.39	0.33	0.45	0.52	0.42	0.52

Table 6.

σ [{\hat{A}}^{(E, m)} (0, 1)]

(

m = 4

left,

m = 1

right). These are the elementwise errors of the estimators reported in Table 4. The results above are expressed in

10^{- 2}

units.

Table 6.

σ [{\hat{A}}^{(E, m)} (0, 1)]

(

m = 4

left,

m = 1

right). These are the elementwise errors of the estimators reported in Table 4. The results above are expressed in

10^{- 2}

units.

0.11	0.12	0.19	0.44	0.11	0.31	0.41	0.25	0.41	1.09	0.24	0.64
0.12	0.20	0.29	0.64	0.13	0.40	0.25	0.86	0.72	1.47	0.50	0.97
0.19	0.29	1.38	1.08	0.21	0.69	0.41	0.72	1.75	2.31	0.49	1.53
0.44	0.64	1.08	5.12	0.49	1.60	1.09	1.47	2.31	9.11	1.20	3.40
0.11	0.13	0.21	0.49	0.13	0.34	0.24	0.50	0.49	1.20	0.29	0.79
0.31	0.40	0.69	1.60	0.34	3.25	0.64	0.97	1.53	3.40	0.79	4.65

Table 7. Set of eigenvalues

{\tilde{σ}}_{k}

and eigenvectors

{\tilde{ω}}_{k}

obtained by the eigenvalues decomposition of

{\hat{A}}^{(E, 4)} (0, 1)

, as reported in Table 4.

Table 7. Set of eigenvalues

{\tilde{σ}}_{k}

and eigenvectors

{\tilde{ω}}_{k}

obtained by the eigenvalues decomposition of

{\hat{A}}^{(E, 4)} (0, 1)

, as reported in Table 4.

${\tilde{ω}}_{k}$	k = 1	k = 2	k = 3	k = 4	k = 5	k = 6
	0.06	−0.34	0.13	0.84	0.00	0.39
	0.10	−0.33	0.43	−0.38	−0.65	0.36
	0.09	−0.36	0.52	−0.26	0.72	0.06
	0.98	0.17	−0.07	0.01	0.01	0.00
	0.08	−0.38	0.22	0.19	−0.22	−0.84
	0.07	−0.68	−0.69	−0.20	0.06	0.07
${\tilde{σ}}_{k}^{2}$	0.08	0.02	0.01	2.9 · $10^{- 3}$	1.4 · $10^{- 3}$	0.3 · $10^{- 3}$

Table 8. The complete set of parameters

\hat{W}, {\hat{σ}}_{Γ}^{2}

necessary to specify the dependence structure in CreditRisk

^{+}

model, obtained by the eigenvalues decomposition of

{\hat{A}}^{(E, 4)} (0, 1)

, as reported in Table 4.

Table 8. The complete set of parameters

\hat{W}, {\hat{σ}}_{Γ}^{2}

necessary to specify the dependence structure in CreditRisk

^{+}

model, obtained by the eigenvalues decomposition of

{\hat{A}}^{(E, 4)} (0, 1)

, as reported in Table 4.

k	0	1	2	3
$ω_{1 k}$	0.67	0.04	0.29	0.00
$ω_{2 k}$	0.07	0.07	0.27	0.59
$ω_{3 k}$	0.00	0.06	0.28	0.66
$ω_{4 k}$	0.13	0.87	0.00	0.00
$ω_{5 k}$	0.63	0.06	0.31	0.00
$ω_{6 k}$	0.29	0.04	0.67	0.00
$σ_{k}^{2}$		0.103	0.031	0.010

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Giacomelli, J.; Passalacqua, L. Calibrating the CreditRisk⁺ Model at Different Time Scales and in Presence of Temporal Autocorrelation †. Mathematics 2021, 9, 1679. https://0-doi-org.brum.beds.ac.uk/10.3390/math9141679

AMA Style

Giacomelli J, Passalacqua L. Calibrating the CreditRisk⁺ Model at Different Time Scales and in Presence of Temporal Autocorrelation †. Mathematics. 2021; 9(14):1679. https://0-doi-org.brum.beds.ac.uk/10.3390/math9141679

Chicago/Turabian Style

Giacomelli, Jacopo, and Luca Passalacqua. 2021. "Calibrating the CreditRisk⁺ Model at Different Time Scales and in Presence of Temporal Autocorrelation †" Mathematics 9, no. 14: 1679. https://0-doi-org.brum.beds.ac.uk/10.3390/math9141679

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Calibrating the CreditRisk+ Model at Different Time Scales and in Presence of Temporal Autocorrelation †

Abstract

1. Introduction

2. The CreditRisk+ Model

3. CreditRisk+ Using Multiple Unwind Periods

3.1. The Single Unwind Period Case

3.2. The Multiple Unwind Periods Case

3.3. Internal Consistency and Autocorrelation in Time Series

4. Calibration of the Structure of Dependence

4.1. The Single Unwind Period Case

4.2. The Multiple Unwind Period Case

4.3. The Exponential Case

4.4. Handling Autocorrelated Time Series in Calibration

5. The Advantage of a Short Sampling Period

5.1. Precision of A ^ at Different Time Scales under the Gaussian Regime

5.2. Beyond the Gaussian Regime: Numerical Simulations

5.3. Estimation Error in Presence of Autocorrelation

6. An Application to Market Data

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Proofs

Appendix A.1. Proof of Theorem 1

Appendix A.2. Proof of Theorem 2

Appendix A.3. Proof of Proposition 2

Appendix A.4. Proof of Proposition 3

Appendix A.5. Proof of Theorem 3

Appendix B. Covariance Estimation Error in Presence of Autocorrelation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Calibrating the CreditRisk⁺ Model at Different Time Scales and in Presence of Temporal Autocorrelation †

2. The CreditRisk⁺ Model

3. CreditRisk⁺ Using Multiple Unwind Periods

5.1. Precision of $\hat{A}$ at Different Time Scales under the Gaussian Regime