Next Article in Journal
Self-Expressive Kernel Subspace Clustering Algorithm for Categorical Data with Embedded Feature Selection
Next Article in Special Issue
The Link between Corporate Reputation and Financial Performance and Equilibrium within the Airline Industry
Previous Article in Journal
On Sharp Oscillation Criteria for General Third-Order Delay Differential Equations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Calibrating the CreditRisk+ Model at Different Time Scales and in Presence of Temporal Autocorrelation

by
Jacopo Giacomelli
1,2,*,† and
Luca Passalacqua
2,*
1
SACE S.p.A., Piazza Poli 42, 00187 Rome, Italy
2
Department of Statistics, Sapienza University of Rome, Viale Regina Elena 295, 00161 Rome, Italy
*
Authors to whom correspondence should be addressed.
The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy or position of SACE S.p.A.
Submission received: 4 June 2021 / Revised: 9 July 2021 / Accepted: 13 July 2021 / Published: 16 July 2021

Abstract

:
The CreditRisk + model is one of the industry standards for the valuation of default risk in credit loans portfolios. The calibration of CreditRisk + requires, inter alia, the specification of the parameters describing the structure of dependence among default events. This work addresses the calibration of these parameters. In particular, we study the dependence of the calibration procedure on the sampling period of the default rate time series, that might be different from the time horizon onto which the model is used for forecasting, as it is often the case in real life applications. The case of autocorrelated time series and the role of the statistical error as a function of the time series period are also discussed. The findings of the proposed calibration technique are illustrated with the support of an application to real data.
MSC:
62F25; 62H12; 62H25; 62M10; 62P05
JEL Classifications:
C38; C51; G21; G22

1. Introduction

While the development of modern portfolio credit risk models started in the 1980–1990 decade [1] within the framework of the Basel Accords, it is with the great credit crisis of 2008 [2] that increasing attention started to be paid to the precise determination of the structure of dependence among default events. It is well established [3] that tails of the distribution of the value of asset/liabilities portfolios are dominated by the structure of dependence rather than by the other fundamental components of credit risk (i.e., the marginal probability and the severity associated with each future default event). The vast research interest in modeling the structure of dependence resulted in the formalization of the so-called copula theory [4,5]. This “language” was explicitly adopted by the second generation of portfolio credit models to describe the dependence among loss events [6,7,8,9].
In this regard, the calibration issues raised by a particular structure of dependence (or, equivalently, the corresponding copula) can be as important as the choice of the structure itself. Generally, calibrating the dependence structure of a portfolio model is a demanding task, given the large number of parameters needed to provide a realistic description of the modeled dependencies, and considering that, on the other hand, historical data are usually not numerous enough to fill the sample space in a way sufficient for a precise estimation of the parameters.
In this work, we address a typical real-life problem: how to choose the frequency of the historical time series of default used to calibrate a classic credit portfolio model, CreditRisk + , in order to provide the most accurate estimation of the structure of dependence parameters, or, in other words, how the calibration error “scales” with the time series frequency. This problem is especially relevant for all the cases when the debtors underlying a credit portfolio are small/medium enterprises. The lack of market information, such as CDS spread, stock price, or bond yield, forces to calibrate the model using a reduced-form approach based on historical cluster data, such as default rate time series associated with the economic sector of each debtor. This case is typical in activities such as credit insurance, surety, and factoring. In most cases, publicly available time series have a sampling period ranging from one to three months (e.g., [10]), while the calibrated CreditRisk + model is used on a projection horizon that is at least one year long (e.g., the unwind period required to quantify a capital requirement both in Solvency 2 and in Basel 3 regulatory frameworks).
CreditRisk + [11], disclosed in 1997, belongs to the first generation of portfolio credit risk models of “actuarial inspiration”. Applications of CreditRisk + to the credit insurance sector are documented in the literature well before the 2008 financial credit crisis [12,13], while research activity is still ongoing in the area of actuarial science [14]. At present, CreditRisk + is still one of the financial and actuarial industry standards for the assessment of credit risk in portfolios of financial loans or credit/suretyship policies.
Despite the vast research activity on this model and its calibration, the issue of using two different time scales for calibration and projection remains not investigated to date. The research conducted to date on the calibration of CreditRisk + [14] has addressed the issues related to the decomposition of a given covariance matrix among the time series, which is the final necessary step to complete the calibration of the model. However, the covariance matrix is obtained by the “classical” estimator, under the assumption that the sampling period of the time series and the projection horizon are equal.
This work shows that calibrating the model at a shorter time scale than the projection horizon is possible, nontrivial, and convenient. The internal consistency of the CreditRisk + assumptions when simultaneously imposed at different time scales has been proved and guarantees that the investigated calibration mode is not ill-posed. However, the form of the covariance estimator needed to obtain a set of parameters coherent with a specific projection horizon, using time series with a smaller sampling period, depends on the two chosen time scales. Indeed, the proposed estimator coincides with the classical one only when calibration and projection time scales are equal. Finally, we show that calibrating at a smaller time scale than the projection one provides a more precise estimation of the model parameters. The estimation error and its dependence on the difference between the two time scales are discussed.
The article is organized as follows. In Section 2, we summarize assumptions and features of the CreditRisk + model. In Section 3, we discuss the internal consistency of the model assumptions when imposing them to be simultaneously true at different time horizons. The calibration of the model parameters, which define the dependence structure, is considered in Section 4. The different degree of precision of the estimators defined at increasing time scales is discussed in Section 5. The techniques introduced in this work are applied to a real-world case study in Section 6. The main results are summarized in Section 7.

2. The CreditRisk+ Model

The CreditRisk + model is a portfolio model developed by Credit Suisse First Boston (CSFB) by Tom Wilde [15] and coworkers, first documented in [11] and later widely discussed in [16]. It is a model actuarially inspired in the sense that losses are due only to default events and not to other sources of financial risk, e.g., variation of the credit standing (the so-called “credit migration” effect). CreditRisk + can be classified as a frequency–severity model, cast in a single-period framework, with the peculiarity that a doubly-stochastic process (i.e., the Poisson–Gamma mixture) describes the frequency of default events. Loss severity is assumed to be deterministic, although this ansatz can be easily relaxed at the cost of some additional computational burden. However, severity-related issues can be neglected for what follows.
The structure of dependence of default events is described using a factor model framework, where factors are unobservable (i.e., latent) stochastic “market” variables, whose precise financial/actuarial identification is irrelevant since the model integrates on all possible realizations (“market scenarios”). Therefore, CreditRisk + can be further classified into the family of factor models and, in particular, into the subfamily of conditionally independent factor models, since, conditionally on the values assumed by the factors, defaults are supposed (by the model) to be independent.
The structure of the model can be summarized as follows. Let N be the number of different risks in a given portfolio and 1 I i the default indicator function of the i-th risk ( i = 1 , , N ) over the time horizon ( t , T ] . The indicator function 1 I i is a Bernoulli random variable such that
E 1 I i = q i , var 1 I i = q i ( 1 q i ) , i = 1 , , N .
The “portfolio loss” L over the reference time horizon ( t , T ) is then given by
L = i = 1 N 1 I i E i
where each exposure E i is supposed to be deterministic.
In order to ease the semianalytic computation of the distribution of L, the model introduces a new set of variables Y i , each replacing the corresponding indicator function 1 I i ( i = 1 , , N ). The new variables Y i are supposed to be Poisson-distributed, conditionally on the value assumed by the market latent variables.
Assumption 1
(CreditRisk + distributional assumption). Given a time horizon ( t , T ] and a set of N risky debtors, the number Y i of insolvency events generated by each i-th debtor over ( t , T ] is distributed as follows:
Y i Poisson p i ( Γ ) , p i ( Γ ) : = q i · ω i 0 + k = 1 K ω i k Γ k
where Γ = ( Γ 1 Γ K ) R + K is an array of independent r.v.’s such that
Γ k Gamma β k 1 , β k , β k R +
and the factor loadings ω i k are supposed to be all non-negative and to sum up to unity:
ω i k 0 , i = 1 , , N , k = 0 , , K , k = 0 K ω i k = 1 , i = 1 , , N .
The Γ parameters set { β 1 β K } is equivalent to the classical shape-scale parameterization { α k , β k } of each Gamma distributed r.v. Γ k , after having imposed the assumption E Γ k = 1 , that is stated in the original formulation of the CreditRisk + model. Hence, the k-th scale parameter β k is equal to the variance σ k 2 of Γ k . Given the independence among Γ k ’s, the covariance matrix Σ takes the form
Σ : = cov Γ = diag σ 1 2 σ K 2 = diag β 1 β K
Assumption 1 implies that q i is the unconditional expected default frequency
q i = E p i ( Γ ) = R + K p i ( Γ ) f Γ d Γ 1 d Γ k ,
where
f x = k = 1 K x k α k 1 β k α k Γ ( α k ) e x k / β k , x k 0 , α k , β k > 0 ,
and that the identity between the expected values of the original Bernoulli variable 1 I i and the new Poisson variable Y i is granted:
E Y i = E 1 I i = q i .
The portfolio loss is now represented by the r.v. L Y
L Y = i = 1 N Y i · E i , where Y i | Γ Poisson p i ( Γ ) .
In [11], the distribution of L Y is obtained by using a recursive method, further described in [17]. The accuracy, stability, and possible variants of the original algorithm are discussed in [16]. The same distribution can be easily computed through Monte Carlo simulation due to the availability of a dedicated importance sampling algorithm in [18].
Notice that, although the distributions of L and L Y differ, the expected value of the portfolio loss is the same E L = E L Y .
In the language of copula functions, the structure of dependence implied by (3) corresponds [19] to a multivariate Clayton copula, i.e., an Archimedean copula where latent variables are Gamma-distributed (for the relation between Archimedean copula functions and factor models see, e.g., ([9] [§2.1])). The copula parameters are the factor loadings ω i k and they can be gathered, taking into account the normalization condition stated in Assumption 1, in an N × K matrix Ω :
Ω : = ω 11 ω 1 K ω N 1 ω N K ,
which is, for typical values of N and K, much smaller than the N × N covariance matrix between the default indicators 1 I .
Remark 1.
This work is specifically focused on improving the estimation of the CreditRisk + copula parameters { Ω , Σ } . Further investigations on the properties of CreditRisk + dependence structure, apart from those needed for the estimation improvement, and its comparison with the other copulae are beyond the scope of this study.
As shown in [14], it holds
cov Y i , Y j = q i q j k = 1 K ω i k ω j k σ k 2 + δ i j q i ,
where δ i j is the Kronecker delta. Equation (12) allows the calibration of the factor loadings, and thus of the dependence structure of the CreditRisk + model, by matching the observed covariance matrix of historical default time series with model values. However, since the model is defined in a single-period framework, with a reference “forecasting” time horizon ( t , T ] , that is typically of 1 year, i.e., T = t + 1 , it is not a priori evident how to use historical time series with a different frequency (e.g., quarterly) in a consistent way, when calibrating the model parameters. Naively, it is reasonable to expect that the larger the information provided by the historical time series (i.e., the higher the frequency), the better the calibration. This issue is addressed in the next sections.

3. CreditRisk+ Using Multiple Unwind Periods

The original CreditRisk + formulation, summarized in Assumption 1, defines the model in a uniperiodal framework, where only one time scale T t is considered. In this section, we discuss the internal consistency of the model assumption when imposing it more than once at distinct time scales. In this context, the expression “internal consistency” means that it is possible and well-posed imposing Assumption 1 to be true at two distinct time scales. The same applies also considering a slightly modified version of the CreditRisk + framework (i.e., imposing Assumption 2, introduced in the following, instead of Assumption 1).
Extending the original CreditRisk + formulation to a multiperiod framework enables the calibration of the model considering a time scale different from the one chosen for its application. The results presented in this section are applied in the next Section 4 to estimate the elements of the matrix
A : = Ω T Σ Ω .
Estimating A is a fundamental step in order to complete the calibration of the model. In Section 4 estimators are defined using historical series sampled with a period that is not necessarily equal to the projection horizon on which Σ and Ω are defined. Section 5 shows the convenience of choosing a sampling period shorter than the projection horizon in order to evaluate A ^ .

3.1. The Single Unwind Period Case

As discussed in Section 2, in CreditRisk + each risk (i.e., debtor) is modeled by a Poisson distributed r.v. Y i , although the Bernoulli distribution is the natural choice to represent absorbing events, such as default. Assumption 1 is convenient in terms of analytical tractability since L Y distribution can be computed through a semianalytical method. However, in order to address the problem of calibrating CreditRisk + in a “roll-over” framework, defined by an arbitrary set of time intervals, it is useful to recover the Bernoulli representation of each debtor by introducing a new r.v. Y ˜ i : = 1 I Y i > 0 .
Both the r.v. Y i and its distribution parameter p i ( Γ ) can take values larger than 1. This is formally correct, given that Y i Poisson p i ( Γ ) , despite not coping with the representation of absorbing events, that can occur at most once by definition. The so-called “Poisson approximation”, introduced by substituting 1 I i with Y i , is numerically sound as q i approaches to zero—a condition that is well fulfilled in most real world relevant cases.
Indeed, Assumption 1 implies that Y ˜ i | Γ Bernoulli ( p ˜ i ( Γ ) ) where the distribution parameter is
p ˜ i ( Γ ) = Prob ( Y i > 0 | Γ ) = 1 exp q i ω i 0 + k = 1 K ω i k Γ k .
It holds by construction
E Y ˜ i = R + K p ˜ i ( Γ ) f ( Γ ) d Γ 1 d Γ K .
Computing the integral in (15) and then approximating the term exp [ q i ω 0 ] with its second order Taylor series centered at q i = 0 leads to the following result.
Proposition 1
(Asymptotic equivalence between Bernoulli and Poisson representation of risks). Let Y ˜ i : = 1 I Y i > 0 where Y i is distributed according to Assumption 1. Then
q ˜ i : = E Y ˜ i = 1 e q i ω i 0 k = 1 K 1 + q i ω i k σ k 2 1 / σ k 2 .
Further,
q ˜ i = q i + O ( q i 2 ) q i 0 + q i .
Proposition 1 implies that E [ L Y i ] E [ L Y ˜ i ] , provided that q i 1 . Moreover, the same result enables also the exact satisfaction of E L Y i = E L Y ˜ i , in case the stochastic parameter p ˜ i ( Γ ) is redefined through the substitution q i q i , where q i verifies the following modified version of (16):
E Y ˜ i ( q i ) = 1 e q i ω i 0 k = 1 K 1 + q i ω i k σ k 2 1 / σ k 2 = q i = E Y i .
It is worth noticing that the substitution 1 I i Y i discussed in Section 2 implies the preservation of the expected value E L = E L Y due to the fact that it is done before the introduction of the market factors Γ . On the other hand, restoring the Bernoulli representation of each risk after having introduced the dependence structure requires the results presented in Proposition 1.
Proposition 1 permits the introduction of a slightly modified version of the CreditRisk + model that is asymptotically equivalent to the original one stated in Assumption 1. The equivalence between the two models is further analyzed in the next sections.
Assumption 2
(Modified CreditRisk + distributional assumption). Given a time horizon ( t , T ] and a set of N risky debtors, the number of insolvency events generated by each i-th debtor over ( t , T ] is represented by the r.v. Y ˜ i Bernoulli ( p ˜ i ( Γ ) ) , where the distribution parameter p ˜ i ( Γ ) satisfies (14). Assumptions on market factors Γ and factor loadings Ω remain the same stated in Assumption 1.
In Assumption 2 the linear dependence of the parameters p i ( Γ ) from the latent variables has been replaced with a log link function. Thus, the modified version of CreditRisk + is also referred to as “exponential” in the following.

3.2. The Multiple Unwind Periods Case

This section investigates the consequences of imposing the internal consistency of Assumption 1 or Assumption 2 at distinct time scales. Assumptions 3 and 4 are introduced hereinafter, in order to specify the family of parameters that have to be considered at the distinct time intervals where the model is applied.
The following assumption guarantees the internal consistency at different time scales of the classical CreditRisk + model, defined in Assumption 1.
Assumption 3
(CreditRisk + parameters at different time scales). Let t t 0 , t 1 , , t m T be a partition of the time interval ( t , T ] . Let Assumption 1 be satisfied over each j-th interval ( t j 1 , t j ] by the set { Y i ( j ) } ( i = 1 N ), where Y i ( j ) is the r.v. representing the i-th risk observed during the j-th interval and the following holds for the associated set { q i ( j ) ; Γ ( j ) ; Ω ( j ) } of parameters and market factors:
q i ( j ) = q i t j t j 1 T t = constant ,
Γ k ( j ) Gamma σ k 2 ξ k j 1 t j t j 1 T t , σ k 2 ξ k j T t t j t j 1 ,
Ω ( j ) = Ω ,
where ξ k j R + .
Further, the following assumption guarantees the internal consistency at different time scales of the modified version of CreditRisk + model, introduced in Assumption 2.
Assumption 4
(Modified CreditRisk + parameters at different time scales). Let t t 0 , t 1 , , t m T be a partition of the time interval ( t , T ] . Let Assumption 2 be satisfied over each j-th interval ( t j 1 , t j ] by the set { Y ˜ i ( j ) } ( i = 1 N ), where Y ˜ i ( j ) is the r.v. representing the i-th risk observed during the j-th interval. The associated set { q i ( j ) ; Γ ( j ) ; Ω ( j ) } of parameters and market factors satisfies the same assumptions stated in Assumption 3.
Finally, for the sake of simplicity, the additional Assumption 5 is introduced, with regard to the independence among market factors considered at different times. However, being possible that real-data time series violate Assumption 5, this assumption is weakened in the following Section 3.3.
Assumption 5
(Non-autocorrelated market factors). Given Assumption 3, let
cov Γ k ( j ) , Γ k ( j ) = δ j j var Γ k ( j ) .
Considering the assumptions introduced above, we prove that CreditRisk + is internally consistent when extended to a roll-over framework.
Theorem 1
(Internal consistency of CreditRisk + in absence of autocorrelation). Let us consider a set of risks { Y i } ( i = 1 N ), observed through a time horizon ( t , T ] , and an arbitrary partition t t 0 , t 1 , , t m T of ( t , T ] , such that Assumptions 3 (“CreditRisk + parameters at different time scales”) and Assumption 5 (“non-autocorrelated market factors”) are verified with
ξ k j = 1
for each i = 1 N , k = 1 K and j = 1 m . Then { Y i } satisfies Assumption 1 (“CreditRisk + distributional assumption”) over ( t , T ] .
The statement above remains true replacing Assumption 3 with Assumption 4 (“modified CreditRisk + parameters at different time scales”) and Assumption 1 with Assumption 2 (“modified CreditRisk + distributional assumption”), ceteris paribus.
The proof of Theorem 1 is reported in Appendix A.1.
This result shows that extending the CreditRisk + model to a multiperiod framework is well-posed.
Remark 2.
The choice ξ j k = 1 implies no loss of generality, since a different (positive) constant ξ j k = c is equivalent to redefine the variances of the market factors c σ k 2 σ k 2 .

3.3. Internal Consistency and Autocorrelation in Time Series

As shown in Section 2, the dynamics of each parameter p i is induced by the latent Gamma factors only. Imposing Assumption 5 to any (arbitrarily short) time scale implies that considered time series { Γ k ( j ) } j = 1 , 2 , must exhibit zero autocorrelation. Hence autocorrelation must be completely absent from the historical default frequencies too.
However, this requirement could not be satisfied by the observed time series used in calibrating the model. Indeed, we need to verify that the model can preserve its internal consistency if autocorrelation has to be considered.
The purpose of this work is to investigate whether it is possible and convenient to calibrate the CreditRisk + model at a time scale that copes with the available historical data (i.e., the sampling period of the historical time series) instead of using the same time scale needed for projections (usually bigger). Hence, in case it is not possible to preserve the internal consistency of the model at each arbitrary time scale, due to the presence of autocorrelation, it is sufficient to ask that it holds up to the smallest of the two time scales of interest—the historical sampling period and the projection horizon.
Let us specialize to the constant mesh case t j t j 1 = ( T t ) / m = δ m . This choice copes with a typical real case, where the sampling period δ m of the available historical time series is constant and the considered projection horizon T t is a multiple of it. Under these premises, a weakened version of Assumption 5 is introduced.
Assumption 6
(Autocorrelated market factors). Given Assumption 3, for each k-th latent variable, considered at the time scale δ m , a time-invariant ACF ϱ x k exists, such that
cov Γ k ( j ) , Γ k ( j + x ) = ϱ x k var Γ k ( j ) .
Furthermore, the following closure with respect to the addition holds
j = 1 m Γ k ( j ) Gamma ( α k , β k )
for a couple α k , β k of shape and scale parameters.
Assumption 6 is considered instead of Assumption 5 to state the following alternate version of Theorem 1.
Theorem 2
(Internal consistency of CreditRisk + model in presence of autocorrelation). Let us consider a set of risks { Y i } ( i = 1 N ), observed through a time horizon ( t , T ] , and a uniform partition { t j : = t + j δ m } j = 1 m of ( t , T ] , such that Assumption 3 (“CreditRisk + parameters at different time scales”) and Assumption 6 (“autocorrelated market factors”) are verified with
ξ k j = 1 + 2 x = 1 m 1 ϱ x k 1 x m 1 2
for each i = 1 N , k = 1 K and j = 1 m . Then { Y i } satisfies Assumption 1 (“CreditRisk + distributional assumption”) over ( t , T ] .
The statement above remains true replacing Assumption 3 with Assumption 4 (“modified CreditRisk + parameters at different time scales”) and Assumption 1 with Assumption 2 (“modified CreditRisk + distributional assumption”), ceteris paribus.
The proof of Theorem 2 is reported in Appendix A.2.
Assumption 6 can be either well-posed or ill-posed, depending on the considered ϱ x k . The trivial case ϱ x k = 0 for each x Z copes with Assumption 5. Correlated Gamma variables, as well as the distributional properties of the sum of Gamma variables, have been intensively studied in the literature, and this is still an active research field [20,21,22,23], due to its relevance for information technology. At least in case of identically distributed Gamma variables—such as Γ k ( j ) in our framework—with ACF obeying to a power-law
ϱ x k = ρ k | x | , ρ k ( 0 , 1 ) ,
the distribution of the sum Γ k is known to be approximately Gamma [20], while more generical cases imply the sum to be distributed differently [22,23]. Moreover, it is known that partial sums of independent Gamma variables can be used to generate sequences of (auto)correlated Gamma variables [21].
Remark 3.
The exponential ACF in Equation (27) provides a non-trivial case that satisfies Assumption Assumption 6 and, thus, Theorem 2. In the following Section 4.4, Theorem 2 permits the estimation of A in presence of autocorrelated time series. Equation (27) is then considered in Section 5.3 to investigate numerically the estimators introduced in Section 4.4. However, to date, a general framework is missing to tell whether a given ϱ x k lets the partial sums j Γ k ( j ) remain (approximately) Gamma distributed, with the exception of exponential ACFs.
The estimators introduced in Section 4.4 to consider autocorrelation in time series are still applicable to an inconsistent framework, provided that at least the latent variables Γ k (defined onto the projection horizon) are Gamma distributed and Γ k ( j ) satisfy the mean and variance requirements implied by Assumption 6 above.

4. Calibration of the Structure of Dependence

The model is calibrated based on a partition of the risks in H homogeneous sets c h ( t ) , h = 1 , , H . In this context “homogeneity” means that two risks belonging to the same set c h ( t ) have the same vector of factor loadings ω ( h ) . The sets have an explicit time dependence since they can change by the occurrence of defaults. On the contrary, the structure of dependence, defined by ω ( h ) is supposed to be time-independent.
Hence, solving the calibration problem requires the evaluation of
  • H factor loading vectors { ω ( h ) } h = 1 H , that link each of the homogenous clusters to the K latent variables;
  • K volatilities { σ k } k = 1 K , needed to specify the distribution of each of the latent variables.
The calibration is achievable by a two-step procedure. Firstly, the matrix A : = Ω T Σ Ω , introduced in Section 3, is estimated. Then, A is decomposed under the proper constraints in order to evaluate Ω and Σ separately. This section describes a method to complete the first step, providing an estimator of A both for the single and the multiple unwind period cases, with a moment-matching approach that allows expressing A ^ as a function of the covariance matrix among the historical frequencies of default. The second step is addressed later in Section 6, which provides an example of calibration using a real data set.
Adopting the standard CreditRisk + Assumption 1, Equation (12) can be used to link the covariance matrix among the historical frequencies of default with the matrix A. In Section 4.1, A ^ is provided in the case of historical frequencies of default, sampled with the same tenor of the projection horizon. In Section 4.2, A ^ is generalized to the case of historical frequencies of default sampled with an arbitrary tenor.
Furthermore, in Section 4.3, A ^ is determined under the exponential version of the CreditRisk + framework, introduced in Assumption 2. Thanks to this modified assumption, the corresponding functional form of A ^ is simpler than the one obtained in Section 4.2 based on Assumption 1.
Section 4.2 and Section 4.3 cope with Assumption 5, that implies absence of autocorrelation in time series. The final Section 4.4 uses Assumption 6 instead, generalizing the main results presented in this section to the case where autocorrelation must be taken into account. In this case, the simpler form of A ^ obtained in Section 4.3 comes in handy in the generalization to the non-trivial ACF case.

4.1. The Single Unwind Period Case

The first case considered is that of a single unwind period ( t , T ] . For each set c h ( t ) , let n h ( t ) : = | c h ( t ) | , F h : = 1 n h ( t ) i c h ( t ) Y i and G h 1 F h . The expected values of F h and G h are respectively:
q h : = E F h = i c h ( t ) q i n h ( t ) ,
s h : = E G h = 1 E F h .
Remark 4.
The slight abuse of notation in (28) is done to avoid the introduction of a new symbol to represent E F h . However, the letters chosen for indexing risks and cluster (“i” and “h” respectively) are maintained in the following of this work, clarifying the meaning of the “q” symbol each time it is used.
For any pair of sets of risks { h , h } , the covariance between the default frequencies is:
cov F h , F h = E F h E F h F h E F h = 1 n h n h E i c h Y i q i i c h Y i q i = 1 n h n h i c h i c h cov Y i , Y i ,
that, using Equation (12), becomes:
cov F h , F h = 1 n h n h i c h i c h q i q i k = 1 K ω i k ω i k σ k 2 + δ i i q i .
Equation (31) shows the relation between the observed covariance of default frequencies and the factor loadings, describing the structure of dependence of the model.
Moreover, assuming that all risks in a given homogenous set share the same factor loadings, the above expression simplifies to:
cov F h , F h = q h q h k = 1 K ω h k ω h k σ k 2 + δ h h q h n h
Notice that the second term in Equation (32) is present only when h = h , and becomes quickly negligible as n h grows (since q h < 1 ).
Equation (32) enables the estimation of A over the same time scale T t used for projections:
A ^ h h = 1 q h q h c o ^ v F h , F h δ h h q h n h .

4.2. The Multiple Unwind Period Case

Let us consider a set of H time series defined using a constant step δ m = ( T t ) / m . As done in Section 2, each variable introduced in Section 4.1 for the time interval ( t , T ] can be redefined over each of the considered time intervals. Namely, in the following we use the set of observables quantities { F h , G h , q h , s h } , measured either over ( t , T ] or ( t j 1 , t j = t j 1 + δ m ] or a generic time interval ( t , t ] . For the latter two cases, we introduce the notation { F h ( j ) , G h ( j ) , q h ( j ) , s h ( j ) } and { F h ( t , t ) , G h ( t , t ) , } , respectively. Further, the variables
F m h : = 1 j = 1 m 1 F h ( j ) ,
G m h : = j = 1 m G h ( j ) = 1 F m h
are introduced.
In CreditRisk + , F h ( t , t ) arises from a doubly stochastic process, since each absorbing event is generated conditioned to the latent systematic factors. For the sake of simplicity, we neglect the idiosyncratic uncertainty brought by each Y i ( t , t ) . In fact, for n h ( t ) large enough, the Bernoulli (or Poisson) r.v.’s contributions to the variance of F h ( t , t ) are dominated by the contribution of Γ ( t , t ) . This permits the following assumption.
Assumption 7
(Large clusters). For each cluster c h ( h = 1 H ) and each time interval ( t , t ] ( t , T ] it holds
var F h ( t , t ) | Γ ( t , t ) = 0 .
Then the following holds:
Proposition 2
(CreditRisk + scale-invariance law). Let us consider a set of risks { Y i } ( i = 1 N ), observed through a time horizon ( t a , t b ] and classified into a set of homogenous clusters c h ( h = 1 H ). Let Assumption 3 (“CreditRisk + parameters at different time scales”), Assumption 5 (“non-autocorrelated market factors”) and Assumption 7 (“large clusters”) hold with ξ k j = 1 for each ( t , T ] ( t a , t b ] and for each uniform partition t t 0 < t 1 < < t m T of ( t , T ] , ( m N ). Then, the couple F h ( t , T ) , F h ( t , T ) satisfies the conservation law
cov F h t , T , F h t , T + s h t , T s h t , T 1 T t = constant .
for each pair of clusters c h , c h and each ( t , T ] ( t a , t b ] .
The proof of Proposition 2 is reported in Appendix A.3.
Proposition 2 is one of the main results of this work. It allows to build an estimator of cov F h t , T , F h t , T using default frequencies F h ( j ) defined on a different time scale δ m . The dependence upon m of the precision of the covariance estimator is discussed in Section 5.
Indeed, applying Proposition 2 to Equation (33), it is possible to calibrate the dependence structure of the CreditRisk + model, by first determining the elements of the A matrix as
A h h = 1 q h q h cov F h ( j ) , F h ( j ) + s h ( j ) s h ( j ) m s h s h δ h h q h n h
for any j = 1 , , m , and then decomposing A, thus obtaining a separate estimate of the Ω , σ Γ 2 parameters. The SNMF decomposition can be performed, e.g., by using the technique described in [14].

4.3. The Exponential Case

In this section the problem of calibrating the dependence structure is addressed using the exponential form of the model introduced in Assumptions 2 and 4. Theorem 1 proves that also the exponential form remains consistent when considering multiple unwind periods. Since now Y ˜ i variables are used instead of the corresponding Y i , the frequencies F h and their complements G h are replaced by F ˜ h and G ˜ h , defined by the substitution Y i Y ˜ i in F h and G h definitions, respectively. Furthermore, it is convenient to introduce the following
L h : = q h q h ln G ˜ h
where
q h : = ln i c h ( t ) e q i n h ( t ) .
The notation introduced in Section 4.2 for { F h , G h , q h , } are extended to the exponential case as well. Hence, the sets of symbols { F ˜ h ( t , t ) , G ˜ h ( t , t ) , } and { F ˜ h ( j ) , G ˜ h ( j ) , } are also used. The log link function that relates p ˜ i and Γ simplifies the form of the scale invariariance law presented in Proposition 2. Indeed, in this case the following holds.
Proposition 3
(Modified CreditRisk + scale-invariance law). Let us consider a set of risks { Y ˜ i } ( i = 1 N ), observed through a time horizon ( t a , t b ] and classified into a set of homogenous clusters c h ( h = 1 H ). Let Assumption 4 (“modified CreditRisk + parameters at different time scales”), Assumption 5 (“non-autocorrelated market factors”) and Assumption 7 (“large clusters”) hold with ξ k j = 1 for each ( t , T ] ( t a , t b ] and for each uniform partition t t 0 < t 1 < < t m T of ( t , T ] , ( m N ). Then L h ( t , T ) , L h ( t , T ) obey to the conservation law
1 T t cov L h ( t , T ) , L h ( t , T ) = constant
for each pair of clusters c h , c h and each ( t , T ] ( t a , t b ] .
The proof of Proposition 3 is reported in Appendix A.4.
Proposition 3 states a conservation law for the modified version of the model, likewise Proposition 2 in the original (i.e., Poisson–Gamma) CreditRisk + framework. The form obtained for the LHS of Equation (40) is simpler than the corrisponding LHS of Equation (36). In general, this framework results to be more tractable than the original model. This is especially useful when estimating A given a non-trivial ACF, as shown in the next Section 4.4.
In this case, A can be estimated as
A h h = 1 q h q h cov L h , L h = 1 q h q h cov ln 1 F ˜ h , ln 1 F ˜ h
where we have neglected the contribution of cov ( Y ˜ i , Y ˜ i ) 1 n h ( t 1 ) 0 . Definition (38) and Proposition 3 imply
A h h = m q h ( j ) q h ( j ) cov ln 1 F ˜ h ( j ) , ln 1 F ˜ h ( j )
for each j = 1 m .

4.4. Handling Autocorrelated Time Series in Calibration

In this section a generalization of estimators in Equations (37) and (42) is provided, in case Assumption 5 has to be replaced with Assumption 6 due to the presence of autocorrelation in time series. We preliminarily report below a second order approximation that comes in handy to generalize Equation (37).
j = 1 m E G h ( j ) G h ( j ) = j = 1 m 1 q h ( j ) q h ( j ) + E F h ( j ) F h ( j ) = 1 j = 1 m q h ( j ) + q h ( j ) + j = 1 m E F h ( j ) F h ( j ) + j < j h , h = 1 , 2 q h ( j ) q h ( j ) +
We now consider again the relation between cov F m h , F m h and cov F h ( j ) F h ( j ) implied by Proposition 2, under the presence of autocorrelation for the latent variables. Unlike in Section 3.2, in this case covariance terms at delay | j j | 1 cannot be nullified.
cov F m h , F m h = E j = 1 m G h ( j ) G h ( j ) s h s h = 1 j = 1 m q h ( j ) + q h ( j ) + j < j h , h = 1 , 2 E F h ( j ) F h ( j ) + j = 1 m E F h ( j ) F h ( j ) s h s h +
Replacing Equation (43) into Equation (44), we have
cov F m h , F m h = j = 1 m E G h ( j ) G h ( j ) + j < j h , h = 1 , 2 cov F h ( j ) F h ( j ) s h s h + O 3
where O 3 is a compact notation for the sum of all the terms of order 3 or greater. Given that O 3 q 0 0 , the approximation O 3 0 is numerically sound in practice and implies the following generalization of A h h in Equation (37)
A h h 1 q h q h cov F h ( j ) , F h ( j ) + s h ( j ) s h ( j ) m + A C h h ( L ) s h s h δ h h q h n h ,
where the autocorrelation term A C ( L ) is defined as
A C h h ( L ) : = x = 1 m 1 ( m x ) cov F h ( j ) F h ( j + x ) + cov F h ( j ) F h ( j + x ) + 2 cov F h ( j ) F h ( j + x ) .
This completes the extension of the linear case presented in Section 4.2 to autocorrelated time series.
The exponential case—introduced in Section 4.3—turns out to be more tractable, since the linear structure implied by Proposition 3 allows us to avoid approximations similar to the one applied to extend the linear case above. Indeed, only the simplification implied by Assumption 5 must be abandoned, implying
cov L h , L h = m cov L h ( j ) , L h ( j ) + x = 1 m 1 2 ( m x ) cov L h ( j ) , L h ( j + x ) .
This is implied by the fact that L h ( j ) are still identically distributed for the same h but not independent. Hence, the estimator in Equation (42) becomes
A h h ( t , T ) = m q h ( j ) q h ( j ) cov ln 1 F ˜ h ( j ) , ln 1 F ˜ h ( j ) + A C h h ( E )
where
A C h h ( E ) : = 1 q h ( j ) q h ( j ) x = 1 m 1 2 ( m x ) cov ln 1 F ˜ h ( j ) , ln 1 F ˜ h ( j + x )

5. The Advantage of a Short Sampling Period

Let us consider a Δ t -long projection period and a set of historical time series of defaults that span a (past) time interval of length n Δ t . Typical examples can be Δ t = 1 year and 5 n 20 . Moreover, let the historical time series be sampled with a period δ m , which is m times smaller than Δ t (i.e., δ m : = Δ t / m ). Considering Δ t = 1 year, realistic assumptions are m = 4 (quarterly time series) or m = 12 (monthly time series). Therefore, the considered time series are defined over m × n intervals of length δ m , defined by a schedule t 0 , , t m × n .
This section discusses the precision improvement achievable by calibrating the model on historical default time series with a period smaller than the time horizon on which the calibrated model is applied. Indeed, the statistical error on the determination of A depends on m, i.e., on the sampling frequency of the observations, as shown in Section 5.1. Further, given Assumption 7 (“large clusters”), the statistical error can be written as a closed-form function of m, as σ k 2 approaches to zero ( k = 1 K ). In the following, the assumption of “small” volatilities is referred to as “Gaussian regime”, because it implies Γ k ˙ N ( 1 , β k ) ( k = 1 K ), as discussed in the proof of Theorem 3.
As in the previous Section 3 and Section 4, both the standard CreditRisk + framework (Assumptions 1 and 3) and the modified “exponential” version (Assumptions 2 and 4) are discussed hereinafter.
In applications where c h ’s are scarcely populated or σ k ’s are not negligible, Theorem 3 is not guaranteed to cope with observations. This case is addressed in Section 5.2, where the robustness of the closed-form expression (54) is investigated by Monte Carlo simulations.
A numerical approach is maintained in Section 5.3 as well, where the estimation error of A ^ at different time scales is measured in presence of autocorrelation, following the generalization introduced in Section 3.3 and Section 4.4. In this case, the exponential version of the model comes in handy: indeed, it is observed that the error on the estimator introduced in (46) (i.e., standard CreditRisk + version) does not decrease at increasing m, while the opposite is true for the estimator presented in (49) (i.e., exponential CreditRisk + version).
In Section 4, the A ^ estimator has been presented in multiple versions, depending on the considered model (standard or exponential version of CreditRisk + ), the chosen sampling period δ m and the presence or absence of autocorrelation. Thus, it is worth introducing a compact notation to identify the different versions of A ^ .
The expressions for A h h presented in (37) and (46) are addressed as “linear” estimators (as opposed to “exponential”) in the following. In these cases the symbol A ^ h h ( L , m ) is used, where L stands for “linear” and m = ( T t ) / δ m is the ratio between the projection and calibration time scales.
On the other hand, the expressions for A h h presented in (42) and (49) are addressed as “exponential” estimators and so the symbol A ^ h h ( E , m ) is used.
For the sake of brevity, when L or E is omitted, A ^ h h ( m ) refers to both the cases and, when m is omitted, A ^ h h refers to the m = 1 case.
The improvement in statistical precision with respect to the estimate with no subsampling, can be quantified by the following ratio:
ε [ A ^ h h ( m ) ] : = var A ^ h h ( m ) var A ^ h h .
Symbol ε h h ( m ) and its further specifications ε h h ( L , m ) : = ε [ A ^ h h ( L , m ) ] and ε h h ( E , m ) can be used as well. The last short notation that results to be convenient in the following is
c h h ( L m ) : = cov F h ( j ) , F h ( j ) + s h ( j ) s h ( j ) ,
c h h ( E m ) : = cov L h ( j ) , L h ( j ) .
where F h ( j ) , L h ( j ) and s h ( j ) ( j = 1 m ) are i.i.d. variables quantified using a sampling period δ m .
Remark 5.
The notation “ A ^ ” refers to the fact the covariances involved in the definitions must be replaced with the corresponding sample estimators, when applying A h h ( m ) to historical time series. The same applies to the symbol c ^ .

5.1. Precision of A ^ at Different Time Scales under the Gaussian Regime

The following result quantifies the precision gain of performing CreditRisk + model calibration by historical time series available at increasing sampling frequencies. As anticipated, the precision of the estimated parameters increases as the sampling period decrease. This result holds under Assumption 7, in the limit σ 0 + and considering absence of autocorrelation. The cases where some n h is small (i.e., it does not verify Assumption 7) or where some σ k is not negligible are addressed numerically in the next Section 5.2—showing that the precision is still increasing as shorter sampling periods are considered. The introduction of autocorrelation is addressed in Section 5.3.
Theorem 3
(Estimation errors under Gaussian regime). Let us consider a set of risks, observed through a time interval ( t 0 , t ] and classified into a set of homogenous clusters c h ( h = 1 H ). Let Assumption 3 (“CreditRisk + parameters at different time scales”), Assumption 5 (“non-autocorrelated market factors”) and Assumption 7 (“Large clusters”) hold with ξ k j = 1 for a given uniform partition t 0 < t 1 < < t j < < t m × n t of ( t 0 , t ] , ( t j t j 1 = δ m ; m , n N ). Let A ^ be the estimate of A needed to calibrate the CreditRisk + model in order to project losses over the time horizon ( t , T ] , such that ( t t 0 ) / ( T t ) = n and ( T t ) / ( δ m ) = m . Then the following is true for A ^ h h ( m ) :
ε h h ( m ) σ 0 + n 1 m · n 1
Equation (54) remains true also considering Assumption 4 (“modified CreditRisk + parameters at different time scales”) instead of Assumption 3.
The proof of Theorem 3 is reported in Appendix A.5.

5.2. Beyond the Gaussian Regime: Numerical Simulations

In this section we verify that both the estimators A ^ h h ( E , m ) and A ^ h h ( L , m ) are more precise at increasing m. The closed-form results obtained in the Gaussian regime, discussed in Section 5.1, hold when the factor volatilities σ Γ are much less than 1. Increasing σ k ( k = 1 , , K ) the Gaussian regime becomes less satisfactory and the difference of precision amongst determinations with different values of m becomes smaller. However, the error of A ^ h h ( m ) remains monotonically decreasing in m, even far from the Gaussian regime conditions.
We considered a case study with a two-factors market ( Γ k , k = 1 , 2 ). The couple of systematic factors induces the dependence between two populations of risks, as per the weights reported in Table 1.
The volatilities ( σ k , k = 1 , 2 ) associated to the factors are chosen according to seven different scenarios (indexed by i σ ), respectively as
σ Γ : = 2 i σ 2.5 · 10 2 5.0 · 10 2 , i σ = 0 6 .
For each scenario, the distributions of the estimators A ^ 12 ( E , m ) and A ^ 12 ( L , m ) ( m = 1 12 ) have been determined using 10 5 simulations of { F 1 ( t , n 1 ) , F 2 ( t , n 2 ) } where t t 0 , t 0 + n Δ t ( n = 10 ) and n h ( h = 1 , 2 ) is the number of risks belonging to each cluster. For both estimators the dynamic F h ( t , n h ) is that reported in (14). All risks belonging to the same cluster are supposed to have the same unconditioned intensity of default
q i ( t , t + Δ t ) = 1 Δ t log ( 0.99 ) , i = 1 , , n h , h = 1 , 2 .
To investigate the additional contribution to the error σ A ^ 12 , generated by the finiteness of each cluster, different values of n h have been considered. In particular, the number of claims per each elementary temporal step δ m = 1 / m is extracted from a binomial distribution with parameter
n h 10 3 , 2.5 · 10 3 , 5 · 10 3 , 10 4 , 2.5 · 10 4 , 5 · 10 4 , h = 1 , 2 .
For simplicity’s sake, it is assumed that each defaulted risk is instantly replaced by a new risk, keeping the population of each cluster constant in time. Finally, the case n h = (absence of binomial source of randomness) is also considered.
Figure 1 shows the behaviour of ε A ^ 12 ( m ) as a function of m, comparing various choices of σ Γ . In this case we are not considering yet the contribution to error due to the finite population ( n h = for each cluster h). Equation (54) (red curve) is almost perfectly verified by the least volatility scenario ( σ Γ = ( 2.5 , 5.0 ) · 10 2 ). At increasing volatility values (brighter curves), the gain in precision obtained at higher m is reduced, as well as the accordance with Equation (54).
Since the transformation of ε A ^ 12 ( m ) moving away from the Gaussian regime (i.e., increasing | σ Γ | ) is smooth, estimating A 12 with m > 1 remains convenient even for σ Γ k 1 , despite the fact that Equation (54) is not verified anymore.
Comparison between the left and the right panel of Figure 1 shows that the above argument holds both in the linear and in the exponential case. This fact is also verified for all the other results of this section.
The results shown in Figure 1 are numerically checked against the case of finite portfolio populations: we tested each of the n h declared in Equation (57). Even the smallest size considered (i.e., n h = 10 3 Figure 2), that is affected by the largest binomial contribution to the error, leads to results comparable to the ones observed in the n h = case. The size n h = 10 3 is considered to be a limiting value for a realistic case.
We simulated the distribution of the estimator A ^ 12 ( m ) as a function of m, testing all the possible combinations of σ Γ and n h declared in Equations (55) and (57). Figure 3 reports an example of the results. All the other considered ( n h , σ k ) couples resulted to have a similar behavior. The visual comparison between E [ A ^ 12 ( m ) ] (blue “X” symbol) and A 12 level (red horizontal line) shows that indeed A ^ 12 ( m ) is unbiased, both in the linear and in the exponential case (Equations (37) and (42) respectively). The dispersion around the mean reduces at increasing m, in agreement with both Equation (54) and the numerical results in Figure 1 and Figure 2.
As implied by Figure 2 and Figure 3, the number of risks n h does not play a relevant role (if any) in computing the ratio ε A ^ 12 ( m ) , while the absolute value of the standard error σ A ^ 12 ( m ) is sensitive to the size of the portfolio.
This fact is confirmed by the results shown in Figure 4, where the estimates of σ A ^ 12 ( m ) have been arranged as functions of n h at fixed σ Γ and m values. As expected, the standard error is greater when considering smaller n h values, while the dependence on n h of the error disappears quickly as approaching n h .

5.3. Estimation Error in Presence of Autocorrelation

In Section 5.2 the precision gain at increasing m is measured in absence of autocorrelation. In this section, the same numerical simulations are re-performed, introducing autocorrelation and comparing the results against the theoretical estimation of ε . The effect of autocorrelation on ε is discussed in Appendix B.
The numerical setup introduced above in Section 5.2 has been maintained, with a further assumption about ACF. Indeed, we assume that each latent variable ( k = 1 , 2 ) obeys to the following ACF law, discussed in Section 3.3
ϱ x k = ρ | x | , m = 12
where the considered ρ values are 0.05 and 0.5 . For m < 12 cases, we have considered ACF’s resulting for the latent variables time series Γ ˜ k ( j ) obtained by the clustering operation
Γ ˜ k ( j ) m m j Γ k ( j ) , j = 1 + m m ( j 1 ) m m j
given the aforementioned ACF law at 1 / m time scale. Since the contribution of the finite population to the error has been shown to be neglectable in Section 5.2, simulations in presence of autocorrelation have been performed under n h = assumption only.
Figure 5 shows that the estimator A ^ h h ( E , m ) remains more precise at increasing m, even in presence of autocorrelation. The analytical results obtained in the Gaussian regime (i.e., theoretical superior and inferior estimates of ε —dashed and solid red lines in Figure 5), discussed in Appendix B, are in good agreement with the numerical results obtained in the considered set up. All the empirical measures of ε are included between the two theoretical limits (yellow areas).
Moreover, precision gain (i.e., ε < 1 ) at m > 1 is also possible when using the estimator ε A ^ 12 ( L , m ) , introduced in Equation (46). However, due to the approximation introduced in this case, the estimator is not convenient (i.e., ε > 1 ) in the majority of the considered configurations.

6. An Application to Market Data

This section provides an example of the calibration technique applied to a real-world data set. The calibration technique is applied to a set of historical time series of bad loan rates supplied from the Bank of Italy. “Bad loan” is a subcategory of the broader class “Non-Performing Loan” and it is defined as exposures to debtors that are insolvent or in substantially similar circumstances [24].
In particular, the chosen data set is composed of the quarterly historical series TRI30496 ( m = 4 ) over a five year period (from 1 January 2013 to 31 December 2017, n = 5 , Δ t = 1 ). The data are publicly available at [10]. The time series are supplied by the customer sector (“counterpart institutional sector”) and geographic area (“registered office of the customer”). The latter, in the example, is held fixed to a unique value that corresponds to the whole country (Italy). Table 2 and Table 3 report the definition of the 6 different clusters and their main features.
By inspection of Table 3, it is possible to perform a rough estimate of σ Γ . Equation (14) implies that the following holds for coefficients of variation C V h ( h = 1 , , H 6 ):
C V h : = σ h p ¯ h k = 1 K ω h k σ Γ k
Furthermore, the normalization requirement over the factor loadings ω h k implies
k = 1 K ω h k 1
Hence we can state that C V : = 1 H h C V h has the same order of magnitude of 1 K k σ Γ k . Since C V 0.124 , results in Section 5.2 suggest that this data set is not far from the Gaussian regime and so there is an appreciable increase of precision in estimating A ( 0 , 1 ) with m > 1 .
A ^ ( 0 , 1 ) is estimated by applying Equation (42) over a one-year period. The results obtained for A ^ ( E , m ) ( 0 , 1 ) ( m = 1 , 4 ) are reported in Table 4.
The elementwise precision gain for m = 4 , ε A ^ ( E , 4 ) ( 0 , 1 ) , obtained under the Gaussian regime assumption, is shown in Table 5. This result is obtained applying definition (51) and Equation (A25) both to the cases m = 4 and m = 1 . Equation (A25) has been shown to be valid under the Gaussian regime, discussed in Section 5.1.
In this case, the preliminary decomposition of A ^ ( E , 4 ) ( 0 , 1 ) , that would be needed using the Monte Carlo method discussed in Section 5.2, is not needed.
According to Equation (54), the elements of ε A ^ ( E , 4 ) ( 0 , 1 ) reported in Table 5 should be all approximately equal to 0.46 , since they should depend only on the couple m , n ( m = 4 and n = 5 in this case). However, in a real world case like the one considered, the assumption of zero autocorrelation is satisfied with a different precision by each time series p h ( t ) . Furthermore, the estimated covariance matrices might need to be regularized (indeed the Higham regularization algorithm [25] was used both for m = 1 and m = 4 series). Hence, a different ratio for each element ( h , h ) = 1 , , 6 is justified. Nonetheless, it is worth noticing that all the ratios reported in Table 5 have the same order of magnitude of the predicted value 0.46 .
Knowledge of the historical number of risky subjects n h ( t ) for each cluster ( h = 1 , , 6 ) at each observation date ( t = 1 / 4 , 2 / 4 , , 5 ) allows to take into account the binomial contribution to the error σ A ^ ( E , m ) ( 0 , 1 ) , both for m = 4 (quarterly series) and m = 1 (yearly series), although the finiteness of the population does not add a relevant contribution to the error, as already observed in Section 5.2.
Table 6 provides Monte Carlo estimation of σ A ^ ( E , m ) ( 0 , 1 ) ( m = 1 , 4 ) , which considers also the role of n h ( t ) . Since the values in Table 6 provide a measure of the error in the determination of A ^ ( E , m ) ( 0 , 1 ) , it turns out that the estimates reported in Table 4 are elementwise consistent one with the other.
The Monte Carlo estimation of σ A ^ ( E , m ) ( 0 , 1 ) , as done in Section 5.2, requires the a priori knowledge of the true dependence structure W , σ Γ . Since this is a case study, we do not have an a priori parameterization of the calibrated model. Hence, we have used W ^ , σ ^ Γ estimated from A ^ ( E , 4 ) ( 0 , 1 ) instead, as a proxy of the “true” model parameters. The computation of W ^ , σ ^ Γ from A ^ ( E , 4 ) ( 0 , 1 ) is discussed below.
In order to complete the CreditRisk + calibration, we have to decompose A ^ and find the factor loadings matrix W ^ together with the vector of systematic factors variances σ ^ Γ 2 . To do so, we use the Symmetric Non-negative Matrix Factorization (SNMF), an iterative numerical method to search an approximate decomposition of A ^ which satisfies the requirements of the CreditRisk + model over W ^ (i.e., all elements ω h k > 0 and k ω h k = 1 ). The application of SNMF to CreditRisk + is discussed in detail in [14]. In the following, we give evidence only of the implementation details necessary to address this case study. Being an iterative method, SNMF requires an initial choice of matrixes
U ^ 0 : = W ^ U Σ ^ 1 / 2 , V ^ 0 : = Σ ^ 1 / 2 W ^ V ,
such that A ^ = U ^ 0 V ^ 0 . It is not required that U ^ 0 = V ^ 0 T , nor all the elements of U ^ 0 and V ^ 0 have to be positive. We set U ^ 0 , V ^ 0 from the eigenvalues decomposition of A ^ ( E , 4 ) ( 0 , 1 ) .
For the considered data set, the eigenvalues decomposition returned the set of eigenvalues and eigenvectors reported in Table 7.
We use the ω ˜ , σ ˜ notation to address the quantities over which the normalization requirement of CreditRisk + has not been imposed yet.
Since more than the 95 % of variance is explained by the first three eigenvectors, we reduced the dimensionality of the latent variables vector to be K = 3 . Hence we define
U ^ 0 = ω ˜ 1 , ω ˜ 2 , ω ˜ 3 · diag ( σ ˜ 1 , σ ˜ 2 , σ ˜ 3 ) = 1.80 5.29 1.15 2.78 5.18 3.71 2.51 5.58 4.46 27.79 2.65 0.57 2.33 5.88 1.93 2.07 10.58 5.96 · 10 2
and V ^ 0 = U ^ 0 T . In general, SNMF aims to minimize iteratively the cost function
A ^ U ^ V ^ 2 + α U ^ V ^ T 2
where | · | is the Frobenious norm, eventually weighted, and α is a free parameter to weight the asymmetry penality term. Further details on the method are available in [14]. The application of SNMF method, together with the normalization constraint over the factor loadings, leads to the result reported in Table 8.
A reasonable economic interpretation supports the set of parameters resulting from the calibration process described above. Indeed, factor loadings associated with the “general government” sector ( h = 4 ) are completely distinct from the ones of the other sectors (i.e., this is the only sector mainly depending on the k = 1 factor): this fact copes with the different nature of the public entities from the ones belonging to the other considered sectors. Furthermore, “companies” ( h = 2 , 3 ) share approximately the same dependence structure. The same applies when considering “households” ( h = 1 , 5 ). Finally, the “institutions serving households” sector ( h = 6 ) shares the same latent factor ( k = 2 ) but shows a different balance between idiosyncratic and systematic factor loadings compared to “households”, that is coherent with the nature of a sector strongly linked to “household” sectors, despite not being completely equivalent.
Results in Table 8 have been used to quantify the estimation errors reported in Table 6.

7. Conclusions

In this work, we have investigated how to calibrate the dependence structure of the CreditRisk + model, when the sampling period δ m of the (available) default rate time series is different from Δ t —the length of the future time interval chosen for the projections.
Preliminarily, we proved that CreditRisk + remains internally consistent when imposing the underlying distributional assumption to be simultaneously true at different time scales (Theorem 1). The model internal consistency is robust against the introduction of autocorrelation, depending on the considered ACF form (Theorem 2).
Then the problem has been approached in terms of moment matching, providing two asymptotically equivalent formulations for estimating the covariance matrix A amongst the systematic factors of the model (Propositions 2 and 3). The choice between the two estimators of A, provided in Equations (37) and (42), depends on the functional form (linear or exponential) that links the probability of claim/default and the latent variables. Both the estimators are explicitly dependent on the ratio Δ t / δ m , allowing for the calibration of the model at a time scale that is different from the one chosen for applying the calibrated model. Both the estimators have been generalized to autocorrelated time series in Equations (46) and (49), although only the latter (i.e., exponential case) is an exact result, while a second-order approximation has been adopted for the linear case.
Furthermore, calibrating the model on a shorter time scale than the projection horizon has been proved to be convenient in terms of reduced estimation error on A ^ . Analytical expressions for the error are provided in the Gaussian regime (i.e., small variances of the latent variables) by Theorem 3. In contrast, the case of increasing variance has been investigated numerically, confirming that, in general, the precision of the calibration is higher when employing historical data with a shorter sampling period. It has been verified that the convenience of calibrating the model at short time scales also remains in the presence of autocorrelation, although this is guaranteed only in the exponential framework, where an exact correction term is available.
Finally, the techniques presented in this work are shown to be numerically sound when applied to a real, publicly available data set of Italian bad loan rates.

Author Contributions

Conceptualization, J.G. and L.P.; methodology, J.G. and L.P.; software, J.G. and L.P.; validation, J.G. and L.P.; formal analysis, J.G. and L.P.; investigation, J.G. and L.P.; resources, J.G. and L.P.; data curation, J.G. and L.P.; writing—original draft preparation, J.G.; writing—review and editing, L.P.; visualization, J.G. and L.P.; supervision, J.G. and L.P.; project administration, J.G. and L.P.; funding acquisition, not applicable. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are openly available in “Banca d’Italia—Base Dati Statistica” [10].

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs

This section reports the proofs of theorems and propositions presented in this study.

Appendix A.1. Proof of Theorem 1

Proof. 
Firstly, the statement is proved considering Assumptions 1 and 3.
Assumption 3 implies by construction that { Y i ( j ) } j = 1 m is a set of Poisson r.v.’s, which are mutually independent, conditionally on the realization of { Γ ( j ) } j = 1 m . Poisson distribution is closed with respect to addition. Hence
j = 1 m Y i ( j ) | Γ ( j ) Poisson ( p i Σ ) ,
where the distribution parameter is
p i Σ = j = 1 m q i t j t j 1 T t q i ( j ) ω i 0 + k = 1 K ω i k Γ k ( j ) .
Equation (20) in Assumption 3, the choice ξ k j = 1 and the scaling property of Gamma distribution imply that
t j t j 1 T t Γ k ( j ) Gamma σ k 2 t j t j 1 T t , σ k 2
Furthermore, Assumption 5 and the fact that independent Gamma r.v.’s with the same scale parameter are closed with respect to addition imply that
j = 1 m t j t j 1 T t Γ k ( j ) Gamma σ k 2 , σ k 2 .
Hence j = 1 m t j t j 1 T t Γ k ( j ) Γ k and so j = 1 m Y i ( j ) Y i . This implies that { Y i } satisfies Assumption 1 over ( t , T ] .
The proof above can be extended to the exponential case, i.e., when considering Assumptions 2 and 4 instead of Assumptions 1 and 3. The form of parameter p i Σ in (A2) can be obtained also can be obtained also from Assumption 4. In fact, the substitution Y i ( j ) Y ˜ i ( j ) implies that Y ˜ i Bernoulli ( p ˜ i ) where
ln 1 p ˜ i = ln j = 1 m 1 p ˜ i ( j ) = j = 1 m q i t j t j 1 T t ω i 0 + k = 1 K ω i k Γ k ( j ) .
Considering Equation (A5) instead of (A2), the proof presented above holds for the Y ˜ i representation of risks, ceteris paribus, implying that { Y ˜ i } satisfies Assumption 2 over ( t , T ] . □

Appendix A.2. Proof of Theorem 2

Proof. 
The same arguments that lead to (A2) or to (A5) in proof of Theorem 1 are still valid in this case. Hence, it suffices to prove that mean and variance of the latent variable
Γ k : = j = 1 m t j t j 1 T t Γ k ( j ) = 1 m j = 1 m Γ k ( j )
remain consistent with CreditRisk + requirements, stated in Assumption 1. It holds E [ Γ k ] = 1 , since E [ Γ k ( j ) ] = 1 . Moreover, the coefficient ξ j k compensates the bias introduced in var Γ k by the fact that Γ k ( j ) ( j = 1 m ) are autocorrelated according to the ACF ϱ x k :
var j = 1 m Γ k ( j ) = j = 1 m var Γ k ( j ) + j = 1 m j j cov Γ k ( j ) , Γ k ( j ) = var Γ k ( 1 ) m + 2 x = 1 m 1 m x ϱ x k m ξ k j 2
which implies var Γ k = σ k 2 directly.
The fact that Γ k is Gamma distributed is imposed in Assumption 6, implying that Γ k Γ k and so that Assumption 1 is satisfied. □

Appendix A.3. Proof of Proposition 2

Proof. 
Given a time interval ( t , T ] ( t a , t b ] and a uniform partition ( j = 1 m ) over ( t , T ] , Assumptions A3 and A5 imply that { Y i } satisfies Assumption 1 over ( t , T ] by Theorem 1. Assumption 7 guarantees the convergence of F h to E [ F h | Γ ] and of F h ( j ) to E [ F h ( j ) | Γ ( j ) ] , where we recall that F h = F h ( t , T ) .
For any interval ( t , T ] ( t a , t b ] and any pair of clusters c h , c h , definitions (34), (35) and Assumption 5 imply that the covariance between F m h and F m h is given by
cov F m h , F m h = j = 1 m cov F h ( j ) F h ( j ) + s h ( j ) s h ( j ) s h s h
Since all the considered subintervals ( t j 1 , t j ] have the same length δ m = t j t j 1 , the frequencies F h ( j ) are i.i.d., so that the above expression simplifies to:
cov F m h , F m h + s h s h = cov F h ( j ) , F h ( j ) + s h ( j ) s h ( j ) m
for any j = 1 , , m .
Each cluster c h is supposed to be homogenous by definition, i.e., ω ( i ) = ω ( h ) for each risk Y i c h . Hence, distributional Assumptions 1 and 3 imply that both F h n h ( t ) E [ F h | Γ ] and F h ( j ) n h ( t j ) E [ F h ( j ) | Γ ( j ) ] are sample estimators of the parameters p h ( Γ ) : = q h ( ω h 0 + k ω h k Γ k ) and p h ( j ) ( Γ ( j ) ) respectively, leading to the equivalence relation
F m h = F h = q ^ h ω h 0 + k = 1 K ω h k Γ k ,
therefore both F m h ( t , T ) and F h ( t , T ) are estimators of the default frequency for the ( t , T ] interval. Thus, Equation (A7) can be rewritten as:
cov F h , F h + s h s h = cov F h ( j ) , F h ( j ) + s h ( j ) s h ( j ) m
and, since m = ( T t ) / δ m ,
cov F h , F h + s h s h 1 / ( T t ) = cov F h ( j ) , F h ( j ) + s h ( j ) s h ( j ) 1 / δ m .
To complete the proof, let ( t , T ] and ( t , T ] be two subintervals of ( t a , t b ] , such that ( T t ) / ( T t ) Q . Hence, GCD { T t ; T t } : = δ ¯ R + exists. δ ¯ can be used as the mesh to define two uniform partitions over the two considered intervals.
Given these partitions, (A10) can be applied both to T t and to T t , leading to
cov F h ( t , T ) , F h ( t , T ) + s h ( t , T ) s h ( t , T ) 1 / ( T t ) = cov F h ( t , T ) , F h ( t , T ) + s h ( t , T ) s h ( t , T ) 1 / ( T t )
and completing the proof. The requirement ( T t ) / ( T t ) Q can be easily weakened by the convergence of finite continued fractions with an increasing number of terms, until the desired degree of precision is reached. □

Appendix A.4. Proof of Proposition 3

Proof. 
Given a time interval ( t , T ] ( t a , t b ] and a uniform partition ( j = 1 m ) over ( t , T ] , Assumptions 4 and 5 imply that { Y ˜ i } satisfies Assumption 2 over ( t , T ] by Theorem 1.
Assumption 7 guarantees the convergence of L h to E [ L h | Γ ] , where we recall that L h = L h ( t , T ) . Furthermore, it holds by definition that E [ L h | Γ ] = p h ( Γ ) , where the notation p h has been introduced in the proof of Proposition 2.
The same apply to L h ( j ) ( j = 1 m ) for each uniform partition of ( t , T ] considered; indeed, Assumption 7 implies L h ( j ) E [ L h ( j ) | Γ ( j ) ] = p h ( j ) ( Γ ) .
Since p h ( Γ ) = j = 1 m p h ( j ) ( Γ ( j ) ) and given that the partition is uniform, it holds L h = m L h ( j ) for each j = 1 m . Since m ( T t ) / δ m , we have
1 T t L h = 1 δ m L h ( j )
Assumption 5 and Equation (A11) imply that
1 T t cov [ L h , L h ] = 1 δ m cov [ L h ( j ) , L h ( j ) ]
for each considered pair of clusters c h , c h . The proof is completed by the same argument used in proof of Proposition 2, after Equation (A10). □

Appendix A.5. Proof of Theorem 3

Proof. 
Assumptions 3 and 5 and ξ k j = 1 imply Assumption 1 by Theorem 1. The same theorem implies Assumption 2 in case Assumption 4 is considered instead of Assumption 3, ceteris paribus. Furthermore, Assumptions 3, 5 and 7 and ξ k j = 1 imply that
A ^ h h ( L , m ) = 1 q h q h c ^ h h ( L m ) m s h s h δ h h q h n h
by Proposition 2, for any j = 1 m and h , h = 1 H . Analogously, considering Assumption 4 instead of Assumption 3, it holds
A ^ h h ( E , m ) = m q h q h c ^ h h ( E m )
by Proposition 3, for any j = 1 m and h , h = 1 H .
The next step of the proof is showing that Γ k ˙ N ( 1 , β k ) in the limit σ k 0 + . In fact, both Assumptions 3 and 4 state that
Γ k ( j ) Γ 1 m β k , m β k , E Γ k ( j ) = 1 , var Γ k ( j ) = m β k , j = 1 , , m .
Hence their probability densities d F k ( x ) satisfy the following:
d F k ( x ) x ( m β k ) 1 1 exp ( m β k ) 1 x d x
Since it holds m β k 1 1 σ k 0 + m β k 1 , we have
lim σ k 0 + d F k ( x ) exp ln x x m β k d x .
By introducing the auxiliary variable x x 1 and replacing ln ( 1 + x ) with the first three terms of its Maclaurin series, relation (A16) can be equivalently written as
lim σ k 0 + d F k x ( x ) exp x 2 2 m β k d x
In the limit σ k = β k 0 + , Equation (A17) implies that
Γ k ( j ) N μ = 1 , σ 2 = m β k .
Hence it holds that each F h ( j ) is normally distributed, with variance m σ h 2 : = m k ω h k β k —when considering the linear case (i.e., Assumptions 1 and 3). Analogously, also each L h ( j ) is normally distributed in the exponential case (i.e., Assumptions 2 and 4).
Considering the market factors—as well as the historical observations of default frequency—as normal random variables is relevant to prove the theorem, since it implies that the covariance matrix estimators c ^ ( L m ) and c ^ ( E m ) are Wishart distributed. Hence the variance associated to a given matrix element is
var c ^ h h ( m ) = m 2 m · n 1 ρ h h 2 + 1 σ h 2 σ h 2
in both linear and exponential cases. In the exponential case Equation (A19) is equivalent to the following
var c ^ h h ( E m ) = 1 m · n 1 c h h ( E m ) 2 + c h h ( E m ) c h h ( E m )
while the same is not true in the linear case. Given Equation (A19), it is possible to prove Equation (54) separately in the two cases.
Proof in the linear case. Proposition 2 implies
var A ^ h h ( L , m ) = 1 q h q h 2 var c ^ h h ( L m ) m = 1 q h q h 2 E c ^ h h ( L m ) 2 m E c ^ h h ( L m ) 2 m = 1 q h q h 2 var c ^ h h ( L m ) + E c ^ h h ( L m ) 2 m E c ^ h h ( L m ) 2 m
In the limit σ 0 + the binomial above can be replaced with its leading term. Hence
var A ^ h h ( L , m ) = 1 q h q h 2 var c ^ h h ( L m ) E c ^ h h ( L m ) 2 ( m 1 )
By applying Equation (A19) we have
var A ^ h h ( L , m ) = 1 q h q h 2 m 2 m · n 1 ρ h h 2 + 1 σ h 2 σ h 2 c h h ( L m ) 2 ( m 1 ) = 1 q h q h 2 1 m · n 1 ρ h h 2 + 1 ρ h h 2 c h h ( L m ) s h ( j ) s h ( j ) 2 c h h ( L m ) 2 ( m 1 )
Applying Proposition 2 once again we have c h h ( L m ) 2 m = c h h ( L 1 ) 2 . Furthermore, we have s h ( j ) s h ( j ) c h h ( L m ) 2 ( m 1 ) σ 0 + s h ( j ) s h ( j ) 2 m = s h 2 s h 2 . Hence it holds
var A ^ h h ( L , m ) = 1 q h q h 2 1 m · n 1 ρ h h 2 + 1 ρ h h 2 c h h ( L 1 ) 2 s h 2 s h 2
and thus the ratio var A ^ h h ( L , m ) / var A ^ h h ( L , 1 ) verifies Equation (54), completing the proof for the linear case.
Proof in the exponential case. Equation (A20) and Proposition 3 imply
var A ^ h h ( E , m ) = m 2 q h q h 2 var c ^ h h ( E m ) = m 2 q h q h 2 1 m · n 1 c h h ( E m ) 2 + c h h ( E m ) c h h ( E m ) = 1 q h q h 2 1 m · n 1 c h h ( E 1 ) 2 + c h h ( E 1 ) c h h ( E 1 )
The latter implies that in case m = 1 we have
var A ^ h h ( E , 1 ) = 1 q h q h 2 1 n 1 c h h ( E 1 ) 2 + c h h ( E 1 ) c h h ( E 1 )
Hence, the ratio var A ^ h h ( E , m ) / var A ^ h h ( E , 1 ) verifies Equation (54), completing the proof for the exponential case. □

Appendix B. Covariance Estimation Error in Presence of Autocorrelation

In this section a generalization of Equation (54) is provided, considering the presence of autocorrelation. Only the exponential case is discussed, because a closed form for A ^ h h ( E , m ) is still available when autocorrelation has to be considered—while only a second order approximation has been computed for the linear case A ^ h h ( L , m ) .
A comparison between Equations (41) and (49) allows us to generalize Proposition 3.
c h h ( E 1 ) = m c h h ( E m ) + 2 x = 1 m 1 ( m x ) x c h h ( E m )
where
x c h h ( E m ) : = cov L h ( j ) , L h ( j + x )
It holds by definition
c h h ( E m ) = q h q h m 2 k = 1 K ω h k ω h k var Γ k ( j ) x c h h ( E m ) = q h q h m 2 k = 1 K ω h k ω h k cov Γ k ( j ) , Γ k ( j + x )
Hence, Assumption 6 implies
E x c ^ h h ( E m ) = E q h q h m 2 k = 1 K ω h k ω h k ϱ x k var Γ k ( j ) = x ϱ ˜ h h
where
x ϱ ˜ h h : = k = 1 K w ˜ k h h ϱ x k k = 1 K w ˜ k h h ; w ˜ k h h : = ω h k ω h k m ξ k 2 σ k 2
Furthermore, applying Equation (A19), it follows that
var x c ^ h h ( E m ) = 1 m · n 1 E x c ^ h h ( E m ) 2 + E cov ^ L h ( j ) , L h ( j ) E cov ^ L h ( j + x ) , L h ( j + x ) = 1 m · n 1 x ϱ ˜ h h 2 c h h ( E m ) 2 + c h h ( E m ) c h h ( E m ) = var c ^ h h ( E m ) 1 x ϱ ˜ h h 2 m · n 1 c h h ( E m ) 2
Equation (A29) leads to another version of Equation (A27)
c h h ( E m ) = 1 m 1 + 2 x = 1 m 1 ( 1 x m ) x ϱ ˜ h h 1 c h h ( E 1 )
From Equation (49) we have
var A ^ h h ( E , m ) = m 2 q h q h 2 var c ^ h h ( E m ) + 2 x = 1 m 1 ( 1 x m ) x c ^ h h ( E m )
Equation (A33) implies that var A ^ h h ( E , m ) depends on the correlation matrix ϱ x x ( c ^ ) among the considered covariance estimators x c ^ h h ( E m ) ( x = 0 , 1 , ) , as shown below by choosing an equivalent expression for the RHS:
var A ^ h h ( E , m ) = m 2 q h q h 2 x , x = 0 m 1 ϱ x x ( c ^ ) x s h h x s h h
where
x s h h : = ( 2 δ 0 x ) ( 1 x m ) ( var x c ^ h h ( E m ) ) 1 2
In case the covariance estimators are independent from each other (i.e. ϱ x x ( c ^ ) = δ x x ), an inferior limit to the considered variance is obtained
var A ^ h h ( E , m ) m 2 q h q h 2 var c ^ h h ( E m ) + 4 x = 1 m 1 ( 1 x m ) 2 var x c ^ h h ( E m )
Equation (A31) can be substituted into Equation (A34). Hence, RHS of inequality (A36) becomes
var c ^ h h ( E m ) m 2 q h q h 2 1 + 4 x = 1 m 1 ( 1 x m ) 2 1 1 x ϱ ˜ h h 2 m · n 1 cv 2 c ^ h h ( E m )
where the notation cv · stands for the coefficient of variation.
( c h h ( E m ) ) 2 and var c ^ h h ( E m ) can be expressed by using Equation (A32). Hence Equation (A36) can be used to estimate an inferior limit to ε A ^ h h ( E , m ) in the gaussian regime. A superior limit for the same quantity can be computed as well, imposing ϱ x x ( c ^ ) = 1 for each considered x , x .
Remark A1.
Equation (A34) does not converge to (A25) in the limit ϱ x k 0 x ϱ ˜ h h 0 . This copes with the fact that assuming ϱ x k = 0 in Equation (A25) implies a lesser error than measuring it.

References

  1. Crouhy, M.; Galai, D.; Mark, R. A comparative analysis of current credit risk models. J. Bank. Financ. 2000, 24, 59–117. [Google Scholar] [CrossRef]
  2. Murphy, D. Unravelling the Credit Crunch; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
  3. Schönbucher, P.J. Credit Derivatives Pricing Models: Model, Pricing and Implementation; Wiley: Hoboken, NJ, USA, 2003. [Google Scholar]
  4. Fréchet, M. Sur les Tableaux de Corrélation Dont les Marges Sont Donnés; Annales de l’Université de Lyon, Science: Lyon, France, 1951; Volume 4, pp. 13–84. [Google Scholar]
  5. Sklar, A. Fonctions de Répartition à n Dimensions et Leurs Marges; Institut Statistique de l’Université de Paris: Paris, France, 1951; Volume 8, pp. 229–231. [Google Scholar]
  6. Joe, H. Multivariate Models and Dependence Concepts; Chapman and Hall/CRC: Boca Raton, FL, USA, 1997. [Google Scholar]
  7. Nelsen, R.B. Introduction to Copulas, 1st ed.; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
  8. Li, D.X. On Default Correlation: A Copula Function Approach. J. Fixed Income 2000, 9, 43–54. [Google Scholar] [CrossRef]
  9. Mai, J.-F.; Scherer, M. Simulating Copulas: Stochastic Models, Sampling Algorithms, and Applications, 2nd ed.; World Scientific Publishing Company: Singapore, 2017. [Google Scholar]
  10. The Bad Loan Rate Series Is Labelled as TRI30496_35120163. The Count of Performing Borrowers at the Initial Period Series Is Labelled as TRI30496_351122141. Bank of Italy Statistical Database. Available online: https://infostat.bancaditalia.it/inquiry/ (accessed on 1 May 2021).
  11. Credit Suisse First Boston. CreditRisk+, a Credit Risk Management Framework; Credit Suisse First Boston: London, UK, 1998. [Google Scholar]
  12. Passalacqua, L. A Pricing Model for Credit Insurance; Giornale Dell’Istituto Italiano Degli Attuari: Rome, Italy, 2006; Volume LXIX, pp. 1–37. [Google Scholar]
  13. Passalacqua, L. Measuring Effects of Excess-of-Loss Reinsurance on Credit Insurance Risk Capital; Giornale Dell’Istituto Italiano Degli Attuari: Rome, Italy, 2006; Volume LXX, pp. 81–102. [Google Scholar]
  14. Vandendorpe, A.; Ho, N.D.; Vanduffel, S.; Van Dooren, P. On the parameterization of the CreditRisk+ model for estimating credit portfolio risk. Insur. Math. Econ. 2008, 42, 736–745. [Google Scholar] [CrossRef]
  15. Wilde, T. CreditRisk+. In Encyclopedia of Quantitative Finance; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
  16. Gundlach, M.; Lehrbass, F. (Eds.) CreditRisk+ in the Banking Industry; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  17. Klugman, S.A.; Panjer, H.H.; Willmot, G.E. Loss Models: From Data to Decisions; Wiley: Hoboken, NJ, USA, 2012. [Google Scholar]
  18. Glasserman, P.; Li, J. Importance Sampling for Portfolio Credit Risk. Manag. Sci. 2005, 51, 1643–1656. [Google Scholar] [CrossRef] [Green Version]
  19. McNeil, A.; Frey, R.; Embrechts, P. Quantitative Risk Management; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
  20. Kotz, S.; Adams, J.W. Distribution of Sum of Identically Distributed Exponentially Correlated Gamma-Variables. Ann. Math. Stat. 1964, 35, 277–283. [Google Scholar] [CrossRef]
  21. Mathai, A.M.; Moschopoulos, P.G. A Form of Multivariate Gamma Distribution. Ann. Inst. Stat. Math. 1992, 44, 97–106. [Google Scholar] [CrossRef]
  22. Florent, C.; Borgnat, P.; Tourneret, J.; Abry, P. Parameter estimation for sums of correlated gamma random variables. Application to anomaly detection in Internet Traffic. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP-08, Las Vegas, NV, USA, 31 March–4 April 2008. [Google Scholar]
  23. Feng, Y.; Wen, M.; Zhang, J.; Ji, F.; Ning, G. Sum of arbitrarily correlated Gamma random variables with unequal parameters and its application in wireless communications. In Proceedings of the IEEE 2016 International Conference on Computing, Networking and Communications (ICNC), Kauai, HI, USA, 15–18 February 2016. [Google Scholar] [CrossRef]
  24. Non-Performing Loans (NPLs) in Italy’s Banking System. 2017. Available online: https://www.bancaditalia.it/media/views/2017/npl/ (accessed on 1 May 2021).
  25. Higham, N. Computing the nearest correlation matrix—A problem from finance. IMA J. Numer. Anal. 2002, 22, 329–343. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Precision gain ε 12 ( m ) , as a function of m and i σ . The left and right plots show the values of ε A ^ 12 ( E , m ) and ε A ^ 12 ( L , m ) respectively, as a function of m, for each volatility scenario ( i σ = 0 , , 6 ), each depicted with darker to brighter curves, in the n h = assumption. The red curve is the theoretical value of ε A ^ 12 ( m ) in the Gaussian regime.
Figure 1. Precision gain ε 12 ( m ) , as a function of m and i σ . The left and right plots show the values of ε A ^ 12 ( E , m ) and ε A ^ 12 ( L , m ) respectively, as a function of m, for each volatility scenario ( i σ = 0 , , 6 ), each depicted with darker to brighter curves, in the n h = assumption. The red curve is the theoretical value of ε A ^ 12 ( m ) in the Gaussian regime.
Mathematics 09 01679 g001
Figure 2. ε [ A ^ 12 ( E , m ) ] and ε [ A ^ 12 ( L , m ) ] as a function of m, considering increasing i σ (from darker to brighter curve) and n h = 10 3 . The red curve is the theoretical value of ε A ^ 12 ( m ) as a function of m in the Gaussian regime. For σ 1 , σ 2 1 the analytical result is perfectly satisfied. However, ε A ^ 12 ( L , m ) is shown to be a decreasing function of m in general. Comparing this result with the n h = case, we can state that ε A ^ h h ( L , m ) is almost insensitive to n h ( h = 1 , 2 ).
Figure 2. ε [ A ^ 12 ( E , m ) ] and ε [ A ^ 12 ( L , m ) ] as a function of m, considering increasing i σ (from darker to brighter curve) and n h = 10 3 . The red curve is the theoretical value of ε A ^ 12 ( m ) as a function of m in the Gaussian regime. For σ 1 , σ 2 1 the analytical result is perfectly satisfied. However, ε A ^ 12 ( L , m ) is shown to be a decreasing function of m in general. Comparing this result with the n h = case, we can state that ε A ^ h h ( L , m ) is almost insensitive to n h ( h = 1 , 2 ).
Mathematics 09 01679 g002
Figure 3. Boxplot of A ^ 12 ( E , m ) and A ^ 12 ( L , m ) distributions, as a function of m. The red horizontal line represent the true value of A 12 and the blue X’s stand for the average value of A ^ 12 ( m ) .
Figure 3. Boxplot of A ^ 12 ( E , m ) and A ^ 12 ( L , m ) distributions, as a function of m. The red horizontal line represent the true value of A 12 and the blue X’s stand for the average value of A ^ 12 ( m ) .
Mathematics 09 01679 g003
Figure 4. σ [ A ^ 12 ( E , m ) ] and σ [ A ^ 12 ( L , m ) ] as a function of n h . Decreasing m values are considered from darker to brighter curve.
Figure 4. σ [ A ^ 12 ( E , m ) ] and σ [ A ^ 12 ( L , m ) ] as a function of n h . Decreasing m values are considered from darker to brighter curve.
Mathematics 09 01679 g004
Figure 5. Precision gain in presence of autocorrelation. ε [ A ^ 12 ( E , m ) ] (exact—left panels) and ε [ A ^ 12 ( L , m ) ] (2nd order approximation—right panels), for each volatility scenario ( i σ = 0 , , 6 , depicted with darker to brighter curves), for ρ = 0.05 (top) and ρ = 0.5 (bottom). The yellow area includes all the values between the maximum (dashed red line) and the minimum (solid red line) expected from the results of Appendix B. The frontier ε = 1 (dotted line) allows to check the presence of a precision gain at m > 1 .
Figure 5. Precision gain in presence of autocorrelation. ε [ A ^ 12 ( E , m ) ] (exact—left panels) and ε [ A ^ 12 ( L , m ) ] (2nd order approximation—right panels), for each volatility scenario ( i σ = 0 , , 6 , depicted with darker to brighter curves), for ρ = 0.05 (top) and ρ = 0.5 (bottom). The yellow area includes all the values between the maximum (dashed red line) and the minimum (solid red line) expected from the results of Appendix B. The frontier ε = 1 (dotted line) allows to check the presence of a precision gain at m > 1 .
Mathematics 09 01679 g005
Table 1. Matrix of weights used for the numerical simulations.
Table 1. Matrix of weights used for the numerical simulations.
k012
ω 1 k 0.300.400.30
ω 2 k 0.500.250.25
Table 2. Definition of the clusters h = 1 , , 6 used in data set TRI30496.
Table 2. Definition of the clusters h = 1 , , 6 used in data set TRI30496.
Cluster Index hSector CodeDescription
1600Consumer households
2S11Non-financial companies
3S12BI7Financial companies other than monetary financial institutions
4S13General government
5S14BI4Producer households
6S15BI1Non-profit institutions serving households and unclassifiable units
Table 3. Main features of the considered historical time series over the period 1 January 2013–31 December 2017. p ¯ h ( h = 1 , , 6 ) is the yearly average bad loan rate; σ h is the volatility associated to each p ¯ h ; n h is the average number of borrowers.
Table 3. Main features of the considered historical time series over the period 1 January 2013–31 December 2017. p ¯ h ( h = 1 , , 6 ) is the yearly average bad loan rate; σ h is the volatility associated to each p ¯ h ; n h is the average number of borrowers.
h123456
p ¯ h 0.01190.03520.02550.00560.02590.0088
σ h 0.00100.00420.00230.00140.00220.0010
n h 269,515407,60231915416132,1794020
Table 4. Values of A ^ ( E , m ) ( 0 , 1 ) ( m = 4 left, m = 1 right) obtained from the quarterly historical series TRI30496 over the period 1 January 2013–31 December 2017. Results are expressed in 10 2 units.
Table 4. Values of A ^ ( E , m ) ( 0 , 1 ) ( m = 4 left, m = 1 right) obtained from the quarterly historical series TRI30496 over the period 1 January 2013–31 December 2017. Results are expressed in 10 2 units.
0.530.280.330.360.410.480.680.400.361.010.560.73
0.280.590.480.610.430.400.401.500.981.260.87−0.16
0.330.480.670.520.430.400.360.980.870.780.720.27
0.360.610.527.800.480.331.011.260.786.501.100.66
0.410.430.430.480.470.540.560.870.721.100.740.47
0.480.400.400.330.541.530.73−0.160.270.660.471.35
Table 5. The elementwise precision gain ε A ^ ( E , 4 ) ( 0 , 1 ) associated with results reported in Table 4.
Table 5. The elementwise precision gain ε A ^ ( E , 4 ) ( 0 , 1 ) associated with results reported in Table 4.
0.360.260.370.410.330.39
0.260.180.240.300.230.33
0.370.240.350.430.300.45
0.410.300.430.550.370.52
0.330.230.300.370.290.42
0.390.330.450.520.420.52
Table 6. σ A ^ ( E , m ) ( 0 , 1 ) ( m = 4 left, m = 1 right). These are the elementwise errors of the estimators reported in Table 4. The results above are expressed in 10 2 units.
Table 6. σ A ^ ( E , m ) ( 0 , 1 ) ( m = 4 left, m = 1 right). These are the elementwise errors of the estimators reported in Table 4. The results above are expressed in 10 2 units.
0.110.120.190.440.110.310.410.250.411.090.240.64
0.120.200.290.640.130.400.250.860.721.470.500.97
0.190.291.381.080.210.690.410.721.752.310.491.53
0.440.641.085.120.491.601.091.472.319.111.203.40
0.110.130.210.490.130.340.240.500.491.200.290.79
0.310.400.691.600.343.250.640.971.533.400.794.65
Table 7. Set of eigenvalues σ ˜ k and eigenvectors ω ˜ k obtained by the eigenvalues decomposition of A ^ ( E , 4 ) ( 0 , 1 ) , as reported in Table 4.
Table 7. Set of eigenvalues σ ˜ k and eigenvectors ω ˜ k obtained by the eigenvalues decomposition of A ^ ( E , 4 ) ( 0 , 1 ) , as reported in Table 4.
ω ˜ k k = 1k = 2k = 3k = 4k = 5k = 6
0.06−0.340.130.840.000.39
0.10−0.330.43−0.38−0.650.36
0.09−0.360.52−0.260.720.06
0.980.17−0.070.010.010.00
0.08−0.380.220.19−0.22−0.84
0.07−0.68−0.69−0.200.060.07
σ ˜ k 2 0.080.020.012.9 · 10 3 1.4 · 10 3 0.3 · 10 3
Table 8. The complete set of parameters W ^ , σ ^ Γ 2 necessary to specify the dependence structure in CreditRisk + model, obtained by the eigenvalues decomposition of A ^ ( E , 4 ) ( 0 , 1 ) , as reported in Table 4.
Table 8. The complete set of parameters W ^ , σ ^ Γ 2 necessary to specify the dependence structure in CreditRisk + model, obtained by the eigenvalues decomposition of A ^ ( E , 4 ) ( 0 , 1 ) , as reported in Table 4.
k0123
ω 1 k 0.670.040.290.00
ω 2 k 0.070.070.270.59
ω 3 k 0.000.060.280.66
ω 4 k 0.130.870.000.00
ω 5 k 0.630.060.310.00
ω 6 k 0.290.040.670.00
σ k 2 0.1030.0310.010
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Giacomelli, J.; Passalacqua, L. Calibrating the CreditRisk+ Model at Different Time Scales and in Presence of Temporal Autocorrelation . Mathematics 2021, 9, 1679. https://0-doi-org.brum.beds.ac.uk/10.3390/math9141679

AMA Style

Giacomelli J, Passalacqua L. Calibrating the CreditRisk+ Model at Different Time Scales and in Presence of Temporal Autocorrelation . Mathematics. 2021; 9(14):1679. https://0-doi-org.brum.beds.ac.uk/10.3390/math9141679

Chicago/Turabian Style

Giacomelli, Jacopo, and Luca Passalacqua. 2021. "Calibrating the CreditRisk+ Model at Different Time Scales and in Presence of Temporal Autocorrelation " Mathematics 9, no. 14: 1679. https://0-doi-org.brum.beds.ac.uk/10.3390/math9141679

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop