Seminario de Matem´ atica Financiera Instituto MEFF-RiskLab
Volumen 3. A˜ nos 2000 y 2001
Director: Santiago Carrillo Men´ endez
c 2003 Santiago Carrillo Men´endez y otros
Preparaci´ on de la edici´ on: Pablo Fern´ andez Gallardo Maquetaci´ on: Aula Documental de Investigaci´ on. Mart´ın de los Heros, 66. 28008. Madrid Edita: Instituto MEFF Imprime: JUMA
ISBN: 84-688-2450-X Dep´ osito Legal M-29207-2003 Printed in Spain - Impreso en Espa˜ na Este libro no podr´ a ser reproducido total o parcialmente, ni transmitirse por procedimientos electr´ onicos, mec´anicos, magn´eticos o por sistemas de almacenamiento y recuperaci´ on inform´ aticos o cualquier otro m´etodo, ni su pr´estamo, alquiler o cualquier otra forma de cesi´ on de uso del ejemplar, sin el permiso previo, por escrito, del titular o titulares del copyright.
Pr´ ologo Un a˜ no m´ as, Instituto MEFF y RiskLab Madrid tenemos la satisfacci´ on de presentar este libro que recoge las diversas intervenciones que han tenido lugar a lo largo de los a˜ nos 2000 y 2001 en el Seminario de Matem´ atica Financiera que organizamos conjuntamente. La gran acogida de este seminario es una muestra m´ as de la necesaria colaboraci´ on entre el mundo acad´emico y la industria financiera y, desde Instituto MEFF, nos sentimos muy satisfechos de contribuir en la creaci´ on de un punto de encuentro entre ambas comunidades que sirva como v´ıa de comunicaci´ on de conocimientos e inquietudes. La importancia del mundo universitario, no s´ olo como lugar de formaci´ on de los futuros profesionales sino tambi´en como centro de investigaci´on que colabora con la industria en la aportaci´ on de soluciones a las necesidades de los mercados, es especialmente evidente en el mundo de los productos derivados. En este campo la colaboraci´ on ha sido muy fruct´ıfera logrando, por un lado ampliar la gama de productos disponibles para la comunidad financiera y por otro, aportar la seguridad que proporcionan los sistemas de valoraci´ on rigurosos y eficientes. As´ı pues, este Seminario tiene por objetivo contribuir a fortalecer los lazos entre ambos mundos y pretende no s´ olo ser una fuente de divulgaci´on del conocimiento financiero, sino tambi´en constituirse como un foro permanente de discusi´on entre acad´emicos y profesionales del sector. No quisiera finalizar este pr´ ologo sin antes agradecer a Santiago Carrillo, director del seminario, su esfuerzo y dedicaci´ on, a los ponentes del mundo acad´emico y profesional que desinteresadamente colaboran en esta actividad, a todo el personal de Instituto MEFF que la hace posible y por supuesto, a los alumnos que un mes tras otro nos animan con su asistencia a seguir trabajando en este interesante proyecto. Beatriz Alejandro Directora del Instituto MEFF
v
´Indice A modo de introducci´ on . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
Portfolio optimization for alternative investments . . . . . . . . . . ˜ a, Ben Djerroud, Luis Seco Ian Buckley, Gustavo Comezan
1
Valoraci´ on de derivados financieros mediante diferencias finitas . . ´ ndez Jos´ e Mar´ıa Pesquero Ferna
15
Aplicaci´ on de la teor´ıa de opciones a la evaluaci´ on del riesgo de cr´ edito: relaci´ on entre probabilidad de impago y rating mediante un probit ordenado . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Teresa Corzo Santamar´ıa Non-gaussian multivariate simulations in mark-to-future calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ˜ a, Marcos Escobar Olivier Croissant, Gustavo Comezan ´ ´ ndez, Nicola ´ s Herna ´ ndez, Luis Angel Pablo Ferna Seco
37
51
Las Matem´ aticas en las Finanzas . . . . . . . . . . . . . . . . . . . . . D´ıdac Art´ es
65
A dynamical model for stock market indices . . . . . . . . . . . . . . ` Jaume Masoliver, Miquel Montero and Josep M. Porra
81
Risk management with drawdown functions . . . . . . . . . . . . . . Alexei Chekhlov, Stanislav Uryasev, Michael Zabarankin
95
Local volatility changes in the Black-Scholes model . . . . . . . . . 113 Hans-Peter Bermin and Arturo Kohatsu-Higa Correlaciones y c´ opulas en finanzas . . . . . . . . . . . . . . . . . . . . 135 Juan Carlos Garc´ıa C´ espedes Introduction to the empirical analysis of swap spreads . . . . . . . 159 David M´ endez-Vives
vii
Testing the optimality of immunization strategies with transaction costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Eliseo Navarro and Juan M. Nave Embedded options and integrated asset-liability management for life insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Gabriele F. Susinno M´ etodos de valoraci´ on de opciones americanas: el enfoque “leastsquares Monte Carlo” . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Manuel Moreno y Javier F. Navas Modelos de volatilidad del futuro sobre el bono nocional a 10 a˜ nos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez Aplicaci´ on a ´ındices de bolsa de modelos de ra´ız unitaria estoc´ astica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 ´ n M´ınguez Salido y Eduardo Morales Mart´ınez Roma An analytic approach to credit risk of loan portfolios of Spanish banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 ´ Juan Carlos G. C´ espedes, Angel Menc´ıa, Mercedes Morris
viii
A modo de introducci´ on Este tercer volumen del Seminario de Matem´atica Financiera recoge diecis´eis de las dieciocho contribuciones al mismo, correspondientes a los a˜ nos 2000 y 2001. Una vez m´ as toca destacar, como uno de los m´eritos de este seminario, la diversidad de temas tratados, as´ı como la colaboraci´ on de acad´emicos y profesionales de las instituciones financieras en la buena marcha del mismo. La conferencia de D´ıdac Art´es (“Las Matem´aticas en las Finanzas”) bien podr´ıa servir de introducci´ on general a todos estos vol´ umenes del Seminario de Matem´ atica Financiera. Se han abordado los aspectos num´ericos subyacentes en la valoraci´on de algunos derivados en las conferencias (ahora art´ıculos) de Manuel Moreno y Javier Fern´ andez Navas (“M´etodos de valoraci´ on de opciones americanas: el enfoque least-squares Monte Carlo”) y de Jos´e Mar´ıa Pesquero Fern´ andez (“Valoraci´ on de derivados financieros mediante diferencias finitas”). Varios de los conferenciantes han centrado su atenci´ on en el riesgo de cr´edito: Teresa Corzo Santamar´ıa (“Aplicaci´ on de la teor´ıa de opciones a la evaluaci´ on del riesgo de cr´edito: relaci´ on entre probabilidad de impago y rating mediante un probit ´ ordenado”) y Juan Carlos Garc´ıa C´espedes, Angel Menc´ıa y Mercedes Morris (“An analytic approach to credit risk of loan portfolios of Spanish banks”). Dos de los trabajos aqu´ı recogidos abordan el delicado tema de las simulaciones multivariantes no gaussianas, de vital importancia si se quiere agregar riesgos de manera consistente: “Non-gaussian multivariate simulations in mark-to-future calculations” (Olivier Croissant, Gustavo Comeza˜ na, Marcos Escobar, Pablo Fern´ andez, ´ Nicol´ as Hern´ andez, Luis Angel Seco) y “Correlaciones y c´opulas en finanzas” (Juan Carlos Garc´ıa C´espedes). Prueba de la diversidad de la problem´ atica abordada en el seminario son los trabajos siguientes, que abordan problemas de indudable inter´es te´orico y pr´ actico: “A dynamical model for stock market indices” (Jaume Masoliver, Miquel Montero y Josep M. Porr` a), “Risk management with drawdown functions” (Alexei Chekhlov, Stanislav Uryasev y Michael Zabarankin), “Local volatility changes in the Black-Scholes model” (Hans-Peter Bermin y Arturo Kohatsu-Higa), “Introduction to the empirical analysis of swap spreads” (David M´endez-Vives), “Testing the optimality of immunization strategies with transaction costs” (Eliseo Navarro y Juan M. Nave), “Modelos de volatilidad del futuro sobre el bono nocional a 10 a˜ nos” (Ricardo Gimeno Nogu´es y Eduardo Morales Mart´ınez) y “Aplicaci´on a ´ındices de bolsa de modelos de ra´ız unitaria estoc´ astica” (Rom´an M´ınguez Salido y Eduardo Morales Mart´ınez). Por primera vez en el Seminario de Matem´ atica Financiera, se abordan temas relacionados con la “gesti´ on alternativa”, tan de moda hoy en d´ıa, en el trabajo de Ian
ix
´ Buckley, Gustavo Comeza˜ na, Ben Djerroud y Luis Angel Seco (“Portfolio optimization for alternative investments”), y con los seguros en el de Gabriele Susinno (“Embedded options and integrated asset-liability management for life insurance”). Dos contribuciones se han quedado fuera de este volumen: la del profesor Jos´e Manuel Campa (en aquel momento, profesor visitante en el CEMFI): “El uso de futuros en instrumentos de inversi´ on” (enero de 2000), hab´ıa sido publicada previamente (ver el Journal of International Money and Finance, volumen 17). Por razones ajenas a nuestra voluntad, no hemos podido publicar la de Mario Petrucci: “La formulaci´ on de Black-Scholes con procesos de jump diffusion” (marzo de 2001). El texto de ambas conferencias sigue a disposici´on de los interesados en la p´ agina web del RiskLab: www.risklab-madrid.uam.es/es/seminarios/meff-uam/seminarios.html. Quiero terminar estas l´ıneas agradeciendo a todos los ponentes y a todos los participantes del Seminario de Matem´ atica Financiera su inter´es: ellos son quienes hacen posible que esta “aventura” iniciada en el a˜ no 1997 por el Instituto MEFF y el RiskLab tenga tan buena salud cuando est´ a a punto de cumplir los siete a˜ nos de vida.
Santiago Carrillo Men´endez Director del RiskLab-Madrid
x
Portfolio optimization for alternative investments ´ ˜ a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco1
Abstract: A tractable and practical generalization to Markowitz meanvariance style portfolio theory is presented in which portfolios of hedge funds and commodity trading advisors (CTAs) can be handled successfully. Making the assumption that their returns have the finite Gaussian mixture distribution and using the probability of outperforming a target return as the objective function, these assets are optimized in the static setting by solving a non-linear programming problem to find portfolio weights.
1. Introduction In this article, Markowitz mean-variance portfolio theory [20], the foundation for single-period investment theory, is generalized to describe portfolios of assets whose returns are described by the (finite) Gaussian mixture (GM) (alternatively mixture of normals) distribution. Whilst the assets in the universe could be of the conventional variety, such as equities or bonds, our primary goal is to develop a framework which lends itself to the management of portfolios of hedge funds or even for optimally combining the recommendations from a group of commodity trading advisors (CTAs). That is to say, we seek an approach suitable for finding an optimal fund of funds. Because of the infinite variety of hedge fund and CTA strategies and the speed at which a given fund’s composition can be changed, the alternative assets that we describe are not expected to behave like conventional assets such as individual equities, bonds or even long-only funds such as index tracker funds and exchange traded funds (ETFs). Indeed we need to be prepared for their prices to be less predictable, more volatile and to have more exotic distributions. The new approach is ideal for an industrial setting, providing considerable additional flexibility over and above a standard Markowitz approach, with only a modest increase in complexity. 1 Ian Buckley es investigador en el Center for Qualitative Finance (Imperial College) y miembro visitante del Risklab Toronto. Gustavo Comeza˜ na es miembro del Risklab Toronto y director de Investigaci´ on y Desarrollo de la firma de gesti´ on financiera Sigma Analysis and Management. Ben ´ Djerroud es analista de Sigma Analysis and Management. Luis Angel Seco es profesor del Departamento de Matem´ aticas de la Universidad de Toronto, director del Risklab Toronto y presidente de ´ Sigma Analysis and Management. Esta charla fue impartida por el u ´ ltimo autor (Luis Angel Seco) en la sesi´ on del Seminario Instituto MEFF-RiskLab de enero de 2000.
2
´ ˜a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco
The assumption that asset returns have the multivariate Gaussian distribution is a reasonable first approximation to reality and gives rise to tractable theories. Many theories forming the foundations of mathematical finance adopt this conjecture, including Black-Scholes-Merton option pricing theory, Markowitz portfolio theory, and the CAPM and APT equity pricing models. However, it is well-known that for assets, both in the conventional sense of equities and bonds, but also in a broader sense, for example in the form of country or sector-based equity or bond indices, and even more so for alternative investments such as hedge funds and CTAs, the situation is more complex. The purpose of the generalization described in this paper is to address two well-known limitations with the assumption that asset returns obey the multivariate Gaussian distribution with constant parameters over time: • The skewed (asymmetric around the mean) and leptokurtotic (more kurtotic or “fat-tailed” than a Gaussian distribution) nature of marginal probability density functions (pdfs) • The asymmetric correlation (or correlation breakdown) phenomenon, which describes the tendency for the correlations between asset returns to be dependent on the prevailing direction of the market. Typically correlations are larger in a bear market than a bull market. The first point refers to the univariate distributions for returns that are observed if assets are considered one at a time. The second point describes effects that can only be observed when the returns to multiple assets are investigated together. One way to capture the dependence structure of multiple random variables in a risk management setting is by using copulas [8], [10], [13], [16], [19]. Copulas describe that part of the shape of the pdf that cannot be described by the marginal distributions. A (finite) Gaussian mixture distribution, as described in this paper, can be used to approximate a general multivariate distribution, as equivalently expressed either as a pdf or decomposed into a series of univariate marginal distributions and a copula. In fact, strictly the term “asymmetric correlation” describes a proposed explanation for a phenomenon, rather than the underlying, fundamental phenomenon itself. The latter is simply that the iso-probability contours of the multivariate pdf for asset returns are less symmetric than the ellipsoidal contours of the multivariate normal distribution. Apart from those distributions with pdfs with ellipsoidal iso-probability surfaces, such as the Gaussian and multivariate t, for all other multivariate distributions the correlation matrix does not provide an adequate summary of the dependence structure of the constituent risks. Considering multiple regimes within which the ordinary constant parameter Gaussian assumptions prevail successfully gives rise to a model that reflects reality better than a standard Gaussian model, without having to depart too radically from it. However, it is certainly not the only way to construct a model with non-ellipsoidal iso-probability contours. Indeed alternative assumptions to achieve the same ends may render the concept of correlation redundant and statements about its evolution over time, meaningless.
Portfolio optimization for alternative investments
3
Apart from the distributional assumptions, the second fundamental difference between the novel approach described in this paper (the GM approach) and the Markowitz mean-variance approach, is that we adopt a different objective function. This is essential in order to get different optimal portfolio weights. When we refer to the GM approach, the use of an alternative objective will be implicit. A drawback of using a more exotic objective than the variance is that the resultant optimization problem is not a linear-quadratic program (LQP), as it is when variance is the chosen risk measure. However, the nonlinear optimization problem with multiple local extrema that replaces it can be solved reliably and quickly with commonly available routines, at least with a moderate number of assets (i.e. less than ten). The plan for the paper is as follows. Evidence of covariance regimes over time is given in Section 2, including a summary of existing results from the literature for conventional assets and original results for alternative investments. We are introduced to the GM distribution in Section 3 in which we find definitions of GM distributed random variables, estimation issues, and some identities concerning moments and linear combinations of GM variables. In Section 4 we take the probability of shortfall objective, assume GM distributed asset returns and consider a theory based on these ingredients. Key theoretical and numerical results are described. Section 5 contains our conclusions.
2. Evidence of covariance regimes During the aftermath of the 1987 market crash, a deficiency in the risk models based on the multivariate normal distribution received increased attention: a simultaneous downward movement in all the markets of the world was a more frequent occurrence than the models predicted when calibrated using asset returns observed during tranquil periods. Diversifying amongst different assets or markets was less effective at reducing risk than many participants had hitherto believed. Important investigations of the contagion phenomenon include [9], [3], [18]. In common with their conventional asset counterparts, alternative assets exhibit the correlation breakdown phenomenon. As evidence we present Figure 1, which shows the correlation matrices between hedge fund returns in tranquil and distressed regimes. During tranquil periods correlations are small, whereas during periods of market distress, the asset returns become highly correlated, with off-diagonal correlation values close to one. We note in passing that because finding portfolio weights in a mean-variance setting is tantamount to inverting the covariance matrix, then errors in the optimal portfolio weights will be most sensitive to errors in the covariance matrix when the latter is closest to being singular (having a determinant close to zero). The covariance matrix will be more singular in the distressed regime than in the tranquil regime.
4
´ ˜a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco
Figure 1: Bar charts to show typical values of correlations between the returns of a group of eight hedge funds during tranquil (left) and distressed (right) periods.
3. Gaussian mixture distribution The Gaussian mixture distribution is selected from the range of parametric alternatives to the normal distribution for its tractability: calculations using it often closely resemble those using the normal distribution. Whilst there are many univariate parametric probability distributions, (e.g. hyperbolic, t, generalized beta, α-stable) the list for multivariate distributions is shorter (e.g. t, α-stable). The GM distribution has been used before in the field of finance, mostly in its univariate guise for the estimation of Value at Risk (VaR). [15] develop a model for estimating VaR in which the user is free to choose any probability distribution for the daily changes in each market variable and employ the univariate mixture of normals distribution as an example. In the same field, [24] assumes probability distributions for each of the parameters describing the mixture of normals and uses a Bayesian updating scheme; and [22] uses a quasi-Bayesian maximum likelihood estimation procedure. The current RiskMetricsTM methodology uses GM with a mixture of two normal distributions. More recently GM models have been used [17] to model futures markets and for portfolio risk management and by [11] for credit risk. [23] develops an efficient analytical Monte Carlo method for generating changes in asset prices using a multivariate mixture of normal distributions with arbitrary covariance matrix. [5] describe computational tools for the calculation of VaR and other more sophisticated risk measures such as shortfall, Max-VaR, conditional VaR and conditional risk measures that aim to take account of the heteroskedastic structure of time series.
Portfolio optimization for alternative investments
5
This paper describes the static, single period setting only, in which distributions of random variables are sufficient to specify the model. In this setting the key idea is that we build an exotic distribution by mixing simple ones, namely copies of the normal distribution. Our only assumption is that over an interval of time, the returns to the assets in the universe are described by the multivariate GM distribution. It is unnecessary to make further assumptions about the nature of the asset price evolution during this interval. However, we do maintain an interest in the dynamic case, i.e. the multiple period discrete or continuous time setting, because we wish to motivate the use of the GM distribution and we prefer to construct static models that extend naturally to the dynamic case. There is a growing body of work in which exotic (asset return) stochastic processes have been constructed by mixing simpler ones. Processes for asset returns may be constructed from unconditional or conditional distributions. As an example of the latter case, by mixing autoregressive processes such as ARCH and GARCH, processes can be constructed that can account for both the heteroscedastic and leptokurtic nature of financial time series. See [21] and [12]. GM distributions can arise naturally as the level of certain stochastic processes at a point in time, conditional on the level at an earlier time e.g. Markov (regime) switching models and jump processes. Regime switching models describe processes in which parameters of a continuous diffusion process may change discontinuously according to the realized stochastic path through an associated Markov chain. We conclude that GM distributions are better motivated and less contrived than first impressions might suggest. Recent applications of regime switching asset return processes include: in the field of Merton-style option pricing theory, [7], and in portfolio management [4] (CAPM ) and [2], [1] (international diversification). Mixture distributions have the appeal that by adding together a sufficient number of component distributions, any multivariate distribution may be approximated to arbitrary accuracy. With an infinite number of contributions, any distribution can be reconstructed exactly. As far as estimation is concerned, a disadvantage of using the GM distribution is that the log-likelihood function does not have a global minimum. A resolution to this problem explored in [14] is to use a modified log likelihood function. Because of the use of the GM distribution and other mixture distributions in image processing, clustering, and unsupervised learning a host of estimation techniques have been developed to address this problem [6]. When using the GM distribution to model asset returns, [23] employs the EM algorithm.
3.1. Definitions Definition 3.1.1 A (scalar) random variable Z has the univariate GM distribution if its probability density function fZ (z) is of the form (3.1.1)
fZ (z) =
n i=1
wi fXi (z) =
n i=1
wi φ(
z − µi ) σi
6
´ ˜a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco
where the random variables Xi are normally distributed with normal probability deni sity functions φXi (x) = φ( z−µ σi ), φ(z) is the standard normal probability density function and the weights wi sum to one. The random variables Xi have means µi and variances σi2 . The finite sum is over the desired number of normal components to combine, n. Remark 3.1.2 The cumulative distribution function is trivially: FZ (z) =
(3.1.2)
n
wi Φ(
i=1
z − µi ) σi
where Φ(z) is the standard normal cumulative density function. We will made use of this observation in Section 4.1 when we define the PoS objective. Similarly, Definition 3.1.3 A vector random variable Z has the multivariate GM distribution if its probability density function fZ (z ) is of the form (3.1.3)
fZ (z ) =
n
wi fX (i) (z ) =
i=1
n
wi φµ(i) ,V (i) (z )
i=1
where the (vector) random variables X (i) are multivariate normally distributed with probability density functions φµ(i) ,V (i) (z ), and the weights wi sum to one. The vector random variable X (i) has mean µ(i) and variance-covariance matrix V (i) . E.g. if we take the ath and bth components of X (i) , their covariance is the element (a, b) of (i) (i) (i) (i) (i) V (i) ; i.e. Cov(Xa , Xb ) = Vab . Also E[Xa ] = µa . (i)
(j)
Remark 3.1.4 Note that by definition Cov(Xa , Xb ) = 0 for i = j. Remark 3.1.5 In the numerical experiments described later, the mixture distribution contains two normal components, describing asset returns under tranquil and distressed conditions. We shall refer to the weights wi as regime weights.
3.2. Moments The mean of a random variable with the mixture distribution is simply expressed as a linear combination of the means of the component normal distributions. Proposition 3.2.1 The expectation of a function f of a random variable with the GM distribution can be expressed in terms of the expectations of functions of the component normally distributed variables: (3.2.1)
E[f (Z )] =
n i=1
wi E[f (X (i) )]
Portfolio optimization for alternative investments
7
Remark 3.2.2 In particular, E[Z ] =
(3.2.2)
n
wi µ(i)
i=1
N.B. This is a potential source of confusion given that it is not true in general that n Z = i=1 wi X (i) . The variance depends not only on the variances of the components, but also on the differences between the means of the components. Proposition 3.2.3 The variance of a random variable with the (univariate) GM distribution can be expressed in terms of the expectations and variances of the component normally distributed variables: Var[Za ] =
n
wi Var[Xa(i) ] +
i=1
(3.2.3) =
n
wi (σa(i) )2 +
i=1
n,n
wi wj (E[Xa(i) ] − E[Xa(j) ])2
i,j
(j) 2 wi wj (µ(i) a − µa )
i,j
Remark 3.2.4 If we permit the variance and expectation operators to thread over the components of the vector arguments f (X )a := f (Xa ), this can be written in alternative vector form as Var[Z ] =
n
wi Var[X (i) ] +
i=1
(3.2.4) =
n
n,n
wi wj (E[X (i) ] − E[X (j) ])2
i,j
wi σ (i) +
i=1
n,n
wi wj (µ(i) − µ(j) )2
i,j
The variance result, above is a special case of the following result for the covariance: Proposition 3.2.5 The covariance between two elements of a vector random variable with the (multivariate) GM distribution can be expressed in terms of the expectations of functions of the component normally distributed variables: Cov[Za , Zb ] =
n
(i)
wi Cov[Xa(i) , Xb ]+
i=1 n,n
(3.2.5) i,j
=
n i=1
(i)
(j)
wi wj (E[Xa(i) ] − E[Xa(j) ])(E[Xb ] − E[Xb ])
(i) wi Vab
+
n,n i,j
(i)
(j)
(j) wi wj (µ(i) a − µa )(µb − µb )
8
´ ˜a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco
Remark 3.2.6 In matrix notation, where Wab := Cov[Za , Zb ], (3.2.6)
W =
n i=1
n,n
wi V (i) +
wi wj (µ(i) − µ(j) ).(µ(i) − µ(j) )
i,j
where indicates matrix transpose.
3.3. Linear combinations of random variables with the GM distribution Proposition 3.3.1 Linear combinations of random variables with the (multivariate) GM distribution will themselves have a (univariate) mmixture of normals distribution. In particular, the (scalar) random variable Y = a=1 θa Za where the m-vector random variable Z has the multivariate GM distribution and θ is an m-vector of real coefficients, has probability density function (3.3.1)
fY (y) =
n i=1
wi φµ¯i ,¯σi2 (y)
¯i2 = θ .V i .θ. where µ ¯i = µi .θ and σ Similar identities may be found in [23].
3.4. Visualizing the GM distribution In Figure 2, in four contour plots of pdfs, we see how two bivariate Gaussian distributions (top row) can be added to yield a GM distribution (bottom, left). Note the potential for highly non-elliptic iso-probability contours when the GM distribution is used. For comparison a bivariate normal with the same sample means, variances and covariances is included (bottom, right). If the GM distribution given were used to describe the returns to two assets, the lobe pointing down and to the left of the figure would describe the propensity of the two assets to decline sharply together (the asymmetric correlation phenomenon). The Gaussian distribution, with its elliptic contours, is clearly unable to capture this feature. These examples illustrate the two asset case. With three assets the contours for the Gaussian distribution are three-dimensional ellipsoids (rather than ellipses in two dimensions), and for the GM distribution the contours are complicated lobed surfaces embedded in a three-dimensional space.
Portfolio optimization for alternative investments
9
Figure 2: Contour plots of probability density functions. The top row contains two bivariate Gaussian distributions - potentially for the tranquil (left) and distressed (right) regimes. The bottom row illustrates the composite Gaussian mixture distribution obtained by mixing the two distributions from the top row (left) and a bivariate normal distribution with the same means and variance-covariance matrix as the composite (right).
4. Portfolio optimization In our framework, the n Gaussian contributions to the GM distribution are associated with n regimes. In all the numerical examples, we take n = 2, and call the two regimes the tranquil and distressed regimes.
4.1. Probability of shortfall as a risk measure. When variance is used as the optimization objective, the efficient frontier is independent of the distribution for the asset returns. Therefore, to achieve non-trivial results in the GM setting, we adopt alternative, non-quadratic objective functions. One with the advantage of being intuitive for practitioners is the probability of shortfall below a target return (PoS ). When returns have a normal distribution, maximizing the PoS is equivalent to maximizing the out-performance Sharpe ratio (ratio of difference between realized return and target return in the numerator to volatility in the denominator). Therefore adopting the GM approach will not represent a significant departure from current practice for many investors.
10
´ ˜a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco
Note that the PoS is not the same as the VaR: the former is the probability beyond a given point on the distribution (e.g. a target return of 18.2%), the latter is the point on the distribution such that the probability of being beyond that point is equal to a given value (e.g. 1% or 5%). In the GM setting of this paper the probability of shortfall objective is the probability that a univariate mixture of normals random variable, with regime means µ ¯i = µi .θ and regime variances σ ¯i2 = θ .V i .θ, exceeds the target return k. From the expression for the univariate mixture of normals CDF Eqn. 3.1.2, above, it can be shown that: Proposition 4.1.1 The probability that the portfolio return falls short of the target k is: n µi .θ − k (4.1.1) Fk (θ) = wi Φ θ .V i .θ i=1
4.2. Statement of optimization problems In each case µT is the target portfolio expected return. Markowitz min
θ .V S .θ
s.t.
θ.1 = 1 µ.θ ≥ µT
θ
(4.2.1)
where V S is the sample variance-covariance matrix. This is calculated using the identity in Equation 3.2.5 or equivalently 3.2.6. Probability of shortfall max (4.2.2)
θ
s.t.
n
wi Φ
i=1
µ .θ − k i θ .V i .θ
θ.1 = 1 µ.θ ≥ µT
4.3. Investment opportunity set Figure 3 shows the investment opportunity set (IOS) in the 3-asset case, overlaid on the contour plot for the PoS objective function, in tranquil and distressed regime out-performance Sharpe ratio space. The investment opportunity set is the set of all attainable points in a given space that may be reached by constructing portfolios of the assets in a given universe. Typically variance and return are taken as the space to explore, but any statistics may be used. The GM approach optimal portfolio is marked in the top righthand corner, on the efficient frontier.
Portfolio optimization for alternative investments
11
Figure 3: Scatter plot of portfolios in the investment opportunity set, overlaid on the contour plot of the probability of shortfall objective function. The axes are the out-performance Sharpe ratios in the tranquil (x) and distressed (y) regimes. The GM approach optimal portfolio is marked in the top righthand corner, on the efficient frontier. Typically, the portfolios that are optimal with respect to the mean-variance objectives in the tranquil and distressed regimes will be sub-optimal with respect to the probability of shortfall objective used in the GM approach. This is a three asset example.
Compare this with Figure 4, which shows the IOSs for the tranquil and distressed regimes superimposed onto the same plot. The axes are the portfolio means and variances in the two regimes. Clearly the portfolio that maximizes the PoS objective, will be suboptimal with respect to the mean-variance efficient frontier in each of the component regimes.
5. Conclusions We are certain that the GM approach, namely the assumption of a multivariate finite Gaussian mixture distribution for asset returns used in conjunction with a probability of shortfall objective, will be useful for a whole range of portfolio management applications. It is only slightly harder to implement than standard Markowitz, with many features in common (e.g. we retain the use of covariance matrices). However, it is more flexible because of its ability to handle non-elliptic return distributions. Because it is intuitive, the technique is unlikely to face resistance from practitioners already familiar with mean-variance approaches. With two regimes, the objective function does not possess more than two maxima, so our numerical examples have been robust and quick to solve.
12
´ ˜a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco
Figure 4: Investment opportunity sets for the tranquil and distressed regimes superimposed onto the same plot. The axes are the portfolio mean and variance. Typically the GM approach optimal portfolio will be sub-optimal with respect to both the tranquil and distressed mean-variance objectives. This is a three asset example.
We have compared the GM and mean-variance approaches. The obvious questions to ask are whether the two approach give different optimal weights from one another, and if so, whether holding the GM weights will give improved performance using measures preferred by practitioners. The response to both questions is in the affirmative. The optimal weights are significantly different between the approaches. The optimal weights for the GM approach will by definition better serve an investor seeking to minimize the probability of shortfall in an environment with multiple regimes. Because the GM distribution is a better model for reality than the Gaussian distribution, we believe that the GM approach will do a better job for managing portfolios in the real world.
Acknowledgment IB would like to thank Mr. Gerry Salkin and Prof. Nicos Christofides of the CQF, Imperial College for funding and his coauthors for their hospitality at RiskLab, Toronto. This project was partly funded by NSERC and MITACS, a canadian network of centers of excellence.
References [1] Ang, A. and Bekaert, G.: How do regimes affect asset allocation?, April 2002. http://www.gsb.columbia.edu/faculty/aang/papers/inquire.pdf. [2] Ang, A. and Bekaert, G.: International asset allocation with regime shifts. On-line, April 2002.
Portfolio optimization for alternative investments
13
[3] Campbell, R., Koedijk, K. and Kofman, P.: Increased correlation in bear markets. Financial Analyst Journal 58 (2002), 87–94. [4] Capiello, L. and Fearnley, T. A.: International capm with regime switching garch parameters, July 2000. http://www.fame.ch:8080/research/papers/CapielloFearnley.pdf. ´ rez, A.: Computational tools for the analysis of market [5] Carrillo, S. and Sua risk. Computing in Economics and Finance (2000), no. 144. [6] Dowe, D.: David Dowe’s clustering, mixture modelling and unsupervised learning page, 1999. http://www.csse.monash.edu.au/~dld/mixture.modelling.page.html. [7] Driffill, J., Kenc, T. and Sola, M.: Merton-style option pricing under regime switching, January 2002. http://www.ms.ic.ac.uk/turalay/DriffillKencSola.pdf [8] Embrechts, P., McNeil, A. J., Straumann, D.: Correlation: pitfalls and alternatives. Risk Magazine (1999), 69–71. [9] Erb, C. B., Harvey, C. R. and Viskanta, T. E.: Forecasting international equity correlations. Financial Analysts Journal 50 (1994), 32–45. [10] Frees, E. and Valdez, E.: Understanding relationships using copula. North American Actuarial Journal 2 (1998), no. 1, 1–25. [11] Frey, R., McNeil, A.J. Nyfeler, M.: Copulas and credit models. Risk (2001), 111–114. [12] Graflund, A. and Nilsson, B.: Dynamic portfolio selection: The relevance of switching regimes and investment horizon, March 2002. http://swopec.hhs.se/lunewp/abs/lunewp2002 008.htm [13] Haas, C.N.: On modeling correlated random variables in risk assessment. Risk Analysis 19 (1999), 1205–1214. [14] Hamilton, J.: A quasi-bayesian approach to estimating parameters for mixtures of normal distributions. Journal of Business and Economic Statistics 9 (1991), no. 1, 27-39. [15] Hull, J. and White, A.: Value at risk when daily changes in market variables are not normally distributed. Journal of Derivatives 5 (1998), no. 3, 9–19. [16] Klugman, S. A. and Parsa, R.: Fitting bivariate loss distribution with copulas. Insurance: Mathematics and Economics 24 (1999), 139–148. [17] Labidi, Ch. and An, Th.: Revisiting the finite mixture of gaussian distributions with applications to futures markets. Computing in Economics and Finance (2000), no. 67. [18] Longin, F. and Solnik, B.: Correlation structure of international equity markets during extremely volatile periods. Journal of Finance 56 (2001), 649–676.
14
´ ˜a, Ben Djerroud and Luis Angel Ian Buckley, Gustavo Comezan Seco
[19] Malevergne, Y. and Sornette, D.: Testing the gaussian copula hypothesis for financial assets dependences, November 2001. http://papers.ssrn.com/sol3/papers.cfm?abstract id=291140 [20] Markowitz, H.: Portfolio selection. Journal of Finance 7 (1952), 77–71. [21] Pelletier, D.: Regime switching for dynamic correlations, March 2002. http://www.crde.umontreal.ca/crde-cirano/pelletier.pdf [22] Venkataraman, S.: Value at risk for a mixture of normal distributions: The use of quasi-bayesian estimation techniques. Economic Perspectives (1997). [23] Wang, J.: Modeling and generating daily changes in market variables using a multivariate mixture of normal distributions, January 2000. http://www.valdosta.edu/~jwang/paper/MixNormal.pdf [24] Zangari, P.: An improved methodology for measuring var. RiskMetrics Monitor, second quarter. Reuters/JP Morgan, 1996.
Ian Buckley Centre for Quantitative Finance Imperial College Exhibition Road, London SW7 2BX, UK
[email protected]
Gustavo Comeza˜ na Sigmanalysis Suite 340 - The Fields Institute 222 College St., Toronto, ON M5T 3J1, Canada
[email protected]
Ben Djerroud Sigmanalysis Suite 340 - The Fields Institute 222 College St., Toronto, ON M5T 3J1, Canada
[email protected] ´ Luis Angel Seco RiskLab University of Toronto, Room 205 1 Spadina Crescent, Toronto, ON M5S 3G3, Canada
[email protected]
Valoraci´ on de derivados financieros mediante diferencias finitas ´ ndez1 Jos´ e Mar´ıa Pesquero Ferna
Abstract: In this paper, we revise the finite difference techniques used to solve the partial differential equations arising in financial derivatives pricing. We start by formulating the pricing problem in terms of partial differential equations and boundary constraints. We then introduce the numerical algorithms used, with a focus on the discretization and convergence, consistency and stability of the technique. Applications to several exotic options are presented. Finally we dwell on the advantages of this method.
1.
Introducci´ on
Por teor´ıa de valoraci´ on, el precio de un derivado financiero se puede expresar como un valor esperado bajo cierta medida de probabilidad de una funci´ on del activo subyacente. Es as´ı habitual interpretar la valoraci´ on de derivados financieros desde un punto de vista probabil´ıstico y utilizar m´etodos num´ericos de acuerdo a esta intuici´ on. Por ello est´ an muy extendidos m´etodos como la integraci´ on num´erica, simulaciones de Montecarlo y a´rboles de probabilidad. Una alternativa a este planteamiento es obtener el valor de un derivado financiero como soluci´on de una determinada ecuaci´ on diferencial en derivadas parciales con ciertas condiciones de contorno. Esta ecuaci´ on puede obtenerse directamente mediante aplicaci´ on de criterios de ausencia de oportunidad de arbitraje. Tambi´en podr´ıa obtenerse partiendo de la formulaci´ on del problema seg´ un el enfoque probabil´ıstico previo y la utilizaci´ on de la f´ ormula de Feynman-Kac, que sirve de puente entre ambos tipos de enfoques del problema. En la mayor´ıa de los casos la resoluci´ on de dichas ecuaciones diferenciales exige el uso de un m´etodo num´erico. Las diferencias finitas se presentan como una alternativa potente, vers´ atil y sencilla para abordar esta tarea. De aqu´ı en adelante utilizar´e el modelo de Black y Scholes como base para la explicaci´ on de los aspectos m´as importantes de la aplicaci´ on de esta t´ecnica num´erica. 1 Jos´ e Mar´ıa Pesquero es responsable del grupo de Tecnolog´ıa Financiera del Departamento de ´ Nuevos Productos en el Area de Mercados del BBVA. Esta charla se imparti´ o en la sesi´ on del Seminario Instituto MEFF-RiskLab de marzo de 2000.
16
´ndez Jos´ e Mar´ıa Pesquero Ferna
Sin embargo, las conclusiones obtenidas son f´ acilmente extrapolables a otros tipos de modelos, como ilustrar´e ocasionalmente. El modelo de Black y Scholes asume un comportamiento lognormal del activo S: dS = µ(t) dt + σ(t) dZt , S donde µ(t) es la tendencia del activo, σ(t) su volatilidad y Zt el valor de un movimiento browniano en el instante t. Bajo ciertas hip´ otesis (ausencia de costes de transacci´on y de oportunidades de arbitraje, posibilidad de negociaci´ on continua, venta en descubierto permitida y divisibilidad del activo), el valor del derivado V (S, t) en el instante t y para un valor del subyacente S viene dado por la soluci´ on de la siguiente ecuaci´ on diferencial en derivadas parciales: (1)
∂V ∂V 1 ∂2V + r(t) S + σ 2 (t) S 2 − rV = 0, ∂t ∂S 2 ∂S 2
la cual se obtiene por aplicaci´ on del lema de Itˆo, criterios de ausencia de oportunidad de arbitraje y la condici´ on de cartera r´eplica autofinanciada [Hull, 2002]. r(t) representa el tipo de inter´es sin riesgo en el instante t. El problema queda definido completamente cuando se formulan las condiciones de contorno para el derivado en cuesti´ on, siendo este aspecto el u ´nico que diferencia unos derivados de otros bajo el mismo modelo de evoluci´on del activo subyacente. Algunas caracter´ısticas de la ecuaci´ on diferencial de Black y Scholes (1), que ser´an relevantes para la aplicaci´ on del m´etodo num´erico, son: • Es una ecuaci´ on diferencial parab´ olica, esto es, presenta derivadas hasta de primer orden respecto al tiempo y hasta de segundo orden respecto a la variable espacial S. • Aparece una derivada primera respecto al valor del activo S (t´ermino convectivo o de tendencia). Este t´ermino dar´ a origen a problemas de inestabilidad num´erica. • Es una ecuaci´ on diferencial “backwards”: partiendo de una condici´ on final para el proceso la ecuaci´ on define un proceso de suavizado de ´esta:
´ n de derivados financieros mediante diferencias finitas Valoracio
17
• La unicidad de la soluci´ on viene marcada por la imposici´ on de condiciones de contorno: condiciones de frontera y condiciones finales. Las primeras definen el valor de la soluci´ on o de sus derivadas en los extremos de la dimensi´ on S del dominio de la soluci´ on. No es necesario imponer condiciones de contorno cuando ´ el dominio de evoluci´ on del activo es infinito [0, +∞). Este ser´a el caso general para la valoraci´ on de derivados financieros. La segunda define la soluci´ on para un instante final T . T´ıpicamente, ´esta ser´ a el pago del derivado a vencimiento T . La ecuaci´ on diferencial de Black y Scholes (1) se puede transformar en la ecuaci´ on diferencial de difusi´ on: ∂2u ∂u = ∂x2 ∂t
(2) tras el siguiente cambio de variables:
(3)
S = K ex ,
t=T −
V = exp − 12 (k1 − 1)x − 14 (k1 + 1)2 τ u(x, τ ) ,
k1 =
τ σ 2 /2
r σ 2 /2
.
La ecuaci´ on (2) se conoce como la ecuaci´on del calor en una dimensi´ on, cuya interpretaci´ on f´ısica es la de temperatura u(t,x) en un instante t en la posici´ on x de una barra. Esta ecuaci´ on del calor tiene las siguientes caracter´ısticas: • La ecuaci´ on del calor es una ecuaci´ on diferencial parab´ olica: derivada hasta de primer orden respecto al tiempo y hasta de segundo respecto a la variable espacial x. • La unicidad de la soluci´ on viene marcada por la imposici´ on de condiciones de contorno. Estas son de dos tipos: – Condiciones de frontera: determina la soluci´ on u(t, x∗ ) o alguna de sus derivadas para todo instante en los extremos x∗ de la barra. – Condiciones iniciales o finales: determina el estado inicial o final de temperatura en todo punto de la barra. • Es una ecuaci´ on forward : dada una condici´ on inicial la ecuaci´ on de difusi´ on define un proceso de suavizado de esta condici´ on inicial:
18
´ndez Jos´ e Mar´ıa Pesquero Ferna
• Las condiciones de frontera mencionadas arriba pueden ser de dos tipos: – Barra infinita −∞ < x < ∞. No es necesario dar condiciones en los extremos de la barra. T´ecnicamente, simplemente es necesario impedir que la temperatura crezca muy r´apido. La soluci´ on es u ´nica simplemente mediante la formulaci´ on de la condici´ on inicial. – Barra finita −L < x < L. Se necesitan poner condiciones para todo instante en los extremos para la unicidad de la soluci´ on. A modo de ejemplo, veamos las condiciones de contorno para la valoraci´ on de una opci´ on de compra (call ) del activo S con vencimiento T y precio de ejercicio K. El valor de la call V (S, t) se obtendr´ a resolviendo la ecuaci´ on diferencial de Black y Scholes (1) con la condici´ on final: (4)
V (S, T ) = m´ ax(S − K, 0)
∀S .
Alternativamente, tras aplicaci´ on del cambio de variables (3), el problema pasa a ser resolver la ecuaci´ on (2) con la condici´ on inicial (5)
2.
1 1 u(x, 0) = m´ ax e 2 (k1 +1)x− 2 (k1 −1)x , 0 ∀ x .
M´ etodo num´ erico: diferencias finitas
Antes de entrar en los detalles propios del m´etodo de diferencias finitas para la resoluci´ on de ecuaciones diferenciales en derivadas parciales es conveniente mencionar ciertas consignas generales respecto al uso de m´etodos num´ericos: • No todo m´etodo num´erico de resoluci´ on de ecuaciones diferenciales funciona bien en todos los casos. Es necesario realizar un estudio sobre el m´etodo num´erico m´ as conveniente antes de su aplicaci´ on al problema concreto. Esto garantizar´ a que la soluci´ on num´ericamente obtenida se aproxime convenientemente a la soluci´on del problema planteado. • La conveniencia viene marcada por los siguientes aspectos: – Consistencia: El problema aproximador es tan cercano al problema inicialmente planteado como queramos. – Convergencia: La soluci´ on del problema aproximador tiende a la soluci´ on del problema inicial. – Estabilidad: Peque˜ nas variaciones en los datos de partida no provocan grandes variaciones en el resultado. – Eficiencia: Medida en t´erminos de recursos necesarios, b´asicamente memoria y tiempo de ejecuci´ on (n´ umero de operaciones realizadas).
´ n de derivados financieros mediante diferencias finitas Valoracio
2.1.
19
Definici´ on del mallado
Tomemos como ejemplo el problema (1) con la condici´ on final (4). Su resoluci´ on implica el conocimiento de la variable inc´ ognita V (S, t) para todo instante t y valor del activo S. Para abordar la resoluci´ on num´ericamente reducimos nuestro nivel de ambici´ on: calcularemos la soluci´ on (aproximada) solo en algunos puntos. Como primer paso realizaremos una acotaci´on del dominio de la soluci´ on. Pasamos del dominio 0 ≤ t ≤ T , 0 ≤ S ≤ +∞ en el que est´a definida la soluci´ on del problema al dominio: 0≤t≤T,
0 ≤ S ≤ Sm´ax .
Se reduce as´ı la “dimensi´ on” del problema mediante la elecci´on de Sm´ax finito. Sin embargo, esta consideraci´on puramente num´erica provoca la necesidad de formular una condici´ on de frontera en S = Sm´ax ∀t para garantizar la unicidad de la soluci´ on. Generalmente no es posible precisar el valor de la soluci´ on V (Sm´ax , t) ∀t, por lo que es habitual elegir un Sm´ax suficientemente grande para que la condici´ on frontera aproximada no altere la soluci´ on en el dominio de mayor inter´es. En segundo lugar definimos un mallado de puntos en los que la soluci´ on va a ser calculada: ∆t
Sm´ax
S
Sm´ın
∆S
0
t
T
Sin p´erdida de generalidad, los puntos definidos en el mallado del gr´ afico de arriba est´an equidistantes. Se pueden por ello representar en la forma: (n ∆S , m ∆t)
n = nm´ın , . . . , nm´ax ,
m = 0, . . . , mm´ax .
Calcularemos la soluci´ on s´ olo en los puntos de la malla: um n = u(n∆S, m∆t) .
20
´ndez Jos´ e Mar´ıa Pesquero Ferna
La soluci´ on en cualquier otro punto puede obtenerse a partir de ´estas en base a cierta ´ elecci´on de interpolaci´ on. Esta ha de elegirse consistentemente con el error cometido en las aproximaciones de las derivadas.
2.2.
Discretizaci´ on de la ecuaci´ on diferencial
En esta etapa las derivadas en la ecuaci´ on diferencial son aproximadas por cocientes incrementales, transformado la ecuaci´on diferencial en una ecuaci´ on algebraica de sencilla soluci´ on, como se ver´ a m´ as adelante. Las expresiones que aproximan las derivadas provienen de la aplicaci´ on del desarrollo en serie de Taylor: si f (x) es suficientemente diferenciable, ∂f 1 ∂2f 1 ∂3f 2 (x0 ) h + (x ) h + (x0 ) h3 + · · · 0 ∂x 2 ∂x2 3! ∂x3 1 ∂nf + ··· + (x0 ) hn + Error, n! ∂xn
f (x0 + h) =f (x0 ) +
donde Error =
1 ∂ n+1 f (α) hn+1 , (n + 1)! ∂xn+1
x0 ≤ α ≤ x0 + h .
Mediante manipulaci´ on trivial de la expresi´ on anterior podemos obtener las siguientes f´ ormulas aproximatorias de las derivadas primera y segunda: • Derivada primera: Aproximaci´ on forward : Aproximaci´ on backward : Aproximaci´ on central:
f (x0 + h) − f (x0 ) ∂f (x0 ) = + O(h) ; ∂x h ∂f f (x0 ) − f (x0 − h) (x0 ) = + O(h) ; ∂x h ∂f f (x0 + h) − f (x0 − h) (x0 ) = + O(h2 ) . ∂x 2h
• Derivada segunda: Aproximaci´ on central:
∂2f f (x0 + h) + f (x0 − h) − f (x0 ) (x0 ) = + O(h2 ) , 2 ∂x h2
donde se indica adem´as el orden del error cometido. Todos los ejemplos anteriores se han obtenido definiendo puntos equidistantes alrededor del x0 . Se pueden tambi´en obtener para puntos no equidistantes. Por ejemplo la aproximaci´ on central de la derivada primera ser´ a: f (x0 + h) − hh f (x0 − h ) − 1 − hh f (x0 ) ∂f (x0 ) = + O(h − h ) , ∂x h donde h y h son distintos.
´ n de derivados financieros mediante diferencias finitas Valoracio
21
La aplicaci´ on de estas aproximaciones convierten las ecuaciones diferenciales en algebraicas. El modo de resoluci´ on depender´ a de las aproximaciones aplicadas. Revisemos algunos ejemplos: Esquema expl´ıcito Tomemos como ejemplo la ecuaci´ on (2). Si consideramos un mallado con puntos equiespaciados la expresi´ on diferencial previa para el punto (ti , xj ) se puede volver a escribir utilizando diferencias forward para la derivada temporal y central para la derivada espacial como u(ti + ∆t, xj ) − u(ti , xj ) + O(∆t) ∆t u(ti , xj + ∆x) − 2u(ti , xj ) + u(ti , xj − ∆x) = + O(∆x2 ) . ∆x2 Denotaremos por vji a la soluci´ on aproximada de la ecuaci´ on diferencial (2), que proviene de despreciar los t´erminos de error en las aproximaciones: (6) Operando:
i i vji+1 − vji vj+1 − 2 vji + vj−1 = . ∆t ∆x2
i i , − 2vji + vj−1 vji+1 = vji + α vj+1
donde
∆t . ∆x2 La ecuaci´ on previa permite conocer la soluci´ on en el punto (ti+1 , xj ) si se conoce la soluci´ on en los puntos del estado anterior (ti , xj+1 ), (ti , xj ), (ti , xj−1 ): α=
∆t xj+1 xj xj−1
∆x ∆x ti ti+1
La aplicaci´ on de este esquema permite calcular la soluci´ on aproximada al problema si se conoce la soluci´on para un instante inicial y para las fronteras j = jm´ın y j = jm´ax : vj0 vjim´ın vjim´ax
= u0 (j∆x) ,
jm´ın ≤ j ≤ jm´ax ;
= f (i∆t) ,
0 ≤ i ≤ im´ax ;
= g (i∆t) ,
0 ≤ i ≤ im´ax .
22
´ndez Jos´ e Mar´ıa Pesquero Ferna
El algoritmo a seguir ser´ıa el siguiente: Se conoce la soluci´ on en el instante t = 0 (condici´ on inicial). Se conoce la soluci´ on en los puntos superior e inferior del mallado del instante i = 1 (condiciones de frontera). Se calcula la soluci´ on en el resto de los puntos de i = 1 siguiendo el esquema: i i vji+1 = αvj+1 + (1 − 2α)vji + αvj−1
Se repiten iterativamente las etapas anteriores desde i = 2 hasta i = im´ax . Analicemos cu´ al es la influencia de las condiciones de frontera en la soluci´ on. En la ilustraci´ on inferior, la zona sombreada no se ve afectada por las condiciones de frontera, como se deduce de manera sencilla tras an´ alisis del algoritmo anterior. Por tanto toda esta zona no es sensible a errores en las condiciones de frontera. As´ı pues, se puede evitar el error debido a las condiciones de frontera en nuestro dominio de inter´es eligiendo un mallado suficientemente ancho cuando se utiliza un esquema de dicretizaci´ on expl´ıcito.
Otra consideraci´ on importante es que el esquema expl´ıcito aqu´ı planteado puede interpretarse como un a´rbol: i i vji+1 = α vj+1 + (1 − 2α) vji + α vj−1 .
La soluci´ on en un punto del instante i + 1 se calcula como suma ponderada de los valores de la soluci´ on en tres puntos del instante de tiempo anterior. Los coeficientes se pueden interpretar como probabilidades siempre y cuando 0 < α ≤ 1/2, ya que suman 1 y su valor est´ a entre 0 y 1. Como veremos m´ as adelante, la condici´ on 0 < α ≤ 1/2 se va a imponer al esquema expl´ıcito siempre por razones de estabilidad, por lo que un esquema expl´ıcito funciona en esencia exactamente igual que un a´rbol. Si se elige α = 1/2 se tiene un a´rbol binomial y si 0 < α < 1/2 se obtiene un a´rbol trinomial. Este ´arbol se puede interpretar como la versi´ on discreta de un salto de v de acuerdo a una distribuci´ on normal de media 0 y varianza 2α ∆t2 . Se justifica por ello de una manera intuitiva la conveniencia de resolver la ecuaci´ on en derivadas parciales de Black y Scholes (1) previa aplicaci´ on de un cambio de variable tal que log S (o una funci´ on
´ n de derivados financieros mediante diferencias finitas Valoracio
23
lineal de ´este) sea la nueva variable, ya que, en el modelo de Black y Scholes, log S tiene distribuci´ on normal. Un ejemplo de este cambio de variables es (3). Esta conclusi´ on se fundamentar´ a con m´ as rigor en base a criterios de estabilidad y convergencia del m´etodo. La soluci´ on en puntos no pertenecientes al mallado se obtendr´ a, como ya se ha indicado anteriormente, mediante interpolaci´ on. En este caso se elegir´a una interpolaci´ on lineal de la soluci´ on de los puntos vecinos del mallado, ya que el error cometido en esta aproximaci´on es del mismo orden que el cometido en la discretizaci´on de la ecuaci´ on diferencial. Esquema impl´ıcito Utilicemos ahora para el mismo problema una diferencia backward para aproximar la derivada temporal. Tenemos entonces que la soluci´ on aproximada verifica: i+1 i+1 vj+1 vji+1 − vji − 2vji+1 + vj−1 = , ∆t ∆x2 i+1 i+1 + (1 + 2α) vji+1 − α vj−1 = vji . −α vj+1
En esta ocasi´ on la soluci´ on en un punto no depende exclusivamente de la soluci´ on obtenida en el instante inmediatamente anterior, sino que depende tambi´en de la soluci´ on en otros puntos en el mismo instante. ∆t xj+1 xj xj−1
∆x ∆x ti ti+1
Esto hace que el c´alculo de la soluci´ on en un instante dado requiera de la resoluci´ on de un sistema lineal de ecuaciones: i+1 i i+1 vjm´ın +1 vjm´ın vjm´ın +1 i+1 i 1+ 2α −α 0 ··· 0 0 vj +2 vjm´ın +2 0 m´ ın −α 1+ 2α −α ··· 0 0 vi 0 i+1 v 0 +3 j −α 1+ 2α . . . 0 0 jm´ın +3 m´ın = +α .. .. .. .. .. .. .. .. .. . . . . . . . . . i 0 0 0 . . . 1+ 2α −α v i+1 vj −2 0 0 0 0 ... −α 1+ 2α jm´ax −2 im´ax i+1 vjm´ax −1 vjm´ax v i+1 jm´ax −1
La matriz del sistema de ecuaciones es una matriz diagonal dominante: |1 + 2α| > 2 |α| ,
α > 0,
24
´ndez Jos´ e Mar´ıa Pesquero Ferna
por lo que la matriz es invertible, y por tanto la soluci´ on u ´nica. Se utilizar´ a un m´etodo de resoluci´ on espec´ıfico para matrices tridiagonales, como por ejemplo el LU . El algoritmo a utilizar para calcular la soluci´ on es: • Se conoce la soluci´ on para i = 0 (condici´ on inicial). • Se conoce la soluci´ on para i = 1 en los extremos del mallado (condici´ on de frontera). • Se resuelve el sistema de ecuaciones previo para calcular la soluci´ on en el resto de puntos de i = 1. • Se repiten las etapas previas iterativamente desde i = 2 hasta i = im´ax . En esta ocasi´ on las condiciones de frontera afectan a la soluci´ on en todos los puntos del mallado. Las ecuaciones para el c´ alculo de la soluci´ on est´ an acopladas y cualquier cambio en alguno de los puntos del mallado afecta instant´ aneamente a la soluci´ on en el resto de puntos de ese momento de tiempo y consecuentemente a todos los instantes de tiempo posteriores. Esquemas semi-impl´ıcitos Se obtienen como combinaci´ on lineal convexa de los m´etodos expl´ıcito e impl´ıcito vistos previamente: i+1 i+1 i i vji+1 − vji vj+1 − 2vji+1 + vj−1 vj+1 − 2vji + vj−1 =θ + (1 − θ) . ∆t ∆x2 ∆x2 Es una media ponderada del esquema expl´ıcito e impl´ıcito. Si θ = 0 tenemos el esquema expl´ıcito y si θ = 1 el impl´ıcito. θ es un factor de ponderaci´ on que var´ıa entre 0 y 1. i+1 i i+1 i = vji + (1 − θ)α vj+1 . vji+1 − θα vj+1 − 2vji+1 + vj−1 − 2vji + vj−1
∆t xj+1 xj xj−1
∆x
ti ti+1 De nuevo tenemos que resolver un sistema de ecuaciones para calcular la soluci´on en un instante dado conocida la soluci´ on en el instante de tiempo anterior. Un caso especialmente interesante es cuando se elige θ = 1/2 (m´etodo de CrankNicolson). En este caso el orden de error de la derivada temporal es ∆t2 en lugar de ∆t, que es el que ten´ıamos en los anteriores esquemas. El beneficio que esto produce en la consistencia del m´etodo ser´ a entendida en el siguiente apartado.
´ n de derivados financieros mediante diferencias finitas Valoracio
2.3.
25
Consistencia, convergencia y estabilidad
Al comienzo del apartado 2 se han definido los conceptos de consistencia, convergencia y estabilidad. Su an´ alisis es fundamental para determinar la adecuaci´ on del m´etodo al problema objeto de estudio. A continuaci´ on se analizan estos conceptos para los esquemas expl´ıcito, impl´ıcito y semi-impl´ıcitos aplicados al problema (2). Adem´ as se analiza el caso de ecuaciones diferenciales con coeficientes no constantes. 2.3.1.
M´ etodo expl´ıcito
Consistencia. El modelo aproximado (ecuaci´ on en diferencias) tiende al modelo original (ecuaci´ on diferencial). Se denomina error de truncamiento T al error debido a esa aproximaci´ on. De acuerdo a (6) y utilizando el desarrollo en serie de Taylor: ui+1 − uij uij+1 − 2uij + uij−1 j − = T (j∆x, i∆t) = 2 ∆t ∆x 2 4 1 ∂ u 1 ∂ u = ∆t + O(∆t2 ) + ∆x2 + O(∆x4 ) 2 ∂t2 j,i 12 ∂x4 j,i 2 4 Si denominamos Mtt ≥ ∂∂tu2 y Mxxxx ≥ ∂∂xu4 , se obtiene que el error de truncamiento est´a acotado por:
1 1 Mxxxx + O(∆t2 ) . (7) |T | ≤ ∆t Mtt + 2 6α Tji
Se observa que el error de truncamiento tiende a 0 si el incremento de tiempo tiende a 0 para α constante. De acuerdo a (7), el m´etodo es consistente de primer orden, salvo en el caso α = 1/6, en el que tenemos consistencia de segundo orden. Estabilidad. Peque˜ nos cambios en las condiciones del problema implican peque˜ nos cambios en la soluci´ on. Es muy importante puesto que en la resoluci´ on del problema aproximado en ordenador se cometen errores de redondeo que no se amplifican en un modelo estable. La soluci´ on de la ecuaci´ on diferencial se puede poner como serie de Fourier (suma de t´erminos arm´ onicos con amplitud variable en funci´ on de t). Esto tambi´en se consigue en la soluci´ on de la ecuaci´ on en diferencias. Si ensayamos la soluci´ on: vji = λi eik (j∆x) en la ecuaci´ on (6) se obtiene: eik∆x − 2 + e−ik∆x i vj , ∆x2
1 2 k∆x . λ = 1 − 4α sen 2
λ−1 i v ∆t j
=
26
´ndez Jos´ e Mar´ıa Pesquero Ferna
λ se puede interpretar como un factor de amplificaci´ on. Aseguraremos que este factor es menor que 1 para cualquier k cuando: 1 α≤ . 2 Convergencia. La soluci´ on del modelo aproximado tiende a la soluci´ on del modelo original. El error cometido eij en un instante ti y punto xj viene dado por: eij = vji − u(j∆x, i∆t) . Sustituyendo en (6) y teniendo en cuenta la definici´ on del error de truncamiento: ei+1 − eij eij+1 − 2eij + eij−1 j − = Tji . ∆t ∆x2 = αeij+1 + (1 − 2α)eij + αeij−1 − Tji ∆t . ei+1 j Si 0 < α ≤ 1/2, todos los coeficientes de la ecuaci´on anterior son positivos y menores que 1. Definiendo la cota superior del error en el instante I como E i = m´ axj (|eij |) y la del error de truncamiento en todo el mallado como T ≥ |Tji |, ∀j, i, se obtiene: ≤ E i + T ∆t . ei+1 j Dado que en el instante inicial el error cometido es nulo ya que la soluci´ on es igual a la condici´ on inicial: E i ≤ i T ∆t , y entonces:
1 1 E Mxxxx tm´ax . ≤ ∆t Mtt + 2 6α El error de convergencia tiende a 0 cuando el paso de tiempo tiende a reducirse para α constante. im´ax
Se llega a la misma condici´ on α ≤ 1/2 que exig´ıamos para obtener la estabilidad. Esto no es una casualidad: se puede demostrar que un m´etodo consistente y estable es convergente. 2.3.2.
M´ etodo impl´ıcito
Utilizando razonamientos similares a los aplicados para el m´etodo expl´ıcito se obtienen las siguientes conclusiones: Consistencia. Se llega al mismo resultado que en el m´etodo expl´ıcito: consistente de primer orden. Estabilidad. El m´etodo es estable ∀α > 0. Por tanto este m´etodo es incondicionalmente estable. Convergencia. Al ser consistente e incondicionalmente estable, el m´etodo impl´ıcito es incondicionalmente convergente.
´ n de derivados financieros mediante diferencias finitas Valoracio
2.3.3.
27
M´ etodos semi-impl´ıcitos
Utilizando razonamientos similares a los aplicados para el m´etodo expl´ıcito se obtienen las siguientes conclusiones: Consistencia. Son consistentes de primer orden. En el caso de elegir θ = 1/2 (m´etodo de Crank-Nicolson), se alcanza consistencia de segundo orden: |T | ≈ O(∆t2 ) + O(∆x2 ) . Estabilidad. La condici´ on de estabilidad viene dada por: α(1 − 2θ) >
1 2
Si α ≥ 1/2, el esquema es incondicionalmente estable. Dentro de este grupo de esquemas est´a Crank-Nicolson. Convergencia. Ser´ a convergente cuando sea estable. Por tanto es condicionalmente convergente. • Ecuaciones diferenciales con coeficientes no constantes y con t´ ermino convectivo Un modo propuesto en la literatura [Dupire, 1994] para representar la estructura temporal de volatilidades y el smile de mercado viene definido por la siguiente elecci´ on de ecuaci´ on diferencial estoc´ astica para el subyacente: dS = µ(t) dt + σ(t, S) dZt . S El precio de un derivado, de acuerdo a este modelo, es soluci´ on de la siguiente ecuaci´ on diferencial:
(8)
∂V 1 ∂V ∂2V + r(t) S + σ 2 (t, S) S 2 − rV = 0 . ∂t ∂S 2 ∂S 2 Se trata de una ecuaci´ on parab´ olica con coeficientes no constantes.
(9)
28
´ndez Jos´ e Mar´ıa Pesquero Ferna
En estos casos el an´ alisis de estabilidad resulta m´ as complejo: – Dado que los coeficientes de la ecuaci´ on son funciones de t y S las condiciones de estabilidad son diferentes en cada punto del mallado. Si tomamos la situaci´ on m´ as conservadora para formular una condici´ on de estabilidad u ´nica para todo el mallado estamos penalizando la eficiencia. – La condici´ on de estabilidad se formula en dos desigualdades: la primera limita el valor de ∆S en funci´ on del m´ aximo valor de S en el mallado. Esta restricci´ on aparece por la existencia del t´ermino convectivo r(t)S
∂V ∂S
en la ecuaci´ on diferencial. La segunda limita el valor de α=
∆t . ∆x2
La primera de estas condiciones no aparec´ıa en todos los casos anteriores al no existir en la ecuaci´ on diferencial t´ermino convectivo. Esto produce dos efectos: perjudica la eficiencia y hace depender la condici´ on de estabilidad del mallado elegido. Es por ello que en la medida de lo posible es conveniente realizar cambios de variables para transformar la ecuaci´ on diferencial en una con coeficientes constantes y eliminar el t´ermino convectivo, que suele ser causa de problemas num´ericos.
3. 3.1.
Aplicaci´ on a la valoraci´ on de algunas opciones ex´ oticas Opciones digitales
Se trata de opciones que pagan una unidad monetaria a vencimiento de la opci´ on si el activo en esta fecha se encuentra por encima del precio de ejercicio (call ) o por debajo de ´el (put). El problema se diferencia del de valoraci´on de una opci´ on call o put est´andar (4) exclusivamente en las condiciones de frontera. Para una call digital:
1 si S > K 0 si S ≤ K
V (S, T )
=
V (0, t)
=
V (S, t)
→ e−r(T −t)
0
∀t ; ∀t cuando S → ∞.
´ n de derivados financieros mediante diferencias finitas Valoracio
3.2.
29
Opciones compound
Estas opciones ofrecen al propietario la posibilidad de comprar o vender en un instante t1 a un precio K1 otra opci´ on con vencimiento t2 (> t1 ) y precio de ejercicio K2 . Su valoraci´ on se puede plantear de la siguiente manera: realizamos un mallado de on final en t1 ser´ a: 0 a t1 . La condici´ V (S, t1 ) = m´ ax (V (S, t1 ; K2 ) − K1 , 0) , on con strike K2 que vence en t2 . donde V (S, t1 ; K2 ) es el precio en t1 de una opci´ Este precio se puede calcular anal´ıticamente. Las condiciones de contorno depender´ an del tipo de opciones que tengamos (compra o venta).
3.3.
Opciones chooser
En un instante futuro t1 el propietario decide si la opci´ on que vence en t2 es una call o put. La manera de resolverlo es realizando un mallado dividido en dos tramos: de 0 a t1 y de t1 a t2 . De t1 a t2 se calcula el precio de una call C(S, t) y de una put P (S, t) por el procedimiento habitual en cada punto del mallado. En t1 se impone la condici´ on final: ax(C(S, t1 ), P (S, t1 )) , V (S, t1 ) = m´ y de 0 a t1 las condiciones frontera:
3.4.
K e−r(t1 −t) ,
V (0, t)
=
V (S, t)
→ S − K e−r(t1 −t)
cuando S → ∞.
Opciones barrera
Se trata de opciones que se activan o desactivan al tocar el activo subyacente ciertos niveles durante la vida de la opci´ on. El c´ alculo del precio de estas opciones se realiza aplicando la metodolog´ıa est´ andar con condiciones adecuadas de frontera. Por ejemplo, para una call up and out (la opci´ on se desactiva cuando el subyacente toca una barrera superior) con nivel de la barrera B y con un pago de “rebate” R en el caso de que toque la barrera, las condiciones ser´ıan: V (S, T ) = V (0, t) =
m´ ax (S − K, 0) , 0 ∀t ,
V (B, t)
R
=
∀t .
Por tanto el mallado en la dimensi´ on S se acota entre S = 0 y S = B como consecuencia de la definici´ on de la opci´ on.
30
´ndez Jos´ e Mar´ıa Pesquero Ferna
Esta metodolog´ıa permite valorar opciones parciales o window, que son aqu´ellas en las que la barrera est´ a activa solo durante una parte de la vida de la opci´ on. La condici´ on frontera asociada a la barrera se impondr´ a solamente durante esa parte de vida de la barrera.
3.5.
Opciones asi´ aticas
Son opciones cuyo pago depende de la media del activo subyacente durante la vida de la opci´ on. Se pueden clasificar en dos grandes grupos: A. De media continua. La media se calcula de manera continua. En este caso es necesario a˜ nadir en la resoluci´ on del problema una nueva variable que define la media: t I(t) = f (S(τ ), τ ) dτ 0
con f (S(τ ), τ ) = S(τ ) (Media aritm´etica); f (S(τ ), τ ) = log S(τ ) (Media geom´etrica). Diferenciando, dI(t) = f (S(t), t) dt . La media aritm´etica se calcular´a como I(t)/t, y la geom´etrica como eI(t)/t . La ecuaci´ on diferencial que verifica el valor de la opci´ on es en este caso es [Tavella, 2000]: ∂V ∂V 1 ∂2V ∂V + f (S, t) +rS + σ2 S 2 − rV = 0. ∂t ∂I ∂S 2 ∂S 2 Para aquellas opciones en las que el strike es la media del subyacente durante la vida de la opci´ on, la ecuaci´ on diferencial anterior y las condiciones de contorno se pueden expresar en funci´ on de u ´nicamente dos variables (una temporal y una espacial) mediante un adecuado cambio de variables [Wilmott, 1993]. Para estos casos entonces aplicar´ıamos las t´ecnicas est´andar vistas previamente. En otras ocasiones esta reducci´on de variables no es posible. Hay por tanto que plantearse la resoluci´ on de un problema tridimensional: – Mallado con una dimensi´ on temporal t y dos espaciales S e I.
I S t
´ n de derivados financieros mediante diferencias finitas Valoracio
31
– Discretizaci´ on de la ecuaci´ on diferencial tridimensional previa. Se podr´ıan aplicar los mismos esquemas de discretizaci´on que hasta ahora, obteniendo resultados de estabilidad similares. Sin embargo, los esquemas impl´ıcitos resultantes dar´ıan lugar a matrices que dejan de ser tridiagonales. Esto provoca p´erdida de eficiencia en la resoluci´ on del sistema de ecuaciones. Es por ello que se utilizan esquemas de discretizaci´ on espec´ıficos (m´etodos ADI y LOD) que, conservando los resultados de estabilidad, permiten la resoluci´ on en dos fases, ambas definidas por matrices tridiagonales [Morton, 1994]. – Condiciones de contorno. Si tenemos por ejemplo una opci´ on put sobre la media aritm´etica las condiciones son:
I(T ) V (S, I, T ) = m´ ax K − ,0 ; T
I(t) −r(T −t) V (0, I, t) = e ,0 ; m´ ax K − T V (S, I, t)
→ 0 e
cuando S → ∞;
−r(T −t)
V (S, 0, t)
=
V (S, I, t)
→ 0
K;
cuando I → ∞.
– El t´ermino de derivada primera respecto a I (t´ermino convectivo en la dimensi´ on I) puede generar problemas num´ericos, tal y como se ha mencionado pre´ viamente. Estos habr´ an de ser tratados con t´ecnicas num´ericas espec´ıficas. B. De media discreta. En este caso la media se define sobre el valor del subyacente en tiempos discretos: n I(tn ) = f (S(ti ), ti ) . i=0
Por tanto I es una variable que permanece constante entre instantes de c´ alculo de promedio ti . As´ı que I se puede considerar como un par´ametro en esos tramos. De esta forma el valor de la opci´ on en estos intervalos se comporta de acuerdo a la ecuaci´on diferencial de Black y Scholes. En los instantes ti de promedio el valor del par´ ametro I cambia bruscamente. Sin embargo el valor V de la opci´ on tiene que comportarse de manera continua para evitar oportunidades de arbitraje: si un salto de I supone un salto negativo en V se podr´ıa aprovechar la oportunidad de arbitraje vendiendo la opci´ on inmediatamente antes del cambio de I y recomprarla inmediatamente despu´es. Se obtendr´ a as´ı un beneficio igual al salto de V . Si el salto de V es positivo la estrategia ser´ıa la contraria. Por tanto, para asegurar la continuidad de V en los puntos de promedio, se imponen las denominadas condiciones de salto: + − − V (S(t+ i ), Ii , ti ) = V (S(ti ), Ii−1 , ti ) .
32
´ndez Jos´ e Mar´ıa Pesquero Ferna
Como ejemplo, la condici´on de salto para el caso de media aritm´etica se escribir´ıa: − V (S, Ii−1 + S, t+ i ) = V (S, Ii−1 , ti ) .
Muy probablemente Ii−1 + S no coincida con ning´ un nodo en la dimensi´ on I del mallado, lo cual obliga a realizar interpolaciones en esta dimensi´ on. En estos casos se recomienda realizar interpolaci´ on con splines c´ ubicas con objeto de eliminar problemas de volatilidad artificial que generar´ıa una interpolaci´ on lineal [Tavella, 2000]. Por tanto el c´ alculo de opciones asi´ aticas con muestreo discreto tienen los siguientes ingredientes: – Mallado con una dimensi´ on temporal t y dos espaciales S e I. – Entre instantes ti de muestreo se resuelve la ecuaci´on diferencial de Black y Scholes con I como par´ametro. – Solo es necesario imponer condiciones de contorno en S, ya que I se comporta como par´ametro. – En los instantes ti se aplican las condiciones de salto.
3.6.
Opciones americanas
Opci´ on que concede al poseedor el derecho a ejercer la opci´ on en cualquier instante de su vida. Dependiendo del valor del activo subyacente en cada momento ser´ a o´ptimo ejercer o no. Por tanto hay que distinguir dos regiones: regi´ on de ejercicio y de no ejercicio. La frontera de separaci´ on de ambas regiones es a priori desconocida: se trata de un problema de frontera libre. Una restricci´ on evidente en la valoraci´on de las opciones americanas es que el valor de la opci´ on nunca puede ser inferior al pago obtenido si ejerci´eramos en ese momento. De otra manera habr´ıa oportunidad de realizar arbitraje. Por ejemplo, si tenemos una opci´ on americana de compra de un activo a un precio de ejercicio K, entonces: V (S, t) ≥ m´ ax(S − K, 0) ; En la regi´ on de ejercicio En la regi´ on de no ejercicio
V (S, t) = m´ ax(S − K, 0) ; V (S, t) > m´ ax(S − K, 0) .
Por otra parte la ecuaci´ on diferencial de Black y Scholes se convierte en una desigualdad: ∂V ∂V 1 ∂2V + rS + σ 2 S 2 2 − rV ≤ 0 ; LBS V ≤ 0 , ∂t ∂S 2 ∂S donde LBS es el operador diferencial de Black y Scholes. Esto se puede justificar de la siguiente manera: en la regi´ on de ejercicio es ´optimo ejercer la opci´ on. Mantener el portfolio opci´ on + cobertura delta produce un rendimiento inferior al tipo de inter´es sin riesgo: LBS V < 0 .
´ n de derivados financieros mediante diferencias finitas Valoracio
33
En la regi´ on de no ejercicio el portfolio opci´ on + cobertura delta produce como rendimiento el tipo de inter´es sin riesgo: LBS V = 0 . Por tanto la valoraci´ on se puede plantear como un problema complementario: LBS V ≤ 0
V (S, t) ≥ m´ ax(S − K, 0)
LBS V · m´ ax(S − K, 0) = 0 Su resoluci´ on num´erica puede plantearse utilizando un esquema expl´ıcito o impl´ıcito: – Esquema expl´ıcito: Se plantear´ıa el problema como el de valoraci´on de una opci´ on europea, calculando la soluci´ on en un instante de tiempo a partir de la soluci´ on en el instante posterior mediante resoluci´ on de la ecuaci´ on diferencial de Black y Scholes. Calculada la soluci´ on en ti se impondr´ıa la condici´ on de no arbitraje: Vji = m´ ax Vji , m´ ax(S − K, 0) – Esquemas impl´ıcitos: A diferencia del anterior, el car´ acter americano de la opci´ on no puede ser impuesto a trav´es de la ecuaci´ on previa sobre la soluci´ on calculada por Black y Scholes. La raz´ on es que en un esquema impl´ıcito la soluci´ on viene dada por un sistema de ecuaciones acopladas: cualquier variaci´ on en la soluci´ on en un nodo implica cambios en la soluci´ on del resto de los nodos de ese instante de tiempo. Hay que utilizar m´etodos espec´ıficos para su resoluci´ on, c´ omo el PSOR.
4.
Ventajas de las diferencias finitas en la valoraci´ on de derivados
• Utilizaci´ on de t´ecnicas num´ericas ya desarrolladas y elaboradas durante a˜ nos en otros campos de la ciencia. • Permite la valoraci´ on de opciones bajo comportamientos del activo para los cuales no existen soluciones anal´ıticas. Un ejemplo de esto es el modelo (8). La valoraci´on de derivados bajo este modelo pasa por resolver la ecuaci´ on (9) mediante aplicaci´ on de diferencias finitas. Para otra serie de opciones existen soluciones anal´ıticas cuando el activo paga dividendos de acuerdo a una tasa continua, que dejan de ser v´ alidas cuando los dividendos son discretos. Las t´ecnicas vistas permiten la valoraci´ on en ambos escenarios. Cuando los dividendos son continuos la ecuaci´ on diferencial a resolver bajo el modelo de Black y Scholes es: ∂V ∂V 1 ∂2V + (r − D) S + σ2 S 2 − rV = 0, ∂t ∂S 2 ∂S 2
34
´ndez Jos´ e Mar´ıa Pesquero Ferna
donde D es la tasa de dividendos, que incluso puede ser una funci´on de t y S. La u ´nica variaci´ on respecto a lo visto en apartados anteriores es la aparici´on del par´ ametro D al discretizar la ecuaci´ on diferencial y en las condiciones de contorno. Si la tasa de dividendos es discreta, por un simple argumento de arbitraje se deduce que si en el instante ti hay un pago Di de dividendos, el activo subyacente se comporta en ese instante de la siguiente manera: + − S(t− i ) = S(ti ) − Di (S(ti )) .
Por tanto la valoraci´ on de opciones sobre activos con pago de dividendos discretos necesitar´ıa incluir sobre lo ya visto condiciones de salto en los momentos de pago de dividendos. Estas condiciones de salto vienen determinadas por la continuidad del precio de la opci´ on y se pueden formular como: − + + V (t− i , S(ti )) = V (ti , S(ti )) .
• No existen restricciones en la construcci´ on del mallado. Esto permite representar exactamente instantes de tiempo y niveles del activo subyacente que sean claves para el derivado a valorar, como por ejemplo fechas de pago de dividendos, fechas de opci´ on, precios de ejercicio, barreras,. . . Adem´ as, la posibilidad de realizar un mallado variable permite refinar la malla en zonas donde se quiere reducir el error de aproximaci´ on, como zonas de alta gamma del derivado, mientras que se puede mantener una malla m´ as gruesa en zonas donde el error de aproximaci´ on es peque˜ no. Realizando una elecci´ on de mallado inteligente se tiene un m´etodo de valoraci´ on eficiente y preciso. • Las t´ecnicas vistas son u ´tiles para otros modelos de valoraci´ on. Por ejemplo, se pueden emplear para el c´ alculo del precio de opciones sobre tipos de inter´es. El modelo de Hull y White plantea el siguiente modelo de evoluci´on del tipo de inter´es instant´ aneo bajo probabilidad riesgo neutro: dr = (θ(t) − ar) dt + σdWt . La ecuaci´ on diferencial que rige el comportamiento de cualquier derivado V es: 1 ∂2V ∂V ∂V + σ 2 2 + (θ(t) − ar) − rV = 0 . ∂t 2 ∂r ∂r La resoluci´ on num´erica de este problema se realiza mediante discretizaci´on de esta ecuaci´ on diferencial y adecuadas condiciones de frontera. Tambi´en se pueden plantear modelos multifactoriales. Las ecuaciones diferenciales resultantes incluir´ an m´ as dimensiones, lo cual implicar´ a mayor consumo de memoria y coste de computaci´ on. Como ejemplo, la ecuaci´ on diferencial verificada por el precio
´ n de derivados financieros mediante diferencias finitas Valoracio
35
de un derivado cuyo pago depende de dos activos subyacentes que se comportan seg´ un el modelo de Black y Scholes (1) con correlaci´ on ρ viene dado por: ∂V 1 ∂V ∂V 1 2 ∂2V ∂2V + r(t)S1 + r(t)S2 + σ1 (t)S12 + σ22 (t)S22 + 2 ∂t ∂S1 ∂S2 2 ∂S1 2 ∂S22 ∂2V − rV = 0 . + ρσ1 (t)σ2 (t)S1 S2 ∂S1 ∂S2 • Permite la valoraci´ on de opciones path-dependent, esto es, cuyo pago depende adem´as del camino seguido por el subyacente hasta su vencimiento. Esta caracter´ıstica entra en la opci´ on como una nueva variable (ver asi´ aticas), con el consiguiente aumento de costes computacionales. Aplicando las t´ecnicas descritas para la valoraci´ on de opciones americanas se pueden valorar opciones americanas y path-dependent simult´ aneamente. • La t´ecnica de diferencias finitas calcula la soluci´ on en todo los puntos del mallado. Esto permite determinar sensibilidades del precio de la opci´ on ante movimientos del subyacente y el tiempo (delta, gamma, theta) sin m´ as que calcular derivadas num´ericamente con los resultados ya obtenidos. De igual manera la soluci´on te ofrece escenarios para combinaciones del valor del subyacente y el tiempo. Otras sensibilidades se pueden obtener resolviendo la ecuaci´ on diferencial que esas derivadas verifican. Por ejemplo, la vega υ (derivada respecto a la volatilidad) de un derivado de un activo S que se comporta de acuerdo al modelo de Black y Scholes (1) viene dada por: ∂υ ∂υ 1 ∂2υ ∂2V + r(t)S + σ 2 (t)S 2 2 + σ(t)S 2 − rυ = 0 . ∂t ∂S 2 ∂S ∂S 2 Se pueden resolver simult´ aneamente las ecuaciones que gobiernan el precio y sus derivadas aprovechando el mismo mallado y muchos c´ alculos comunes. • Una dificultad que presenta el m´etodo es la formulaci´ on de las condiciones de contorno. Durante todo el art´ıculo se han utilizado condiciones que definen aproximaciones del valor de la soluci´ on en los extremos, las cuales obligan en ocasiones a utilizar mallados muy grandes para que la aproximaci´ on sea admisible. Adem´as no siempre es f´ acil formular este tipo de condiciones de contorno. Una alternativa es definir las condiciones de frontera en funci´on de las derivadas de la soluci´ on. Generalmente los productos tienden a comportarse linealmente cuando nos acercamos a los extremos del mallado. En esos casos se puede imponer que la segunda derivada de la soluci´ on sea nula en los extremos o utilizar en los puntos del contorno la misma ecuaci´ on diferencial pero sin la aparici´ on de derivadas segundas.
36
´ndez Jos´ e Mar´ıa Pesquero Ferna
Referencias [1] Dupire, B. (1994): Pricing with a smile. Risk 7 (January), 18–20. [2] Hull, J.C. (2002): Options, Futures and Other Derivatives. Prentice Hall. [3] Morton, K.W., Mayers, D.F. (1994): Numerical Solution of Partial Differential Equations. Cambridge University Press. [4] Tavella, D., Randall, C. (2000): Pricing Financial Instruments. The Finite Difference Method. Wiley. [5] Wilmott, P., Dewynne, J., Howison, S. (1993): Option Pricing. Oxford Financial Press. [6] Wilmott, P. (1998). Derivatives. Wiley.
Jos´e Mar´ıa Pesquero Fern´ andez BBVA V´ıa de los Poblados s/n 28033, Madrid
[email protected]
Aplicaci´ on de la teor´ıa de opciones a la evaluaci´ on del riesgo de cr´ edito: relaci´ on entre probabilidad de impago y rating mediante un probit ordenado Teresa Corzo Santamar´ıa1
Abstract: Default probabilities are important to credit markets. Changes in default probabilities may forecast either credit migrations or default. While rating agencies such as Moodys and Standard & Poors compute historical default frequencies, option models can also be used to calculate forward looking or expected default frequencies. In this paper we compute risk neutral default probabilities using the diffusion option models of Merton (1974) and Geske (1977). It is shown that the Geske model produces a term structure of default probabilities. With an ordered probit model we test the relationship between the default probabilities of each model and the rating given to the firm by Standard and Poors. The results of this paper show that there is relationship between the default probabilities and the quality of the debt.
1.
Introducci´ on
El riesgo de cr´edito es un tema de importancia creciente. El Banco Internacional de pagos de Basilea requiere a todas sus instituciones miembros ser capaces de medir y gestionar las interacciones de sus riesgos de mercado y de cr´edito desde el 31 de diciembre de 1998. Una estimaci´on del riesgo de cr´edito es provista por la calificaci´ on de los t´ıtulos de deuda emitidos por un agente. No hay forma un´ıvoca de establecer la calificaci´ on de unos t´ıtulos determinados. Hoy d´ıa la calificaci´on es llevada cabo por entidades especializadas (agencias de rating) que tratan de mantenerse independientes pero no deja de ser un juicio de valor acerca de las perspectivas econ´ omico-financieras de ese agente. El valor de una particular emisi´ on depende, entre otras cosas, de la probabilidad de que el emisor sea incapaz de atender a sus compromisos o probabilidad de impago o riesgo de impago. 1 Teresa Corzo Santamar´ ıa es doctora en Econ´ omicas por la Universidad de Navarra y en la actualidad gestiona renta variable internacional y control de riesgos en Renta 4 SGIIC. Esta charla se imparti´ o en la sesi´ on del Seminario Instituto MEFF-Risklab de abril de 2000.
38
Teresa Corzo Santamar´ıa
La teor´ıa de opciones nos proporciona un marco de trabajo muy u ´til para modelizar el riesgo de impago. En este contexto, se considera que los deudores (B) tienen una opci´ on sobre los activos de la empresa (V ) y los accionistas (S) tienen una opci´ on residual (S = V − B). Tambi´en puede entenderse la posici´ on de los deudores como una opci´ on compuesta, d´ onde la opci´ on de impago del presente cup´ on existe s´olo si la corporaci´ on no ha impagado el cup´ on previo. Con esta teor´ıa, el valor de la empresa, V , y la volatilidad del valor de la empresa pueden ser estimados con precisi´on observando los precios de mercado de las acciones de la empresa y sus volatilidades. Adem´ as, de la opci´ on de impago se desprende la probabilidad de que la empresa o el banco fallen en los pagos de sus obligaciones. Esta probabilidad, impl´ıcita en el modelo, es una probabilidad esperada de forma semejante al caso de la volatilidad impl´ıcita. Las estimaciones de estas probabilidades y su matriz de transici´ on se pueden usar a continuaci´ on para ver si el modelo ayuda a mejorar la predicci´ on de las migraciones en la deuda (credit migrations) de empresas y bancos. En este contexto y para empresas norteamericanas incluidas en la base de datos Compustat, un estudio y estimaci´ on de las probabilidades de impago para los modelos Merton (1973) y Geske (1979) puede encontrarse en Delianedis y Geske (1998). La comparaci´ on de las probabilidades de impago extra´ıdas de estos modelos con las frecuencias actuales de impago se puede encontrar en Delianedis, Geske y Corzo (1998)2 . Usando los mismos resultados de las estimaciones de esos estudios, en el presente trabajo relacionamos las probabilidades de impago impl´ıcitas en el modelo de valoraci´ on de opciones Merton (1973) y en el de Geske (1979) con el nivel de calificaci´ on asignado a la deuda de la empresa por la agencia Standard and Poor’s a trav´es de la estimaci´on de un ordered probit. Los resultados muestran que los niveles de la probabilidad de impago contienen informaci´ on acerca de la calidad de la deuda de una empresa. El trabajo est´ a organizado de la siguiente manera: en la Secci´ on 2 explicamos c´omo obtener las probabilidades de impago del modelo de Merton (1974) y del modelo de Geske (1977); en la Secci´on 3 describimos la base de datos; en la Secci´ on 4 introducimos el modelo probit ordenado y el criterio de selecci´ on de variables. Finalmente, en la Secci´ on 5 comentamos los resultados obtenidos y en la Secci´ on 6 concluimos. Las tablas pueden encontrarse en el Anexo correspondiente.
2.
Las probabilidades de impago
Por impago se entiende en los mercados financieros que el emisor de una deuda no haga frente a la devoluci´ on de dinero prometida. Cuando uno considera el gran n´ umero de entidades que emiten activos de renta fija y el relativamente escaso n´ umero de impagos en que se incurre, podr´ıa pensar que el impago es infrecuente y por tanto susceptible de modelizarse como un suceso raro (rare event); sin embargo todos los emisores de deuda tienen una probabilidad de impago positiva. Esta probabilidad 2 Reproducimos
los resultados de las estimaciones en las tablas al final de este trabajo.
´ n de la teor´ıa de opciones a la evaluacio ´ n del riesgo de cr´ Aplicacio edito
39
cambia continuamente con cambios en el precio de las acciones de las empresas y provoca variaciones en el valor de los activos de renta fija en manos de los inversores, lo que preocupa al p´ ublico, aunque el impago exactamente no se d´e mas que en contadas ocasiones. El valor de la mayor´ıa de los t´ıtulos de renta fija est´ a inversamente relacionado con la probabilidad de impago. Las agencias de rating miden la frecuencia de impago hist´ orica. Si bien estas frecuencias hist´oricas son interesantes, no miran hacia el futuro. Los modelos de opciones nos suministran una probabilidad de impago neutral al riesgo, cuyos cambios pueden anticipar informaci´ on sobre la calidad de una emisi´ on y sobre un posible cambio en el rating 3 .
2.1.
El modelo de Merton (1974)
Merton (1974) usa procesos de difusi´ on en tiempo continuo para describir cambios en el valor de la empresa y fue el primero en demostrar que la opci´ on de impago de una empresa puede modelizarse de acuerdo con Black y Scholes (1973). Podemos considerar una acci´ on como una opci´ on call sobre la empresa con el precio de ejercicio igual al valor facial de la deuda total emitida por esa empresa. Si consideramos las obligaciones de la empresa como si simplemente constituyeran una emisi´ on de un bono cup´ on cero con valor facial M y que madura en T , entonces la condici´ on frontera para la acci´ on en T es ST = m´ ax(0, VT − M ). La soluci´ on para el precio de la acci´on sujeta a esta condici´ on frontera es la conocida ecuaci´ on de Black y Scholes: √ (1) S = V N (k + σ T − t) − M e−r(T −t) N (k) , donde
ln (V /M ) + r − 12 σ 2 (T − t) √ k= , σ T −t
y S es el valor de mercado actual de la acci´ on, V el valor de mercado actual de la empresa, M el valor facial de la deuda, r el tipo de inter´es libre de riesgo, σ la volatilidad instant´ anea del rendimiento de los activos de la empresa, t es el momento de tiempo actual, T la fecha de vencimiento de la deuda y N (·) es la funci´ on de distribuci´ on de una variable normal est´ andar. En este contexto, la probabilidad neutral al riesgo, hoy, de que el valor de la empresa sea mayor que el valor de la deuda en una fecha futura T (VT ≥ M ), es N (k). Por tanto, la probabilidad de impago (Risk Neutral Default Probability, RNDP) es: (2)
RN DP = 1 − N (k) .
3 Mientras que las probabilidades de impago neutrales al riesgo son aptas para valoraci´ on y cobertura, queda por estudiar su relevancia de cara a la estimaci´ on de las frecuencias reales de impago y por tanto para el Valor en Riesgo (VAR).
40
Teresa Corzo Santamar´ıa
Esta RNDP incorpora el futuro y puede considerarse como una frecuencia de impago “esperada” condicionada en el valor actual de la empresa, su endeudamiento, volatilidad, estructura de la deuda y el tipo de inter´es libre de riesgo. En general, las empresas tienen un pasivo con una estructura de deuda m´ as compleja que la supuesta por el modelo de Merton; a costa de perder generalidad y para salvar este inconveniente construimos una u ´nica deuda que venza en el momento de la duraci´on del conjunto de obligaciones originales de la empresa. La ecuaci´ on (1) contiene dos inc´ ognitas, V y σ. Para encontrarlas necesitamos una ecuaci´ on adicional que relaciona la volatilidad de una opci´ on con la volatilidad del activo subyacente4 : σS =
(3)
2.2.
∂S V σV . ∂V S
El modelo de Geske (1977)
Geske (1979) generaliza el modelo de valoraci´on de opciones de Black-Scholes usando el enfoque de Merton: si una acci´ on es una opci´ on sobre los activos de una empresa, entonces una opci´ on sobre una acci´ on es una opci´ on sobre una opci´ on o una opci´ on compuesta. En su aplicaci´ on a la valoraci´ on de deuda, Geske amplia el modelo de Merton para incluir deuda a corto plazo que paga cupones, y otros posibles compromisos de pago. En este marco se entiende la posici´ on de los deudores como una opci´ on compuesta, donde la opci´ on de impago del presente cup´ on existe s´olo si la empresa no ha fallado en el pago del cup´ on previo. Dividimos la deuda de la empresa en deuda a corto plazo y deuda a largo plazo. Suponemos que la deuda a corto plazo tiene un valor facial M1 y madura en T1 , y la deuda a largo plazo tiene un valor facial M2 y madura en T2 , siendo T2 > T1 . Siguiendo el desarrollo de Geske (1977), el valor de las acciones hoy viene dado por la siguiente expresi´ on: S = V N2 (k1 + σV T1 − t,k2 + σV T2 −t; ρ) (4) − M2 e−r(T2 −t) N2 (k1 , k2 ; ρ) − M1 e−r(T2 −t) N (k1 ) ,
donde
T1 − t T2 − t on de distribuci´ on normal bivariante con los siguientes l´ımites de y N2 (·) es una funci´ integraci´ on: ρ=
4 Se
k1
=
ln(V /V¯ ) + (r − σV2 )(T1 − t) √ , σV T1 − t
k2
=
ln(V /M2 ) + (r − σV2 )(T2 − t) √ . σV T2 − t
requiere el uso de un algoritmo num´erico.
´ n de la teor´ıa de opciones a la evaluacio ´ n del riesgo de cr´ Aplicacio edito
41
Como ocurre con la aplicaci´on de Merton, la ecuaci´ on (4) contiene dos inc´ ognitas, V y σ, y necesitamos la ecuaci´ on (3) para relacionarlas. Pero adem´ as ahora necesitamos calcular V¯T1 . Este es un valor critico para la empresa en T1 y es igual al valor facial de la deuda a corto plazo (M1 ) m´ as el valor de mercado de la deuda a largo plazo en T1 (B2T1 ). Es decir V¯T1 = M1 + B2T1 . Cuando V¯T1 > M1 + B2T1 la empresa es solvente y puede refinanciar la deuda, con lo que el valor de las acciones en T1 , despu´es de pagar M1 , es positivo. Con esta hip´ otesis de refinanciaci´ on de la deuda el proceso seguido por el valor de la empresa sigue siendo continuo; por lo general las empresas mantienen la deuda a corto plazo, si no lo hicieran el valor de la empresa disminuir´ıa dr´ asticamente y aumentar´ıa la probabilidad de impago del resto de la deuda. Por otra parte, esta condici´on a˜ nade realismo al modelo porque la empresa impaga todas sus deudas a la vez, y no s´ olo las deudas a corto, cuando se declara insolvente. Para calcular V¯T1 podemos hacer uso de que B2T1 ser´ a el valor de la empresa en el momento T1 menos el valor de las acciones en el mercado en T1 (B2T1 = VT1 − ST1 ), y por el modelo de Merton sabemos que el valor de una acci´ on se puede calcular como si fuese una opci´ on de compra sobre los activos de la empresa ST1 = m´ ax(VT1 − M2 , 0) de modo que VT1
= M1 + B2T1 = M1 + VT1 − ST1 = M1 + VT1 (k2 + σV T2 − T1 ) − M2 e−r(T2 −T1 ) N (k2 ) .
Ahora podemos calcular tres probabilidades de impago y obtener una estructura temporal para las probabilidades. La probabilidad conjunta de impago, tanto en T1 como en T2 es PT = 1 − N2 (k1 , k2 ; ρ). La probabilidad de impagar la deuda a corto plazo ´ en T1 es 1 − N (k1 ). Esta se obtiene de la misma forma a la del modelo de Merton (ver ecuaci´on 2), aunque no representa el mismo concepto. En el caso de Merton era la u ´nica probabilidad que recog´ıa toda la informaci´ on relevante para el suceso de impago. Ahora es una probabilidad que adquiere su contenido informativo dentro de un grupo de tres. La tercera probabilidad es una probabilidad a plazo de impago de la deuda a largo plazo en T2 , condicionada a no fallar en la deuda a corto plazo en T1 , Pf = 1 −
N2 (k1 , k2 ; ρ) . N (k1 )
Si la probabilidad de impago a corto es menor que la probabilidad de impago a plazo, la estructura temporal de las probabilidades tiene pendiente positiva, es decir, la probabilidad de impago para un horizonte temporal determinado crece con el tiempo. La estructura temporal se invierte si la probabilidad de impago a corto es mayor que la probabilidad a plazo. Esta situaci´ on se da cuando la empresa tiene una alta probabilidad de incumpliento a corto pero si es capaz de hacer frente a sus pagos el pr´ oximo a˜ no y sobrevive, entonces la probabilidad de impago bajar´ a. Por ejemplo, una empresa con una probabilidad conjunta de impago del 0,6 y una probabilidad de impago a corto de 0,5, tiene una probabilidad a plazo de 0,2.
42
3.
Teresa Corzo Santamar´ıa
Los datos
Los datos usados provienen de tres fuentes, Compustat, Center for Research in Security Prices (CRSP) y Data Resources Incorporated (DRI). Compustat provee informaci´ on trimestral de la composici´ on del pasivo de las empresas y del rating de Standard & Poor’s (S &P). CRSP suministra los precios diarios de las acciones y DRI suministra mensualmente la estructura temporal de los tipos de inter´es. El sistema de rating de S&P consta de 27 categor´ıas: la primera se usa para empresas que no han sido calificadas, y las otras 26 van desde el nivel de cr´edito m´ as alto, “AAA”, al m´ as bajo, “D”. A cada clase de rating le asignamos un n´ umero, desde 2 (AAA) hasta 27 (D). Para identificar un cambio en el rating miramos la diferencia num´erica en el rating de dos trimestres consecutivos5 . La muestra de este trabajo recoge todas las empresas con rating v´ alido (no en quiebra) asignado. A la hora de estudiar esta muestra la hemos resumido en seis categor´ıas: AAA, AA, A, BBB, BB y B (en B quedan recogidas tambi´en todas aquellas empresas que reciben una calificaci´ on menor). En la Tabla 1 resumimos el n´ umero medio de empresas para cada categor´ıa. Los datos del endeudamiento de las empresas provienen tambi´en de Compustat. Con ellos calculamos la duraci´ on de Macaulay del pasivo de la empresa y ese horizonte temporal es el que usamos en la f´ormula de Merton. Para cada observaci´ on trimestral de Compustat, sacamos de CRSP el precio de la acci´ on y el n´ umero de acciones en circulaci´on del mes anterior a ese trimestre y de dos meses despu´es. La volatilidad de las acciones se calcula con datos de los 60 d´ıas h´ abiles anteriores a cada trimestre. Con esta informaci´ on podemos hallar mensualmente el valor de la empresa, su volatilidad y los cambios en su endeudamiento. Usando adem´ as el precio mensual de las acciones, su volatilidad, el valor facial de la deuda y su madurez y el tipo de inter´es correspondiente podemos resolver las ecuaciones (1) y (3) y obtener para cada empresa la serie temporal de probabilidades de impago. Las tablas 4–7 recogen los estad´ısticos descriptivos de estas variables para cada categor´ıa. Para usar el modelo de Geske, utilizamos como deuda a corto plazo las obligaciones corrientes y la deuda a un a˜ no, y como deuda a largo plazo todas las obligaciones menos las de corto plazo. Con ellas calculamos la duraci´ on de las obligaciones a corto, T1 , y la de las obligaciones a largo, T2 . Con estos datos, el precio de las acciones y su volatilidad, y el tipo de inter´es correspondiente, podemos proceder al c´alculo de las tres probabilidades. Los resultados se muestran en la Tabla 8. Como podemos ver, las probabilidades de impago a corto plazo (Geske) son muy peque˜ nas. Su rango de variaci´ on est´ a entre el 0 % − 3 %. Las probabilidades a plazo y las conjuntas son muy similares. Las mejores probabilidades se consiguen tras el crash de la Bolsa en 1987, y durante y tras la recesi´ on en 1990–1991. De la comparaci´ on de las probabilidades a corto y de las probabilidades a plazo puede apreciarse que, 5 Como las cintas de Compustat muestran cada trimestre el rating de S&P que recibe una emisi´ on de deuda representativa de la empresa, no sabemos exactamente cu´ ando ocurre un cambio en el rating dentro de un trimestre.
´ n de la teor´ıa de opciones a la evaluacio ´ n del riesgo de cr´ Aplicacio edito
43
para esta muestra, la estructura temporal de las probabilidades libres de riesgo tiene, en media, pendiente positiva tanto para las empresas en la categor´ıa de grado de inversi´ on como para las de categor´ıas inferiores. La pendiente se acent´ ua en 1987, en ambos casos, y se suaviza en torno a 1989.
4.
El probit ordenado
El probit ordenado usa variables dependientes discretas que pueden tomar tres o m´ as valores diferentes. En los modelos de respuestas cualitativas6 ordenadas hay una funci´ on ´ındice de la que dependen todas las elecciones. Esto los hace especialmente u ´tiles cuando las respuestas (variables dependientes) tienen un orden, como es el caso de nuestro estudio. El modelo latente es yi∗ = Xi β + ei ,
ei ∼ IID(0, 1)
para i = 1, . . . , 29106,
donde i es el n´ umero de empresas en nuestra muestra, yi∗ es la variable dependiente subyacente, X i son las k variables explicativas, y ei es el residuo, que se distribuye como una variable IID. Lo que nosotros observamos en la realidad es una variable discreta yi que puede tomar s´ olo seis valores: yi yi
= 1 = 2
yi
= 3
yi yi
= 4 = 5
yi
= 6
si yi∗ < γ1 ; si γ1 ≤ yi∗ < γ2 ; si γ2 ≤ yi∗ < γ3 ; si γ3 ≤ yi∗ < γ4 ; si γ4 ≤ yi∗ < γ5 ; si γ5 ≤ yi∗ .
Los par´ ametros de este modelo son β y γ ≡ [γ1 · · · γ5 ]. Los γi son las fronteras que determinan en qu´e valor de yi se va a adjudicar a cada yi∗ . El n´ umero de componentes de γ es cinco, uno menos que el n´ umero de categor´ıas. Con el modelo probit se supone que la distribuci´ on de probabilidad de yi∗ es una normal (0,1). Para estimar el modelo maximizamos la funci´ on de log-verosimilitud para los diferentes par´ ametros, el resultado es un sistema que debe ser resuelto iterativamente. Usamos el algoritmo de Newton-Raphson, que converge al m´aximo global de la funci´ on de verosimilitud7 . on del probit son: el nivel de Las variables independientes (Xi ) para la estimaci´ las probabilidades de impago, la duraci´ on, la volatilidad de la acci´ on, la volatilidad instant´ anea del rendimiento de los activos de la empresa (σ), el ratio de apalancamiento (con valores de mercado, no contables), el valor total de la deuda y el valor de la empresa. Hacemos la estimaci´ on para todas las empresas de la muestra. 6 Amemiya 7 Ver
(1981) constituye una excelente referencia para modelos de respuesta cualitativa. Pratt (1981).
44
Teresa Corzo Santamar´ıa
Para calcular los ratings m´ as probables de acuerdo a nuestras observaciones usamos las estimaciones maximoveros´ımiles de los par´ ametros en el probit, y dadas las Xi , obtenemos las categor´ıas m´ as probables. Adem´ as de la estimaci´ on del probit ordenado, mostramos, como medida de bondad de ajuste, una tabla con el ´exito de predicci´ on. Un criterio de la adecuaci´ on del modelo puede ser8 m 1 C= Nii , N i=1 donde Nii es el n´ umero de ratings predichos correctamente.
5.
Resultados
Al estimar el probit ordenado tomando como variables independientes (Xi ), las que hemos descrito en el apartado 3, y como variable discreta el nivel del rating yi (del cual s´olo vemos determinadas realizaciones), perseguimos dos fines: en primer lugar, comprobar si las probabilidades de impago obtenidas por los modelos de Merton y de Geske guardan una relaci´ on con la calidad de la deuda —medida de acuerdo a S & P— y hallar la bondad de esa relaci´ on. Dicho con otras palabras, si empresas con niveles m´ as bajos de probabilidades de impago reciben mejores ratings que empresas con mayores probabilidades. Adem´ as nos interesa hallar cu´ al es la variable que m´ as informaci´ on contiene acerca de la calidad de la deuda de las empresas (y que por tanto nos sirva mejor para diferenciar entre ellas), y c´ omo se comportan las probabilidades con relaci´ on al resto de las variables. De este modo, procedemos a la estimaci´ on del probit para cada una de las variables por separado —Probabilidad de impago de Merton, Probabilidades de impago de Geske, Total deuda (TL), Valor empresa (Fv), Desv. t´ıp. emp. (σv ), Desv. t´ıp. acc. (σs ), Duraci´ on (D), Endeud. (Mercado)—, y para combinaciones de ellas. un En el ap´endice se pueden ver los probit con mejores resultados obtenidos9 seg´ el criterio (C) descrito en la secci´on anterior. Las probabilidades de impago obtienen los mejores resultados (tabla 9). La Desv. t´ıp acc (σs ), que sigue a las probabilidades en el ´exito obtenido por el modelo. Con las probabilidades de impago del modelo de Merton y con la probabilidad a plazo del modelo de Geske, asignamos un 40 % de los t´ıtulos correctamente, o sea, al mismo nivel de cr´edito de S&P, y un 36 % con σs . Estas dos variables muestran una alta correlaci´ on entre s´ı y con el nivel del rating (ver tabla 3), lo que, a nuestro parecer, es evidencia a favor del uso generalizado en la industria de la volatilidad las acciones como aproximaci´on para el nivel de riesgo de los t´ıtulos de una empresa. La probabilidad a corto plazo del modelo de Geske contiene poca informaci´ on sobre la calidad de la deuda, pero es necesaria para el c´alculo de la probabilidad a plazo. 8 En
Maddala (1983). resto de las estimaciones —con peores resultados, y que no reproducimos aqu´ı por motivo de espacio— est´ an a disposici´ on de las personas interesadas. 9 El
´ n de la teor´ıa de opciones a la evaluacio ´ n del riesgo de cr´ Aplicacio edito
45
Como referencia para la bondad de los porcentajes anteriores, podemos compararlos con los aciertos que lograr´ıamos siguiendo una estrategia na¨ıve. Por ejemplo, si distribuimos los t´ıtulos igualmente entre las distintas categor´ıas obtenemos un 16,66 % de asignaci´ on correcta, o, mejor a´ un, si colocamos todos los t´ıtulos en la misma clase de rating, la A —por ser la que m´ as t´ıtulos de nuestra muestra incluye— ser´ıa un 28 % el porcentaje de aciertos. El uso de las probabilidades de impago (y tambi´en el de la volatilidad) mejora considerablemente estos n´ umeros, lo que indica que contiene informaci´ on relativa a la calidad de la deuda. En los c´ alculos con una variable, la que peores resultados logr´ o fue la duraci´ on. Esto seguramente se ve influido por las simplificaciones previas: por la calidad de la informaci´ on que da Compustat, que unifica toda la deuda a m´ as de diez a˜ nos en un apartado, con lo que no es posible diferenciar entre deuda a 11 o a 30 a˜ nos, y por haber calculado uno o dos u ´nicos vencimientos para el total de deuda de la empresa con el fin de aplicar los modelos. En la tabla 3 hemos recogido las correlaciones entre las distintas variables que nos han servido de referencia a la hora de formar combinaciones. Primero escogimos variables con poca correlaci´ on para observar si mejoraban el porcentaje de aciertos obtenidos por las probabilidades de impago, pero sorprendentemente la combinaci´ on de mejores resultados (ver Tabla 9: la combinaci´ on de la probabilidades de impago, desviaci´ on t´ıpica de la empresa y endeudamiento nos da un 0,4169 de ´exito de predicci´ on) no consigue hacerlo mucho mejor que las probabilidades de impago por s´ı solas.
6.
Conclusiones
En este trabajo relacionamos las probabilidades de impago impl´ıcitas en los modelos de valoraci´ on de opciones Merton (1973) y Geske (1977) con el nivel de calificaci´ on asignado a la deuda de la empresa por la agencia Standard and Poor’s a trav´es de la estimaci´ on de un probit ordenado. Los resultados muestran que los niveles de la probabilidad de impago contienen informaci´ on relevante acerca de la calidad de la deuda de una empresa. Usando las variables: probabilidades de impago, Total deuda (TL), Valor empresa (Fv), Desv. t´ıp emp (σv ), Desv. t´ıp acc (σs ), Duraci´on (D) y Ratio de apalancamiento (Mercado), primero de forma individual y a continuaci´ on en grupos para aproximar la calidad de la deuda de una muestra de empresas norteamericanas, obtenemos que las probabilidades de impago que se desprenden del modelo de Merton y la probabilidad a plazo del modelo de Geske son la mejor aproximaci´ on para el rating otorgado a la deuda. Los resultados de este estudio confirman el uso de la teor´ıa de opciones como marco de estimaci´ on del riesgo de impago. Son adem´as evidencia a favor de la modelizaci´ on de la probabilidad de impago como un proceso de difusi´ on.
46
Teresa Corzo Santamar´ıa
Anexo: Tablas Tabla 1: Media mensual de empresas de la muestra en cada rating. B incluye todas las emisiones calificadas con B o menos 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996
AAA 17 18 18 17 16 18 16 16 14 13
A 170 178 171 164 171 178 191 196 207 232
AA 74 70 64 65 68 72 67 64 62 64
BBB 114 119 122 130 140 153 164 177 205 241
BB 95 90 94 101 99 115 139 170 180 218
B↓ 112 111 101 78 68 79 94 106 123 164
Total 582 586 570 555 562 615 671 729 791 932
Fuente: Delianedis & Geske (1998) Tabla 2: Estad´ısticos descriptivos de las variables. N´ umero de observaciones: 29106
Merton Total deuda (TL) Valor empresa (Fv) Desv. t´ıp emp (σv) Desv. t´ıp acc (σs) Duraci´ on (D) Endeud. (Mercado)
Media
Mediana
0’1063 2’86e+9 5’38e+9 0’218 0’344 5’596 0’3982
0’0154 8’8e+8 1’8e+9 0’176 0’291 5’43 0’3831
Desv. t´ıpica 0’1938 8e+9 1’27e+10 0’1823 0’2118 1’2115 0’1925
M´ınimo
M´ aximo
0 1860000 1391930 0’00002 0’059 1’85 0’00002
1 2’45e+11 3’4e+11 7’2 7’2 9’17 0’99994
Fuente: Elaboraci´ on propia Tabla 3: Correlaci´ on entre las variables Rating Merton TL Fv σv σs D Endeud.
Merton 0’53 1
TL -0’26 -0’11 1
Fv -0’38 -0’17 0’86 1
σv 0’31 0’68 -0’16 -0’11 1
σs 0’51 0’89 -0’14 -0’17 0’9 1
D 0’10 0’09 -0’02 -0’09 -0’11 -0’10 1
Endeud. 0’34 0’28 0’14 -0’09 -0’35 0’05 0’13 1
Fuente: Elaboraci´ on Propia. En negrita las correlaciones superiores a 0,5
´ n de la teor´ıa de opciones a la evaluacio ´ n del riesgo de cr´ Aplicacio edito
Tabla 4: Desglose de RNDP por rating AAA AA A BBB BB B↓
Observ 757 3032 8336 6802 5550 4629
Media 0’0159 0’017 0’024 0’05 0’1565 0’35
Desv. t´ıpica 0’06 0’067 0’07 0’1 0’189 0’283
Mediana 0’0001 0’0003 0’0021 0’0119 0’0846 0’2762
Fuente: Elaboraci´ on Propia Tabla 5: Desglose de Desv. t´ıp. emp. (σv ) y de Desv. t´ıp. acc. (σs ) por rating Observ AAA AA A BBB BB B↓
757 3032 8336 6802 5550 4629
Media 0’0169 0’0173 0’177 0’178 0’239 0’362
σv Desv. t´ıpica 0’11 0’09 0’103 0’105 0’156 0’31
Media 0’238 0’239 0’262 0’291 0’41 0’58
σs Desv. t´ıpica 0’102 0’105 0’113 0’122 0’174 0’305
Fuente: Elaboraci´ on Propia Tabla 6: Desglose de Duraci´ on y de Endeud. (Mercado) por rating Observ AAA AA A BBB BB B↓
757 3032 8336 6802 5550 4629
Media 5’078 5’43 5’51 5’7 5’7 5’72
Fuente: Elaboraci´ on Propia
Duraci´ on Desvc. t´ıpica 0’85 1’16 1’09 1’04 1’13 1’23
Media 0’2935 0’289 0’344 0’415 0’458 0’487
Endeud Desvc. t´ıpica 0’253 0’166 0’159 0’159 0’196 0’221
47
48
Teresa Corzo Santamar´ıa
Tabla 7: Desglose de Valor empresa (Fv) y de Total deuda (TL) por rating Observ AAA AA A BBB BB B↓
757 3032 8336 6802 5550 4629
Media 3’86e+10 1’26e+10 5’98e+9 3’92e+9 1’73e+9 6’23e+8
Fv Desvc. t´ıpica 4’68e+10 1’78e+10 8’95e+9 5’18e+9 2’8e+9 1’48e+9
Media 1’89e+10 5’29e+9 3’0e+9 2’6e+9 1’34e+9 5’75e+8
TL Desvc. t´ıpica 3’63e+10 8’55e+9 5’32e+9 4’87e+9 2’65e+9 1’54e+9
Fuente: Elaboraci´ on Propia Tabla 8: Probabilidades de impago medias para los modelos de Merton y de Geske, agrupadas en dos categor´ıas: grado de inversi´ on (IG) e inferior al grado de inversi´ on (NIG).
1987 1988 1989 1990 1991 1992 1993 1994 1995 1996
MERTON IG NIG 0.105 0.296 0.022 0.212 0.004 0.147 0.02 0.262 0.018 0.286 0.012 0.242 0.008 0.218 0.009 0.177 0.006 0.204 0.008 0.214
P (T ) IG NIG 0.072 0.205 0.009 0.093 0.001 0.054 0.005 0.113 0.005 0.13 0.003 0.102 0.002 0.106 0.002 0.082 0.002 0.092 0.002 0.111
IG 0 0 0 0 0 0 0 0 0 0
P corto NIG 0.006 0.002 0 0.002 0.005 0.006 0.006 0.003 0 0.001
IG 0.071 0.009 0.001 0.004 0.005 0.003 0.002 0.002 0.002 0.002
P (f ) NIG 0.2 0.087 0.051 0.101 0.115 0.092 0.099 0.08 0.088 0.107
Fuente: Elaboraci´ on Propia ´ Tabla 9: Exito de predicci´ on del rating seg´ un las distintas variables Variables Probabilidad de impago Merton Probabilidad de impago a corto (Geske) Probabilidad de impago a plazo (Geske) Desviaci´ on t´ıpica de las acciones Pcorto+Plargo (Geske) Probabilidad de impago Endeudamiento (mercado) Desviaci´ on t´ıpica de la empresa Total deuda Duraci´ on Merton, T(1), y T(2): mismo rdo: Fuente: Elaboraci´ on Propia
C 0’3935 0’27 0’3957 0,3607 0’40 0’4169 0’3187 0’275 0’245 0’2337
´ n de la teor´ıa de opciones a la evaluacio ´ n del riesgo de cr´ Aplicacio edito
49
Referencias [1] Amemiya, T. (1981): “Qualitative Response Models: A survey”, Journal of Economic Literature 19, 1483–1536. [2] Black, F. y M. Scholes (1973): “The Pricing of Options and Corporate Liabilities”, Journal of Political Economy 81, 399–418. [3] Crouhy, M., D. Galai y R. Mark (2000): “A Comparative Analysis of Current Credit Risk Models”, Journal of Banking and Finance 24, 59–117. [4] Delianedis, G. y R. Geske (1998): “Credit Risk and Risk Neutral default Probabilities: Information About Rating Migrations and Defaults”, UCLA, Anderson School, Finance Working Paper. [5] Delianedis, G., R. Geske and T. Corzo (1998): “Credit Risk Analysis with Option Models: estimation and Comparison of Actual and Risk Neutral Default Probabilities”, UCLA, Anderson School, Finance Working Paper. [6] Geske, R. (1977): “The Valuation of Corporate Liabilities as Compound Options”, Journal of Financial and Quantitative Analysis 12, 541–552. [7] Geske, R. (1979): “The Valuation of Compound Options”, Journal of Financial Economics 7, 63–81. [8] Maddala, G. S. (1983): Limited-Dependent and Qualitative Variables in Econometrics, Cambridge University Press, 46–51. [9] Merton, R. C. (1973): “Theory of Rational Option Pricing”, Bell Journal of Economics and Management Science 4, 141–183. [10] Merton, R. C. (1974): “On the pricing of Corporate Debt: The Risk Structure of Interest Rates”, Journal of Finance 29, 449–470. [11] Pratt, J. W. (1981): “Concavity of the Log-likelihood”, Journal of the American Statistical Association 76, 137–159.
Teresa Corzo Santamar´ıa C/ Mar´ıa de Molina 3, 5o Dcha. 28006 Madrid
[email protected]
Non-gaussian multivariate simulations in mark-to-future calculations ˜ a, Marcos Escobar, Olivier Croissant, Gustavo Comezan ´ ´ ´ ´ Pablo Fernandez, Nicolas Hernandez, Luis Angel Seco1
Abstract: Scenario generation techniques for gaussian markets are very well understood, and are based on Monte Carlo methodologies for multivariate normal variables determined by marginal means and covariance matrices. This paper presents an approach for the calibration and generation of nongaussian scenarios; it is first developed in the one-factor setting, and then extended to multifactor situations. They are both based on extreme value theory.
1. Introduction The problem of calibrating financial time series is at the heart of financial risk management. RiskMetrics, when it assumes linear portfolios of gaussian risk factors, needs estimates for covariance matrices. When RiskMetrics assumes nonlinear portfolios with GARCH underlyings, one needs to derive the parameters of the underlying marginal distributions, and even then one assumes that standard correlations are a suitable measure of dependence among the risk factors. In the Mark-to-Future framework, one is free from these assumptions but at the same time needs to have the ability to generate forward-looking scenarios that are compatible with historical observations. The basic premise of this paper is that it is possible to split the calibration problem into two parts: • Calibration of one-dimensional marginal distributions. • Calculation of a dependence structure among the risk factors. 1 Olivier
Croissant es miembro del Departamento de Investigaci´ on y Desarrollo de Algorithmics Inc. Gustavo Comeza˜ na es miembro del RiskLab Toronto y Director de Investigaci´ on y Desarrollo de la firma de gesti´ on financiera Sigma Analysis and Management. Pablo Fern´ andez es profesor del Departamento de Matem´ aticas de la Universidad Aut´ onoma de Madrid y miembro del RiskLab Madrid. Nicol´ as Hern´ andez y Marcos Escobar son miembros del RiskLab Toronto y estudiantes de ´ doctorado en el Departamento de Matem´ aticas de la Universidad de Toronto. Luis Angel Seco es profesor de este u ´ltimo Departamento, Director del RiskLab Toronto y Presidente de Sigma Analysis and Management. Esta charla fue impartida por el segundo autor (Gustavo Comeza˜ na) en la sesi´ on del Seminario Instituto MEFF-RiskLab de mayo de 2000.
52
˜a, M. Escobar, P. Ferna ´ndez, N. Herna ´ndez, L. Seco O. Croissant, G. Comezan
The dependence structure is discussed in section 2 below. Traditionally, the dependence structure is based on the correlation matrix. While this is totally appropriate for gaussian distributions, it does not adequately reflect the dependence structure for non-gaussian variables. In this paper, we shall use instead the normal rank correlation as a measure of dependence, which enjoys a number of advantages over the standard approach. The fitting of one-dimensional distributions is seemingly simpler, although careful inspection of the problem leads to a number of possible fitting methodologies. These can basically be classified as parametric versus non-parametric. Parametric methods may give rise to vastly different distributions depending on the model adopted. Nonparametric methodologies tend to overfit the data, leading to unsatisfactory scenario simulations. The purpose of this paper is to examine the validity of non-parametric multivariate predictions for risk management, when the dependence structure is given by the normal rank correlation. We will test the methodology in the case of natural gas forward prices. The ideas in this paper lead naturally to considering other onedimensional fitting methodologies, as well as more general dependence structures, and also to generalizations to time-dependent (non-iid) distributions. We plan to tackle these in subsequent papers. The paper is organized as follows: First (Section 2), we discuss dependence structures and how the calibration and generation of scenarios is reduced to onedimensional problems through the use of the normal rank correlation. Next (Section 3), we give an overview of different methodologies for the one-dimensional fit problem, Section 4 put together the ideas as a framework for the multivariate case. Sections 5 show some limitations of this framework. Due to its importance on the financial environment, we provide some ways to compute VaR in Section 6. Section 7 contain a description of the applications to the above-mentioned business cases. We conclude with an Appendix that gather more mathematical aspects of the theory.
2. The dependence structure In this section we wish to consider a general framework of dependence structures, which we will refer to as generalized correlations. It is a well known fact that two gaussian variables are independent if and only if they are uncorrelated. In other words, a single number encapsulates their entire dependence structure. The same is not true for general random variables. In fact, the concept of correlation can be very deceptive outside the gaussian domain. We try to bridge this gap by introducing a more general correlation concept, as described below. Let us agree on some notation and fundamental concepts first: Here, and throughout this paper, we define x 1 φ(x) = √ exp(−t2 /2) dt , 2π −∞ the cdf of the univariate normal.
Non-gaussian multivariate simulations in mark-to-future calculations
53
Lemma 1 (Fisher Transformation). If a random variable X has cumulative distribution function (cdf ) P, then P(X) is uniformly distributed in [0,1]. Similarly, φ−1 (P (X)) is a gaussian random variable with mean 0 and variance 1. Next we discuss an extended notion of correlation. Consider two random variables X and Y , with cdf given by P and Q respectively, and two increasing functions, F and G. We define: (X, Y ) = E{F −1 P (X) G−1 Q(Y )} . F,G
The idea of this generalization is to calculate the correlation of a convenient transformation of X and Y , namely their remapping into a random variables with cumulative distributions given by F and G respectively. This generalizes more familiar correlation concepts as follows: • Correlation, given by: Corr(X, Y ) =
(X, Y ) .
PX ,QY
• Spearman Ratio, given by: E(PX (X)PY (Y )) =
(X, Y ) ,
1,1
where 1 denotes the cdf of the uniform distribution. • Normal Rank Correlation, given by. K(X, Y ) = (X, Y ) . φ,φ
Note that the normal rank correlation for gaussian variables coincides with the usual correlation. The normal rank correlation is specially meaningful because of the following result: Lemma 2. Consider a random vector (X,Y) that maximizes the entropy among all random vectors with X and Y as marginals. Then X and Y are independent if and only if K(X, Y ) = 0. The advantages of this generalized concept of correlation include the fact that one can easily fit distributions in n factors to market data, represented simply by observed marginals with cdfs given by Pi and computed normal rank correlations V , as follows: Let X be a gaussian vector with normal-0,1 marginals and variance/covariance matrix V . The random vector Y with components Yi = Pi−1 (φ(Xi )) has normal rank correlation V and marginal cdfs Pi , and hence solves the problem.
54
˜a, M. Escobar, P. Ferna ´ndez, N. Herna ´ndez, L. Seco O. Croissant, G. Comezan
3. One-factor models The previous section managed to reduce the problem of calibration and generation of financial scenarios to a collection of decoupled, one-factor models. The difficulties in dealing with one-factor distributions are two: reconstructing the “bulk” of the distribution, i.e., the part of the distribution which is away from extreme values, and reconstructing the tail of the distribution. In this section we present some general ideas dealing with the problem dealing with the bulk of the distribution. The “extreme value theory”, which as its name indicates deals with the second problem above, is better known and presented at the end of this paper in the Appendix. In what follows X will denote a real-valued random variable. We will specify the probablity law associated to X by its cummulative distribution function (cdf), F . We will denote by the conventional moments of X, defined as µr = E(x − µ)r ,
r = 1, 2, 3, . . .
Location. Intuitively a good measure of location should reflect the “center of mass” of the distribution. For symmetric distributions it is very clear how to define such a measure: the center of symmetry is the only logical choice. For more general distributions different measures have been proposed in the statistical literature, the best-known among being the mean of the random variable. The mean, usually denoted by µ, is defined to be the first moment of X. Spread. A good measure of spread should reflect the degree dispersion of the values √ taken by X. Traditionally this has been measured by the standard deviation, µ2 . Skewness. A good measure of skewness should account for the degree of asymmetry of a distribution. A widely used measure of skewness is based on the third moment, and is given by µ3 s = 3/2 . µ2 Kurtosis. A good measure of kurtosis should reflect certain features of the shape of a distribution, such as the presence of fat tails. Traditionally this has been measured by the ratio between the fourth moment and the square of the variance k=
µ4 . µ22
The sample analogues of these quantities are obtained by substituting the theoretical moments for their sample estimators. Traditionally these quantities have been used to reflect, in one way or another, typical deviations from normality like asymmetries or fatter tails. This line of thought has led some authors (see for example Hull and White (1998)) to extend the gaussian model to more general families of distributions by adding a number of additional parameters that can account for the empirical skewness and kurtosis observed in the sample.
Non-gaussian multivariate simulations in mark-to-future calculations
55
This approach has some serious drawbacks. Estimators for the parameters of these distributions based on sample moments are as unlikely to be reliable as the sample moments on which they are based. This fact becomes more evident for long tails and/or very skewed distributions for which the empirical moments exhibit a high variability as well as an extreme sensitivity to outliers. In practice, an unrealistically large sample size would be required in order to obtain accurate estimators for both empirical moments and the parameters of the model in question. In general, for data exhibiting a higher frequency of extreme observations than that expected for a normal model, more robust descriptive measures are needed. Recently, an alternative and appealing system of similar descriptors have been proposed by Hosking (1990). In analogy with the conventional moments, they are called L-moments. L-moments are linear statistics of the quantile function Q(u) = F −1 (u) of the random variable X. They are defined as λr = Q(u)Pr−1 (u)du , where Pr (u) are orthogonal polynomials defined in the interval (0,1) of the form Pr−1 (u) =
pr,k uk ,
where pr,k =
(−1)r−k (r + k)! . (k!)2 (r − k)!
Like their traditional counterparts, L-moments can be interpreted as both intuitive and simple descriptors of the shape of a general distribution, as location, scale, asymmetry and kurtosis can be described in terms of the first four L-moments. However, they offer a number of advantages over conventional moments. They can be defined for a wider class of distributions, such as distributions that decay like power laws for which moments of higher order do not exist. Second, they completely characterize the probability law of the random variable, unlike conventional moments. Finally, and perhaps most importantly of all, they can be accurately estimated by their empirical analogs even for distributions with fat tails. It has been shown in Monte Carlo studies (see for example, Hosking (1990) and Sankarasubramanian and Srinivasan (1999)) that typically they just require a moderate sample size to obtain a reasonable accuracy. This advantage in efficiency and many other desirable properties offered by L-moments have been discussed in detail by Hosking (1986; 1990). Based on the assumption that the sample L-moments are reasonably robust and efficient estimators for the true L-moments of the unknown distribution of the population, we can expect a model that matches the empirical L-moments to be a more accurate and robust approximation to the unknown distribution than other models based on a fitting to higher order sample moments. In an upcoming paper we discuss this problem in more detail and provide a solution to it. The focus of this paper is to deal with the reconstruction of the marginal onefactor distributions by several methods. We review two of them: one based on the Hill estimator, another based on an explicit formula that invokes the L-moments to determine the model parameters.
56
˜a, M. Escobar, P. Ferna ´ndez, N. Herna ´ndez, L. Seco O. Croissant, G. Comezan
4. Multivariate model The process of building a multivariate simulation up can be summarize into the following steps: • Extraction of marginal distributions from historical samples. This is achieved by reconstructing the tail of the distribution from extreme value theory (see the appendix), and reconstructing the bulk of the cumulative distribution function (cdf) by interpolation of the historical cdf. • Computing of Normal Rank Correlation Matrix. • Creation of a Gaussian simulation with “the correct” dependency structure, namely with a correlation equal to the sample normal rank correlation. • Applying inverse of Fischer transform on the marginals. This yields a distribution with the same marginals as the reconstructed one from the sample (step 1 above), and same normal rank correlation as the sample. The pioneers of this ideas have been Hull and White, they applied it to mixtures of gaussian distributions. Our step ahead has been its extension to the non-parametric case.
5. Limitations of the model The proposed model has a number of limitations, usually linked to the fact that the historical data available may not be stationary, or show trends, mean reversion, etc. In this section we document some of them, and propose a methodology to be studied in upcoming papers. Our first exhibit about such limitations is a series of electricity price data. 250
200
5 days 120 days 300 days
100
50
orw 21/0 ard 7 31/0 /98 7 12/0 /98 8 24/0 /98 8 03/0 /98 9 15/0 /98 9 25/0 /98 9 07/1 /98 0 19/1 /98 0 29/1 /98 0 10/1 /98 1 20/1 /98 1 02/1 /98 2 14/1 /98 2 24/1 /98 2 05/0 /98 1/9 15/0 9 1 27/0 /99 1 08/0 /99 2 18/0 /99 2 02/0 /99 3 12/0 /99 3 24/0 /99 3/9 05/0 9 4 15/0 /99 4 27/0 /99 4 07/0 /99 5 19/0 /99 5 31/0 /99 5 10/0 /99 6/9 22/0 9 6 02/0 /99 7 14/0 /99 7 26/0 /99 7 05/0 /99 8 17/0 /99 8 27/0 /99 8/9 08/0 9 9 20/0 /99 9 30/0 /99 9 12/1 /99 0/9 9
0
Da te\F
Price
150
-50 Date
Figure 1: 5-120-300 days electricity forward Prices
Non-gaussian multivariate simulations in mark-to-future calculations
57
The data presents a number of clear features, all of them rooted in the strong temporal structure of the price data, making the methodology presented in this paper inadequate: • There is a clear correlation between a forward curve for a fixed term, and lagged versions of other terms. • There is a clear seasonality in the series, as electricity prices tend to peak in the summer. Note however, that the extreme tails observed in the data are not an obstacle, since the methodology presented here addresses precisely those issues. Compare, for instance, the histogram of the electricity forward price data and the one for forward gas prices: the gas price histogram clearly gives rise to a marginal structure that is harder to fit. 5-day electricity price histogram 180 160 140 120 100 80 60 40 20
-2
5, 6 -1 92 2, 69 1 0, 31 13 1 ,3 1 26 2 ,3 1 39 3 ,3 1 52 5 ,3 1 65 6 ,3 1 78 8 ,3 1 91 9 ,3 10 20 4, 3 11 22 7, 3 13 23 0, 3 14 25 3, 3 15 26 6, 3 16 27 9, 3 18 29 2, 3 19 30 5, 33 2 M or e
0
Figure 2: 5 days electricity forward Prices 5-day forward price histogram 80
60 50 40 30 20 10
Price bin
Figure 3: 5 days gas forward Prices
M or e
0 10 ,5 29 11 ,6 94 12 ,8 60 14 ,0 26 15 ,1 91 16 ,3 57 17 ,5 23 18 ,6 88 19 ,8 54 21 ,0 20 22 ,1 85 23 ,3 51 24 ,5 17 25 ,6 82
Number of days
70
58
˜a, M. Escobar, P. Ferna ´ndez, N. Herna ´ndez, L. Seco O. Croissant, G. Comezan
The methodology to be used in situations like this one is a generalization of the one presented here. Roughly speaking, we will apply filters to each of the marginals to achieve a series of residuals that are free from time-dependent effects (seasonality, autocorrelations, etc.) and are, in short, independent identically distributed observations. The resulting series of residuals will be dealt with the methods presented in this paper.
6. Traditional approaches to VaR RiskMetrics-type VaR. The basic assumption is that the returns are normally distributed; the parameters (mean and volatility) are then estimated using the historical information available and the distribution obtained in this way is used to compute quantiles. Problem: The financial data seem to have heavier tails than the normal distribution. If that is so the quantiles obtained with the normal approximation will be lower than they should. Historical VaR. The basic assumption now is that the distribution of returns is unknown; one proceeds to compute quantiles using the empirical distribution obtained from the historical data of the portfolio. Problem: Since we are trying to understand extreme losses, the empirical distribution will not give good information for high quantiles since there are, in general, only a few data points in the “extreme” range (or none at all!). EVT provides a different approach to the problem (see Appendix). We may assume that the distribution F of the X’s is unknown but it satisfies the conditions for convergence of the maxima (i.e. the tail is like x−1/k for x large). We can then use the results we have reviewed to give a measure of the value at risk, in several ways: • Get estimates of the tail of F and use it to compute quantiles that are taken as measure of value at risk. • MaxVaR: instead of using the X, use new random variable Y , that correspond to maxima of the X in consecutive blocks of n days. Estimate the parameters of the limit distribution of the maxima, use that limit distribution to get the quantiles of the maxima and take that as measure of value at risk. • In any of the two former cases instead of using quantiles as measure of value at risk, use the quantities given by E[X|X > Q] and E[Y |Y > Q] respectively, where Q is a given quantile of the distribution.
Non-gaussian multivariate simulations in mark-to-future calculations
59
In all three cases there are proceedural difficulties, such as deciding how large the n has to be in order that Mn has an approximate distribution given by Hξ,µ,ψ , or what should the initial threshold u be to start use values in excess of the threshold to approximate the tail of F .
7. Application to crude oil and natural gas portfolios In this section we apply the multifactor methodology to the calculation of Value-atrisk numbers for a portfolios of forward price contracts for crude oil and natural gas. In this situation, a gaussian fit produces very poor results: if one computes the frequency of outliers beyond the theoretical gaussian VaR numbers, one gets the following chart: TABLE 1 5% VAR 1% VAR
% outliers 8% 4%
The conclusion is obvious: gaussian methodologies are slightly off for 95% levels, and yield unacceptable results for 99% VaR calculations. The methodology described in this paper obtains results slightly better than gaussian at 95% levels, and dramatically better at 99% levels, where they still maintain acceptable levels of accuracy in their VaR predictions. The methodology to test the predictions of our results is an out-of-sample VaR comparison test, which uses the price series and a collection of futures portfolios. Roughly speaking, using subsets of the available history, we obtain out-of-sample predictions for the Value-at-Risk numbers for each of the portfolios under consideration, with confidence levels of 95% and 99%. Using the previously ignored history, we will then check if the actual losses that the portfolio incurred are above the VaR calculations or not. Finally, after doing this for a large collection of sub-histories, we check whether the loss outliers have the correct frequency: 5% and 1% respectively. In greater detail: Consider futures portfolios Πk ; constant through time. Let x denote each scenario for a futures curve, which is a vector of dimension equal to the number of available contracts at any point in time; in our case, this equals 12. Our data set consists of a series of 800 daily observations of the forward curve, denoted xi , i = 1, . . . , 800, respectively.
60
˜a, M. Escobar, P. Ferna ´ndez, N. Herna ´ndez, L. Seco O. Croissant, G. Comezan
Crude Gas Prices 28
26
24
22
Price
20
18
16
14
12
9/ 3 10 0/9 /3 6 11 0/9 /2 6 12 9/9 /3 6 1 1/ /96 3 03 0/9 /0 7 04 3/9 /0 7 05 2/9 /0 7 06 2/9 /0 7 07 3/9 /0 7 08 3/9 /0 7 09 4/9 /0 7 10 3/9 /0 7 11 3/9 /0 7 12 4/9 /0 7 01 4/9 /0 7 02 5/9 /0 8 03 4/9 /0 8 04 6/9 /0 8 05 7/9 /0 8 06 7/9 /0 8 07 8/9 /0 8 08 8/9 /0 8 09 7/9 /0 8 10 8/9 /0 8 11 8/9 /0 8 12 9/9 /0 8 01 9/9 /0 8 02 8/9 /0 9 03 9/9 /1 9 04 1/9 /1 9 05 2/9 /1 9 06 2/9 /1 9 1 7/ /99 1 08 3/9 /1 9 2 9/ /99 1 10 3/9 /1 9 3/ 99
10
Date
Figure 4: From 5 days to 12 month Crude forward Prices
The test selects a number m, for instance 400, which will account for about one years worth of data. Starting at the oldest point in time for when data is available, we select an initial window of m values given by x1 , . . . , xm , and we calibrate the multivariate distribution to this first window. Next, for the calibrated distribution, we generate a number of scenarios (1000, for example) which we then use to stresstest each portfolio. According to this, we compute the non-parametric 95% or 99% Value-at-Risk, Vk,m , of portfolio Πk . Using the scenario xk+1 which was available in our dataset, but so far ignored in our mark-to-future VaR calculation, we check whether Π(xm+1 ) > Vm,k or not. This has the effect to test the observed outliers in the P&L distribution with the VaR calculations according to our model, for this particular window. The window is then rolled one unit to the right, until our available data is exhausted, and the corresponding check is performed each time. Finally, we compare how frequently the portfolio values exceed the projected valueat-risk number, with the theoretical one, that is, 5% (or 1%). The results obtained were applied to 12 different futures portfolios, and are summarized in the table below. The numbers quoted include, for both the 95% and 99% VaR, the smaller outlier frequency, the largest frequency and the average frequency across the 12 different portfolios under consideration. TABLE 2 5% VAR 1% VAR
Smallest 3.00% 0.25%
Average 6.05% 0.67%
Largest 8.02% 1.50%
Non-gaussian multivariate simulations in mark-to-future calculations
61
Appendix: Extreme Value Theory Extreme Value Theory (EVT) can be thought as the study of tails of distributions. It has been widely used in Engineering for studying extremal events (such as earthquakes, floods, etc.). More recently, it has become popular in the financial context. Our setting is as follows: Consider a random variable X (as a model of, say, the daily returns of a portfolio), with cumulative distribution function (cdf) F . One could postulate a certain distribution function for the random variable (for example, a lognormal one) and fit the parameters by means of historical data. If interested in extremal events, one may compute the quantiles and from there obtain risk measurements, such as VaR. However it is well known that returns do not, in general, follow a log-normal distribution, but exhibit fat tails: normality or lognormality lead to underestimation of tail events, and losses that exceed VaR bounds occur more often than predicted. What EVT proposes is to study the maximum of a random sample of n values of X, that is Mn = max(X1 , . . . , Xn ) , where the Xj are independent and identically distributed (i.i.d) with common distribution function F . If the Xj represent the daily returns of a portfolio then P(Mn > x) represents the chance of having at least a loss that exceeds x in a period of n days. The advantage of considering the maximum value is that, no matter what F is, lim P(Mn ≤ x) = Hξ;µ,ψ (x) ,
n→∞
where
x − µ −1/ξ Hξ;µ,ψ (x) = exp − 1 + ξ . ψ + ξ, µ and ψ > 0 are the parameters of shape, location and scale, respectively, and a+ = max(a, 0). Here the case ξ = 0 has to be understood in the limiting sense, that is x − µ H0,µ,ψ (x) = exp − exp . ψ The normal and lognormal distributions correspond to the case ξ = 0, but for many financial series, ξ seems to be positive. This parameter is in some sense related to the size of the tail of the distribution: 1 − F (x) ∼ = x−1/ξ for x large. This paper considers two methods for the estimation of the parameters ξ, µ, ψ (the biggest difficulty lying in the parameter ξ): maximum likelihood, and L-moments. This ξ also appears when one considers the excess distribution function Fu (x) = P(X − u ≤ x|X > u) =
F (x + u) − F (u) , F (u)
62
˜a, M. Escobar, P. Ferna ´ndez, N. Herna ´ndez, L. Seco O. Croissant, G. Comezan
and the mean excess function e(u) = E[X − u|X > u] . If the behaviour of the maximum is like Hξ;µ,ψ , then Fu (x) looks like a the generalized Pareto distributions Gε,β(u) (x) where Gξ,β ≡ Gξ (x/β) and
Gξ (x) = 1 − (1 + ξx)1/ξ ,
x ≥ 0.
This is behind some of the methods that are used to estimate the tail of F (x). In fact, since F (u + x) = F (u) + Fu (x)(1 − F (u)) one can use the empirical distribution of the sample to approximate F (u) and a parametric method, for example maximum likelihood, to estimate Fu (x) through a GPD. We need therefore: • A choice of the initial threshold u. • Estimators ξˆ and βˆ of ξ and β(u). One way to choose u is based upon the following observation: for a generalized Pareto distribution Gε,β with 0 < ξ < 1, the mean excess function is given by e(u) =
β + ξu . 1−ξ
In particular, it is linear in u. We can now take the empirical mean excess function en (u) =
n 1 (Xj − u)+ Nu j=1
(where Nu is the number of excesses over u) and choose u in such a way that en (x) is approximately linear for x ≥ u. ˆ βˆ can be obtained, for example, by using the maximum likeliThe estimators ξ, hood method to adjust a Gξ,β to the data points in excess of u (the Fu (y)).
References [1] Boothe, P. and Glassman, D. (1987): “The Statistical Distribution of Exchange Rates: Economic Evidence and Economic Implications”. Journal of International Economics 22, 297–319. [2] Borwein, J. W. and Lewis, A. S. (1991): “Convergence of best entropy estimates”. SIAM Journal of Optimization 1 (2), 191–205.
Non-gaussian multivariate simulations in mark-to-future calculations
63
[3] Corrado, C. and Su, T. (1997): “Implied Volatility Skews and Stock Index Skewness and Kurtosis implied by S&P 500 Index Option Prices”. Journal of Derivatives 4, 8–19. [4] Crow, E. L. and Siddiqui, M. M. (1967): “Robust estimation of location”. Journal of the American Statistical Association 62, 353–389. ¨ ppelberg, C., Mikosch, T.: Modelling Extremal Events [5] Embrechts, P., Klu for Insurance and Finance. Springer-Verlag, 1997. [6] Hosking, J. R. M. (1986): “The theory of probability weighted moments”. Research Report RC12210, IBM Research Division, Yorktown Heights, N.Y. [7] Hosking, J. R. M. (1990): “L-moments: Analysis and estimation of distributions using linear combinations of order statistics”. Journal of the Royal Statistics Society, Series B, 52, 105–24. [8] Hull, J. and White, A. (1998): “Value at Risk When Daily Changes in Market Variables are not Normally Distributed”. Journal of Derivatives 5, no. 3, 9–19 [9] Kotz, S., Madarajah, S. (2000): Extreme Value Distributions. Imperial College Press.
Olivier Croissant Algorithmics 185 Spadina Ave Toronto ON M5T2C6, Canada
[email protected] Gustavo Comeza˜ na Sigmanalysis Suite 340 - The Fields Institute 222 College St., Toronto, ON M5T 3J1, Canada
[email protected] Marcos Escobar Department of Mathematics University of Toronto, Room 205 1 Spadina Crescent, Toronto, ON M5S 3G3, Canada
[email protected] Pablo Fern´ andez Departamento de Matem´aticas Universidad Aut´ onoma de Madrid Ciudad Universitaria de Cantoblanco, s/n. 28049-Madrid, Spain
[email protected]
64
˜a, M. Escobar, P. Ferna ´ndez, N. Herna ´ndez, L. Seco O. Croissant, G. Comezan
Nicol´ as Hern´ andez Department of Mathematics University of Toronto, Room 205 1 Spadina Crescent, Toronto, ON M5S 3G3, Canada
[email protected] ´ Luis Angel Seco RiskLab University of Toronto, Room 205 1 Spadina Crescent, Toronto, ON M5S 3G3, Canada
[email protected]
Las Matem´ aticas en las Finanzas D´ıdac Art´ es1
Abstract: This article attempts to summarize the topics I touched upon when I gave a talk in 2000 within a cycle named Seminar of Financial Mathematics, organized by the Universidad Aut´ onoma de Madrid and MEFF AIAF SENAF Holding de Mercados Financieros through Instituto MEFF. More than two years have elapsed between my talk and this article and I feel I have to warn the reader that this article is not a literal transcription of the talk but rather an account of the some of the most important scientific and technological advances and the way they have come to influence financial technique. The written word, as opposed to the spoken, dictates some limitations in the treatment of the subject. When I gave my talk I spoke about a fairly wide range of developments in the different fields of human knowledge and how they have interacted with one another. When speaking to an audience one can do this exercise in a reasonable amount of time, even though a more precise account of those facts is sacrificed in the process. Now I have to necessarily limit the scope of those ideas unless I risk ending up writing a whole book about them. The compromise I have found, between scope and depth, in the treatment of the subject is to browse over concepts and relationships I very much feel are present amongst the different disciplines and leave the reader to go deeper into those themes that attract his or her interest using the references contained in the article. I believe, nonetheless, that the line of thought of this article remains faithful to the spirit that led me to accept to give the talk in the first place: To speak of the propagation of knowledge from certain sciences, particularly form mathematics and physics, to finance; of its limits and to hint at what the future could bring to us.
1.
Un peque˜ no artificio
El t´ıtulo de esta charla, “Las Matem´aticas en las Finanzas”, quiz´as pudo resultar un tanto artificioso a las personas que asistieron a ella. El t´ıtulo en este art´ıculo tambi´en quiere jugar un papel de se˜ nuelo, de excusa para hablar acerca de toda una serie de desarrollos que, sobre todo durante los u ´ltimos cien a˜ nos, se han ido ´ produciendo. Estos a su vez no han hecho sino provocar a su vez otros desarrollos y 1 D´ ıdac Art´ es trabaja como consultor para temas de banca mayorista y es consejero de MEFF, Altura Markets y BBVA Midas (Portugal). Esta charla se imparti´ o en la sesi´ on del Seminario Instituto MEFF-RiskLab de junio de 2000.
66
D´ıdac Art´ es
descubrimientos a un ritmo que, en mi opini´on, es el m´as r´ apido en la historia del g´enero humano. No quiero decir con esta afirmaci´ on que los a˜ nos que precedieron a ese periodo no hayan sido de vital importancia, sobre todo como cimiento necesario de avances posteriores; sugiero que ha sido durante ese breve periodo de cien a˜ nos cuando como resultado de esos desarrollos ha visto la luz una forma de entender el mundo radicalmente distinta a la que nos hab´ıa servido hasta entonces, dando lugar a nuevas concepciones de lo que nos rodea que han alcanzado por fin el mundo financiero. Podr´ıamos decir que esta tendencia a incorporar al mundo de las finanzas el conocimiento adquirido del mundo f´ısico no ha hecho m´ as que empezar. En esta charla voy a prestar particular atenci´ on a la contribuci´ on de la ciencia f´ısica a este fen´omeno. El conocimiento del mundo f´ısico ha estado ligado, adem´ as de a la comprensi´ on de los fen´ omenos que nos rodean [Sag96], al desarrollo de los instrumentos matem´aticos necesarios para poder actuar como veh´ıculos con los que expresar el sentido de lo observado [Ber99]. Va a ser a esta faceta de la F´ısica y su relaci´ on con la Matem´ atica a la que voy a dar m´ as relieve. En estos u ´ltimos a˜ nos (m´as o menos, cien) tambi´en otras ciencias han contribuido al avance de la ciencia econ´ omica; en particular la Biolog´ıa y la Psicolog´ıa. Pero la influencia de la F´ısica y las Matem´aticas ha seguido siendo preponderante, sobre todo en lo que concierne a los productos financieros y sus mercados. Las Matem´aticas, en todo esto, han jugado un papel de gu´ıa, de herramienta que ha posibilitado, primero, la formulaci´ on del conocimiento de lo que nos rodea a medida que se iba accediendo a ´el y, m´ as tarde, su generalizaci´on. Las Matem´aticas nos han ayudado a ordenar el conocimiento de lo grande y de lo peque˜ no; de lo que est´ a cerca y de lo que est´ a lejos; de lo que podemos ver y de lo que es invisible para nosotros. La sociedad en la que vivimos es una sociedad “de intercambio”, y este intercambio se basa fundamentalmente (cuanto menos en el mundo occidental) en aspectos econ´ omicos. Por esta raz´on, a lo largo de la historia, las finanzas —que no dejan de ser la forma como llamamos a aquellas reglas que rigen la forma y el valor que a esos intercambios se les da— no han estado exentas de atenci´ on por parte de los desarrollos matem´aticos. No obstante, en el caso que nos ocupa (el de este breve periodo de los u ´ltimos cien a˜ nos aproximadamente), ha tenido que pasar casi medio siglo para que ese conocimiento se trasladase de la F´ısica a las Finanzas. Esto ha hecho que el impacto de una disciplina sobre la otra no se haya materializado hasta bien entrada la segunda mitad del siglo XX. Como hilo conductor que gu´ıe la exposici´ on he elegido lo que considero son los pilares de la actividad financiera: la confianza y el tiempo. A los juicios de valor sobre una y otro la t´ecnica cuantitativa, que se ha ido incorporando en estos a˜ nos al mundo financiero, ha a˜ nadido el concepto de valoraci´ on de ambos; de la confianza y del tiempo. Con el transcurso de los a˜ nos la tendencia ha sido a ir substituyendo los juicios de valor por c´ alculos cada vez m´as precisos y ajustados [Hul89]. A su vez la aplicaci´ on pr´ actica de las t´ecnicas de c´alculo se ha hecho posible gracias a desarrollos en las tecnolog´ıas de c´ omputo y de las comunicaciones, tecnolog´ıas ambas que no son sino otra encarnaci´ on del recientemente ganado conocimiento en el campo de la F´ısica.
´ticas en las Finanzas Las Matema
67
Por u ´ltimo, he intentado se˜ nalar algunas tendencias en el progreso del conocimiento y la direcci´on en la que pueden llegar a influir profundamente en algunos de los aspectos m´as relevantes del mundo financiero. Las ondas que hace casi cien a˜ nos la f´ısica cu´ antica hab´ıa empezado a levantar, en el estanque tranquilo del conocimiento humano basado en las leyes cl´ asicas, han llegado por fin a nuestra orilla.
2.
La evoluci´ on del conocimiento del medio
El hombre2 gana conocimiento acerca de su entorno a trav´es de sus sentidos. El entorno, antes que nada, es un entorno f´ısico. La ciencia F´ısica alcanza a comienzos del siglo XX una de las cimas m´ as altas en cuanto a la penetraci´ on de su mirada en el conocimiento de la naturaleza de lo m´ as peque˜ no, y a trav´es de ello del entorno que nos rodea. El entorno de lo m´ as peque˜ no hab´ıa resultado esquivo hasta entonces (hace apenas 100 a˜ nos, en 1897, que J.J. Thomson descubri´ o el electr´ on). A comienzos del siglo XX nace una ciencia, que luego se conocer´ıa por Mec´ anica Cu´ antica, que abre las puertas a una comprensi´ on nueva y m´ as profunda de la naturaleza. De su mano pronto llegar´ an tecnolog´ıas que estar´ an llamadas a permear casi todos los aspectos de nuestra vida diaria; tanto es as´ı que para algunos de los lectores de este art´ıculo estas tecnolog´ıas habr´ an estado presentes en sus vidas desde la infancia: la radio de transistores, el reproductor de discos compactos, los ordenadores. . . Es durante el primer cuarto del sigo XX cuando se da tal concentraci´on de descubrimientos y de nuevas formulaciones en el campo de la Ciencia F´ısica que es dif´ıcil encontrarle parang´ on en toda la historia de la humanidad. Por citar s´ olo a algunos de los m´ as destacados contribuidores al avance de la Ciencia F´ısica durante ese breve periodo, podemos recordar Ludwig Boltzmann, que invent´ o la Mec´ anica Estad´ıstica durante la u ´ltima d´ecada del siglo XIX y se dio cuenta de la importancia de la Teor´ıa Electromagn´etica de J. C. Maxwell (Electricidad y Magnetismo, 1873); a Ernst Ludwig Planck, que descubre en 1900 la F´ ormula de la Radiaci´ on (del cuerpo oscuro); a Hendrik Antoon Lorentz, que desarrolla la Teor´ıa Matem´atica del Electr´ on (1902); a Albert Einstein, que desarrolla la Teor´ıa Especial de la Relatividad, la Mec´ anica Estad´ıstica (1905) y la Teor´ıa General de la Relatividad (1912) —con la ayuda de los matem´aticos Marcel Grossman, Tullio Levi-Civita, Gregorio Ricci-Castro—. En 1922 Niels Bohr obtiene el Premio Nobel por sus trabajos sobre el a´tomo y la radiaci´ on emitida por ´estos, en 1925 formula el Principio de Complementariedad, que aboga por una comprensi´ on de los fen´ omenos cu´ antico a trav´es de las diversas interpretaciones de los mismos; Paul Dirac formula sus Principios de Mec´ anica Cu´ antica; en 1925, Erwin Schr¨ odinger enuncia su Mec´anica de Ondas, mientras que en 1924 Louis de Broglie hab´ıa formulado su Teor´ıa de la dualidad onda-part´ıcula y Ernst Pauli el Principio de Exclusi´ on, que llevar´ıa a partir de entonces su nombre. 2 En este escrito hombre se utiliza en el mismo sentido en que lo hace frase de la Biblia “Dios hizo al hombre var´ on y mujer”.
68
D´ıdac Art´ es
Es tal la avalancha de descubrimientos y de nuevas formulaciones que les dan explicaci´ on te´ orica que la relaci´ on anterior omite a muchos protagonistas relevantes de esa ´epoca. Esta omisi´ on no implica una valoraci´ on superior de los descubrimien´ tos enunciados m´ as arriba. Estos son solamente una muestra que pretende ilustrar la concentraci´ on de descubrimientos que se llevaron a cabo en esos breves a˜ nos. Si tuvi´esemos que resumir en dos grandes temas el conocimiento que estos descubrimientos y nuevas teor´ıas aportan, ´estos ser´ıan que (i) nada puede estar en reposo absoluto y (ii) en el coraz´ on de la naturaleza se encuentra un proceso irreductiblemente aleatorio. En esos a˜ nos, ambas aseveraciones entraban en contradicci´on frontal con lo que hasta esa ´epoca hab´ıa sido el paradigma del conocimiento del mundo f´ısico, asentado como estaba en la Mec´ anica Cl´ asica o de Newton y en la Teor´ıas del Electromagnetismo (que tambi´en ahora podemos llamar cl´ asicas). Sin violar las leyes cl´ asicas, una part´ıcula pod´ıa hallarse en reposo absoluto y el movimiento de los cuerpos pod´ıa expresarse de forma continua a trav´es de leyes que se cre´ıan inmutables. La contradicci´ on entre uno y otro planteamiento iba a sacudir para siempre los cimientos del conocimiento humano. La mayor parte las tecnolog´ıas que disfrutamos hoy nacen de la nueva comprensi´ on que estos cient´ıficos ganaron sobre la naturaleza de la materia (part´ıculas), de la luz y de c´ omo ambas se relacionan. Es en uno de estos campos, la Mec´anica Estad´ıstica, donde se encuentra el germen del desarrollo de la mayor parte del aparato matem´atico que hoy soporta nuestros mercados financieros.
3.
Los pilares de la actividad financiera
En el campo de las finanzas dos son, en mi opini´ on, los pilares que dan soporte a la actividad financiera: la confianza (en el sistema) y el valor del tiempo. Casi toda la actividad financiera, sobre todo cuando ´esta se ve como negocio, descansa en estos pilares y, m´ as concretamente, en poder poner valor a la confianza y al tiempo. Aquilatar el valor de la confianza es el determinante u ´ltimo de la din´ amica de todas aquellas actividades que hoy en d´ıa conocemos bajo la denominaci´ on gen´erica de actividades de “riesgo”. Se enmarcan aqu´ı los riesgos de contrapartida, de cr´edito, de liquidaci´ on (settlement), operativos, de pa´ıs, pol´ıticos, etc. Todos ellos tienen una caracter´ıstica en com´ un: la valoraci´ on (impl´ıcita en todas estas actividades) que se hace de la confianza que nos merece nuestra contrapartida. Esta confianza se expresa en t´erminos de la seguridad con la que creemos que nuestra contrapartida pueda cumplir las obligaciones que ha adquirido con nosotros. La contrapartida puede ser el mercado (una c´ amara de compensaci´ on) o una instituci´ on que act´ ua como agente en ´el. El poner valor al tiempo (o la percepci´ on que tenemos del mismo) determina a su vez otra parte de la de la operativa: la del producto y su variedad. Un conocimiento ´ıntimo del valor que otorgamos al tiempo y de c´ omo este valor hace que los distintos productos se comporten en el mercado nos lleva a poder concebir y dise˜ nar nuevos productos y, con ´estos, a tener nuevas oportunidades de negocio.
´ticas en las Finanzas Las Matema
69
En ambos aspectos, en valorar la confianza y valorar el tiempo, ha tenido la F´ısica una influencia importante al aportar m´etodos estad´ısticos para su tratamiento. Poco pod´ıa imaginar Robert Brown (que observ´ o en 1827 lo que hoy se conoce como “movimiento browniano”) las repercusiones que tendr´ıa en mundo financiero su descubrimiento ni el camino que este conocimiento seguir´ıa para llegar a ´el. A medida que nuestro conocimiento del mundo cambia y se hace m´ as profundo, as´ı (aunque a un ritmo mucho m´ as lento) var´ıa el conocimiento y el tratamiento que se da al mismo en el mundo de las finanzas.
4.
Expectativas y conservaci´ on de la energ´ıa
Hubo a comienzos del u ´ltimo siglo intentos de aplicar t´ecnicas estad´ısticas sobre un activo financiero; quiz´ as el primero de ellos fuese el de Louis Bachelier en 1900 en su tesis doctoral “Th´eorie de la Sp´eculation”. Bachelier utilizaba una generalizaci´ on del “movimiento browniano” (que aplicaba aqu´ı al movimiento en el tiempo del precio de los activos en lugar de a granos de polen, como fue el caso con Brown) para realizar sus c´ alculos utilizando t´ecnicas estad´ısticas. M´ as tarde, hasta finales de los a˜ nos 60, la Estad´ıstica aplicada a la F´ısica y a las Finanzas siguieron caminos separados. La primera progres´ o considerablemente con desarrollos fundamentados en el c´alculo de Itˆ o, pero no fue hasta los a˜ nos 70 cuando se ver´ıa un resurgir de los m´etodos estoc´ asticos aplicados a las finanzas por parte de Black, Merton, Scholes y Samuelson. En 1973, Fischer Black y Myron S. Scholes publicaron su ahora famosa f´ ormula para valorar opciones [Black and Scholes, 1973]. Con estos trabajos proliferar´ıan en el tiempo f´ormulas para calcular el valor de un activo contingente (opci´ on). Todos esos trabajos se apoyaban, entre otras, en la hip´ otesis de que el precio de los activos no dependiera de las expectativas de los agentes econ´ omicos —que se hallaban ya incorporadas en los precios observables de los activos— y en otro principio: el principio de no-arbitraje. Este u ´ltimo, a grandes rasgos, sosten´ıa que cuando se llegaba a un resultado econ´ omico utilizando caminos distintos no pod´ıa haber uno de ellos que comportara ganancias con relaci´ on a los dem´as sin que se incurriera en riesgos adicionales. En otras palabras, si el resultado neto de una transacci´ on econ´ omica era, por ejemplo, endeudarse a un tipo de inter´es fijo por un periodo de tiempo dado, no pod´ıa darse que se llegara a ese mismo resultado por otra v´ıa3 y hacerlo de modo m´ as barato. Estas dos hip´ otesis, junto con otras que asum´ıan una forma simplificada para el comportamiento de los precios de los activos (que se relajar´ıan con el transcurso del tiempo), introduc´ıan un marco de trabajo muy f´ertil para abordar problemas financieros y de fijaci´ on de precios. 3 Un camino alternativo podr´ ıa ser tomar a pr´ estamo un valor, venderlo en el mercado con fecha valor de contado y comprarlo a una fecha valor futura que coincidiese con el plazo establecido para devolverlo.
70
D´ıdac Art´ es
A un observador avezado no se le puede escapar la similitud que existe entre la hip´ otesis de no-arbitraje y la Ley de la Conservaci´ on de la Energ´ıa4 . En el a´mbito de la mec´ anica cl´ asica se postula la conservaci´ on de tres magnitudes; el trabajo, el momento y el momento angular. Si nos centramos en el trabajo, la Ley establece que “con independencia del camino seguido en un desplazamiento dentro de un campo (gravitatorio, de potenciales el´ectricos. . . ) de un cuerpo desde el punto A al punto B, la cantidad de trabajo era invariable, es decir independiente del camino elegido”. Depend´ıa pues de d´onde hab´ıamos salido y a d´ onde hab´ıamos llegado. Si, en este entorno, hacemos abstracci´ on de la noci´ on de cantidad de trabajo, y la asimilamos al resultado econ´ omico, nos encontramos con que la hip´ otesis de no-arbitraje dice algo extremadamente similar a la Ley de la Conservaci´ on de la Energ´ıa: el resultado econ´ omico de una operaci´ on (en un mercado que cumpla con las hip´ otesis se˜ naladas) no ha de depender del camino seguido sino del punto de partida y de llegada. Lo que se hab´ıa enunciado en el campo de la F´ısica como Ley de la Conservaci´ on de la Energ´ıa no encuentra un enunciado paralelo en el a´mbito de las Finanzas hasta pr´ acticamente los a˜ nos 70. Sin embargo, a partir de ese momento, el progreso que experimentan los m´etodos cuantitativos aplicados a las finanzas se acelera de forma considerable, ya que hace tratables ciertos problemas que sin esta asunci´ on tendr´ıan dif´ıcil soluci´ on. Vale la pena reflexionar en el hecho de que ambas hip´ otesis; el que las expectativas de los agentes econ´ omicos se hallan ya incorporadas en los precios observables de los activos y el que no debiera existir arbitraje sin riesgo, posibilitan la aparici´ on de toda una serie de productos menos complejos que las opciones pero que hab´ıan existido s´ olo como juicios de valor y no como la expresi´on de un c´ alculo ajustado. Como ejemplo podemos considerar los FRAs (Forward Rate Agreement), cuyo uso —a pesar de la simplicidad con que hoy valoramos la dificultad de calcularlos— no se generaliza hasta los a˜ nos 80. Las Permutas Financieras de Tipos de Inter´es (SWAP) no comienzan a verse en volumen hasta finales de esa d´ecada cuando se consideran desde un punto de vista anal´ıtico como cadenas de FRAs [Hul89].
5.
Cuando lo absoluto deja de serlo
Durante muchos a˜ nos, hasta el principio del siglo XX, la concepci´ on que se ten´ıa de la naturaleza del mundo, plasmada en las leyes que se cre´ıa lo reg´ıan, era absoluta, es decir, ´estas no depend´ıan de ninguna otra cosa. Se hab´ıan necesitado 16 siglos para pasar de una postura que negaba el movimiento de la Tierra (Ptolomeo), preciando como superior aquello que en medio del caos permanece inm´ovil; a otra que lo aceptara junto como el movimiento de los astros en el firmamento (Cop´ernico). Parte 4 Quiz´ as los trabajos de Julius Robert Mayer (1814-1878) puedan considerarse la primera enunciaci´ on de la Ley de la Conservaci´ on de la Energ´ıa, aunque no lo hiciese de modo completo (Mayer era cirujano) ni utilizando un lenguaje que nos parezca hoy preciso.
´ticas en las Finanzas Las Matema
71
de las limitaciones impuestas al desarrollo del conocimiento proven´ıan de consideraciones religiosas prevalentes en aquellos a˜ nos. La Iglesia Cat´ olica hab´ıa hecho suya la astronom´ıa ptolomeica y se resist´ıa grandemente a cambiar, por mucha que fuera la evidencia en favor del cambio en la comprensi´ on del universo creado. Esta oposici´ on estuvo a punto de costarle la vida a Galileo en 1615, y fue en parte responsable de que no fuese hasta Newton (1685) cuando la concepci´ on copernicana del mundo se asumiera cuanto menos por una gran parte de Europa (otros tardar´ıan casi un siglo m´ as en asumirla). No fue hasta el siglo XIX cuando se observ´o que, aunque las formulaciones (que realizaba la mec´ anica de Newton) de las pautas en el movimiento de los astros y de otros cuerpos serv´ıan perfectamente a efectos pr´acticos, no eran absolutamente precisas. Esa imprecisi´on se observaba todav´ıa m´ as cuando los cuerpos eran extremadamente peque˜ nos o se mov´ıan muy deprisa. Ya a comienzos del siglo pasado, con la introducci´ on de la Teor´ıa de la Relatividad, se confirm´ o que dichas pautas no eran absolutas. La f´ ormula de Planck, que daba respuesta te´ orica a la evidencia registrada en el laboratorio con las emisiones del cuerpo oscuro, era la primera quiz´ as que ven´ıa a sentar las bases te´oricas de una limitaci´ on fundamental en el comportamiento de la materia y la energ´ıa. El descubrimiento de Planck ven´ıa a dar formulaci´ on te´ orica a hallazgos en el laboratorio que suger´ıan que una part´ıcula confinada a un a´mbito relativamente peque˜ no (con relaci´ on a su tama˜ no) no pod´ıa moverse a cualquier velocidad en ´el; solo ciertas velocidades estaban permitidas [Mil97]. Por utilizar una analog´ıa familiar, era como si en un tramo de autopista los veh´ıculos pudiesen circular a 30, 70 y 100 km/h, pero a ninguna otra velocidad. Las observaciones introduc´ıan tambi´en limitaciones en cuanto a los lugares que pod´ıa ocupar la part´ıcula o donde se la pod´ıa encontrar. Se introdujo entonces el concepto de la amplitud de probabilidad, que posibilitaba el c´ alculo probabil´ıstico en el dominio cu´ antico. Esta misma senda, de lo absoluto a lo relativo, iba a seguir tambi´en, aunque con muchos a˜ nos de retraso, la evoluci´ on del conocimiento financiero.
6.
De la confianza y del tiempo
Simplificando mucho, podr´ıa decirse que la t´ecnica financiera se interesa por la naturaleza del tiempo, del tiempo que media, por ejemplo, desde que se concede un cr´edito hasta que se pagan sus intereses y hasta que, por u ´ ltimo, se devuelve el principal del mismo. Se preocupa pues por el valor que hoy tienen flujos monetarios (o de bienes y servicios) que ocurren en el futuro, y viceversa; de cu´ anto valdr´ a en el futuro un activo que se posee hoy5 . De alg´ un modo, el a´mbito financiero hace posible los viajes en el tiempo y utiliza para ello el veh´ıculo de los tipos de inter´es: 5 Podemos resaltar que es en el ´ ambito financiero donde los viajes en el tiempo han tenido m´ as ´ exito; baste para ello ver el auge que han tenido los mercados de futuros financieros y hasta de mercanc´ıas.
72
D´ıdac Art´ es
rendimientos esperados pueden traerse al presente, as´ı como posponer rendimientos presentes a una fecha futura. Los c´alculos que nos hab´ıan asistido hasta hace poco a calcular ese valor presente eran, como en el caso de la mec´anica cl´ asica, absolutos. Nada mediaba entre que sucediera efectivamente la previsi´ on calculada (normalmente ayudada por un juicio acertado que se habr´ıa realizado sobre el otro aspecto de toda transacci´ on de intercambio; el de la solvencia de la contrapartida) y que no llegara a suceder por frustrarse la operaci´ on debido a un impago. El ´exito era absoluto, el fracaso tambi´en. Todo esto empez´ o a cambiar, a´ un cuando los financieros no nos hubi´esemos dado cuenta de ello, cuando se empez´o a observar que ciertas part´ıculas no se encontraban exactamente all´ı donde las ecuaciones cl´ asicas las situaban. Esos peque˜ nos errores, no explicados, llevaron a los investigadores a ahondar en la naturaleza de las cosas y eventualmente —cuando toda otra serie de elementos necesarios se hallan disponibles6 — a descubrir que en realidad esas ecuaciones cl´ asicas no eran sino un subconjunto (un caso particular, de hecho) de otras leyes, m´as amplias, que s´ı eran capaces de explicar las peque˜ nas diferencias observadas entre el pron´ ostico que aqu´ellas hac´ıan del comportamiento del mundo y el comportamiento de ´este. La misma transici´ on que en el mundo del conocimiento de la naturaleza se hab´ıa dado al pasar de una concepci´ on absoluta, en la representaci´ on de las leyes que rigen sus fen´ omenos, a una relativa se contagia, si bien con cierto retraso en el tiempo con respecto a la primera, al mundo de las finanzas en la concepci´ on y valoraci´ on de la confianza y del tiempo. En a˜ nos pasados el an´ alisis que se hac´ıa de una operaci´ on crediticia era un asunto dual; o bien se estimaba que el cliente iba a cumplir con lo convenido o bien se estimaba que no lo iba a hacer. Hasta hace poco (finales de los a˜ nos sesenta, comienzos de los a˜ nos setenta) no se hab´ıa introducido un nuevo t´ermino en esa ecuaci´ on que ha hecho que todo cambie: la probabilidad de que el cliente cumpla. La introducci´ on del concepto de probabilidad en el a´mbito financiero ha sido responsable de uno de los mayores cambios que se hayan observado en los u ´ltimos a˜ nos. Ha tenido un impacto substancial en la forma como se conceden los cr´editos, como se venden los servicios y como se administra el negocio. No hay m´ as que ver la incidencia de los m´etodos de “scoring” en la banca al por menor, tanto en la actividad de cr´edito al consumo como hipotecario, o bien c´ omo la venta masiva de tarjetas de cr´edito o peque˜ no cr´edito al consumo (por ejemplo, para la compra de electrodom´esticos) se realiza hoy en d´ıa y compararlo —si es que antes exist´ıa la actividad— en c´ omo se ven´ıan realizando hasta ahora esas transacciones. El an´ alisis estad´ıstico que se hace hoy de datos de renta media, impagos acaecidos, recuperaciones de fallidos, etc., ha contribuido tambi´en a que todo el negocio que gira alrededor de la actividad de riesgo de cr´edito haya cambiado substancialmente. Sin la contribuci´ on de t´ecnicas estad´ısticas (Matem´aticas, al fin y al cabo) a esta actividad 6 En mi opini´ on es imprescindible aqu´ı valorar el progreso del conocimiento a la luz de la disponibilidad de m´ etodos emp´ıricos adecuados, de los materiales precisos y de la capacidad de c´ omputo.
´ticas en las Finanzas Las Matema
73
este desarrollo no hubiese sido posible. Tampoco lo hubiese sido sin la disponibilidad creciente (en t´erminos de potencia de c´alculo y de precio) de la capacidad de c´ omputo necesaria para efectuar todos los c´ alculos requeridos por los m´etodos estad´ısticos, ya que de poco hubiese servido conocer la forma de calcular una respuesta al problema si el tiempo necesario para llevar a cabo el c´ alculo exced´ıa el permitido por las pr´ acticas del mercado. Es en este u ´ltimo aspecto donde la F´ısica, en concreto la F´ısica Cu´ antica, ha hecho posible los avances necesarios. Otro tanto ha ocurrido con la valoraci´ on del tiempo. En ning´ un campo de la actividad econ´ omica ha sido su impacto m´ as visible que en el nacimiento y generalizaci´on de los productos contingentes. No quiero decir con ello que no existiesen antes productos de esta naturaleza, sobre todo en el ´ambito de los que se restring´ıan a la comprobaci´ on del cumplimiento de una obligaci´ on contractual (performance bonds), pero la generalizaci´ on de las t´ecnicas empleadas para su manejo efectivo a todos los ´ambitos de la actividad financiera y el volumen en que se contratan estos instrumentos ha superado recientemente todos los l´ımites que nos hubi´esemos podido imaginar hace tan s´ olo unos a˜ nos. Tambi´en aqu´ı ha ocurrido otro tanto similar al que ya apuntaba anteriormente. De poco hubiese servido dar con una respuesta te´ orica al problema de establecer el valor justo (fair value) de un activo contingente si el c´ alculo de dicho valor no hubiese sido posible (en t´erminos pr´ acticos) por carecer de los elementos de c´omputo necesarios en t´erminos de capacidad y prontitud. Sin los sistemas de c´ alculo actuales, de difusi´ on de precios, de negociaci´ on electr´ onica en tiempo real, posiblemente los desarrollos que se han dado en estos ´ambitos no se hubiesen llegado a producir. De nuevo aqu´ı, la F´ısica ha venido al rescate al posibilitar todos esos avances que han sido una condici´ on necesaria para el desarrollo de los mercados. La introducci´ on de una aproximaci´ on probabil´ıstica al valor de los activos en el tiempo ha sido responsable del mayor desarrollo de productos que se recuerde en la historia de las finanzas. De este modo opciones, contratos a futuros, opciones sobre esos contratos, han aparecido y se han desarrollado con creciente grado de complejidad tanto en su subyacente como en las condiciones que se han de dar para que los contratos den lugar a un pago definido. No creo que tengamos que esperar muchos a˜ nos para ver c´omo todas las transacciones financieras se analizan como operaciones contingentes, aplicando para ello las t´ecnicas de c´alculo estad´ıstico que hoy se aplican al c´ alculo del valor de las opciones.
7.
De la confianza como comunicaci´ on
Otro aspecto de la confianza que es f´ acil pasar por alto en una discusi´ on abstracta del tema es el sentido que tiene ´esta en la frase “tomar a alguien en su confianza”. Ah´ı la confianza expresa comunicaci´on y m´ as a´ un una comunicaci´ on privada, lejos de los ojos de terceras partes sin derecho a conocer [Sin00]. En el mundo actual, la legislaci´ on de los mercados reconoce la importancia que tiene el que un agente en la confianza de una transacci´ on u hecho relevante traicione esa confianza utilizando
74
D´ıdac Art´ es
la informaci´ on a la que ha tenido acceso para su enriquecimiento personal e il´ıcito (il´ıcito, ya que vulnera las normas de transparencia del mercado y atenta por tanto contra el mismo sistema del que se pretende aprovechar). Tambi´en reconocen los marcos legales contempor´ aneos en sistemas democr´aticos el derecho a la intimidad. En cuestiones financieras, aunque ese derecho no estuviese reconocido expresamente en la legislaci´ on en vigor, las partes lo procurar´ıan para s´ı de igual modo en atenci´ on los perjuicios que les acarrear´ıa el que informaci´ on sensible se divulgara. En un mundo en el que las comunicaciones ocurren con mayor frecuencia a trav´es del medio de las ondas electromagn´eticas, de poco servir´ıa la velocidad que ´estas proporcionan a la comunicaci´ on si no pudi´esemos contar tambi´en con la seguridad de que lo que decimos pueda protegerse de terceros malintencionados, de que podemos estar seguros de la identidad de quien dice ser nuestra contrapartida leg´ıtima y de la veracidad de los datos que nos transmite. Tambi´en en este ´ambito las matem´aticas vienen a prestar auxilio [Sin00] y lo hacen, curiosamente tambi´en, en los a˜ nos 70 del siglo XX. Aprovechando la caracter´ıstica que tienen ciertas operaciones matem´ aticas de ser sencillas de calcular en un sentido y muy dif´ıciles en el contrario (la factorizaci´ on de grandes n´ umeros, por ejemplo), se introduce por Whitfield Duffie y Marty Hellman lo que se ha venido en llamar cifrado de llave p´ ublica [Lev01]. Por primera vez en la historia, al utilizar este sistema no hay necesidad de que las partes intercambien la llave, o clave, que cifra la comunicaci´ on. Las dos partes que quieren comunicarse entre s´ı escogen dos n´ umeros primos muy grandes (del orden de 300 d´ıgitos o m´ as) y los multiplican. Publican el producto de ´estos (o lo transmiten el uno al otro por medios no seguros) mientras que retienen el conocimiento de los dos n´ umeros que originaron el producto. Cuando A env´ıa a B un mensaje, lo cifra utilizando como llave el n´ umero producto. B lo recibe, y como posee el conocimiento de cuanto menos uno de sus factores primos no triviales, es capaz de descifrar el mensaje. Aunque la llave bajo el que se cifr´ o hubiese estado a la vista de todos, solo el destinatario puede descifrarlo. Al eliminar la necesidad de tener que intercambiar la llave o clave de la comunicaci´ on se elimina tambi´en con ella uno de los eslabones m´as d´ebiles de la cadena. Uno que hab´ıa sido causa de que en el pasado muchos mensajes cifrados se hubiesen visto comprometidos [Sin00] al caer la clave en la que se basaban en poder de una tercera parte malintencionada. Un esquema parecido al descrito es hoy la base de la comunicaci´ on segura a trav´es del Internet (siendo el sistema m´ as extendido el de RSA Security). Cuando un cliente accede al portal de su banco, validando su identidad, a partir de ese momento las comunicaciones entre ambos se ven cifradas con una llave p´ ublica que el banco provee, reteniendo el banco su llave privada que no se env´ıa a trav´es de la Red. Ni que decir tiene que la r´ apida aceptaci´ on de sistemas de cifrado como el descrito m´ as arriba tiene mucho que ver con la potencia de c´ alculo disponible, ya que sin ´esta no ser´ıa factible (por el tiempo considerable que habr´ıa que invertirse en las operaciones de cifrado y descifrado) su utilizaci´ on.
´ticas en las Finanzas Las Matema
8.
75
El tiempo como cambio
Ha sido tambi´en durante el sigo XX cuando se ha asistido a algunos de los desarrollos hist´oricos que m´ as mella han hecho sobre el concepto de confianza. Las grandes guerras, junto con las causas que nos abocaron a ellas, incluyendo los periodos de hiperinflaci´ on que las precedieron y los periodos de reconstrucci´ on que les siguieron, han dejado una huella profunda en el sentir de las poblaciones del mundo, pero sobre todo de Europa y Asia. La hiperinflaci´ on, s´ıntoma por antonomasia de la p´erdida de confianza en un sistema de intercambio basado en el dinero, es todav´ıa hoy la ra´ız responsable de la aversi´on que, sobre todo en Alemania (y por ende en Europa) y Jap´ on, se tiene a preferir el crecimiento econ´ omico sobre la estabilidad de precios. Las grandes guerras se˜ nalan tambi´en el comienzo de una ´epoca cuando los desarrollos de los mercados coexisten, o mejor dicho, encuentran su raz´ on de ser, con volatilidades de los precios que no se hab´ıan visto anteriormente. En este sentido, contrasta fuertemente la estabilidad de los precios de las distintas divisas y valores a comienzos de siglo y hasta su primera mitad, con las fluctuaciones que observamos hoy pero que tuvieron antecedentes en mayor o menor medida a partir del final de la d´ecada de los cincuenta. Los desarrollos que he comentado de los mercados han ido de la mano de otros tantos avances, no menos espectaculares, acaecidos en el terreno del tratamiento de la informaci´ on [Kur00]. La potencia de c´ omputo disponible ha ido aumentando de forma exponencial a partir de la primera mitad del siglo XX —algunos ven este crecimiento exponencial comenzando mucho antes, aunque conceden que no ha sido notorio hasta el siglo pasado [Kur90]— hasta llegar a los niveles de accesibilidad que conocemos hoy. Una familia (de EEUU, la UE o Jap´ on) media tiene hoy en su hogar acceso a una potencia de c´ alculo semejante a la que hace tan solo quince a˜ nos estaba reservada a quienes pod´ıan permitirse adquirir tiempo de proceso en un superordenador y superior a la que ten´ıan todos los ej´ercitos aliados en la Segunda Guerra Mundial. La base instalada, en cuanto a capacidad de c´ omputo, con la que cuentan las empresas que operan en el sector financiero quiz´ as sea la mayor en comparaci´ on con todos los dem´ as sectores de la actividad econ´ omica, cont´ andose de forma agregada hoy en d´ıa, no por cientos, sino por miles de Teraflops7 .
9.
La potencia de c´ alculo
El efecto que la creciente disponibilidad (ya se mida por unidades de informaci´ on procesadas por unidad de cuenta de coste, o bien en cifras absolutas de operaciones posibles por unidad de tiempo) de esta potencia de c´ alculo ha tenido sobre el tiempo —en relaci´ on con algunas actividades como ha sido el caso de las finanzas— ha sido enorme. Se ha redefinido el concepto de lo posible en una determinada escala o marco temporal. 7 Un Teraflop es la capacidad de c´ omputo necesaria para efectuar un mill´ on de millones de operaciones de coma flotante en un tiempo de un segundo (1012 operaciones de coma flotante).
76
D´ıdac Art´ es
El tiempo, por as´ı decirlo se ha expandido8 en virtud de las nuevas herramientas de c´ omputo, que han hecho posible realizar en un periodo relativamente corto de tiempo actividades que antes hubiesen necesitado de periodos mucho m´ as amplios para poder ser acometidas. Otro efecto, esta vez de la ubicuidad con que se manifiesta la potencia de c´ omputo, ha sido provocar una verdadera revoluci´ on en las comunicaciones humanas, ya sea mediante la palabra escrita, como hablada, como a trav´es de im´ agenes combinadas con ´estas. La comprensi´ on de lo que constituye informaci´ on y de la cantidad de informaci´ on contenida en un determinado mensaje ha sentado las bases para todos estos desarrollos (Claude Shannon, 1948). Para hacer posible que la disponibilidad de los recursos fuese “real” se han necesitado tambi´en avances considerables en la forma como se utilizan los ordenadores, la cantidad de software disponible y el costo del mismo [Loh01]. Los desarrollos tecnol´ ogicos en estos ´ambitos: c´alculo y comunicaci´ on, se han hecho posible por la evoluci´ on de la comprensi´ on del universo que nos rodea y por la formulaci´ on, con el auxilio de las matem´ aticas, de ese nuevo conocimiento. Se ha dado una confluencia en el tiempo de la disponibilidad de nuevos conocimientos y materiales as´ı como de un entorno social distinto al de comienzos de siglo en cuanto a la percepci´on que se tiene del tiempo y de su efecto sobre los asuntos del hombre, en particular los econ´ omicos —la idea del cambio ha pasado de ser algo no querido o a lo sumo inevitable a ser considerado bueno y deseable para una sociedad din´ amica que quiera progresar—. Todos estos factores han tenido el efecto de concentrar el tiempo, y han sentado las bases para la revoluci´ on a la que hemos asistido en t´erminos de los productos tecnol´ ogicos (como encarnaci´on del conocimiento) disponibles en todos los campos, incluyendo el financiero. La potencia de c´ alculo y con ella el sentimiento de control que ´esta da sobre las cosas que se sujetan a aqu´ella, ha facilitado el inicio de amplios procesos de liberalizaci´ on de las pol´ıticas nacionales de mercado y, con ´estos, su generalizaci´on o globalizaci´ on. Aunque el n´ umero de las pr´ acticas consideradas l´ıcitas en los mercados se ha ido expandiendo, tambi´en lo ha hecho el grado de conocimiento de las autoridades y los agentes econ´ omicos sobre las mismas. Ha sido esta conciencia del conocimiento la que ha proporcionado las bases para que se pudiese acometer reformas a fondo del funcionamiento de los mercados en pr´ acticamente todo el primer mundo. En cuanto a los elementos contingentes que se negocian en los mercados, la nueva capacidad de c´ alculo ven´ıa a aportar la posibilidad de explorar millones de escenarios posibles para mejor poder poner precio a las cosas. De nuevo, la tecnolog´ıa, como cristalizaci´ on de las leyes f´ısicas expresadas a trav´es de las matem´ aticas, viene a concentrar el tiempo haciendo posible que unos pocos minutos, u horas a lo sumo, se puedan reproducir desarrollos simulados de condiciones de mercado que tardar´ıan 8 No deja de ser ir´ onico que el tiempo en el mundo financiero se haya vuelto relativo. La causa de esta relativizaci´ on del tiempo no es otra que la velocidad (aqu´ı de c´ omputo); la misma causa que hace relativo el tiempo en el mundo f´ısico.
´ticas en las Finanzas Las Matema
77
meses o a˜ nos en desarrollarse en el mundo real. Del aprendizaje de estas t´ecnicas se benefician, inmediatamente despu´es de las actividades de desarrollo de productos, las actividades de control de los riesgos asociados a la venta de los mismos, as´ı como los reguladores que velan por que los agentes econ´ omicos operen en un marco que se considera prudente para preservar el modelo de sistema y la pervivencia de los propios agentes econ´ omicos.
10.
El conocimiento abierto a nuevas vistas
Resulta curioso considerar que esta concentraci´on del tiempo a la que estamos asistiendo haya tra´ıdo a la vez una visi´ on m´ as clara de los l´ımites del conocimiento. Estos l´ımites se han puesto de manifiesto tanto en el mundo f´ısico (el Principio de la Incertidumbre, Werner K. Heisenberg, 1927, [Hei49]) como en el de las matem´ aticas (Teorema de la incompletitud, Kurt G¨ odel, 1931) y seguramente nos est´en acercando a una ´epoca en la que habremos de trascender nuevas barreras para asegurar el desarrollo continuado del conocimiento y, con ´el, de la tecnolog´ıa como su encarnaci´on. Son tambi´en notables las implicaciones filos´oficas [Deu97] de este nuevo modelo del mundo; el de la existencia de l´ımites al conocimiento mismo. De los mismos desarrollos te´ oricos de la f´ısica cu´ antica parecen nacer dos escuelas de pensamiento; una, propone la noci´ on de que el universo se concreta al ser observado por los seres conscientes; otra, que la realidad se manifiesta en la existencia de infinidad de universos que difieren los unos de los otros en grados variables, desde la m´ as ´ınfima diferencia: el esp´ın de un electr´ on en un solo a´tomo, hasta las mayores: configuraciones de galaxias enteras. Como ha ocurrido tantas veces a lo largo de la historia, los mismos descubrimientos que precisan de una nueva concepci´ on del mundo (teor´ıa) para poder ser explicados, traen de la mano la soluci´ on para superar las barreras que en su avance el hombre encuentra. As´ı, las leyes de la Mec´anica Cu´ antica hacen posible la explotaci´on de las propiedades de los semiconductores aunque limitan de momento su escala de utilizaci´ on (quiz´ as hasta un grosor de 5 µm.9 ). Sin embargo, ya se empiezan a avistar propiedades de la materia, anticipadas tambi´en por la mec´ anica cu´ antica, que pueden ayudar a sortear estos escollos [Mil97] .
11.
Lo que nos depara el ma˜ nana
Es muy probable que el progreso que a partir de ahora experimentemos sea una mezcla de perfeccionar lo que ya conocemos y de nuevas invenciones. En el a´mbito financiero vendr´ a dado sin duda por la forma ubicua de la informaci´ on de mercado dada en un tiempo que cada vez se aproxime m´as al tiempo real. Junto con esta informaci´ on y la facilidad con que se acceda a la misma, vendr´ a de la mano la capacidad, primero, y la facilidad, despu´es, con que se pueda manipular y hacer parte de los activos (pasivos) de cada agente econ´ omico. Una casa, por poner un ejemplo, ser´ a vendida junto 91
micra (o micr´ on) = 10−6 m.
78
D´ıdac Art´ es
con un historial del comportamiento de su precio hist´ oricamente, del precio de casas similares y del sector inmobiliario en la regi´ on. Tambi´en tendr´ a una relaci´ on de todos los gastos de mantenimiento que precis´o y que probablemente precise, as´ı como el tiempo necesario para llevarlos a cabo. Toda esa informaci´ on pasar´ a a formar parte del precio del bien —la casa en este caso— contribuyendo a su determinaci´ on. Los agentes econ´ omicos tendr´ an adosado un historial crediticio con el resultado de todas sus operaciones financieras significativas. Asumiendo que la problem´ atica de derecho a la intimidad se resuelve satisfactoriamente y un agente determinado consiente en la difusi´ on a parte interesada de esos registros, esto tambi´en se incorporar´a al precio de los servicios financieros que quiera contratar, abarat´ andolos en aquellos casos en los que se cuente con un historial superior a la media. Las instituciones financieras, a su vez, si son capaces de mostrar que la composici´on de sus portafolios es superior a la media, gozaran de calificaciones crediticias (en este caso siempre p´ ublicas) mejores y de un abaratamiento de costes y del uso del capital. La ubicuidad y la facilidad con que se puedan incorporar toda esta (y parecida) informaci´ on, van a depender a su vez de los continuos avances en la capacidad de c´ omputo por unidad de cuenta de coste. En este sentido no s´ olo no ha remitido la velocidad con que se mejora la potencia de c´ alculo (Ley de Moore, 1970) sino que comienzan a darse indicios de que pueda estarse acelerando. Hay quien piensa (Kurtzweil) que en tan s´olo unos 20 a˜ nos un ordenador de 1000 euros de coste igualar´ a la capacidad de c´ omputo del cerebro humano [Kur00]. De darse desarrollos parecidos a los se˜ nalados, muchos aspectos de la vida econ´ omica cambiar´ an profundamente. Es posible que el impacto de estos avances se haga sentir primero en la difusi´ on de la informaci´ on ya sea factual (noticias, datos) o para entretenimiento (m´ usica, imagen), m´ as tarde en c´ omo se utiliza la informaci´ on para fines educativos; el modo como se forma a los distintos profesionales. Quiz´ as m´as tarde a´ un se incorpore de tal modo a toda transacci´ on econ´ omica que sea dif´ıcil deslindar la transacci´ on de su contenido en informaci´ on. Durante el siglo pasado hemos iniciado una transici´ on de la edad de las m´ aquinas a la edad de la informaci´ on. Los artilugios m´ as complejos que fabricamos hoy no son mec´anicos sino incorp´ oreos; son los programas inform´ aticos. Esta revoluci´ on en la forma como se ve el mundo ha tenido heraldos en el a´mbito de la F´ısica moderna, donde John Wheeler, por ejemplo, ha postulado que llegar´ a un d´ıa que comprendamos todo fen´ omeno f´ısico como un fen´ omeno de informaci´ on. De hecho ya se ha emprendido un esfuerzo de reformulaci´ on de algunas de las Leyes F´ısicas en t´erminos estad´ısticos y los resultados a los que se est´ a llegando son sumamente interesantes [Mil97] . Un campo donde estos avances tendr´ıan sin duda un gran impacto es el de la computaci´ on y, con ella, la comunicaci´ on segura entre partes. En virtud de lo que se conoce como la superposici´ on de estados que experimentan las part´ıculas elementales, se ha teorizado primero y construido despu´es los primeros ordenadores cu´anticos [Deu97]. La peculiaridad de ´estos es que pueden determinar todas las soluciones a un problema en concreto a la vez. Mientras que esta capacidad, caso de desarrollarse de modo comercial, pondr´ıa en peligro esquemas de comunicaci´ on segura como los que se han
´ticas en las Finanzas Las Matema
79
citado (el de llave p´ ublica por ejemplo, en el que el sistema m´as extendido en la Red, el de RSA Security, se basa, ya que har´ıa desaparecer el tiempo necesario para calcular los factores primos de un n´ umero por grande que ´este fuese al ser capaz el ordenador cu´ antico de establecer todas las respuestas correctas en un instante), tambi´en traer´ıa consigo la habilidad de utilizar esquemas de cifrado que ser´ıan todav´ıa m´ as dif´ıciles (algunos piensan que ser´ıa imposible) de descifrar; el cifrado cu´ antico [Deu97, Lev01]. En esta forma de cifrado de los mensajes se utilizan tambi´en las propiedades de superposici´ on de estados de, digamos, los fotones de forma que si una tercera parte quisiese espiar la comunicaci´ on, el mero hecho de observarla cambiar´ıa las propiedades de ´estos y ser´ıa detectada la intrusi´ on por parte de los comunicantes, que tendr´ıan entonces evidencia de estar siendo espiados. Otro de los campos en los que probablemente se sucedan importantes avances es en el llamado Inteligencia Artificial. Desde los a˜ nos 50 se ha venido invirtiendo por parte de las universidades y gobiernos grandes sumas de dinero para sintetizar la cualidad de la inteligencia. Casi todos estos esfuerzos han fracasado o dado resultados mucho m´ as pobres que los esperados. A finales de los 90, Steven Grand [Gra00] utiliz´ o una aproximaci´ on muy distinta en un juego que dise˜ no llamado Creatures. En este juego las criaturas no ten´ıan inteligencia destilada e independiente de otros aspectos que han acompa˜ nado su aparici´ on en el mundo real. Su inteligencia estaba unida a tener que alimentarse, guarecerse, crecer y aparearse para procrear. Todo esto se desarrollaba en un entorno de recursos limitados que obligaba a tener que competir por ellos y racionarlos. Los resultados obtenidos por esta aproximaci´ on al problema han sido de lo m´ as sorprendentes y prometedores. Si la historia pasada reciente da alguna indicaci´ on de cu´ al puede ser el curso de los acontecimientos, no me extra˜ nar´ıa que vi´esemos en un futuro no muy lejano que el contenido de informaci´ on se convirtiese en la moneda de pago, es decir, en aquello que mide el valor de los intercambios econ´ omicos y que todas las transacciones econ´ omicas se valoraran en t´erminos de su naturaleza contingente. Esta transici´on a un nuevo modelo de valor va a requerir de cambios profundos en la sociedad y en los sistemas econ´ omicos. Si, como ha ocurrido anteriormente en la historia, la F´ısica vuelve a actuar como heraldo de los cambios que van a producirse, en la Econom´ıa mejor ser´ a que empecemos a prestar atenci´ on a estos desarrollos.
Referencias [Sin00]
Singh, Simon: The Code Book. Anchor Books, New York, 2000.
[Sag96]
Sagan, Carl: The Demon-Haunted World. Ballantine Books, New York, 1996.
[Hei49]
Heisemberg, Werner: The Physical Principles of the Quantum Theory. Dover Publications Inc., New York, 1949.
[LWME52] Lorentz, H. A., Weyl, H., Minkowsky, H. and Einstein, A.: The Principle of Relativity. Dover Pubs. Inc., New York, 1952.
80
D´ıdac Art´ es
[Ber99]
Berlinski, David: The Advent of the Algorithm. Harcourt, Inc., New York, 1999.
[Gra00]
Grand, Steve: Creation, Life and How to Make it Happen. Weidenfeld & Nicolson, London, 2000.
[Swa00]
Swade, Doron: The Cogwheel Brain. Little, Brown & Company, London, 2000.
[Kur90]
Kurzweil, Raymond: The Age of Intelligent Machines. The MIT Press, Cambridge, Massachusetts, 1990.
[Lev01]
Levy, Steven: Crypto. Viking, New York, 2001.
[Mil97]
Milburn, Gerard J.: Schr¨ odinger’s Machines. W. H. Freeman and Company, New York, 1997.
[Kur00]
Kurzweil, Raymond J.: The Age of Spiritual Machines. Penguin USA, New York, 2000.
[Deu97]
Deutsch, David: The Fabric of Reality. Penguin USA, New York, 1997.
[Loh01]
Lohr, Steve: Go To. Basic Books, New York, 2001.
[Hul89]
Hull, John: Options, Futures, and other Derivative Securities. PrenticeHall Inc., New Jersey, 1989.
D´ıdac Art´es Apartado de Correos, 6 28280–El Escorial, Madrid
[email protected]
A dynamical model for stock market indices `1 Jaume Masoliver, Miquel Montero and Josep M. Porra
Abstract: High frequency data in finance have led to a deeper understanding on probability distributions of market prices. Several facts seem to be well stablished by empirical evidence. Specifically, probability distributions of financial indices, such as the Standard & Poor’s 500 cash index, have the following properties: (i) They are not Gaussian and their center is well adjusted by L´evy distributions. (ii) They are long-tailed but have finite moments of any order. (iii) They are self-similar on many time scales. Finally, (iv) at small time scales, price volatility follows a non-diffusive behavior. We extend Merton’s ideas on speculative price formation and present a dynamical model that explains in a natural way, and with high degree of accuracy, all of the above features found when we have analysed the historical records of the S&P 500.
1. Introduction One of the most important problems in mathematical finance is to know the probability distribution of speculative prices. In spite of its importance for both theoretical and practical applications the problem is yet unsolved. The first approach to the problem was given by Bachelier in 1900 when he modelled price dynamics as an ordinary random walk where prices can go up and down due to a variety of many independent random causes. Consequently the distribution of prices was Gaussian [5]. The normal distribution is ubiquitous in all branches of natural and social sciences and this is basically due to the Central Limit Theorem: the sum of independent, or weakly dependent, random disturbances, all of them with finite variance, results in a Gaussian random variable. Gaussian models are thus widely used in finance although, as Kendall first noticed [11], the normal distribution does not fit financial data specially at the wings of the distribution. Thus, for instance, the probability of events corresponding to 5 or more standard deviations is around 104 times larger than the one predicted by the Gaussian distribution, in other words, the empirical distributions of 1 Jaume Masoliver es Profesor Titular del Departament de F´ ısica Fonamental de la Universitat de Barcelona. Miquel Montero es Profesor Ayudante del mismo Departamento de la Universitat de Barcelona. Josep M. Porr` a es doctor en F´ısica y analista financiero; trabaja en Gaesco Bolsa, SVB. Esta charla fue impartida por el primer autor (Jaume Masoliver) en la sesi´ on del Seminario Instituto MEFF–RiskLab de septiembre de 2000.
82
` Jaume Masoliver, Miquel Montero and Josep M. Porra
prices are highly leptokurtic. Is the existence of too many of such events, the so called outliers, the reason for the existence of “fat tails” and the uselessness of the normal density specially at the wings of the distribution. Needless to say that the tails of the price distributions are crucial in the analysis of financial risk. Therefore, obtaining a reliable distribution has deep consequences from a practical point of view [4, 3]. One of the first attempts to explain the appearance of long tails in financial data was taken by Mandelbrot in 1963 [15] who, based on Pareto-L´evy stable laws [8], obtained a leptokurtic distribution. Nevertheless, the price to pay is high: the resulting probability density function has no finite moments, except the first one. This is indeed a severe limitation and it is not surprising since Mandelbrot’s approach can still be considered within the framework of the Central Limit Theorem, that is, the sum of independent random disturbances of infinite variance results in the L´evy distribution which has infinite variance [8]. On the other hand, the L´evy distribution has been tested against data in a great variety of situations, always with the same result: the tails of the distribution are far too long compared with actual data. In any case, as Mantegna and Stanley have recently shown [17], the L´evy distribution fits very well the center of empirical distributions —much better than the Gaussian density— and it also shares the scaling behavior shown in data [17, 25, 10, 9]. Therefore, if we want to explain speculative price dynamics as a sum of weakly interdependent random disturbances, we are confronted with two different and in some way opposed situations. If we assume finite variance the tails are “too thin” and the resulting Gaussian distribution only accounts for a narrow neighborhood at the center of the distribution. On the other hand, the assumption of infinite variance leads to the L´evy distribution which explains quite well a wider neighborhood at the center of distributions but results in “too fat tails”. The necessity of having an intermediate model is thus clear and this is the main objective of the paper. Obviously, since the works of Mandelbrot [15] and Fama [7] on L´evy distributions, there have been several approaches to the problem, some of them applying cut-off procedures of the L´evy distribution [16, 12] and, more recently, the use of ARCH and GARCH models to obtain leptokurtic distributions [2]. The approaches based on cut-off procedures are approximations to the distributions trying to better fit the existing data, but they are not based on a dynamical model that can predict their precise features. On the other hand ARCH [6] and GARCH [1] models are indeed dynamical adaptive models but they do not provide an overall picture of the market dynamics resulting in a distinctive probability distribution. In fact, ARCH/GARCH models usually assume that the market is Gaussian with an unknown time-varying variance so to be self-adjusted to obtain predictions. The paper is organized as follows. In Section 2 we recall the stochastic model presented in [22], setting the mathematical framework that leads to the probability distribution. In Section 3 we present the main results achieved by the model when applied to the S&P 500 case. Conclusions are drawn in Section 4.
A dynamical model for stock market indices
83
2. The Model Let S(t) be a random processes representing stock prices or some market index value, the Standard & Poor’s cash index, in our case. The usual hypothesis is to assume that S(t) obeys an stochastic differential equation of the form dS = ρdt + dF (t), S
(1)
where ρ is the instantaneous expected rate of return and F (t) is a random process with specified statistics. Sometimes F (t) is W (t), the Wiener process or Brownian motion, as it is shown in Fig. 1.
W(t)
t Figure 1: Typical aspect of a Wiener process. The increments are uncorrelated zero-mean Gaussian random variables. It may seem that the process leads to an increasing magnitude, but this fact is purely incidental.
In this case, the dynamics of the market is clear since the return R(t) ≡ log
S(t) S(0)
evolve like an overdamped Brownian particle driven by the “inflation rate” ρ and, in consequence, the return distribution is Gaussian. Let us take a closer look at the price formation and dynamics. Following Merton [23] we say that the change in the stock price (or index) is basically due to the random arrival of new information. This mechanism is assumed to produce a marginal change in the price and it is modelled by the standard geometric Brownian motion defined above. In addition to this “normal vibration” in price, there is an “abnormal vibration” basically due to the (random) arrival of important new information
84
` Jaume Masoliver, Miquel Montero and Josep M. Porra
that has more than a marginal effect on price. Merton models this mechanism as a jump process with two sources of randomness: the arrival times when jumps occurs, and the jump amplitudes. The result on the overall picture is that the noise source F (t) in price equation is now formed by the sum of two independent random components: one corresponding to the normal vibration, the Wiener process, and the other corresponding to the abnormal vibration in price, a “shot noise”. This shot noise component can be explicitly written as
(2)
f (t) =
∞
Ak Θ(t − Tk ),
k=1
where Ak are jump amplitudes, Tk are jump arrival times and Θ(t) is the Heaviside unit-step function defined by Θ(t) =
1, 0,
if t > 0, if t < 0.
In Fig. 2 we will find a example of the behavior of such a process. Note that it is also assumed that Ak and Tk are independent random variables with known probability distributions given by h(x) and ψ(t) respectively [20].
f(t)
Ak
Tk
Tk+1
t
Figure 2: Typical aspect of a shot noise process. The increments are uncorrelated random variables. The “arrival times” are also independents with respect to the rest of magnitudes.
Therefore, the evolution of X(t), the zero-mean return, i.e., X(t) ≡ R(t) − ρt, must closely resemble to what is depicted in Fig. 3.
A dynamical model for stock market indices
85
X(t)
Ak
Tk
Tk+1
t
Figure 3: Evolution of the zero-mean return in Merton’s scenario. Each “quiet” period is broken by a sudden change in X(t).
We now go beyond this description and specify the “inner components” of the normal vibration in price, by unifying this with Merton’s abnormal component. We thus assume that all changes in the stock price (or index) are modelled by different shot-noise sources corresponding to the detailed arrival of information, that is, we replace the total noise F (t) by the sum (3)
F (t) =
m
fn (t),
n=n0
where fn (t) are a set of independent shot-noise processes, just like the previous one. The amplitudes are independent random variables with zero mean and probability density function (pdf), hn (x), depending only on a single “dimensional” parameter which, without loss of generality, we assume to be the standard deviation of jumps σn , i.e., (4)
hn (x) = σn−1 h(xσn−1 ).
We also assume that the occurrence of jumps is a Poisson process. In such a case shot noises are Markovian, and the pdf for the time interval between jumps is exponential ψn (t) = λn exp[−λn t]. Once again a single parameter, λn , will characterize each pdf. Now is the mean jump frequency, since 1/λn is the mean time between two consecutive jumps [20]. Finally, note that we will order the mean frequencies in a decreasing way, that is, λn < λn−1
n = 1, 2, 3, . . ..
86
` Jaume Masoliver, Miquel Montero and Josep M. Porra
Our main objective now is to obtain an expression for the pdf of X(t), p(x, t), and compare it with the actual density of the real process, the Standard & Poor’s 500 cash index, as shown on Fig. 4. 4
10
2
p(x,t)
10
0
10
−2
10
−15
−10
−5
0 x/Σ
5
10
15
Figure 4: Probability density function of S&P 500 cash index. Here we plot a sample of the probability density function p(x, t) for t = 1 min. against x/Σ, where Σ is the sampling standard deviation. Green circles represent empirical data from January 1988 to December 1996. We can check that neither a Gaussian density, purple line, nor a L´evy distribution, dashed blue line, can satisfactorily account for the observed behavior.
For computational reasons we will center our efforts on obtaining the characteristic function (cf) of X(t), p˜(ω, t), instead. This is just the Fourier transform of the pdf p(x, t), and contains the same information. Note that X(t) is a sum of independent jump processes, this allows us to generalize Rice’s method for a single Markov shot noise to the present case of many shot noises [24]. The final result is (5)
p˜(ω, t) = exp −t
m n=n0
˜ λn [1 − h(ωσn )] .
A dynamical model for stock market indices
87
As it is, X(t) represents a shot noise process with mean frequency of jumps given by λ = λn and jump distribution given by h(x) = λn hn (x)/λ. Nevertheless, we make a further approximation by assuming (i) n0 = −∞, i.e., there is an infinite number of shot-noise sources, and (ii) there is no characteristic time scale limiting the maximum feasible value of jump frequencies, thus λn → ∞ as n → −∞. Both assumptions are based on the fact that the “normal vibration” in price is formed by the addition of (approximately) infinitely many random causes, which we have modelled as shot noises. According to this, we introduce a “coarse-grained” description and replace the sum in Eq. (5) by an integral p˜(ω, t) = exp −t
(6)
um −∞
˜ λ(u)[1 − h(ωσ(u))]du .
In order to proceed further we should specify a functional form for λ(u) and σ(u). We note by empirical evidence that, in general, the bigger a sudden market change is, the longer is the time we have to wait until we observe it. Therefore, since λ(u) decreases with u (recall that frequencies are decreasingly ordered) then σ(u) must increase with u. We thus see that σ(u) has to be a positive definite, regular and monotone increasing function for all u: σ(u) = σ0 eu . On the other hand, the evidence of scaling properties in financial series [17, 25, 10, 9] is widely reported, and it is also empirical found in our own data, as it is shown in Fig. 5. 1 1 min. 3.16 min. 10 min. 31.6 min. 0.1
p(x/Σ,t)Σ
0.01
0.001
0.0001
1e-005
1e-006 -1
-0.5
0
0.5
1
x/Σ
Figure 5: Self-similar behavior of the probability density function of S&P 500, for several time scales.
88
` Jaume Masoliver, Miquel Montero and Josep M. Porra
We summarize all the above requirements (i.e., inverse relation between λ and σ, and scaling) by imposing the “dispersion relation”: (7)
λ = λ0 (σ0 /σ)α .
where α is the scaling parameter. Under these assumptions the cf of the return X(t) reads: σm ˜ (8) p˜(ω, t) = exp −λ0 tσ0α z −1−α [1 − h(ωz)]dz , 0
where σm = σ0 eum is the maximum value of the standard deviation. We observe that if σm = ∞, which means that some shot-noise source has infinite variance, then Eq. (8) yields the L´evy distribution (9)
˜ α (ω, t) = exp(−ktω α ), L
where (10)
k = λ0 σ0α
∞ 0
˜ z −1−α [1 − h(z)]dz.
Hence, if we want a distribution with finite moments, we have to assume a finite value for σm . Let λm be the mean frequency corresponding to the maximum (finite) variance. Recall that, in the discrete case (c.f. Eq. (5)), shot-noise sources are ordered, thus λm and σm correspond to the mean frequency and the variance of the last jump source considered. Our last assumption is that the total number of noise sources in Eq. (3) increases with the observation time t and, since n0 = −∞, this implies that m = m(t) is an increasing function of time. Consequently, the mean period of the last jump source, λ−1 m , also grows with t. The simplest choice is the linear relation: λm t = a, where a > 0 is constant. Therefore, from the dispersion relation, Eq. (7), we see that the maximum jump variance depends on time as a power law: 2 σm = (bt)2/α ,
(11)
where b ≡ σ0α λ0 /a. We finally have (12) p˜(ω, t) = exp −abt
(bt)1/α 0
z
−1−α
˜ [1 − h(ωz)]dz
,
3. Results Let us now present the main results and consequences of the above analysis. First, the volatility of the return is given by (13)
X 2 (t) =
2 a aσm = (bt)2/α , 2−α 2−α
which proves that α < 2 and the volatility shows super-diffusion.
A dynamical model for stock market indices
89
The anomalous diffusion behavior of the empirical data (at least at small time scales) was first shown by Mantegna and Stanley without mention it [18, 19]. Second, kurtosis is constant and given by (14)
γ2 =
˜ (iv) (0) (2 − α)2 h . (4 − α)a
Thus γ2 > 0 for all t, in other words, we have a leptokurtic distribution in all time scales. Third, the return probability distribution scales as p(x, t) = (bt)−1/α p(x/(bt)1/α )
(15)
2
and the model becomes self-similar [17, 25, 10, 9]. 10
−4
10
−5
10
−6
10
−7
10
−8
1
10 t (min.)
100
Figure 6: Second moment of the zero-mean return. Green circles correspond to empirical data from S&P 500 cash index (January 1988 to December 1996). Red solid line shows the super-diffusive character predicted by Eq. (13), clearly in contrast with a diffusive regime, blue dashed line.
In Fig. 6 we plot the super-diffusion behavior. Circles correspond to empirical data from S&P 500 cash index during the period January 1988 to December 1996. Solid line shows the super-diffusive character predicted by Eq. (13) setting α = 1.30 and ab2/α = 2.44 × 10−8 (if time is measured in minutes). Dashed line represents normal-diffusion X 2 (t) ∝ t. Observe that data obeys super-diffusion for t ≤ 10 min, and when t > 10 min there seems to be a “crossover” to normal diffusion.
90
` Jaume Masoliver, Miquel Montero and Josep M. Porra
We finally study the asymptotic behavior of our distribution. It can be shown from Eq. (8) that the center of the distribution, defined by |x| < (bt)1/α , is again approximated by the L´evy distribution defined above. On the other hand the tails of the distribution are solely determined by the jump pdf h(u) by means of the expression ∞ abt (16) p(x, t) ∼ uα h(u)du, (|x| (bt)1/α ). |x|1+α |x|/σm Therefore, return distributions present fat tails and have finite moments if jump distributions behave in the same way. This, in turn, allows us to make statistical hypothesis on the form of h(u) based on the empirical form and moments of the pdf. 4
10
2
p(x,t)
10
0
10
−2
10
−15
−10
−5
0 x/Σ
5
10
15
Figure 7: Fit for the the probability density function. We have recovered the information in Fig. 4, and we have included a new plot, the red solid line. This line represents our fit to the empirical data of the S&P 500 cash index, ranging from January 1988 to December 1996.
In Fig. 7 we plot the probability density p(x, t) of the S&P 500 cash index returns X(t) observed at time t = 1 min (circles). Σ = 1.87 × 10−4 is the standard deviation of the empirical data. Dotted line corresponds to a Gaussian density with standard deviation given by Σ. Solid line shows the Fourier inversion of Eq. (8) with α = 1.30, σm = 9.07 × 10−4 , and a = 2.97 × 10−3 . We use the gamma distribution for the
A dynamical model for stock market indices
91
absolute value of jump amplitudes, h(u) = µβ |u|β−1 e−µ|u| /2Γ(β),
(17) with
β = 2.39
and
µ=
β(β + 1) = 2.85 .
Dashed line represents a symmetrical L´evy stable distribution of index α = 1.30 and the scale factor k = 4.31 × 10−6 obtained from Eq. (10). We note that the values of σm and Σ predict that the Pareto-L´evy distribution fails to be an accurate description of the empirical pdf for x 5Σ (see Eq. (16)). The validity of this assertion becomes evident when we observe in detail the behavior of the tail of the distribution, Fig. 8.
10000
Model α = 1.7 α = 3.0 Data
Probability density function
1000
100
10
1
0.1
0.01 0.1
1 10 Normalized S&P500 returns
100
Figure 8: Tails of the probability density function for the S&P 500 data. The logarithmic scale enhaces the fact that a power-law approximation for the pdf, i.e. p(x, t) ∼ κ(t)x−1−α , fails to cover all the range of interest.
We chose a gamma distribution of jumps because (i) as suggested by the empirical data analized, the tails of p(x, t) decay exponentially, and (ii) one does not favor too small size jumps, i.e., those jumps with almost zero amplitudes, as it is shown in Fig. 9. In any case, it would be very useful to get some more microscopic approach (based, for instance, in a “many agents” model [4, 14]) giving some inside on the particular form of h(u).
92
` Jaume Masoliver, Miquel Montero and Josep M. Porra
0.5 0.45 0.4 0.35
h(u)
0.3 0.25 0.2 0.15 0.1 0.05 0 0
0.5
1
1.5
2
2.5
3
3.5
4
u
Figure 9: Profile of h(u), the probability density function of the jump amplitudes, for the gamma case. The parameters are those reported in the main text.
4. Conclusions Summarizing, by means of a continous description of random pulses, we have obtained a dynamical model leading to a probability distribution for the speculative price changes. This distribution which is given by the following characteristic function: (18)
p˜(ω, t) = exp −a
0
1
˜ z −1−α [1 − h(ωzσ (t))]dz , m
where σm (t) = (bt)1/α , it depends on three positive constants: a, b, and α < 2. ˜ The characteristic function (18) also depends on an unknown function h(ω), the unitvariance characteristic function of jumps, also to be conjectured and fitted from the tails of the empirical distribution. Therefore, starting from simple and reasonable assumptions we have developed a new stochastic process that possesses many of the features, i. e. fat tails, self-similarity, superdiffusion, and finite moments, of financial time series, thus providing us with a different point of view on the dynamics of the market. We finally point out that the model does not explain any correlation observed in empirical data (as some markets seem to have [3, 13]). This insufficiency is due to the fact that we have modelled the behavior of returns through a mixture of independent sources of white noise. The extension of the model to include non-white noise sources and, hence, correlations can be found in [21].
A dynamical model for stock market indices
93
References [1] Bollerslev, T.: Generalized Autoregressive Conditional Heteroskedasticity. J. Econometrics 31 (1986), 307–327. [2] Bollerslev, T., R. Y. Chou and K. F. Kroner: ARCH Modeling in Finance. J. Econometrics 52 (1992), 5–59. [3] Bouchaud, J. P. and M. Potters: Th´eorie des Riskes Financiers. Al´eaSaclay, Paris, 1997. [4] Campbell, J. Y., A. W. Lo and A. C. MacKinlay: The Econometrics of Financial Markets. Princeton University Press, Princeton, 1997. [5] Cootner, P. H. (ed.): The Random Character of Stock Market Prices. MIT Press, Cambridge MA, 1964. [6] Engle, R. F.: Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of the United Kingdom Inflation. Econometrica 50 (1982), 987– 1007. [7] Fama, E.: Mandelbrot and the Stable Paretian Hypothesis. J. Business 35 (1963), 420–429. [8] Feller, W.: An Introduction to Probability Theory and its Applications. J. Wiley, New York, 1971. [9] Gallucio, S., G. Caldarelli, M. Marsili and Y.-C. Zhang: Scaling in the Currency Exchange. Physica A 245 (1987), 423–436. [10] Gashghaie, S., W. Breymann, J. Peinke, P. Talkner and Y. Dodge: Turbulent Cascades in Foreign Exchange Market. Nature 381 (1996), 767–770. [11] Kendall, M. G.: The Analysis of Economic Time-Series — Part I: Prices. J. Royal Stat. Soc. 96 (1953), 11–25. [12] Koponen, I.: Analytical Approach to the Problem of Convergence of Truncated L´evy Flights towards the Gaussian Stochastic Process. Phys. Rev. E 52 (1995), 1197–1199. [13] Lo, A. W. and A. C. MacKinlay: Stock Market Prices do not follow Random Walks: Evidences for a Simple Specification Test Rev. Financial Studies 1 (1988); When are Contrarian Profits due to Stock Market Overreaction. ´ıbid. 3 (1990), 175–205. [14] Lux, T. and M. Marchesi: Scaling and Criticality in Stochastic Multiagent Model of a Financial Market. Nature 397 (1999), 498–500. [15] Mandelbrot, B.: The Variation of Certain Speculative Prices. J. Business 35 (1963), 394–419. [16] Mantegna, R. N. and E. H. Stanley: Stochastic Process with Ultra-slow Convergence to a Gaussian: the truncated L´evy flight. Phys. Rev. Lett. 73 (1994), 2946–2949.
94
` Jaume Masoliver, Miquel Montero and Josep M. Porra
[17] Mantegna, R. N. and E. H. Stanley: Scaling Behaviour in the Dynamics of an Economic Index. Nature 376 (1995), 46–49. [18] Mantegna, R. N. and E. H. Stanley: Turbulence and Financial Markets. Nature 383 (1996), 587–588. [19] Mantegna, R. N. and E. H. Stanley: Stock Market Dynamics and Turbulence: Parallel Analysis of Fluctuation Phenomena. Physica A 239 (1997), 255–266. [20] Masoliver, J.: First Passage Times for non-Markovian Processes: Shot Noise. Phys. Rev. A 35 (1987), 3918–3928. [21] Masoliver, J., M. Montero and A. McKane: Integrated random processes exhibiting long tails, finite moments, and power-law spectra. Phys. Rev. E 64 (2001), 1–11. ` : A dynamical model de[22] Masoliver, J., M. Montero and J. M. Porra scribing stock market price distributions. Physica A 283 (2000), 559–567. [23] Merton, R. C.: Option Pricing when the Underlying Stock Returns are Discontinuous. J. Financial Economics 3 (1976), 125–144. [24] Rice, S. O.: in Noise and Stochastic Processes, N. Wax (ed.). Dover, New York, 1954. [25] Scalas, E.: Scaling in the Market of Futures. Physica A 253 (1998), 394–402.
Jaume Masoliver Departament de F´ısica Fonamental Universitat de Barcelona Diagonal, 647, 08028-Barcelona, Spain [email protected] Miquel Montero Departament de F´ısica Fonamental Universitat de Barcelona Diagonal, 647, 08028-Barcelona, Spain [email protected] Josep M. Porr` a Gaesco Bolsa, SVB, S.A. Diagonal 429, 08036-Barcelona, Spain [email protected]
Risk management with drawdown functions Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin1
Abstract: We propose a new one-parameter family of risk measures, which is called Conditional Draw-down-at-Risk (CDaR). These measures of risk are functionals of the portfolio drawdown (underwater) curve considered in an active portfolio management. For some value of the tolerance parameter β, the CDaR is defined as the mean of the worst (1 − β) ∗ 100% drawdowns. The CDaR risk measure includes the Maximal Drawdown and Average Drawdown as its limiting cases. For a particular example, we find the optimal portfolios for a case of Maximal Drawdown, a case of Average Drawdown, and several intermediate cases between these two. The CDaR family of risk measures is similar to Conditional Value-at-Risk (CVaR), which is also called Mean Shortfall, Mean Access loss, or Tail Value-at-Risk. Some recommendations on how to select the optimal risk measure for getting practically stable portfolios are provided. We solved a real life portfolio allocation problem using the proposed measures.
1. Introduction Optimal portfolio allocation is a longstanding issue in both practical portfolio management and academic research on portfolio theory. Various methods have been proposed and studied (for a recent review, see, for example, [6]). All of them, as a starting point, assume some measure of portfolio risk. From a standpoint of a fund manager, who trades clients’ or bank’s proprietary capital, and for whom the clients’ accounts are the only source of income coming in the form of management and incentive fees, losing these accounts is equivalent to the death of his business. This is true with no regard to whether the employed strategy is long-term valid and has very attractive expected return characteristics. Such fund manager’s primary concern is to keep the existing accounts and to attract the new ones 1 Alexei Chekhlov es vicepresidente de Thor Asset Management Department. Stanislav Uryasev es profesor del Industrial and Systems Engineering (ISE) Department de la Universidad de Florida y director del Risk Management and Financial Engineering (RMFE) Lab. Michael Zabarankin es estudiante de doctorado en el Industrial and Systems Engineering Department de la Universidad de Florida. Esta charla fue impartida por el segundo autor (Stanislav Uryasev) en la sesi´ on del Seminario Instituto MEFF-RiskLab de octubre de 2000.
96
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
in order to increase his revenues. A particular client who was persuaded into opening an account with the manager through reading the disclosure document, listening to the manager’s attractive story, knowing his previous returns, etc., will decide on firing the manager based, most likely, on his account’s drawdown sizes and duration. In particular, it is highly uncommon, for a Commodity Trading Advisor (CTA) to still hold a client whose account was in a drawdown, even of small size, for longer than 2 years. By the same token, it is unlikely that a particular client will tolerate a 50% drawdown in an account with an average- or small-risk CTA. Similarly, in an investment bank setup, a proprietary system trader will be expected to make money in 1 year at the longest, i.e., he cannot be in a drawdown for longer than a year. Also, he/she may be shut down if a certain maximal drawdown condition will be breached, which, normally, is around 20% of his backing equity. Additionally, he will be given a warning drawdown level at which he will be reviewed for letting him keep running the system (around 15%). Obviously, these issues make managed accounts practitioners very concerned about both the size and duration of their clients’ accounts drawdowns. First, we want to mention paper [7], where an assumption of log-normality of equity statistics and use of dynamic programming theory led to an exact analytical solution of a maximal drawdown problem for a one-dimensional case. A subsequent generalization of this work for multiple dimensions was done in [3]. In difference to these works, which were looking to find a time-dependent fraction of “capital at risk”, we will be looking to find a constant set of weights, which will satisfy a certain risk condition over a period of time. We make no assumption about the underlying probability distribution, which allows considering variety of practical applications. We primarily concentrate on the portfolio equity curves over a particular past history path, which, effectively, makes the risk measures not stochastic but historical. Being perfectly aware of this insufficiency, we leave the issue of predictive power of a constant set of weights for future research, trying to introduce and test the new approach in this simplified version. To some extend we consider a setup similar to the index tracking problem [4] where an index historical performance is replicated by a portfolio with constant weights. In this paper, we have introduced and studied a one-parameter family of risk measures called Conditional Drawdown-at-Risk (CDaR). This measure of risk quantifies in aggregated format the number and magnitude of the portfolio drawdowns over some period of time. By definition, a drawdown is the drop in the portfolio value comparing to the maximum achieved in the past. We can define drawdown in absolute or relative (percentage) terms. For example, if at the present time the portfolio value equals $9M and the maximal portfolio value in the past was $10M, we can say that the portfolio drawdown in absolute terms equals $1M and in relative terms equals 10%. For some value of the tolerance parameter β, the β-CDaR is defined as the mean of the worst (1 − β) ∗ 100% drawdowns experienced over some period of time. For instance, 0.95-CDaR (or 95%CDaR) is the average of the worst 5% drawdowns over the considered time interval.
Risk management with drawdown functions
97
The CDaR risk measure includes the average drawdown and maximal drawdown as its limiting cases. The CDaR takes into account both the size and duration of the drawdowns, whereas the maximal drawdown measure concentrates on a single event —maximal account’s loss from its previous peak. CDaR is related to Value-at-Risk (VaR) risk measure and to Conditional Valueat-Risk (CVaR) risk measure studied in paper [13]. By definition, with respect to a specified probability level β, the β-VaR of a portfolio is the lowest amount α such that, with probability β, the loss will not exceed α in a specified time τ (see, for instance, [5]), whereas the β-CVaR is the conditional expectation of losses above that amount α. The CDaR risk measure is similar to CVaR and can be viewed as a modification of the CVaR to the case when the loss-function is defined as a drawdown. CDaR and CVaR are conceptually closely related percentile-based risk performance measures. Optimization approaches developed for CVaR can be directly extended to CDaR. The paper [11] considers several equivalent approaches for generating return-CVaR efficient frontiers; in particular, it considers an approach, which maximizes return with CVaR constraints. A nice feature of this approach is that the threshold, which is exceeded (1 − β) ∗ 100%, is calculated automatically using an additional variable (see details in [11,13]) and the resulting problem is linear. CVaR is known also as Mean Excess Loss, Mean Shortfall [4,10], or Tail Valueat-Risk [2]. A case study on the hedging of a portfolio of options using the CVaR minimization technique is included in [11]. Also, the CVaR minimization approach was applied to credit risk management of a portfolio of bonds [1]. A case study on optimization of a portfolio of stocks with CVaR constraints is considered in [11]. Similar to the Markowitz mean-variance approach [9], we formulate and solve the optimization problem with the return performance function and CDaR constraints. The return-CDaR optimization problem is a piece-wise linear convex optimization problem (see definition of convexity in [12]), which can be reduced to a linear programming problem using auxiliary variables. Explanation of the procedure for reducing the piece-wise linear convex optimization problems to linear programming problems is beyond the scope of this paper. In formulating the optimization problems with CDaR constraints and reducing it to a linear programming problem, we follow ideas presented in the paper [11]. Linear programming allows solving large optimization problems with hundreds of thousands of instruments. The algorithm is fast, numerically stable, and provides a solution during one run (without adjusting parameters like in genetic algorithms or neural networks). Linear programming approaches are routinely used in portfolio optimization with various criteria, such as mean absolute deviation [8], maximum deviation [14], and mean regret [4]. The reader interested in other applications of optimization techniques in the finance area can find relevant papers in [15].
98
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
2. General Setup Denote by function w(x, t) the uncompounded portfolio value at time t, where portfolio vector x = (x1 , x2 , . . . , xm ) consists of weights of m instruments in the portfolio. The drawdown function at time t is defined as the difference between the maximum of the function w(x, t) over the history preceding the point t and the value of this function at time t (1)
f (x, t) = max {w(x, τ )} − w(x, t). 0≤τ ≤t
We consider three risk measures: (i) Maximum Drawdown (MaxDD), (ii) Average Drawdown (AvDD), and (iii) Conditional Drawdown-at-Risk (CDaR). The last risk measure, Conditional Drawdown-at-Risk, is actually a family of performance measures depending upon a parameter β. It is defined similar to Conditional Value-at-Risk studied in [2] and, as special cases, includes the Maximum Drawdown and the Average Drawdown risk measures. Maximum drawdown on the interval [0, T ], is calculated by maximizing the drawdown function f (x, t), i.e., (2)
M(x) = max {f (x, t)} . 0≤t≤T
The average drawdown is equal to (3)
A(x) =
1 T
T
f (x, t) dt. 0
For some value of the parameter β ∈ [0, 1], the CDaR, is defined as the mean of the worst (1 − β) ∗ 100% drawdowns. For instance, if β = 0, then CDaR is the average drawdown, and if β = 0.95, then CDaR is the average of the worst 5% drawdowns. Let us denote by α(x, β) a threshold such that (1 − β) ∗ 100% of drawdowns exceed this threshold. Then, CDaR with tolerance level β can be expressed as follows 1 (4) ∆β (x) = f (x, t) dt, Ω = {t ∈ [0, T ] : f (x, t) ≥ α(x, β)}. (1 − β)T Ω Here, when β tends to 1, CDaR tends to the maximum drawdown, i.e. ∆1 (x) = M(x). To limit possible risks, depending upon our risk preference, we can impose constraints on the maximum drawdown given by (2) M(x) ≤ ν1 C, on average drawdown given by (3) A(x) ≤ ν2 C,
Risk management with drawdown functions
99
on CDaR given by (4) ∆β (x) ≤ ν3 C, or combine several constraints together (5)
A(x) ≤ ν2 C,
M(x) ≤ ν1 C,
∆β (x) ≤ ν3 C,
where the constant C represents the available capital and the coefficients ν1 , ν2 and ν3 define the proportion of this capital which is “allowed to be lost”. Usually, (6)
0 ≤ ν1 ≤ 1,
0 ≤ ν2 ≤ 1,
0 ≤ ν3 ≤ 1.
Suppose that the historical returns for m portfolio instruments on interval [0, T ] are available. Let vector y(t) = (y1 (t), y2 (t), . . . , ym (t)) be a set of uncompounded cumulative net profits for m portfolio instruments at a time moment t. The cumulative m yk (t) xk = y(t) · x. portfolio value then equals w(x, t) = k=1
The average annualized return R(x) over a period [0, T ], which is a linear function of x, is defined as follows (7)
R(x) =
1 1 w(x, t) = y(t) · x, Cd Cd
where d is the number of years in the time interval [0, T ]. For the case considered, the so-called technological constraints on the vector x need to be imposed. Here, we assume that they are given by the set of box constraints: (8) X = x : xmin ≤ xk ≤ xmax , ∀ k = 1, m . for some constant values of xmin and xmax . Our objective is to maximize the return R(x) subject to constraints on various risk performance measures and technological constraints (8) on the portfolio positions.
3. Problem Statement Maximization of the average return with constraints on maximum drawdown can be formulated as the following mathematical programming problem (9)
max R(x) x∈X
s. t.
M(x) ≤ ν1 C.
Maximization of the average return with constraints on the average drawdown can be formulated as follows (10)
max R(x) x∈X
s. t.
A(x) ≤ ν2 C.
100
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
Analogously, maximization of the average return with constraints on CDaR can be formulated as follows max R(x) x∈X
(11)
∆β (x) ≤ ν3 C.
s. t.
Similar to [2], the problems (9), (10), (11) can be reduced to linear programming problems using some auxiliary variables.
4. Discrete Model By dividing interval [0, T ] into N equal intervals (for instance, trading days) (12)
ti = i
T , N
i = 1, N ,
we create the discrete approximations of the vector function y(t) y(ti ) = yi ,
(13) the drawdown function
fi (x) = max {yj · x} − yi · x,
(14)
1≤j≤i
and the average annualized return function (15)
R(x) =
1 yN · x. Cd
For the discrete time case, problems (9), (10) and (11) can be accordingly reformulated. The optimization problem with constraint on maximum drawdown is given below 1 yN · x Cd s. t. max { max {yj · x} − yi · x} ≤ ν1 C, max x
(16)
1≤i≤N 1≤j≤i
xk ∈ [xmin , xmax ],
∀ k = 1, m.
The optimization problem with constraint on average drawdown can be written as follows 1 yN · x x Cd N 1 s. t. { max {yj · x} − yi · x} ≤ ν2 C, N i=1 1≤j≤i xk ∈ [xmin , xmax ], ∀ k = 1, m. max
(17)
Risk management with drawdown functions
101
Following the approach for Conditional Value-at-Risk (CVaR) [2], it can be proved that the discrete version of the optimization problem with constraint on CDaR may be stated as follows max x
(18)
1 Cd
s. t.
yN · x
N 1 ({ max {yj · x} − yi · x} − α)+ ≤ ν3 C, (1 − β)N i=1 1≤j≤i xk ∈ [xmin , xmax ], ∀ k = 1, m,
α+
where we use the notation (g)+ = max{0, g}. An important feature of this formulation is that it does not involve the threshold function α(x, β). An optimal solution to the problem (18) with respect to x and α gives the optimal portfolio and the corresponding value of the threshold function. The problems (16), (17), and (18) have been reduced to linear programming problems using auxiliary variables and have been solved by the CPLEX solver (inputs are prepared with C++ programming language). An alternative verification of the solutions was obtained via solving similar optimization problems using a more general Genetic Algorithm method implemented in VB6, discussion of which is beyond the present scope.
5. Results As the starting equity curves, we have used the equity curves generated by a characteristic futures technical trading system in m = 32 different markets, covering a wide range of major liquid markets (currencies, currency crosses, U.S. treasuries both short- and long-term, foreign long-term treasuries, international equity indices, and metals). The list of market ticker symbols, provided in the results below, is mnemonic and corresponds to the widely used data provider, FutureSource. It is necessary to mention several issues related to technological constraints (8). In our case, we chose xmin = 0.2 and xmax = 0.8. This choice was dictated by the need to have the resultant margin-to-equity ratio in the account within admissible bounds, which are specific for a particular portfolio. These constraints, in this futures trading setup is analogous to the “fully-invested” condition from classical Sharpe-Markowitz theory [1], and it is namely this condition, which makes the efficient frontier concave. In the absence of these constraints, the efficient frontier would be a straight line passing through (0,0), due to the virtually infinite leverage of these types of strategies. Another subtle issue has to do with the stability of the optimal portfolios if the constraints are “too lax”. It is a matter of empirical evidence that the more lax the constraints are – the better portfolio equity curve you can get through optimal mixing – and the less stable with respect to walk-forward analysis these results would be. The above set of constraints was empirically found to be both leading to sufficiently stable portfolios and allowing enough mixing of the individual equity curves.
102
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
The individual equity curves, when the market existed at the time, covered a time span of 1/1/1988 through 9/1/1999. The equity curves were based on $20M backing equity in a margin account and were uncompounded, i.e. it was assumed that the amount of risk being taken, was always based of the original $20M, not taking the money being made or lost into account. The problem, then, is to find a set of weights x = (x1 , x2 , . . . , xm ), such that it solves the minimization problems (16), (17), or (18). Let us denote the problem (16) as the MaxDD problem, the problem (17) as the AvDD problem, and the problem (18) as the β-CDaR problem. We have solved the above optimization problems for cases of (1 − β) = 0, 0.05, 0.1, 0.2, 0.4, 0.6, 0.8 and 1. As we have noted before, cases of (1 − β) = 0 and (1 − β) = 1 correspond to MaxDD and AvDD problems, respectively. Tables 1 and 2 below provide the list of markets and corresponding sets of optimal weights for MaxDD and AvDD problems. Table 3 provides the weights for the case with (1 − β) = 0.05 CDaR. In these tables, the solution achieving maximal Reward/Risk ratio is boldfaced. Note that the smallest value of risk is chosen in such a way that the solutions to the optimization problem still exist. This means that each problem does not have a solution beyond the upper and lower bounds of the risk range covered (the whole efficient frontier is shown). Notions of risk and rate of return are expressed in percent with respect to the original account size, i.e. $20M. Efficient frontiers for problems reward-MaxDD and reward-AvDD, are shown in Figures 1 and 2, respectively. We do not show efficient frontiers for CDaR measure on separate graphs (except for MaxDD and AvDD). However, we show on Figure 3 the reward-MaxDD graphs for portfolios optimal with (1 − β) = 0, 0.05, 0.4 and 1 CDaR constraints. As it is expected, the case with (1 − β) = 0 CDaR corresponding to MaxDD has a concave efficient frontier majorating other graphs. The reward is not maximal for each level of MaxDD when we solved the optimization problems with (1 − β) = 0.05, 0.4 and 1 CDaR constraints. Viewed from the reference point of MaxDD problem, (1 − β) < 1 solutions are uniformly “worse”. However, none of these solutions are truly better or worse than others from a mathematical standpoint. Each of them provides the optimal solution in its own sense. Some thoughts on which might be a better solution from a practical standpoint are provided below. Similar to Figure 3, Figure 4 depicts the reward-AvDD graphs for portfolios optimal with (1 − β) = 0, 0.05, 0.4 and 1 CDaR constraints. The case with (1 − β) = 1 CDaR corresponding to AvDD has a concave efficient frontier majorating other graphs. As in classical portfolio theory, we are interested in a portfolio with a maximal Reward/Risk ratio, i.e., the portfolio where the straight line coming through (0,0) becomes tangent to the efficient frontier. We will call the Reward/Risk ratios for Risk defined in terms of problems (16), (17), and (18) as MaxDDRatio, AvDDRatio, and CDaRRatio which, by definition, are (19)
MaxDDRatio =
R(x) , M (x)
AvDDRatio =
R(x) , A(x)
CDaRRatio =
R(x) . ∆β (x)
Risk management with drawdown functions
103
The charts of MaxDDRatio and AvDDRatio quantities are shown in Figures 5 and 6 for the same cases of (1 − β) as in Figures 3 and 4. We have solved optimization problem (18) for cases of (1 − β) = 0, 0.05, 0.1, 0.2, 0.4, 0.6, 0.8 and 1. Let us note that already the case of (1 − β) = 0.05 (see Table 3), which considers minimization of the worst 5% part of the underwater curve, is producing a set of weights significantly different from the (1 − β) = 0 case (MaxDD problem), and (1−β) = 0.05 CDaR case includes several tens of events over which the averaging was performed. We consider that optimization with (1 − β) = 0.05 or 0.1 constraints produces a more robust portfolio than the optimization with MaxDD or AvDD constraints. CDaR solution takes into account many significant drawdowns, comparing to the case with MaxDD constraints, which considers only the largest drawdown. Also, CDaR solution is not dominated by many small drawdowns like the case with AvDD constraints. We have also made an alternative check of our results via solving the related nonlinear optimization problems corresponding to problems (16)-(18). These problems have optimized the corresponding drawdown ratios defined above within the same set of constraints. Verification was done using Genetic Algorithm-based search software. We were satisfied to find that this procedure has produced the same sets of weights for the optimal solutions.
6. Conclusions We have introduced a new CDaR risk measure, which, we believe, is useful for the practical portfolio management. This measure is similar to CVaR risk measure and has the MaxDD and AvDD risk measures as its limiting cases. We have studied Reward/Risk ratios implied by these measures of risk, namely MaxDDRatio, AvDDRatio, and CDaRRatio. We have shown that the portfolio allocation problem with CDaR, MaxDD and AvDD risk measures can be efficiently solved. We have posed and for a real-life example, solved a portfolio allocation problem. These developments, if implemented in a managed accounts’ environment will allow a trading or risk manager to allocate risk according to his personal assessment of extreme drawdowns and their duration on his portfolio equity. We believe that however attractive the MaxDD approach is, the solutions produced by this optimization may have a significant statistical error because the decision is based on a single observation of maximal loss. Having a CDaR family of risk measures allows a risk manager to have control over the worst (1 − β) ∗ 100% of drawdowns, and due to statistical averaging within that range, to get a better predictive power of this risk measure in the future, and therefore a more stable portfolio. Our studies indicate that when considering CDaR with an appropriate level (e.g., β = 0.95, i.e., optimizing over the 5% of the worst drawdowns), one can get a more stable weights allocation than that produced by the MaxDD problem. A detailed study of this issue calls for a separate publication.
104
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
Acknowledgements: Authors are grateful to Anjelina Belakovskaia, Peter Carr, Stephan Demoura, Nedia Miller, and Mikhail Smirnov for valuable comments which helped to improve the paper.
Appendix: Tables and Figures Risk, % 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 Reward, % 25.0 36.3 44.5 51.4 57.3 63.0 67.7 71.7 75.2 78.0 80.4 81.9 82.9 83.0 Reward/Risk 6.26 7.27 7.42 7.34 7.16 7.00 6.77 6.52 6.27 6.00 5.74 5.46 5.18 4.88
Optimal portfolio configuration AAO AD AXB BD BP CD CP DGB DX ED EU FV FXADJY FXBPJY FXEUBP FXEUJY FXEUSF FXNZUS FXUSSG FXUSSK GC JY LBT LFT LGL LML MNN SF SI SJB SNI TY
0.20 0.20 0.20 0.20 0.20 0.25 0.62 0.20 0.20 0.20 0.20 0.20 0.27 0.20 0.20 0.20 0.33 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.49 0.20 0.20
0.25 0.40 0.37 0.20 0.20 0.59 0.80 0.80 0.20 0.20 0.20 0.20 0.58 0.20 0.28 0.20 0.20 0.20 0.20 0.80 0.20 0.23 0.35 0.20 0.20 0.27 0.30 0.20 0.20 0.74 0.56 0.20
0.25 0.74 0.32 0.20 0.20 0.80 0.77 0.80 0.20 0.20 0.20 0.39 0.77 0.20 0.29 0.41 0.25 0.20 0.20 0.80 0.20 0.34 0.62 0.20 0.20 0.36 0.42 0.37 0.20 0.80 0.67 0.23
0.28 0.80 0.47 0.20 0.20 0.80 0.80 0.80 0.20 0.20 0.80 0.58 0.80 0.20 0.32 0.80 0.30 0.20 0.20 0.65 0.20 0.25 0.80 0.20 0.20 0.46 0.45 0.39 0.20 0.80 0.69 0.32
0.21 0.80 0.63 0.62 0.20 0.80 0.80 0.80 0.20 0.20 0.80 0.52 0.80 0.20 0.34 0.80 0.73 0.20 0.20 0.73 0.20 0.37 0.80 0.39 0.20 0.51 0.44 0.52 0.20 0.80 0.78 0.60
0.39 0.80 0.80 0.41 0.20 0.80 0.80 0.80 0.20 0.20 0.80 0.50 0.80 0.20 0.65 0.80 0.80 0.20 0.20 0.70 0.20 0.80 0.80 0.63 0.20 0.60 0.80 0.52 0.20 0.80 0.80 0.69
0.68 0.80 0.55 0.53 0.20 0.80 0.80 0.80 0.20 0.20 0.80 0.54 0.80 0.53 0.72 0.80 0.80 0.20 0.20 0.60 0.20 0.80 0.80 0.80 0.20 0.78 0.80 0.63 0.20 0.80 0.80 0.80
0.80 0.80 0.64 0.56 0.20 0.80 0.80 0.80 0.20 0.20 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.28 0.35 0.20 0.80 0.80 0.80 0.37 0.80 0.80 0.75 0.20 0.80 0.80 0.80
0.69 0.80 0.80 0.80 0.22 0.80 0.80 0.80 0.63 0.20 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.21 0.20 0.20 0.80 0.80 0.80 0.80 0.80 0.77 0.80 0.20 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.51 0.80 0.80 0.80 0.80 0.35 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.43 0.20 0.20 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.77 0.80 0.80 0.80 0.80 0.74 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.72 0.20 0.20 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.27 0.80 0.80 0.57 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.40 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80
Table 1: List of markets and corresponding sets of optimal weights for the MaxDD problem. The solution achieving maximal Reward/Risk ratio is boldfaced.
Risk management with drawdown functions
Risk, % Reward, % Reward/Risk
0.77 21.7 28.2
1.00 35.6 35.6
1.23 45.3 36.8
1.46 53.3 36.5
1.50 54.5 36.3
1.69 59.9 35.4
1.92 65.7 34.2
2.15 70.6 32.9
105
2.38 74.8 31.4
2.61 78.2 30.0
2.84 81.2 28.6
3.07 83.0 27.0
0.80 0.80 0.80 0.20 0.43 0.80 0.80 0.80 0.20 0.70 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.20 0.65 0.80 0.80 0.80 0.20 0.80 0.80 0.69
0.80 0.80 0.80 0.52 0.80 0.80 0.80 0.80 0.30 0.75 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.36 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.77
0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.71 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.46 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.79 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80
Optimal portfolio configuration AAO AD AXB BD BP CD CP DGB DX ED EU FV FXADJY FXBPJY FXEUBP FXEUJY FXEUSF FXNZUS FXUSSG FXUSSK GC JY LBT LFT LGL LML MNN SF SI SJB SNI TY
0.20 0.21 0.20 0.20 0.20 0.20 0.24 0.33 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.29 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.23 0.20 0.20
0.46 0.57 0.20 0.20 0.20 0.37 0.60 0.80 0.20 0.30 0.20 0.20 0.20 0.20 0.20 0.59 0.62 0.20 0.20 0.74 0.20 0.38 0.52 0.20 0.20 0.20 0.20 0.20 0.20 0.67 0.33 0.20
0.61 0.80 0.23 0.20 0.20 0.54 0.80 0.80 0.20 0.35 0.20 0.37 0.20 0.32 0.29 0.80 0.80 0.20 0.20 0.80 0.20 0.62 0.80 0.20 0.20 0.21 0.20 0.38 0.20 0.80 0.47 0.20
0.77 0.80 0.55 0.20 0.20 0.80 0.80 0.80 0.20 0.33 0.20 0.50 0.31 0.49 0.53 0.80 0.80 0.20 0.40 0.80 0.20 0.80 0.80 0.20 0.20 0.34 0.20 0.50 0.20 0.80 0.62 0.20
0.80 0.80 0.62 0.20 0.20 0.80 0.80 0.80 0.20 0.32 0.20 0.53 0.33 0.50 0.58 0.80 0.80 0.20 0.48 0.80 0.20 0.80 0.80 0.20 0.20 0.34 0.20 0.54 0.20 0.80 0.66 0.20
0.80 0.80 0.80 0.20 0.20 0.80 0.80 0.80 0.20 0.21 0.20 0.76 0.42 0.69 0.77 0.80 0.80 0.27 0.71 0.80 0.20 0.80 0.80 0.20 0.20 0.49 0.20 0.67 0.20 0.80 0.72 0.20
0.80 0.80 0.80 0.20 0.20 0.80 0.80 0.80 0.20 0.31 0.46 0.80 0.57 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.20 0.29 0.64 0.42 0.80 0.20 0.80 0.80 0.20
0.80 0.80 0.80 0.20 0.20 0.80 0.80 0.80 0.20 0.44 0.80 0.80 0.73 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.20 0.48 0.80 0.80 0.80 0.20 0.80 0.80 0.32
Table 2: List of markets and corresponding sets of optimal weights for the AvDD problem. The solution achieving maximal Reward/Risk ratio is boldfaced.
106
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
Risk, % Reward, % α, % Reward/Risk
3.0 24.2 2.55 8.06
3.2 27.2 2.64 8.50
3.7 33.3 3.10 8.99
3.8 34.4 3.18 9.04
3.9 35.5 3.27 9.09
4.0 36.6 3.36 9.14
5.0 46.3 4.26 9.26
6.0 54.7 5.13 9.12
7.0 62.1 6.02 8.86
8.0 68.4 6.81 8.55
9.0 73.9 7.66 8.21
10.0 78.6 8.61 7.86
11.0 82.0 9.57 7.45
12.0 83.0 9.98 6.92
0.80 0.80 0.46 0.67 0.20 0.80 0.80 0.80 0.20 0.28 0.80 0.73 0.80 0.73 0.76 0.80 0.80 0.20 0.75 0.80 0.20 0.80 0.80 0.58 0.20 0.52 0.80 0.80 0.20 0.80 0.80 0.39
0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.80 0.31 0.28 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.79 0.20 0.80 0.80 0.66 0.27 0.69 0.80 0.80 0.20 0.80 0.80 0.70
0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.80 0.80 0.48 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.20 0.80 0.80 0.20 0.80 0.80 0.76 0.66 0.74 0.80 0.80 0.20 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.64 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.77 0.80 0.80 0.20 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.58 0.80 0.80 0.80
0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80
Optimal portfolio configuration AAO AD AXB BD BP CD CP DGB DX ED EU FV FXADJY FXBPJY FXEUBP FXEUJY FXEUSF FXNZUS FXUSSG FXUSSK GC JY LBT LFT LGL LML MNN SF SI SJB SNI TY
0.20 0.24 0.20 0.20 0.20 0.20 0.23 0.50 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.31 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.47 0.21 0.20
0.21 0.36 0.20 0.20 0.20 0.20 0.34 0.71 0.20 0.20 0.20 0.20 0.22 0.20 0.20 0.35 0.20 0.20 0.20 0.20 0.20 0.35 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.57 0.22 0.20
0.30 0.60 0.20 0.20 0.20 0.29 0.41 0.80 0.20 0.20 0.23 0.20 0.33 0.20 0.29 0.68 0.28 0.20 0.20 0.22 0.20 0.42 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.71 0.29 0.20
0.32 0.64 0.20 0.20 0.20 0.31 0.44 0.80 0.20 0.20 0.26 0.23 0.34 0.20 0.31 0.72 0.30 0.20 0.20 0.22 0.20 0.43 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.74 0.29 0.20
0.33 0.68 0.20 0.20 0.20 0.32 0.46 0.80 0.20 0.20 0.30 0.25 0.35 0.20 0.34 0.74 0.31 0.20 0.20 0.24 0.20 0.45 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.77 0.30 0.20
0.34 0.69 0.20 0.20 0.20 0.33 0.51 0.80 0.20 0.20 0.31 0.30 0.36 0.20 0.34 0.77 0.29 0.20 0.20 0.25 0.20 0.47 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.80 0.33 0.20
0.49 0.80 0.20 0.20 0.20 0.49 0.80 0.80 0.20 0.26 0.80 0.47 0.49 0.20 0.43 0.80 0.38 0.20 0.20 0.61 0.20 0.75 0.47 0.25 0.20 0.20 0.20 0.20 0.20 0.80 0.58 0.20
0.54 0.80 0.20 0.60 0.20 0.64 0.80 0.80 0.20 0.27 0.80 0.47 0.69 0.32 0.39 0.80 0.59 0.20 0.37 0.80 0.20 0.80 0.80 0.28 0.20 0.20 0.34 0.54 0.20 0.80 0.80 0.20
0.69 0.80 0.33 0.69 0.20 0.80 0.80 0.80 0.20 0.31 0.80 0.56 0.80 0.50 0.46 0.80 0.80 0.20 0.59 0.80 0.20 0.80 0.80 0.43 0.20 0.31 0.74 0.80 0.20 0.80 0.80 0.20
Table 3: List of markets and corresponding sets of optimal weights for the CDaR problem with (1 − β) = 0.05. The solution achieving maximal Reward/Risk ratio is boldfaced.
Risk management with drawdown functions
Efficient Frontier
Uncompounded Portfolio Rate of Return, R(x)
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
MaxDD, M(x)
Figure 1: Efficient frontier for the MaxDD problem (rate of return versus MaxDD
Efficient Frontier
Uncompounded Portfolio Rate of Return, R(x)
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
AvDD, A(x)
Figure 2: Efficient frontier for the AvDD problem (rate of return versus AvDD
107
108
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
Reward-MaxDD
R(x) 0.90 0.80 0.70
0% CDaR 5% CDaR 40% CDaR 100% CDaR
0.60 0.50 0.40 0.30 0.20 0.04
MaxDD, M(x) 0.06
0.08
0.10
0.12
0.14
0.16
0.18
Figure 3: Reward-MaxDD graphs for optimal portfolios with (1 − β) = 0, 0.05, 0.4 and 1 CDaR constraints (rate of return versus MaxDD). The frontier is efficient only for the case with (1 − β) = 0 CDaR constraints, which corresponds to the MaxDD risk measure.
R(x)
Reward-AvDD
0.90 0.80 0.70 0% CDaR 5% CDaR 40% CDaR 100% CDaR
0.60 0.50 0.40 0.30 0.20 0.007
AvDD, A(x) 0.012
0.017
0.022
0.027
0.032
Figure 4: Reward-AvDD graphs for optimal portfolios with (1 − β) = 0, 0.05, 0.4 and 1 CDaR constraints (rate of return versus AvDD). The frontier is efficient only for the case with (1 − β) = 1 CDaR constraints, which corresponds to the AvDD risk measure.
Risk management with drawdown functions
109
MaxDD Ratio 8 7.5 7 6.5
0% CDaR 5% CDaR
6
40% CDaR
5.5
100% CDaR
5 4.5 4 0.04
M(x) 0.06
0.08
0.10
0.12
0.14
0.16
Figure 5: MaxDDRatio graphs for optimal portfolios with (1 − β) = 0, 0.05, 0.4 and 1 CDaR constraints (MaxDDRatio versus MaxDD). The maximum MaxDDRatio is achieved in the case with (1 − β) = 0 CDaR constraints, which corresponds to the MaxDD risk measure.
AvDD Ratio 39 37 35 0% CDaR
33
5% CDaR 40% CDaR
31
100% CDaR
29 27 25 0.007
A(x) 0.012
0.017
0.022
0.027
0.032
Figure 6: AvDDRatio graphs for optimal portfolios with (1 − β) = 0, 0.05, 0.4 and 1 CDaR constraints (AvDDRatio versus AvDD). The maximum AvDDRatio is achieved in the case with (1 − β) = 1 CDaR constraints, which corresponds to the AvDD risk measure.
110
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
Reward/MaxDD Ratio
8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0 0.2
0.3
0.4
0.5
0.6
weight BP
0.7
0.8 0.20
0.35
0.50
0.65
0.80
weight US
Figure 7: Example of Reward to Risk ratio of two instruments. The risk is defined by the value of portfolio MaxDD.
Reward/AvDD Ratio
30.0 29.5 29.0 28.5 28.0
0.20 0.50
weight BP
0.80
0.2
0.3
0.4
0.5
0.6
0.7 0.8
27.5
weight US
Figure 8: Example of Reward to Risk ratio of two instruments. The risk is defined by the value of portfolio AvDD. Using MaxDD leads to nonsmooth picture while, using AvDD, which is an integrated characteristic, determines the smooth ratio. Solutions based on using CDaR or AvDD seem to be more robust than those obtained by using MaxDD.
Risk management with drawdown functions
111
References [1] Andersson, F. and S. Uryasev (1999): Credit Risk Optimization With Conditional Value-At-Risk Criterion. Research Report 99-9. ISE Dept., University of Florida, August. (Revised version submitted to the journal of Mathematical Programming. Can be downloaded: www.ise.ufl.edu/uryasev/and mp.pdf) [2] Artzner, P., Delbaen F., Eber, J.M., and D. Heath (1999): Coherent Measures of Risk. Mathematical Finance 9, 203–228. [3] Cvitanic, J. and I. Karatzas (1995): On Portfolio Optimization Under “Drawdown” Constraints, IMA Lecture Notes in Mathematics & Applications 65, 77–88. [4] Dembo, R.S. and A.J. King (1992): Tracking Models and the Optimal Regret Distribution in Asset Allocation. Applied Stochastic Models and Data Analysis 8, 151–157. [5] Jorion, Ph. (1996): Value at Risk: A New Benchmark for Measuring Derivatives Risk. Irwin Professional Pub. [6] Grinold, R.C. and R.N. Kahn (1999): McGraw-Hill, New York.
Active Portfolio Management,
[7] Grossman, S. J. and Z. Zhou (1993): Optimal Investment Strategies for Controlling Drawdowns, Mathematical Finance 3, 241–276. [8] Konno, H. and H. Yamazaki (1991): Mean Absolute Deviation Portfolio Optimization Model and Its Application to Tokyo Stock Market. Management Science 37, 519–531. [9] Markowitz, H.M. (1952): Portfolio Selection. Journal of Finance 7, 77–91. [10] Mausser, H. and D. Rosen (1999): Beyond VaR: From Measuring Risk to Managing Risk. ALGO Research Quarterly 1, 5–20. [11] Palmquist, J., Uryasev, S. and P. Krokhmal: Portfolio Optimization with Conditional Value-At-Risk Objective and Constraints. Research Report 99-14, ISE Dept., University of Florida (can be downloaded: www.ise.ufl.edu/uryasev/pal.pdf). [12] Rockafellar, R.T. (1970): Convex Analysis. Princeton Mathematics 28, Princeton Univ. Press. [13] Rockafellar, R.T. and S. Uryasev (2000): Optimization of Conditional Value-at-Risk. The Journal of Risk, accepted for publication (can be downloaded: www.ise.ufl.edu/uryasev/cvar.pdf).
112
Alexei Chekhlov, Stanislav Uryasev and Michael Zabarankin
[14] Young, M.R. (1998): A Minimax Portfolio Selection Rule with Linear Programming Solution. Management Science 44, 673–683. [15] Ziemba, W.T. and J.M. Mulvey (Eds.) (1998): Worldwide Asset and Liability Modeling. Cambridge Univ. Press.
Alexei V. Chekhlov Thor Asset Management, Inc. 551 Fifth Avenue, Suite 601, 6th Floor New York, NY 10017 a [email protected]
Stanislav Uryasev Department of Industrial and Systems Engineering P.O. Box 116595 University of Florida 303 Weil Hall, Gainesville, FL 32611-6595 [email protected]
Michael Zabarankin Department of Industrial and Systems Engineering P.O. Box 116595 University of Florida 303 Weil Hall, Gainesville, FL 32611-6595 [email protected]
Local volatility changes in the Black-Scholes model Hans-Peter Bermin and Arturo Kohatsu-Higa1 Abstract: In this paper we address a sensitivity problem with finantial applications. Namely the study of price variations of different contingent claims in the Black-Scholes model due to changes in volatility. This study needs an extension of the classical Vega index, i.e. the price derivative with respect to the constant volatility, which we call the local Vega index (lvi). This index measures the importance of a volatility perturbation at a certain point in time. We compute this index for different options and conclude that for the contingent claims studied in this paper, the lvi can be expressed as a weighted average of the perturbation in volatility. In the particular case where the interest rate and the volatility are constant and the perturbation is deterministic, the lvi is an average of this perturbation multiplied by the classical Vega index. We also study the well-known goal problem of maximizing the probability of a perfect hedge and conclude that the speed of convergence is in fact related to the lvi.
1. Introduction In this paper we will analyze a sensitivity problem with respect to variations in parameters. To keep things simple we will place ourselves in the well-known Black-Scholes model. In order to fix our terminology we consider a finite time interval [0, T ∗ ] and assume that the basic market consists of two assets: one locally risk free asset of price B (·) and one stock of price S (·). The interpretation of the locally risk free asset is as usual that of a bank account where money grows at the short interest rate r. The asset prices are modelled by the (stochastic) differential equations dB (t) = rB (t) dt (1.1) B (0) = 1 dS (t) = αS (t) dt + σS (t) dW (t) (1.2) S (0) = S0 where we assume that the coefficients r, α, S0 and σ are strictly positive constants. 1 Hans-Peter Bermin es analista financiero en WestLB, Global Financial Markets Department, Londres, y profesor del Departamento de Econom´ıa de la Universidad de Lund. Arturo KohatsuHiga es profesor visitante del Departamento de Econom´ıa de la Universidad Pompeu Fabra. Esta charla fue impartida por el segundo autor (Arturo Kohatsu-Higa) en la sesi´ on del Seminario Instituto MEFF-RiskLab de noviembre de 2000.
114
Hans-Peter Bermin and Arturo Kohatsu-Higa
Here we let {W (t) : 0 ≤ t ≤ T ∗ } denote a Brownian motion defined on the complete probability space (Ω, F, P ) and we let F = {Ft : 0 ≤ t ≤ T ∗ } denote the natural filtration generated by the σ-fields (W (s) : 0 ≤ s ≤ t) and completed by the P -null sets of F. It is well known that the basic market is free of arbitrage opportunities and complete, see e.g. the classical article by Harrison and Pliska (1981). Hence, for any sufficiently integrable FT -measurable contingent claim G, we can define the corresponding price process Π (·) according to: (1.3)
Π (t) = B (t) B (T )
−1
EQ [G |Ft ] ,
where Qis the unique equivalent martingale measure under which the process V (·) := · W (·) − 0 r−α ds is a Brownian motion. Consequently, prices are to be computed σ given the dynamics dS (t) = rS (t) dt + σS (t) dV (t) (1.4) S (0) = S0 of the stock. For technical convenience, we will assume from now on that the contingent claims to be studied are square integrable, i.e. belong to the space L2 (Ω, F, Q). Note also that the maturity of a contingent claim T is an arbitrary value in the interval [0, T ∗ ]. Of course, the Black-Scholes model is a simplified model of reality. This is seen, for example, by taking the prices of traded options as well as the interest rate as given and thereafter solving backwards for the implied volatility, see e.g. Dupire (1994) for further details. As a result one normally finds that the implied volatilities are not constant over time, which contradicts the specification of the stock price in (1.4). However, taking into account the simplicity of the Black-Scholes model it is an outstanding benchmark. The only parameter to be estimated is the volatility σ, and thereafter closed form solutions for most contingent claims can quite easily be obtained. Hence, the derived contingent claim prices will depend upon the estimated volatility, and therefore it is of course natural to ask what is the effect in pricing and hedging of a possible mis-specification of the volatility. One possible solution to this problem is to assume that volatility depends on time and the value of the underlying. Valuation is then obtained via simulation or other approximation procedures. Due to this, pricing and hedging becomes difficult and therefore one wonders if this change necessarily implies a completely different price and hedging. Here again, between other related problems, we need to know how sensible is the pricing and hedging procedure to a mis-calculation of the parameters. In particular, one wonders what is the acceptable amount of error in the volatility estimation that introduces a reasonable amount of price and hedging error? To answer this and other related questions we intend to measure this importance in time. That is, we will be able to say at which times a volatility mis-specification becomes crucial. In fact, our objective is not only qualitative but also quantitative
Local volatility changes in the Black-Scholes model
115
in the sense that we will measure this importance through some weight functions. Related problems have already been considered by Hobson (1997) and El Karoui et al (1998) between others. Hobson (1997) obtains that if volatility is uniformly dominated then prices for lookback options will be higher. In El Karoui et al (1998) it is proven that, in the case of deterministic volatilities, if one dominates another in the L2 [0, T ]-sense then prices of European type option will also be higher. Although these studies seek full generality in the volatility structure they only obtain some qualitative results about the hedging and pricing behavior. So one can consider the problem of defining a distance concept for volatility functions so that distance properties of prices are implied. In the case of the Black Scholes model where σ is constant, the natural way to carry out this sensitivity analysis is to study the derivative of the contingent claim prices with respect to the volatility, i.e. ∂Π ∂σ (·) which in the financial field is known as the Vega index of the contingent claim. However, as we will show in this paper, there are other ways as well. To see this, consider a generalized version of the Black-Scholes model where the dynamics of the stock price are given by: dS (t) = r (t) S (t) dt + σ (t) S (t) dV (t) , where r (·) and σ (·) are strictly positive deterministic functions. The advantage of the generalized version is of course that we now can, more or less, calibrate the volatility structure to the implied volatilities obtained from the prices of the traded options at the market. However, this procedure gives us a set of implied volatilities for different points in time, to which we have to fit the continuous volatility function σ (·), hence again we are interested in some kind of sensitivity analysis. This time, though, things are not as obvious as before since we now would like to take the derivative of the contingent claim prices with respect to the deterministic volatility function. Clearly, this cannot be done so instead we have to consider directional derivatives. However, depending upon in which direction we perturb the volatility function, we get different directional derivatives. The quantity we will propose to study is therefore the lvi: ∂Πε ∂ −1 (1.5) (0) B (T ) EQ [Gε ] = , ∂ε ∂ε ε=0 ε=0 · where now B (·) := exp 0 r (s) ds . The contingent claim Gε has the same form as G except that the underlying security S (·) is replaced by the perturbed stock price S ε (·) defined as the solution to the stochastic differential equation: dS ε (t) = r (t) S ε (t) dt + σε (t, S ε (t)) dV (t) (1.6) S ε (0) = S0 where σε (·, ·) is a deterministic function such that the above equation has a unique a.s. strictly positive pathwise solution. For technical convenience we assume throughout this paper that ε belongs to the compact set [0, ε¯] for some ε¯ ≥ 1, and that the following assumption, which we will refer to as assumption A1, is satisfied.
116
Hans-Peter Bermin and Arturo Kohatsu-Higa
Assumption (A1) The volatility σε (t, x) := σ (t) x + εˆ σ (t, x) ≥ 0, satisfies 0 < σmin ≤ σ (t) ≤ σmax for all t ∈ [0, T ∗ ], x ≥ 0, and ε ≤ ε¯. Furthermore, we assume that σ ˆ (t, x) is infinitely differentiable in x with bounded partial derivatives of any order uniformly in t, and that σε (t, x) is bounded away from zero in a neighborhood of (0, S0 ) uniformly in ε. The short interest rate r (·) and σ ˆ (·, 0) are bounded functions. Note that under A1 we allow the perturbations to be random. From a practical point of view this may be interesting since it gives us the possibility to study different scenarios. For instance we could model an increase in anticipated volatility when the stock price reaches some threshold level. Hence, assumption A1 is indeed very general, and in fact for some of the contingent claims that are to be studied it is too general. We will therefore, sometimes, use the following stronger conditions, which we will refer to as assumption A2. Assumption (A2) r (t) = r, σ (t) = σ, and σ ˆ (t, x) = σ ˆ (t) x for all t ∈ [0, T ∗ ] and x≥0. The deterministic measurable function σ + εˆ σ (·) is strictly positive and bounded for all ε. More precisely we assume without loss of generality that 0 < σmin ≤ σ + εˆ σ (t) ≤ σmax for all ε, t. Note that under A2, we can intuitively relate the lvi to the usual Vega index since the stock price S (·) according to (1.4) then corresponds to the case where ε = 0 in (1.6). For the sake of simplicity we will in this case denote S 0 (·) simply by S (·) , and we will use this kind of convention for any quantity that depends on ε when we ∂ are working under A2. However, note that whenever we consider the operator ∂σ , as in the case of the Vega index, we assume that the stock price is defined by (1.4). The main question of interest is to study the new concept of lvi in (1.5) and compute it when possible. Under A2, we will also study its relationship with the Vega index ∂Π ˆ (·) for the two ∂σ (·) and give conditions on the deterministic function σ sensitivity indices to coincide. The relationship between the lvi and the classical Vega index comes out easily if it is possible to obtain a closed form solution for the expectation EQ [Gε ], however as we will see in the next sections even if this is not the case we will still be able to compute the lvi and relate it to the classical Vega index. Nevertheless, in order to start our analysis let us consider an example where this relationship can be deduced easily. Example 1.1 Let us consider a standard call option with payoff Gε = max (S ε (T ) − K, 0) for some constant strike price K and assume A2. It is easily verified that at time 0 the price is given by √ Πε (0) = S0 N (dε1 ) − e−rT KN dε1 − Σε ,
Local volatility changes in the Black-Scholes model
117
where N (·) denotes the cumulative distribution function of a standard normal random variable, and dε1 is defined by T ln (S0 /K) + rT + 12 Σε ε ε √ d1 = ; Σ = (σ + εˆ σ (t))2 dt. Σε 0 Straightforward calculations then gives, denoting ϕ (·) = dN dx (·), that T 1 √ ∂Π0 ∂Πε (0) (0) = S0 ϕ d01 = S0 ϕ d01 √ σ ˆ (t) dt ; T. ∂ε ∂σ T 0 ε=0 Finally, since
∂Π ∂σ
(0) =
∂Π0 ∂σ
(0), we get the relationship 1 T ∂Π ∂Πε (0) (0) . = σ ˆ (t) dt ∂ε T ∂σ 0 ε=0
In fact there are a lot of interesting results that are worth to be pointed out in this very simple example. First we see that the two approaches are identical if T σ ˆ (t) dt = T for any maturity T ∈ [0, T ∗ ]. This implies that for standard call 0 options the usual derivative ∂Π ∂σ (·) is identical to the directional derivative with an uniform perturbation, i.e. σ ˆ (·) = 1 a.e.-λ where from now on λ denotes the Lebesgue measure. In fact, as we will show in the forthcoming sections, this is a result that is true in general and not only for standard options. Moreover, for standard options T there is no local price change as long as 0 σ ˆ (t) dt = T . Hence, if the volatility depends on time and it drops and then rises symmetrically around the uniform level σ ˆ (·) = 1, the result will be identical to the converse scenario with a initial rise followed by a drop. This result, however, is not true for other type of options as we will see later on. If on the other hand, one believes that there can be significant changes in the volatility, the two approaches can give quite different results. For all the options studied in this paper, the lvi can be expressed as a weighted average of the perturbation in volatility. Under A2, the lvi is an average of this perturbation multiplied by the classical Vega index. In particular, we will conclude that for path-dependent options the change in option prices due to a perturbation of volatility decreases in importance as maturity is approached. This is natural since the payoff of a path-dependent option depends on the whole path of the stock price. Hence, a change of volatility at the beginning of the time to maturity will affect almost the whole path of the underlying security, while a change of volatility at the end of the time to maturity will only affect a small part of the path of the stock price. In the case of the lookback option, we find in comparison to example 1.1, that not only the average of change in volatility is an important quantity that determines the change in price, but also the modulus of continuity plays a role in the sensitivity analysis. Since, the lvi measures the local change in prices due to a perturbation in volatility in a quantitative manner, we can properly call it an index. Hence, a big value for this index corresponds to a big change in the option price. This contrasts with the qualitative results of El Karoui et al. (1998) and Hobson (1997).
118
Hans-Peter Bermin and Arturo Kohatsu-Higa
Obviously their results are implied by ours. For example, under assumption A2 we show that for all the options studied in this paper a local volatility change of the form σ ˆ (·) ≥ 0 implies the inequality Π (0) ≤ Πε (0). Consequently, if we use the lower bound Π (0) as the initial amount of money we can never obtain a perfect hedge of the contingent claim Gε . Therefore we study the problem of maximizing a perfect hedge and we show that the lvi is an important quantity in determining the speed at which the probability of maximizing a perfect hedge goes to one as ε goes to zero. To carry out the analysis in a general way we cannot assume that there exists a closed form solution to the expected value EQ [Gε ]. In fact, this will in Asian or lookback options there is no hope of getting such close formulae. Instead we will use the natural approach of derivation on Wiener space. For this reason we introduce Malliavin calculus and in particular the integration by parts formula in the next section. Malliavin calculus is a natural tool to use in finance as it gives information about hedging portfolios through the concept of stochastic derivation, see e.g. Bermin (1998a), (1998b), and Karatzas and Ocone (1991). Recently, Malliavin calculus has been applied in other areas through the integration by parts formula which allows for analysis of non-smooth functions of smooth random variables, see e.g. Fourni´e et al. (1997) who used this technique to study the numerical simulation of greeks. In the same spirit, one can also say that the lvi is deeply tied with the concept of greeks. The greeks and in particular the vega index is used in risk assessment, optimal contract design and to imply out parameters. This lvi could be used also for the same purposes. In order to give an example of the importance of such an index we study the goal problem and prove that the lvi plays an important role within this framework. The goal problem is a type of quantile hedging problem. Recently, F¨ ollmer and Leukert (1999), Karatzas (1996), Spivak and Cvitani´c (1999) between others have shown an alternative way to deal with the superreplication problem by using partial hedging. In the goal problem we suppose that the assumed model has underestimated volatility and therefore we find ourselves in a situation where perfect replication is impossible. Instead, assuming some borrowing without interest condition, we prove that the replication can be achieved with some probability that will depend of the ’amount’ of mis-specification. We prove exactly how this dependence works and furthermore that this perfect replication probability depends explicitly on the lvi showing therefore its importance. It is a natural extension to obtain the formulas that we give in the next sections in a general framework of underlying stock prices modelled by multi-dimensional diffusion processes (which may be the most common type of stochastic volatility model) that satisfy some type of H¨ ormander condition. As our interest, though, is to show that the concept of sensitivity can be developed and computed explicitly we have decided to leave the extensions to another publication. The techniques to study higher order derivatives of option prices are of course similar to the ones we use throughout the paper.
Local volatility changes in the Black-Scholes model
119
2. Stochastic flows and Malliavin calculus In this section we give a brief account of some elementary properties of stochastic flows and Malliavin calculus. For far more general results see for instance Protter (1990) and Nualart (1995) respectively. It is well-known that the unperturbed stock price S 0 (·), according to (1.6), admits the unique solution
t
t 1 2 σ (s) dV (s) ; 0 ≤ t ≤ T ∗. S 0 (t) = S0 exp r (s) − σ (s) ds + 2 0 0 Since S 0 (·) coincide with S (·) under A2, we see that
1 S (t) = S0 exp r − σ 2 t + σV (t) 2 and consequently we can without ambiguity, for all t ∈ [0, T ∗ ], define the derivative processes: dS dS (t) = − [σt − V (t)] S (t) . (t) = S (t) /S0 and dS0 dσ
(2.1)
In order to define the derivative process of S ε (·) with respect to ε, we first use Thm. 39 in Protter (1990) to ensure that we can always choose versions of {S ε (t) : 0 ≤ t ≤ T ∗ } which are continuously differentiable with respect to ε for each (t, ω) ∈ [0, T ∗ ] × Ω. Now, we let the stochastic process {Z ε (t) : 0 ≤ t ≤ T ∗ } denote the derivative process of S ε (·) with respect to ε, defined as the solution to the stochastic differential equation: dZ ε (t) = r(t)Z ε (t) dt + σε (t, S ε (t))Z ε (t) dV (t) + σ ˆ (t, S ε (t))dV (t) (2.2) Z ε (0) = 0 where σε denotes the derivative with respect to the space variable. Lemma 2.1 The solution to the stochastic differential equation (2.2) is given by
t t −1 ε ε −1 ε ε ε ε Z (t) = − E (s) σε (s, S (t)) σ ˆ (s, S (s)) ds − E (s) σ ˆ (s, S (s)) dV (s) E ε(t) , 0
0
t t 2 where E ε (t) := exp 0 r (s) − 12 σε (s, S ε (s)) ds + 0 σε (s, S ε (s)) dV (s) . Note especially that
t
t 1 2 0 σ (s) dV (s) =: S 0 (t) /S0 , E (t) = exp r (s) − σ (s) ds + 2 0 0
and in particular under A2:
t t Z (t) = − σ σ ˆ (s) ds − σ ˆ (s) dV (s) S (t) . 0
0
0
120
Hans-Peter Bermin and Arturo Kohatsu-Higa
Proof. The solution is constructed using theorem 52 in Protter (1990). However, the proof can also be obtained directly from the Itˆ o formula. For the sake of simplicity we will from now on use the short hand notation Z (·) for Z 0 (·) just as for the stock prices whenever we are working under A2. Remark 2.2 If we assume A2, and combine lemma we see that a.s. 2.1 with (2.1), t t dS ∗ for all t ∈ [0, T ], dσ (t) = Z (t) if and only if σ 0 σ ˆ (s) ds − 0 σ ˆ (s) dV (s) = (σt − V (t)). This will of course be the case only when σ ˆ (·) = 1. Hence, we recover the same result as we found in the introductory example for standard options. Now we introduce some concepts of Malliavin calculus. Note that under assumption A2, we can work directly on the probability space (Ω, F, Q) since the Brownian motions W (·) and V (·) generate the same filtration, see Karatzas and Ocone (1991) for details. In general though, this does not have to be the case. However, since we will only deal with quantities in expectation and all our stochastic differential equations have strong solutions, we can always change the underlying probability space accordingly so that the concepts of Malliavin Calculus can be applied there. We will do this if necessary without further mentioning. We will use the following version of the integration by parts formula. Let the stochastic variable F ∈ D1,2 (for definitions, see Nualart (1995)) and the stochastic process u (·) ∈ dom(δ), then
∗ T
EQ [F δ (u)] = EQ
0
u (t) Dt F dt .
Here δ(u) denotes the Skorohod integral of u (·) which coincides with the usual stochastic integral if u (·) is adapted, while D· denotes the Malliavin derivative defined in T∗ D1,2 . We remind the reader that Dt 0 g (s) dV (s) = g (t) for deterministic functions g (·) ∈ L2 ([0, T ∗ ]), and that the usual chain rule applies in the sense that if ψ (·) is a deterministic Lipschitz function and F is a stochastic variable in D1,2 , having a density, then ψ (F ) ∈ D1,2 with Dt ψ (F ) =
dψ (F ) Dt F. dx
In particular, it follows that Dt S 0 (s) = σ(t)S 0 (s) 1t≤s . Finally, we state the ClarkOcone formula which says that any FT -measurable stochastic variable F ∈ D1,2 has the representation T EQ [ Dt F | Ft ] dV (t) . F = EQ [F ] + 0
For more details about these concepts we refer the reader to Nualart (1995).
Local volatility changes in the Black-Scholes model
121
Remark 2.3 The space D1,2 is a dense subspace of L2 (Ω, F, Q) and therefore it is possible to extend the Clark-Ocone formula. However, for F ∈ L2 (Ω, F, Q) we ¨ unel (1997) generally have to consider D· F in the distributional sense, see e.g. Ust¨ and Bermin (1998a).
3. The effect of local volatility changes In this section we will start by proving a slight generalization of Example 1 using the tools of derivation of stochastic flows and the integration by parts formula of Malliavin calculus. In particular, we analyze the change in prices due to volatility changes for different kinds of European contingent claims. The purpose is to compute ε the lvi ∂Π (0) , and under assumption A2 derive relations between the lvi and ∂ε ε=0 the classical Vega index ∂Π ∂σ (0). It will be shown that these relations depend very much on the specific contingent claims that are treated. In order to define different kinds of contingent claims we need the concept of a payoff function and here we follow El Karoui et al. (1998). Definition 3.1 A payoff function is a convex function Φ (·), defined on (0, ∞) and having bounded one-sided derivatives, that is |Φ (x±)| ≤ C
∀x,
;
for a positive constant C. We recall that for any convex function Φ (·) : R+ → R, there is a countable set N ⊂ R+ such that Φ (·) is differentiable on R+ \N and Φ (x) = D+ Φ (x) = D− Φ (x)
;
x ∈ R+ \N,
where D+ Φ (·) (respectively, D− Φ (·)) denotes the derivative of Φ (·) taken from the right (respectively, from the left). The second derivative Φ (·) may not exist so we define the second derivative measure ς (·) on (R, B (R)) by ς ([a, b)) = D− Φ (b) − D− Φ (a)
;
−∞ < a < b < +∞,
such that ς (·) is positive and ς (dx) = Φ (x) dx if Φ (·) exists. In general, ς (dx) = Φ (x) dx + m (dx), where m (·) is a positive measure which is singular with respect to the Lebesgue measure and Φ (·) is defined as the second derivative of Φ (·) whenever it exists and zero otherwise. From now on, though, we will use the second derivative in a formal sense and interpret expressions like EQ [Φ (F )] with F being a random variable having a smooth density fF (·), by ∞ ∞ ∞ fF (x) ς (dx) = fF (x) Φ (x) dx + fF (x) m (dx) . (3.1) EQ [Φ (F )] := 0
0
0
In general, our assertions do hold under greater generality for the behavior at infinity of Φ (·) but we will only remark this on each respective section.
122
Hans-Peter Bermin and Arturo Kohatsu-Higa
3.1. Simple options We say that a simple option is a contingent claim Gε in the form Φ (S ε (T )). Recall that in example 1.1, Φ(x) = max(x − K, 0). If we for the moment assume that the payoff function is sufficiently smooth then formal calculations yield: ∂Πε −1 (0) = B (T ) EQ Φ S 0 (T ) Z 0 (T ) , (3.2) ∂ε ε=0 and
∂Π ∂S −1 (0) = B (T ) EQ Φ (S (T )) (T ) . ∂σ ∂σ
ε ∂Π Hence, due to remark 2.2 we see that we only have to consider ∂Π ∂ε (0) ε=0 , since ∂σ (0) can be obtained under A2 by just setting σ ˆ (·) = 1 in (3.2), i.e. under assumption A2, ∂Π ∂Πε (0) = (0) . An extension of this argument now gives us the following ∂σ ∂ε ε=0,ˆ σ (·)=1 proposition, which generalizes the results in example 1.1. Proposition 3.2 For simple options, i.e. contingent claims with payoff Φ (S ε (T )), we have that T ∂Πε (3.3) σ (s, S 0 (s)) σ(s)ds, (0) = EQ µ(s, T )ˆ ∂ε 0 ε=0 where µ (·, T ) is a positive adapted process that is independent of σ ˆ (·, ·) but depends on S 0 (·), Φ (·) and its derivatives. If we further assume A2 we have ∂Πε 1 T ∂Π (0) (0) . = σ ˆ (t) dt ∂ε T ∂σ 0 ε=0 Note that although the two formulas in the above proposition look somewhat different, it is straightforward to rewrite the second formula in the general form of (3.3). Remark 3.3 Note that the above formulas are satisfied in greater generality. Under the assumption A2 we only need Φ (·) to be an element of L2 (ϑ) where ϑ is a measure dominating all the measures ν ε . In particular, if Φ (·) is any measurable function with polynomial growth at infinity, the relationship between the local Vega and the classical Vega index is maintained. In the general case one could also allow some flexibility for Φ (·) but then the expression of the lvi starts to depend on the derivatives of σ ˆ (t, ·). By considering the general case (i.e. under assumption A1), we obtain a little bit more information about the local volatility σ ˆ (·, ·) than what could be extracted from example 1.1. For instance, the expression (3.3) also says that there is a trade-off between the size of volatility and the permissible amount of mis-specification of it. In other words, the lvi will be the same in the following two cases: First, one allows σ ˆ (·, ·) to be big when σ (·) is small and second one allows σ (·) to be big when σ ˆ (·, ·) is small.
Local volatility changes in the Black-Scholes model
123
A natural question to pose is whether the lvi is positive or negative. For the sake of simplicity let us assume A2. It is easily verified that for simple options ∂Π ∂σ (0) ≥ T ∂Πε 0, and therefore a sufficient condition for ∂ε (0)ε=0 ≥ 0 is that 0 σ ˆ (t) dt ≥ 0. Furthermore, this also implies that Πε (0) ≥ Π (0) for ε small enough. Now, let us try to answer the following question: when is the price Π1 (0) greater or equal than Π (0). Note that the first price corresponds to the case where the stock has the volatility σ + σ ˆ (·), while the second price corresponds to the case where the stock has the volatility σ. As shown in El Karoui , Jeanblanc-Picqu´ e and Shreve T 2 1 (1998) a condition for Π (0) ≥ Π (0) is that 0 ≤ 0 2σˆ σ (t) + σ ˆ (t) dt. We will hint that this is only a sufficient condition and present an equivalent condition by using the concept of the lvi. From the proof of proposition 3.2 it follows that T ∂Πε (0) = C ε [σ + εˆ σ (t)] σ ˆ (t) dt. ∂ε 0 −1 2 where the positive constant C ε = B (T ) EQ Φ (S ε (T )) S ε (T ) . Moreover, it follows from the Fubini theorem that 1 ∂Πε 1 (0) dε Π (0) − Π (0) = ∂ε 0 1 T ε = C dε σ + C¯ σ ˆ (t) σ ˆ (t) dt , 0
where C¯ = 1
1 0
εC ε dε/
1 0
0
C ε dε. Hence, we see that 0 ≤
T 0
2
σˆ σ (t) + C¯ σ ˆ (t)
dt if and
only if Π (0) ≥ Π (0).
3.2. Asian options We say that an Asian option is a contingent claim Gε in the form Φ
T 0
w(t)S ε (t) dt .
Here w (·) is a bounded measurable positive weight function, such that w (t) ≥ w0 > 0 in a neighborhood around zero. Note that in this case there exists no closed solution for the price of the option and therefore there is no straightforward way of computing the sensibility indices we are interested in. As usual one starts by assuming that Φ (·) is infinitely continuously differentiable with bounded derivatives. The general case will then be obtained by density arguments. Moreover, it is straightforward to show that we can interchange the order of integration and derivation since all the operators involved are linear. Hence, formal calculations yield:
T T ∂Πε −1 0 0 (0) = B (T ) EQ Φ w(t)S (t) dt w(t)Z (v) dv , ∂ε 0 0 ε=0
124
Hans-Peter Bermin and Arturo Kohatsu-Higa
T T ∂S ∂Π −1 (0) = B (T ) EQ Φ (v) dv . w(t)S (t) dt w(v) ∂σ ∂σ 0 0 ε ∂Π Again we see that we only have to consider ∂Π be obtained ∂ε (0) ε=0 , since ∂σ (0) can ∂Π ∂Πε by assuming A2 and setting σ ˆ (·) = 1, i.e. under A2, ∂σ (0) = ∂ε (0) ε=0,ˆσ(·)=1 .
and
Proposition 3.4 For Asian options, i.e. contingent claims with payoff T
Φ
w(t)S ε (t) dt
,
0
we have that
T ∂Πε (0) = E µ(s, T )ˆ σ (s, S 0 (s)) σ(s)ds, ∂ε 0 ε=0
where µ (·, T ) is a positive adapted process that is independent of σ ˆ (·, ·) but depends on Φ (·) and its derivatives, and on the past of S 0 (·) and w (·). Furthermore, if one assumes A2 then T ∂Π ∂Πε (0) (0) , = µ ¯(s, T )ˆ σ (s) ds ∂ε ∂σ 0 ε=0 where
2 T EQ Φ w(t)S (t) dt w(t)S (t) dt s µ ¯(s, T ) = 2 , T T T EQ Φ w(t)S (t) dt w(t)S (t) dt ds 0 0 s
T 0
and µ ¯(·, T ) is a decreasing density function such that lim µ ¯(s, T ) = 0. It can be shown that the pair
2
s→T
T 0
w(t)S (t) dt,
T s
w(t)S (t) dt has a smooth joint
density for all s ∈ [0, T ] and therefore the function µ ¯ (·, T ) is well defined in the formal sense of (3.1). We observe that the relations are qualitatively similar to the ones in Proposition 3.2. Therefore similar remarks as those already made are valid for this case too. In particular, under assumption A2 one finds similar to the previous section that there exists a positive decreasing density function µ ¯ε (·, T ) such that 1 T Π1 (0) − Π (0) = µ ¯ε (s, T ) σ ˆ (s) (σ + εˆ σ (s)) dsdε 0
=
0
T
2 C (s) σ ˆ (s) σ + C¯ (s) σ ˆ (s) ds
0 2 From
here and on we interpret
0 0
= 0 for all the density functions.
Local volatility changes in the Black-Scholes model
125
for certain positive deterministic functions C (·) and C¯ (·). In particular, one has that Π1 (0) ≥ Π (0) if σ ˆ (·) ≥ 0. By choosing particular forms for the weight function w (·) one can study properties of other Asian type options such as, for instance, discretely monitored Asian options. Needless to say, this is of great importance since every Asian option traded at the market is discretely monitored, i.e. in the form n T ε ε Φ h(ti )S (ti ) rather than Φ w(t)S (t) dt . 0
i=1
Note also that the weight function is easily extended to the multi dimensional case.
3.3. Lookback options So far, we have treated somewhat smooth functionals of the underlying asset. Now we concentrate on the case of an option based on an irregular functional such as the maximum process. As a consequence, the techniques used in the previous section cannot be used in this case due to the lack of smoothness of the maximum process. In order to solve this problem without having to worry about other technicalities, we assume A2 throughout this section. We say that is a contingent claim whose payoff function Gε is a lookback option in the form Φ sup0≤t≤T S ε (t) for some payoff function. (Actually it is also possible to define lookback options as contingent claims in the form Gε = Φ (inf 0≤t≤T S ε (t)), however, for notational simplicity we do not consider this case). This time, though, it is not obvious that we can interchange the order of integration and derivation since the running maximum process is highly path-dependent and non-smooth. Moreover, the problem is that we do not have a closed form expression for the derivative with respect to ε of the running maximum process and therefore we cannot simply use formal calculations to obtain a relationship between the two approaches. However, one still has the inequality Πε (0) ≥ Π (0) if the local volatility change is of the form σ ˆ (·) ≥ 0 and Φ (0) ≥ 0. Proposition 3.5 For lookback options, i.e. contingent claims with payoff
Φ sup S ε (t) , 0≤t≤T
we have that ∂Π ∂Πε (0) = (0) ∂σ ∂ε ε=0,ˆ σ (·)=1 T ε ∂Π ∂Π (0) (0) = µ ¯(s, T )ˆ σ (s) ds ∂ε ∂σ 0 ε=0
126
Hans-Peter Bermin and Arturo Kohatsu-Higa
where the density function µ ¯(·, T ) is given by EQ Φ sup0≤t≤T S (t) sup0≤t≤T S (t) 2r σ 1s≤τ + X µ ¯(s, T ) = T . EQ Φ sup0≤t≤T S (t) sup0≤t≤T S (t) 2r ds σ 1s≤τ + X 0 The random time τ is implicitly defined by the relation sup0≤t≤T S (t) = S (τ ) and X is an appropriate random variable that belongs to Lp (Ω, F, Q) for any p. Furthermore, if Φ (·) is monotone then µ(·, T ) is decreasing and if Φ (0) ≥ 0 then lims→T µ (s, T ) ≥ 0. To get a little bit more intuition of the way the lvi and the classical Vega index are related one to the other, we present the following result. Corollary 3.6 For lookback options we have the alternative characterization 2r 1 T ∂Π ∂Πε (0) (0) + 2 e−rT · = σ ˆ (t) dt ∂ε T ∂σ σ 0 ε=0
T 1 T σ ˆ (t) dV (t) − V (T ) σ ˆ (t) dt . ·EQ Φ sup S (s) T 0 0≤s≤T 0 In particular if σ ˆ (·) is differentiable, then 2r ∂Πε 1 T ∂Π (0) (0) + 2 e−rT · = σ ˆ (t) dt ∂ε T 0 ∂σ σ ε=0
T 1 T dˆ σ (t) V (t) dt . V (T ) (ˆ σ (T ) − σ ˆ (t))dt − · EQ Φ sup S (s) T 0 dt 0≤s≤T 0 Note that this last statement shows the dependence on the way the anticipated volatility structure changes. For example if one considers σ ˆ (t) = σ1 1(t ≤ T /2) + σ2 1(t > T /2), one finds that the lvi is given by re−rT σ2 + σ1 ∂Π ∂Πε (0) (0) + = (σ1 − σ2 ) · ∂ε 2 ∂σ σ2 ε=0
T V (T ) · EQ Φ sup S (s) V − . 2 2 0≤s≤T The second term could be positive or negative according to what the values for σ1 and σ2 are, which shows a clear dependence of the behavior of the price on the modulus of continuity of the local change in volatility. Note that if we assume Φ (0) ≥ 0 then Πε (0) ≥ Π (0) and consequently the lvi and the Vega index are both positive. Moreover, by using the integration by parts formula and setting MTS := sup0≤s≤T S (s) for notational simplicity, we find that the above expectation equals: S S σ T /2 σ T EQ Φ MT MT 1t≤τ dt − EQ Φ MTS MTS 1t≤τ dt. 2 0 2 T /2
Local volatility changes in the Black-Scholes model
127
This expression is positive since the integrand
EQ Φ sup S (s) sup S (s) 1t≤τ 0≤s≤T
0≤s≤T
is a decreasing function as shown in the final part of the proof of proposition 3.5. Hence, this little toy example clearly shows that a perturbation of volatility close to maturity (take for instance σ1 = 0 and σ2 = 1) has less effect on the option prices than a similar perturbation at the beginning of the time interval (σ1 = 1 and σ2 = 0).
4. Extensions on hedging of European options There are two aspects regarding the hedging problems that one may think of after the local change in volatility has been studied. One is how much more money is needed to put into the hedging strategy3 in order to cover for the change in volatility. This ∂ 2 Πε question is answered by analyzing the quantity ∂ε∂S0 (0) , which of course can be ε=0 carried out using similar calculations as the ones showed in the previous sections. The other question, though, is known as the goal problem and in our study it takes the form: given that we cannot or are not willing to add more money into the hedging portfolio what is the strategy to follow so that the chances of being able to cover the option are the highest? This question has been partly answered, although in a somewhat different setting, by Kulldorff (1993). Here, we briefly extend the results to time dependent volatility. The following analysis will be carried out under assumption A2 since the general case does not seem to lead to a tractable problem. We assume that the contingent claims to be studied are square integrable.
4.1. The goal problem Let us start by giving a short resume and a little extension of the goal problem. We refer to Karatzas (1996) for details and further references. Let us recall that the discounted value process of a self financing portfolio is given by the expression · −1 −1 (4.1) B (·) X x0 ,ξ (·) = x0 + B (s) ξ (s) σ (s) dV (s) ; σ (t) := σ + εˆ σ (t) . 0
Here x0 is the initial wealth in our portfolio, i.e. x0 = X x0 ,ξ (0), and the strategy ξ (·) represents the amount of money that is invested in the stock at each point in time. Of course we require ξ (·) to be an F-adapted process. According to the extended Clark-Ocone formula any FT -measurable contingent claim Gε ∈ L2 (Ω, F, Q) can be expressed as T −1 −1 −1 B (s) π ¯ (s) σ (s) dV (s) , B (T ) Gε = B (T ) EQ [Gε ] + 0
3 Note
that when hedging a contingent claim, the number of units to be held in the underlying asset at each point in time, is given by the derivative of the option price with respect to the stock price.
128
Hans-Peter Bermin and Arturo Kohatsu-Higa
−1
−1
where π ¯ (t) = σ (t) B (t) B (T ) EQ [ Dt Gε | Ft ] a.s. for all t ∈ [0, T ]. Hence, start−1 ing with the initial wealth u0 := Πε (0) = B (T ) EQ [Gε ] and using the strategy π ¯ (·) we will at maturity obtain a perfect hedge, i.e. X u0 ,¯π (T ) = Gε almost surely. −1 Moreover, in this case B (·) X u0 ,¯π (·) is a Q-martingale and consequently the fair ε price of G , i.e. the price consistent with no arbitrage opportunities, is given by −1 X u0 ,¯π (t) := B (t) B (T ) EQ [ Gε | Ft ]. Now, suppose that our initial wealth x0 is less than the money required to obtain a perfect hedge, i.e. we assume 0 < x0 ≤ u0 , then as we can no longer obtain a perfect hedge we will instead try to maximize the probability of a perfect hedge: p (ε) := sup P X x0 ,ξ (T ) ≥ Gε . ξ(·) tame X x0 +u0 ,ξ (T )≥Gε a.s.
Note that in the case of a perfect hedge X u0 ,¯π (T ) = Gε ∈ L2 (Ω, F, Q), which implied −1 that B (·) X u0 ,¯π (·) was a Q-martingale. However, in this situation we do not have a terminal value for X x0 ,ξ (·) and therefore we to impose the condition that ξ (·) is a · have −1 tame strategy meaning that the process 0 B (s) ξ (s) σ (s) dV (s) is a.s. uniformly bounded from below by some real constant. By using Fatou’s lemma we see that −1 the discounted value process B (·) X x0 ,ξ (·) is a Q-supermartingale whenever the portfolio ξ (·) is tame, and in this case we also have that −1 P X x0 ,ξ (T ) ≥ Gε = P B (T ) X x,π (T ) ≥ 1 , where x = x0 /u0 ∈ [0, 1] and π (·) = [ξ (·) − π ¯ (·)] /u0 , such that π (·) as well is a tame strategy. Moreover, it follows that the inequalities X x0 +u0 ,ξ (T ) ≥ Gε a.s. and X x,π (T ) ≥ 0 a.s. are equivalent. Therefore the above stated problem is identical to solving −1 sup P B (T ) X x,π (T ) ≥ 1 . (4.2) G0 (x) := π(·) tame X x,π (T )≥0 a.s.
Proposition 4.1 The maximal probability of obtaining a perfect hedge is given by
T x0 −2 [σ + εˆ σ (t)] dt , p (ε) = N N −1 + |α − r| u0 0 with u0 := Πε (0) = e−rT EQ [Gε ]. Moreover, the optimal strategy is given by the expression: ξ ∗ (t) = π ¯ (t) + u0 π ∗ (t) , where π ¯ (t) = σ (t)
−1
−1
B (t) B (T )
EQ [ Dt Gε | Ft ] ,
Local volatility changes in the Black-Scholes model
and π ∗ (t)
= ert ϕ
t 0
−1
σ (s)
−1
x0 u0
dV (s) + N T −2 σ (s) ds t
T 0
σ (s)
−2
129
−2 ds σ (t) T −2 σ (s) ds t
−2 σ (t) −1 . = ert ϕ N −1 B (T ) X x,ˆπ (t) T −2 σ (s) ds t
Here, we have used the notation σ (t) := σ + εˆ σ (t), and let ϕ (·) denote the derivative of N (·). It is interesting to see that the expressions for the optimal strategies (π ∗ , ξ ∗ ) do not depend on the stock appreciation rate α. This, however, ceases to be the case whenever the risk premium |α − r| is time dependent. Note that although the strategy π ¯ (·) satisfies the requirements in the original problem, it is never optimal. One undesirable feature of the optimization problem considered in this proposition is that the set of admissible strategies cannot be chosen arbitrarily, i.e. we have to consider strategies such that ξ (·) is tame and X x0 +u0 ,ξ (T ) ≥ Gε almost surely. Without this restriction one could not obtain the sequence of equivalences between the optimal problems stated at the beginning of the proof. To some extent thus, the problem is somewhat artificial since the class of strategies for which X x0 +u0 ,ξ (T ) ≥ G a.s. is indeed very large. Nevertheless, it can be interpreted as a loan with no interest in order to cover for the contingent claim Gε . In this framework, we find that the lvi is an important quantity in determining how fast perfect hedging is achieved. This is the topic of the next subsection.
4.2. The speed of convergence We consider the situation were the time 0 price of a contingent claim G in the classical Black-Scholes model is given by the quantity x0 := Π (0) = e−rT EQ [G], whereas in the perturbed model the corresponding contingent claim is worth the amount u0 := Πε (0) = e−rT EQ [Gε ]. Now, let us suppose that we only have x0 money units available for hedging where it is assumed that x0 ≤ u0 . Furthermore, suppose that we don’t know how to explicitly calculate the quantity u0 although we have managed to show the inequality x0 ≤ u0 , using either our results of the previous sections or other types of arguments. In both situations though, we can no longer obtain a perfect hedge and therefore we will instead try to maximize the probability of a perfect hedge as in proposition 4.1:
T Π (0) −2 (4.3) p (ε) = N N −1 [σ + εˆ σ (t)] dt . + |α − r| Πε (0) 0 Now, letting ε → 0 we see that p (ε) → 1 according to the assumption σ+εˆ σ (t) ≥ σmin for all ε and t ∈ [0, T ∗ ]. Hence, in the limit we will obtain a perfect hedge. So, the
130
Hans-Peter Bermin and Arturo Kohatsu-Higa
question is: how fast will p (ε) converges to one? As we will see, the quantity that ε determines the speed is the ratio ∂Π ∂ε (0) ε=0 /Π (0). This gives another motivation for the importance of studying the lvi. Throughout this subsection we will make the following assumption: Assumption The perturbed price Πε (0) has a Taylor expansion of order 2 around ε = 0, in the sense that ∂Πε ε (0) Π (0) = Π (0) + ε + G (ε) ε2 , ∂ε ε=0 where G (·) is differentiable around 0, and |G (ε)| ≤ C1 for ε ≤ ε¯. By using the same techniques as in the previous sections where we have computed the local Vega indices for different options it can be proved that these prices satisfy the above assumption. Proposition 4.2 The maximal probability of obtaining a perfect hedge in (4.3) has the property 2 ∂Πε 1 − p (ε) = exp −c (0) lim /2 /Π (0) , −1 ε→0 ε exp (−cN (1 − ε)) ∂ε ε=0 where c =
|α−r| σ
√
T.
Note that N −1 (1 − ε) ≈ than any polynomial.
√ − ln ε, hence exp −cN −1 (1 − ε) goes to zero slower
Similar results as the ones obtained in the previous Proposition can be also obtained for some other quantile hedging problems of the same type such as the one in Spivak and Cvitani´c (1999).
5. Conclusions and future work In this section, we will give a brief summary of possible applications of the results obtained here. These will be subject of future work. Throughout let us assume A2 in order to simplify the discussion. A possible first situation where our result may be of help is to consider a financial manager who is responsible for the trading activity at some company. The manager is aware of the fact that mis-specifications or changes of the volatility can drastically alter the balance of the trading activity. Therefore, the manager may impose the restriction ∂Πito the traders that together they must be more or less “Vega neutral”, i.e. i ∂σ (·) ≈ 0 where the sum is taken over every contingent claim traded by the company. Hence, by following this strategy the manager is on average protected from volatility changes. Now, let us suppose that one day the manager poses the
Local volatility changes in the Black-Scholes model
131
question: what will happen to our balance if the volatility will rise in two months due to a political meeting that will take place just before. To answer such specific questions, we already know that we have to use directional derivatives, hence in this ∂Πεi case the manager should rebalance his strategy setting i ∂ε (·) ε=0 ≈ 0 with the local volatility function σ ˆ (·, ·) according to previous analyses or beliefs. Consequently, it is of course interesting to see if this new strategy will produce other results than the well-known Vega neutrality concept. ε At a first sight, the implementation of the lvi ∂Π ∂ε (0) ε=0 might seem a little strange due to the fact that the local volatility, i.e. εˆ σ (·), is over specified. Hence, suppose that the manager believes that the future volatility over the time period [0, T ] will be given by the deterministic function σ (·). He then solves backwards for εˆ σ (·) = σ (·)−σ. Now, depending on what value of ε that is fixed at the beginning the manager will obtain different functions σ ˆ (·). To be precise we see that the function σ ˆ (·) is uniquely defined up to a multiplicative constant. This is of course a problem as we found that for all the options studied in this paper we had the relationship T ∂Πε ∂Π (0) (0) , = µ ¯(s, T )ˆ σ (s) ds ∂ε ∂σ 0 ε=0 where µ ¯ (·, T ) is a weight function independent of ε and σ ˆ (·). Therefore the lvi ∂Πε (0) is unique up to a multiplicative constant as well. It follows that the ∂ε ε=0 ∂Πε quantity of interest for practical situations is ε ∂ε (0) ε=0 and not the lvi by itself. Hence, the local Vega neutrality approach used by a financial manager should actually be written as Ti ∂Πε ∂Πi i (0) (0) ε := µ ¯i (s, Ti )σi (s) ds − σi ≈ 0, ∂ε ∂σ 0 ε=0 i i where the sum, as before, is taken over every contingent claim traded by the company. Note that for a portfolio composed of simple contingent claims with a fixed maturity T , one has µ ¯i (·, T ) = T1 and the local Vega neutrality concept corresponds exactly to i the classical Vega neutrality approach i ∂Π ∂σ (0) ≈ 0. However, for all other cases these two criteria do not coincide. In order to study the price variations we use the integral formula: 1 ∂Πε ∂Πε 1 0 Π (0) − Π (0) = (0) dε ≈ (0) , ∂ε ∂ε 0 ε=0 where the price Π1 (0) corresponds to the case where the stock has the volatility σ+σ ˆ (·). This analysis can of course also be related to the classical Vega index in the sense that σ1 ∂Π0σ ∂Π0v (0) dv ≈ (0) · (σ1 − σ) , Π0σ1 (0) − Π0 (0) = ∂σ ∂σ σ
132
Hans-Peter Bermin and Arturo Kohatsu-Higa
where we let Π0σ1 (·) denote the price given that the volatility is constant and equal to σ1 . Note that these are just first order approximations. By including higher order terms more accuracy can be gained, however this would of course be more complicated. Another possibility is to use Riemann sum approximations. This may be of interest since there exists no closed form solution for the price of most exotic options whenever the volatility is time dependent (see e.g. Carr (1999)). To sum up, a possible application of our result arises naturally in risk management: how can a financial manager protect the company’s position once the manager has a personal belief about the future development of the market. Typical questions that can be studied are for instance what happens to the option prices if the next two weeks will be a very unstable period, or what happens to our positions in options if the volatility will drop in a month as predicted by a time series analysis. The lvi introduced here should be helpful to answer these and other related questions. For example, what is the relationship between this index and presence of asymmetric information and particularly existence of inside information. Another interesting question is: what is the amount of rebalancing needed in order to keep a portfolio stable with respect to the lvi? We believe that this amount is smaller than the amount needed when using the classical Vega neutrality concept since the lvi incorporates anticipated time dependent volatility structures. We leave it as open questions to see if our results might be useful in order to detect such phenomena. The lvi is a natural extension of the classical Vega index, i.e. the price derivative with respect to the constant volatility, in the sense that we perturb the volatility in different directions. For all the contingent claims studied in this paper, we show that the lvi can be expressed as a weighted average of the perturbation in volatility. In the case that one assumes that the volatility and the rate of interest are constant and the perturbation in volatility only depends on time then this average is multiplied by the classical Vega index, giving a clear relationship between the classical Vega index and the local one defined here. Moreover, in the case of path-dependent options these weighted averages have in general the property of putting less and less weight to events in the future. Hence, a financial manager should according to this result think in short terms and do not worry (that much) about the future. We have also studied the well-known goal problem of maximizing the probability of a perfect hedge and show that the speed of convergence depends on the lvi. We should also remark that all the quantities appearing in the formulae for the lvi are amenable to simulation and therefore one could in fact, compute these quantities in order to assess numerically the importance of volatility changes.
Local volatility changes in the Black-Scholes model
133
References [1] Bermin, H.-P. (1998a): Essays on Lookback and Barrier Options: A Malliavin Calculus Approach. Ph.D.-thesis, Department of Economics, Lund University. [2] Bermin, H.-P. (1998b): Hedging options: the Malliavin calculus approach versus the ∆-hedging approach. Working paper, Department of Economics, Lund University. [3] Carr, P. (2000): Deriving derivatives of derivative securities. Journal of Computational Finance 4 (2), 5–29. [4] Dupire, B. (1994): Pricing with a smile. Risk 7, 18–20. [5] El Karoui, N., Jeanblanc-Picqu´ e, M., and Shreve, S.E. (1998): On the robustness of the Black-Scholes equation. Mathematical Finance 8, 93–126. [6] Fourni´ e, E., Lasry, J.-M., Lebuchoux, J., Lions, P.-L. and Touzi, N. (1999): Applications of Malliavin calculus to Monte Carlo methods in finance. Finance and Stochastics 3, 391–412. [7] Frey, R. (2000): Superreplication in stochastic volatility models and optimal stopping. Finance and Stochastics 4, 161–188. [8] Harrison, J.M. and Pliska, S.R. (1981): Martingales and stochastic integrals in the theory of continuous trading. Stochastic Processes and Their Applications 11, 215–260. [9] Hobson, D. (1997): Robust hedging via coupling. Working paper, School of Mathematical Sciences, University of Bath. [10] Karatzas, I. (1996): Lectures on the Mathematics of Finance. CRM Monographs 8, American Mathematical Society. [11] Karatzas, I. and Ocone, D. (1991): A generalized Clark representation formula, with applications to optimal portfolios. Stochastics and Stochastic Reports 34, 187–220. [12] Karatzas, I. and Shreve, S.E. (1988): Brownian Motion and Stochastic Calculus, second edition. Springer-Verlag. [13] Kulldorff, M. (1993): Optimal control of favorable games with a time- limit. SIAM Journal of Control & Optimization 31, 52–69. [14] Kusuoka, S. and Stroock, D.W. (1984): Application of the Malliavin calculus I. In Stochastic Analysis, Proceedings of the Taniguchi International Symposium on Stochastic Analysis, Katata and Kyoto, 1982, ed.: Itˆo, K., Kinokuniya/North-Holland, Tokyo, 271–306.
134
Hans-Peter Bermin and Arturo Kohatsu-Higa
[15] Nualart, D. (1995): The Malliavin Calculus and Related Topics. SpringerVerlag. [16] Protter, P. (1990): Stochastic Integration and Differential Equations: a unified approach. Springer-Verlag. [17] Rudin, W. (1976): Principles of Mathematical Analysis. McGraw-Hill. [18] Seshadri, V. (1988): Exponential models, Brownian motion, and independence. Canadian Journal of Statistics 16, 209–221. ´ J. (1999): Maximizing the probability of a perfect [19] Spivak, G. and Cvitanic hedge. Ann. Applied Probab. 9, 1303–1328. [20] Taniguchi, S. (1985): Applications of Malliavin’s calculus to time-dependent systems of heat equations. Osaka Journal of Mathematics 22, 307–320. ¨ u ¨ nel, A.S. (1995): An Introduction to Analysis on Wiener Space. Springer [21] Ust Lecture Notes in Mathematics 1610.
Hans-Peter Bermin Department of Economics, P.O. Box 7082 Lund University 220 07 Lund, Sweden [email protected]
Arturo Kohatsu-Higa Department of Economics Universitat Pompeu Fabra Ramon Trias Fargas 25-27 08005 Barcelona, Spain [email protected]
Correlaciones y c´ opulas en finanzas Juan Carlos Garc´ıa C´ espedes1 Abstract: This paper deals with the concept of correlation, its meaning and, principally, its limitations. Also, it deals with the concept of copula. For a better understanding, we develop some amazing counterexamples for uncorrelated gaussian distributions that, however, are not independent. Finally, we present some particular examples of copulae; the Clayton copula, the Frank copula and the gaussian copula. More and more often copulae are used as useful tools for modelling dependencies where the gaussian assumption is not reasonable. In these moments, copulae are basic tools in credit risk modelling (for example copulae appear in the theoretical foundation of the Basel 2 proposal), in credit derivatives pricing, in asset securitisation design, in risk integration, in operational risk modelling, in extreme market risks modelling As you can see, their use in finance is becoming more frequent and, in that sense, we think this paper is an easy introduction to this difficult subject.
1.
Introducci´ on
Originalmente, el t´ıtulo de la charla fue “Correlaciones en finanzas”; sin embargo, en esta revisi´on, he cre´ıdo oportuno incluir expl´ıcitamente en el t´ıtulo la palabra c´ opula. En enero de 2001, las c´ opulas eran un tema absolutamente novedoso2 (cuando escribo esto, m´ as de un a˜ no despu´es, podr´ıamos decir que contin´ uan siendo novedosas, aunque ya las conoce m´ as gente). A´ un no hab´ıa aparecido la nueva propuesta de capital de Basilea (BIS II), cuyo modelo subyacente es un modelo de c´opulas gaussianas. Se hablaba sobre la modelizaci´ on del riesgo de mercado, acerca de la distribuci´ on de rendimientos, se discut´ıa sobre c´ omo modelizar las colas, que ya todo el mundo admit´ıa que no eran gaussianas. Sin embargo, se hablaba poco sobre la modelizaci´on de las dependencias entre los rendimientos. Quiz´as err´ oneamente, se pensaba que la clave estaba en las colas y que las relaciones de dependencia ven´ıan razonablemente recogidas mediante la estructura de correlaciones de los rendimientos. 1 Juan Carlos Garc´ ıa C´ espedes es Director del Departamento de Metodolog´ıa de Riesgos Corporativos de BBVA. Esta charla se imparti´ o en la sesi´ on del Seminario Instituto MEFF-RiskLab de enero de 2001. 2 Tan novedoso que, como he dicho, no me atrev´ ı a incluir la palabra “c´ opula” en el t´ıtulo de la ponencia. No quer´ıa correr el riesgo de que un titulo excesivamente t´ecnico o extra˜ no para el p´ ublico habitual de las charlas fuera causa de una menor audiencia.
136
Juan Carlos Garc´ıa C´ espedes
Sorprendentemente, el uso de las c´ opulas en finanzas aparece de la mano del riesgo de cr´edito, en el que el supuesto de normalidad es claramente inadecuado; adem´as, las distribuciones t´ıpicas son altamente asim´etricas. En este entorno surgi´o la cuesti´ on de c´ omo incorporar los beneficios de la diversificaci´ on. As´ı, en paralelo, se empezaron a utilizar las c´ opulas para la valoraci´ on de derivados de cr´edito donde interviniesen m´ as de dos contrapartidas (derivados crediticios sobre cestas, first to default, second to default. . . ) y en titulizaciones de activos, para estimar las distribuciones de las p´erdidas crediticias de las diferentes tranchas de una titulizaci´ on. En enero de 2001, justo despu´es de la charla, el Comit´e de Basilea public´ o su nueva propuesta de requerimiento de capital sustentada en un modelo crediticio de c´ opulas gaussianas. Actualmente se est´a trabajando, adem´ as de en riesgo de cr´edito, en el uso de c´ opulas para la modelizaci´ on del riesgo operacional, para la integraci´ on de los diferentes tipos de riesgos (mercado, cr´edito y operacional), as´ı como en su aplicaci´ on para la medici´ on del riesgo de mercado en entornos claramente no gaussianos, donde las relaciones de dependencia sean importantes. Durante mi ´epoca de estudiante recuerdo que, en una ocasi´on, durante una clase de estad´ıstica, solicit´e al profesor que, por favor, me diera un ejemplo de lo que estaba explicando. La respuesta fue: “¡no sabes t´ u lo que est´ as pidiendo!”. Realmente aquello me sorprendi´o. Cre´ıa, y sigo creyendo, en la utilidad de los ejemplos, los s´ımiles y las met´ aforas como m´etodo para fijar conceptos que en si mismos son complejos. . . En este art´ıculo, siguiendo esta l´ınea de pensamiento, voy a intentar dar ejemplos concretos para muchos de los conceptos que quiero introducir, con muchos gr´ aficos de apoyo. Adem´ as, durante el art´ıculo se plantear´ an una serie de preguntas ret´ oricas: sugiero al lector que cuando se encuentre con alguna de dichas preguntas intente contestarla. . . o que al menos haga el esfuerzo mental de contestarla. El art´ıculo se estructura en tres partes. En la primera se profundiza en el concepto de correlaci´ on: se trata de mostrar las limitaciones de la correlaci´ on para medir el grado de dependencia entre variables aleatorias. En una segunda parte se introduce el concepto de c´opula mediante un ejemplo que resulta sumamente desconcertante: se muestran dos variables aleatorias gaussianas, con correlaci´ on cero y que, sin embargo, no son independientes. En la tercera parte, se formaliza el concepto de c´ opula y se muestran algunas de las c´opulas m´as habituales. Finalmente se incluye un anexo con algunas medidas de dependencia alternativas a la tradicional correlaci´ on lineal.
2.
¿Qu´ e es la correlaci´ on? Ante esta pregunta, se obtienen habitualmente respuestas y comentarios del tipo: - es una medida de la relaci´ on lineal entre variables aleatorias; - correlaci´ on cero no implica independencia, esto s´ olo es verdad con las variables aleatorias gaussianas; - la correlaci´ on es la covarianza entre dos variables aleatorias dividida por las desviaciones t´ıpicas. . .
´ pulas en finanzas Correlaciones y co
137
Todos estaremos de acuerdo con los comentarios anteriores y, sin embargo, en sentido estricto, habr´ıa que matizar enormemente los mismos. Vamos a pensar en tres cuestiones concretas respecto a la correlaci´on que, posiblemente, parezcan en contradicci´ on con alguna de las respuestas anteriores (s´ olo lo parecer´ an): a) ¿Existe la correlaci´ on? Los matem´aticos suelen preguntarse primero por la existencia de las cosas. El resto de los seres humanos (y entre ellos los economistas) solemos ser m´as simples, y damos por supuestas muchas cosas3 que luego no lo son tanto. As´ı, una de las primeras cosas que solemos hacer cuando nos presentan un par de series financieras es estimar su correlaci´ on. Pero, ¿qu´e estamos realmente haciendo? ¿Qu´e significa el n´ umero que obtenemos? b) En el contexto de las variables aleatorias gaussianas, ¿qu´e significa exactamente la correlaci´ on? Muchos de los problemas, restricciones y cuidados que se han de tener con el concepto de correlaci´ on parece que se acaban cuando estamos trabajando con variables aleatorias gaussianas. ¿Es esto realmente cierto? Si la variable aleatoria es gaussiana, ¿el concepto de correlaci´on es suficiente para caracterizar la dependencia entre las variables aleatorias? c) ¿Qu´e ocurre cuando no hablamos de variables gaussianas? ¿Qu´e significa la correlaci´ on en este contexto? Cada vez es m´as t´ıpico tener que trabajar en entornos no gaussianos y, adem´ as, en muchas ocasiones, en contextos multivariantes. En esta situaci´on, ¿qu´e provecho real se puede sacar del concepto de correlaci´ on? En un mundo no gaussiano, ¿es posible a´ un utilizar el concepto de correlaci´ on? ¿C´ omo?
3.
¿Existe la correlaci´ on?
Como ya he dicho antes, los matem´aticos tienden a preguntarse por la existencia de las cosas. Otras ramas de la ciencia son m´ as pr´ acticas: si las cosas se pueden calcular, existen. Pongamos un ejemplo. Las gr´ aficas siguientes presentan varias muestras de 500 observaciones obtenidas a partir de variables aleatorias t-bivariantes. El proceso generador de los datos es: Xt Yt
= Xt−1 + εt , = Yt−1 + υt ,
donde εt y νt siguen una distribuci´ on t de Student bivariante “correlacionada”. 3 Es sorprendente la cantidad de conceptos que pareciendo obvios, en un an´ alisis m´ as detallado resultan extraordinariamente profundos y desde luego, en absoluto obvios.
138
Juan Carlos Garc´ıa C´ espedes
En el gr´ afico anterior, las series superiores se han obtenido a partir de una distribuci´ on t-bivariante “altamente correlacionada” con 4 grados de libertad. Las series inferiores, en cambio son una muestra de una distribuci´ on t-bivariante “altamente correlacionada”, pero con tan s´ olo 2 grados de libertad. Los gr´ aficos de la derecha presentan la nube de puntos de ∆Xt = Xt − Xt−1 = εt frente a ∆Yt = Yt − Yt−1 = υt , junto a la recta de regresi´ on (en lugar de la correlaci´ on he preferido representar la recta de regresi´ o n obtenida mediante m´ ınimos cuadrados, √ recu´erdese que ρ = R2 ). ¿Qu´e ocurre si generamos otros 500 datos aleatorios?
¿Alg´ un comentario?4 4 Esta ´ es una de las preguntas ret´ oricas, intente el lector contestarla antes de continuar con el texto. Una pista: los datos mal llamados at´ıpicos.
´ pulas en finanzas Correlaciones y co
Veamos otra posible realizaci´ on:
Y la u ´ltima:
139
140
Juan Carlos Garc´ıa C´ espedes
Resulta muy instructivo el cuadro resumen siguiente, con las R2 y las pendientes de las rectas de regresi´ on de las cuatro simulaciones:
R2 de la regresi´ on t de 2 grados t de 4 grados
Sim1 0.734 0.800
Sim2 0.570 0.758
Sim3 0.842 0.651
Sim4 0.199 0.646
Pendiente de la regresi´ on t de 2 grados t de 4 grados
Sim1 0.888 0.886
Sim2 0.646 0.880
Sim3 0.1.113 0.864
Sim4 0.396 0.832
¿Qu´e est´a ocurriendo? ¡Lo que est´ a sucediendo es que para la distribuci´ on t-bivariante con dos grados de libertad no existe la correlaci´ on! Recordemos que la correlaci´ on se define como: Cov(X, Y ) ρl (X, Y ) = , V ar(X) · V ar(Y ) pero, ¿cu´ al es la varianza de una distribuci´ on t? Desempolvando alg´ un manual de estad´ıstica obtenemos que la varianza de una distribuci´ on t es V ar(X) =
C , C −2
siendo C el n´ umero de grados de libertad. Pero, para C ≤2, la expresi´ on anterior no es un numero real positivo. . . es decir, para 2 o menos grados de libertad no existe la varianza. ¿Qu´e significa esto? ¿Qu´e pasar´ıa si estimamos la varianza de una muestra mediante la consabida f´ ormula siguiente?: T
σ 2
=
1 ¯ 2 (Xt − X) T − 1 t=0
¯ X
=
T 1 Xt T t=0
´ pulas en finanzas Correlaciones y co
141
Hagamos eso, estimemos la desviaci´ on t´ıpica de una muestra mediante la f´ ormula anterior, ampliando el n´ umero de datos-observaciones de la muestra “T ”. Veamos qu´e sucede5 : Caso 1:
Caso 2:
Caso 3:
5 T2 est´ a calculada a partir de una distribuci´ on t de Student con 2 grados de libertad, T3 a partir de una con 3 grados de libertad y T4 a partir de otra con 4 grados de libertad.
142
Juan Carlos Garc´ıa C´ espedes
En los gr´ aficos vemos que lo que ocurre es que el estimador de la varianza T
σ 2
=
1 ¯ 2, (Xt − X) T − 1 t=0
¯ X
=
T 1 Xt , T t=0
no converge cuando el numero de grados de libertad es 2. Por mucho que se aumente el tama˜ no de la muestra, σ se muestra inestable, oscila, salta, sube, cae, vuelve a subir, vuelve a caer. . . Esto explica lo que est´a ocurriendo con las regresiones que antes se han mostrado: en el caso de 2 grados de libertad los resultados, tanto la R2 como la pendiente de la regresi´ on, se muestran oscilantes, inestables. En el ejemplo que hemos visto se han utilizado distribuciones t por comodidad, ya que son distribuciones suficientemente conocidas y que, adem´as, para 2 grados de libertad no tienen varianza; pero cualquier distribuci´ on de cola gruesa hubiera servido, entendiendo por distribuci´ on de cola gruesa aqu´ella que no posee varianza6 . Primera conclusi´ on: no siempre existen las correlaciones. Por ejemplo, las distribuciones de colas gruesas7 , las distribuciones que no tienen varianza, tampoco tienen correlaciones en el sentido habitual de la palabra. Esto sin embargo en absoluto significa que no est´ an relacionadas entre s´ı (linealmente o no). Dejemos el escabroso universo de las distribuciones sin correlaci´ on. A fin de cuentas, se me argumentar´a, se trata de extra˜ nos y at´ıpicos casos sin aplicaci´ on pr´ actica en finanzas. . . o quiz´ as no tanto. Ser´ıa muy interesante modelar algunos procesos, por ejemplo los tipos de cambio o algunas acciones tecnol´ogicas, como procesos generados por distribuciones de colas gruesas, de manera que las varianzas y las correlaciones no estuviesen definidas. En la misma l´ınea ser´ıa muy interesante derivar la formula de valoraci´ on de opciones en entornos de colas gruesas.
6 Alguien dudar´ a del inter´ es pr´ actico del ejemplo: ¿c´ omo va a ser infinita la varianza de los rendimientos financieros? Y, sin embargo, seguro que nadie se ha planteado que en los modelos gaussianos, tan utilizados, el soporte es desde −∞ a +∞, es decir, existe una probabilidad no nula de obtener rendimientos tan grandes o peque˜ nos como se quiera. 7 Ahora, que se reconoce que las distribuciones de rendimientos tienen colas gruesas, el debate sobre la existencia o no de la varianza tiene, en mi opini´ on, mucha importancia. Por ejemplo, si no existe varianza, el modelo Black-Scholes no puede aplicarse.
´ pulas en finanzas Correlaciones y co
4.
143
En el contexto de las distribuciones gaussianas, ¿qu´ e significa la correlaci´ on?
Si estamos hablando de distribuciones gaussianas (no quiero llamarlas normales, porque son menos normales de lo que parecen), ¿qu´e significa la correlaci´on? De nuevo, veamos algunos ejemplos.
Las dos series anteriores son gaussianas est´andar (media cero y varianza 1); adem´as, ambas tienen entre s´ı correlaci´ on cero (se han construido as´ı). Los gr´ aficos siguientes son los histogramas de las series anteriores frente a la densidad gaussiana est´ andar; como vemos, el ajuste es bueno (no pod´ıa ser de otra forma, ya que as´ı se han construido)8 .
8 Cr´ eame el lector: por construcci´ on, las dos series son normales de media cero y varianza 1, al margen de los sorprendentes resultados que obtendremos con ellas.
144
Juan Carlos Garc´ıa C´ espedes
Veamos la variable suma de ambas, V 1 + V 2. Debe tener media cero y varianza 2 (por ser suma de dos gaussianas est´andar incorrelacionadas); y, l´ ogicamente, adem´ as. . . ¿debe ser gaussiana, porque la suma de gaussianas es gaussiana?9
¿Alg´ un comentario respecto a la serie V 1 + V 2?10 Quiz´ as si dibujamos el histograma de la serie V 1 + V 2. . . 0 .7
0 .6
0 .5
0 .4
0 .3
0 .2
3
3 .5
2
2 .5
1
1 .5
0
0 .5
-0 .5
-1 .5 -1
-3
-2 .5 -2
0
-3 .5
0 .1
V1 + V2
Desde luego no es normal (esta vez con el doble sentido de la palabra normal). 9 De
10 No
nuevo, lector, te animo a contestar mentalmente a esta pregunta. parece una serie gaussiana, ¿verdad?
´ pulas en finanzas Correlaciones y co
145
Tambi´en podr´ıamos dibujar, en lugar de la densidad, la distribuci´ on acumulada emp´ırica frente a la de una gaussiana de varianza 2 y media cero; o bien el Q-Q plot.
Sigue pareci´endose poco a una distribuci´ on gaussiana. . . ¿Qu´ e est´ a ocurriendo? Veamos en m´as detalle las colas de la distribuci´ on de V 1 + V 2 y compar´emoslas con las colas de la gaussiana de igual varianza:
Vemos que los percentiles “emp´ıricos” son menores que los de la gaussiana de igual varianza. . . Las variaciones extremas de V 1 + V 2 son “menores” de lo esperado. Quiz´ as, si dibujamos finalmente la “nube de puntos” de V 1 frente a V 2:
Lo que hemos obtenido no es la t´ıpica nube de puntos que esper´ abamos11 . 11 Viendo la nube de puntos de V 1 frente a V 2, sabemos que est´ an incorrelacionados, pero, ¿son V 1 y V 2 independientes?
146
Juan Carlos Garc´ıa C´ espedes
Veamos un segundo caso. De nuevo dos series gaussianas est´ andar de media 0 y varianza 1, V1 y V2 as´ı como su suma, V1+V2:
En este caso, es bastante m´ as obvio que antes que la suma de V1 y V2 no es gaussiana. . . hay demasiados ceros. Como antes, podemos dibujar la distribuci´ on acumulada emp´ırica frente a la acumulada gaussiana de media 0 y varianza 2 (figura de la izquierda). Y observar, en m´ as detalle, las colas de la distribuci´ on de V1+V2 frente a la gaussiana de igual media y varianza (figura de la derecha). En este caso, a diferencia del anterior, vemos c´ omo los percentiles “emp´ıricos” son mayores que los de la gaussiana de igual varianza. . . Las variaciones extremas de V1+V2 son “mayores” de lo esperado.
´ pulas en finanzas Correlaciones y co
147
En este punto, el lector se estar´ a preguntando por la forma de la nube de puntos de V 1 frente a V 2:
Ya va siendo hora de saber c´ omo se han construido los dos ejemplos anteriores. En primer lugar conviene recordar un resultado muy utilizado en simulaci´ on de Montecarlo: a) sea X una v.a. (variable aleatoria) con distribuci´ on acumulada F (·); entonces U = F (X) es una v.a. uniforme; b) sea U una v.a. uniforme; entonces X = F −1 (U ) es una v.a. con distribuci´ on acumulada F (·). El resultado anterior permite, de una manera muy pr´ actica, generar dos variables aleatorias con distribuciones dadas, por ejemplo F(·) y G(·). Para ello no hay m´ as que generar U y V , variables aleatorias uniformes (generar variables aleatorias uniformes en un ordenador es muy sencillo): entonces X = F −1 (U ) e Y = F −1 (V ) son variables aleatorias con distribuciones las deseadas (calcular funciones inversas puede ser, en ocasiones, tambi´en muy sencillo en un ordenador12 ): - si queremos, adem´as, que X e Y sean independientes, entonces generamos U y V independientes. - En cambio si queremos que X e Y tengan alg´ un tipo de relaci´ on, podemos generar U y V no independientemente. Los ejemplos anteriores se han creado mediante la metodolog´ıa descrita, en donde las variables uniformes, U y V , se han generado de forma no independiente.
12 Aunque
en otras ocasiones puede llegar a ser realmente muy complicado.
148
Juan Carlos Garc´ıa C´ espedes
La distribuci´ on conjunta de U y V utilizada en el primer ejemplo se puede ver en las siguientes gr´aficas:
Y la distribuci´ on conjunta de U y V del segundo ejemplo, en las siguientes gr´ aficas:
Ahora no hay m´as que calcular X = N −1 (U ) e Y = N −1 (V ), donde N (·) es la distribuci´ on acumulada gaussiana est´ andar. De esta manera se han generado dos variables aleatorias, X e Y , cuya distribuci´ on marginal es gaussiana, cuya correlaci´ on es cero13 y que ni son independientes14 ni conjuntamente gaussianas. No deben a nadie sorprender los resultados obtenidos. Aunque parecen contradecir lo que habitualmente se estudia en Estad´ıstica, no es as´ı, no hay tal contradicci´ on. El problema que subyace quiz´ as sea l´exico: estamos acostumbrados a o´ır que “la suma de variables aleatorias normales es normal”; probablemente debi´eramos decir mejor “la suma de variables aleatorias normales multivariantes es normal”. Una normal multivariante es algo m´ as que una distribuci´ on con marginales gaussianas, es una distribuci´ on multivariante que, adem´ as de tener marginales normales, tiene una estructura de “codependencia” muy particular, tan particular que da lugar a que la suma de sus componentes sea normal. 13 Por argumentos de simetr´ ıa es muy f´ acil ver que la correlaci´ on entre las variables aleatorias implicadas en los ejemplos anteriores es cero. 14 Ahora es obvio que las variables aleatorias no son independientes: U y V est´ an relacionadas, luego X e Y deben tambi´ en estarlo.
´ pulas en finanzas Correlaciones y co
149
Por tanto, una inmediata conclusi´ on es que dos variables aleatorias normales incorrelacionadas no tienen por qu´e ser independientes; s´ olo lo ser´ an si la distribuci´ on conjunta de las variables es tambi´en gaussiana. Supongo que la mayor´ıa de los asistentes conocen el concepto de “Valor en Riesgo” (VaR o Value at Risk ). No se trata m´ as que del percentil de la distribuci´ on de p´erdidas y ganancias de una cartera de activos. Es uno de los conceptos m´ as utilizados dentro de lo que se conoce como “medici´on y gesti´ on de riesgos”. Uno de los m´as habituales m´etodos para la estimaci´ on del VaR consiste en asumir que las distribuciones de rendimientos de los activos financieros son gaussianas, como en el modelo de Black-Scholes, de manera que, conociendo las volatilidades y correlaciones entre los activos financieros, es posible estimar el VaR de una cartera. . . puesto que la suma de gaussianas es gaussiana. Sin embargo, acabamos de ver que esto no es as´ı, acabamos de ver que aunque las distribuciones marginales sean gaussianas, la suma no tiene por qu´e serlo. El mensaje para los practitioners del riesgo es realmente duro. No es s´olo que el supuesto de normalidad de las series financieras est´e en entredicho (realmente hay que decir que es falso), sino que, adem´ as, aunque se cumpliese, no podemos garantizar que, siendo las series individualmente normales, lo sean conjuntamente. Por tanto, el riesgo (percentil) de la cartera no tiene por qu´e ser el derivado de una distribuci´ on gaussiana. Para finalizar esta secci´ on, perm´ıtanme mostrarles un u ´ltimo ejemplo. Seguro que muchos de los lectores estar´ an pensando en que los dos casos que hemos visto tienen algo de truco, ya que las distribuciones bivariantes, de marginales uniformes, que he utilizado, son pr´ acticamente univariantes, un aspa en un caso y el per´ımetro de un cuadrado en el otro, dando lugar a que las nubes de puntos no sean tales15 . No es muy dif´ıcil modificar ligeramente los casos anteriores para satisfacer los deseos de los mas cr´ıticos. El gr´ afico siguiente representa otra distribuci´ on generadora de variables bivariantes uniformes (e incorrelacionadas pero no independientes). Marginales uniformes
15 En
Marginales gaussianas
lugar de “nubes de puntos” probablemente deber´ıan llamarse “bordes de puntos”.
150
Juan Carlos Garc´ıa C´ espedes
Podemos dibujar la densidad conjunta entre X e Y . De nuevo, las distribuciones marginales de dicha distribuci´ on son normales, la correlaci´ on es cero y, sin embargo, X e Y , obviamente, no son independientes:
A continuaci´ on se dibujan las series temporales de X e Y (Variable 1 y Variable 2 en los gr´ aficos) y de la suma X + Y que siguen las distribuciones anteriores.
´ pulas en finanzas Correlaciones y co
151
Finalmente podemos dibujar la distribuci´ on acumulada de la suma X +Y , as´ı como el detalle de las colas:
Aqu´ı hemos visto algunos pocos casos; en realidad, el n´ umero de distribuciones con correlaci´ on cero y marginales gaussianas que no son independientes es infinito. . . El gr´ afico siguiente proporciona un u ´ltimo ejemplo extra´ıdo de [2]:
Segunda conclusi´ on: ni siquiera en el confortable mundo de las distribuciones gaussianas podremos estar seguros con las correlaciones. Salvo que garanticemos algo m´ as que la normalidad marginal, esto es, la normalidad multivariante, la correlaci´ on no ser´ a una adecuada medida de la dependencia entre las variables aleatorias.
152
Juan Carlos Garc´ıa C´ espedes
5.
¿Qu´ e ocurre con las distribuciones multivariantes en general?
Pese a que lo que hasta ahora hemos visto pudiera parecer que complica sobremanera el modelado de las variables aleatorias multivariantes, tambi´en es cierto que se nos ofrecen nuevas herramientas de an´alisis. Hemos visto una manera de generar variables aleatorias con distribuciones marginales cualesquiera mediante la idea base de apoyarnos en las distribuciones uniformes. . . pero es que, adem´ as, tenemos la posibilidad de establecer o no relaciones de “dependencia” entre las variables aleatorias (dependiendo de que las variables aleatorias uniformes sean o no independientes). La formalizaci´ on de esta idea, de construir variables aleatorias cualesquiera sobre las uniformes, se corresponde con el concepto de “c´opula”. Una c´ opula es una distribuci´ on de variables aleatorias cuyas marginales son distribuciones uniformes. C (u1 , u2 , . . . , un ) = Pr (U1 u1 , U2 u2 , . . . , Un un ) . Una c´ opula es un instrumento magn´ıfico para la simulaci´ on de variables aleatorias con distribuciones marginales dadas; solo habremos de simular variables uniformes con estructuras de dependencia determinadas por su c´ opula de manera que F (x1 , x2 , . . . , xn ) = C (F1 (x1 ), F2 (x2 ), . . . , Fn (xn )) , donde F (·, ·, . . . , ·) es la distribuci´ on acumulada de X1 , . . . , Xn , donde ´estas tienen distribuciones marginales F1 (·), F2 (·), . . . , Fn (·) y C(·, ·, . . . , ·) es una c´opula. Por otro lado, existe un teorema, el Teorema de Sklar, que garantiza la equivalencia entre cualesquiera distribuciones multivariantes y las c´ opulas16 : on de distribuci´ on acumulada; entonces Sea F (x1 , x2 , . . . , xn ) una funci´ existe una n-c´ opula tal que F (x1 , x2 , . . . , xn ) = C(F1 (x1 ), F2 (x2 ), . . . , Fn (xn )) . Como el lector podr´a imaginar, existen multitud de c´ opulas. A modo de ejemplo, podemos ver dos de ellas (sin olvidar las distribuciones que hemos visto en la parte anterior, que tambi´en son c´ opulas).
16 Lo que viene a decir este teorema es que cualquier distribuci´ on multivariante admite una expresi´ on en forma de copula; cuesti´ on diferente es la tractabilidad anal´ıtica de dicha c´ opula.
´ pulas en finanzas Correlaciones y co
La c´ opula de Clayton −1/α Distribuci´ on acumulada: C(u, v) = u−α + v −α − 1
153
α > 1.
Cuanto mayor es el par´ametro α, mayor es la “relaci´ on” entre las dos variables uniformes.
La c´ opula de Frank (eα u − 1) (eα v − 1) 1 Dist. acumulada: C(u, v) = ln 1 + α eα − 1
− ∞ < α < ∞.
De nuevo, cuanto mayor es el par´ametro α, en valor absoluto, mayor es la “relaci´ on” entre las dos variables uniformes.
154
Juan Carlos Garc´ıa C´ espedes
En el gr´ afico siguiente se muestran 3 nubes de puntos correspondientes a dos variables aleatorias con distribuciones marginales normales est´ andar; una de ellas es gaussiana bivariante, mientras que las otras se han generado utilizando en un caso la c´ opula de Frank y la de Clayton en el otro. El coeficiente de correlaci´ on en los 3 casos es el mismo; sin embargo, a simple vista se pueden ver diferencias entre las 3 distribuciones:
Las distribuciones marginales de estas 3 nubes de puntos son iguales, normales estándar. Además, las correlaciones son iguales
Normal bivariante
Cópula de Clayton
Cópula de Frank
Si superponemos las nubes de puntos originadas por la c´ opula de Clayton y la gaussiana bivariante, se aprecian mejor las diferencias:
´ pulas en finanzas Correlaciones y co
155
Finalmente, existe un caso muy particular de c´ opula con especial inter´es, la llamada c´ opula gaussiana. Sean X e Y dos variables aleatorias con distribuci´ on gaussiana est´andar bivariante. Entonces U = N (X) y V = N (Y ) son variables aleatorias uniformes. Se dice que U y V vienen definidas por una c´ opula gaussiana. Seg´ un lo anterior, T = F −1 (U ) es una v.a. con distribuci´ on acumulada F (·) y −1 S = G (V ) es una v.a. con distribuci´ on acumulada G(·). La distribuci´ on conjunta entre T y S es muy interesante ya que: 1. la estructura de dependencia entre T y S viene definida por la correlaci´ on entre X e Y (que son normales bivariantes y por tanto toda su estructura de dependencia viene determinada por la estructura de correlaciones). 2. La distribuciones marginales son F (·) y G(·). ´ Esta es una forma sencilla y simple de generar variables aleatorias multivariantes con marginales dadas y una razonable estructura de dependencia, con la ventaja de que, adem´ as, generar variables gaussianas correlacionadas es muy sencillo (Cholesky). La c´ opula gaussiana est´ a recibiendo un especial inter´es en finanzas, sobre todo en el a´rea de la modelizaci´on del riesgo crediticio. Por ejemplo, el modelo de Merton para la modelizaci´ on del default es, b´ asicamente, una c´ opula gaussiana. En la valoraci´ on de derivados crediticios donde intervienen m´ as de una contrapartida, por ejemplo aseguramientos de cestas de bonos, aseguramientos al first to default, etc., la modelizaci´on de las dependencias entre las diferentes contrapartidas que intervienen suelen hacerse mediante el uso de c´opulas gaussianas, de manera que la correlaci´ on entre las variables latentes gaussianas suele inferirse, por ejemplo, a partir de las correlaciones entre las cotizaciones burs´atiles. Un ultimo ejemplo: el modelo que subyace en la nueva propuesta de capitales de Basilea es una c´ opula gaussiana. Tercera conclusi´ on: el estudio en m´ as detalle del concepto de correlaci´on no s´ olo nos ha llevado a conocer sus limitaciones, sino a definir el concepto de c´opula, una herramienta muy potente que nos facilita trabajar en el mundo m´ as general de las distribuciones multivariantes, gaussianas o no. En concreto, la c´ opula gaussiana est´ a siendo ya utilizada en los mercados financieros para la valoraci´ on de derivados crediticios.
Anexo: otras medidas de la dependencia entre variables aleatorias Ya hemos visto que el concepto de correlaci´on “hace aguas”: s´olo tiene sentido en determinados entornos, en otros hay que interpretarlo con sumo cuidado. El problema es que en general, la correlaci´on no caracteriza la estructura de relaciones entre las variables aleatorias; estas estructuras en general son muy ricas y dif´ıcilmente se pueden capturar con un solo par´ ametro17 . 17 En
este sentido la distribuciones gaussianas son una excepci´ on a la regla general.
156
Juan Carlos Garc´ıa C´ espedes
De hecho, existen otras medidas de correlaci´on, o mejor dicho, de relaci´ on entre variables aleatorias; mencionemos dos de ellas, la correlaci´on de Spearman y la tau de Kendall: Recordemos primero el concepto “tradicional” de correlaci´ on, que llamaremos correlaci´ on lineal: ρl (X, Y ) =
Cov(X, Y ) V ar(X) V ar(Y )
La correlaci´ on de Spearman se define como: ρs (X, Y ) = ρl (FX (X), FY (Y )) =
Cov(FX (X), FY (Y )) V ar(FX (X)) · V ar(FY (Y ))
La correlaci´ on de Spearman no es m´ as que la correlaci´ on lineal pero en el mundo de las distribuciones uniformes; esto es, las variables aleatorias previamente se transforman en uniformes (dicho de otra forma, la correlaci´ on de Spearman no es m´as que la correlaci´ on de la copula impl´ıcita en las distribuciones). La Tau de Kendall se define como: ρτ = P [(X1 − X2 ) (Y1 − Y2 ) > 0] − P [(X1 − X2 ) (Y1 − Y2 ) < 0]
puntosconcordantes
Y2
puntosdiscordantes
Correlación positiva indica que hay más puntos concordantes que discordantes y viceversa
Y1 X1
X2
Existe una relaci´ on curiosa entre la correlaci´ on lineal y la Tau de Kendall ρτ (X, Y ) =
2 · arcsin (ρl (X, Y )) π
Para estimar emp´ıricamente la Tau de Kendall se recomienda la f´ ormula siguiente: c−d √ , τ = √ c + d + eX · c + d + eY conde c es el n´ umero de puntos concordantes, d el n´ umero de puntos discordantes, ey y ex ajustan los casos en los que no hay ni concordancia ni discordancia bien porque X1 − X2 = 0, o bien porque Y1 − Y2 = 0.
´ pulas en finanzas Correlaciones y co
157
Como podemos ver en el ejemplo siguiente, la correlaci´on lineal se ve muy afectada por los datos at´ıpicos:
Muestra sin at´ıpicos Correlaci´ on lineal Tau de Kendall Corr. a partir de Tau
70,23 % 0,4996 70,71 %
Sin embargo la Tau de Kendall es mucho mas robusta a los at´ıpicos:
Muestra con at´ıpicos Correlaci´ on lineal Tau de Kendall Corr. a partir de Tau
51,77 % 0,49488 70,14 %
Referencias [1] Embrechts, Paul; Alexander McNeil and Daniel Straumann: “Correlation: Pitfalls and Alternatives”. Departement Mathematik, ETH Zentrum, CH-8092 Z¨ urich. [2] Embrechts, Paul; Alexander McNeil and Daniel Straumann: “Correlation and dependence in risk management: Properties and Pitfalls”. Departement Mathematik, ETH Zentrum, CH-8092 Z¨ urich. [3] Li, David X.: “On Default Correlation: A Copula Function Approach”. The RiskMetrics Group. Working Paper Number 99-07. [4] Lindskog, Filip: “Linear Correlation Estimation”. RiskLab ETH-Zentrum, CH8092 Z¨ urich, Switzerland.
158
Juan Carlos Garc´ıa C´ espedes
[5] Lindskog, Filip: “Modelling Dependence with c´ opulas and Applications to Risk Management”. Master Thesis, ETH-Zentrum 6. CH-8092 Z¨ urich, Switzerland. [6] Frees, Edward W. and Emiliano A. Valdez: “Understanding relationships using c´ opulas”. North American Actuarial Journal 2 (1998), 1–25. [7] Belkacem, Lotfi: “Processus stables et applications en finance: CAPM, risque, choix des portefeuilles, ´evaluation des options, dans un march´e α-stable”. Tesis doctoral, Universit´e Paris IX Dauphine, U.F.R. Math´ematiques de la d´ecision.
Juan Carlos Garc´ıa C´espedes Metodolog´ıa de Riesgos Corporativos, BBVA Paseo de Recoletos 8, tercera planta 28001–Madrid, Espa˜ na [email protected]
Introduction to the empirical analysis of swap spreads David M´ endez-Vives1
Abstract: The present article describes the markets for interest rate swaps and presents some topics on their empirical research. The main contribution is an econometric study of the relation between swap spreads and corporate bond spreads, in a cointegration framework. The model includes other relevant variables, like the slope of the yield curve or the spread differential between corporate bonds of different credit quality. The estimation results tend to confirm the hypothesis that the relation between swap and corporate bond spreads is due to the fact that the floating leg of a swap is indexed to a LIBOR rate, which is risky. In particular, we show that the credit risk of the swap itself is very low.
1. Introduction to swap markets A swap is a derivative contract in which two parties agree to periodically exchange a stream of cashflows. In an interest rate swap (swap hereafter), one party –the “payer”– periodically pays a fixed rate –the swap rate– to the other party –the “receiver”– and, in exchange, receives a floating coupon, generally equal to the Libor rate (Euribor in the case of EUR swaps). According to the Bank for International Settlements (BIS), the notional on interest rate swaps at the end of year 2000 stood at $48.768trn, being by far the most important interest rate derivative contract. Arguably, swaps are one of the finest examples of the success of the financial innovation process in the last 30 years. Interest rate swaps are traded by a network of broker-dealers, and indicative swap rates are published by providers of financial information such as Reuters or Bloomberg. The swap rate is typically quoted as the spread in basis points over the benchmark Treasury bond with the closest maturity.2 1 David M´ endez-Vives trabaja en el Departamento de Investigaci´ on en Renta Fija de Lehman Brothers, Londres, habiendo previamente cursado estudios de Doctorado en Econom´ıa en la London School of Economics. Esta charla se imparti´ o en la sesi´ on del Seminario Instituto MEFF-RiskLab de febrero de 2001. 2 A benchmark or on-the-run government bond is, for a particular maturity, the most recently issued (and generally the most liquid) bond.
160
David M´ endez-Vives
For maturity m, the swap spread is calculated as (1)
m m Swap Spreadm t = Swap Ratet − Treasury Bond Yieldt .
As we mention above, the floating coupon of the swap is equal to the Libor rate. These rates apply to unsecured lending operations, hence they include a default risk premium that varies over time.3 This default and liquidity premia are reflected in the swap rates, which are generally higher than the risk-free Treasury rate. The fact that the swap payoffs are referenced to risky rates, together with the evidence that all sectors of the credit market are tightly correlated, translates into a significant correlation between swap spreads and corporate bond spreads. Swap markets tend to be more liquid than corporate bond markets, making swaps a natural instrument with which to hedge and speculate with credit risk. As an example, many dealers use swaps to hedge their inventories of corporate bonds. Given the liquidity and transparency of swaps, many market participants interpret swap spreads as a good measure of the evolution of global credit risk. Especially since the financial crisis of 1998, the relation between swap and credit spreads has been the main focus of swap markets. It is increasingly common to take the swap curve, instead of the Treasury curve as the benchmark curve to reference instruments with credit risk. This is a more important issue in the Euro-area, where it does not exist a single Government bond yield curve, but a number of country curves that depend on the credit standing and liquidity of each country’s debt. In contrast, the EUR swap curve is unique. The main contribution of this paper is an ecometric study of the relation between swap and credit spreads, conditioning for the impact of a set of variables such as the slope of the yield curve or the spread differential between high-quality and lowquality corporate bonds. The econometric framework we use is that of co-integration and error-correction models.4 We will show evidence of spreads being integrated (or near-integrated). In this context, an error-correction model is interesting because it can shed light both on the long-term and the short-term dynamics of the processes.
2. The relation between swap and credit spreads It is well-known in the fixed income community that the corporate bond and the swap markets are highly related, and a positive contemporaneous correlation between credit spreads and swap spreads has been empirically documented by, among others, Minton (1994), Liu, Lang and Litzenberger (1998) and Baz et al. (1999). As an illustration, we present in Figure 1 the evolution of 10-year USD swap spreads and credit spreads for different bond ratings. 3 Libor rates are also sensitive to short–term liquidity conditions, which translate into short– lived rate spikes. For instance, in the period around Y2K, Libor rates experienced a large temporary increase as the value of liquidity increased, due to concerns about the possible failure of the technology supporting the payments system. 4 See Banarjee et al. (1993) for a comprehensive introduction.
Introduction to the empirical analysis of swap spreads
161
250
USD Swap and Credit Spreads for the 10−year sector, 1994−2001 (relative to fitted Treasuries)
150 0
50
100
basis points
200
Swap spread (10y) AA (10y) A (10y) BBB (10y)
08/94
01/96
05/97
10/98
02/00
06/01
Figure 1: Evolution of the 10-year USD swap and credit spreads, relative to fitted off-the-run Treasuries. Data between 15 May 1994 and 31 May 2001, with semimonthly frequency (T=170 obs.).
The spreads on corporate bonds (“credit spreads”) have been computed from the data in the Lehman Brothers’ US Corporate Investment Grade Index and are relative to off-the-run fitted Treasury yields. For each maturity, we also have have corporate spreads by rating quality (AA, A, BBB) and by sector (Industrials and Financials). The swap spreads are measured as the difference between the mid-market quoted swap yields and the corresponding constant maturity fitted off-the-run Treasury yield.
2.1. Swaps and credit risk In this section we discuss the motivation for the relation between swap and credit spreads. A major line of research is based in modelling the default risk embedded in the swap contract due to the default risk of the swap counterparties (see Duffie and Huang (1996), Sun et al. (1993) and Cossin and Pirotte (1997) among others). In the following, we will argue that counterparty risk is actually not very relevant. The financial industry, which is to a large extent self-regulated, has developed a number of mechanisms to reduce the counterparty credit risk of swaps, which we present below. First the swap notional is not exchanged, hence the amount of money at risk is relatively small.5 In general, the swap spreads quoted by dealers are the same for 5 When Duffie and Huang (1996) calibrate their model of differential counterparty default risk, they find that, in the absence of protection mechanisms, a 100bp credit spread differential between the swap counterparties in a 5-year swap, would result in a 1bp increase in swap spreads. In a currency swap, in which the notionals are indeed exchanged, the impact on the swap spread would be much higher, at 8.7bp.
162
David M´ endez-Vives
all counterparties,6 and the discrimination for counterparty risk tends to happen via quantities: the magnitude of the gross exposure to a counterparty is determined by the size of its credit lines. The usual practice is that the two swap counterparties collateralize the net exposures by regularly posting cash or Treasury securities. It is fair to note, though, that these bilateral agreements do not make swaps risk free as a clearing house would do, because the mark-to-market is less frequent and the margining system is less transparent. In many instances the counterparties also agree on netting the swap cashflows, a practice by which all the swap cashflows between two counterparties are aggregated into a unique payment.7 The netting increases the security of the counterparties because at each payment date, the amount at stake is the net payment, instead of the gross. One situation that netting helps to avoid is that in a case of liquidity crisis, the party which is a net creditor may delay its payments to the other counterparty for fear of not being paid back at all, putting the net debtor closer to insolvency. Finally, in order to reduce the risk from the dealer side, swaps are generally contracted with dealer subsidiaries that are structured to be AAA-rated.8 Due to the credit-protection mechanisms described above, swap cashflows are generally regarded as of a high credit quality.9 This leads us to the other main motivation for the relation between swaps and credit spreads, based on the fact that the swap cashflows on the floating side are indexed to Libor rates. Libor rates apply to short– term unsecured borrowing operations (1 day to 1 year) and are determined every day by the British Bankers Association, through a poll of banks in London. The Libor panel includes 8 to 16 international banks in London, depending on the currency. Each bank communicates, by 11:00 am London time the rate at which it could borrow funds from other prime banks. The top and bottom quartiles of quotes are eliminated, and the Libor fixings are computed as the average of the rest. Since the Libor rates apply to unsecured lending operations, they incorporate a credit risk premium (in addition to a possible liquidity premium). Hence, the fixed rate will be naturally above the Treasury rate. A model of swap spreads based on this intuition is He (2000), where swap spreads are computed as the present value of the forward Libor vs. Repo spreads. 6 As
long as this have been approved by the risk-control and legal departments. practice there can be legal impediments to consolidating payment obligations. 8 These “special purpose vehicles” or SPVs are overcapitalized so that they have a better rating than the dealer that owns them. 9 The development of futures contracts referenced on swaps like the LIFFE’s “Swapnote” contract may be the final step in the elimination of counterparty risk in swaps. Essentially, the Swapnote contract is a future on a notional bond which is referenced on the EUR swap curve. The rationale for this contract lies in the increasing importance of the swap curve as the benchmark curve in the Euro-area. This contract tries to take advantage of two facts. First, many agents -typically portfolio managers- have mandates that do not allow them to enter into swap transactions, while they are free to enter into futures. These agents can use the new contract to take exposures to the swap curve, for instance to extend or decrease the duration of their bond portfolios. Second, and more related to our discussion, the futures contract is an exchange-traded contract. Therefore, the counterparty to all trades is a clearing house and the margining system is totally clear and transparent. 7 In
Introduction to the empirical analysis of swap spreads
163
In recent years, especially since the crisis of 1998, there has been an ongoing debate about whether the swap curve is going to become the benchmark curve in fixed income markets. In any case, the swap curve has gained a large degree of visibility. The factors behind this phenomenon have been the large distortions in the US Treasury yield curve due to concerns over supply,10 and the very same effects of the crisis of August 1998. Before the crisis, portfolios of corporate bonds were typically hedged with short positions in Treasuries. This practice resulted in large losses in 1998, when in a period of flight to quality, spreads widened at the same time that Treasuries appreciated sharply. On the other hand, a hedge consisting of a long position in swap spreads would have worked properly. The high volatility of spread markets in the post-98 period has made the issue of how to hedge credit risk more important. The way credit risk is traded has also had an impact on the relation between swap and credit spreads, via the financial innovation process. It has become increasingly common to trade corporate bonds on an “asset-swapped” basis. In an asset swap, the party buying a fixed-rate bond transforms it into a Libor floater by entering into a “tailored” swap paying fixed (where the fixed rate is equal to the coupon of the corporate bond), and receiving Libor plus a spread (the asset swap spread). In this way, the owner of the bond eliminates the interest rate risk of the position, and keeps an exposure to credit risk only. Notice that in the case that the corporate defaulted, he is still obliged to pay fixed in the swap.11 Economically, buying a bond on an asset-swapped basis is similar to buying a bond and paying fixed in an interest rate swap. The asset swap arises because investors may want swap payoffs that match the date and the amount of the fixed bond payoffs. Nevertheless, we can think of the asset swap and buying a bond plus paying fixed in a swap as essentially equivalent operations, and hence the asset swap spread will be roughly equivalent to the Libor spread. The evolution of the value of the bond on an asset swapped basis will be then determined by the joint evolution of swap and credit spreads.
2.2. Some empirical implications In this section we study the relation between swap spreads and two aggregate measures of credit risk, the “quality spread” (differential between “low quality” BBB debt and “high quality” AA debt, and the financial vs. industrial spread (for single-A issues, the difference between Financial and Industrial spreads). We will refer to them as QUAL and FININD spreads respectively, and we can see their evolution in Figure 2. 10 The relative importance of US Treasury securities in fixed income markets has decreased substantially, due to the US Federal budget surpluses in the late 90’s. The uncertainty generated by the future evolution of Treasury issuance and the size of the buy-back program, has made the US Treasury yield curve experience major swings. One example is the inversion of the US Treasury curve between the 2-year and the 30-year sector, for most of year 2000, after it was announced that the US Treasury would buy back $30bn of long-dated bonds during that year. 11 The mechanics of the asset swap are explained in detail in O’Kane (1999).
164
David M´ endez-Vives
60
80
Swap spread 10y QUAL 10y FININD 10y
0
20
40
basis points
100
120
140
Relation between the Swap Spreads and the QUAL and FININD Spread Differentials (10−year sector, 1994−2001)
08/94
01/96
05/97
10/98
02/00
06/01
Figure 2: Relation between the USD 10-year swap spread and the QUAL and FININD spreads. Data between 15 May 1994 and 31 May 2001, with semi-monthly frequency (T=170 obs.).
We would expect the QUAL spread differential to be correlated with swap spreads if it is the case that it is the default risk of swaps what drives the relation with credit. On the other hand, we would expect the FININD spread to be more closely related to swap spreads if what matters is the specific risk of the financial sector. We advance that the evidence tends to support the second hypothesis: the correlation between fortnightly changes in 10-year swap spreads and changes in 10-year FININD spreads is 0.259, against a correlation of 0.004 with the 10-year QUAL spread. The QUAL spread is generally interpreted as a proxy of the effect of the business cycle on credit markets.12 . The average QUAL spread is increasing with maturity, although for the period after August 1998, we have that the average QUAL spreads are very similar for the three maturities considered, at between 73.4bp and 78bp. The evolution of the QUAL spread is interesting, in the sense that it deteriorated after the crisis of August 1998, but then tightened back to more normal levels in 1999, even though the swap spreads and corporate spreads experienced a large widening. It is in the second half of year 2000 when the QUAL spread strongly deteriorated again, as the US economy approached recession. The FININD spread measures the differential compensation for credit risk of the financial sector with respect to the industrial sector. Since swaps spreads can be seen as the cost of transforming fixed rate debt into floating, Libor-indexed debt, they 12 The quality spread was already included in the empirical study of swap spreads by Minton (1994), as the difference between AAA and BBB corporates. Minton (1994) did not find the QUAL spread to be a significant driver of swap spreads.
Introduction to the empirical analysis of swap spreads
165
should be sensitive to the factors that determine the Libor rates. Also, on one side of the swap we will find a financial dealer, hence the relevance of trying to capture the risk of the dealer community as a whole. The spread between Financials and Industrials increased markedly during and after the crisis of 1998. This can be interpreted as a reflection of the higher sensitivity to systemic risk of the financial sector relative to the industrial sector. The FININD spread came back to more normal levels at the beginning of 1999, but subsequently increased steadily, as the Federal Reserve kept raising short-term interest rates. This tends to hurt the profitability of the financial sector. The average level of the FININD spread is relative low (e.g. 14.6bp for the 10-year), and even negative for the 2-year maturity. However, the volatility of the FININD is substantial, at 25.7bp for the 10-year. This is even more extreme when we only consider the subsample after August 1998: the average FININD for the 10-year is 20.3bp but the volatility is as high as 38.2bp. When we put the three spreads together in Figure 2, it is clear that the swap spreads move more closely with the FININD spread. On the other hand, we can have significant movements in the QUAL spread which are not mirrored by movements of swap spreads. At this simple level, it seems that the sensitivity of swap spreads to credit risk is mostly driven by their sensitivity to the performance of the financial sector.
3. An error-correction model for swap spreads In this section we formulate and estimate an econometric model of swap spreads. We will first comment on a number of variables that are a priori relevant in such a model.
3.1. The empirical model First, it is intuitive that swap spreads are likely to be related to the dynamics of the Treasury yield curve. The dynamics of the yield curve can be explained mostly by two factors, generally identified as the level and slope of the curve (see MendezVives (2000)). In general, we would expect that when the bond markets sell off (yields increase), spreads will tend to widen reflecting the negative news. However, in periods of extreme market instability or “flight-to-quality,” we observe that Treasury yields fall while spreads widen, i.e. the correlation becomes negative. In practice, it is likely to be difficult to disentangle which of these effects prevails.13 The slope of the yield curve is a crucial variable, since it not only reflects the way the yield curve evolves over time, but also because there is evidence that it encapsulates the market expectations regarding the business cycle. This point is argued extensively in Harvey (1988), where it is shown evidence that a flat or inverted 13 Another common argument is that the correlation between the levels of rates and swap spreads is negative because swap markets adjust to new information more slowly than bond markets, due to their lower liquidity. For instance, lower Treasury yields would make spreads to widen temporally, in a mechanical fashion. This argument implies a degree of forecastibility for swap spreads, topic that has not been investigated to our knowledge.
166
David M´ endez-Vives
yield curve predicts a recession. In empirical work on spreads, it is usual to condition them on some measure of the business cycle.14 We will we use the slope of the Treasury yield curve for that purpose, measured as the differential between 2-year and 10-year rates. In Figure 3, we can see the evolution of the slope of the yield curve together with the evolution of credit spreads. Notice the strong negative correlation since January 1999, where spreads widen when the slope turns negative and tighten when the curve steepens.
0
50
%
100
150
200
Credit Spreads and the Slope of the Yield Curve
−50
Slope (10y−2y UST) A Credit (10y)
08/94
01/96
05/97
10/98
02/00
06/01
Figure 3: Evolution of the USD single-A 10-year credit spreads, and of the slope of the Treasury yield curve (computed as 10-year minus 2-year fitted yields). Data between 15 May 1994 and 31 May 2001, with semi-monthly frequency (T=170 obs.).
Another reason to introduce the slope of the yield curve is that it has a strong impact on the financing choices of firms and their corresponding positions in swaps. When the yield curve is very steep, corporates may prefer to obtain finance on a floating basis and swap it into fixed-rate, long-term debt. In other words, when the curve is steep there is a receiver bias in swap markets, which tends to decrease swap rates and hence swap spreads. As additional conditioning variables, we will also introduce the QUAL and the FININD spread differentials, that have an interpretation in terms of the hypotheses we have formulated for the relation between swap and credit spreads. The choice of variables may become more clear after we discuss the previous literature and their main conclusions. 14 For instance, Lang et al. (1998) condition swap spreads on detrended unemployment, and Bevan and Garzelli (2000) condition credit spreads on real GDP growth and the financing gap (difference between capital spending and internally generated funds) of US non-farm and non-financial corporations.
Introduction to the empirical analysis of swap spreads
167
3.2. Estimation results We will argue that credit and swap spreads are cointegrated and that an appropiate methodology to capture that is an error correction model (ECM) (see Banarjee et al. (1993)). We will estimate such a model following the Engle-Granger (1987) two-step procedure. The first step consists in estimating the cointegrating vector via and OLS regression of the variables in levels. The regression residual (which should be I(0)), is interpreted as the deviation of the swap spread from its long-term equilibrium level. In the second step, we estimate a regression with the variables in first differences and in which we also include the lagged residual from the levels regression. This equation tries to capture the short-term adjustment dynamics of the swap spreads towards its equilibrium level. The adjustment will depend on the movements of a number of economic drivers and how far the swap spread is from its long-term level. If the model is well-specified, swap spreads should tend to revert to their equilibrium level, hence the coefficient of the lagged error would measure the speed of mean reversion and would be expected to be negative. For the sake of concreteness, we will present the results of the analysis for the 10-year maturity, which is generally the most representative sector of the swap market. Regarding the credit spreads, we will focus on the single-A rated corporates, which is the average credit quality of the USD credit market.15 In order to argue that swaps and credit spreads are cointegrated, we first need to show that the series are I(1). For that purpose, we have performed a series of Augmented Dickey-Fuller tests, both for the full sample and for the two subsamples before and after the crisis of August 1998.16 Full sample ADF t-test ADF Z-test Before Aug98 ADF t-test ADF Z-test After Aug98 ADF t-test ADF Z-test
A10 -0.71 -1.42
SWSPR10 -1.57 -4.83
SLOPE -1.83 -6.25
-0.83 -3.55
-1.96 -10.50
-2.11 -5.53
-1.73 -4.14
-1.85 -5.91
0.29 0.83
Table 1: Results of the Augmented Dickey-Fuller tests for 10-year credit (single-A) and swap spreads and for the slope of the yield curve (10-year minus 2-year Treasuries). Data between 15 May 1994 and 31 May 2001, with semi-monthly frequency (T=170 obs). The null hypothesis is that the series contains a unit root. The rejection values for the t-test are -2.57 (10%), -2.88 (5%) and -3.46 (1%). The rejection values for the Z-test are -11.2 (10%), -14.0 (5%) and -20.3 (1%). 15 The conclusions for the other maturities and for the other corporate bond ratings are similar to those that we present. 16 The reason for the tests by subsamples is because non-stationarity might be caused by the presence of a structural break of exogenous nature.
168
David M´ endez-Vives
The results of the ADF tests in Table 1 are conclusive, as we cannot reject the null hypothesis of a unit root at the 5% level for any maturity or subsample. We have also performed ADF tests for the credit and swap spreads in first differences, for which the hypothesis of nonstationarity is strongly rejected. We have augmented the cointegrating relation with the slope of the Treasury curve (10-year yield minus 2-year yield, in basis points). This is because as Lang et. al (1994) argue, the equilibrium relation between swap and credit spreads changes along the business cycle, and the evidence in Harvey (1988) supports the idea of using the slope of the yield curve as a proxy for it. The ADF test results in Table 1 indicate that the slope is I(1), whereas the first differences are found to be stationary. The cointegrating relation we estimate is then (2)
Swap Spreadt = β0 + β1 Credit Spreadt + β2 Slopet + t .
The results of this regression can be seen in Table 2, where the credit spread has a coefficient of 0.462 and the slope has a coefficient of -0.158. In other words, in the long run swap spreads are roughly half the size of single-A credit spreads, but the slope of the yield curve has a negative impact on it. A yield curve that is 10bp steeper will result on swap spreads 1.5bp tighter. The residual from the levels regression is shown to be I(0), where we correct the critical values of the ADF test to account for the fact that the cointegrating coefficients are estimated, not known. Variable Constant A10 SLOPE Rc2
Coeff 9.430 0.462 -0.158 0.906
Std Error 4.076 0.042 0.045 ADF t-test
T-Stat 2.314 10.921 -3.490 -4.151
Signif 0.021 0.000 0.000
Variable Constant ρ ∆A10 ∆BOND10 ∆SLOPE ∆FININD10 ∆QUAL10 Rc2
Coeff -0.076 -0.190 0.729 0.041 -0.122 -0.069 -0.303 0.572
Std Error 0.240 0.042 0.079 0.020 0.041 0.083 0.102
T-Stat -0.316 -4.577 9.221 2.074 -2.969 -0.835 -2.973
Signif 0.752 0.000 0.000 0.038 0.003 0.404 0.003
Table 2: Results of the Error Correction model for 10-year swap spreads, for the full sample. Data between 15 May 1994 and 31 May 2001, with semi-monthly frequency (T=170 obs). The std. errors are Newey-West (with 6 lags). The critical value for the ADF t-test on the residuals of the regression in levels is -3.37 (5%) or -3.96 (1%).
The results of the above regression are consistent with the observed international differences on the average level of swap spreads. It is the case that credit spreads in
Introduction to the empirical analysis of swap spreads
169
USD are higher than in EUR (156bp for USD single-A debt and 91bp for EUR17 ) and that the average slope of the yield curve in the Euro-area are higher than in the US. The spread between 2-year and 10-year bonds is 78bp and 18bp respectively (between January 1999 and July 2001). According to the model, this would result in higher swap spreads for USD than for EUR, as it is the case. Next, we estimate the short-term adjustment equation, where the lagged residual from the levels regression (2) captures the mean reversion in the process. In addition to the the changes in credit spreads and the slope, this second equation also includes a number of variables that may influence the short term dynamics of the swap spreads: the level of interest rates and the quality and financial vs. industrial spreads.18 , 19 ∆Swap Spreadt = ρt−1 + γ0 + γ1 ∆Credit Spreadt + γ2 ∆Bondt (3)
+ γ3 ∆Slopet + γ4 ∆Finindt + γ5 ∆Qualt + ut .
The results from the ECM (full sample in Table 2, pre and post-98 crisis in Tables 3 and 4) show that the most important influence in the swap spreads comes from credit spreads, with a positive sign as expected. The coefficient on the credit spread, for the 10-year spread levels equation is 0.462, and 0.729 in the changes regression (both numbers are for the full sample). We also observe that the short term relation between swap spreads and credit spreads is stronger after the crisis of 1998. The credit spread coefficient in the changes regression for the sample after the crisis is 0.733, against 0.468 for the sample before the crisis. The lagged residual is found to be strongly significant and enters into the changes regression with a negative sign. Hence, deviations from the “fair” value given by the fitted level of the swap spread relative to the credit spreads, tend to be corrected during subsequent periods. The magnitude of the coefficient differs over subsamples, and in particular, the speed of the reversion to fair levels is faster in the pre-crisis period. In other words, distortions between the swap and credit markets tend to persist longer after the crisis of 1998, which is consistent with general market opinion (see Baz et al. (1999)).
17 As of July 2001, average spreads from the single-A corporate sector of the Lehman Brohters US Aggregate and Euro-Aggregate indices. 18 All variables in equation (3) are introduced in first differences, as they are generally shown to be I(1) or near-I(1). Essentially, the bond yields and the slope of the yield curve appear to be highly nonstationary. For the spread differentials, FININD and QUAL, the results of the ADF tests are generally close to the rejection of I(1) at the 5% level. For all the variables in first differences, we obtain a strong rejection of the null of stationary. We do not report the results from the Augmented Dickey-Fuller for lack of space. 19 In the “changes” regression, all variables are as of time t, except for the lagged error from the levels regression. The intention in this paper is to describe the contemporaneous relation between variables, and not to look for forecasting relations. In order to study the later, we would need a rigorous analysis of the exogeneity of the variables, Granger causality, etc. which is beyond the scope of the present work. Instead, we focus on the relative value relations between variables.
170
David M´ endez-Vives
Variable Constant A10 SLOPE Rc2
Coeff 8.560 0.385 -0.037 0.714
Std Error 3.186 0.042 0.011 ADF t-test
T-Stat 2.687 9.228 -3.377 -5.130
Signif 0.007 0.000 0.001
Variable Constant ρ ∆A10 ∆BOND10 ∆SLOPE ∆FININD10 ∆QUAL10 Rc2
Coeff -0.044 -0.378 0.468 0.019 -0.037 -0.149 -0.169 0.354
Std Error 0.158 0.084 0.059 0.015 0.022 0.086 0.119
T-Stat -0.276 -4.486 7.866 1.240 -1.720 -1.727 -1.412
Signif 0.782 0.000 0.000 0.215 0.085 0.084 0.158
Table 3: Results of the Error Correction model for 10-year swap spreads, for the sample before the crisis of August 1998. Data between 15 May 1994 and 31 July 1998, with semi-monthly frequency (T=102 obs). The std. errors are Newey-West (with 6 lags). The critical value for the ADF t-test on the residuals of the regression in levels is -3.37 (5%) or -3.96 (1%). Variable Constant A10 SLOPE Rc2
Coeff 12.067 0.455 -0.289 0.986
Std Error 9.877 0.076 0.063 ADF t-test
T-Stat 1.222 6.023 -4.602 -2.870
Signif 0.222 0.000 0.000
Variable Constant ρ ∆A10 ∆BOND10 ∆SLOPE ∆FININD10 ∆QUAL10 Rc2
Coeff 0.109 -0.217 0.733 0.072 -0.183 -0.026 -0.281 0.620
Std Error 0.598 0.077 0.092 0.044 0.062 0.100 0.111
T-Stat 0.182 -2.824 7.979 1.644 -2.971 -0.264 -2.534
Signif 0.855 0.005 0.000 0.100 0.003 0.792 0.011
Table 4: Results of the Error Correction model for 10-year swap spreads, for the sample after the crisis of August 1998. Data between 14 August 1998 and 31 May 2001, with semi-monthly frequency (T=68 obs). The std. errors are Newey-West (with 6 lags). The critical value for the ADF t-test on the residuals of the regression in levels is -3.37 (5%) or -3.96 (1%).
The yield curve variables (level and slope) tend to be significant mostly for the period after August 1998. Also, the slope seems to have a more important effect. The changes in the 10-year bond yield have a positive coefficient, but this is significant for the full sample and the post-crisis subsample at the 10% level only. The slope
Introduction to the empirical analysis of swap spreads
171
has a negative sign and it’s significant again for the full sample and for the post-crisis period, at the 1% level. These results confirm our conjecture that it may be difficult to obtain a clear relation between changes in yields and in swap spreads, but the slope should have a clear effect, in the form of swap spreads tightening when the curve steepens. The impact of changes in the FININD and the QUAL spreads on the swap spreads is either not significant (FININD) or when it is, it has a puzzling negative sign (QUAL). We think that this can be due to the high correlation of these variables with the changes in credit spreads20 that obscures their relation with swap spreads. In other words, the information in the spreads differentials seems to be already contained in the credit spread. In any case, the QUAL spread has a significantly negative sign both for the full sample and the post-crisis period, which is not consistent with the idea that differential counterparty default risk affects swap spreads.
4. Conclusions The main focus of this introduction to the analysis of swap spreads is their relation with credit spreads, that we treat from an empirical point of view. The explanation of the high contemporaneous correlation between the swap and the corporate bond markets is a topic that has been not solved theoretically. One of the reasons may be that there is scarce empirical evidence on which to base a theoretical investigation, mainly due to the difficulties in obtaining a comprehensive and accurate dataset on credit spreads. We discuss two approaches to explaining the relation between swap and credit spreads. We tend to find more evidence supporting the explanation that relies in the fact that swap cashflows are indexed to Libor and that the agents increasingly focus on the swap curve as the benchmark for trading and hedging credit risk. An alternative explanation, based on the default risk of the swap cashflows is hindered by the fact that the industry has developed extensive credit protection mechanisms. Swap spreads do not seem to be related to measures of the global credit risk, like the spread differential between BBB and AA debt. Given the statistical evidence and the economics of the problem, we have investigated the possibility of swap and credit spreads being cointegrated. We estimate an error correction model where we also introduce a number of a possibly relevant economic variables. We find that the relation between credit and swap spreads is strongly positive and stable. The relation between swap spreads and the slope of the yield curve is typically negative and stronger after the crisis of 1998. The credit spread differentials (FININD and QUAL) do not appear to be significant, and we conjecture that the reason for that is because the information they contain is already incorporated in the credit spreads. The evidence in this paper should be relevant for the construction of a theoretical model dealing with the integration between the different sectors of the fixed income markets. 20 The correlation of fortnightly changes of 10-year single-A spreads with the 10-year QUAL spread is 0.30, and 0.48 for the FININD spread.
172
David M´ endez-Vives
References [1] Banarjee, A., Dolado, J., Galbraith, W., and D. Hendry (1993): Cointegration, Error-correction and the Econometric Analysis of Non-stationary Data. Oxford University Press. [2] Baz, J. (1997): “Capital market swaps: origins and functions”. CEMS Business Review 2, pg. 85-102, Kluwer AP. [3] Baz, J., M´ endez-Vives, D., Munves, D., Naik, V., and J. Peress (2001): The dynamics of swap spreads: a cross-country Study. Analytical Research Series, Lehman Brothers Fixed Income Research. [4] Bevan, A., and F. Garzarelli (2000): “Corporate bond spreads and the business cycle”. Journal of Fixed Income 9, 61–78. [5] Brown, K., Harlow, W., and D. Smith (1994): “An empirical analysis of swap spreads”. Journal of Fixed Income 3, March issue, 61–78. [6] Campbell, J., Lo, A., and C. MacKinlay (1997): The Econometrics of Financial Markets. Princeton University Press. [7] Campbell, R., and T. Temel (2000): Interest rate swaps. Analytical Research Series, Lehman Brothers Fixed Income Research. [8] Collin-Dufresne, P., and B. Solnik (2001): “On the term structure of default premia in the swap and libor markets”. Journal of Finance 56, 1095–1115. [9] Cossin, D., and H. Pirotte (1997): “Swap credit risk: An empirical analysis on transaction data”. Journal of Banking and Finance 21, 1351–1373. [10] Duffee, G. (1996): “Idiosyncratic variation of Treasury Bill yields”. Journal of Finance 51, 527–552. [11] Duffie, D. (1996): “Special repo rates”. Journal of Finance 51, 493–526. [12] Duffie, D., and M. Huang (1996): “Swap rates and credit quality”. Journal of Finance 51, 921–949. [13] Duffie, D., and K. Singleton (1997): “An econometric model of the term structure of interest rate swap yields”. Journal of Finance 52, 1287–1321. [14] Duffie, D., and K. Singleton (1999): “Modeling term structures of defaultable bonds”. Review of Financial Studies 12, 687–720. [15] Engle, R., and C. Granger (1987): “Co-integration and error correction: representation, estimation and testing”. Econometrica 55, 251–76. [16] Haidar, S. (1993): The complete guide to derivatives. Mimeo, Lehman Brothers Fixed Income Research.
Introduction to the empirical analysis of swap spreads
173
[17] Harvey, C. (1988): “The real term structure and consumption growth”. Journal of Financial Economics 22, 305–33. [18] He, H. (2000): “Modeling term structures of swap spreads”. Working Paper, Yale School of Management, Yale University. [19] Lang, L., Litzenberger, R., and A. Luchuan Liu (1998): “Determinants of interest rate swap spreads”. Journal of Banking and Finance 22, 1507–1532. [20] Litzenberger, R. (1992): “Swaps: plain and fanciful”. Journal of Finance 47, 831–850. [21] Liu, J., Longstaff, F., and R. Mandell (2000): “The market price of credit risk: An empirical analysis of interest rate swap spreads”. Working paper, Anderson School of Management. [22] Minton, B. (1997): “An empirical examination of basic valuation models for plain vanilla US interest rate swaps”. Journal of Financial Economics 44, 251-277. [23] Monkkonen, H. (1999): Estimating credit spread curves. Fixed Income Research, Lehman Brothers. [24] O’Kane, D. (2000): Introduction to asset swaps. Analytical Research Series, Fixed Income Research, Lehman Brothers. [25] Sun, T., Sundaresan, S. and C. Wang (1993): “Interest rate swaps: an empirical investigation”. Journal of Financial Economics 34, 77–99.
David M´endez-Vives Lehman Brothers Fixed Income Research One Broadgate, London ECM2 7HA Great Britain [email protected]
Testing the optimality of immunization strategies with transaction costs Eliseo Navarro and Juan M. Nave1
Abstract: In this paper, a dynamic fixed-income portfolio selection model is developed under different stochastic and non stochastic term structure regimes. This model allows the introduction of transaction costs and shows that, in this context, the maximin strategy against interest rate risk may not be the immunization strategy but that it may consist of building up a portfolio with an initial duration less than the investor planning period. The optimality of the solutions provided by the model are tested using simulation techniques.
1. Introduction One of the main results concerning the development of portfolio strategies against interest rate risk is the Dynamic Global Immnuzation Theorem enunciated by Khang (1983). According to this theorem, to guarantee a final portfolio value at the end of a given period of time, independently of interest rate changes, the optimal strategy would be to keep along time portfolio duration equal to the investor’s planning period. Due to the nature of portfolio duration such a strategy would imply a continuous portfolio rebalancing. However, the optimality of this strategy is based on a set of assumptions, including: (a) the forward interest rates structure g(t), t > 0, change to g ∗ (t, λ) where λ is a stochastic shift parameter. Specifically he assumes that (1)
g ∗ (t, λ) = g(t) + λ ;
(b) absence of transaction costs. The first assumption avoids the problem of the risk of misestimation of the term structure behavior called by Fong and Vasicek (1983) “immunization risk”. Bierwag (1987) called it “stochastic process risk”, pointing out that if an investor assumes an incorrect hypothesis about term structure changes, the percieved durations would 1 Eliseo Navarro es Catedr´ atico de Econom´ıa Financiera de la Universidad de Castilla-La Mancha. Juan M. Nave es Profesor Titular de Econom´ıa Financiera de la misma Universidad. Esta charla fue impartida por el primer autor en la sesi´ on del Seminario Instituto MEFF-RiskLab de abril de 2001.
176
Eliseo Navarro and Juan M. Nave
be different from the actual ones and so, the investor losses from misestimation (or misguesstimation) of the correct process could be substantial. This problem has been studied in several papers assuming alternative hypothesis about he term structure behavior and two different approaches can be distinguished. The first one could be called, according to De Felice and Moriconi (1991) semideterministic and consists of assuming that instantaneous forward rates may change by a random factor. These sort of models, however, may allow arbitrage opportunities (Ingersoll, Skelton and Weil (1978)) and lead to different definitions of duration depending on the initial hypothesis about interest rate changes. The second approach is based on non arbitrage models of the term structure as those suggested by Cox, Ingersoll and Ross (1985) or Vasicek (1977). The immunization strategy under these term structure models, has been analyzed by Boyle (1978) and Cox, Ingersoll and Ross (1979) where the concept of stochastic duration was first defined. With respect to the hypothesis of the absence of transaction costs, it becomes crucial in a dynamic context; if transaction costs are taken into account, the strategy of a continuous portfolio rebalancing, may become not optimal due to the high expenses this strategy would incur. This problem has been analyzed by Maloney and Logue (1989). These authors measure the impact of transaction costs on the immunization strategy. Bierwag (1987) suggests that continuous portfolio rebalancing in order to keep portfolio duration equal to the remaining investor’s planing period may be inefficient. Some years later, Lee and Cho (1992) developed a portfolio selection model using dynamic programming techniques, concluding that if transaction costs are high enough, the optimal strategy against interest rate risk may consist of a partial immunization in order to avoid transaction costs. However, they assume as a boundary condition that the optimal portfolio path must start with a immunized portfolio; this is an arbitrary constrain , which may lead to a suboptimal strategy. In this article, we try to deal with these problems developing a dynamic portfolio selection model under different assumptions about the term structure of interest rates (TSIR) behavior. Particularly, three different cases are analyzed: – the most simplistic one consisting of assuming a flat term structure and parallel TSIR shifts and – two alternative non arbitrage stochastic term structure models which assume that the instantaneous spot interest rate follows a diffusion process: Vasicek (1977) and Cox, Ingersoll and Ross (1985) models. The model is based on a previous static one that behaves according to classical Fisher and Weil (1971) Immunization Theorem in the first case and Boyle’s (1978) Stochastic Immunization in the two stochastic cases. Then, the model is adapted to a dynamic context and, then the model is enlarged in order to incorporate transaction costs. In this case, it is shown that, if transaction costs are high enough, then the optimal strategy may differ from that proposed by Khang. Finally, the optimality of the solutions provided by the model is tested using simulation techniques and showing their superiority against immunization strategy.
Testing the optimality of immunization strategies with transaction costs
177
2. The static model As Bierwag and Khang (1979) proved, immunization can be described as a maximin strategy in a game against Nature where the investor target is to guarantee a minimum return over his planning period or, equivalently, a minimum value at the end of his horizon planning period (HPP). Thus, according to Dantzig (1971), this maximin solution can be worked out by solving an equivalent linear program. This linear program depends on the hypothesis about the term structure of interest rates assumed by the model. So, we describe first a set of common assumptions necessary to model this portfolio selection problem and then different hypothesis about the TSIR are introduced leading each of them to alternative models which are described in sections 2.1 (non-stochastic term structure) and 2.2 (stochastic term structure). The set of initial assumptions are the following: – Financial markets are competitive: individual investors decisions don’t affect interest rates which are given exogenously. – Perfect divisibility of financial assets. – Absence of transaction costs. – Short sales are not allowed2 .
2.1. Non stochastic term structure models In this case we make the following hypothesis about TSIR: a) The term structure is flat; b) TSIR changes consist of parallel movements of the whole term structure, i.e., short and long term interest rates changes are equal. Under this set of assumptions the immunization problem can be modeled as follows. We assume an investor who wants to allocate an amount of I dollars in a market where n different coupon bearing bonds3 are available. Different portfolio allocations can be considered as the strategies played by the investor. Also, we assume that just after the purchase of the selected portfolio, interest rates may change from its current level (denoted by rc ) to any of the values r1 , . . . , rc , . . . rm ; where r1 < · · · < rc < · · · < rm . 2 This constrain is imposed in the model as a sufficient condition in order to guarantee that the net income generated by the portfolio is always non negative throughout the planning period, which is one of the hypothesis of Khang’s theorem. 3 In order to isolate interest rate risk we assume these are default-free risk bond without any callable feature.
178
Eliseo Navarro and Juan M. Nave
Finally we assume that it doesn’t take place any additional unexpected interest rate change during the remaining HPP4 . These possible new interest rate values can be regarded as the strategies played by Nature. Let pi (i = 1, . . . , n) be the current price of one unit of asset i and xi be the number of units of this asset included in the optimal portfolio. Then an investor strategy consists of a vector (x1 , x2 , . . . , xn ) which must verify the following budget constrain: n
(2)
xi pi = I
i=1
If just after selecting a strategy, interest rates move from rc to rj , portfolio value at the end of the HPP (if no additional unexpected interest rate change takes place) is given by the following expression: n
(3)
xi vi ,
i=1
where vij denotes the value at the end of the HPP of an investment of pi dollars in asset i and, as before, interest rates change just after the purchase, from rc to rj remaining unchanged until the end of the HPP. Note that vij is calculated assuming that coupon and principal payments due before the end of the HPP are reinvested at the expected interest rate immediately after the TSIR shift according to the Pure Expectations Theory. Denoting by V the minimum final portfolio value the investor wishes to maximize, the portfolio selection process can be modeled as follows: max V x,V
s.t.
n
xi vij ≥ V
j = 1, . . . , m
i=1 n
xi pi = I
i=1
xi , V 0
4 We assume along this paper that the Pure Expectations Hypothesis about the TSIR holds; so under a flat term structure regime any interest rate change is considered to be unexpected.
Testing the optimality of immunization strategies with transaction costs
179
2.2. Stochastic term structure models At this point we assume the following hypothesis about the term structure: a) Instantaneous spot interest rate r(t) follows a diffusion process so its behavior is described by the following stochastic differential equation: (4)
dr(t) = f (r(t), t) dt + ρ(r(t), t) d˜ z
where d˜ z is a Wiener Process with zero mean and variance dt. b) There are no arbitrage opportunities. The price, P (r(t), t, s) at time t of a pure discount bond which matures at time s (t = s) is a stochastic variable that verifies (by Ito’s lemma) the following differential equation ∂P ∂P 1 ∂2P ∂P (5) dP = +f + ρ2 2 dt + ρ d˜ z = P µ dt + P σ d˜ z, ∂t ∂r 2 ∂r ∂r where (6)
µ=
1 ∂P ∂P 1 ∂2P +f + ρ2 2 P ∂t ∂r 2 ∂r
and (7)
σ=
1 ∂P ρ . P ∂r
If we assume the Expectations Hypothesis and there exists no arbitrage opportunities, then the price of a unit discount bond must satisfy the following partial differential equation [Vasicek (1977)]: (8)
∂P 1 ∂2P ∂P +f + ρ2 2 − rP = 0 . ∂t ∂r 2 ∂r
At maturity, the bond price must be equal to one so P (r(t), s, s) = 1; this provides the boundary condition necessary to solve equation (5). Now we will assume two specific assumptions about the stochastic process followed by spot rate. Case A: Vasicek model [Vasicek (1977)] (9)
dr(t) = α(γ − r(t)) dt + ρ d˜ z,
where α, γ and ρ are positive constants.
180
Eliseo Navarro and Juan M. Nave
In this case the solution to (5) is given by
(10)
ρ2 2 F (T ) , P (r(t), t, s) = exp F (T )(G − r(t)) − T G − 4α
where: (11)
T
=
(12)
F (T )
=
(13)
G
=
s − t, 1 (1 − exp(−αT )) , α ρ2 γ− 2. 2α
Under this term structure hypothesis, the relative basis risk of a discount bond is given by (14)
−
∂P/∂r = F (T ) . P
As CIR (1979) point out “if we want stochastic durations be a proxy for basis risk5 of coupon bonds with the units of time, it is natural to define it as the maturity of a discount bond with the same risk”. Then portfolio duration under Vasicek TSIR model, DV , is given by: C(s) P (r(t), t, s) F (t) , (15) DV = F −1 s C(s) P (r(t), t, s) s
where C(s) is the stream of cash flows generated by that portfolio and F −1 [X] is: (16)
F −1 [X] =
− ln(1 − αX) . α
Case B. Cox, Ingersoll and Ross model [CIR (1979)] (17)
dr(t) = κ(µ − r(t))dt + σ
r(t)d˜ z,
where κ, µ and σ are positive constants. Now the solution to (5) is given by (18)
P (r(t), t, s) = A(T ) exp(−r(t)B(T )) ,
5 Basis risk can be defined as the possibility that an institution margin will rise or fall as a consequence of market rates movements.
Testing the optimality of immunization strategies with transaction costs
181
where
(19)
A(T ) =
(20)
B(T ) =
(21)
λ =
2λ exp [(κ − λ)T /2] (λ + κ)[1 − exp(−λT )] + 2λ exp(−λT ) 2(1 − exp(−λT )) , (λ + κ)[1 − exp(−λT )] + 2λ exp(−λT ) κ2 + 2σ 2
(2κµ)/σ2 ,
and portfolio duration under CIR term structure model, DCIR is: C(s) P (r(t), t, s) B(t, s) s , (22) DCIR = B −1 C(s) P (r(t), t, s) s
where (23)
B
−1
2 − (κ − λ)X 1 [X] = ln . λ 2 − (κ + λ)X
It is important to point out that under these two term structure regimes the whole TSIR depends on the current instantaneously compounded spot interest rate r(t). Now the selection model should be restated in terms of the variable r(t). Thus vij must be redefined as follows: Ci (s) P (rj , t, s) (24) vij = s , P (rj , t, s) where Ci (s) denotes the payment stream generated by one unit of bond i and P (rj , t, s) is the price at time t of a pure discount bond with maturity at s if spot instantaneous interest rate becomes rj .
2.3. Model results To illustrate numerically the former models we have applied them to a very simplistic case. Let’s assume an investor with a horizon planning period of 18 months and a fixed income market where there are four different default-free coupon bonds available which characteristics are described in Table 1. Additionally we assume that the investor has an initial budget of one million dollars to allocate among these assets. The hypothesis about the term structure are the following: Case 1: Flat term structure In this case we assume a flat term structure being the current interest rate level 10% (compounded semiannually). Interest rates may move up and down by 100 basis points to 9% or to 11%, i.e. r1 = 9%; r2 = rc = 10%; r3 = 11%.
182
Eliseo Navarro and Juan M. Nave
The optimal solution of this program is shown in Table 2. As we can see, this result is consistent with Fisher and Weil immunization theorem: optimal solution consist of a portfolio with a duration equal to the HPP. Case 2.A: Vasicek model We assume that the instantaneous spot interest rate follows the diffusion process proposed by Vasicek. In order to give realistic values to the model we used those estimated in Nowman (1997) for both US (from Treasury Bill market) and UK (sterling one-month interbank rate). (See Table 3). Notice that now, Nature strategies consists of the different values the current instantaneous spot rate can take which we assume can vary 100 basis points (upwards or downwards) from its current level (5.61% for US and 5.99% for UK). The optimal solutions are shown in Table 2. Case 2.B: Cox, Ingersoll and Ross model Now, it is assumed that instantaneous spot interest rate follows CIR process. Model parameter are as before those estimated in Nowman (1997) for US and UK (see Table 3). As before, we assume that instantaneous spot interest rate may change by 100 basis points from its current value. The optimal solutions are presented in Table 2. It is important to see that “Redington’s basic idea is still valid although there are important differences within the framework of the stochastic model of the term structure” [Boyle (1978)], i.e., under stochastic term structure models portfolio immunization consists of making portfolio duration (properly defined) equal to the remaining HPP.
2.4. Immunization risk In former sections we have assumed explicitly a specific term structure behavior where the whole term structure is supposed to depend on a unique factor (short term interest rate). However, as it is well known, the nature of the dynamics of interest rates is by far much more complex6 . So, immunization strategies may fail if TSIR behavior differs significantly. This is known as immunization risk.7 In order to minimize the immunization risk derived from an unexpected behavior of the term structure several proposals have been suggested but on the whole most of them consist of selecting among the set of immunized portfolios those that generate 6 There
are some international evidence that at least 95% of term structure movements can be explained by three factors; parallel shifts, slope changes and curvature changes. Depending on the country analyzed and the period covered by different studies, parallel shifts can explain a percentage of the variance on interest rates changes between 72.2 and 96.88. See for further details Steeley (1990), Strickland (1993), D’Ecclesia and Zenios (1994), Navarro and Nave (1995) and Sherris (1995) for UK, US, Italy, Spain and Australia respectively. 7 For a review of the effects of non-parallel yield curve shifts on traditional immunization strategy see Reitano (1992). In Reitano (1991) Kang’s result was generalized to any directional yield curve model and to a general multivariate nondirectional model.
Testing the optimality of immunization strategies with transaction costs
183
a payment stream as concentrated as possible around the end of the HPP. A trivial example would be a portfolio consisting of zero coupon bonds maturing at the end of the HPP: it is an immunized portfolio (Duration equal to the HPP length) and simultaneously, due to total concentration of payments at tk , it would be free of immunization risk. Although there are several alternatives to measure immunization risk one of the most usually accepted dispersion measures is that proposed by Fong and Vasicek8 known as M 2 . As they proved, by minimizing this quadratic dispersion measure, the effect on final portfolio value of a TSIR movement different from that assumed is minimized. Particularly, they analyze the effect of a TSIR shift consisting of a linear movement of the instantaneous forward rate around the end of the HPP; in this case it is not possible to build up an immunized portfolio but there is a lower bound for portfolio final value that depends on M 2 . This dispersion measure corresponding to bond i can be defined as follows: (s − HP P )2 Ci (s)Pi (r(t), t, s) s 2 , i = 1, . . . , n , (25) Mi = Ci (s)Pi (r(t), t, s) s
where Ci (s) denotes the payment stream generated by bond i; HPP is the length of the investors horizon planning period; P (r(t), t, s) is the price at t of a zero coupon bond with maturity at s if the outstanding interest rate at t is r(t). This dispersion measure is introduced in the model by penalizing the objective function which becomes: n (26) V − A Mi2 xi i=1
where A > 0 is a coefficient that depends on the investor’s Immunization risk aversion. If this dispersion measure is applied to the model the optimal portfolio path consist of immunized portfolios of minimum dispersion, independently of the term structure assumption9 . (See Table 4). 8 There are other alternative dispersion measures, as M-Absolute, derived from different assumptions about term structure movements; Nawalkha and Chambers (1996) test the use of this latter measure as a first order condition to protect an investment against interest rate risk instead of using it as a second order condition to minimize immunization risk. Other authors have criticized M 2 measure, suggesting the convenience of including an asset with maturity at the end of the HPP as the best strategy against immunization risk (see Bierwag et al. (1993)). However, in practical terms, these three alternative measures lead to very similar results. For a generalization of immunization risk measures see Balb´ as e Ib´ an ˜ ez (1998). 9 In fact, any other decreasing function of M 2 could be added to the objetive function to penalized portfolio dispersion as far as we are only trying to obtain the immunized portfolio of minimum dispersion.
184
Eliseo Navarro and Juan M. Nave
3. The dynamic model The portfolio selection model described in the previous section provides a portfolio that is immunized against interest rate risk only at the beginning of the HPP. However, the dynamic behavior of portfolio duration makes it impossible to keep that portfolio immunized during the whole planning period. Moreover, the immunization solution provided by the model is only valid for the current interest rate, so the portfolio has to be adjusted continuously. Only if portfolio consists of zero coupon bonds with maturity at the end of the HPP it would be possible to keep it immunized along the HPP without any additional rearrangement. The selection model we describe next, tries to obtain an optimal rebalancing path to keep portfolio free of interest rate risk. First, we make a partition of the HPP into k subintervals of equal length, [t0 , t1 ], [t1 , t2 ], . . . , [tk−1 , tk ], being t0 the beginning of the HPP and tk the end of the HPP and we assume that portfolio rebalancing is only allowed at the beginning of each subinterval. At these points Nature can play a set of strategies (scenarios about interest rates) analogous to those described in the former static model. But now, we have to make a previous hypothesis about the expected interest rate level at the beginning of each subinterval in order to analyze the impact of interest rate changes just after these rebalancing dates. It must be pointed out that actual interest rates at each ts will differ from those assumed initially but what does really matters in this model is not the level of interest rates but the effects of interest rate changes on the final portfolio value. However, the hypothesis about those future interest rates will differ depending upon the term structure regime we had assumed. In case 1 (flat term structure) if we assume the Expectation Hypothesis about the term structure we have that Et0 [r(ts )] = rc , ts > t0 , i.e., interest rates are assumed to remain unchanged. If a stochastic term structure model is assumed (case 2.A, Vasicek’s model, and case 2.B, Cox, Ingersoll and Ross model) we have10 : Case 2.A: Vasicek model: (27)
Et0 [r(ts )] = γ + (r(t0 ) − γ)e−α(ts −t0 ) ,
ts > t0 .
Case 2.B: CIR model: (28)
Et0 [r(ts )] = µ + (r(t0 ) − µ)e−κ(ts −t0 ) ,
ts > t0 .
Then, what we analyze in this dynamic model are the effects of unexpected interest rate changes at each ts on the final portfolio value. For the sake of simplicity and without loss of generality we assume that there are n different coupon bonds available at t0 each of them maturing at t1 , . . . , tk−1 , tk , . . . , tn respectively; coupon payments are also due at the rebalancing points. 10 See
Vasicek (1977) and Cox, Ingersoll and Ross (1985).
Testing the optimality of immunization strategies with transaction costs
185
Now let x(s, i) be the number of units of asset i included in the portfolio at ts (s = 0, 1, . . . , k − 1). Then x(s, i) − x(s − 1, i) is the number of units of asset i bought and sold at ts (depending upon x(s, i) − x(s − 1, i) being positive or negative). Let b(s, i) and z(s, i) be the number of units of asset i bought and sold at ts respectively and p(s, i) the value at ts of one unit of asset i as if Et0 [r(ts )] were the spot rate prevailing at ts , i.e., as if no unexpected interest rate change had taken place during period [t0 , ts ]. Then, the following set of constrains must be satisfied: (i) x(0, i) = b(0, i) ,
i = 1, . . . , n.
(ii) x(s, i) − x(s − 1, i) − b(s, i) + z(s, i) = 0 , (iii) x(s − 1, s) = z(s, s) ,
s = 1, . . . , k − 1.
(iv) x(k − 1, i) = z(k, i) ,
i = k + 1, . . . , n.
s = 1, . . . , k − 1 , i = s + 1, . . . , n.
The first set of constrains (i) indicates the number of assets bought at the beginning of the HPP and the second group of constrains (ii) the purchases and sales at each subsequent ts . The third group of constrains (iii) represents the number of units of bond s maturing at ts (s = 1, . . . , k − 1) so it must be equal to the number of assets s held in the optimal portfolio at ts−1 i.e. with one period to maturity; this set of constrains is included in the model to distinguish those assets that are sold at ts from those with maturity at ts . Finally, the last group of constrains (iv) indicates that at tk (the end of the HPP) all assets must be sold. The initial budget constrain must be replaced by: n
(29)
x(0, i) p(0, i) = I ,
i=1
where I is the amount of money available at the beginning of the HPP. Now the budget constrain must be satisfied not only at t0 but during the whole planning period, so we have to add the following set of budget constrains: (30)
n i=s+1
n
b(s, i) p(s, i) −
z(s, i) p(s, i) − z(s, s) p(s, s) −
i=s+1
n
ci x(s − 1, i) = 0 ,
i=s
for s = 1, . . . , k − 1, where Ci is the coupon payment of one unit of bond i; p(s, s) is the face value of asset s, i.e. maturing at ts ; p(s, s) z(s, s) are the principal repayments at ts corresponding to asset s.
186
Eliseo Navarro and Juan M. Nave
What this set of constrains is trying to catch up is the fact that the amount of money invested in new purchases at each ts (b(s, i)p(s, i)) must come from coupon payments (x(s, i)), sales (z(s, i)p(s, i)), and/or principal repayments (p(s, s)z(s, s)). As in the static model, we assume that investor’s aim is to maximize at each ts the guaranteed portfolio value at the end of the HPP if an unexpected interest rate change takes place immediately after portfolio rebalancing, i.e., just after each ts . Then, if we denote by Vs the minimum final portfolio value to guarantee at ts , the following set of constrains must be satisfied: n
(31)
x(s, i)vi,j (s) Vs
i=s+1
s = 0, . . . , k − 1 , j = 1, . . . , m ,
where Vi,j (s) denotes now the final value of an investment of p(s, i) dollars in asset i at ts if the instantanous spot interest rate changes, just afterwards, from Et0 [r(ts )] to rj and no additional unexpected interest rate change takes place until the end of the HPP, i.e. (32)
vi,j (s) =
n 1 Ci (tr ) P (rj , ts , tr ) , P (rj , ts , tk ) r=s+1
where Ci (tr ) represents the payments stream generated by bond i from ts awards and P (rj , ts , tr ) the value at ts of a unit zero coupon bond with maturity at tr if the prevailing interest rate at ts is rj . As far as investor’s aim is to maximize at each ts these minimum portfolio final values and simultaneously to minimize immunization risk at each ts the objective function would be restated as follows: k
(33)
Vs − A
s=0
k−1
n
M 2 (s, i)x(s, i) ,
s=0 i=s+1
2
where M (s, i) is the Fong and Vasicek dispersion measure corresponding to bond i at ts and is defined as follows: ti
M 2 (s, i) =
(34)
j=s+1
(tj − tk )2 Ci (tj )P (Et0 [r(ts )], ts , tj ) ti
j=s+1
Ci (tj )P (Et0 [r(ts )], ts , tj )
s = 0, . . . , k − 1 , i = s + 1, . . . , n
where Ci (tj ) denotes the payment stream generated by bond i, ti is the term to maturity of bond i; tk is the end of the HPP; P (Et0 [r(ts ), ts , tj ) is the price at ts of a zero coupon bond with maturity at tj if the outstanding interest rate at ts is the expected one at t0 .
Testing the optimality of immunization strategies with transaction costs
187
Then the whole model is: max V,x
k
Vs − A
s=0
k−1
n
2
M (s, i) x(s, i)
s=0 i=s+1
s. to n
x(s, i) vi,j (s) ≥ Vs
s = 0, . . . , k − 1; j = 1, . . . , m; i = 1, . . . , n
i=s+1
x(0, i) = b(0, i) x(s, i) − x(s − 1, i) − b(s, i) + z(s, i) = 0
i = 1, . . . , n s = 1, . . . , k − 1; i = s + 1, . . . n
x(s − 1, s) = z(s, s) x(k − 1, i) = z(k, i) n x(0, i)p(0, i) = I0 i=1 n i=s+1
b(s, i)p(s, i) −
s = 1, . . . , k − 1 i = k + 1, . . . , n
n
z(s, i)p(s, i) − z(s, s)p(s, s)
i=s+1
− n
n
Ci x(s − 1, i) = 0
i=s
z(k, i)p(k, i) + z(k, k)p(k, k) +
s=k+1
x(s, i); b(s, i); s(s, i); Vs 0
n
s = 1, . . . , k − 1
x(k − 1, i) = Vk
i=k
∀s, i
We proceed to illustrate this model by applying it to the former example, assuming that the HPP (18 months) is subdivided into 3 subperiods of equal length (6 months) and that portfolio rebalancing is only allowed at the beginning of each subinterval. As before, we make three different assumptions about the TSIR behavior. Case 1. Flat term structure. We assume that at the beginning of the HPP the current interest rate is 10% (compounded semiannually) and that interest rates may move upwards and downwards by 100 basis points. Also, the expected interest rate outstanding at the beginning of each subinterval is the current interest rate (we assume the Pure Expectations Hypothesis). The optimal solution paths are reported in panel 1 of Table 5. We can see that this result is consistent with Khang’s Theorem: optimal portfolio duration consist of making duration equal to the remaining HPP at every ts . The small difference between these two variables are due to the fact of considering a finite number of scenarios about interest rate changes.
188
Eliseo Navarro and Juan M. Nave
Case 2. Stochastic term structure models. We assumed that the expected interest rates at the beginning of each subinterval are given by formulae (27) and (28) and that the instantaneously compounded spot interest rates may change by 100 basis points at the beginning of each subinterval. The current spot rate and the model parameters are those assumed in the former example for US and UK. As we can see in panel 1 of Tables 6 to 9, Khang’s Theorem is still valid under this stochastic term structure regimes. Also these results are consistent with those obtained by Gagnon and Johnson (1994) under stochastic interest rate in a discrete time framework11 .
4. Introduction of transaction costs The models described in the two previous sections are based on the assumption that there are no transaction costs and they lead to the solution suggested by Khang. The next step is the introduction of transaction costs in the model to analyze its effects on the optimal solution. We will assume that transaction costs incurred at each portfolio rearrangement are a percentage, α, of the volume traded (in dollars) at each ts . We also assume that principal and coupon repayments don’t generate any transaction costs although other assumptions could be easily implemented. So, asset purchase prices are increased by a percentage α and sale prices are reduce by the same proportion. This new purchase (sale) prices can be understood as the bid (ask) prices of the assets plus (minus) any proportional fee paid to intermediaries. Then, the budget constrains have to be modified as follows: n
p(0, i)(1 + α)b(0, i) = I0
i=1 n
(35)
p(s, i)(1 + α)b(s, i) −
i=s+1
− n i=k+1
n
n
p(s, i)(1 − α)z(s, i) − p(s, s)z(s, s)
i=s+1
Ci x(s − 1, i) = 0 ,
i = 1, . . . , k − 1
i=s+1
z(k, i)p(k, i)(1 − α) + z(k, k)p(k, k) +
n i=k
Ci x(k − 1, i) = Vk
Finally, the model is applied to the former example and the results are presented in panel 2 of Tables 5 to 9 for different a values (0.15%, 0.3%, 0.45% and 0.6%). The first outcome that must be pointed out is the fact that the optimal path depends now on the level of the transaction costs, i.e., on the level of α. So for α=0 we get the solution suggested by Khang: at each rebalancing point portfolio duration 11 In
particular they assume the Black, Derman and Toy (1990) arbitrage-free evolution model.
Testing the optimality of immunization strategies with transaction costs
189
must be equal to the remaining HPP. But for values of α bigger than 0.05%, the optimal path has an initial portfolio that is not immunized any longer. It is important point out that this critical α-values are very low, implying that, in practical terms, the immunization strategy cannot be optimal. In any case, this critical α-values will depend on the assumption about interest rates volatility which has been introduced in the model through a set of scenarios about the interest rates changes (in our example consisting of changes up to 100 basis points from its current level). Then the bigger the variability of interest rates, the bigger the critical α-value. An additional point is that the difference between the initial portfolio duration and the HPP increases as the level of transaction costs rises. In fact, in this simple example four different solutions are obtained; the initial portfolio duration range from 1.5 years (for α=0) to approximately 1,43 years (for α = 0.6%). It is worth pointing out that for high enough transaction costs the optimal solution consist of investing the whole initial budget in a bond with maturity at the end of the HPP. What this strategies suggest is that due to the behavior of portfolio duration, it is possible to take advantage of its evolution along time. If investors build up a portfolio with a duration less than the HPP, as time passes, portfolio duration becomes closer to length of the HPP. So interest rate risk disappears without any additional portfolio rebalancing. Finally, portfolio duration would be greater than the HPP, and then investors should wonder if it is worth avoiding the potential loss derived from an adverse interest rate change by immunizing their portfolio. If transaction costs are too high it could be worth assuming this risk, in order to avoid the loss derived from portfolio rebalancing. If portfolio duration is long enough, this process may be helped by a optimal coupon reinvestment in those bonds with a the shortest duration. However, if the HPP is short, it will not be possible to keep duration equal to the HPP unless we proceed to sell bonds with long durations and invest the proceeds in bonds with shorter duration. It is important to point out that these findings are common to all cases analyzed, i.e., they are independent of the TSIR model assumed. These models provide a first hint to answer the question posed by Maloney and Logue (1989) with respect to the “mismatch duration that is tolerable, given that allowing a modest mismatch will certainly reduce trading costs”.
5. Model testing The model we developed in the previous sections is based on the hypothesis of a given behavior of interest rates. Particularly, we assumed that short term interest rates can move upwards or downwards from its expected value by 100 basis points. This range of interest rate changes was chosen arbitrarily , so in this section we test the optimality of the different strategies if interest rates are allowed to change without any bound
190
Eliseo Navarro and Juan M. Nave
according to the interest rate model assumed (Vasicek or C.I.R’s models). In this case, extreme interest rate changes are possible although its probability of occurrence may be negligible. Theoretically if we don’t assume any bound for interest rate changes the maximin strategy is immunization independently of transaction costs; however very extreme interest rate changes are so unlikely that in practical terms other strategies may yield better results. One point that has to be highlighted is the existence of a trade-off between interest rate risk and the level of transaction costs. If we had assume in the previous model a higher level of interest rate risk (i.e. if we widen the range of interest rate changes) we would have increased the level of the potential loss derived from adverse interest rate movements and so it may be worth to immunize although this strategy can cause higher transaction costs. As we could see there were four different optimal strategies, depending on the level of transaction costs. These strategies are represented in Figure 1. In this Figure 1 we have represented the difference between portfolio duration and the investor’s planning period as a function of time. Strategy 1 (immunization), consists of a permanent portfolio rebalancing in order to avoid all interest rate risk along the whole planning period by keeping portfolio duration equal to the remaining investor’s planning period; this strategy, however, generates the highest transaction costs. Strategy 4, consists of investing all the investor’s budget in the bond maturing at the end of the planning period; thus, at the beginning , its duration is less than the planning period and so the investor is bearing some interest rate risk, but this strategy is the cheapest in terms of transaction costs. Strategies 2 and 3 are intermediate positions between minimization of interest rate risk (Strategy 1) and minimization of transaction costs (Strategy 4). In order to test the optimality of these four strategies we have proceed to simulate the behavior of interest rates according to Vasicek’s model12 . Particularly, we have discretized the model by dividing each period between rebalancing points into 26 subintervals (of approximately one week length13 ). Then (36)
rt+1 − rt = [α (γ − rt )] ∆t + ρ εt
√ ∆t ,
where ∆t = 1/52, εt ∼ i.i.d N (0, 1), r0 = 0.0561 and t = 1, 2, . . . , 78. For each interest rate path, we have calculated the final portfolio value we would have got if we had followed each strategy and repeated the process 10000 times, for different transaction costs levels. In Figure 2 we have drawn the outcomes of these simulations under the hypothesis of the absence of transaction costs. As we can see, Strategy 4 presents the widest range of final portfolio values meanwhile Stratey 1 concentrates all the outcomes 12 We have also, used CIR’s model, and te results are analogous so we explain here Vasicek model for the sake of simplicity. 13 We have assumed weeks of 7.019 days.
Testing the optimality of immunization strategies with transaction costs
191
around the same final porfolio value (the final value we would have got if interest rates had behaved as expected). To illustrate the effects of transaction costs on portfolio returns (or equivalently on final portfolio values ) we have represented in Figure 3 the final portfolios values under different transaction cost levels . On the whole, transaction costs make final portfolio values to move leftwards and the higher the level, the deeper the movement to the left. However , as Figure 3 shows transaction costs don’t affect all strategies in the same way. This leftwards movement is more intensive for Strategy 1, i.e. immunization strategy, meanwhile the effect of transaction costs is minimized when following Strategy 4. Due to this behavior of transaction costs it may happen that Strategies different from immunization may provide the maximun minimun return. This is shown in Table 10 where we provide the maximun, minimun and mean final portfolio values (after 10000 iterations) we got when following the four strategies described earlier for three different levels of transaction costs (α=0, 0.5 and 1 per cent). For instance, we can see that when α= 1 % strategy 4 yielded a minimum final portfolio value that is higher than the minimum portfolio value guaranteed by immunization although, theoretically, immunization is the maximin strategy when interest rates are not bounded.
6. Conclusions In this paper we have developed a dynamic portfolio selection model for interest rate risk management under different TSIR regimes. This model leads to a result which is consistent with Khang’s Dynamic Immunization Strategy consisting of a continuous rebalancing to keep portfolio duration equal to the investor’s HPP. The model is then enlarged in order to allow the introduction of transaction costs to analyze its effects on the optimal strategy. The results obtained through a very simple example suggest that if transaction costs are taken into account, the strategy consisting of making portfolio duration equal to the HPP is not optimal any longer. Moreover, the optimal path has an initial solution with a portfolio duration less than the HPP. Furthermore, the bigger the level of transaction costs, the bigger the difference between the initial portfolio duration and the HPP. This result held under different TSIR models. Finally, we have tested using simulation techniques, the optimality of these strategies when interest rate movements are not bounded concluding that if transaction costs are high enough, strategies consisting of portfolios with an initial duration less than the investor’s planning period provide the maximun minimun return, although theoretically inmunization is the maximin strategy.
192
Eliseo Navarro and Juan M. Nave
Appendix: Tables and Figures Table 1: Asset characteristics
Asset Asset Asset Asset
Maturity(1) 0.5 1 1.5 2
1 2 3 4
Coupon(2) 10% 10% 10% 10%
Duration(3) 0.5 0.9762 1.4297 1.8616
(1) Years. (2) Paid half-yearly. (3) Macaulay duration.
Table 2: Optimal strategies in a static framework Panel 1 NON STOCHASTIC MODEL x1 x2 x3 x4 2639.536 0 0 7360.465
US UK
US UK
x1 0 1044.207
x1 2858.766 0
Panel 2 VASICEK MODEL x2 x3 0 7886.288 0 4745.310 Panel 3 CIR MODEL x2 x3 0 0 3919.526 0
Duration(1) 1.5006
x4 1516.520 3676.038
Duration(1) 1.5010 1.4999
x4 6843.708 5545.140
Duration(1) 1.5012 1.5010
(1) Portfolio duration are calculated according to Macaulay, and formulae (15) and (22) for the non stochastic model, Vasicek model and CIR model respectively.
Testing the optimality of immunization strategies with transaction costs
193
Table 3: Parameter values of Vasicek and CIR models These parameters were estimated by Nowman (1997) using a discrete time model which reduces some of the temporal aggregation bias. The data used are US Treasury Bill one month yields from June 1964 to December 1989 and one month sterling interbank rate from March 1975 to March 1995.
US UK
a 0.0506 0.0311
US UK
k 0.0373 0.0279
Panel 1 VASICEK MODEL g r2 0.06917 0.0001 0.1028939 0.0001 Panel 2 CIR MODEL m s2 0.0697051 0.0008 0.1039427 0.0007
Current r(t)(1) 0.0561 0.0599
Current r(t)(1) 0.0561 0.0599
(1) April 1997.
Table 4: Optimal strategies of minimum dispersion in a static framework(a)
x1 0
US UK
x1 0 0
US UK
x1 0 0
Panel 1 NON STOCHASTIC MODEL x2 x3 x4 0 8382.429 1617.571 Panel 2 VASICEK MODEL x2 x3 x4 0 7914.367 1488.932 0 7966.707 1494.853 Panel 3 CIR MODEL x2 x3 x4 0 7916.721 1485.324 0 7956.605 1506.982
Duration(b) 1.4996
Duration(b) 1.4995 1.4999
Duration(b) 1.4999 1.5005
(a) Dispersion measure is calculated according to Fong and Vasicek M 2 formula (25). (b) Portfolio duration are calculated according to Macaulay, and formulae (15) and (22) for the non stochastic model, Vasicek model and CIR model respectively.
194
Eliseo Navarro and Juan M. Nave
Table 5: Optimal portfolio path under a flat term structure regime The α value represents the level of transaction costs as a percentage of the volume traded; α=0 means the absence of transaction costs. In this case the optimal strategy is consistent with Khang’s theorem, i.e., at each rebalancing point the portfolio has to be restructured in order to keep its duration equal to the remaining HPP.
s 0 0.5 1
x(s, 1) 0 0 11025.000
x(s, 2) 0 9951.796 0
s 0 0.5 1
x(s, 1) 0 0 10470.523
x(s, 2) 0 9947.132 536.395
s 0 0.5 1
x(s, 1) 0 0 10453.338
x(s, 2) 0 9931.548 535.554
s 0 0.5 1
x(s, 1) 0 0 10436.212
x(s, 2) 0 9916.016 534.717
s 0 0.5 1
x(s, 1) 0 0 10953.022
x(s, 2) 0 10434.412 0
Panel 1 α = 0.00% x(s, 3) 8382.429 548.204 0 Panel 2 α = 0.15% x(s, 3) 9448.628 536.395 0 α = 0.30% x(s, 3) 9434.535 535.554 0 α = 0.45% x(s, 3) 9420.485 534.717 0 α = 0.60% x(s, 3) 9940.359 0 0
(a) Portfolio duration are calculated according to Macaulay.
x(s, 4) 1617.571 0 0
Durationa 1.4996 0.9999 0.5000
x(s, 4) 536.395 0 0
Durationa 1.4529 0.9994 0.5232
x(s, 4) 535.554 0 0
Duration(a) 1.4529 0.9994 0.5232
x(s, 4) 534.717 0 0
Duration(a) 1.4529 0.9994 0.5232
x(s, 4) 0 0 0
Duration(a) 1.4297 0.9762 0.5000
Testing the optimality of immunization strategies with transaction costs
195
Table 6: Optimal portfolio path under Vasicek TSIR model using US data The α value represents the level of transaction costs as a percentage of the volume traded; α=0 means the absence of transaction costs. In this case the optimal strategy is consistent with Khang’s theorem, i.e., at each rebalancing point the portfolio has to be restructured in order to keep its duration equal to the remaining HPP.
s 0 0.5 1
x(s, 1) 0 0 10366.868
x(s, 2) 0 9363.004 0
s 0 0.5 1
x(s, 1) 0 0 10348.526
x(s, 2) 0 9344.520 0
s 0 0.5 1
x(s, 1) 0 0 9810.844
x(s, 2) 0 9329.912 512.549
s 0 0.5 1
x(s, 1) 0 0 10316.518
x(s, 2) 0 9836.590 0
s 0 0.5 1
x(s, 1) 0 0 10299.744
x(s, 2) 0 9821.278 0
Panel 1 α = 0.00% x(s, 3) 7914.367 510.227 0 Panel 2 α = 0.15% x(s, 3) 8893.123 513.351 0 α = 0.30% x(s, 3) 8879.864 512.549 0 α = 0.45% x(s, 3) 9387.450 0 0 α = 0.60% x(s, 3) 9373.477 0 0
(a) Portfolio duration are calculated according to formula (15).
x(s, 4) 1488.932 0 0
Duration(a) 1.4995 0.9999 0.5000
x(s, 4) 513.351 0 0
Duration(a) 1.4543 1.0002 0.5000
x(s, 4) 512.549 0 0
Duration(a) 1.4543 1.0002 0.5238
x(s, 4) 0 0 0
Duration(a) 1.4305 0.9764 0.5000
x(s, 4) 0 0 0
Duration(a) 1.4305 0.9764 0.5000
196
Eliseo Navarro and Juan M. Nave
Table 7: Optimal portfolio path under Vasicek TSIR model using UK data The α value represents the level of transaction costs as a percentage of the volume traded; α=0 means the absence of transaction costs. In this case the optimal strategy is consistent with Khang’s theorem, i.e., at each rebalancing point the portfolio has to be restructured in order to keep its duration equal to the remaining HPP.
s 0 0.5 1
x(s, 1) 0 0 10433.992
x(s, 2) 0 9426.207 0
s 0 0.5 1
x(s, 1) 0 0 10415.415
x(s, 2) 0 9416.569 0
s 0 0.5 1
x(s, 1) 0 0 9886.797
x(s, 2) 0 9401.809 504.493
s 0 0.5 1
x(s, 1) 0 0 9870.596
x(s, 2) 0 9387.090 503.703
s 0 0.5 1
x(s, 1) 0 0 10365.058
x(s, 2) 0 9882.670 0
Panel 1 α = 0.00% x(s, 3) 7966.707 511.094 0 Panel 2 α = 0.15% x(s, 3) 8960.509 505.285 0 α = 0.30% x(s, 3) 8947.112 504.493 0 α = 0.45% x(s, 3) 8933.754 503.703 0 α = 0.60% x(s, 3) 9431.205 0 0
(a) Portfolio duration are calculated according to formula (15).
x(s, 4) 1497.853 0 0
Duration(a) 1.4999 0.9999 0.5000
x(s, 4) 505.285 0 0
Duration(a) 1.4542 0.9998 0.5000
x(s, 4) 504.493 0 0
Duration(a) 1.4542 0.9998 0.5233
x(s, 4) 503.703 0 0
Duration(a) 1.4542 0.9998 0.5233
x(s, 4) 0 0 0
Duration(a) 1.4308 0.9764 0.5000
Testing the optimality of immunization strategies with transaction costs
197
Table 8: Optimal portfolio path under CIR TSIR model using US data The α value represents the level of transaction costs as a percentage of the volume traded; α=0 means the absence of transaction costs. In this case the optimal strategy is consistent with Khang’s theorem, i.e., at each rebalancing point the portfolio has to be restructured in order to keep its duration equal to the remaining HPP.
s 0 0.5 1
x(s, 1) 0 0 10365.467
x(s, 2) 0 9364.264 0
s 0 0.5 1
x(s, 1) 0 0 10346.979
x(s, 2) 0 9361.767 0
s 0 0.5 1
x(s, 1) 0 0 9827.833
x(s, 2) 0 9347.161 494.133
s 0 0.5 1
x(s, 1) 0 0 9811.754
x(s, 2) 0 9332.549 493.360
s 0 0.5 1
x(s, 1) 0 0 10297.982
x(s, 2) 0 9819.790 0
Panel 1 α = 0.00% x(s, 3) 7916.720 507.666 0 Panel 2 α = 0.15% x(s, 3) 8910.619 494.905 0 α = 0.30% x(s, 3) 8897.360 494.133 0 α = 0.45% x(s, 3) 8884.090 493.360 0 α = 0.60% x(s, 3) 9372.248 0 0
(a) Portfolio duration are calculated according to formula (22).
x(s, 4) 1485.324 0 0
Duration(a) 1.4999 1.0000 0.5000
x(s, 4) 494.905 0 0
Duration(a) 1.4539 0.9995 0.5000
x(s, 4) 494.133 0 0
Duration(a) 1.4539 0.9995 0.5230
x(s, 4) 493.360 0 0
Duration(a) 1.4539 0.9995 0.5230
x(s, 4) 0 0 0
Duration(a) 1.4309 0.9764 0.5000
198
Eliseo Navarro and Juan M. Nave
Table 9: Optimal portfolio path under CIR TSIR model using UK data The α value represents the level of transaction costs as a percentage of the volume traded; α=0 means the absence of transaction costs. In this case the optimal strategy is consistent with Khang’s theorem, i.e., at each rebalancing point the portfolio has to be restructured in order to keep its duration equal to the remaining HPP.
s 0 0.5 1
x(s, 1) 0 0 10433.109
x(s, 2) 0 9427.115 0
s 0 0.5 1
x(s, 1) 0 0 9901.047
x(s, 2) 0 9414.835 505.935
s 0 0.5 1
x(s, 1) 0 0 9898.075
x(s, 2) 0 9413.339 492.139
s 0 0.5 1
x(s, 1) 0 0 9881.896
x(s, 2) 0 9398.639 491.371
s 0 0.5 1
x(s, 1) 0 0 10364.794
x(s, 2) 0 9882.616 0
Panel 1 α = 0.00% x(s, 3) 7956.605 509.365 0 Panel 2 α = 0.15% x(s, 3) 8959.023 505.935 0 α = 0.30% x(s, 3) 8958.877 492.139 0 α = 0.45% x(s, 3) 8945.533 491.371 0 α = 0.60% x(s, 3) 9430.492 0 0
(a) Portfolio duration are calculated according to formula (22).
x(s, 4) 1506.982 0 0
Duration(a) 1.5005 0.9999 0.5000
x(s, 4) 505.935 0 0
Duration(a) 1.4542 0.9998 0.5234
x(s, 4) 492.139 0 0
Duration(a) 1.4537 0.9992 0.5228
x(s, 4) 491.371 0 0
Duration(a) 1.4537 0.9992 0.5228
x(s, 4) 0 0 0
Duration(a) 1.4309 0.9764 0.5000
Testing the optimality of immunization strategies with transaction costs
199
Table 10: Description of the data generated by the simulation of the optimal strategies with different levels of transaction costs
STRATEGY STRATEGY STRATEGY STRATEGY
STRATEGY STRATEGY STRATEGY STRATEGY
STRATEGY STRATEGY STRATEGY STRATEGY
1 2 3 4
TRANSACTION COSTS = 0 % Final Values MINIMUM MAXIMUM MEAN 1088500 1088600 1088525 1087250 1089700 1088525 1087150 1089950 1088525 1086650 1080300 1088525
1 2 3 4
TRANSACTION COSTS = 0’5% Final Values MINIMUM MAXIMUM MEAN 1080950 1081050 1080975 1080750 1083350 1082068 1080700 1083750 1082353 1080300 1084800 1082615
1 2 3 4
TRANSACTION COSTS = 1% Final Values MINIMUM MAXIMUM MEAN 1073500 1073600 1073530 1074250 1076900 1075682 1074450 1077700 1076243 1074900 1078650 1076762
Figure 1: Interest rate risk assumed by optimal strategies 0,1
Strategy 4
HPP-DURATION
Strategy 3 Strategy 2
0,05
0
-0,05 Strategy 1 (immunization) -0,1 0
0,25
0,5 TIME
0,75
1
1,25
1,5
Eliseo Navarro and Juan M. Nave
Figure 2: Outcomes of optimal strategies. Transaction costs 0% 1000 900
Strategy 1
800 700
Frecuency
600
Strategy 2
500 400
Strategy 4
300
Strategy 3
200 100 0 1086225 108647 108672108697 5 108722 5 108747108772 5 5 108797 5 1088225 5 108847108872 5 108897 5 1089225108947 5 Final portfolio Value 5 108972 108997 1090225 5 5 5
Figure 3: Effect of transaction costs on optimal strategies
E FFE C T OF TRANS AC TION C OSTS ON OP TIMAL S TR ATEGIE S 1000
TRA NS ACTION CO S T S 1 %
TRA NS ACTIION CO S T S 0'5 % TRA NS ACTIION CO S T S 0 %
900 800
S trategy 1 S trategy 1
700 FRE CUE NCY
200
600
S trategy 2
500 S trategy 3
400 300
S trategy 2
S trategy 4
200 S trategy 4 100 0 1073025
S trategy 3
1075025
1077025
1079025
1081025
1083025
FINAL P ORTFOLIO VALUE
1085025
1087025
1089025
1091025
Testing the optimality of immunization strategies with transaction costs
201
References ´ s, A. and Iba ´n ˜ ez, A. (1998): When can you inmunize a bond portfolio? [1] Balba Journal of Banking and Finance 22, 1571–1595. [2] Bierwag, G.O. (1987): Duration Analysis. Managing Interest Rate Risk. Ed. Bellinger, Cambridge, Mass. [3] Bierwag, G.O.; Fooladi, I. and Roberts, G.S. (1993): Designing an immunized portfolio: Is M-squared the key? Journal of Banking and Finance 17, 1147–1170. [4] Bierwag, G.O. and Khang, C. (1979): An immunization Strategy is a Minimax Strategy. Journal of Finance 34, 389–399. [5] Black, F.; Derman, E. and Toy, W. (1990): A One Factor Model of Interest Rates and its Application to Treasure Bond Options. Financial Analysts Journal 46, 33–39. [6] Boyle, P.P. (1978): Immunization Under Stochastic Models of the Term Structure. Transactions of the Faculty of Actuaries 108, 179–187. [7] Cox, J.C.; Ingersoll, J.E. and Ross, S.A. (1979): Duration and the Measurement of Basis Risk. Journal of Business 52, 51–61. [8] Cox, J.C.; Ingersoll, J.E. and Ross, S.A. (1985): A Theory of the Term Structure of Interest Rates. Econometrica 53, 385–407. [9] Dantzig, G.D. (1971): A Proof of the Equivalence of Programming Problem and the Game Problem. In Activity Analysis of Production and Allocation, T.C. Koopmans, ed., Yale University Press. [10] D’Ecclesia, L. and Zenios, S.A. (1994): Risk Factor Analysis and Portfolio Immunization in the Italian Bond Market. Journal of Fixed Income 4, 51–58. [11] De Felice, M. and Moriconi, F. (1991): La teoria dell’immunizzazione finanziaria. Modelli e strategie. Il Mulino, Bologna. [12] Fisher, L. and Weil, R.L. (1971): Coping with the risk of interest rates fluctuations: Returns to bondholders from naive and optimal strategies. Journal of Business 44, 408–431. [13] Fong, H.G. and Vasicek, O. (1983): The Tradeoff Between Return and Risk in Immunized Portfolios. Financial Analysts Journal 39, 73–78. [14] Gagnon, L. and Johnson, L.D. (1994): Dynamic Immunization under Stochastic Interest Rates. The Journal of Portfolio Management 20, 48–55. [15] Ingersoll, J.E.; Skelton, J. and Weil, R.L. (1978): Duration Forty Years Later. Journal of Financial and Quantitative Analysis 13, 627–650. [16] Khang, C. (1983): A Dynamic Global Portfolio Immunization Strategy in the world of Multiple Interest Rates Changes: A Dynamic Immunization and Minimax Theorem. Journal of Financial and Quantitative Analysis 18, 355–362.
202
Eliseo Navarro and Juan M. Nave
[17] Lee, S. B. and Cho, H. Y. (1992): A Rebalancing Discipline for an Immunization Strategy., The Journal of Portfolio Management 18, 56–62. [18] Maloney, K.J. and Logue, D.E. (1989): Neglected complexities in Structured Bond Portfolio. The Journal of Portfolio Management 15, 59–68. [19] Navarro, E. and Nave, J.M. (1995): An´ alisis de los factores de riesgo en el mercado espa˜ nol de Deuda P´ ublica. Cuadernos Aragonenses de Econom´ıa 5, 331–341. [20] Nawalkha, S.K. and Chambers, D.R. (1996): An Improved Immunization Strategy: M-Absolute. Financial Analysts Journal 52, 69–76. [21] Nowman, K.B. (1997): Gaussian Estimation of Single-factor Continuous Time Models of the Term Structure of Interest Rates. Journal of Finance 52, 1695– 1706. [22] Reitano, R.R. (1991): Multivariate Immunization Theory. Transactions of the Society of Actuaries 43, 392–441. [23] Reitano, R.R. (1992): Non-Parallel Yield Curve Shifts and Immunization. The Journal of Portfolio Management 18, 36–43. [24] Sherris, M. (1995): Interest Rate Risk Factors in the Australian Bond Market. AFIR International Colloquium 2, 859–869. [25] Steeley, J.M. (1990): Modeling the Dynamics of the Term Structure of Interest Rates. The Economic and Social Review 21, 337–361. [26] Strickland, C.R. (1993): Interest Rate Volatility and the Term Structure of Interest Rates. FORC Preprint., Vol 93/37. [27] Vasicek, O. (1977): An Equilibrium Characterization of the Term Structure. Journal of Financial Economics 5, 177–188.
Eliseo Navarro Departamento de Econom´ıa y Empresa Universidad de Castilla-La Mancha Plaza de la Universidad, 1. 02071-Albacete [email protected]
Juan M. Nave Departamento de Econom´ıa y Empresa Universidad de Castilla-La Mancha Avenida Alfares, 44. 16041-Cuenca [email protected]
Embedded options and integrated asset-liability management for life insurance Gabriele F. Susinno1
Abstract: In this paper we describe life insurance contracts as a portfolio of financial options. This type of policy constitutes the bulk of mathematical reserves of continental European insurance companies. A close examination of a typical contract reveals an exchange of options between policy holders and the Insurance company whereby the former is long a floor option (the minimum guaranteed return) on the fund and short a call on the fund excess return relative to the floor. From an insurance company’s point of view, this amounts to holding a portfolio of financial options vis-` a-vis the client (the most common types of options included in Insurance contracts are the standard European and cliquet with European exercise options). This framework can be successfully used to support strategic decisions at a firm-wide level: return on risk capital, product design and innovation, risk management, asset benchmark selection and hedging strategies.
1. Introduction Financial derivatives and insurance contracts are joined by the same origins, indeed both are based on the notion of fair expectation with respect to the realization of a random event (risk). The principal difference between the actuarial valuation approach (Insurance) and the Capital Markets approach (Finance) is based on the typology and phenomenology of risks considered. Indeed the mathematical paradigm of risk in insurance is based on a long term view controlled through the initial condition settings at the inception of the insurance policy [3]. While actuaries focuses their know-how on risks such as mortality or property hazard, Capital Markets tools are aimed to use or contrast the random evolution of financial securities. The paradigm of financial mathematics being based on a short term view implying a constant adaptation to changes. For both the basic concept is to provide insight into decision making in 1 Gabriele F. Susinno es doctor en F´ ısica experimental de part´ıculas y trabaj´ o durante a˜ nos en el CERN de Ginebra. En los u ´ltimos tiempos se ha interesado por las Finanzas y la Gesti´ on del Riesgo; actualmente es Managing Director de la firma Capital Management Advisors de Roma. Esta charla se imparti´ o en la sesi´ on del Seminario Instituto MEFF-RiskLab de mayo de 2001.
204
Gabriele F. Susinno
the face of uncertainty. The emergence of finance related insurance products such as index linked policies, catastrophe bonds, catastrophe future options, and the convergence to a low inflation-low interest rates environment creates the demand for a more dynamic interplay between insurance and finance. The standard operativity in life insurance market tend to re-conciliate the liability status with the related investments few times over the life cycle of the product (once a year or quarterly is a common practice). Often there is a little concern on the coherency of methodologies used to estimate the levels of Assets (Financial Side) and Liabilities (Actuarial Side). From a financial risk management point of view this operativity is a paradox! A life insurance contract is nothing else than an investment with a financial guarantee, where the guarantee can be exercised only on the occurrence of particular events (often independent of the market evolution). The insurance has sold exotic, long lived put options to its clients. This book of options, embedded in the liabilities evaluation process, should be monitored almost in real time and assets must be managed coherently to minimize the risk of loss (or even bankruptcy). Assets must be managed in order to maximize the probability to satisfy the constraints imposed by the contingent claim positions hidden in the liabilities side. A critical problem to achieve this goal is in the information flow process, and data quality. Indeed for investments in the Capital Markets, a real time information stream allowing for a constant survey of a segregated fund evolution is available. The update of the liabilities side of the balance sheet undergoes through a totally different process. In a risk management view the reconciliation between liabilities and assets on a frequent basis (e.g. daily or weekly) is mandatory to maximize the efficiency of any dynamic asset allocation strategy. In this framework we have developed a set of operational tools to interface actuarial analysis with financial engineering for life insurances. The main target of our efforts has been toward the application of portfolio insurance principles to life insurance segregated funds. In this context the objective is given by the liabilities profile and assets are managed accordingly. The idea to build-up a computer system dedicated to the ALM dates back to September 2000 when a group of professionals in collaboration with Arthur Andersen decided to open a dedicated business unit specialized on advisory services in the field of Dynamic Hedging and Asset and Liability Management. In this talk we report on a rationalization of the problem, and we outline main technical issues we have faced in the technical implementation. We are sincerely grateful to Prof. H. Buhlmann, to Prof. P. Embrechts, and to Prof. F. Delbaen for the useful discussions we had at the ETHZ. A warm acknowledgement to Dr V. Henderson for the interest shown in the problem of ALM for Life Insurance.
Embedded options and integrated ALM for life insurance
205
2. A contingent claim approach Leaving aside life and death probabilities, and consider a single-premium policy for which the Company is committed to pay a compounded amount at maturity T. Indeed, we will first focus entirely on financial risks, therefore the reader may also consider the products analyzed as a specific form of guaranteed investment contracts or GICs (see e.g. [2]). Consider a single cohort of Life insurance policies maturing at time t = T . At t = 0, the policyholder pays a premium L0 to be invested on an asset portfolio. For regulatory reasons, the Company will participate with its own capital for an amount equal to λL0 to the acquisition of the asset portfolio. The starting value invested in the portfolio will be: A0 = L0 (1 + λ) . The terms of the Life insurance cohort entitle the policyholders to receive at maturity an amount equal to: AT − A 0 rg T L0 · max e ; 1 + β . A0 Where: rg = annual interest rate; β = policyholder participation rate; L0 = single premium invested in t = 0; AT = value in t = T of the single premiuminvested in t = 0. In the typical case (for instance, in the Italian context) rg and β are contractual aspects which do not change over time. In some cases β may differ from contract to contract as a function of the premium amount. At maturity, the value of the policyholder investment is given by: AT − A 0 AT − A 0 T rg T − L0 1 + β + max L0 e ;0 ΠL = L0 1 + β A0 A0 1 βAT + max (1 + λ)L0 erg T − 1 + β − βAT ; 0 = (1 − β)L0 + 1+λ
K β − AT ; 0 (1) = (1 − β)L0 + AT + max 1+λ β where: (2)
K = L0 (1 + λ) erg T − (1 − β) .
2.1. Option decomposition Let us consider the ownership of the asset portfolio ΠA as the sum of two components, (3)
t t t ΠA = ΠE + ΠL ,
where the index A refers to the assets portfolio, which is equal to the equities E (shareholders) portfolio plus liabilities L (policyholder) portfolio.
206
Gabriele F. Susinno
At time t = 0 the policyholders’ premiums are paid in the fund for an amount of L0 , while the Company will contribute for an amount of λL0 . Therefore the initial balance can be written as: Πt=0 L
= L0 =
Πt=0 E
=
(4)
1 At=0 ; 1+λ
λ At=0 . 1+λ
Figure 1: Policyholder payoff at maturity.
The policyholder finances the implicit cost of the minimum guaranteed return by granting to the insurance company a participation of (1 − β) on the fund yield. Clearly this will happen only in case of over-performance of the latter compared to the minimum guaranteed return. No transfer of resources takes place between the two parties, however options are exchanged. Therefore the initial fair portfolio of options, embedded in the Insurance contract, must be worthless. Given a determined participation β of the policyholder in the profits at maturity, and given a lower bound on the final policyholder return determined by the guaranteed rate, the final payoff for the policyholder (Fig. 1) can be decomposed in a series
Embedded options and integrated ALM for life insurance
207
of claims. The claims are contingent to the final worth AT of the trading account of a given reference portfolio. Hence, the liabilities portfolio at a time t ≤ T can be written as: 1 L = ZC + At + P (At ; K, σA , T − t) − C(At ; K, σA , T − t)+ Πt 1+λ K (5) + βC At ; , σA , T − t β
(6)
⇐⇒
L Πt =
+
t
Πcc +
1 At , 1+λ
where ZC corresponds to the value of a Zero Coupon at time t and it is given by: ZC = L0 (1 − β)e−rf (T −t) , and where: rf P C + t Πcc
= = = =
continuously compounded risk-free rate; cost of the put option sold to the policyholder (guaranteed yield); cost of the call option; portfolio containing the contingent claims/bonds positions of the policyholder at time t.
Obviously there is not a unique possibility of decomposing the portfolio2 as the one showed in Eq. 6. However, the decomposition 6 highlight a put option with the smallest strike compatible with the terms of the contract. Indeed, the insurance company has sold to the policyholder a put option with a strike K. Replicating only this position will minimize the downside risks of the final payoff. Moreover, in absence of a mark-up required by the insurance company, it is evident that the initial conditions given by Eq. 4 imply that + Πt=0 cc is worthless. If the options exchanged between the Insurance and the insured are of the same type (e.g. plain vanilla European options) the the following evidence can be deduced: Proposition 1 Given the constraint Πt=0 = L0 then: L rg ≤ rf
∀ β ∈]0; 1] .
Proof. Using the Put-Call parity we have P (At ; K, σA , T − t) − C(At ; K, σA , T − t) ≥ Ke−rf (T −t) − At . Inserting the previous inequality in the Eq. 6 at t = 0 we obtain: −r T rg T β K −rf T f C A0 ; , σA , T ≤ L0 + L0 e −1+β e + (1 − β)L0 e 1+λ β 2 E.g.
the portfolio may be seen as as a zero coupon plus a portion of a call.
208
Gabriele F. Susinno
β K C A0 ; , σA , T ≤ L0 1 − e(rg −rf )T =⇒ 1+λ β Since the call price must positive, it follows that ∀β ∈]0; 1]: rg ≤ rf
The structure of the insurance positions (Fig. 2) at time t ≤ T , is deduced from Eq. 3 and can be written as: (7)
t ΠE =
λ At + 1+λ
−
t
Πcc ,
where the superscript “−” on the left side of Π indicate that the company is taking the opposite contingent claim positions of the policyholder.
Figure 2: Decomposition of the insurance payoff at maturity.
Embedded options and integrated ALM for life insurance
209
2.2. Embedding actuarial probabilities An exhaustive treatment of the following arguments may be found in [7]. Let be payoff (td , g) the payoff of a policy with guarantee g at time td corresponding to the death event of the insured. The time horizon over which the policy is defined is T and the probability density function of the death event i.e. the probability that the death of a subject aged of a will occur in the interval [t, t + dt] is fa (t). If Ta is the time-to-death of an a years old insured, then the survival probability is defined by: P (Ta > t) = 1 − Fa (t) , ∞ where Fa (t) = t fa (s)ds. Finally, let ma (t) be the hazard rate, i.e. the conditional probability that a policy holder who has survived up to time t will die in the time interval [t, t + dt]. It is straightforward that: ma (t) =
fa (t) . 1 − Fa (t)
A standard life insurance contract can be seen as the combination of two components: a term insurance in which the payment is made conditional to a dead event i.e. td ∈ [0, T ] and a pure endowment which is conditional to the fact that the policy holder has survived the maturity of the contract T , i.e. td > T . Indeed it s useful for the reasoning to keep them separate since the un-hedgeable mortality risk intervene in a different way in the two components. Let suppose a virtual equity evolving in a Black-Scholes world, the question is how the Black-Scholes investment strategies in bonds and risky assets to replicate the claims are affected by the additional source of randomness given by the mortality process? In a standard Black and Scholes formalism, is possible to derive a strategy {φ1 , φ2 } in such a way that: V (t) = w[SN (wd1 ) − ge−r(T −t) N (wd2 )] , where w = 1 for an European call and w = −1 and w = −1 for an European put option, exactly replicates the process of the option price. In this case, (8)
φ1
= wN (wd1 ) ,
φ2
= we−r(T −t) N (wd2 )
and N (·) is the cumulative function and di i = 1, 2 are parameters depending on market conditions and on terms of the claim contract (see e.g. [5]). In the insurance case, we can hope to work out a similar framework by carefully taking into account death probabilities [7]. In the term insurance contract, supposing that the death of the insured lie in the time interval [ν, ν + dν], the Black-Scholes strategy would be to hold a portfolio φν1 , φν2 for each possible time of death ν weighted by the corresponding probability to occur.
210
Gabriele F. Susinno
Clearly, the insurance company holds the portfolio corresponding to a client only if he is still alive, therefore we introduce the function 1(Ta >t) which is nil if the condition on parentheses is false. In this case for the term insurance we obtain: φterm 1
= 1(Ta >t)
φterm 2
= 1(Ta >t)
φν1 ma (ν)[1 − Fa+t (ν − t)]dν ,
t
(9)
T
t
T
φν2 ma (ν)[1 − Fa+t (ν − t)]dν ,
where φνi are equivalent to those described on Eq. 1 but where the maturity has been replaced by the time ν of the death event. Note that ma (ν)[1 − Fa+t (ν − t)]dν = P(ν < Ta < ν + dν | Ta > t) is the probability that the death event take place in the interval [ν, ν + dν], ν > t conditional to the fact that the policy holder has survived up to time t. In the pure endowment case, the situation is much simpler since the only condition to get paid is that the policy holder must be still alive at the end of the insurance policy, i.e.: (10)
φendowment 1
= 1(Ta >t) φ1 [1 − Fa+t (T − t)]
φendowment 2
= 1(Ta >t) φ2 [1 − Fa+t (T − t)]
Note that in absence of mortality the standard Black-Scholes strategy is recovered. Note as well that from eq. 2 and eq. 3 it follows that the portfolio must be continuously adjusted due to the presence of the death process. Given the previous results the single net premium Π is straightforward to compute since it is a weighted expectation, differing from the standard option price by the weight factors given by the survival probability. Therefore we have:
T
Πterm (t) =
ν EQ payoff (ν, g) e− t
r(s) ds
ma (ν) [1 − Fa+t (ν − t)] dν ,
t
(11)
− T r(s) ds [1 − Fa+t (T − t)] . Πendowment (t) = EQ payoff (T, g) e t
Is the standard option price
Therefore, if the mortality rate is known, the single net premium for a pure endowment contract can be obtained from the standard option price simply multiplicating it by the adequate survival probability. It is also evident that the option pricing process cannot be decoupled from the mortality process in a term insurance contract since in this case the single net premium is recovered from the convolution of the two processes.
211
Embedded options and integrated ALM for life insurance
2.3. Constant rate premium In this case instead of requiring a net single premium at the beginning of the contract, we allow the policy holder to pay his premiums Πterm and Πendowment at a constant rate A. Consider a standard discount process
B(t1 , t2 ) = e
t2 t1
r(s)ds
The standard strategy at time t0 for a constant amount to be paid at time t is {0, φ2 } = {0, B(t0 , t)}. Therefore, if according to this single premiums are invested in risk-less bonds, the previous results are modified as follows: (12) T φterm = 1 φν1 ma (ν)[1 − Fa+t (ν − t)]dν (Ta >t) 1 t T
φterm 2
=
1(Ta >t)
T
+ t
A · B −1 (t, ω)dω[1 − Fa+t (T − t)]
− t
φν2 −
ν
A·B
−1
(t, ω)dω
· ma (ν)[1 − Fa+t (ν − t)]dν
;
t
and for endowment: φendowment 1
= 1(Ta >t) φT1 [1 − Fa+t (T − t)] T
(13)
φendowment 2
= 1(Ta >t)
T
φT2 −
ν
− t
A · B −1 (t, ω)dω
· [1 − Fa+t (T − t)]
t
A · B −1 (t, ω)dω · ma (ν)[1 − Fa+t (ν − t)]dν
.
t
Which gives the insurer strategy. On the other end, the cumulative discounted payment process C0 is given by: (14)
C0 = A ·
T 0
B −1 (0, ν)1(Ta >ν) dν .
From which we can deduce the net premium rate A by equating the cumulative discounted premium process vale with the single net premium Π (15)
Aterm
=
T 0
(16)
Aendowment
=
T 0
Πterm B −1 (0, ν)1(Ta >ν) dν Πendowment B −1 (0, ν)1(Ta >ν) dν
,
.
212
Gabriele F. Susinno
3. From theory to practice: High performance computing To put theory in practice and to create a useful risk management tool based on the formalization outlined above, a careful study of the computing platform has been mandatory. Indeed the first trials on realistic cases have shown that the volume of data to be treated ask for a prohibitive computing time on “conventional” platforms. Hereafter we outline the contents of a joint project between Capital Management Advisors (CMA - a company of Deloitte Touche Tohmatsu-Italy) and major experts on the field of parallel computing to apply low cost/high efficiency solutions of high performance computing to financial applications. In particular CMA is specialized in supporting institutional investors with advisory services for Integrated Asset-Liability Management and Portfolio Insurance. Given the extreme path dependency imposed by the joint evolution of assets and liabilities, the problem of defining strategic asset allocation and management rules, given the investor’s profile, must be tackled in an integrated Monte Carlo framework. Indeed the conceptual complexity of the problem is reflected on the technological challenges for its solution. The quantitative department of CMA has investigated computational solutions which maximize the exploitation of the computational potential of its PC based intranet. The adopted strategy has been to locate and implement the solution at the divergence of the performance/price ratio. This has been achieved with a Linux operating system which grants to the CMA computing system stability, flexibility, efficiency, and portability. With the power and low prices of today’s PCs and the availability high speed Ethernet interconnect, it makes sense to combine them to build High-PerformanceComputing and Parallel Computing environment. This is the concept behind the parallel computing system we have implemented. The core idea is to apply tested techniques of High Performance Computing to tackle the problem of operative limits both from the hardware and software side via the use of Cluster of PC’s. Clusters are piles of powerful PC running the best available processors generally interconnected through a low latency communication network. By working in parallel, they can provide huge amounts of computing power. Of course software has to be tuned to benefit from this architecture. Clustering technology offers by far the best price/performance ratio and can beat costly vector computers by orders of magnitude (Fig. 3). It is important to note that up to now parallel computing has constituted an attracting research domain in academic and military communities. Given the apparent technological complexity and the costs of available proprietary solutions, their use on other fields had a limited development. The actual status of the art, both from the technological and administration point of view, allows to a wider spectrum of users to benefit from this kind of technical solution.
Embedded options and integrated ALM for life insurance
213
Figure 3: Performance enhancement from cluster technology. Comparative test from RTExpress, top line corresponds to a single processor Matlab computation.
3.1. The needs The CMA’s Asset-Liability Management model outlined above [4, 8] is used to determine the optimal asset allocation between the different risk classes of a portfolio given the constraints existing on the liability side of the balance sheet. Operative indicators are computed as expected values from a Monte Carlo generated set of scenarios, given an ex-ante hedging strategy. As a Monte Carlo simulation a large sample of independent scenarios need to be generated in order to minimize the statistical error of the estimated quantities. It results a natural parallelization of the process which consist to generate an optimal (the optimality is given by hardware constraints) number of scenarios on each node of the cluster (Fig. 4). The Parallel Statistically Independent Runs (PSIR) algorithm has coarse grained parallelism, high speedup and efficiency growing with the number of processor p < n (n the number of statistically independent runs). The maximum value of speedup for this algorithm can be obtained at p = n (See e.g. [1]). The Matlab environment has been both chosen for its high-level matrix based language that permits to express computations in an exceptionally concise manner and for historical inheritance. Unfortunately the main problem one has to face when moving to distributed memory multiprocessors is that Matlab is strongly rooted on uniprocessor platforms. To overcome this problem many solutions with different degrees of complexity has been investigated. The simplest and probably the most advisable solution is to take advantage of both Matlab functionalities and high levels of performance obtained through the use
214
Gabriele F. Susinno
Initialization Processes distribution
Random Numbers Generator Martket Evolution Fund Evolution Balance Estimation
. . .
NO
Modify Asset Allocation
. . .
YES Apply the Hedging Strategy NO
t = Horizon YES
Sinchronization Indicators Estimation (Averaging) OUTPUT Figure 4: Flowchart of Parallel Statistically Independent Runs (PSIR).
of a parallel system (Fig.5). In that direction some solutions have been investigated at Cornell University [6]. Indeed they developed the MultiMATLAB system designed to provide high-performance on multiprocessors while keeping the functionality of the Matlab environment. The major drawback of this solution could be the cost of Matlab Licenses, which grows linearly with the number of nodes in the cluster. Indeed the approach of bringing together the advantages of SCEs (scientific and technical computing environments) and of parallel and distributed processing systems can be implemented with the use of open source SCEs like Octave or Scilab. These environments may grant similar advantages than those obtained using Matlab with reduced costs. For the time being, we have asked to a team of researchers from Artabel and from the Ecole Polytechnique de Paris to analyze the eventual problems arising from a migration of the existing Matlab code to Scilab [10]. Alternative solutions like RTExpressTM [9] which provides a development and runtime environment allowing for MATLAB script files to be directly compiled and then executed on embedded and non-embedded parallel high performance computers (HPC) are considered.
Embedded options and integrated ALM for life insurance
215
11 00 111 000 000 111 USER
Interactive Processor
Processor
Processor
MATLAB
MATLAB
MATLAB
Process
Process
Process
INTERCONNECTING NETWORK
Figure 5: Multi-Matlab Architecture.
3.2. The collaboration Importing customized parallel computing solutions on quantitative financial advisory services constitute an excellent opportunity to propose a new scalable and performing solution to a market in a fast technological evolution. Since the beginning, CMA has used the consulting services of SuSE AG to optimize its internal network, and therefore they have been a preferential partner for the development of a parallel computing system. AMD has as well contributed to the project with its last generation hardware technology.
3.3. The system The system is made by eight identical nodes on a rack solution plus a dispatching server. Server Characteristics are: Component Processor: Mother Board: SDRAM: Hard Disk(raid): Network Card: Floppy Driver: CDRom:
Type AMD: Athlon 1.2GHz (FSB266) Motherboards supporting DDR 0.5Gb 30Gb IDE 7200t Ethernet 100 1 1
216
Gabriele F. Susinno
Figure 6: An example of Linux Cluster.
Non Exhaustive Node Characteristics are: Component Processor: Mother Board: SDRAM: Hard Disk(raid): Network Card: Floppy Driver: CDRom:
Type AMD: Athlon 1.2GHz (FSB266) Motherboards supporting DDR 1Gb 20Gb IDE 7200t Ethernet 100 1 1
The system is completed by a Ethernet Switch 100 (Cisco 24 ports), and the relative rack and power supply devices. A cluster picture is shown in Fig. 6. Given the typology of the problem to solve, the system’s latency do not constitute a key issue and will be eventually addressed in a second phase. Thanks to the parallelization of the Monte Carlo scenario simulation, we have been able to provide a risk-management service in almost real time. Indeed the time for the analysis of a typical client’s portfolio has been brought down to less than two hours.
Embedded options and integrated ALM for life insurance
217
4. Conclusions In this paper we developed a model describing the traditional life Insurance contract as a portfolio of zero-coupons and financial options. This representation, even though is simplified, highlights the sources of shareholders’ value for life insurance companies. The model is general enough to be subsequently refined and adapted at the firm level in order to take into account more detailed and realistic situations. Moreover, the portfolio represented in this way is potentially tradeable in the capital markets. This also creates value because it allows Companies to securitise an otherwise illiquid asset such as the “embedded value”. As to product planning, we showed that shareholders value (or return on capital) is a function of the bonus level, the minimum guaranteed return and the volatility of the underlying fund. When launching a new product it is therefore crucial to well balance the above variables in order to attain the desired return on capital given the current market conditions. Furthermore, since the volatility of the underlying fund is a proxy of asset allocation, the model can be adapted to set rules for benchmark selection and tracking error limits for the asset managers. The paper also addresses the question of how to best manage the embedded option portfolio once the policy is launched. Notice the importance of this issue in order to protect shareholders’ value against adverse changes in key financial variables such as interest rates.
References [1] Bykov, N. Y. and Lukianov, G. A.: Parallel direct simulation Monte Carlo of non-stationary rarefied gas flows at the supercomputers with parallel architecture.. IHPC&DB, Preprint no. 5-97. [2] Black, K. and Skipper, H. D.: Life Insurance. Prentice Hall, twelfth edition, 1994. [3] Buhlmann, H.: Mathematical Paradigms in Insurance and Finance. http://www.afshapiro.com/Bulhmann/index Buhlmann.htm. [4] Giraldi, C. et al: Consequences of the Reduction of Interest Rates on Insurance. Geneva Papers on Risk and Insurance, 1999. [5] Hull, J.: Options, Futures, and Other Derivatives. Prentice-Hall, 2002. [6] Menon, V. and Trefethen, A. E.: MultiMATLAB: Integrating MATLAB with High-Performance Parallel Computing. In Supercomputing 1997. IEEE Computing and ACM SIGARCH. San Jose, California. November, 1997. [7] Rolski, T., Schmidli, H., Schmidt, V. and Teugels, J.: Stochastic Processes for Insurance and Finance. Wiley, 1999.
218
Gabriele F. Susinno
[8] Susinno, G.: On Equity Linked Life Insurance Contracts, to be published. [9] Integrated Sensors Inc. http://www.rtexpress.com/ [10] Scilab Group, Jean-Philippe Chancelier http://www-rocq.inria.fr/scilab/
et
al,
INRIA-Rocquencourt.
Gabriele F. Susinno Capital Management Advisors S.r.l. Quantitative Strategies Department Via Piave, 8 00187 Roma, Italy [email protected]
Any views expressed here are the present personal views of the author. They do not necessarily represent the views of the author’s company.
M´ etodos de valoraci´ on de opciones americanas: el enfoque “least-squares Monte Carlo” Manuel Moreno y Javier F. Navas1
Abstract: This article presents a review of American option pricing models. We concentrate on Least-Squares Monte Carlo, a technique recently proposed by Longstaff and Schwartz (2001). This method is based on least squares regressions in which the explanatory variables are some polynomic functions.
1.
Introducci´ on
Los activos derivados son instrumentos financieros cuyo precio y caracter´ısticas dependen de otro activo (llamado “subyacente”). Ejemplos de dichos activos son futuros y opciones. La principal diferencia entre ambos es que el futuro implica una obligaci´ on a comprar (o vender) el activo subyacente en un momento determinado, mientras la opci´ on otorga a su propietario el derecho (pero no la obligaci´ on) a negociar el subyacente en un momento futuro a un precio predeterminado (precio de ejercicio o strike price). El propietario de una opci´ on de compra (de venta) ejerce su derecho si el precio de mercado del activo subyacente es superior (inferior) al precio de ejercicio. La diferencia entre ambos precios supone la ganancia para el inversor que ha comprado la opci´ on. Esta ganancia es bruta, pues se debe descontar el precio de la opci´ on, esto es, la prima que se pag´o en el momento inicial al vendedor. Dentro de las opciones “est´ andar”, se suele distinguir entre las de tipo “europeo” y “americano”. En el primer caso, la opci´ on se puede ejercer solamente en el momento final indicado por el vencimiento de dicha opci´ on. Por el contrario, la opci´ on americana se puede ejercer en cualquier momento hasta su vencimiento. Dada esta diferencia, es obvio que el precio de una opci´ on americana es igual o superior al precio de una opci´ on europea de iguales caracter´ısticas. Como se ha mencionado anteriormente, la decisi´ on de ejercer una opci´ on europea depende de la comparaci´ on entre el precio de mercado del subyacente en el momento 1 Manuel Moreno es Profesor Titular del Departament d’Econom´ ıa i Empresa de la Universitat Pompeu Fabra y miembro del Centre de Recerca en Econom´ıa Financera de dicha Universidad. Javier F. Navas es profesor de Finanzas en el Instituto de Empresa. Esta charla fue impartida por el segundo autor en la sesi´ on del Seminario Instituto MEFF-RiskLab de junio de 2001.
220
Manuel Moreno y Javier F. Navas
determinado por el vencimiento de la opci´ on y el precio de ejercicio. Sin embargo, para la opci´ on americana, existe la posibilidad de ejercer esta opci´ on antes del vencimiento, lo cual se suele conocer como ejercicio anticipado. En el caso de la opci´on americana de compra, se ejerce cuando se dan dos condiciones: a) el precio de mercado del subyacente es superior al precio de ejercicio y b) el inversor cree que dicho precio de mercado es superior al valor actual de cualquiera de los posibles precios futuros del subyacente (lo que se suele llamar valor de continuaci´ on). Esta segunda condici´on hace que la valoraci´ on de opciones americanas sea m´ as complicada que la de opciones europeas. En general, la valoraci´ on de opciones comienza suponiendo que el precio del activo subyacente sigue un cierto proceso de difusi´ on. Bajo condiciones de no arbitraje, diversas t´ecnicas matem´aticas permiten derivar una ecuaci´ on diferencial en derivadas parciales cuya soluci´ on es el precio de la opci´on.2 Para opciones europeas, esta soluci´ on se puede obtener de forma anal´ıtica. Sin embargo, para opciones americanas, la soluci´ on se suele obtener de forma num´erica, empleando t´ecnicas como diferencias finitas, ´arboles, integraci´ on num´erica o simulaci´ on Monte Carlo. Los esquemas de diferencias finitas discretizan la ecuaci´ on diferencial asociada al precio de la opci´ on resolviendo la ecuaci´ on en diferencias correspondiente, mientras que los a´rboles trabajan con el proceso en tiempo discreto que sigue el precio del activo subyacente. Por otra parte, la t´ecnica de Monte Carlo simula trayectorias para el precio del subyacente. Para cada una de estas trayectorias, se obtiene el momento en el que se ejerce la opci´on y se calcula el pago para el inversor. Dicho pago es descontado al instante inicial utilizando el tipo de inter´es libre de riesgo. Este proceso se repite para todas las trayectorias y se calcula la media aritm´etica de todos los valores iniciales. Dicha media es el precio de la opci´ on. Esta t´ecnica permite valorar opciones est´andar, pero tambi´en se puede aplicar a casos m´as complicados como varios activos subyacentes, procesos de difusi´on con saltos o volatilidad estoc´ astica. Como describiremos posteriormente, la t´ecnica de Monte Carlo ha sido utilizada en pocos casos para valorar opciones americanas, a las cuales se les suelen aplicar otros m´etodos num´ericos. Una excepci´ on es Longstaff y Schwartz (2001), que han propuesto un m´etodo conocido como “Least-Squares Monte Carlo” (LSM) para estimar el valor de continuaci´ on. La estimaci´ on se realiza mediante una regresi´ on de m´ınimos cuadrados conjuntamente con la informaci´ on de secci´ on cruzada proporcionada por la simulaci´ on de Monte Carlo. En concreto, esta regresi´ on utiliza como variable a explicar los pagos (descontados) que esperamos recibir en el futuro. Las variables explicativas son una serie de funciones b´ asicas que dependen de los precios de los activos subyacentes. Mediante este m´etodo se obtienen precios de diferentes derivados como, por ejemplo, opciones de venta americanas, una opci´ on americana-bermuda-asi´ atica, una op2 Otra alternativa es obtener dicho precio como el valor actual del pago futuro esperado. En este caso, se obtiene una integral.
´ n de opciones americanas M´ etodos de valoracio
221
ci´ on americana donde el precio del subyacente sigue un proceso de difusi´ on con saltos o una opci´ on americana sobre el m´aximo de cinco activos subyacentes. Este trabajo se estructura del siguiente modo: la secci´on 2 presenta una descripci´ on de modelos de valoraci´ on de opciones, con especial atenci´ on a los derivados de tipo americano. En la secci´ on 3 presentamos la t´ecnica LSM e incluimos un ejemplo num´erico. Finalmente, la secci´on 4 presenta las principales conclusiones.
2. 2.1.
Modelos de valoraci´ on de opciones El modelo Black-Scholes-Merton
Este modelo parte de una econom´ıa en la que tenemos tres instrumentos financieros: una opci´ on europea, una acci´ on (activo subyacente de esta opci´ on) y un activo libre de riesgo. El modelo supone que no hay fricciones en el mercado, que los activos se negocian en tiempo continuo y que existe un tipo de inter´es libre de riesgo (r) constante para prestar y pedir prestado.3 Finalmente, se supone que el precio del activo subyacente, S, sigue —bajo la medida de probabilidad neutral al riesgo— un proceso browniano geom´etrico (1)
dS = r S dt + σ S dz ,
donde σ es la tasa (constante) de volatilidad de la rentabilidad de la acci´ on y z es un movimiento browniano est´andar. A veces, es u ´til reescribir esta ecuaci´ on en funci´ on del logaritmo del precio del activo, x = ln(S). Con esta nueva variable, la ecuaci´ on (1) se convierte en 1 (2) dx = r − σ 2 dt + σ dz , 2 con la ventaja de que ahora tenemos un t´ermino constante tanto en la deriva como en la volatilidad del proceso estoc´ astico. Tras construir una cartera de cobertura (libre de riesgo), estos autores aplican condiciones de no-arbitraje y derivan la siguiente ecuaci´ on diferencial en derivadas parciales para el precio de la opci´ on C(S, t) (3)
1 2 2 ∂ 2 C(S, t) ∂C(S, t) ∂C(S, t) σ S + = r C(S, t) . +r S 2 2 ∂S ∂S ∂t
La condici´ on terminal que debe cumplir la soluci´ on de esta ecuaci´ on, C(S, T ), viene dada por el pago final de la opci´ on. Para el caso de una opci´ on de compra, tenemos (4)
C(S, T ) = m´ ax{ST − K, 0} ,
3 As´ ı, la cantidad Bt invertida en el activo libre de riesgo en el momento t sigue la ecuaci´ on diferencial dBt = rBt dt
222
Manuel Moreno y Javier F. Navas
donde ST es el precio del activo subyacente en el momento del vencimiento de la opci´ on, T , y K es el precio de ejercicio de la opci´on. Si se utilizan las variables x = ln(S), W (x, t) = C(S, t), se obtiene una ecuaci´on con coeficientes constantes para las derivadas parciales 1 2 ∂ 2 W (x, t) 1 2 ∂W (x, t) ∂W (x, t) (5) σ σ + = r W (x, t) . + r − 2 ∂x2 2 ∂x ∂t Esta ecuaci´on, despu´es de utilizar cierto cambio de variables,4 es equivalente a la famosa “ecuaci´on del calor”, cuya soluci´ on (el precio de la opci´ on de compra europea) viene dada por C(S, t) = S N (d1 ) − K e−r (T −t) N (d2 ) , donde N (.) es la funci´ on de distribuci´ on acumulativa de una variable aleatoria normal est´andar y se tiene que S √ + r + 12 σ 2 (T − t) ln K √ d1 = , d2 = d1 − σ T − t . σ T −t Aplicando la paridad put-call, se obtiene el precio de una opci´ on de venta europea: P (S, t) = K e−r (T −t) N (−d2 ) − S N (−d1 ) . Esta f´ ormula no se puede aplicar a opciones americanas pues, en ´estas, hay que decidir en cada fecha de ejercicio si ejercemos la opci´ on o si esperamos a ejercerla en un momento futuro. La frontera que separa las regiones de ejercicio anticipado y de continuaci´ on es la frontera o ´ptima de ejercicio, que debe determinarse para valorar dicha opci´ on.
2.2.
M´ etodos num´ ericos para valorar opciones americanas
En esta secci´ on presentamos un resumen de las principales t´ecnicas que se han propuesto en la literatura de valoraci´ on de activos derivados de tipo americano. La mayor´ıa de estos m´etodos son adecuados solo para algunos derivados y no existe consenso sobre cu´al de estas t´ecnicas es la m´as adecuada. Nos centraremos en soluciones anal´ıticas y aproximaciones num´ericas y/o anal´ıticas. Las soluciones anal´ıticas proporcionan expresiones cerradas para los precios de las opciones y es el m´etodo m´ as elegante (y r´ apido) para valorar derivados. Sin embargo, a veces, este tipo de soluci´ on se basa en supuestos muy restrictivos y, por tanto, puede llevar a modelos poco realistas. Roll (1977), Geske (1979) y Whaley (1981) han valorado anal´ıticamente una opci´ on de compra americana sobre una acci´ on que paga dividendos en una serie (discreta) de fechas. McKean (1965) calcula la soluci´ on para el caso de horizonte infinito. Expresiones cerradas para la frontera o´ptima de ejercicio se pueden encontrar en Ait-Sahlia (1996) y Ait-Sahlia y Lai (1996, 2000). 4 V´ eanse
los detalles en Black y Scholes (1973).
´ n de opciones americanas M´ etodos de valoracio
223
El m´etodo de l´ıneas (v´ease Rektorys (1982)) ha sido aplicado en, por ejemplo, Carr y Faguet (1996) y Carr (1998) para obtener otras soluciones anal´ıticas. Este m´etodo se basa en discretizar la derivada con respecto al tiempo en la ecuaci´on diferencial de Black-Scholes. Carr y Faguet (1996) obtienen una serie de ecuaciones diferenciales ordinarias que resuelven de manera anal´ıtica mientras Carr (1998) muestra que este problema es equivalente al caso del horizonte infinito y utiliza los resultados de McKean (1965) para obtener precios exactos de opciones americanas. Las aproximaciones anal´ıticas son soluciones cerradas para aproximaciones del problema original. Esta t´ecnica ha sido utilizada para valorar opciones americanas en Johnson (1983), Geske y Johnson (1984), Barone-Adesi y Whaley (1987), Bunch y Johnson (1992), Broadie y Detemple (1996) y Ho et al. (1997), entre otros. Muchas de las aproximaciones obtenidas en estos trabajos han sido comparadas num´ericamente en Ait-Sahlia y Carr (1997) y Ju (1998). Johnson (1983) obtiene cotas superiores e inferiores para el precio de estas opciones y utiliza estos datos como variables explicativas de una regresi´ on en la que el precio de la opci´ on es la variable a explicar. Este autor utiliza los resultados de esta regresi´on para obtener otros precios mediante interpolaci´ on. Broadie y Detemple (1996) proponen un m´etodo similar si bien utilizan la cota inferior (LBA) y la media de las cotas superior e inferior (LUBA). La caracter´ıstica com´ un de ambos trabajos es que los resultados dependen en gran medida del esquema de interpolaci´ on y del ajuste de estas cotas. Geske y Johnson (1984) valoran opciones compuestas mediante la t´ecnica de extrapolaci´ on de Richardson y obtienen una expresi´ on que incluye una serie infinita con distribuciones normales multidimensionales. Se han sugerido varias modificaciones de este m´etodo. En concreto, Bunch y Johnson (1992) simplifican su c´ alculo num´erico, Ho et al. (1994) usan una extrapolaci´ on exponencial y Ho et al. (1997) generalizan la t´ecnica original para trabajar con tipos de inter´es estoc´asticos. Barone-Adesi y Whaley (1987) han desarrollado una aproximaci´ on muy r´ apida que se basa en una simplificaci´ on de la ecuaci´ on diferencial pero este m´etodo no es muy adecuado para opciones con un largo vencimiento. En general, los m´etodos anal´ıticos no son aptos para valorar activos con caracter´ısticas complejas (varios procesos estoc´asticos, procesos no markovianos. . . ). En este caso, debemos recurrir a m´etodos num´ericos. 2.2.1.
´ Arboles
Como mencionamos anteriormente, los m´etodos basados en a´rboles parten de la versi´ on en tiempo discreto del proceso (neutral al riesgo) en tiempo continuo que sigue el precio del activo subyacente. Tras esta discretizaci´on, el precio de la opci´ on se obtiene a partir del valor final movi´endonos hacia atr´ as en el tiempo. Los m´etodos m´ as populares de este tipo son los a´rboles binomiales y trinomiales. El m´etodo binomial se basa en la aproximaci´ on del movimiento browniano (proceso seguido por el activo subyacente de la opci´ on) mediante un paseo aleatorio en tiempo
224
Manuel Moreno y Javier F. Navas
discreto. Este m´etodo fue propuesto originalmente en Cox et al. (1979) y Rendleman y Bartter (1979) y proporciona una soluci´ on num´erica sencilla e intuitiva. En este m´etodo se considera la partici´ on {t0 = 0, t1 , t2 , . . . , tN −1 , tN = T } del intervalo temporal [0, T ]. En cada punto de esta partici´ on, se supone que el precio del activo subyacente sigue un proceso binomial multiplicativo: dicho precio sube en una proporci´ on u o baja en una proporci´ on d. Ambos valores, u y d, determinan la media y la volatilidad del activo subyacente. De acuerdo a esta evoluci´on del precio del activo, el pago final de la opci´ on de compra es Cu = m´ ax{uS − K, 0} o Cd = m´ ax{dS − K, 0}. De manera an´ aloga al modelo Black-Scholes, se construye una cartera libre de riesgo y el precio de una opci´ on de compra con un vencimiento igual a un per´ıodo viene dado por C = e−r ∆t (p Cu + (1 − p) Cd ),
p=
er ∆t − d , u−d
∆t = T /N .
Por tanto, el precio de la opci´ on de compra puede interpretarse como el valor esperado (descontado) de los pagos futuros de dicha opci´ on. Dicha esperanza se calcula utilizando la medida de probabilidad neutral al riesgo. Puesto que el m´etodo binomial es una aproximaci´ on del proceso en tiempo continuo que sigue el precio del activo, se eligen los valores de los par´ametros de salto (u y d) y la probabilidad (neutral al riesgo) p de manera que la media y varianza neutrales al riesgo del proceso en tiempo discreto coincidan con las del proceso en tiempo continuo dado por la ecuaci´ on (1). Como tenemos dos ecuaciones y tres par´ametros, podemos elegir libremente uno de dichos par´ ametros. Se han propuesto dos especificaciones alternativas en la literatura: Cox et al. (1979) suponen igualdad entre los tama˜ nos de los saltos5 y Jarrow y Rudd (1983) suponen que la subida y la bajada en el precio del activo subyacente son equiprobables. Para evitar problemas num´ericos, es recomendable utilizar el cambio de variable x = ln(S). En este caso, x puede subir a x+∆xu o bajar a x+∆xd con probabilidades p y 1 − p, respectivamente. Igualando la media y la varianza de los procesos discreto y continuo, obtenemos dos ecuaciones y, como antes, uno de los tres par´ ametros (∆xu , ∆xd o p) se puede elegir libremente. Para la especificaci´ on basada en S (x), podemos construir un a´rbol para el precio del activo a partir del valor inicial S0 (x0 ). En cada nodo (i, j) de este ´arbol, los precios del activo y de la opci´ on de compra son Si,j = S0 uj di−j (xi,j = x0 + j ∆xu + (i − j) ∆xd ) y Ci,j , respectivamente. Comenzamos en el nodo final del a´rbol en el momento T donde se conoce el valor de la opci´ on (su pago final). Puesto que estamos trabajando en un mundo neutral al riesgo, el valor de la opci´ on en cada nodo en el momento T − ∆t puede obtenerse como el valor esperado en el momento T multiplicado por un factor de descuento: Ci,j = e−r ∆t (p Ci+1,j+1 + (1 − p)Ci+1,j ) . 5 Bajo este supuesto, se obtiene un a ´rbol binomial recombinante (no importa el orden de las subidas y bajadas), una propiedad muy deseable desde un punto de vista computacional.
´ n de opciones americanas M´ etodos de valoracio
225
Movi´endonos hacia atr´ as en el tiempo a lo largo de todos los nodos en el a´rbol, se obtiene el precio de la opci´ on en el momento inicial, C0,0 . Para una opci´ on americana, la u ´nica diferencia es que, en cada nodo, debemos comparar la ganancia que se obtiene si se ejerce la opci´on antes de su vencimiento frente a la ganancia que se consigue si la opci´on se ejerce en un momento posterior. Breen (1991) ha generalizado el m´etodo binomial mediante el “m´etodo binomial acelerado”, el cual utiliza la t´ecnica de extrapolaci´ on de Richardson. Otras dos modificaciones han sido sugeridas en Broadie y Detemple (1996): 1. M´etodo BBS: en el modelo binomial, la f´ ormula Black-Scholes sustituye al “valor de continuaci´ on” un momento antes del vencimiento de la opci´ on. 2. M´etodo BBSR: este m´etodo es igual al BBS complementado con la t´ecnica de extrapolaci´ on de Richardson. El m´etodo binomial ha sido generalizado por el m´etodo de a´rboles trinomiales, originalmente propuesto por Parkinson (1977) y Boyle (1988). El modelo trinomial supone que el logaritmo del precio del activo, x, en un peque˜ no intervalo ∆t, puede a) subir en ∆x, b) mantener el mismo valor o c) bajar en ∆x, con probabilidades pu , pm y pd , respectivamente. Como en el modelo binomial, los valores de estas probabilidades se eligen para igualar la media y varianza neutrales al riesgo del proceso en tiempo discreto y del proceso (2). El siguiente paso consiste en la construcci´ on de un a´rbol para el precio del activo subyacente a partir de su valor inicial x0 . De modo an´ alogo al m´etodo binomial, el precio de la opci´ on en cada nodo (i, j) de este ´arbol en el momento T − ∆t, Ci,j , se calcula como el valor esperado descontado Ci,j = e−r ∆t (pu Ci+1,j+1 + pm Ci+1,j + pd Ci+1,j−1 ) La inducci´ on hacia atr´ as en el tiempo a lo largo de los nodos del a´rbol permite obtener el precio actual de la opci´ on, C0,0 . La principal ventaja de los a´rboles trinomiales es que, para un n´ umero dado de per´ıodos, N , su convergencia es m´as r´ apida que la de los a´rboles binomiales (aunque su utilizaci´ on requiere m´ as memoria computacional). 2.2.2.
Esquemas de diferencias finitas
Este m´etodo representa una t´ecnica alternativa a la anterior y parte de la construcci´ on de una rejilla de puntos (t, x) = (ik, jh), i ∈ Z + , j ∈ Z donde h y k son los par´ ametros que indican el tama˜ no de la rejilla, tan peque˜ nos como se quiera. A continuaci´ on, se obtiene una soluci´ on aproximada de la ecuaci´ on diferencial en derivadas parciales en dichos puntos. Dicha soluci´ on se consigue sustituyendo las derivadas parciales por diferencias finitas.
226
Manuel Moreno y Javier F. Navas
Las expresiones en diferencias se pueden centrar alrededor de los momentos i + 1, i, o i + 12 . Estas alternativas dan lugar a tres m´etodos: totalmente expl´ıcito (TE), totalmente impl´ıcito (TI) o el m´etodo de Crank-Nicolson (CN), respectivamente.6 Estos tres m´etodos pueden compararse en funci´ on de sus propiedades de consistencia, convergencia y estabilidad. Intuitivamente, estas propiedades pueden interpretarse del siguiente modo: 1. Consistencia: Un modelo es consistente cuando puede ser tan cercano al modelo original como se quiera. 2. Convergencia: La soluci´ on de la aproximaci´ on converge a la soluci´ on del problema original. 3. Estabilidad: Peque˜ nos cambios en las condiciones originales no implican grandes cambios en los resultados. La siguiente tabla resume estas propiedades para los tres m´etodos: M´etodo TE TI CN
Consistencia O (∆x)2 + ∆t 2 O (∆x) + ∆t 2 O (∆x)2 + ∆t 2
Convergencia √ S´ olo si ∆x > 2 ∆t
Estabilidad √ S´ olo si ∆x > 2 ∆t
Incondicionalmente
Incondicionalmente
Incondicionalmente
Incondicionalmente
El m´etodo totalmente expl´ıcito tiene el√inconveniente de que es estable y convergente s´ olo si se impone la restricci´on ∆x > 2 ∆t. Dicha restricci´ on implica que podemos necesitar muchos per´ıodos infinitesimales para obtener la soluci´ on. Este problema se evita con cualquiera de los otros dos m´etodos que son incondicionalmente estables y convergentes si bien requieren unos c´alculos m´ as sofisticados.7 La valoraci´ on de opciones mediante m´etodos de diferencias finitas se realiza mediante inducci´on hacia atr´ as en el tiempo, de manera similar a lo comentado en los a´rboles. Los m´etodos de diferencias finitas se pueden aplicar a derivados de tipo europeo y americano pero es dif´ıcil extenderlos a derivados que dependan de la trayectoria del activo subyacente o a opciones que utilicen varios procesos estoc´asticos. 2.2.3.
Cuadratura (integraci´ on num´ erica)
Hasta ahora, hemos comentado t´ecnicas que se basaban en la aproximaci´ on del proceso estoc´astico que sigue el precio del activo subyacente o de la ecuaci´on diferencial en derivadas parciales que deb´ıa verificar el precio de la opci´ on. Otra alternativa es el caso de las t´ecnicas de cuadratura, las cuales tienen como objetivo aproximar una cierta integral. 6 Los primeros dos m´ etodos fueron aplicados a la valoraci´ on de opciones en Schwartz (1977) y Brennan y Schwartz (1977, 1978), respectivamente. Courtadon (1982) es el primer trabajo que valora opciones mediante el m´etodo Crank-Nicolson. 7 En este caso, debemos resolver un sistema tridiagonal de ecuaciones aunque esto puede hacerse de manera muy eficiente usando el algoritmo de Thomas. Los detalles pueden verse en Morton y Mayers (1994).
´ n de opciones americanas M´ etodos de valoracio
227
Utilizando un argumento de arbitraje, Karatzas (1988) muestra que, para una opci´ on de compra europea, su precio C(S, t) en el momento t ∈ [0, T ] viene dado por (6)
˜ C(S, t) = EtP e−r (T −t) C(S, T ) ,
donde Et es el operador de esperanza en el momento t, P˜ es la medida de probabilidad neutral al riesgo y C(S, T ) es el precio de esta opci´ on al vencimiento (v´ease la ecuaci´on (4)). Esta f´ ormula para el precio de la opci´ on de compra puede reescribirse como
∞
C(S, t) = −∞
e−r (T −t) g(S) m´ ax{ST − K, 0} dS ,
donde g(S) es la funci´ on de densidad de probabilidad neutral al riesgo del activo subyacente. En general, esta integral s´ olo puede resolverse de modo num´erico. Esta integral puede aproximarse mediante la suma de los valores del integrando en ciertos puntos, multiplicados por unos coeficientes de ponderaci´ on. Un ejemplo de esta t´ecnica es la regla de Simpson. La integraci´ on num´erica se suele utilizar para valorar derivados de tipo europeo aunque Parkinson (1977) la ha utilizado para valorar opciones de venta americanas. Otra alternativa para valorar opciones mediante integrales es el llamado “m´etodo de representaci´ on integral”, utilizado en, por ejemplo, Kim (1990), Jacka (1991) y Carr et al. (1992). Dicho m´etodo expresa el precio de una opci´ on de venta americana como el precio de una opci´on europea de venta de caracter´ısticas similares m´as una prima. Esta prima refleja la ventaja obtenida por la posibilidad del ejercicio anticipado y se expresa como una integral. Varios trabajos desarrollan diferentes aproximaciones para esta integral. Huang et al. (1996) utilizan funciones “escal´ on” para aproximar el integrando. Despu´es de obtener una sucesi´ on de precios aproximados de opciones, se utiliza el m´etodo de extrapolaci´ on de Richardson con cuatro puntos para valorar una opci´ on de venta americana. Ju (1998) ha propuesto una funci´ on exponencial a intervalos para aproximar la frontera o´ptima de ejercicio. Este autor aplica la extrapolaci´ on de Richardson para obtener una expresi´ on cerrada para esta integral y muestra que la aproximaci´ on propuesta es exacta en los casos extremos en que el tiempo al vencimiento tiende a cero o infinito. Otra aproximaci´ on (en este caso, lineal) a la frontera o´ptima de ejercicio ha sido propuesta en Ait-Sahlia y Lai (2000) los cuales encuentran dos soluciones diferentes basadas en dicha aproximaci´ on. Finalmente, Bunch y Johnson (2000) han obtenido una expresi´ on cerrada para el precio de una opci´ on de venta americana en los casos finito y perpetuo. El elemento clave de su derivaci´ on es que el precio cr´ıtico del activo subyacente puede interpretarse como el m´aximo valor del precio de dicho activo en el que el precio de la opci´ on de venta no depende del tiempo al vencimiento.
228
Manuel Moreno y Javier F. Navas
2.2.4.
Simulaci´ on Monte Carlo
Esta t´ecnica fue aplicada por primera vez en el marco financiero por Boyle (1977). Para una revisi´ on de la literatura, v´ease Boyle et al (1997). Tal como se muestra en la ecuaci´on (6), el precio de una opci´ on es la esperanza de su valor final descontado, donde esta esperanza se calcula utilizando la medida de probabilidad neutral al riesgo. Dicho valor esperado se puede estimar calculando la media de un gran n´ umero de pagos finales. Como comentamos anteriormente, los pasos a seguir son los siguientes: 1. Simular el proceso neutral al riesgo que sigue el precio del activo subyacente (v´ease la expresi´on (1)) hasta el vencimiento de la opci´ on y calcular el pago terminal de dicha opci´ on. Este paso se repite M veces. 2. Calcular la media de estos pagos finales. 3. Descontar esta media utilizando el tipo de inter´es libre de riesgo para obtener una estimaci´ on del precio de la opci´ on. El punto crucial es simular adecuadamente el proceso que sigue el precio del activo subyacente para lo cual se recomienda utilizar el logaritmo de dicho precio. En este caso, la ecuaci´ on (2) se aproxima mediante √ 1 x(t + ∆t) = x(t) + r − σ 2 ∆t + σ ∆t ε, ∆t = T /N , 2 donde ε procede de una distribuci´ on normal est´ andar. Esta ecuaci´ on se utiliza para obtener el valor de x(t) desde el instante inicial hasta el momento final, T . El principal inconveniente de este m´etodo es que es muy intensivo desde un punto de vista computacional pues, en general, se necesitan muchas simulaciones para obtener un grado de precisi´ on aceptable. Este problema se puede simplificar mediante la utilizaci´ on de t´ecnicas de reducci´ on de la varianza como, por ejemplo, variables antit´eticas o variables de control. Tilley (1993) es el primer trabajo que valora opciones americanas mediante esta t´ecnica. Este autor propone un algoritmo en el que, en cada fecha, las trayectorias simuladas se agrupan seg´ un los precios de los activos. A continuaci´ on, para cada uno de estos grupos, se toma una decisi´ on de ejercicio o´ptimo. Sin embargo, como Broadie y Glasserman (1997) indican, este m´etodo tiene los siguientes inconvenientes: no se presentan resultados de convergencia para el algoritmo, todas las trayectorias simuladas se deben almacenar simult´aneamente y el m´etodo no es f´ acilmente aplicable al caso en el que tengamos varias variables de estado. Barraquand y Martineau (1995) proponen agrupar los valores simulados en una serie de “bins” para reducir la dimensi´ on del problema de valoraci´ on. Posteriores simulaciones permiten calcular las probabilidades de transici´ on entre estos bins y la valoraci´ on de opciones se realiza utilizando cada bin como una unidad de decisi´ on.
´ n de opciones americanas M´ etodos de valoracio
229
Broadie y Glasserman (1997) desarrollan un algoritmo que les permite obtener estimaciones puntuales de los precios de opciones americanas as´ı como errores de valoraci´ on para dichos precios. Tras analizar alguna de las propiedades de este algoritmo, se generan dos estimadores (sesgados) para dichos precios. Combinando ambas estimaciones, se obtiene un intervalo de confianza para el precio de la opci´ on americana. Las t´ecnicas presentadas en estos dos trabajos son mejoradas en Raymar y Zwecher (1997) y Broadie et al. (1997), respectivamente. Ambos trabajos valoran opciones americanas sobre el m´aximo de varios activos. Iba˜ nez y Zapatero (1998) proponen un algoritmo cuyo punto fijo es la frontera o´ptima de ejercicio. Para obtener esta frontera, se fija el valor de todos los par´ ametros excepto uno y se utiliza el algoritmo propuesto para obtener el valor del par´ ametro desconocido en la frontera o´ptima de ejercicio. Suponiendo que los derivados americanos se pueden ejercer en un n´ umero finito de fechas, estos autores valoran opciones sobre el m´aximo de dos activos. Finalmente, la valoraci´ on de opciones americanas tambi´en puede realizarse mediante m´etodos no param´etricos. Un ejemplo son las redes neuronales que se basan en el empleo de datos hist´ oricos y un proceso de aprendizaje para obtener el precio que se busca. Este m´etodo permite valorar opciones europeas y americanas con m´ ultiples procesos estoc´asticos. V´ease, por ejemplo, Hutchinson et al. (1994).
3.
El m´ etodo “Least-Squares Monte Carlo”
Como se ha mencionado previamente, el principal problema para valorar opciones americanas proviene de la posibilidad de ejercerla antes del vencimiento. Existen por tanto diferentes fechas posibles de ejercicio en cada una de las cuales el propietario de la opci´ on debe decidir entre ejercer la opci´ on o esperar a ejercer en una fecha futura. Esta decisi´on depende de la comparaci´ on entre (a) la ganancia que se obtiene si se ejerce la opci´ on (el valor inmediato de ejercicio) y (b) la ganancia que se consigue si se ejerce la opci´on en un momento posterior (el valor de continuaci´ on). Por tanto, la principal cuesti´ on a la que se enfrenta el inversor es el c´alculo de este valor de continuaci´ on. Longstaff y Schwartz (2001) han propuesto un m´etodo para calcular este valor mediante una regresi´ on de m´ınimos cuadrados que se realiza en cada una de las fechas posibles de ejercicio. La informaci´ on proporcionada por esta regresi´ on se complementa con la obtenida mediante simulaciones de Monte Carlo. Estas regresiones utilizan como variables explicativas una serie de funciones cuyos argumentos dependen de los precios de los activos subyacentes. El valor esperado de continuaci´ on de la opci´ on viene dado por el valor estimado en estas regresiones. La decisi´ on o´ptima de ejercicio se toma tras comparar estos valores estimados con los valores de ejercicio inmediato. Este proceso se repite recursivamente en cada una de las posibles fechas de ejercicio comenzando en el instante dado por el vencimiento de la opci´ on y terminando en el momento inicial. As´ı, en cada fecha de ejercicio, se obtiene un flujo de caja. El precio de la opci´ on americana se obtiene descontando al momento inicial estos flujos de caja.
230
Manuel Moreno y Javier F. Navas
Estos autores quieren valorar en el instante inicial, t = 0, una opci´ on que vence en el instante T . En el intervalo temporal finito, [0, T ], se define un espacio de probabilidad,8 (Ω, F, P ), y una medida martingala equivalente, Q. Denotamos por C(ω, s; t, T ), ω ∈ Ω, s ∈ (t, T ] la trayectoria de flujos de caja de la opci´ on, suponiendo que la opci´ on se ejerce despu´es de t y que el inversor siempre sigue la estrategia o´ptima de decisi´ on. Se supone que la opci´ on americana se puede ejercer en un n´ umero finito de fechas de ejercicio 0 < t1 < t2 < . . . < tK = T . Esto equivale a aproximar la opci´ on americana por su opci´ on bermuda correspondiente. Bajo condiciones de no-arbitraje, el valor de continuaci´ on es igual a la esperanza (neutral al riesgo) de los flujos de caja futuros descontados C(ω, s; ti , T ): tj K
exp − r(ω, s) ds C(ω, tj ; ti , T ) | Fti , (7) F (ω; ti ) = EQ j=i+1
ti
on donde r(ω, s) es el tipo de inter´es libre de riesgo y Fti representa la informaci´ disponible en el instante ti . El principal supuesto en que se basa el algoritmo LSM es que, en cada una de las posibles fechas de ejercicio, esta esperanza condicional se puede aproximar mediante una regresi´ on de m´ınimos cuadrados. As´ı, en el momento tK−1 , se supone que F (ω; tK−1 ) puede expresarse como una combinaci´on lineal finita de funciones b´ asicas ortonormales (pj (X)) como, por ejemplo, los polinomios Laguerre, Hermite, Legendre o Jacobi. Esto es, ∞
aj pj (X), aj ∈ R F (ω; tK−1 ) = j=0
puede aproximarse por FM (ω; tK−1 ) =
M
aj pj (X), aj ∈ R.
j=0
Este procedimiento se repite en cada fecha de ejercicio comenzando por el instante en el que vence la opci´on hasta llegar a la primera fecha posible de ejercicio. Longstaff y Schwartz (2001) aplican este algoritmo para valorar una serie de derivados americanos como, por ejemplo, opci´ on de venta, opci´ on americana-bermudaasi´ atica o una opci´ on americana sobre el m´aximo de cinco activos subyacentes.
8 Este espacio de probabilidad consta de tres partes: Ω, el conjunto de todos los sucesos posibles (ω), F, el sigma-´ algebra de sucesos en el momento T y P , una medida de probabilidad definida sobre los elementos de F.
´ n de opciones americanas M´ etodos de valoracio
3.1.
231
Un ejemplo num´ erico
Para proporcionar intuici´ on sobre su m´etodo, Longstaff y Schwartz (2001) presentan un ejemplo num´erico. A continuaci´ on, incluimos otro ejemplo num´erico que muestra que, si se utiliza el m´etodo LSM con un n´ umero reducido de trayectorias simuladas, una opci´ on americana puede tener un precio inferior a su contraparte europea. Valoramos una opci´ on de venta americana sobre una acci´ on que no paga dividendos. El precio de ejercicio es 1.1 y consideramos tres posibles fechas de ejercicio. El tipo de inter´es libre de riesgo compuesto continuamente es igual a 0.05. Simulamos ocho trayectorias del precio del activo subyacente tal como muestra la siguiente tabla:9 Trayectoria 1 2 3 4 5 6 7 8
t=0 1 1 1 1 1 1 1 1
t=1 0.917938 1.133931 1.162833 1.096706 1.056690 1.416442 0.937138 0.872576
∗
∗ ∗
∗ ∗
t=2 1.272171 1.290983 0.917742 1.081163 0.871784 1.672474 0.945920 0.658605
∗ ∗ ∗
∗ ∗
t=3 1.417021 1.669802 1.228432 1.118280 0.818722 1.263264 0.861259 0.475270
∗
∗ ∗
Pago en t = 3 0 0 0 0 0.281278 0 0.238741 0.624730
La u ´ltima columna de esta tabla muestra los pagos finales de una opci´ on europea. Descontando estos pagos al momento cero y calculando su media, se obtiene que el precio de esta opci´ on europea es igual a 0.123162. Para una opci´ on americana, el m´etodo LSM maximiza su valor en cada fecha de ejercicio a lo largo de las trayectorias “in-the-money (ITM)”. En cada fecha, X denota el precio del activo subyacente e Y representa el flujo de caja (descontado) que se recibir´ a en fechas futuras si se decide que la opci´on no ser´ a ejercida en este momento. En los momentos t = 2, 3, tenemos 5 trayectorias ITM (todas excepto la primera, la segunda y la sexta) y los valores de X e Y son como sigue: Trayectoria 1 2 3 4 5 6 7 8
e−0,05 e−0,05 e−0,05 e−0,05 e−0,05
Y — — × 0 × 0 × 0,281278 — × 0,238741 × 0,624730
X — — 0.917742 1.081163 0.871784 — 0.945920 0.658605
9 El s´ ımbolo ’*’ denota las trayectorias “in-the-money”. El m´ etodo LSM es m´ as eficiente si nos centramos en este tipo de trayectorias.
232
Manuel Moreno y Javier F. Navas
Para decidir entre ejercer o esperar, estimaremos el valor de continuaci´ on y lo compararemos con el valor de ejercicio inmediato, 1.1−X. El valor de continuaci´ on se estima mediante una regresi´on de m´ınimos cuadrados en la que Y es la variable a explicar. Las variables explicativas son una constante, X y X 2 . Los resultados de esta regresi´on son E[Y | X] = 2,848474 − 4,6539 X + 1,871826 X 2 . A partir de esta expresi´ on, la decisi´ on de ejercicio es como sigue: Trayectoria 1 2 3 4 5 6 7 8
1,1 − X — — 0.182258 0.018837 0.228216 — 0.154080 0.441395
E[Y | X] — — 0.1539056 0.0048106 0.2138467 — 0.1210645 0.5952915
Decisi´ on — — Ejercer Ejercer Ejercer — Ejercer Esperar
En esta tabla, vemos que ejercemos la opci´ on en todas las trayectorias ITM excepto la octava, en la que 1,1 − X < E[Y | X]. Por tanto, suponiendo que la opci´ on no se ejerce antes de t = 2, los flujos de caja para su propietario son los siguientes: Trayectoria 1 2 3 4 5 6 7 8
t=1 — — — — — — — —
t=2 0 0 0.182258 0.018837 0.228216 0 0.154080 0
t=3 0 0 0 0 0 0 0 0.62473
Repetimos este proceso en t = 1, donde tambi´en tenemos 5 trayectorias ITM. Ahora, para calcular la variable Y , utilizamos los flujos de caja que recibiremos en los momentos t = 2 o t = 3 (pero no en ambas fechas) para cada trayectoria. Los valores de X e Y son los siguientes: Trayectoria 1 2 3 4 5 6 7 8
Y e × 0 — — e−0,05 × 0,018837 e−0,05 × 0,228216 — e−0,05 × 0,154080 (e−0,05 )2 × 0,624730 −0,05
X 0.917938 — — 1.096706 1.056690 — 0.937138 0.872576
´ n de opciones americanas M´ etodos de valoracio
233
Estimando otra vez Y sobre una constante y sobre las dos primeras potencias de X, se obtiene E[Y | X] = 23,905695 − 47,1482 X + 23,23217 X 2 , que nos lleva a la siguiente decisi´on de ejercicio: Trayectoria 1 2 3 4 5 6 7 8
1,1 − X 0.182062 — — 0.003294 0.043310 — 0.162862 0.227424
E[Y | X] 0.202191 — — 0.1407488 0.0255102 — 0.1244155 0.4539830
Decisi´ on Esperar — — Esperar Ejercer — Ejercer Esperar
Por tanto, los flujos de caja que paga esta opci´ on americana en las tres fechas de ejercicio son los siguientes: Trayectoria 1 2 3 4 5 6 7 8
t=1 0 0 0 0 0.043310 0 0.162862 0
t=2 0 0 0.182258 0.018837 0 0 0 0
t=3 0 0 0 0 0 0 0 0.62473
Por tanto, en t = 1, ejercemos la opci´ on en las trayectorias quinta y s´eptima. En t = 2, ejercemos la opci´ on en las trayectorias tercera y cuarta y, en el momento final, t = 3, se recibe un pago no nulo en la octava trayectoria. Obviamente, todos los flujos de caja en las trayectorias segunda y sexta son nulos porque ambas son trayectorias “out-of-the money”. En la primera trayectoria, los flujos de caja son tambi´en nulos aunque, en t = 1, la opci´ on est´ a “in the money”. Ello se debe a que la decisi´ on o´ptima en este momento fue la de esperar. Finalmente, descontado estos flujos de caja al momento inicial y calculando la media de los valores en todas las trayectorias, se obtiene que el precio para la opci´ on americana es 0.114473, un 7 % inferior al de la opci´ on europea equivalente. Por supuesto, este hecho es consecuencia del reducido n´ umero de trayectorias simuladas. Aumentando el n´ umero de trayectorias simuladas se obtienen precios de opciones americanas superiores a los de las correspondientes opciones europeas.
234
Manuel Moreno y Javier F. Navas
4.
Conclusiones
La principal caracter´ıstica de las opciones americanas es la posibilidad de ejercicio en cualquier momento hasta su vencimiento. En el caso de la opci´on americana de compra, este activo se ejerce cuando a) el precio de mercado del activo subyacente es superior al precio de ejercicio y b) el inversor cree que dicho precio de mercado es superior al valor actual de cualquiera de los posibles precios futuros del subyacente (el valor de continuaci´ on). Esta segunda condici´on hace que la valoraci´ on de opciones americanas sea m´as complicada que la valoraci´ on de opciones europeas y, en general, es necesario el empleo de t´ecnicas num´ericas. Este trabajo ha presentado una panor´ amica de los modelos de valoraci´on de opciones, con especial atenci´ on a los derivados de tipo americano. Entre estas t´ecnicas, podemos se˜ nalar los a´rboles, los esquemas de diferencias finitas, la integraci´ on num´erica y las simulaciones de Monte Carlo. A continuaci´ on, nos centramos en el trabajo de Longstaff y Schwartz (2001), en el que se propone un m´etodo num´erico para estimar el valor de continuaci´ on anteriormente mencionado. Esta estimaci´on se realiza, en cada posible fecha de ejercicio, mediante una regresi´ on de m´ınimos cuadrados que se complementa con la informaci´ on de secci´ on cruzada proporcionada por la simulaci´ on de Monte Carlo. Esta regresi´ on intenta explicar los flujos de caja esperados mediante funciones que dependen del precio del activo subyacente. Esta t´ecnica es conocida como Least-Squares Monte Carlo (LSM). Mediante este m´etodo, se valoran diferentes activos derivados de tipo americano. Para un tratamiento m´ as desarrollado de dicha t´ecnica y sus posibles aplicaciones, el lector interesado puede consultar Moreno y Navas (2001). Agradecimientos El primer autor agradece la ayuda financiera del proyecto de investigaci´ on DGES n´ umero PB98-1057 y la hospitalidad del Financial Options Research Centre de la Warwick Business School, donde se realiz´ o parte de este trabajo. Los posibles errores son responsabilidad exclusiva de los autores.
Referencias [1] Ait-Sahlia, F. (1996): Optimal Stopping and Weak Convergence Methods for Some Problems in Financial Economics. Tesis Doctoral, Dept. of Operations Research, Stanford University. [2] Ait-Sahlia, F. y P. Carr (1997): American Options: A Comparison of Numerical Methods. In Numerical Methods in Finance, edited by L.C.G. Rogers y D. Talay, Cambridge University Press. [3] Ait-Sahlia, F. y T.L. Lai (1996): Approximations for American Options. Documento de Trabajo, Cornell University.
´ n de opciones americanas M´ etodos de valoracio
235
[4] Ait-Sahlia, F. y T.L. Lai (2000): A Canonical Optimal Stopping Problem for American Options and its Numerical Solution. Journal of Computational Finance 3, 33–52. [5] Barone-Adesi, G. y R.E. Whaley (1987): Efficient Analytic Approximation of American Option Values. Journal of Finance 42, 301–320. [6] Barraquand, J. y D. Martineau (1995): Numerical Valuation of High Dimensional Multivariate American Securities. Journal of Financial and Quantitative Analysis 30, 383–405. [7] Black, F. y M. Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal of Political Economy 81, 637–654. [8] Boyle, P. (1977): Options: A Monte Carlo Approach. Journal of Financial Economics 4, 323–338. [9] Boyle, P. (1988): A Lattice Framework for Option Pricing with Two State Variables. Journal of Financial and Quantitative Analysis 22, 1–12. [10] Boyle, P., M. Broadie y P. Glasserman (1997): Monte Carlo Methods for Security Pricing. Journal of Economic Dynamics and Control 21, 1267–1321. [11] Breen, R. (1991): The Accelerated Binomial Option Pricing Model. Journal of Financial and Quantitative Analysis 26, 153–164. [12] Brennan, M.J. y E.S. Schwartz (1977): The Valuation of American Put Options. Journal of Finance 32, 449–462. [13] Brennan, M.J. y E.S. Schwartz (1978): Finite Difference Methods and Jump Processes Arising in the Pricing of Contingent Claims: A Synthesis. Journal of Financial and Quantitative Analysis 13, 461–474. [14] Broadie, M. y J. Detemple (1996): American Option Valuation: New Bounds, Approximations and a Comparison of Existing Methods. Review of Financial Studies 9, 1211–1250. [15] Broadie, M. y P. Glasserman (1997): Pricing American-Style Securities using Simulations. Journal of Economic Dynamics and Control 21, 1323–1352. [16] Broadie, M., P. Glasserman y G. Jain (1997): Enhanced Monte Carlo estimation for American Option Prices. Journal of Derivatives 5, 25-44. [17] Bunch, D. y H.E. Johnson (1992): A Simple and Numerically Efficient Valuation Method for American Puts Using a Modified Geske-Johnson Approach. Journal of Finance 47, 809–816. [18] Bunch, D. y H.E. Johnson (2000): The American Put Option and Its Critical Stock Price. Journal of Finance 55, 2333–2356.
236
Manuel Moreno y Javier F. Navas
[19] Carr, P. (1998): Randomization and the American Put. Review of Financial Studies 11, 597–626. [20] Carr, P. y D. Faguet (1996): Fast Accurate Valuation of American Options, Documento de Trabajo, Cornell University. [21] Carr, P., R. Jarrow y R. Mynemi (1992): Alternative Characterization of American Puts. Mathematical Finance 2, 87–106. [22] Clewlow, L. y C. Strickland (1998): Implementing Derivatives Models, John Wiley & Sons Ltd., England. [23] Courtadon (1982): A More Accurate Finite Difference Approximation for the Valuation of Options. Journal of Financial and Quantitative Analysis 17, 697– 703. [24] Cox, J.C., S.A. Ross y M. Rubinstein (1979): Option Pricing: A Simplified Approach. Journal of Financial Economics 7, 229–263. [25] Geske, R. (1979): A Note on an Analytical Valuation Formula for Unprotected American Options on Stocks with Known Dividends. Journal of Financial Economics 7, 375–380. [26] Geske, R. y H.E. Johnson (1984): The American Put Option Valued Analytically. Journal of Finance 39, 1511–1524. [27] Ho, T.S.Y., R.C. Stapleton y M.G. Subrahmanyam (1994): A Simple Technique for the Valuation and Hedging of American Options. Journal of Derivatives 2, 52–66. [28] Ho, T.S.Y., R.C. Stapleton y M.G. Subrahmanyam (1997): The Valuation of American Options with Stochastic Interest Rates: A Generalization of the Geske-Johnson Technique. Journal of Finance 52, 827–840. [29] Huang, J.Z., M.G. Subrahmanyam y G.G. Yu (1996): Pricing and Hedging American Options: A Recursive Integration Method. Review of Financial Studies 9, 277–300. [30] Hutchinson, J., A. Lo y T. Poggio (1994): A Nonparametric Approach to Pricing and Hedging Derivative Securities. Journal of Finance 49, 851–886. ´n ˜ ez, A. y F. Zapatero (1998): Monte Carlo Valuation of American Options [31] Iba through Computation of the Optimal Exercise Frontier, Documento de Trabajo, The University of Southern California. [32] Jacka, S.D. (1991): Optimal Stopping and the American Put. Mathematical Finance 1, 1–14.
´ n de opciones americanas M´ etodos de valoracio
237
[33] Jarrow, R. y A. Rudd (1983): Option Pricing. Dow Jones-Irwin, Homewood, Illinois. [34] Johnson, H.E. (1983): An Analytic Approximation for the American Put Price. Journal of Financial and Quantitative Analysis 18, 141–148. [35] Ju, N. (1998): Pricing American Option by Approximating its Early Exercise Boundary as a Multipiece Exponential Function. Review of Financial Studies 11, 627–646. [36] Karatzas, I. (1988): On the Pricing of American Options. Applied Mathematics and Optimization 17, 37–60. [37] Kim, I.J. (1990): The Analytical Valuation of American Options. Review of Financial Studies 3, 547–572. [38] Longstaff, F.A. y E.S. Schwartz (2001): Valuing American Options by Simulations: A Simple Least-Squares Approach. Review of Financial Studies 14, 113–147. [39] McKean, H.P. (1965): Appendix: A Free Boundary Problem for the Heath Equation Arising from a Problem in Mathematical Economics. Industrial Management Review 6, 32–39. [40] Merton, R.C. (1973): Theory of Rational Option Pricing. Bell Journal of Economics and Management Science 4, 141–183. [41] Moreno, M. y J.F. Navas (2001): “On the Robustness of Least-Squares Monte-Carlo (LSM) for Pricing American Derivatives”. Documento de Trabajo num. 543, Departamento de Econom´ıa y Empresa. Universitat Pompeu Fabra de Barcelona. [42] Morton, K.W. y D.F. Mayers (1994): Numerical Solution of Partial Differential Equations. Cambridge University Press. [43] Parkinson, M. (1977): Option Pricing: The American Put. Journal of Business 50, 21–36. [44] Raymar, S. y M. Zwecher (1997): Monte Carlo Estimation of American Call Options on the Maximum of Several Stocks. Journal of Derivatives 5, 7–24. [45] Rektorys, K. (1982): The Method of Discretization in Time and Partial Differential Equations, D. Reidel Publishing, Boston, Mass. [46] Rendleman, R. y B. Bartter (1979): Two-State Option Pricing. Journal of Finance 34, 1093–1110.
238
Manuel Moreno y Javier F. Navas
[47] Roll, R. (1977): An Analytic Valuation Formula for Unprotected American Call Options on Stocks with Known Dividends. Journal of Financial Economics 5, 251–258. [48] Schwartz, E.S. (1977): The Valuation of Warrants: Implementing a New Approach. Journal of Financial Economics 4, 79–93. [49] Tilley, J. (1993): Valuing American Options in a trayectoria Simulation Model. Transactions of the Society of Actuaries 45, 83–104. [50] Whaley, R.E. (1981): On the Valuation of American Call Options on Stocks whit Known Dividends. Journal of Financial Economics 9, 207–211.
Manuel Moreno Departament d’Econom´ıa i Empresa Universitat Pompeu Fabra Carrer Ram´ on Trias Fargas, 25-27 08005 Barcelona, Spain [email protected]
Javier F. Navas Departamento de Finanzas Instituto de Empresa Mar´ıa de Molina, 13 28006 Madrid [email protected]
Modelos de volatilidad del futuro sobre el bono nocional a 10 a˜ nos Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez1
Abstract: The revolution caused by availability of high frequency data has produced an explosion of academic works on financial time series. In these studies, there are two aspects which play a key role in the process of implementing volatility models. First, while in traditional time series analysis the interval of time between two observations is a parameter given by the own time series, working with tick data allows an additional parameter to be considered: time aggregation. Different volatility models are estimated for the 10-year Spanish Notional Bond Future Contract that is quoted in MEFF (Spanish Financial Futures Market). It is verified that the choice between tick-by-tick time series and five minute time series aggregation affects the selected model. Secondly, the presence of outliers in the data represents a determinant factor on the estimation of the models. It is verified that the volatility models could present integrated variance if the outliers are not included, whereas this integrated variance disappears when the outliers explicitly appear in the model.
1.
Introducci´ on
La introducci´ on de los modelos ARCH por Engle [5] en 1982 y GARCH por Bollerslev [1] en 1986 abri´ o una l´ınea de investigaci´ on que ha contribuido a mejorar considerablemente el conocimiento de las caracter´ısticas estoc´asticas de determinadas variables financieras de alta frecuencia; recientes art´ıculos ([3] y, m´ as a´ un, [2] y [6]), contienen revisiones de los principales avances tanto te´ oricos como emp´ıricos en este ´area. Las distintas variantes de modelos que tienen su ra´ız en los ya citados ARCH tienen por objetivo explicar el comportamiento de las varianzas de los rendimientos financieros a partir de una funci´ on del pasado de esos rendimientos (bajo el supuesto de que vienen generados por un proceso ruido blanco de variables incorrelacionadas pero no independientes) o de los residuos de un modelo ARMA para los mismos; es 1 Ricardo Gimeno Nogu´ es es profesor de Econometr´ıa y de Matem´ aticas Financieras en la Universidad Pontificia Comillas (ICADE). Eduardo Morales Mart´ınez es Profesor de Econometr´ıa en la Universidad San Pablo-CEU. Esta charla fue impartida por el primer autor en la sesi´ on del Seminario Instituto MEFF-RiskLab de septiembre de 2001.
240
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
decir, la modelizaci´ on de la varianza condicional, en la pr´ actica, es muy dependiente ´ del modelo que se proponga para los rendimientos. Este es, precisamente, uno de los puntos sobre los que se hace hincapi´e en este trabajo; en concreto se aporta nueva evidencia (v´ease por ejemplo, [10] y [4]) que refuerza la influencia que el tratamiento de determinados valores anormales (outliers) ejerce sobre los par´ ametros del modelo de la varianza. Por otra parte, la disponibilidad de datos financieros con una frecuencia m´ as alta incluso que la diaria posibilita plantearse si los modelos de la varianza condicional se ven afectados por un problema de agregaci´ on temporal como puede ser el pasar de una serie en la que se tiene informaci´ on tick-by-tick —y, por tanto, habitualmente, disponible a intervalos de tiempo no constantes— a otra igualmente espaciada en el tiempo. Para ilustrar estos dos temas se presentan los resultados de una investigaci´ on realizada sobre los precios del contrato de futuros sobre el bono a 10 a˜ nos negociado en el MEFF. El resto del documento se organiza de la siguiente manera. En la secci´ on 2 se presentan dos de los tipos de modelos de la familia ARCH utilizados con m´ as frecuencia en la modelizaci´ on de las varianzas condicionales en series financieras, los GARH y su inmediata ampliaci´ on M-GARCH, y los EGARCH. El apartado 3 se destina a explicar en detalle las caracter´ısticas estad´ısticas de los datos utilizados y el proceso de agregaci´ on temporal seguido. De la modelizaci´on de los rendimientos calculados a partir de los precios de todas las operaciones cruzadas se desprende que ´estos no vienen generados por procesos ruido blanco; en el ep´ıgrafe 4 se presentan los modelos obtenidos para esos rendimientos as´ı como los correspondientes a las series agregadas temporalmente. En la secci´ on 5 se incluyen los modelos para las varianzas condicionales y se discute la importancia que un n´ umero reducido de outliers tiene sobre los par´ ametros de los mismos. Finalmente, en la secci´ on 6 se presentan las principales conclusiones.
2.
Modelos para las varianzas condicionales
En este apartado nos concentramos en dos tipos de modelos2 , GARCH y EGARCH, que son los m´ as frecuentemente utilizados para modelizar las varianzas condicionales de variables financieras. La modelizaci´on completa de los rendimientos requiere una ecuaci´ on inicial para las medias condicionales que adopta la forma (1) (con estructura ARMA) (1)
(1 − L) =
θ(L) t , φ(L)
y una segunda para recoger el comportamiento de las varianzas condicionales. 2 Una exposici´ on m´ as detallada de los mismos y de algunos otros modelos de la misma familia ARCH se puede consultar en [4]
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
241
Estas u ´ltimas aparecen referidas m´as abajo como (2) para los GARCH y (3) para los EGARCH: σ2t = α0 +
(2)
q i=1
αi 2t−i +
p
βj σ2t−j .
i=1
La propuesta de los modelos tipo GARCH, recogida en la ecuaci´ on (2), es que las variables aleatorias generadoras de los residuos de la primera ecuaci´ on tienen distribuciones marginales iguales, pero los momentos de segundo orden de las distribuciones condicionales son funci´ on de su pasado. Los coeficientes de esa segunda ecuaci´ on tienen que ser positivos para evitar obtener el resultado absurdo de una varianza negativa. Adem´ as se exige a la ecuaci´on que la suma de los coeficientes α y β debe ser menor que uno para que la varianza no tienda a infinito. Si en la ecuaci´ on de la media se incorpora como variable explicativa la estimaci´ on de la desviaci´ on tipica condicional se tienen los GARCH-M (v´ease [7])3 ; su inclusi´ on permite recoger la relaci´ on entre volatilidad del activo y rendimientos esperados. La ecuaci´ on de la varianza en los modelos EGARCH viene dada en (3): t−1 2 t−1 2 + β1 ln σ2t−1 . (3) ln σt = α0 + α1 − −λ σt−1 σt−1 π Estos modelos EGARCH, al estar definidos sobre el logaritmo de la varianza, permiten prescindir de las restricciones de no negatividad de los coeficientes. Sin embargo subsiste el posible problema de la integraci´ on en la varianza, lo que va a exigir que los par´ ametros β sumen menos que uno.
3.
Datos
Las nuevas tecnolog´ıas incorporadas por los mercados financieros han permitido almacenar datos cada vez m´as pormenorizados. De las ya cl´ asicas cotizaciones diarias se ha pasado al conocimiento del tick-by-tick. Existen distintos tipos de informaci´ on tick-by-tick, que van desde el registro de las operaciones cruzadas en el mercado al registro de cualquier cambio en las ofertas y demandas de acciones. El mercado espa˜ nol de derivados financieros MEFF registra en su base de datos MEFF Tick Data todas las operaciones cruzadas dentro y fuera de mercado. Cada registro incluye el precio de la operaci´ on, el n´ umero de contratos intercambiados y la fecha y la hora a la que se produjo la operaci´ on. En el presente trabajo se ha estudiado la evoluci´on de los precios del contrato de futuro sobre el Bono Nocional a 10 a˜ nos a lo largo de 1998 a partir de los datos proporcionados por MEFF Tick Data. Este contrato de futuro es el de mayor volumen 3 En
este caso la ecuaci´ on (1) adopta la forma: (1 − L) = δ0 σt +
θ(L) . φ(L) t
242
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
de negociaci´ on de los derivados de renta fija espa˜ nola. Tiene cuatro vencimientos al a˜ no: marzo, junio, septiembre y diciembre. Los modelos que se presentan en los siguientes apartados se referir´an al u ´ltimo de estos contratos, pudiendo encontrarse en el ap´endice los resultados para los otros vencimientos. Se ha seguido la evoluci´ on del precio de cada vencimiento durante los meses en los que ha sido el contrato m´ as negociado. Dicha condici´ on coincide con la de ser el contrato m´as cercano a la fecha de vencimiento. La evoluci´ on de los precios puede seguirse en las gr´aficas de la figura 1. Debe tenerse en cuenta que al proceder los datos de series obtenidas con las operaciones cruzadas, dichos datos no van a estar regularmente espaciados en el tiempo.
Figura 1: Evoluci´ on de los precios operaci´ on a operaci´ on (tick-by-tick ) del contrato de futuro sobre el Bono Nocional a 10 a˜ nos.
La agregaci´ on temporal llevada a cabo ha consistido en obtener series temporales que s´ı est´en regularmente espaciadas en el tiempo, para lo que resulta prioritario definir dicho intervalo temporal. Dada la frecuencia con la que se cruzan operaciones en MEFF para el contrato analizado, el intervalo m´ as corto de tiempo en el que podemos agregar la serie es cada cinco minutos. El m´etodo de agregaci´ on temporal utilizado ha consistido en dividir el tiempo total en el que funciona el mercado en intervalos de cinco minutos. De entre las operaciones que se han producido en cada intervalo, se escoge la u ´ltima de todas. Esta operaci´on se toma como representativa de ese periodo de cinco minutos. Los resultados de esta agregaci´ on temporal son los que se observan en la figura 2.
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
243
Figura 2: Evoluci´ on de los precios a intervalos regulares (cinco minutos) del contrato de futuro sobre el Bono Nocional a 10 a˜ nos.
Tanto para las series que est´an regularmente espaciadas en el tiempo como para las que no, la evoluci´ on de los precios puede representarse matem´ aticamente mediante (4)
Pt+∆t = Pt ert ∆t .
En la ecuaci´ on (4), adem´ as del precio P , se incluye una nueva variable, r, que representa los rendimientos del activo. Podemos definir los rendimientos a partir de la ecuaci´ on (5) cuando el intervalo de tiempo entre observaciones es constante, por ejemplo cinco minutos, o de la ecuaci´on (6) cuando dicho intervalo no es constante, como es el caso de las observaciones tick-by-tick. (5)
∆ log Pt
=
r(t)∆t
(6)
log Pt2 − log Pt1
=
r(t)(t2 − t1 )
En la figura 3 se observa el comportamiento de las series de rendimientos de cinco minutos del contrato de futuro sobre el Bono Nocional a 10 a˜ nos en 1998. Estas gr´ aficas presentan dos caracter´ısticas muy marcadas: 1. Periodos de volatilidad alta seguidos de periodos de volatilidad baja, lo que ser´ıa una se˜ nal de que la varianza no es constante y que por tanto podr´ıa considerarse interesante la modelizaci´ on mediante modelos GARCH; 2. Se detectan un n´ umero alto de valores extremos que los modelos propuestos no son capaces de tratar sino se incorporan como outliers a trav´es del an´ alisis de intervenci´ on.
244
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
Figura 3: Rendimientos de cinco minutos del contrato de futuro sobre el Bono Nocional a 10 a˜ nos.
En las figuras 4 y 5 se representan, respectivamente, los histogramas y las estad´ısticas b´ asicas de las series temporales de rendimientos tanto de cinco minutos como de tick-by-tick.
Figura 4: Histogramas y estad´ısticos de las series de rendimientos cada cinco minutos del contrato de futuros sobre el Bono Nocional a 10 a˜ nos.
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
245
Figura 5: Histogramas y estad´ısticos de las series de rendimientos tick-by-tick del contrato de futuros sobre el Bono Nocional a 10 a˜ nos.
Entre los rasgos comunes a ambos tipos de series, se observa en todas ellas que es aceptable la hip´ otesis de que los rendimientos tienen media cero. De igual forma el contraste de Jarque-Bera rechaza en todos los casos la hip´ otesis de normalidad. La principal diferencia entre las series se encuentra en la forma de los histogramas. En las series tick-by-tick los valores se concentran m´as alrededor de la media, mientras que con el proceso de agregaci´ on temporal aumenta la dispersi´ on.
4. Modelos para los rendimientos del futuro a 10 a˜ nos Para el an´ alisis del comportamiento de los rendimientos partimos de la expresi´ on general de los modelos ARIMA (7). (7)
(1 − L)d rt =
θ(L) t . φ(L)
En el caso de las series de rendimientos tick-by-tick se ha procedido a identificar el orden de los polinomios a partir del estudio de los correlogramas que se incluyen en el ap´endice. En estos se observa una clara ley de formaci´ on, tanto en el correlograma simple como en el correlograma parcial, lo que nos inclina a proponer un modelo ARMA(1,1), cuya estimaci´on para el vencimiento de diciembre 98 es el que se presenta en la tabla 1 del ap´endice. Los valores obtenidos para los coeficientes MA y AR son pr´oximos, pero decidimos mantenerlos porque el correlograma de los rendimientos no es el que cabr´ıa esperar de un ruido blanco. El correlograma de los residuos de este modelo s´ı que permite afirmar que recoge las correlaciones lineales presentes en la serie. La existencia de un modelo lineal para las series de rendimientos implica que existe dependencia entre los rendimientos de un momento del tiempo y los rendimientos de momentos anteriores, en contra de la hip´ otesis de mercado eficiente de Fama [8].
246
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
En las series de rendimientos de cinco minutos, los correlogramas no permiten una clara identificaci´ on de los o´rdenes de las partes autorregresiva y de medias m´oviles, por lo que el criterio utilizado para su identificaci´ on ha sido el criterio de informaci´ on de Schwarz [11]. Siguiendo dicho criterio, el mejor modelo para los rendimientos cada cinco minutos del vencimiento de diciembre es un ARMA(2,4), cuya estimaci´ on se presenta en la tabla 2. El resultado obtenido en la tabla 2 es de un orden superior al de los otros vencimientos, incluidos en el ap´endice, en los que se ha procedido a la agregaci´ on temporal, en los que se obten´ıan modelos AR(1) o incluso ruidos blancos. Una posible causa de la elecci´on de dicho modelo puede ser que, como se observa en las figuras 1 y 2, entre los d´ıas 8 y 9 de octubre de 1998 se produce una ca´ıda del mercado (figura 6) de 29 veces la desviaci´on t´ıpica de la serie de rendimientos, debido a inestabilidades en el gobierno norteamericano.
Figura 6: Evoluci´ on del precio del Bono Nocional a 10 a˜ nos los d´ıas 8 y 9 de octubre de 1998.
El cambio en los precios del d´ıa 9 de octubre no es atribuible a la din´ amica interna del mercado, por lo que dif´ıcilmente el modelo ARIMA puede recogerlo. Lo que s´ı provoca es un aumento artificial del n´ umero de par´ ametros necesarios para explicarlo mediante este tipo de modelos. Por tanto, se ha decidido considerar tal movimiento como un valor an´ omalo que puede modelizarse mediante la incorporaci´on al modelo de una variable artificial de tipo escal´ on, que toma valor cero para las fechas anteriores al 9 de octubre de 1998, y uno a partir de entonces. El modelo definitivo, siguiendo el criterio de Schwarz, ser´ıa el u ´ltimo de la tabla 2, en el que s´ olo hay un MA(1), modelo mucho m´ as razonable que el ARMA(2,4), pen´ ultimo de la misma tabla 2. El correlograma de los residuos, tanto del modelo MA(1) sobre los cinco minutos, como del modelo ARMA(1,1) para la serie tick-by-tick, se presentan en el ap´endice y permiten afirmar que ambos modelos recogen las correlaciones lineales de ambas series de rendimientos. Una vez modelizado el comportamiento de la media de los rendimientos, es necesario estudiar el comportamiento de la varianza de la serie, puesto que al representar gr´ aficamente los rendimientos hab´ıamos detectado que posiblemente la varianza no fuera constante.
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
5.
247
Modelos para la volatilidad del futuro sobre el bono a 10 a˜ nos
Analizando los correlogramas de los residuos de los modelos ARIMA no es posible afirmar que vengan generados por variables independientes entre s´ı. De la observaci´ on de los correlogramas de los residuos al cuadrado (en el ap´endice) y en valor absoluto, se puede confirmar que existen dependencias en los momentos de segundo orden, que pueden ser recogidas mediante modelos de tipo GARCH como los presentados en el segundo apartado del presente estudio. Dadas las estructuras de los correlogramas de los residuos al cuadrado de los modelos obtenidos en el apartado anterior, se ha procedido a estimar modelos GARCH(1,1) para la varianza tanto para las series tick-by-tick (ecuaci´ on 3) como para las series de cinco minutos (ecuaci´ on 4). Se puede observar en el modelo estimado para las series tick-by-tick que los coeficientes de la parte GARCH son significativos y que suman 0.85656, lo que significa que se cumple la condici´ on requerida por estos modelos de que α + β < 1. Sin embargo, tras la agregaci´ on temporal en series de cinco minutos, el modelo estimado es un IGARCH, esto es, α + β = 1, por lo que decimos que el proceso es integrado en varianza. La persistencia en la varianza es una caracter´ıstica que no se encontraba en la serie tick-by-tick y que es, por tanto, resultado de la agregaci´ on temporal. La representaci´ on gr´ afica de la serie de rendimientos pone en duda la validez de dicho resultados. Las gr´ aficas de rendimientos no presentan persistencia en la varianza, pero s´ı destaca la presencia de valores an´omalos. Por este motivo, se comprueba si la presencia de estos outliers es justificaci´ on suficiente de la aparici´ on de modelos IGARCH. Si se incluyen variables artificiales para los outliers en la ecuaci´ on de la media de rendimientos, entonces la ecuaci´on estimada de la varianza se corresponder´ a con la tabla 7. En la ecuaci´ on de la media se incluyen trece variables impulso referidas a trece valores an´ omalos de la serie. La identificaci´ on de los outliers se ha hecho se˜ nalando los rendimientos que superen siete veces la desviaci´on t´ıpica de la serie. A este nivel, el coeficiente media m´ovil que ten´ıa el modelo ya no es significativo y se ha eliminado. Descontando el efecto de los outliers, puede considerarse que los rendimientos de cinco minutos son independientes. De igual forma, la inclusi´ on de outliers reduce la persistencia en la varianza, por lo que puede considerarse que esta caracter´ıstica, encontrada con la agregaci´ on temporal, es el resultado de no tratar adecuadamente los outliers. Los resultados obtenidos son los mismos si sustituimos los modelos GARCH por GARCH-M. En las series tick-by-tick los coeficientes suman menos que uno (tabla 5). En el caso de la agregaci´ on temporal en series de cinco minutos, aparece de nuevo la varianza integrada, que puede evitarse si incluimos outliers (tabla 8). Se puede comprobar tambi´en que la desviaci´ on t´ıpica es una variable significativa del comportamiento de la media de los rendimientos. De esta forma, se puede afirmar
248
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
que a mayor volatilidad del contrato de futuro, mayor ser´ a la rentabilidad esperada. Estos resultados se confirman mediante contrastes de raz´ on de verosimilitud incluidos en el ap´endice (tabla 11). Por u ´ltimo se comprueba que con los modelos EGARCH tambi´en se reduce la persistencia de la varianza cuando se incorporan a la ecuaci´ on de la media los outliers correspondientes. En la tabla 6 se presenta la estimaci´ on del modelo EGARCH para las series tickby-tick. Los resultados tras la agregaci´ on temporal son los de la tabla 9. Tras la inclusi´ on de los outliers identificados, se observa que el coeficiente β de la ecuaci´ on de la varianza condicional pasa de 0.94 a 0.89 al incluir outliers. En conjunto, los resultados son los mismos, independientemente del modelo que usemos para explicar el comportamiento de la varianza de los rendimientos.
6.
Conclusiones
A modo de resumen de los resultados obtenidos se pueden sacar las siguientes conclusiones: El mercado constituido por el contrato de futuro sobre el Bono Nocional a 10 a˜ nos negociado en MEFF no fue eficiente durante el a˜ no 1998 considerando las series tick-by-tick en la que los intervalos de tiempo no son constantes. Tras la agregaci´ on temporal en series de cinco minutos, hay vencimientos en los que s´ı es eficiente y vencimientos en los que no. En todas las series tick-by-tick estudiadas se puede afirmar que la volatilidad es una variable que influye de forma positiva sobre el valor de los rendimientos medios. En las series de cinco minutos, tras la agregaci´ on temporal, s´ olo el vencimiento de diciembre de 1998 mantiene a la volatilidad como variable significativa para explicar el valor esperado de los rendimientos. La agregaci´ on temporal de la serie supone la aparici´ on de persistencia en la varianza que no se encontraba en las series tick-by-tick. Dicha persistencia llega a suponer la integraci´ on de la varianza en los vencimientos de septiembre y diciembre de 1998. Se comprueba que dicha persistencia surgida con la agregaci´ on temporal es debida a la presencia de outliers en la serie que no han sido tratados adecuadamente. Para la correcci´ on del problema de la persistencia en dichos modelos se propone la siguiente estrategia: 1. Estimar el modelo ARIMA que mejor recoja las correlaciones lineales detectadas en la media de los rendimientos. 2. Identificar e incluir en el modelo de la media de rendimientos los outliers. 3. Estimar el modelo GARCH para la volatilidad sobre los residuos del paso anterior para evitar la aparici´ on err´ onea de persistencia en la varianza.
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
Anexo: tablas y figuras
Figura 7: Correlograma y correlograma parcial de la serie de rendimientos tick-by-tick. Vencimiento diciembre 98.
Figura 8: Correlograma y correlograma parcial de los residuos del modelo ARMA de la serie de rendimientos tick-by-tick. Vencimiento diciembre 98.
Figura 9: Correlograma y correlograma parcial de la serie de rendimientos de cinco minutos. Vencimiento diciembre 98.
249
250
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
Figura 10: Correlograma y correlograma parcial de los residuos del modelo ARMA de la serie de rendimientos de cinco minutos. Vencimiento diciembre 98.
Figura 11: Correlograma y correlograma parcial de los residuos al cuadrado (tick-by-tick ). Vencimiento diciembre 98.
Figura 12: Correlograma y correlograma parcial de los residuos al cuadrado(cinco minutos). Vencimiento diciembre 98.
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
φ θ
Marzo 0.32878 (0.01046) 0.49525 (0.00962)
Junio 0.30093 (0.01218) 0.46112 (0.01134)
Septiembre 0.22489 (0.01591) 0.36569 (0.01519)
251
Diciembre 0.42662 (0.02665) 0.49219 (0.02565)
Tabla 1: Modelos ARIMA estimados sobre series de operaci´ on a operaci´ on (1 − φL)(1 − L) log Pt = (1 − θL)t
φ1
Marzo -0.055347 (0.013985)
Junio
Sept.
φ2 θ1 θ2 θ3 θ4
Dic. -0.1905 (0.0096) -0.9664 (0.0093) -0.2535 (0.0158) -1.0395 (0.0158) -0.0829 (0.0133) -0.0889 (0.0130)
Intervenci´ on
Dic. (intervenido)
-0.01229 (0.000382)
-0.0438 (0.0128)
Tabla 2: Modelos ARIMA estimados sobre series de cinco minutos (1 − φ1 L − φ2 L2 . . .)(1 − L) log Pt = β(1 − L)Si + (1 − θ1 L − θ2 L2 . . .)t
Ecuaci´ on de la media φ θ Ecuaci´ on de la varianza α0 α1 β1
Marzo
Junio
Septiembre
Diciembre
0.28962 (0.00799) 0.49952 (0.00717)
0.29038 (0.00808) 0.49025 (0.00709)
0.35677 (0.01066) 0.48695 (0.01038)
0.24889 (0.00924) 0.39599 (0.00861)
4.29670e-10 (1.74634e-12) 0.04452 (1.82621e-4) 0.81900 (6.68304e-4)
1.81967e-10 (0.00000) 0.04610 (1.66956e-4) 0.88490 (4.40610e-4)
6.08034e-11 (0.00000) 0.04489 (1.10201e-4) 0.94382 (9.44670e-5)
9.12470e-010 (0.00000) 0.12880 (1.25574e-4) 0.72776 (2.01349e-4)
Tabla 3: Modelos GARCH(1,1) estimados sobre series de operaci´ on a operaci´ on (1 − φL)(1 − L) log Pt = (1 − θL)t ,
σ2t = α0 + α1 2t−1 + β1 σ2t−1
252
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
Marzo Ecuaci´ on de la media
φ
Junio
Septiembre
Diciembre
-0.0388 (0.016)
θ
-0.0288 (0.0116) -0.015 (1.4711e-4)
Intervenci´ on α0 Ecuaci´ on de la varianza
α1 β1
1.4178e-8 (2.6667e-10) 0.4484 (0.0118) 0.5244 (6.6466e-3)
5.71992e-9 (1.90239e-10) 0.30337 (0.00892) 0.63272 (0.00994)
1.24850e-9 (4.15802e-11) 0.31904 (0.00683) 0.68096 (0.00330)
6.5243e-9 (1.0162e-10) 0.2207 (2.4901e-3) 0.7793 (2.0446e-3)
Tabla 4: Modelos GARCH(1,1) estimados sobre series de cinco minutos (1 − φL)(1 − L) log Pt = β(1 − L)Si + (1 − θL)t σ2t = α0 + α1 2t−1 + β1 σ2t−1
φ Ecuaci´ on de la media
θ σt α0
Ecuaci´ on de la varianza
α1 β1
Marzo
Junio
Septiembre
Diciembre
0.30339 (0.00725) 0.50517 (0.00688) 0.00569 (6.95618e-4)
0.29046 (0.00807) 0.49039 (0.00708) 0.00285 (8.91897e-4)
0.35762 (0.01076) 0.48786 (0.01056) 0.00377 (7.35650e-4)
0.25370 (0.00903) 0.40159 (0.00836) 0.01216 (5.90851e-4)
1.04052e-10 (0.00000) 0.03313 (9.65349e-5) 0.93744 (1.66618e-4)
1.82752e-10 (0.000) 0.04628 (1.68416e-4) 0.88442 (4.42004e-4)
6.08072e-11 (0.000) 0.04490 (1.13409e-4) 0.94381 (9.58521e-5)
8.99021e-10 (0.00000) 0.12841 (1.39677e-4) 0.73064 (2.47990e-4)
Tabla 5: Modelos GARCH-M (1,1) estimados sobre series de operaci´ on a operaci´ on (1 − φL)(1 − L) log Pt = δσt + (1 − θL)t σ2t = α0 + α1 2t−1 + β1 σ2t−1
Ecuaci´ on de la media
φ θ α0
Ecuaci´ on de la varianza
α1 γ β1
Marzo
Junio
Septiembre
Diciembre
0.27064 (0.00549) 0.46480 (0.00473)
0.31502 (0.00691) 0.50453 (0.00591)
0.26809 (0.00867) 0.39398 (0.00821)
-0.9760 (4.3809e-4) -0.9802 (3.8406e-4)
-0.4700 (0.00168) 0.0659 (0.00016) -0.3910 (0.00212) 0.9750 (0.00008)
-0.8924 (0.00321) 0.0771 (0.00018) -0.1638 (0.00239) 0.9538 (0.00016)
-0.4052 (0.00106) 0.0895 (0.00013) -0.0831 (0.00183) 0.9778 (0.00005)
-0.0309 (9.1028e-5) 0.0529 (5.6744e-5) -0.4494 (1.5922e-3) 0.9973 (5.3828e-6)
Tabla 6: Modelos EGARCH (1,1) estimados sobre series operaci´ on a operaci´ on t−1 t−1 2 (1 − φL)(1 − L) log Pt = (1 − θL)t , ln σt = α0 + α1 [| |−λ − 2/π] + β1 ln σ2t−1 σt−1 σt−1
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
Ecuaci´ on de la media Intervenci´ on 1 Intervenci´ on 2 Intervenci´ on 3 Intervenci´ on 4 Intervenci´ on 5 Intervenci´ on 6 Intervenci´ on 7 Intervenci´ on 8 Intervenci´ on 9 Intervenci´ on 10 Intervenci´ on 11 Intervenci´ on 12 Intervenci´ on 13 Intervenci´ on 14
Marzo
Junio
Septiembre
Diciembre
0.00484 (0.00013) 0.00184 (0.00069) 0.00290 (0.00074) -0.00344 (0.00015) 0.00279 (0.0294) -0.00259 (0.00301) 0.00206 (0.00013) -0.00205 (0.5179) 0.00194 (0.00069) 0.00279 (0.00005) 0.00388 (0.00089) 0.00234 (0.00102) 0.00194 (0.00072) 0.00197 (0.00092)
0.00275 (5.026) 0.00156 (0.0444) -0.00187 (0.00009) -0.00199 (0.00091) -0.00571 (0.00011) 0.00218 (0.00006) 0.00175 (0.0933) -0.00141 (0.00003) 0.00208 (0.00007) 0.00185 (0.00009) 0.00175 (0.00006) 0.00181 (0.00018)
-0.00164 (0.0172) 0.00272 (0.00032) -0.00137 (0.00006) -0.00173 (0.0121) -0.00179 (0.8795) 0.00154 (0.00013) 0.00296 (0.0113) 0.00529 (0.00010) -0.0144 (0.00015) 0.00251 (0.00016) -0.00225 (0.00010) -0.00146 (0.00143) 0.00217 (0.00007) -0.00146 (0.00016) 0.00128 (0.00008) -0.00174 (0.00067)
0.00332 (0.00025) -0.00565 (0.00007) 0.00489 (0.00238) -0.00219 (0.00007) -0.00358 (41.6558) -0.0110 (0.00044) -0.00879 (0.00035) -0.00148 (0.0003) 0.00297 (0.00128) 0.00823 (0.00024) 0.00411 (0.00007) -0.00363 (0.00017) 0.00507 (0.00007)
3.249e-9 (1.3606e-10) 0.1701 (0.00742) 0.7789 (0.0076)
2.6164e-9 (1.116e-10) 0.2913 (0.0096) 0.686 (0.00748)
2.055e-9 (7.8331e-11) 0.2650 (0.0089) 0.7348 (0.0066)
5.9044e-9 (1.5213e-10) 0.1981 (0.00530) 0.7679 (0.00448)
Intervenci´ on 15 Intervenci´ on 16 Ecuaci´ on de la varianza α0 α1 β1
253
Tabla 7: Modelos GARCH(1,1) estimados sobre series de cinco minutos con intervenciones
254
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
Ecuaci´ on de la media σt Intervenci´ on 1 Intervenci´ on 2 Intervenci´ on 3 Intervenci´ on 4 Intervenci´ on 5 Intervenci´ on 6 Intervenci´ on 7 Intervenci´ on 8 Intervenci´ on 9 Intervenci´ on 10 Intervenci´ on 11 Intervenci´ on 12 Intervenci´ on 13 Intervenci´ on 14
Marzo
Junio
Septiembre
Diciembre
0.00267 (0.0139) 0.00484 (0.00013) 0.00195 (0.00816) 0.00301 (0.00039) -0.00342 (0.00016) 0.00278 (0.0660) -0.00262 (0.00919) 0.00205 (0.00013) -0.00205 (0.00482) 0.00194 (0.0585) 0.00239 (0.00022) 0.00385 (0.1670) 0.00239 (0.00121) 0.00201 (0.0218) 0.00186 (0.00028)
0.00137 (0.0101) 0.00274 (0.0498) 0.00155 (0.2711) -0.00187 (0.00009) -0.00197 (0.00109) -0.00567 (0.00012) 0.00218 (0.00006) 0.00174 (0.0305) -0.00141 (0.00003) 0.00207 (0.00007) 0.00185 (0.00008) 0.00175 (0.00006) 0.00180 (0.00017)
0.00261 (0.00998) -0.00166 (2.4119) 0.00273 (0.00038) -0.00138 (0.00006) -0.00173 (0.0207) -0.00179 (0.00559) 0.00186 (0.00014) 0.00295 (0.0136) 0.00532 (0.00011) -0.0142 (0.00015) 0.00250 (0.00015) -0.00227 (0.00010) -0.00148 (0.00169) 0.00216 (0.00007) -0.00147 (0.00016) 0.00139 (0.00009) -0.00155 (0.00039)
0.0426 (0.0102) 0.00353 (0.00068) -0.00565 (0.00007) 0.00468 (0.00115) -0.00220 (0.00007) -0.00363 (15.9145) -0.0321 (0.00010) -0.00768 (0.2444) -0.00323 (0.2176) 0.00360 (0.0321) 0.00820 (0.00024) 0.00410 (0.00007) -0.00321 (0.00010) 0.00506 (0.00007)
3.1367e-9 (1.3039e-10) 0.1706 (0.0075) 0.7809 (0.0076)
2.5965e-9 (1.1155e-10) 0.2924 (0.00963) 0.6862 (0.00749)
1.8668e-9 (7.234e-11) 0.2554 (0.00845) 0.7441 (0.00636)
6.2260e-9 (1.7506e-10) 0.1887 (0.00546) 0.7697 (0.00477)
Intervenci´ on 15 Intervenci´ on 16 Ecuaci´ on de la varianza α0 α1 β1
Tabla 8: Modelos GARCH-M (1,1) estimados sobre series de cinco minutos
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
Ecuaci´ on de la media Intervenci´ on 1 Intervenci´ on 2 Intervenci´ on 3 Intervenci´ on 4 Intervenci´ on 5 Intervenci´ on 6 Intervenci´ on 7 Intervenci´ on 8 Intervenci´ on 9 Intervenci´ on 10 Intervenci´ on 11 Intervenci´ on 12 Intervenci´ on 13 Intervenci´ on 14
Marzo
Junio
Septiembre
Diciembre
0.00481 (0.00016) 0.00202 (0.00697) 0.00264 (0.00055) -0.00333 (0.00012) 0.00271 (0.00115) -0.00534 (0.00003) 0.00275 (0.00011) -0.00205 (0.00054) 0.00194 (0.00080) 0.00230 (0.00039) 0.00386 (0.00084) 0.00240 (0.00111) 0.00202 (0.00078) 0.00202 (0.00060)
0.00187 (0.00006) 0.00155 (0.00079) -0.00184 (0.00008) -0.00213 (0.00033) 0.00174 (0.00197) 0.00218 (0.00007) 0.00176 (0.00058) -0.00191 (0.00002) 0.00185 (0.00012) 0.00151 (0.00005) 0.00122 (0.00007) 0.00186 (0.00025)
-0.00165 (0.00101) 0.00272 (0.00028) -0.00186 (0.00008) -0.00173 (0.00078) -0.00179 (0.00584) 0.00184 (0.00019) 0.00295 (0.00082) 0.00535 (0.00009) -0.00465 (0.00030) 0.00158 (0.00012) -0.00150 (0.00013) -0.00133 (0.00045) 0.00161 (0.00009) -0.00151 (0.00022) 0.00109 (0.00075) -0.00012 (0.00039)
0.00354 (0.00043) -0.00345 (0.00007) 0.00502 (0.00042) -0.00227 (0.00009) -0.00464 (0.00077) -0.00089 (0.00013) -0.00757 (0.53122) -0.00386 (0.6342) 0.00622 (0.00192) 0.00858 (0.00041) 0.00326 (0.00012) -0.00362 (0.00018) 0.00326 (0.00011)
-2.4814 (0.1005) 0.2994 (0.0114) -0.1851 (0.0267) 0.8520 (0.0059)
-1.7856 (0.0653) 0.4042 (0.0102) -0.09409 (0.0142) 0.8945 (0.0037)
-1.4175 (0.0507) 0.3837 (0.0091) 0.0146 (0.0138) 0.9150 (0.0029)
-1.6344 (0.0343) 0.2553 (0.0057) 0.3585 (0.016) 0.8983 (0.0021)
Intervenci´ on 15 Intervenci´ on 16 Ecuaci´ on de la varianza α0 α1 γ β1
Tabla 9: Modelos EGARCH (1,1) estimados sobre series de cinco minutos
255
256
Ricardo Gimeno Nogu´ es y Eduardo Morales Mart´ınez
σ 2σ 3σ 4σ 5σ 6σ 7σ 8σ 9σ 10 σ 11 σ 12 σ 13 σ 14 σ 15 σ
Marzo 1093 178 57 32 19 15 14 9 8 5 2 2 2 2 1
Junio 851 243 104 55 26 19 12 10 5 1 1 1 1 0 0
Septiembre 1149 240 91 61 30 18 16 6 4 4 4 3 2 2 2
Diciembre 685 146 75 48 31 20 13 11 8 6 4 3 3 3 3
Tabla 10: N´ umero de casos que superan en n-veces la desviaci´ on t´ıpica de la serie de rendimientos (cinco minutos)
Marzo Junio Septiembre Diciembre
Logaritmo de la funci´ on de verosimilitud (Garch) 1.732.443 1.509.072 1.339.901 1.350.497
Logaritmo de la funci´ on de verosimilitud (Garch-M) 1.733.545 1.509.075 1.339.905 1.350.523
Contraste de raz´ on de verosimilitud 2205,2985 5,5974 8,8584 52,1595
Probabilidad asociada al estad´ıstico 0 0,01799 0,00292 0
Tabla 11: Contraste de raz´ on de verosimilitud del modelo Garch-m sobre el modelo Garch para la serie de rendimientos del futuro sobre el bono a 10 a˜ nos, operaci´ on a operaci´ on.
tick-by-tick GARCH (α + β) GARCH-M (α + β) EGARCH (β)
0.85656 0.85905 0.99730
cinco minutos (sin intervenci´ on) 1 1 0.9417
cinco minutos (con intervenci´ on) 0.9660 0.9584 0.8983
Tabla 12: Persistencia en la varianza condicional. Coeficientes de la ecuaci´ on de la varianza en los modelos de volatilidad estimados para el vencimiento de diciembre del 98.
˜ os Modelos de volatilidad del futuro sobre el bono nocional a 10 an
257
Referencias [1] Bollerslev, T. (1986): “Generalized autoregressive conditional heterokedasticity”. Journal of Econometrics 31, 307–327. [2] Bollerslev, T. (2001): “Financial Econometrics: Pst Developments and Future Challenges”. Journal of Econometrics 100, 41–51. [3] Bollerslev, T., Chou, R.Y. y Kroner, K.F. (1992): “ARCH Modeling in Finance: A Review of the Theory and Empirical Evidence”. Journal of Econometrics 52, 498–505. ˜ a, D. y Ruiz, E. (2001): “Outliers and conditional [4] Carnero, M.A., Pen autorregresive heterocedasticity in time series”. Working Paper 01-07, Statistics and Econometrics Series 04, febrero 2001, Departamento de Estad´ıstica y Econometr´ıa, Universidad Carlos III de Madrid. [5] Engle, R.F. (1982): “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation”. Econometrica 50, 987–1007. [6] Engle, R.F. (2001): “Financial Econometrics - A New Discipline with New Methods”. Journal of Econometrics 100, 53–56. [7] Engle, R.F., Lilien, D.M. y Robins, R.P. (1987): “Estimating time varying risk premia in the term structure: the ARCH-M model”. Econometrica 55, 391–407. [8] Fama, E. (1965): “The Behavior of Stock Market Prices”. Journal of Business 38, 34–105. [9] Nelson, D.B. (1991): “Conditional heteroskedasticity in asset returns: A new approach”. Econometrica 59, 347–370. [10] Morales, E. (1993): Modelos de predicci´ on y adopci´ on de decisiones: El caso de los tipos de cambio diarios de la peseta. Tesis Doctoral, Universidad Complutense de Madrid, Facultad de Ciencias Econ´ omicas y Empresariales. [11] Schwarz, G. (1978): “Estimating the dimension of a model”. Annals of Statistics 6, 461–464. Ricardo Gimeno Nogu´es Universidad Pontificia Comillas de Madrid (ICAI-ICADE) Alberto Aguilera 23 28015, Madrid, Spain [email protected] Eduardo Morales Mart´ınez Universidad San Pablo–CEU Juli´ an Romea, 23 28003, Madrid, Spain [email protected]
Aplicaci´ on a ´ındices de bolsa de modelos de ra´ız unitaria estoc´ astica ´ n M´ınguez Salido y Eduardo Morales Mart´ınez1 Roma
Abstract: The primary aim of this article deals with the issue of whether daily observed financial data is generated by a fixed unit root process of by a stochastic root process with unitary expected value in the root. The latter one could be the reason for the stationarity of the process at a particular moment, while at a different point in time it appears as non stationary. Once the stochastic unit root process has been defined, we will design the appropriate tests for whatever sample size and corresponding response surface are obtained. In particular, we have tested the presence of stochastic unit roots (alternative hypothesis) in 26 stock exchange indexes, from which we have 1454 observations. The results obtained from the tests give rise to some doubts about the null hypothesis validity (fixed unit root), since in more than a half of the series analysed the null has been rejected. Finally, we will estimate the values for the stochastic unit root in the sample for the series under study, by applying the Kalman smoothing technique.
1.
Introducci´ on
El an´ alisis estad´ıstico de series financieras diarias se ha centrado, fundamentalmente, en el estudio de las distribuciones de probabilidad de los rendimientos, aproximados por la primera diferencia de los logaritmos neperianos de las correspondientes observaciones (tipos de cambio, ´ındices burs´atiles, etc.). La conclusi´on, en la mayor parte de esos trabajos, es que dichos rendimientos vienen generados por variables aleatorias con distribuciones marginales leptoc´ urticas y distribuciones condicionales incorrelacionadas pero no independientes. La dependencia no lineal se ha venido analizando, desde los trabajos pioneros de Engle (1982) y Bollerslev (1986), con alguna de las m´ ultiples variantes de los modelos ARCH y GARCH. En estos modelos, recu´erdese, se propone un esquema de dependencia de la varianza con su pasado que se sustenta en una funci´ on de los rendimientos (bajo el supuesto de que ´estos vienen generados por un proceso ruido blanco) o de los residuos de un modelo con estructura ARMA para dichos rendimientos. 1 Rom´ an M´ınguez Salido y Eduardo Morales Mart´ınez son profesores de Econometr´ıa en la Universidad San Pablo-CEU de Madrid. Esta charla fue impartida por el primer autor en la sesi´ on del Seminario Instituto MEFF-RiskLab de noviembre de 2001.
260
´n M´ınguez y Eduardo Morales Roma
A efectos del trabajo que se presenta en este documento, el hecho relevante es que los modelos citados no discuten la existencia de una ra´ız unitaria fija para toda la muestra de datos que se analiza e incorporan la transformaci´ on logar´ıtmica como m´etodo para lograr homocedasticidad en las distribuciones marginales. Desde el campo puramente no estad´ıstico esa forma de proceder encuentra justificaci´ on en el significado del t´ermino rendimiento o rentabilidad en el universo financiero. En este trabajo se aborda el problema de si los datos financieros observados diariamente vienen generados por un proceso estoc´ astico con una ra´ız unitaria fija o con una ra´ız estoc´astica, con media igual a la unidad, que permite que, en algunos per´ıodos, el proceso generador de los datos sea estacionario y, en otros, no estacionario (puede ser homog´eneo o explosivo). En concreto, el procedimiento de detecci´ on de ra´ıces unitarias estoc´asticas se ha aplicado a 26 ´ındices burs´atiles para los que se dispone de una muestra de 1454 observaciones. Los resultados de los contrastes aportan algunas dudas acerca de la validez de la hip´ otesis nula de ra´ız unitaria fija, ya en que en m´ as de la mitad de las series analizadas se rechaza esta hip´otesis. Posteriormente se realiza una estimaci´ on, mediante el alisado de Kalman, de los valores de la ra´ız unitaria estoc´ astica durante el per´ıodo muestral en cada uno de los ´ındices analizados. Se observa que, en la mayor´ıa de los casos, la ra´ız oscila alrededor de la unidad y los per´ıodos en que la ra´ız muestra una mayor variabilidad coinciden con mayor volatilidad en los rendimientos. El resto del trabajo se organiza de la siguiente manera. En la Secci´ on 2 se describe la metodolog´ıa seguida poniendo especial ´enfasis en los modelos utilizados, los estad´ısticos de contraste y el procedimiento de generaci´ on de valores cr´ıticos basado en la construcci´ on de superficies de respuesta. En la Secci´ on 3 se describen las caracter´ısticas estad´ısticas de los datos de los ´ındices burs´atiles analizados y se presentan los resultados de las estimaciones de la ra´ız unitaria estoc´ astica para cada ´ındice. Finalmente, en el u ´ltimo apartado se presentan las conclusiones del trabajo.
2. 2.1.
Metodolog´ıa Modelos de ra´ız unitaria estoc´ astica
Los modelos con ra´ız unitaria estoc´ astica, expuestos por Leybourne, MacCabe y Mills (1996, p´ ag. 255), parten de la ecuaci´ on: (1)
yt = ρt yt−1 + t ,
t i.i.d. N (0, σ2 ) ,
donde el proceso ρt se puede expresar como ρt = 1 + δt y δt puede venir generado por un proceso AR(1) o un sendero aleatorio: (2)
δt = γδt−1 + ηt
ηt i.i.d. N (0, ση2 )
(3)
δt =
ηt i.i.d. N (0, ση2 )
δt−1 + ηt
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
261
En este modelo el proceso yt siempre posee una ra´ız de media unitaria; si ση2 = 0 la ra´ız unitaria es fija2 (es decir, ρt = 1 en todos los per´ıodos), mientras que si ση2 > 0 la ra´ız unitaria viene generada por un proceso estoc´ astico (los valores de ρt oscilar´ an alrededor de la unidad). La justificaci´ on de elegir un proceso autorregresivo de primer orden o un sendero aleatorio para modelizar la ra´ız unitaria estoc´ astica es captar la posible dependencia en la evoluci´ on de la ra´ız. Esta modelizaci´on generaliza la expuesta por Leybourne, MacCabe y Tremayne (1996) en la cual el proceso estoc´ astico supuesto para la ra´ız es un ruido blanco (parece menos realista suponer que los movimientos de la ra´ız no tienen que ver con el pasado). El grado de dependencia en la trayectoria de la ra´ız unitaria estoc´ astica viene dado por el valor del par´ ametro γ (en el caso l´ımite de γ = 1 se obtiene el sendero aleatorio).
2.2.
Estad´ısticos para contrastar la presencia de ra´ız unitaria estoc´ astica
Para discriminar si los datos procedentes de una serie con ra´ız unitaria se explican mejor por un proceso con ra´ız unitaria fija o por un proceso con ra´ız unitaria estoc´astica, Leybourne, MacCabe y Mills (1996, p´ ag. 255-261) desarrollan contrastes con H0 : ση2 = 0 (ra´ız unitaria fija) frente a H1 : ση2 > 0 (ra´ız unitaria estoc´ astica). Los contrastes permiten que la variable yt en la ecuaci´ on (1) incluya, adem´ as de la ra´ız unitaria, estructura autorregresiva estacionaria del tipo AR(p + 1) alrededor de alguna funci´ on de tendencia. El contraste se inicia calculando los residuos MCO en la regresi´ on: ∆yt = α + β t +
(4)
p
φi ∆yt−i + t
i=1
A partir de dichos residuos se construye la suma acumulada w t = estimadores: σ 2 =
T 1 2 T − (p + 1) t=p+2
κ 2 =
t
j j=p+1
y los
T 1 ( 2 − σ 2 ) T − (p + 1) t=p+2
El estad´ıstico para contrastar la existencia de ra´ız unitaria estoc´ astica con estructura AR(1) (ecuaciones (1) y (2)) es: (5)
3
ε−2 κ −1 Z1 = [T − (p + 1)]− 2 σ
T t=p+2
2 Se
supone, como condici´ on inicial, que δ0 = 0.
2 w t−1 ( εt 2 − σ ε2 )
262
´n M´ınguez y Eduardo Morales Roma
Si se quiere contrastar si la ra´ız unitaria estoc´ astica est´a generada por un sendero aleatorio (ecuaciones (1) y (3)) el estad´ıstico es: (6)
E1 = [T − (p + 1)]
−3
σ ε−4
T i=p+2
T
2 εt w t−1
−
σ ε2
t=i
T
2 w t−1
t=i
La distribuci´ on de ambos estad´ısticos converge asint´ oticamente a funcionales sobre movimientos brownianos (Leybourne, MacCabe y Mills, p´aginas 259-260), con lo cual es necesario recurrir a simulaciones para obtener valores cr´ıticos utilizables en muestras finitas. Por otra parte, si en la regresi´ on (4) se incluye una constante en lugar de una tendencia lineal3 , la distribuci´ on asint´ otica de los estad´ısticos anteriores cambia (algo habitual en los contrastes de ra´ız unitaria, cuya distribuci´ on suele depender de la funci´ on de tendencia especificada en la regresi´ on). En la notaci´ on usada en Leybourne, MacCabe y Mills (1996) los estad´ısticos de contraste se denominan E2 y Z2 . Como consecuencia, los valores cr´ıticos a utilizar para cada caso deben ser diferentes. Todos los contrastes vistos rechazan H0 (es decir, detectan la existencia de ra´ız unitaria estoc´ astica) si, para un tama˜ no muestral dado, el valor del estad´ıstico de contraste supera el valor cr´ıtico correspondiente.
2.3.
Superficies de respuesta y obtenci´ on de valores cr´ıticos
Para obtener valores cr´ıticos adecuados en cada caso, se puede consultar Leybourne, MacCabe y Tremayne (1996) para el contraste Z1 , Leybourne, MacCabe y Mills (1996) para los contrastes E1 y E2 y Taylor y Van Dijk (1999) para el contraste Z2 . Sin embargo, en este trabajo se ha optado por utilizar superficies de respuesta4 calculadas, mediante simulaciones, para cada contraste. Estas superficies de respuesta son desarrolladas siguiendo una metodolog´ıa similar a la utilizada por MacKinnon (1991, 1994, 1996, 2000) el cual describe la especificaci´ on de ecuaciones, o superficies de respuesta, donde la variable dependiente es el cuantil correspondiente mientras que las variables independientes suelen ser potencias negativas del tama˜ no muestral T . Dicha especificaci´ on funcional tiene una justificaci´ on te´ orica basada en la tasa de convergencia de los estimadores, que es de orden5 O(T k ), con lo cual las ecuaciones de la forma: (7)
q α = β0α + β1α
1 1 1 + β2α 2 + · · · + βkα k T T T
3 La inclusi´ on de una tendencia lineal sobre ∆yt implica la existencia de una tendencia cuadr´ atica en yt . 4 Las superficies de respuesta son ecuaciones que permiten obtener, para cualquier cuantil deseado, el valor cr´ıtico del estad´ıstico a partir del tama˜ no muestral correspondiente. 5 En nuestro caso los estad´ ısticos Zi convergen a una tasa T 3/2 y los estad´ısticos Ei a una tasa T 3 , con lo cual ser´ a suficiente establecer k = 3 en la ecuaci´ on (7) para recoger de forma adecuada la distribuci´ on del estad´ıstico.
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
263
recogen la distribuci´ on asint´ otica del estad´ıstico en el t´ermino β0α , mientras que los t´erminos restantes recogen las diferencias entre los valores de los cuantiles del estad´ıstico en muestras finitas respecto de la distribuci´ on asint´ otica. El paso inicial de esta metodolog´ıa es el dise˜ no y realizaci´ on de un experimento de Monte Carlo para obtener los cuantiles de inter´es de los contrastes estudiados. En el experimento de Monte Carlo se simularon distribuciones muestrales de cada estad´ıstico de contraste (Zi y Ei ) bajo la hip´ otesis nula (ra´ız unitaria fija), para lo cual se generaron r´eplicas del sendero aleatorio: yt = yt−1 + εt
(8)
εt ∼ i.i.d. N (0, σε2 )
con distintos tama˜ nos muestrales obteni´endose el valor del estad´ıstico en cada uno de ellos y los cuantiles correspondientes a las distintas repeticiones. El vector de 30 tama˜ nos muestrales utilizado en la simulaci´ on completa es T =< 20, 30, . . . , 100, 125, 150, . . . , 500, 600, . . . , 1000 > ; para cada tama˜ no muestral del vector T se realizan M experimentos de Monte Carlo cada uno de ellos con N r´eplicas6 . La realizaci´ on, para cada tama˜ no muestral, de M experimentos cada uno de ellos con N repeticiones, en lugar de una u ´nica simulaci´ on con M N r´eplicas, permite obtener una estimaci´on de la variabilidad muestral de cada cuantil estimado, posibilitando as´ı la estimaci´ on de las superficies de respuesta por M´ınimos Cuadrados Ponderados7 . Para cada uno de los M experimentos con N r´eplicas realizado se guarda el vector de 225 cuantiles para los valores (9)
α = < 0,0001, 0,0002, · · · , 0,001, 0,002, · · · , 0,01, 0,015, · · · , 0,99, 0,991, · · · , 0,999, 0,9995, 0,9996, · · · , 0,9999 >
6 MacKinnon(2000)
recomienda unos valores de N = 100000 o ´ 200000 y M = 50 o ´ 100. hecho, para cada cuantil α y cada tama˜ no muestral Ti , se obtienen M valores de cada estad´ıstico (cada valor obtenido con N r´ eplicas). Esto permite conocer, al menos de forma aproximada, cual es la variabilidad muestral del estad´ıstico y una estimaci´ on de su desviaci´ on t´ıpica, la cual se puede utilizar como ponderaci´ on en las regresiones por m´ınimos cuadrados. 7 De
264
´n M´ınguez y Eduardo Morales Roma
El algoritmo se resume en el siguiente cuadro: Algoritmo de obtenci´ on de cuantiles de estad´ısticos mediante simulaci´ on 1.
Establecer i = 1.
2.
Establecer Ti = elemento i-´esimo vector T .
3.
Establecer j = 1.
4.
Generar N muestras aleatorias normales est´andar de tama˜ no Ti . Replicar N veces el sendero aleatorio dado en (8) tomando como valores muestrales de εt las N muestras generadasa y un valor inicial y0 = 0.
5.
Para cada una de las N realizaciones muestrales del sendero aleatorio de tama˜ no Ti , realizar una regresi´ on MCO ∆yt = α + βt + εt y calcular los valores de los estad´ısticos Z1 y E1 , dados en (5) y (6), con los residuos ( εt ) y sumas acumuladas de residuos (w t ) obtenidos en la regresi´ on. Los valores de los estad´ısticos E2 y Z2 se obtienen utilizando los residuos y sumas acumuladas procedentes de la regresi´ on ∆yt = α + εt .
6.
Para los N valores obtenidos de cada estad´ıstico, se obtiene el vector de cuantiles α dado en (9). Como se˜ nala MacKinnon (2000) en la p´ agina 3 de su art´ıculo, para que las estimaciones de los cuantiles de los estad´ısticos de contraste sean v´ alidas, el n´ umero de repeticiones por experimento, N , ha de cumplir que αN sea un n´ umero entero para cualquier valor del vector α. A pesar de que, en nuestro caso, para algunos tama˜ nos muestrales el n´ umero de r´eplicas es claramente insuficiente (no supera las 25000), s´ı que se cumple esta condici´on para todos los valores de N y α.
7.
Si j < M establecer j = j + 1 y volver al paso 4. En caso contrario continuar con el siguiente paso.
8.
Si Ti < dim(T ) establecer i = i + 1 y volver al paso 2. En caso contrario terminar el algoritmo.
a Los
contrastes Zi y Ei son invariantes ante σε2 .
Los c´alculos fueron realizados usando el programa Ox versi´ on 2.20 (Doornik, 1998) a ´ partir de un generador de n´ umeros aleatorios de L’Ecuyer (1997) con per´ıodo aproximado 2113 . Una vez estimados los cuantiles de cada estad´ıstico de contraste para cada tama˜ no muestral, se realizan estimaciones de las ecuaciones dadas por (7) con k = 3 (para obtener la distribuci´ on asint´ otica, los estad´ısticos Ei son reescalados por T −3 mientras
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
265
que los estad´ısticos Zi son reescalados por T −3/2 ), es decir8 : q α = β0α + β1α
(10)
1 1 1 + β2α 2 + β3α 3 + εα . T T T
La estimaci´ on por M´ınimos Cuadrados Ordinarios de las 225 ecuaciones (una ecuaci´ on para cada cuantil del vector α), cada una de ellas con T M (4000) datos, muestra indicios evidentes de heterocedasticidad. Para estimar de forma eficiente, bajo heterocedasticidad desconocida, se ha optado por utilizar el m´etodo generalizado de momentos (Cragg, 1983) que, en este caso, se reduce a aplicar M´ınimos Cuadrados Ponderados en dos pasos: 1. Estimar la varianza del estad´ıstico asociada a cada tama˜ no muestral (dada por 9 σ T2∗α ) en la regresi´ o n : i (11)
2 [qiα − q α Ti ] = γ∞ + γ1
1 1 + γ2 2 + υi , Ti Ti
donde qiα es el valor del estad´ıstico en cada uno de los M experimentos realizados (para cada cuantil del vector α y cada tama˜ no muestral Ti ) y q Tαi representa el valor medio del estad´ıstico en los M experimentos. Los valores estimados de esta regresi´on constituyen las varianzas σ T2αi que, modificadas, se utilizan como ponderaciones en la regresi´ on posterior. 2. Para cada cuantil del vector α realizar la regresi´ on m´ınimo cuadr´ atica con las 30 medias muestrales (una para cada valor Ti ): (12)
qα 1 Ti Ti2 Ti3 Ti = β0α ∗α + β1α ∗α + β2α ∗α + β3α ∗α + uTi . ∗α σ Ti σ Ti σ Ti σ Ti σ Ti
La ponderaci´ on utilizada es: = σ T2∗α i
σ T2α i , M
donde σ T2α representa los valores estimados, para cada tama˜ no muestral, de la i variable dependiente en la regresi´on (11). Las superficies de respuesta estimadas para los principales cuantiles se puede consultar en el Anexo (Tablas A5 y A6). Las ecuaciones obtenidas con el vector T modificado excluyen los menores tama˜ nos muestrales (T = 20, 30, 40, 50) ya que se obtiene un ajuste, medido por un contraste de especificaci´ on expuesto en MacKinnon (2000), claramente mejor. 8 Por otra parte, se puede contrastar de forma sencilla si se elimina alg´ un regresor de la ecuaci´ on (8) una vez estimada. Tambi´ en se ha probado un valor de k = 4 inicial pero los resultados son claramente insatisfactorios. 9 No se utilizan como ponderaciones directamente las varianzas [q α − q α ]2 /M de cada estad´ ıstico i Ti puesto que los resultados son m´ as inestables.
266
´n M´ınguez y Eduardo Morales Roma
3.
Resultados emp´ıricos
Los contrastes Zi y Ei se van a aplicar a una base de datos financiera compuesta por 1454 datos diarios de 26 ´ındices burs´atiles en los principales mercados mundiales. La muestra comprende desde el 03-01-1994 hasta el 29-07-1999. Las caracter´ısticas estad´ısticas de los rendimientos burs´atiles, obtenidos como diferencia de logaritmos, se resumen en el Cuadro 1. En este cuadro, se puede observar que, como cabr´ıa esperar, todas las series tienen, estad´ısticamente, media nula; mayoritariamente son asim´etricas y leptoc´ urticas (la tabla refleja el exceso de curtosis). El objetivo es distinguir si el logaritmo del ´ındice burs´atil correspondiente posee una ra´ız unitaria fija (H0 ) o una ra´ız unitaria estoc´ astica (H1 ). Los resultados de todos los contrastes sobre los logaritmos de los ´ındices burs´atiles se presentan10 en el Cuadro 2. Nombre MADRIDGEN MADRIDIBEX35 DAX30 CAC40 FTSE100 MILAN30 PORBVL30 TORONSE200 NIKKEI225 AUSTRALIA NEWZEAL40 DJINDUSTR DJTRANSPO DJUTILIT DJCOMP65 BANGKOK JAKARTA KUALAMPUR PHILIPPIN KOREA SINGDBS50 HANGSENG TAIWAN SHENZHEN ISTANBUL100 JSEOVERALL
T
Media
Desv. T´ ıp.
M´ ınimo
M´ aximo
Rango
Coef. Asim.
Exceso Curtosis
1453
0.00064
0.01171
-0.0672
0.0573
0.1244
-0.449
4.605
1453
0.00064
0.01293
-0.0734
0.0632
0.1366
-0.417
4.344
1453
0.00055
0.01279
-0.0838
0.0611
0.1449
-0.617
4.09
1453
0.00043
0.01224
-0.0563
0.061
0.1172
-0.11
2.065
1453
0.0004
0.00925
-0.0366
0.0435
0.0801
-0.102
1.918
1453
0.00053
0.01523
-0.0643
0.0697
0.134
-0.037
1.662
1453
0.0007
0.01023
-0.0894
0.069
0.1584
-0.905
11.969
1453
0.00032
0.0066
-0.0571
0.0292
0.0863
-1.426
9.677
1453
0.00002
0.01389
-0.0596
0.0766
0.1362
0.196
3.152
1453
0.00024
0.0083
-0.0745
0.0607
0.1352
-0.402
6.827
1453
0
0.00983
-0.1331
0.0948
0.2278
-1.228
29.773
1453
0.00073
0.00924
-0.0746
0.0486
0.1232
-0.623
6.732
1453
0.00045
0.01135
-0.0752
0.067
0.1421
0.06
3.147
1453
0.00022
0.00761
-0.0377
0.0259
0.0636
-0.187
1.491
1453
0.00058
0.00832
-0.069
0.0435
0.1125
-0.537
5.69
1453
-0.00089
0.01983
-0.1003
0.1135
0.2138
0.747
4.515
1453
0.00003
0.01825
-0.1273
0.1313
0.2586
0.369
9.627
1453
-0.00035
0.02149
-0.2415
0.2082
0.4497
0.592
28.042
1453
-0.00022
0.0166
-0.0974
0.0967
0.1941
0.006
4.501
1453
0.00009
0.02067
-0.116
0.1002
0.2162
0.245
4.275
1453
0.00008
0.01385
-0.0782
0.093
0.1712
0.436
6.178
1453
0.00006
0.01945
-0.1474
0.1725
0.3198
0.24
1453
0.00013
0.01526
-0.0778
0.0852
0.163
-0.163
3.239
1453
-0.00026
0.02304
-0.167
0.1245
0.2915
-0.01
9.838
1453
0.00225
0.0317
-0.1617
0.1661
0.3278
-0.069
3.006
1453
0.00025
0.01123
-0.1185
0.067
0.1855
-1.316
14.654
10.309
Cuadro 1: Resumen estad´ıstico de la distribuci´ on de los rendimientos logar´ıtmicos de los principales ´ındices burs´ atiles. Muestra compuesta por 1453 datos diarios entre 04-01-1994 hasta 29-07-1999.
10 El n´ umero de retardos incluido en la tabla corresponde a los retardos de la variable dependiente incluidos en la regresi´ on (4). En esta metodolog´ıa, denominada de lo general a lo espec´ıfico, se elige inicialmente un n´ umero m´ aximo de retardos (30) y se va eliminando el u ´ ltimo retardo hasta que ´este sea significativo utilizando el contraste t habitual. Ng y Perron (1995) muestran que esta metodolog´ıa tiene buenas propiedades de potencia en los contrastes de ra´ız unitaria.
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
267
Nombre
Ret
Z1
Concl.
E1
Concl.
Ret
Z2
Concl.
E2
Concl.
MADRIDGEN
15
-0.015
No Sign.
-0.009
No Sign.
15
-0.91
No Sign.
-0.051
No Sign.
MADRIDIBEX35
27
0.039
No Sign.
-0.029
No Sign.
27
-1.172
No Sign.
-0.08
No Sign.
DAX30
22
-0.051
No Sign.
-0.006
No Sign.
22
-0.635
No Sign.
-0.028
No Sign.
CAC40
20
0.051
No Sign.
0.016
No Sign.
20
-1.231
No Sign.
-0.123
No Sign.
FTSE100
28
-0.025
No Sign.
-0.011
No Sign.
28
-0.969
No Sign.
-0.069
No Sign.
MILAN30
27
0.212
Sign.5 %
0.018
No Sign.
27
-0.712
No Sign.
-0.033
No Sign.
PORBVL30
16
0.671
Sign.1 %
-0.039
No Sign.
16
-0.139
No Sign.
-0.066
No Sign.
TORONSE200
27
0.284
Sign.1 %
0.008
No Sign.
27
0.233
Sign.10 %
-0.004
No Sign.
NIKKEI225
27
0.179
Sign.5 %
0.019
No Sign.
27
0.649
Sign.5 %
0.02
No Sign.
AUSTRALIA
26
-0.104
No Sign.
0.012
No Sign.
26
-0.109
No Sign.
-0.104
No Sign.
NEWZEAL40
26
0.194
Sign.5 %
0.019
No Sign.
26
0.099
No Sign.
0.008
No Sign.
DJINDUSTR
10
0.012
No Sign.
0.02
No Sign.
10
-0.05
No Sign.
0.003
No Sign.
DJTRANSPO
23
-0.066
No Sign.
-0.017
No Sign.
23
-0.357
No Sign.
-0.039
No Sign.
DJUTILIT
15
0.11
Sign.10 %
0.011
No Sign.
15
-0.7
No Sign.
-0.117
No Sign.
DJCOMP65
10
-0.138
No Sign.
-0.002
No Sign.
10
-0.463
No Sign.
-0.039
No Sign.
BANGKOK
30
0.448
Sign.1 %
0.001
No Sign.
30
0.61
Sign.5 %
0.004
No Sign.
JAKARTA
30
0.245
Sign.5 %
0.018
No Sign.
30
0.729
Sign.1 %
0.014
No Sign.
KUALAMPUR
18
1.432
Sign.1 %
0.004
No Sign.
18
1.643
Sign.1 %
0.008
No Sign.
PHILIPPIN
12
0.676
Sign.1 %
0.027
No Sign.
12
1.098
Sign.1 %
0.029
No Sign.
KOREA
25
1.213
Sign.1 %
-0.014
No Sign.
25
3.427
Sign.1 %
-0.016
No Sign.
SINGDBS50
13
1.756
Sign.1 %
-0.012
No Sign.
13
4.257
Sign.1 %
-0.029
No Sign.
HANGSENG
10
0.241
Sign.5 %
0.022
No Sign.
10
0.719
Sign.1 %
0.012
No Sign.
TAIWAN
15
0.041
No Sign.
0.031
No Sign.
15
0.007
No Sign.
0.032
No Sign.
SHENZHEN
24
0.527
Sign.1 %
-0.001
No Sign.
24
-0.383
No Sign.
-0.019
No Sign.
ISTANBUL100
29
0.263
Sign.1 %
0.04
No Sign.
29
0.254
Sign.10 %
0.031
No Sign.
JSEOVERALL
17
0.363
Sign.1 %
0.032
No Sign.
17
0.011
No Sign.
0.012
No Sign.
Cuadro 2: Tabla resumen de los contrastes de ra´ız unitaria estoc´ astica en los logaritmos de series financieras. Resultados de contrastar H0 : ln yt es I(1) frente a H1 : ra´ız unitaria estoc´ astica en ln yt .
Los valores cr´ıticos, obtenidos utilizando las superficies de respuesta estimadas para T = 1450 (en realidad, para cada serie habr´ıa que corregir seg´ un el n´ umero de retardos de la variable independiente incluidos pero, en todos los casos, las modificaciones de los valores cr´ıticos son despreciables), se pueden consultar en el Cuadro 3.
90 % 95 % 99 %
Z1 0.1093 0.1548 0.2727
Z2 0.2227 0.3412 0.6822
E1 0.0633 0.0668 0.0719
E2 0.0506 0.0572 0.066
Cuadro 3: Tabla resumen de valores cr´ıticos a utilizar en los contrastes de ra´ız unitaria estoc´ astica. Los valores cr´ıticos han sido calculados con las superficies de respuesta estimadas y T = 1450.
En el Cuadro 2 se observa un n´ umero importante de rechazos de la hip´otesis nula (16 series con, al menos, un 10 % de significaci´ on) en los contrastes tipo Z1 . Estos contrastes detectan ra´ız unitaria estoc´ astica generada por un modelo AR(1) y funci´ on de tendencia lineal en t en la regresi´ on11 (4). Sin embargo, no se detecta ra´ız unitaria estoc´astica generada por un sendero aleatorio (contrastes Ei ) en ning´ un caso. Parece 11 El contraste Z detecta el mismo tipo de ra´ ız pero la parte determinista incluida en la regresi´ on 2 es una constante.
268
´n M´ınguez y Eduardo Morales Roma
claro que los modelos de ra´ız unitaria estoc´ astica relevantes en series financieras son, desde el punto de vista pr´ actico, aquellos que incluyen un mecanismo estacionario en la evoluci´ on de dicha ra´ız (el valor del par´ ametro del AR(1) medir´ıa el grado de dependencia en la ra´ız unitaria estoc´ astica). Respecto a la distribuci´ on geogr´ afica de las detecciones de ra´ız unitaria estoc´ astica destacan, claramente, el n´ umero de rechazos en los contrastes Zi correspondientes a los mercados asi´ aticos. Una posible explicaci´ on es la mayor inestabilidad experimentada por estos mercados durante la muestra analizada; parece l´ogico que durante per´ıodos de incertidumbre se elija un modelo que permite transiciones entre la estacionariedad y la no estacionariedad (ra´ız unitaria estoc´ astica) frente a un modelo que no permite cambios en el car´ acter del proceso (ra´ız unitaria fija). Adicionalmente, el n´ umero de valores at´ıpicos u outliers de los ´ındices correspondientes a mercados asi´aticos ha sido relativamente grande en comparaci´ on al resto de mercados, lo cual puede plantear la duda de si la detecci´ on de ra´ız unitaria estoc´ astica viene provocada por la existencia de outliers. Una vez realizados los contrastes de ra´ız unitaria estoc´ astica se procede a estimar los valores de dicha ra´ız a lo largo del per´ıodo muestral (en realidad se cortan los u ´ltimos 300 datos ya que se utilizan en un ejercicio de predicci´ on para un trabajo posterior). Para poder obtener las estimaciones muestrales de la ra´ız, el primer paso ser´ a expresar el proceso dado por las ecuaciones (1) y (2) en forma de espacio de los estados para as´ı poder aplicarle el filtro de Kalman. Dicho filtro permite obtener estimaciones ´optimas (en el caso de normalidad, en otro caso, las estimaciones obtenidas ser´ an, al menos, las mejores proyecciones lineales) de los valores de los estados en el per´ıodo muestral. El modelo de ra´ız unitaria estoc´ astica que queremos expresar en forma de espacio de los estados es (Leybourne, McCabe y Mills, 1996):
(13)
∗ yt∗ = (1 + δt )yt−1 + εt
εt i.i.d. (0, σε2 )
δt = ρδt−1 + ηt
ηt i.i.d. (0, ση2 )
yt∗ = yt − λt −
p
i=1
φi yt−i
λt = α + βt
Este modelo permite que la serie observada yt∗ posea una ra´ız unitaria estoc´ astica alrededor de una tendencia lineal determinista (parece razonable incluir una tendencia lineal en la ecuaci´ on de medida sobre los logaritmos, de los ´ındices burs´atiles). Adem´as se permite que la estructura estoc´astica de yt∗ pueda incluir un proceso autorregresivo estacionario (las ra´ıces del polinomio que define dicho proceso deben estar fuera del c´ırculo unidad) de orden12 p. En este modelo se supone que la ra´ız estoc´astica sigue un proceso autorregresivo de orden uno ya que en el Cuadro 2 s´ olo se detecta ra´ız unitaria estoc´ astica en los contrastes Zi (estos son los contrastes en los que se supone 12 El orden del autorregresivo debe elegirse suficientemente alto como para recoger la estructura de correlaciones de la variable. En el supuesto de que el verdadero proceso fuera una media m´ ovil invertible, se podr´ıa aproximar de una forma adecuada eligiendo un valor alto de p.
269
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
que la ra´ız unitaria estoc´ astica viene generada por el proceso autorregresivo dado en la ecuaci´ on (2)). Si operamos en el modelo anterior (sustituyendo yt∗ y reordenando t´erminos) se obtiene: ∆yt = β +
p
p φi ∆yt−i + δt yt−1 − (α+ β(t−1)) − φi yt−i−1 + εt ,
i=1
i=1
εt i.i.d. N (0, σε2 ) ; ηt i.i.d. N (0, ση2 ) .
δt = ρδt−1 + ηt ,
Eligiendo p = 5 (valor suficientemente alto para recoger, tanto las correlaciones regulares, como la posible estacionalidad semanal en los rendimientos dados por ∆yt ) esta ecuaci´on se puede expresar directamente en forma de espacio de los estados (los valores de la ra´ız unitaria estoc´ astica ρt = 1 + δt se obtienen como una de las variables estado) y estimar los par´ ametros por m´axima verosimilitud. Los resultados de la estimaci´ on del modelo anterior, sobre las series burs´atiles consideradas, pueden consultarse en el Cuadro 4. φˆ1
φˆ2
φˆ3
φˆ4
φˆ5
α ˆ
βˆ
0.637
0.461
-0.389
0.394
-0.1
-0.496
-6.3E-5
0.683
0.196
-0.493
0.704
-0.086
-0.622
-5.8E-5
0.083
0.075
0.139
0.059
-0.105
4.243
0.002
-0.017
0.021
0.008
0.009
-0.062
7.583
-1.9E-4
0.006
-0.217
-0.064
0.6
0.751
-0.065
-0.393
-2.8E-6
0.03
-0.003
0.166
0.619
0.57
-0.426
-1.226
-1.4E-4
0.265
0.022
0.448
-0.142
0.188
0.675
-0.195
-1.159
-8.2E-5
0.05585
0.068
0.132
0.369
-0.11
0.176
-0.061
0.003
3.432
2.7E-4
0.062
-0.077
0.029
0.341
-0.562
0.398
0.281
0.017
5.528
2.3E-5
0.008
0.07389
-0.038
0.062
0.132
0.042
0.04
0.024
0.001
5.698
3.8E-4
0.104
0.008
0.10355
0.035
0.032
0.178
0.657
0.423
-0.079
-0.722
-1.28
-3.2E-5
0.02
0.002
0.01961
0.029
0.043
1.518
0.39
-1.165
-0.311
0.568
4.333
2.5E-5
DJTRANSPO
0.037
0.021
0.03663
0.114
0.009
-0.384
0.37
0.796
0.225
-0.038
0.155
7.8E-5
DJUTILIT
0.022
0.004
0.02172
0.039
0.001
-0.758
0.278
0.942
0.502
-0.02
0.523
-1.1E-6
DJCOMP65
0.021
0.003
0.02127
-0.049
0.034
0.166
0.017
-0.01
-0.057
-0.003
6.082
1.2E-3
BANGKOK
0.008
0.001
0.0077
0.145
0.018
0.095
-0.85
-0.054
-0.756
0.114
20.421
-1.2E-3
JAKARTA
0.069
0.005
0.06946
0.249
0.052
0.364
0.019
2.2E-5
0.003
-0.001
3.705
4.6E-4
KUALAMPUR
0.079
0.003
0.07933
0.128
0.051
0.449
0.19
0.202
0.324
-0.249
-0.837
-5.3E-4
PHILIPPIN
0.035
0.005
0.03526
0.208
0.07
0.206
0.01
0.022
0.023
-0.006
5.518
7.0E-4
KOREA
0.066
0.008
0.06551
0.129
0.062
0.109
0.078
-0.041
-0.072
0.011
6.43
-2.2E-4
SINGDBS50
0.054
0.005
0.05366
0.066
0.031
1.579
-0.443
-0.339
0.604
-0.422
-2.692
-5.3E-5
HANGSENG
0.079
0.004
0.07855
0.115
0.044
-0.341
0.298
0.658
0.192
-0.023
1.919
9.8E-5
TAIWAN
0.025
0.007
0.02465
0.237
0.066
-0.055
0.066
0.021
-0.065
0.015
8.7
SHENZHEN
0.017
0.002
0.01682
0.117
0.03
-0.012
1.253
0.722
-0.392
-0.576
-2.469
1.7E-3
ISTANBUL100
0.043
0.034
0.04316
0.194
0.126
1.123
-0.131
-0.933
1.122
-0.182
-0.242
-5.0E-4
JSEOVERALL
0.092
0.005
0.0916
0.064
0.023
0.216
0.091
0.059
0.109
-0.006
4.385
3.8E-4
σ ˆη
d.t.
0.117
0.011
0.11707
0.098
MADRIDIBEX35 0.112
0.011
0.11225
0.143
0.004
DAX30
0.017
0.002
0.01721
-0.136
0.064
CAC40
0.02
0.005
0.01981
0.109
0.244
FTSE100
0.042
0.007
0.04191
0.087
MILAN30
0.027
0.014
0.02721
0.051
PORBVL30
0.071
0.005
0.07062
TORONSE200
0.056
0.005
NIKKEI225
0.062
0.007
AUSTRALIA
0.074
NEWZEAL40 DJINDUSTR
Nombre MADRIDGEN
σ ˆε
ρˆ
d.t. 4.9E-5
8.3E-4
Cuadro 4: Tabla resumen de estimaciones del modelo (13) para el log. de yt
Como se puede comprobar, los estad´ısticos t correspondientes al contraste de significatividad individual asociados a las estimaciones σˆη son muy altos (los u ´nicos valores
270
´n M´ınguez y Eduardo Morales Roma
por debajo de 3 corresponden a los ´ındices Milan30, DJTranspo e Istanbul100) indicando la, aparentemente, clara significatividad de las desviaciones t´ıpicas ση estimadas en las ra´ıces unitarias estoc´asticas. Incluso, estos resultados indican una importancia excesiva del car´ acter estoc´astico de las ra´ıces unitarias si se comparan con el resultado de los contrastes realizados anteriormente. Por otro lado, las estimaciones de ρˆ son muy bajas, lo cual es coherente con los resultados de los contrastes anteriores, ya que, en todos los casos, se rechaza H0 para los contrastes Ei en los cuales la ra´ız unitaria estoc´ astica est´a generada por un sendero aleatorio. Si se hubieran obtenido valores de ρˆ pr´ oximos a la unidad (en valor absoluto) se habr´ıan dado rechazos de la hip´ otesis nula en dichos contrastes Ei con una alta probabilidad. Incluso, en 8 ´ındices burs´atiles los estad´ısticos t de significatividad individual son menores de 2 indicando la posibilidad de que la ra´ız estoc´astica estuviera generada por un proceso ruido blanco. Las implicaciones de estos resultados son evidentes: existe una alta aleatoriedad (poca dependencia respecto del pasado) en los valores de las ra´ıces unitarias estoc´asticas. Una vez estimados los par´ ametros del expresados en la ecuaci´on (13), es l´ ogico estimar los valores de las ra´ıces en el per´ıodo muestral considerado utilizando el filtro de Kalman. En realidad ´este es un filtro recursivo que utiliza la informaci´ on disponible hasta ese per´ıodo, con lo cual la estimaci´ on de la ra´ız ρt = 1 + δt (dada por una variable de estado) en el momento t s´ olo utiliza la informaci´ on disponible hasta ese momento. Sin embargo, existe una variedad de dicho filtro, denominada alisado13 o smoothing, que obtiene estimaciones de los estados en cada momento utilizando toda la informaci´ on muestral disponible (l´ ogicamente, el alisado es u ´til para obtener estimaciones ´optimas de los estados en un per´ıodo muestral determinado, no para predecir los valores de los estados y la variable observada fuera de la muestra). Intuitivamente, dicho alisado comienza estimando los valores de los estados (incluyendo, en nuestro caso, el valor de la ra´ız unitaria estoc´ astica) en el u ´ltimo per´ıodo muestral con toda la informaci´ on disponible. Posteriormente, se obtienen, de forma recursiva hacia atr´ as, los valores de las ra´ıces para el resto de la muestra (siempre utilizando toda la informaci´ on disponible). Las estimaciones alisadas obtenidas, para los logaritmos de los principales ´ındices burs´atiles, se pueden observar en los siguientes gr´aficos (las estimaciones de las ra´ıces unitarias estoc´asticas para el resto de ´ındices se proporcionan en un ap´endice al final del art´ıculo). Para cada gr´ afico se representa el valor del ´ındice correspondiente junto con las estimaciones de la ra´ız unitaria estoc´ astica e intervalos de confianza (que, en la mayor´ıa de los casos, son indistiguibles de la estimaci´ on de la ra´ız debido a la estrechez de los intervalos). 13 Una, a mi juicio, excelente exposici´ on del alisado se puede consultar en Hamilton (1994), p´ ag. 394-397. Una descripci´ on de la implementaci´ on pr´ actica de dicho alisado es expuesta en Koopman, Shephard y Doornik (1998) para el software SsfPack en el lenguaje Ox (Doornik, 2000)
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
LMADRIDIBEX35
9
8.5
8 0 1.1
100
200
300
400
LMADRIDIBEX35_Low_Int_UR LMADRIDIBEX35_Upp_Int_UR
500
600
700
800
900
1000
1100
LMADRIDIBEX35_Unit_Root
1.05
1
.95
0
8.75
100
200
300
400
500
600
700
800
900
1000
1100
300
400
500
600
700
800
900
1000
1100
700
800
900
1000
1100
LFTSE100
8.5
8.25
8 0
100
200
LFTSE100_Low_Int_UR LFTSE100_Upp_Int_UR
LFTSE100_Unit_Root
1.05
1
.95
0
100
200
300
400
500
600
271
272
´n M´ınguez y Eduardo Morales Roma
8
LDJCOMP65
7.75
7.5
7.25 0
100
200
300
400
LDJCOMP65_Low_Int_UR LDJCOMP65_Upp_Int_UR
1.1
500
600
700
800
900
1000
1100
LDJCOMP65_Unit_Root
1.05 1 .95 0
100
200
300
400
500
600
700
800
900
1000
1100
300
400
500
600
700
800
900
1000
1100
800
900
1000
1100
LNIKKEI225
10 9.9 9.8 9.7 9.6 0
100
200
LNIKKEI225_Low_Int_UR LNIKKEI225_Upp_Int_UR
1.1
LNIKKEI225_Unit_Root
1
.9 0
100
200
300
400
500
600
700
En estos gr´ aficos parece que, efectivamente, en bastantes per´ıodos los valores de la ra´ız se alejan de la unidad, lo cual indicar´ıa una mejor adecuaci´ on del modelo de ra´ız unitaria estoc´ astica frente al modelo de ra´ız unitaria fija. Los resultados de las estimaciones de las ra´ıces muestran que la variabilidad de la ra´ız est´ a directamente
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
273
relacionada con la volatilidad de la variable (algo intuitivamente l´ ogico). Incluso parece detectarse en muchas series una variabilidad creciente al final de la muestra considerada cuando el ´ındice burs´atil muestra una ruptura en la tendencia (esto es f´ acilmente observable en el caso del IBEX35). Adicionalmente, los valores an´omalos (correspondientes a la observaci´ on 1000 de fecha 31-Octubre-1997) se detectan tambi´en en la estimaci´on del valor de la ra´ız en dicho per´ıodo que resulta muy alto en valor absoluto.
4.
Conclusiones Para finalizar, conviene resaltar varios puntos destacados a lo largo de este estudio: Existe cierta evidencia de ra´ız unitaria estoc´ astica en ´ındices burs´atiles habitualmente modelizados con una ra´ız unitaria fija sobre los logaritmos. Este hecho tiene consecuencias directas sobre la evoluci´ on, y posible modelizaci´ on, de los rendimientos burs´atiles. Su generalizaci´on a otras series financieras diarias est´a pendiente de estudio. En las series en las que se ha detectado ra´ız unitaria estoc´ astica el modelo adecuado para recoger la evoluci´ on de la ra´ız es un AR(1) (frente a la alternativa de sendero aleatorio). Parece que la evoluci´ on de la ra´ız muestra mayor persistencia que un ruido blanco pero esta persistencia no llega a tener un car´ acter no estacionario. La detecci´ on de ra´ız unitaria estoc´ astica se centra, sobre todo, en los mercados asi´ aticos, donde ha existido una mayor volatilidad en el per´ıodo analizado (este hecho se observa claramente en los gr´aficos del Anexo). Queda pendiente el estudio de si los modelos de ra´ız unitaria estoc´ astica son capaces de captar, por s´ı mismos, la volatilidad observada en series financieras habituales. Adicionalmente, el n´ umero de outliers en los ´ındices burs´atiles asi´ aticos, as´ı como su tama˜ no, ha sido comparativamente mayor que en el resto de los mercados, lo cual puede influir tanto en la detecci´ on como en la estimaci´on de las ra´ıces unitarias estoc´asticas. Por u ´ltimo, parece que la variabilidad en las estimaciones de las ra´ıces unitarias estoc´asticas puede estar directamente relacionada con rupturas en las tendencias de los ´ındices burs´atiles.
274
´n M´ınguez y Eduardo Morales Roma
Ap´ endice: Tablas y Figuras Contraste E1 (vector T modificado) Cuantil 90 (10 %) 0.063021 + (0.00001) Cuantil 95 (5 %) 0.066223 + (0.00001) Cuantil 99 (1 %) 0.070861 + (0.00002) Contraste E2 (vector T completo) Cuantil 90 (10 %) 0.05047 + (0.00001) Cuantil 95 (5 %) 0.05699 + (0.00001) Cuantil 99 (1 %) 0.06541 + (0.00002)
0.4471 T −1 + (0.0045) 0.7754 T −1 + (0.0044) 1.5536 T −1 + (0.0109)
-0.4876 T −2 + (0.6362) -12.4035 T −2 + (0.6325) -46.4312 T −2 + (1.5893)
0.1743 T −1 + (0.0016) 0.3519 T −1 + (0.0022) 0.8739 T −1 + (0.0055)
0.9055 T −2 + (0.0932) -0.7984 T −2 + (0.1283) -9.8079 T −2 + (0.3415)
-70.645 T −3 (25.1769) 198.992 T −3 (25.624) 1032.888 T −3 (64.599)
-13.351 (1.324) -3.332 (1.850) 77.548 (5.038)
T −3 T −3 T −3
Cuadro 5: Superficies de respuesta estimadas de los contrastes E1 y E2 para los niveles de significaci´ on m´ as habituales (los valores entre par´entesis indican desviaciones t´ıpicas estimadas). Contraste Z1 (vector T completo) Cuantil 90 (10 %) 0.10638 + (0.00033) Cuantil 95 (5 %) 0.15125 + (0.00039) Cuantil 99 (1 %) 0.26788 + (0.00059) Contraste Z2 (vector T modificado) Cuantil 90 (10 %) 0.21956 + (0.00025) Cuantil 95 (5 %) 0.33765 + (0.00037) Cuantil 99 (1 %) 0.67842 + (0.00092)
4.3645 T −1 + (0.1201) 5.1772 T −1 + (0.1323) 7.0644 T −1 + (0.1895)
-100.7831 T −2 + (7.7557) -118.1505 T −2 + (8.2580) -170.5901 T −2 + (11.4401)
1018.232 (116.346) 1151.557 (121.486) 1566.557 (164.898)
T −3
4.6456 T −1 + (0.1314) 5.2810 T −1 + (0.1904) 5.6045 T −1 + (0.4739)
-193.1104 T −2 + (17.5838) -200.7496 T −2 + (25.3431) -211.6117 T −2 + (61.8925)
4566.203 T −3 (674.126) 4249.842 T −3 (966.818) 3785.999 T −3 (2317.002)
T −3 T −3
Cuadro 6: Superficies de respuesta estimadas de los contrastes Z1 y Z2 para los niveles de significaci´ on m´ as habituales (los valores entre par´entesis indican desviaciones t´ıpicas estimadas).
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
8.75
LDAX30
8.5 8.25 8 7.75
0 1.05
100
200
300
LDAX30_Low_Int_UR LDAX30_Upp_Int_UR
400
500
600
700
800
900
1000
1100
LDAX30_Unit_Root
1.025 1 .975 .95 0
100
200
300
400
500
600
700
800
900
1000
1100
200
300
400
500
600
700
800
900
1000
1100
700
800
900
1000
1100
LCAC40
8.25
8
7.75
7.5 0 1.05
100
LCAC40_Low_Int_UR LCAC40_Upp_Int_UR
LCAC40_Unit_Root
1.025
1
.975 0
100
200
300
400
500
600
275
276
´n M´ınguez y Eduardo Morales Roma
6
LMILAN30
5.75 5.5 5.25 5
0
100
200
300
400
LMILAN30_Low_Int_UR LMILAN30_Upp_Int_UR
500
600
700
800
900
1000
1100
LMILAN30_Unit_Root
1.02
1
.98
0
100
200
300
400
300
400
500
600
700
800
900
1000
1100
500
600
700
800
900
1000
1100
800
900
1000
1100
LPORBVL30
8.5
8
7.5 0 1.05
100
200
LPORBVL30_Low_Int_UR LPORBVL30_Upp_Int_UR
LPORBVL30_Unit_Root
1.025 1 .975
0
100
200
300
400
500
600
700
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
LTORONSE200
6
5.8
5.6
0 1.2
100
200
300
400
LTORONSE200_Low_Int_UR LTORONSE200_Upp_Int_UR
500
600
700
800
900
1000
1100
LTORONSE200_Unit_Root
1
.8
0
8
100
200
300
400
500
600
700
800
900
1000
1100
300
400
500
600
700
800
900
1000
1100
800
900
1000
1100
LAUSTRALIA
7.9 7.8 7.7 7.6
0 1.4
100
200
LAUSTRALIA_Low_Int_UR LAUSTRALIA_Upp_Int_UR
LAUSTRALIA_Unit_Root
1.2
1
.8 0
100
200
300
400
500
600
700
277
278
´n M´ınguez y Eduardo Morales Roma
LNEWZEAL40
7.8
7.7
7.6
0
100
200
300
400
LNEWZEAL40_Low_Int_UR LNEWZEAL40_Upp_Int_UR
1.01
500
600
700
800
900
1000
1100
LNEWZEAL40_Unit_Root
1 .99 .98 0
100
200
300
400
500
600
700
800
900
1000
1100
300
400
500
600
700
800
900
1000
1100
800
900
1000
1100
LDJINDUSTR
9 8.75 8.5 8.25 0
100
200
LDJINDUSTR_Low_Int_UR LDJINDUSTR_Upp_Int_UR
LDJINDUSTR_Unit_Root
1.01
1
0
100
200
300
400
500
600
700
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
8.25
LDJTRANSPO
8 7.75 7.5 7.25 0
100
200
300
400
LDJTRANSPO_Low_Int_UR LDJTRANSPO_Upp_Int_UR
3
500
600
700
800
900
1000
1100
LDJTRANSPO_Unit_Root
2 1 0
0
5.7
100
200
300
400
300
400
500
600
700
800
900
1000
1100
500
600
700
800
900
1000
1100
800
900
1000
1100
LDJUTILIT
5.6 5.5 5.4 5.3 5.2 0
100
200
LDJUTILIT_Low_Int_UR LDJUTILIT_Upp_Int_UR
LDJUTILIT_Unit_Root
1.1
1
.9 0
100
200
300
400
500
600
700
279
280
´n M´ınguez y Eduardo Morales Roma
7.5
LBANGKOK
7
6.5
6 0
100
200
300
400
LBANGKOK_Low_Int_UR LBANGKOK_Upp_Int_UR
1.02
500
600
700
800
900
1000
1100
LBANGKOK_Unit_Root
1
.98 0
100
200
300
400
300
400
500
600
700
800
900
1000
1100
500
600
700
800
900
1000
1100
800
900
1000
1100
LJAKARTA
6.5
6.25
6
0
100
200
LJAKARTA_Low_Int_UR LJAKARTA_Upp_Int_UR
1.2
LJAKARTA_Unit_Root
1
.8
0
100
200
300
400
500
600
700
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
LKUALAMPUR
7 6.75 6.5 6.25 0 1.1
100
200
300
400
LKUALAMPUR_Low_Int_UR LKUALAMPUR_Upp_Int_UR
500
600
700
800
900
1000
1100
LKUALAMPUR_Unit_Root
1.05
1
.95 0
100
200
300
400
300
400
500
600
700
800
900
1000
1100
500
600
700
800
900
1000
1100
800
900
1000
1100
LPHILIPPIN
8
7.75
7.5
0
100
200
LPHILIPPIN_Low_Int_UR LPHILIPPIN_Upp_Int_UR
1.1
LPHILIPPIN_Unit_Root
1
.9 0
100
200
300
400
500
600
700
281
282
´n M´ınguez y Eduardo Morales Roma
LKOREA
7
6.5
6
0 1.2
100
200
300
LKOREA_Low_Int_UR LKOREA_Upp_Int_UR
400
500
600
700
800
900
1000
1100
LKOREA_Unit_Root
1.1
1
.9 0
100
200
300
400
300
400
500
600
700
800
900
1000
1100
500
600
700
800
900
1000
1100
800
900
1000
1100
LSINGDBS50
6.4
6.2
6
0
100
200
LSINGDBS50_Low_Int_UR LSINGDBS50_Upp_Int_UR
LSINGDBS50_Unit_Root
1.02
1
.98 0
100
200
300
400
500
600
700
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
9.75
LHANGSENG
9.5
9.25
9
0
100
200
300
400
LHANGSENG_Low_Int_UR LHANGSENG_Upp_Int_UR
500
600
700
800
900
1000
1100
LHANGSENG_Unit_Root
1.5
1
.5
0
100
200
300
400
500
600
700
800
900
1000
1100
300
400
500
600
700
800
900
1000
1100
800
900
1000
1100
LISTANBUL100
8
7
6
5 0
100
200
LISTANBUL100_Low_Int_UR LISTANBUL100_Upp_Int_UR
1.2
LISTANBUL100_Unit_Root
1.1 1 .9 .8 0
100
200
300
400
500
600
700
283
284
´n M´ınguez y Eduardo Morales Roma
9.25
LTAIWAN
9
8.75
8.5 0
100
200
300
400
LTAIWAN_Low_Int_UR LTAIWAN_Upp_Int_UR
1.05
500
600
700
800
900
1000
1100
LTAIWAN_Unit_Root
1.025 1 .975
0
100
200
300
400
300
400
500
600
700
800
900
1000
1100
500
600
700
800
900
1000
1100
800
900
1000
1100
LSHENZHEN
5.25 5 4.75 4.5 4.25 0
100
200
LSHENZHEN_Low_Int_UR LSHENZHEN_Upp_Int_UR
LSHENZHEN_Unit_Root
1.1
1
.9
0
100
200
300
400
500
600
700
´ n a ´ındices de bolsa de modelos de ra´ız unitaria estoca ´stica Aplicacio
285
LJSEOVERALL
9
8.8
8.6
0
100
200
300
400
LJSEOVERALL_Low_Int_UR LJSEOVERALL_Upp_Int_UR
1.75
500
600
700
800
900
1000
1100
800
900
1000
1100
LJSEOVERALL_Unit_Root
1.5 1.25 1 .75 0
100
200
300
400
500
600
700
Referencias [1] Bollerslev, T.: Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics 32 (1986), 307–327. [2] Cragg, J.G.: More Efficient Estimation in the Presence of Heteroscedasticity of Unknown Form. Econometrica 51 (1983), 751–763. [3] Doornik, J.: Ox. An Object-Oriented Matrix Programming Language. Timberlake Consultants Ltd. http://www.timberlake.co.uk, 1998. [4] Engle, R.: Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of UK Inflation. Econometrica 50 (1982), 987–1008. [5] Granger, C. W .J. and Swanson, N. R.: An introduction to stochastic unit root processes. Journal of Econometrics 80 (1997), 35-62. [6] Hamilton, J.D.: Time Series Analysis. Princeton Univ. Press, Princeton, 1994. [7] Koopman, S. J. and Shephard, N. and Doornik, J. A.: Statistical Algorithms for Models in State Space using SsfPack 2.2. The Econometrics Journal 2 (1999), 107-160. [8] L’Ecuyer, P.: Tables of Maximally-Equidistributed Combined LSFR Generators. Mimeo, 1997. [9] Leybourne, S.J. and MacCabe, B.P. and Mills, T.C.: Randomized Unit Root Processes for Modelling and Forecasting Financial Time Series: Theory and Applications. Journal of Forecasting 15 (1996), 253–270.
286
´n M´ınguez y Eduardo Morales Roma
[10] Leybourne, S. J. and MacCabe, B. P. M. and Tremayne, A. R.: Can Economic Time Series Be Differenced to Stationarity? Journal of Business and Economic Statistics 14 (1996), 435–446. [11] MacKinnon, J.: Numerical distribution functions for unit root and cointegration tests. Journal of Applied Econometrics 11 (1996), 601–618. [12] MacKinnon, J.: Computing Numerical Distribution Functions in Econometrics. Working Paper, Department of Economics, Queen’s University, 2000. [13] MacKinnon, J. G.: Approximate Asymptotic Distribution Functions for UnitRoot and Cointegration Tests. Journal of Business and Economic Statistics 12 (1994), 167–176. [14] MacKinnon, J.G.: In Long-Run Economic RelationShips. Readings in Cointegration, Engle, R.F. and Granger, C.J.W. editors. Chapter 13, 267–276. Oxford University Press, 1991. [15] Maddala, G.S. and Kim I-M.: Unit Roots, Cointegration and Structural Change. Cambridge University Press, 1998. [16] Mills, T.C.: The Econometric Modelling of Financial Time Series. Cambridge University Press, 1999. [17] Ng, S. and Perron, P.: Unit Root Tests in ARMA models with data-dependent methods for the selection of the truncation lag. Journal of the American Statistical Association 90 (1995), 268–281. [18] Taylor, A.M.R. and Van Dijk, D.J.C., Testing for stochastic unit roots: Some Monte Carlo evidence. EI 9922/A, Econometric Institute Research, Erasmus University. Rotterdam, 1999.
Rom´an M´ınguez Salido Departamento de M´etodos Cuantitativos para la Econom´ıa Facultad de Ciencias Econ´ omicas y Empresariales Universidad San Pablo-CEU C/ Juli´ an Romea, 23. 28003 Madrid [email protected]
Eduardo Morales Mart´ınez Departamento de M´etodos Cuantitativos para la Econom´ıa Facultad de Ciencias Econ´ omicas y Empresariales Universidad San Pablo-CEU C/ Juli´ an Romea, 23. 28003 Madrid [email protected]
An analytic approach to credit risk of loan portfolios of Spanish banks Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris1
Abstract: One of the problems in credit risk is the estimation of credit loss distributions. Due to the non-normality of these, it is common practice to use Montecarlo simulation techniques. The disadvantage of such simulations is their cost in terms of time and resources, which is why analytical approximations are sometimes used. In particular, this is the approach followed by the Basel Committee in its proposal on the New Capital Accord. This study aims to estimate an analytical expression of the credit loss distribution for loan portfolios of Spanish banks. The objective is to obtain an estimation of economic capital in a simpler way than with a simulation approach. The analytical approximation used is based on a one-factor model and is comparable to the distribution used by the Basel Committee in its recent proposal. The estimation of the model is carried out not only for the whole sample of Spanish banks, but also for sub-samples, so as to take into account fundamental differences in terms of business operations and size. This differentiation enables us to identify the existence of several loss distributions and thus improve the quality of the analytical approximations. In particular we distinguish between banks and savings banks and by loan portfolio size, and deduce that the size criterion is the more relevant one.
1. Introduction The fundamental concepts used when analysing risk are expected loss and unexpected loss. Expected loss is a measure of the loss the bank would expect to experience on its portfolio over a determined time horizon, whereas unexpected loss is a measure of the volatility of losses. While adequate pricing and provisioning should generate sufficient earnings to absorb any expected losses, banks need to set assets aside as a cushion to cover any unexpected losses and thus guarantee their solvency. Economic capital is measured as the difference between some selected high confidence percentile 1 Juan Carlos Garc´ ´ ıa C´ espedes es Director, Angel M. Menc´ıa Gonz´ alez Subdirector, y Mercedes Morris Mu˜ noz analista de riesgos en el Departamento de Metodolog´ıa de Riesgos Corporativos de BBVA. Esta charla fue impartida por la tercera autora en la sesi´ on del Seminario Instituto MEFF– RiskLab de diciembre de 2001.
288
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
of losses and the expected loss, and corresponds to the level of capital that the bank needs to set aside in order to protect itself (with a certain level of confidence) against unexpected losses. The analysis of credit risk differs fundamentally from that of market risk for the following reason. Generally speaking, the assumption that market returns are normally distributed is widely accepted, although recent research is exploring other possible distributions. Under the assumption of normality, it is possible to reduce the analysis of market risk to the two parameters of the normal distribution: mean and standard deviation. In other words, these two summary statistics are sufficient to approximate the distribution of market returns in a relatively accurate way. However, because credit returns are by nature highly skewed and fat-tailed, they cannot be assumed to be normally distributed. In this case the mean and standard deviation do not adequately describe risk and are not sufficient to estimate the percentile levels of the distribution as in the case of market returns. The technical explanation for the non-normality of credit returns is the following. The central limit theorem states that the sum of independent random variables with finite variance converges to a normal distribution when the number of random variables tends to infinity. This assumption of independence is applicable to market returns as portfolios of exposures tend to be sufficiently diversified. However, in the case of credit risk, the assumption fails given the existing relation between loans. Indeed, empirical evidence shows that a certain cyclicality exists, i.e. a positive correlation between credit events of different debtors is induced by the fact that default rates are significantly higher in periods of economic recession than in periods of economic growth. The failure of the assumption of independence means that the central limit theorem cannot be applied. Furthermore, because the credit loss distribution of an individual loan is very skewed, even though loan portfolios are highly diversified over different countries and industries, implying small correlations, the central limit theorem still cannot be applied. The non-normality of credit returns means that the generation of credit loss distributions is much more complex than that of market loss distributions, which is why in practice it is commonly done by Montecarlo simulations. The disadvantage of such simulations is their cost in terms of time and resources. The main objective of this study is to obtain an analytical expression for the credit loss distribution of loan portfolios of Spanish banks. The advantage of obtaining an analytical expression is that it can be used to estimate economic capital in a simpler way than using a simulation approach. The analytical approximation used is based on a one-factor model and is comparable to the distribution used by the Basel Committee in its proposal on the New Capital Accord. The paper is organised as follows. The theoretical model and the derivation of the limiting credit loss distribution are given in Section 2. In Section 3, using data for Spanish banks, we estimate the Gamma, Beta and Weibull distributions and the analytical approximation that best fit the empirical credit loss distribution. In Section 4 we split the sample in two groups to identify the existence of different credit
An analytic approach to credit risk of loan portfolios of Spanish banks
289
loss distributions and thus improve the analytical approximations, while in Section 5 we present estimations of economic capital at various confidence levels. Finally, we present the conclusions in Section 6.
2. Theoretical framework We base ourselves on the model introduced by Lucas, Klaassen, Spreij and Straetmans (1999). In their paper, the authors consider portfolio credit loss distributions based on a factor model for individual exposures and derive an analytic approximation to the credit loss distribution if the portfolio contains a large number of exposures.
2.1. Basic model We consider a portfolio containing n exposures. Each exposure j is characterised by a four-dimensional stochastic vector (Sj , kj , lj , π(j, kj , lj , ψ)).
(1)
The first element of the vector, Sj , is the firm’s surplus (difference between the market values of liabilities and assets). It triggers the mechanism for defaults and credit rating migrations, as in Merton’s model2 . We assume that the portfolio exposures are driven by a vector of common factors: Sj = µj + βj f + εj ,
(2) where µj ∈ R is a constant term,
βj ∈ Rm is a vector of factor loadings, f ∈ Rm is a vector of common factors, εj ∈ R is a scalar representing idiosyncratic risk. Furthermore, we assume that f ∼ N (0, Ωf ) and εj ∼ N (0, ωj ), with E(εj f ) = 0 for all j, Ωf positive definite and E(εi εj ) = 0 for all i = j. Given this common factor structure, the surplus variables Sj of different firms are correlated. Because the Sj ’s trigger the mechanism for defaults and credit rating migrations, correlation between them results in correlated probabilities of default and credit rating migrations. 2 In Merton’s model (1974), the firm defaults when the market value of the assets is insufficient to repay the liabilities. The market value of assets can be determined using an options pricing based approach which recognises equity as a call option on the underlying assets of the firm, with a strike price equal to the book value of the firm’s liabilities.
290
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
The second element of equation (1) is kj , the exposure’s initial rating category, while the third element is lj , the exposure’s end-of-period rating category. We assume that there are r rating categories, such that kj , lj ∈ {1, . . . , r}, where the rth rating category is the state of default, and that there is only one period. Migrations are driven by a Markovian transition matrix P p11 · · · p1r .. , .. P = ... . . pr1
···
prr
where pkl denotes the probability that a firm with initial rating k switches to rating l. For given values of pkl , one can select constants ckl , with k = l, . . . , r and l = 0, . . . , r, such that ck0 = −∞ and ckr = +∞ for all k, and Φ(ckl ) − Φ(ck,l−1 ) = pkl
(3)
for all k and l = 1, . . . r, where Φ(·) is the standard normal cumulative distribution function. The end-of-period rating lj is such that (4) cj,kj ,lj−1 ≡ ckj ,lj−1 ωj + βj Ωf βj < Sj ≤ ckj ,lj ωj + βj Ωf βj ≡ cj,kj ,lj . The fourth element of equation (1) is π(j, kj , lj , ψ), exposure j’s credit loss. We assume that the amount of credit loss depends on the exposure’s initial (kj ) and final (lj ) rating category, as well as on the state of the economy (ψ). In other words, a credit loss occurs not only if a firm defaults, but also if the firm’s rating deteriorates. This is due to different credit risk spreads across rating categories. The credit loss of a portfolio containing n exposures is given by the sum of the individual credit losses: n π(j, kj , lj , ψ). (5) Cn = j=1
2.2. The limiting distribution of portfolio credit losses In this section the distribution of the portfolio credit loss Cn when the number of exposures becomes large is established. Assumptions: n 1 βj ωj−1 βj converges to a finite, positive definite matrix; n j=1 βn βn /(nωn ) → 0; n
sup n
r
1 π(j, kj , lj , ψ)2 is bounded almost surely (a.s.). n j=1 l=1
An analytic approach to credit risk of loan portfolios of Spanish banks
291
Theorem 1 Under the previous assumptions, the R2 of the factor regression model (2) is given by (6)
Rj2 =
βj Ωf βj Cov(Sj , βj f )2 = . V ar(Sj ) V ar(βj Ωf ) ωj + βj Ωf βj
Define
1/2
βj Ωf vj =
1 − Rj2
ωj Rj2
such that vj vj = 1, and let Y be an m-dimensional standard normal random variable −1/2
defined by Y = Ωf f . The conditional (on f ) probability of migrating from rating kj to rating lj is given by
ckj ,l−1 − µj − Rj2 vj Y ckj ,l − µj − Rj2 vj Y ˆ jl = Φ −Φ . (7) Φ 1 − Rj2 1 − Rj2 Define the conditional (on f ) portfolio credit loss as Bn =
(8)
n r
ˆ jl π(j, kj , lj , ψ). Φ
j=1 l=1 a.s.
n−1 Cn − n−1 Bn −→ 0.
(9)
In other words, the average portfolio credit loss converges almost surely to the average conditional portfolio credit loss. We consider the particular case of a one-factor model (m = 1) where vj ≡ 1. We define Bn . n→∞ n
g(Y ) = lim
(10)
By the transformation of variables technique, the cumulative distribution function (c.d.f) and the probability density function (p.d.f.) of credit losses c are given by Cn Bn (11) ≤ c = P lim ≤c F (c) = P lim n→∞ n n→∞ n =
P (g(Y ) ≤ c) = P (Y ≤ g −1 (c)) = Φ(g −1 (c))
and (12)
f (c) =
φ(g −1 (c)) , |g (g −1 (c))|
where g −1 (·) and g (·) denote the inverse and the first derivative of g(·) respectively, and φ(·) is the standard normal probability density function.
292
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
By numerical integration one can approximate the expected credit loss and its variance for k = 1, 2 by N
(13)
g(yi )k φ(yi ) (yi − yi−1 ),
i=1
where −K = y0 < y1 < . . . < yN = K denotes an appropriate partitioning of the interval [−K, K] for a sufficiently large constant K > 0. It is also fairly easy to calculate the percentiles of the credit loss by using Cn (14) P lim ≤ c = δ ⇐⇒ c = g(Φ−1 (δ)). n→∞ n
2.3. Default/non-default model 2.3.1. One-factor model We consider a one-factor model (m = 1) in which there are only two rating categories, i.e. default and non-default: r = 1, 2, where r = 2 corresponds to a state of default. Without loss of generality we can assume3 Ωf = 1 in f ∼ N (0, Ωf ). We assume that the factor affects all exposures in the same way such that for all firms µj = µ = 0 and βj = β > 0 for all j, implying4 that vj = v = +1. Furthermore, we assume that εj ∼ N (0, ω) for all j. This implies that Rj2 ≡ ρ, the asset correlation. Also, we assume that all firms have the same probability of default p, which implies that ckj ,l = c1,2 = +∞ and ckj ,l−1 = c1,1 for all j, and, from equation (3), p = 1 − Φ (c1,1 ). In this simple model we have π(j, kj , 1, ψ) ≡ 0 and π(j, kj , 2, ψ) ≡ (1 − α) π(j),where π(j) is the size of the jth loan and α is the recovery rate. If we work with credit losses in percentage, then π(j) = 1, so that π(j, kj , 2, ψ) ≡ (1 − α). Given these assumptions, equation (8) can be written as √ n c1,1 − ρ Y √ (15) Bn = 1−Φ (1 − α) π(j) 1−ρ j=1 3 If
Ωf = 1 we can define a new factor g = A−1 f f such that V ar(f ) = Ωf = Af Af and Sj =
−1 −1 −1 µj + βj Af A−1 f f + εj = µj + βj Af g + εj , where V ar(g) = V ar(Af f ) = Af V ar(f )(Af ) = 1. 4 This implies that the systematic risk factor f has a positive impact on S . This can be interpreted j in the following way. If we take GDP as being the systematic risk factor, we are thus assuming that in periods of expansion firms are less likely to default, while in periods of recession firms are more likely to default.
An analytic approach to credit risk of loan portfolios of Spanish banks
such that
(16) g(Y ) = 1 − Φ
293
√ √ c1,1 − ρ Y c1,1 − ρ Y √ √ (1 − α) π ¯ = 1−Φ (1 − α). 1−ρ 1−ρ
Using equations (11), (14) and (16), we obtain that c = g(Y ) = g(Φ−1 (δ))
⇐⇒
Y = g −1 (c),
which implies that Y =g
−1
1 c −1 (c) = √ 1− c1,1 − 1 − ρ Φ , ρ 1−α
so that (17)
F (c) = Φ(g −1 (c)) = Φ
1 c , √ c1,1 − 1 − ρ Φ−1 1 − ρ 1−α
or, alternatively, using the fact that p = 1 − Φ (c1,1 ),
1 c 1 − ρ Φ−1 (18) F (c) = Φ(g −1 (c)) = Φ √ − Φ−1 (p) ρ 1−α Equation (18) is the analytical credit loss distribution used by the Basel Committee in its proposal on the New Capital Accord. The unknown parameters to be estimated in equation (18) are the asset correlation ρ, the recovery rate α, and the probability of default p. 2.3.2. Extension: Different recovery rates If we assume that the recovery rate α varies for each observation5 , equations (15) and (16) are replaced by Bn =
(19)
n j=1
1−Φ
√ c1,1 − ρ Y √ (1 − α(j)) π(j) 1−ρ
such that
√ √ c1,1 − ρ Y c1,1 − ρ Y √ √ (20) g(Y ) = 1 − Φ (1 − α) π = 1 − Φ (1 − α), 1−ρ 1−ρ n
1 where α = α(j) is the average recovery rate. n j=1 5 We
assume that α is deterministic. The same result is obtained when α is stochastic if it is i.i.d.
294
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
This implies that equation (18) is replaced by
1 c −1 −1 −1 (21) F (c) = Φ(g (c)) = Φ √ 1 − ρΦ − Φ (p) ρ 1−α ¯ The expression for the analytical approximation to the credit loss distribution is the same as when the recovery rate is unique except that the unknown parameter is the average recovery rate α instead of the unique recovery rate α. 2.3.3. Extension: Different default rates We can also assume that there are three different types of customers depending on their credit quality: high, average and low credit quality. This implies that instead of a unique constant c1,1 which determines the default probability, there are three constants determining the default probability for each one of the three groups. We assume that β and ω, and thus ρ, are the same for the three groups of customers (β > 0), and that µ = 0. If we define the constants determining the default probability for the customers of high, average and low credit quality by ch , ca , cl respectively, and define the number of customers of each type by nh , na , nl , with n = nh + na + nl , then equations (15) and (16) are replaced by nh na c − √ρ Y c − √ρ Y h a √ Bn = (1 − α) π(j) + (1 − α) π(j) 1−Φ 1−Φ √ 1 − ρ 1−ρ j=1 j=1 nl c − √ρ Y l + 1−Φ √ (22) (1 − α) π(j) 1−ρ j=1 such that
c − √ρ Y c − √ρ Y c − √ρ Y h a l √ (23) g(Y ) = (1−α) 1−ph Φ −pa Φ √ −pl Φ √ , 1−ρ 1−ρ 1−ρ where ph = nh /n, pa = na /n, pl = nl /n are the fractions of customers of high, average and low credit quality, ph + pa + pl = 1. In this simple case it is not possible to invert equation (23) to obtain an equivalent expression to equation (18). Furthermore, there are now eight unknown parameters instead of three unknown parameters as in the case of a unique credit quality and default rate, which are ρ, ch , ph , ca , pa , cl , pl , α. Although this extension is interesting from a theoretical point of view, from a practical point of view it may be difficult to define the three groups of customers. If we could recover an expression for the cumulative distribution function, we would be able to analyse how an additional customer of a determined credit quality, i.e. a marginal change in the proportions of customers of the different credit quality types, would affect the average credit loss and the 99.9% percentile. Thus we could determine the incremental effect on the expected and unexpected credit losses and on the amount of economic capital required to support the loan portfolio.
An analytic approach to credit risk of loan portfolios of Spanish banks
295
2.3.4. Multi-factor model We consider the same model as previously with only two rating categories but now with two factors. Consider m = 2, with f ∼ N (0, Ωf ). Without loss of generality we can assume6 1 0 Ωf = I2 = . 0 1 We extend the previous assumptions as follows: β1 µ j = µ = 0, βj = β = and ωj = ω = 1 β2 This implies that Rj2 = ρ =
for all j.
β12 + β22 β Ωf β = ω + β Ωf β 1 + β12 + β22
and that √ 1−ρ 1−ρ 1−ρ 1 0 = (β1 β2 ) = (β1 β2 ) √ 0 1 ρ ρ ρ
1/2
vj
= v = (v1 v2 ) =
β Ωf
so that v1 =
β1 β12
+
β22
and
v2 =
β2 β12
+ β22
.
As before we have π(j, kj , 1, ψ)
≡ 0,
π(j, kj , 2, ψ)
≡ (1 − α) π(j) ≡ (1 − α) .
Also, we assume that ckj ,l = c1,2 = +∞ and ckj ,l−1 = c1,1 = Φ−1 (1 − p) for all j. Given these assumptions, equations (15) and (16) become −1 √ n Φ (1 − p) − ρ v Y √ (24) Bn = 1−Φ (1 − α) π(j) 1−ρ j=1 such that (25)
−1 √ Φ (1 − p) − ρ v Y √ g(Y ) = 1 − Φ (1 − α), 1−ρ
where β1 β2 v = (v1 v2 ) = 2 β1 + β22 β12 + β22 6 The
and
ρ=
β12 + β22 . 1 + β12 + β22
proof is analogous to the proof given in the case of the one-factor model.
296
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
In this version of the model there are now four unknown parameters to be determined: β1 , β2 , p and α. In the case of n factors, the results are similar and are as follows. Consider m = n, with f ∼ N (0, Ωf ). Without loss of generality we can assume7 1 0 ··· 0 0 1 ··· 0 . Ωf = In = . . . . . ... .. .. 0 0 ··· 1 We extend the previous assumptions as follows: β1 µ j = µ = 0, βj = β = ... and ωj = ω = 1
for all j.
βn This implies that Rj2 = ρ =
β12 + β22 + · · · + βn2 β Ωf β = 1 + β Ωf β 1 + β12 + β22 + · · · + βn2
and that
1/2 √ 1−ρ β Ωf = (β1 β2 · · · βn ) vj = v = (v1 v2 · · · vn ) = √ ρ
= (β1 β2 · · · βn )
1 0 .. . 0
0 ··· 1 ··· .. . . . . 0 ···
0 0 .. .
1 − ρ ρ
1
1−ρ ρ
so that v1 =
β1 β12
+
β22
+ ··· +
βn2
, v2 =
β2 β12
+
β22
+ ··· +
βn2
, . . . , vn =
βn β12
+
β22
+ · · · + βn2
As before we have π(j, kj , 1, ψ) ≡
0
π(j, kj , 2, ψ) ≡
(1 − α) π(j) ≡ (1 − α).
Also, we assume that ckj ,l = c1,2 = +∞ and ckj ,l−1 = c1,1 = Φ−1 (1 − p) for all j. Given these assumptions, equations (24) and (25) remain unchanged. In this version of the model there are now n + 2 unknown parameters to be determined: β1 , β2 , . . . , βn , p and α. 7 The
proof is analogous to the proof given in the case of the one-factor model.
.
An analytic approach to credit risk of loan portfolios of Spanish banks
297
3. Estimation of the credit loss distribution Using data from Spanish banks between 1992 and 1999, we construct an empirical credit loss distribution. We first use the methodology frequently employed by credit risk practitioners8 and estimate the Gamma, Beta and Weibull distributions that best fit the empirical distribution. Then we estimate an analytical expression for the credit loss distribution in three alternative manners, using the results derived from the default/non-default one-factor model developed in the previous section.
3.1. Data The data available consists of loan loss provisions, loans and loan loss reserves between 1992 and 1999 for 151 banks in Spain (source: Bankscope, May 2000, values in ‘000 USD). Loan loss provisions are the amounts that figure in the income statement of each bank in a given year due to non-repayment of loans. The ratio of provisions to loans is legally determined by the Bank of Spain and depends on the delay of the loan repayment. The original data base does not contain data for all years for every bank, which explains why we only have 926 observations (where one observation corresponds to a given bank in a given year). The ratio “loan loss provisions/gross loans” for each bank and year serves as a proxy for the actual portfolio credit loss (in %). Gross loans are defined as the sum of loans and loan loss reserves. Below we represent credit losses against gross loans for all observations excluding outliers corresponding to the 1st and 99th percentiles. Credit Losses vs Gross Loans 120,000,000
Gross Loans ('000 USD)
100,000,000
80,000,000
60,000,000
40,000,000
20,000,000
0 -10.00
-8.00
-6.00
-4.00
-2.00
0.00
2.00
4.00
6.00
8.00
10.00
Credit Loss (% )
8 This methodology generally generates good empirical results, although it is not based on any theoretical model.
298
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
We observe that there is a high concentration of credit losses around zero and that the high positive or negative credit losses correspond to low levels of gross loans. In other words, this shows us that small entities are contributing a lot of noise to the credit loss distribution. Furthermore, we observe that there are negative credit losses, i.e. gains. This can be explained by the problem of mismatches between gross provisions and recoveries, as explained below. The loan loss provisions are net of recoveries, so it is possible that in a given year a bank has negative loss provisions. For example in a period of recession loan loss provisions are expected to be high due to many non-repayments, while in a period of expansion banks will recover previous losses and therefore loan loss provisions might be negative, i.e. recoveries exceed gross provisions. Thus the problem we face is one of timing and potential mismatches between gross provisions and recoveries. One solution to this problem would be to obtain the gross provisions and the recoveries separately, and then construct the distribution for the gross provisions instead of for the net provisions. Yet such detailed data is not available. The approach adopted is thus to transform the data, adjusting the values of the credit losses on an entity basis so as to eliminate mismatches between gross provisions and recoveries in any given year. It should be mentioned that this problem of mismatches is not a problem on an aggregate basis, since the data we have contains both years of recession as well as of expansion, i.e. data over a whole cycle. This enables us to transform the data considering the following simple model9 . Transformation of the data We assume there are two series: x(t), Gross provisions, and y(t), Recoveries. However in the framework of our model we only observe the net provisions given by z(t) = x(t) − y(t). Under the assumption that y(t) = α x(t − 1), where α is the recovery rate, z(t) = x(t) − α x(t − 1). The variable of interest is the credit loss, i.e. (1 − α) x(t)
= (1 − α) [z(t) + α x(t − 1)] = (1 − α) z(t) + α z(t − 1) + α2 z(t − 2) + α3 z(t − 3) + · · · .
We need to make an assumption concerning the recovery rate α in order to recover the series of credit losses. We assume α = 50%, which is the benchmark recovery rate used frequently in the banking industry. This enables us to recover the series of credit losses from the observed net provisions on an entity basis. 9 We wish to mention that this is an ad hoc transformation and that other ways of adjusting the data could have been used.
An analytic approach to credit risk of loan portfolios of Spanish banks
299
Once we transform the data, we obtain a data base with 843 observations10 . We compute ratios of “adjusted loan loss provisions/gross loans” for each bank and year, which as before proxy for the actual portfolio credit loss (in %). Below we represent credit losses against gross loans for all observations excluding outliers corresponding to the 1st and 99th percentiles. Credit Losses vs Gross Loans 120,000,000
Gross Loans ('000 USD)
100,000,000
80,000,000
60,000,000
40,000,000
20,000,000
0 0.00
1.00
2.00
3.00
4.00
5.00
6.00
Credit Loss (%)
As previously, there is a distinct concentration of credit losses around zero and the high positive credit losses correspond to low levels of gross loans. Furthermore, we have now eliminated all negative credit losses. It is possible that we have not solved the mismatch problem entirely and that the credit loss distribution still exhibits excess volatility. If this is the case we would need to adjust the data further, i.e. instead of only allowing one lag between provisions and recoveries allowing a larger number of lags. This would further reduce volatility.
3.2. Empirical credit loss distribution We construct the empirical credit loss distribution for our sample of observations11 . It can be seen that the distribution obtained presents the typical characteristics of credit loss distributions, i.e. positive skewness and leptokurtosis. 10 We eliminate observations in a series of instances, such as when for a given entity the loan loss provisions in all years are negative, when the loan loss provisions in years between the first and the last year of data are missing, or when the loan loss provision in the first year is negative. 11 We can construct a non-weighted and a weighted credit loss distribution. The former is based on the assumption that all banks have sufficiently diversified portfolios, so that when constructing the credit loss distribution we can assign the same weight to each loss. The latter however assumes that larger banks have more diversified portfolios than smaller banks, therefore the distribution is constructed by assigning different weights to the observations, where the weight is given by the share of gross loans in total gross loans. In all that follows we will present results based on the non-weighted credit loss distribution. Nevertheless, estimations have also been carried out for the weighted distribution.
300
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
Empirical Credit Loss Distribution 0.25
Statistics credit losses (%) Mean 0.68 Std Dev 0.60 Skewness 2.10 Kurtosis 10.81 Minimum 0.00 Maximum 5.18 No Obs 827
Frequency
0.20
0.15
0.10
0.05
0.00 0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
5.50
Credit Loss (%)
3.3. Gamma, Beta and Weibull distributions We use the methodology frequently employed by credit risk practitioners12 and estimate the Gamma, Beta and Weibull distributions that best fit the empirical credit loss distribution. The parameters of the Gamma, Beta and Weibull distributions are estimated by matching the first two moments (mean and standard deviation) of the empirical credit loss distribution with the corresponding moments of the estimated distribution. Credit Loss Distributions 0.12
Frequency
0.10 0.08 0.06 0.04 0.02 0.00 0.00% 0.50% 1.00% 1.50% 2.00% 2.50% 3.00% 3.50% 4.00% 4.50% 5.00% 5.50% Credit Loss
Empirical
Gamma
Beta
Weibull
The Gamma and Beta distributions are practically indistinguishable and are represented by the full line, while the Weibull distribution is represented by the dotted line. 12 This methodology generally generates good empirical results, although it is not based on any theoretical model.
An analytic approach to credit risk of loan portfolios of Spanish banks
301
3.4. Analytical approximation to the credit loss distribution In the one-factor model and given the assumptions detailed in Section 2.3.1., we estimate the cumulative credit loss distribution using equation (26), F (c) = Φ(g
(26)
−1
(c)) = Φ
c 1 −1 −1 − Φ (p) . 1 − ρΦ √ ρ 1−α
We estimate the three unknown parameters in Equation (26), ρ, α, and p, using three different estimation methods. First, assuming13 that α = 50%, we estimate the two parameters ρ and p of the analytical credit loss distribution such that its first two moments (mean and standard deviation) match the corresponding moments of the empirical credit loss distribution. Secondly, we estimate the three parameters ρ, α and p such that the first three moments (mean, standard deviation and skewness) of the analytical distribution match the corresponding moments of the empirical distribution. Finally, we estimate the three parameters of the analytical distribution such that determined central percentiles match the corresponding percentiles of the empirical distribution14 . The results obtained with the three methods are similar. Consequently we only present the results of the estimations using the first three moments. Analytical Approximation to the Credit Loss Distribution 140
Estimated Parameters
120
Frequency
8.69%
0.10
Recovery rate = 50.39%
0.08
Prob default
0.06
Asset correl
100 80
0.12
=
=
1.42%
60 0.04 40 0.02
20 0 0.00% 0.50% 1.00%
1.50% 2.00% 2.50% 3.00% 3.50% 4.00%
0.00 4.50% 5.00% 5.50%
Credit Loss Empirical
Analytical
The estimated recovery rate is in line with the common assumption in the banking industry (i.e. LGD = 50%). Besides, the estimated asset correlation is in line with 13 This
is the benchmark recovery rate used frequently in the banking industry. believe this might be a more robust method due to the fact that the second and third moments are very sensitive to extreme values. The estimations are carried out for the percentiles 20%, 40%, 60% and 80%, as well as for the percentiles 10%, 30%, 50%, 70% and 90%. 14 We
302
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
the asset correlations proposed by the Basel Committee in November 2001 (between 10% and 20% for corporate exposures and between 4% and 15% for retail exposures). We compare the analytical approximation obtained to the empirical distribution as well as to the Gamma, Beta and Weibull distributions that we obtained previously. We present the results for the whole distribution as well as in the tail of the distribution. Cumulative Credit Loss Distributions
Cumulative Frequency
100.00%
80.00%
60.00%
40.00%
20.00%
0.00% 0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
5.0%
5.5%
Credit Loss Empirical
Analytical
Gamma
Beta
Weibull
Cumulative Credit Loss Distributions
Cumulative Frequency
100.00%
99.00%
98.00%
Empirical
Analytical
97.00%
96.00%
95.00% 1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
Credit Loss Empirical
Analytical
Gamma
Beta
Weibull
The Gamma, Beta and Weibull distributions are practically indistinguishable and exhibit a relatively good fit to the empirical credit loss distribution. The analytical approximation also exhibits a good fit to the empirical distribution.
An analytic approach to credit risk of loan portfolios of Spanish banks
303
Finally, we compute the approximation errors relative to the empirical credit loss distribution for the different analytical estimation methods as well as for the Gamma estimated distribution, at various confidence levels. We do not show the approximation errors for the Beta and Weibull distributions, as these are practically the same as for the Gamma distribution. Approximation Errors Relative to Empirical Credit Loss Distribution 10% 5%
Error
0% -5% -10% -15% -20% -25% 95.0%
97.5%
99.0%
99.5%
99.9%
Confidence Level Analytical-2 moments
Analytical-3 moments
Analytical-percentiles
Gamma
It can be seen that the best analytical approximations are the ones that use the first three moments and central percentiles.
4. Identification of different credit loss distributions The idea is to split the sample in two groups to take into account that the groups exhibit fundamental differences in terms of business operations and thus have different credit loss distributions. This should enable us to improve the quality of the analytical approximations. First we split the sample in banks and savings banks and secondly we split the sample in small banks and large banks.
4.1. Split between banks and savings banks Generally speaking, savings banks are typically smaller banks dedicated to retail business on a relatively local level. This is why it is thought probable that their loan portfolios are less diversified over regions and industries and that they have higher unexpected credit losses than banks, that in general operate on a global basis and are thus believed to have more diversified loan portfolios.
304
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
Credit Losses vs Gross Loans: Banks 120,000,000
Statistics credit losses (%) Mean 0.69 Std Dev 0.67 Skewness 2.10 Kurtosis 10.10 Minimum 0.00 Maximum 5.18 No Obs 512
Gross Loans ('000 USD)
100,000,000 80,000,000 60,000,000 40,000,000 20,000,000 0 0.00
1.00
2.00
3.00
4.00
5.00
6.00
Credit Loss (%)
Credit Losses vs Gross Loans: Savings Banks 120,000,000
Gross Loans ('000 USD)
100,000,000
80,000,000 Observations corresponding to La Caixa & Caja Madrid
60,000,000
Statistics credit losses (%) Mean 0.65 Std Dev 0.45 Skewness 1.42 Kurtosis 6.17 Minimum 0.00 Maximum 2.84 No Obs 315
40,000,000
20,000,000
0 0.00
1.00
2.00
3.00
4.00
5.00
6.00
Credit Loss (%)
It can be seen that banks have loan portfolios of all sizes, while in general savings banks have small loan portfolios. The exceptions are La Caixa and Caja Madrid, that have gross loans exceeding USD 15 billion for all years. In addition, although the expected loss (mean) is the similar for banks and savings banks, the unexpected loss (standard deviation) is larger for banks.
An analytic approach to credit risk of loan portfolios of Spanish banks
305
Analytical Approximation to the Credit Loss Distribution: Banks 140 120
= 10.17%
0.10
Recovery rate = 50.06%
0.08
Asset correl
100 Frequency
0.12
Estimated Parameters
80
Prob default
=
1.41%
0.06
60 0.04 40 0.02
20 0
0.00%
0.00
0.50%
1.00%
1.50% 2.00%
2.50%
3.00% 3.50%
4.00%
4.50%
5.00%
5.50%
Credit Loss Empirical
Analytical
Analytical Approximation to the Credit Loss Distribution: Savings Banks 140
Estimated Parameters
120
Asset correl
Frequency
100
=
5.34%
Recovery rate = 50.09%
80
Prob default
=
1.36%
0.14 0.12 0.10 0.08
60
0.06
40
0.04
20
0.02
0 0.00% 0.50%
1.00% 1.50% 2.00% 2.50%
3.00% 3.50% 4.00% 4.50%
0.00 5.00% 5.50%
Credit Loss Empirical
Analytical
In both estimations we obtain α very close to 50% and a probability of default of the order of 1.4%. Moreover, the asset correlations obtained imply a higher diversification of the savings banks relative to the banks. This result contradicts our belief that banks have more diversified loan portfolios than savings banks.
306
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
Cumulative Credit Loss Distributions: Banks
Cumulative Frequency
100.00%
99.00% Analytical
Empirical 98.00%
97.00%
96.00%
95.00% 1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
4.0%
4.5%
Credit Loss Empirical
Analytical
Gamma
Beta
Weibull
Cumulative Credit Loss Distributions: Savings Banks 100.00%
Cumulative Frequency
Empirical 99.00%
98.00%
97.00%
96.00%
95.00% 1.5%
Analytical
2.0%
2.5%
3.0%
3.5%
Credit Loss Empirical
Analytical
Gamma
Beta
Weibull
Even though it appears that the analytical approximations improve in the case of banks, for savings banks this does not seem to be so. We believe this can be explained by the fact that the savings banks group is not homogeneous in terms of operations and diversification. In particular we note that La Caixa and Caja Madrid are more comparable to banks than to savings banks in terms of their business and operations. Consequently, it seems that the criterion chosen to split the sample into banks and savings banks is rather arbitrary and not justifiable. Not only do we obtain counterintuitive results concerning unexpected losses and correlations, but we also find that this split does not lead to a satisfactory improvement of the analytical approximations to the credit loss distributions. For this reason we consider another criterion to split the sample: loan portfolio size.
An analytic approach to credit risk of loan portfolios of Spanish banks
307
4.2. Split between small banks and large banks Small banks are expected to have higher unexpected credit losses than large banks due to their less important geographic and sectorial diversification. We split the sample in two groups composed of a similar number of observations according to volume of gross loans which we use as a proxy for the size of the bank. The critical value of gross loans chosen is USD 1 billion, which enables us to split the sample in two groups of similar size (number of observations). Credit Losses vs Gross Loans: Large Banks 120,000,000
Statistics credit losses (%) Mean 0.68 Std Dev 0.46 Skewness 1.13 Kurtosis 4.85 Minimum 0.00 Maximum 3.05 No Obs 416
Gross Loans ('000 USD)
100,000,000
80,000,000
60,000,000
40,000,000
20,000,000
0 0.00
1.00
2.00
3.00
4.00
5.00
6.00
Credit Loss (%)
Credit Losses vs Gross Loans: Small Banks 120,000,000
Statistics credit losses (%) Mean 0.68 Std Dev 0.71 Skewness 2.20 Kurtosis 10.00 Minimum 0.00 Maximum 5.18 No Obs 411
Gross Loans ('000 USD)
100,000,000
80,000,000
60,000,000
40,000,000
20,000,000
0 0.00
1.00
2.00
3.00 Credit Loss (%)
4.00
5.00
6.00
308
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
The banks with largest gross loans are BBVA, SCH, La Caixa and Caja Madrid. It can be seen that large banks have less extreme credit losses than small banks. Moreover, although the expected loss (mean) is the same for small banks as for large banks, the unexpected loss (standard deviation) is larger for small banks. Analytical Approximation to the Credit Loss Distribution: Large Banks 140
Estimated Parameters
120
0.12
4.75%
0.10
Recovery rate = 50.75%
0.08
Prob default
0.06
Asset correl
=
Frequency
100 80
=
1.43%
60 0.04 40 0.02
20 0 0.00% 0.50% 1.00%
1.50% 2.00% 2.50% 3.00% 3.50% 4.00%
0.00 4.50% 5.00% 5.50%
Credit Loss Empirical
Analytical
Analytical Approximation to the Credit Loss Distribution: Small Banks 140
Estimated Parameters
120
Asset correl
= 10.97%
Frequency
100
0.12 0.10
Recovery rate = 50.01%
0.08
Prob default
0.06
80
=
1.41%
60 0.04
40
0.02
20
0 0.00 0.00% 0.50% 1.00% 1.50% 2.00% 2.50% 3.00% 3.50% 4.00% 4.50% 5.00% 5.50% Credit Loss Empirical
Analytical
In both estimations we obtain α very close to 50% and a probability of default of the order of 1.4%. Moreover, the asset correlations obtained imply a higher diversification of large banks relative to small banks. This result corroborates our belief that large banks have more diversified loan portfolios than small banks.
An analytic approach to credit risk of loan portfolios of Spanish banks
309
Cumulative Credit Loss Distributions: Large Banks 100.00%
Cumulative Frequency
Empirical 99.00%
98.00% Analytical 97.00%
96.00%
95.00% 1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
Credit Loss Empirical
Analytical
Gamma
Beta
Weibull
Cumulative Credit Loss Distributions: Small Banks 100.00%
Cumulative Frequency
Empirical 99.00%
98.00%
97.00% Analytical 96.00%
95.00% 1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
4.5%
Credit Loss Empirical
Analytical
Gamma
Beta
Weibull
We observe a distinct improvement of the analytical approximations for small banks, but not for large banks. This can be explained mainly by the still existing heterogeneity of the observations constituting the large banks group, given that the critical value of gross loans chosen to split the sample in two is only USD 1 billion. We believe we could improve the analytical approximations for large banks if we considered a more homogeneous sub-sample, i.e. a larger critical value of gross loans to split the sample. However, this is not possible due to the tiny number of observations with a high volume of gross loans, making the estimations very imprecise.
310
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
5. Estimation of economic capital The loss distribution can be used to determine economic capital, which is defined as the difference between some selected high confidence percentile of losses (e.g. 99.9%) and the expected loss. This corresponds to the level of capital that the bank needs to set aside in order to protect itself (with a certain level of confidence) against unexpected losses. Using the expected default frequencies (EDF) provided for each rating by Standard & Poor’s, we calculate the levels of economic capital (EC) for different confidence levels in the case of the whole sample, large banks and small banks, using for each one of them the estimations obtained previously.
S&P Rating
EDF
Conf. level
EC (All Banks)
AAA AA A BBB+ BBB BBBBB+ BB BBB+ B BCCC
0.01% 0.03% 0.08% 0.14% 0.20% 0.30% 0.50% 0.90% 1.50% 2.50% 4.50% 7.50% 15.00%
99.99% 99.97% 99.92% 99.86% 99.80% 99.70% 99.50% 99.10% 98.50% 97.50% 95.50% 92.50% 85.00%
5.52% 4.67% 3.92% 3.52% 3.27% 2.97% 2.62% 2.22% 1.87% 1.57% 1.17% 0.87% 0.47%
EC (Large Banks) EC (Small Banks) 3.17% 2.72% 2.37% 2.12% 1.97% 1.82% 1.62% 1.42% 1.22% 1.02% 0.82% 0.62% 0.37%
6.97% 5.82% 4.87% 4.32% 4.02% 3.62% 3.17% 2.67% 2.22% 1.82% 1.37% 1.02% 0.52%
It can be observed that as the rating diminishes, i.e. the EDF increases, the level of economic capital decreases for the three distributions. Furthermore, the economic capital corresponding to a given rating or EDF is on average twice as high for small banks as for large banks. This has a double interpretation. On one hand, to attain a given rating, small banks need to set aside twice as much capital as large banks. On the other hand, from a regulator’s perspective, if all banks are required to set aside a certain amount of capital, then this means that large banks have to maintain a better rating than small banks. For instance, if banks are required to set aside 3% of capital, then large banks need to be AAA, while small banks only need to be BB+. In other words, because the regulator is concerned about the health of the financial system, he is more concerned about large banks defaulting than about small banks defaulting.
An analytic approach to credit risk of loan portfolios of Spanish banks
311
6. Conclusions This study aims to obtain an analytical expression for the credit loss distribution of loan portfolios of Spanish banks in order to estimate economic capital in a simpler way than with the usual simulation approach. The analytical approximation used is based on a one-factor model and is comparable to the distribution used by the Basel Committee in its proposal on the New Capital Accord. We find that the Gamma, Beta and Weibull distributions, employed frequently by practitioners, fit the empirical credit loss distribution relatively well. Furthermore, the analytical approximations obtained are also reasonably accurate approximations to the empirical distribution, and the estimated asset correlations are in line with the correlations proposed by the Basel Committee in November 2001 (between 10% and 20% for corporate exposures and between 4% and 15% for retail exposures)15 . In order to improve the accuracy of the analytical approximations, we split the whole sample in two groups based on the belief of the existence of fundamental differences between the two groups. This would justify the existence of two different credit loss distributions instead of a unique distribution and enable us to obtain better approximations by estimating the parameters for each one of the two groups. We first split the sample in banks and savings banks and then by loan portfolio size. The results obtained in terms of unexpected losses and asset correlations enable us to conclude that the appropriate criterion to split the sample is that of size. Estimations realised for large banks and small banks show that the two groups are similar in terms of credit quality (similar probabilities of default and recovery rates) but differ in diversification (different asset correlations). Finally, we provide estimations of economic capital at different confidence levels for the whole sample, large banks and small banks. We obtain that small banks need to set aside twice as much capital as large banks to attain the same rating, or, equivalently, that if all banks are required to set aside the same amount of capital, then large banks have to achieve a better rating than small banks. Possible extensions to this analysis are to use a multi-factor model or distinguish between different types of customers. These extensions are presented from a theoretical point of view, yet their disadvantage from the practical point of view is the higher complexity of the model and the increasing number of parameters to be estimated.
15 Because the sample of banks employed contains in large part commercial banks, one would expect the asset correlations to be closer to those proposed for retail exposures. The results obtained go in this direction.
312
Juan Carlos Garc´ıa C´ espedes, Angel M. Menc´ıa and Mercedes Morris
References [1] Basel Consultative Papers. http://www.bis.org. [2] Credit Suisse (1999), “CreditRisk+”. http://www.csfp.csh.com. [3] Crosbie, P. (1997): “Modelling Default Risk”. http://www.kmv.com. [4] J.P. Morgan (1999), “CreditMetrics” (4th ed.). www.creditmetrics.com. [5] Lucas, A., P. Klaassen, P. Spreij and S. Straetmans (1999): “An Analytic Approach to Credit Risk of Large Corporate Bond and Loan Portfolios”. Research Memorandum 1999-18. [6] Merton, R. (1974): “On the Pricing of Corporate Debt: The Risk Structure of Interest Rates”. Journal of Finance 29, 429–442. [7] Ong, M. (1999): Internal Credit Risk Models. Risk Books. ¨ nbucher, P. (2000): “Factor Models for Portfolio Credit Risk”. Working [8] Scho Paper, Bonn University.
Juan Carlos Garc´ıa C´epedes Metodolog´ıa de Riesgos Corporativos, BBVA Paseo de Recoletos 8, tercera planta 28001–Madrid, Espa˜ na [email protected]
Angel M. Menc´ıa Gonzalez Metodolog´ıa de Riesgos Corporativos, BBVA Paseo de Recoletos 8, tercera planta 28001–Madrid, Espa˜ na [email protected]
Mercedes Morris Mu˜ noz Metodolog´ıa de Riesgos Corporativos, BBVA Paseo de Recoletos 8, tercera planta 28001–Madrid, Espa˜ na [email protected]