<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "https://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.1" specific-use="sps-1.9" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
	<front>
		<journal-meta>
			<journal-id journal-id-type="publisher-id">rmef</journal-id>
			<journal-title-group>
				<journal-title>Revista mexicana de economía y finanzas</journal-title>
				<abbrev-journal-title abbrev-type="publisher">Rev. mex. econ. finanz</abbrev-journal-title>
			</journal-title-group>
			<issn pub-type="ppub">1665-5346</issn>
			<issn pub-type="epub">2448-6795</issn>
			<publisher>
				<publisher-name>Instituto Mexicano de Ejecutivos de Finanzas, A. C.</publisher-name>
			</publisher>
		</journal-meta>
		<article-meta>
			<article-id pub-id-type="doi">10.21919/remef.v15i1.446</article-id>
			<article-categories>
				<subj-group subj-group-type="heading">
					<subject>Research papers</subject>
				</subj-group>
			</article-categories>
			<title-group>
				<article-title>Hierarchical PCA and Applications to Portfolio Management</article-title>
				<trans-title-group xml:lang="es">
					<trans-title>PCA jerárquico y aplicaciones a la gestión de cartera</trans-title>
				</trans-title-group>
			</title-group>
			<contrib-group>
				<contrib contrib-type="author">
					<name>
						<surname>Avellaneda</surname>
						<given-names>Marco</given-names>
					</name>
					<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
					<xref ref-type="corresp" rid="c1">*</xref>
				</contrib>
				<aff id="aff1">
					<label>1</label>
					<institution content-type="original">Courant Institute of Mathematical Sciences, NYU, USA</institution>
					<institution content-type="orgname">Courant Institute of Mathematical Sciences</institution>
					<addr-line>
						<state>NYU</state>
					</addr-line>
					<country country="US">USA</country>
				</aff>
			</contrib-group>
			<author-notes>
				<corresp id="c1">
					<label>*</label>251 Mercer Street, New York, NY 10012</corresp>
			</author-notes>
			<pub-date date-type="pub" publication-format="electronic">
				<day>01</day>
				<month>01</month>
				<year>2020</year>
			</pub-date>
			<pub-date date-type="collection" publication-format="electronic">
				<season>Jan-Mar</season>
				<year>2020</year>
			</pub-date>
			<volume>15</volume>
			<issue>1</issue>
			<fpage>01</fpage>
			<lpage>16</lpage>
			<history>
				<date date-type="received">
					<day>02</day>
					<month>07</month>
					<year>2019</year>
				</date>
				<date date-type="accepted">
					<day>08</day>
					<month>10</month>
					<year>2019</year>
				</date>
			</history>
			<permissions>
				<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by-nc/4.0/" xml:lang="en">
					<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License</license-p>
				</license>
			</permissions>
			<abstract>
				<title>Abstract</title>
				<p>It is widely known that the common risk-factors derived from PCA beyond the first eigenportfolio are generally difficult to interpret and thus to use in practical portfolio management. We explore an alternative approach (HPCA) which makes strong use of the partition of the market into sectors. We show that this approach leads to no loss of information with respect to PCA in the case of equities (constituents of the S&amp;P 500) and also that the associated common factors admit simple interpretations. The model can also be used in markets in which the sectors have asynchronous price information, such as single-name credit default swaps, generalizing the works of <xref ref-type="bibr" rid="B3">Cont and Kan (2011)</xref> and Ivanov (2016). </p>
			</abstract>
			<trans-abstract xml:lang="es">
				<title>Resumen</title>
				<p>Es ampliamente conocido que los factores de riesgo comunes derivados del PCA más allá de la primera eigenportafolio son generalmente difíciles de interpretar y, por lo tanto, de utilizar en la gestión práctica de la cartera. Exploramos un enfoque alternativo (HPCA) que hace un fuerte uso de la partición del mercado en sectores. Demostramos que este enfoque no conduce a la pérdida de información con respecto al PCA en el caso de la renta variable (constituidos por el S&amp;P 500) y también que los factores comunes asociados admiten interpretaciones simples. El modelo también se puede utilizar en mercados en los que los sectores tienen información asincrónica de precios, como single-name swaps de incumplimiento de crédito, generalizando las obras de <xref ref-type="bibr" rid="B3">Cont y Kan (2011)</xref> e Ivanov (2016). </p>
			</trans-abstract>
			<kwd-group xml:lang="en" kwd-group-type="JEL">
				<title><italic>JEL Classification:</italic></title>
				<kwd><italic>C02</italic></kwd>
				<kwd><italic>C65</italic></kwd>
				<kwd><italic>G24</italic></kwd>
			</kwd-group>
			<kwd-group xml:lang="en">
				<title><italic>Keywords:</italic></title>
				<kwd><italic>returns</italic></kwd>
				<kwd><italic>blocks</italic></kwd>
				<kwd><italic>PCA</italic></kwd>
				<kwd><italic>HPCA</italic></kwd>
				<kwd><italic>portfolio</italic></kwd>
			</kwd-group>
			<kwd-group xml:lang="es" kwd-group-type="JEL">
				<title><italic>Clasificación JEL:</italic></title>
				<kwd><italic>C02</italic></kwd>
				<kwd><italic>C65</italic></kwd>
				<kwd><italic>G24</italic></kwd>
			</kwd-group>
			<kwd-group xml:lang="es">
				<title><italic>Palabras clave:</italic></title>
				<kwd><italic>rendimiento</italic></kwd>
				<kwd><italic>bloques</italic></kwd>
				<kwd><italic>PCA</italic></kwd>
				<kwd><italic>HPCA</italic></kwd>
				<kwd><italic>portafolio</italic></kwd>
			</kwd-group>
			<counts>
				<fig-count count="14"/>
				<table-count count="2"/>
				<equation-count count="19"/>
				<ref-count count="9"/>
				<page-count count="16"/>
			</counts>
		</article-meta>
	</front>
	<body>
		<sec sec-type="intro">
			<title>1 Introduction</title>
			<p> Principal Components Analysis (PCA) and random matrix theory (RMT) have become widespread tools for data analysis. PCA (<xref ref-type="bibr" rid="B6">Joliffe (2002)</xref> [<xref ref-type="bibr" rid="B6">6</xref>]) provides a mathematical and objective approach to extract economic information from the correlation matrices of asset returns. In this approach, the analyst extracts common risk factors from the eigenvectors and eigenvalues of the correlation matrix.</p>
			<p>The first eigenvector of the correlation of stock returns corresponds to the solution of the variational problem </p>
			<p>
				<disp-formula id="e1">
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:mi>a</mml:mi>
						<mml:mi>r</mml:mi>
						<mml:mi>g</mml:mi>
						<mml:mi>m</mml:mi>
						<mml:mi>a</mml:mi>
						<mml:mi>x</mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mo>{</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>t</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mi>R</mml:mi>
						<mml:mi>V</mml:mi>
						<mml:mo>;</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mi>V</mml:mi>
						<mml:mo>|</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mo>=</mml:mo>
						<mml:mn>1</mml:mn>
						<mml:mo>}</mml:mo>
						<mml:mo>.</mml:mo>
					</mml:math>
					<label>(1)</label>
				</disp-formula>
			</p>
			<p> Here, <inline-formula>
					<mml:math>
						<mml:mi>R</mml:mi>
					</mml:math>
				</inline-formula> is the correlation matrix of daily returns and <inline-formula>
					<mml:math>
						<mml:mo>|</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mo>|</mml:mo>
					</mml:math>
				</inline-formula> is the Euclidean norm in <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>R</mml:mi>
								<mml:mi>n</mml:mi>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula>, <inline-formula>
					<mml:math>
						<mml:mi>n</mml:mi>
					</mml:math>
				</inline-formula> being the total number of assets. <xref ref-type="disp-formula" rid="e1">Equation 1</xref> shows that the principal eigenvector is represents the direction (line) which “captures the most variance” as described by the correlation matrix. The first eigenvector satisfies </p>
			<p>
				<disp-formula id="e2">
					<mml:math>
						<mml:mi>R</mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>λ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
					</mml:math>
					<label>(2)</label>
				</disp-formula>
			</p>
			<p>PCA also finds recursively additional (orthogonal) directions beyond <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>V</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mn>1</mml:mn>
									</mml:mfenced>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula> which capture the most variance. The other eigenvectors and eigenvalues are computed in the same way as <xref ref-type="disp-formula" rid="e1">Eq. (1)</xref>) with the maximization to the sub-space orthogonal to the space spanned by the ones computed previously, i.e, </p>
			<p>
				<disp-formula id="e3">
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:mi>a</mml:mi>
						<mml:mi>r</mml:mi>
						<mml:mi>g</mml:mi>
						<mml:mi>m</mml:mi>
						<mml:mi>a</mml:mi>
						<mml:mi>x</mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mo>{</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>t</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mi>R</mml:mi>
						<mml:mi>V</mml:mi>
						<mml:mo>;</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mi>V</mml:mi>
						<mml:mo>|</mml:mo>
						<mml:mo>|</mml:mo>
						<mml:mo>=</mml:mo>
						<mml:mn>1</mml:mn>
						<mml:mo>,</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
								<mml:mi>t</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:msup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>l</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:mn>0</mml:mn>
						<mml:mo>,</mml:mo>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mn>1</mml:mn>
						<mml:mo>≤</mml:mo>
						<mml:mi>l</mml:mi>
						<mml:mo>&lt;</mml:mo>
						<mml:mi>k</mml:mi>
						<mml:mo>}</mml:mo>
						<mml:mo>.</mml:mo>
					</mml:math>
					<label>(3)</label>
				</disp-formula>
			</p>
			<p> The eigenvalues satisfy <inline-formula>
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mi>λ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>&gt;</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>λ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>2</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>≥</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>≥</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>λ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>n</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
					</mml:math>
				</inline-formula>. Assume that the data corresponds to the daily returns of a group of stocks. The Karhunen-Loeve representation of the standardized returns is</p>
			<p>
				<disp-formula id="e4">
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mi>X</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>=</mml:mo>
						<mml:mrow>
							<mml:munderover>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>k</mml:mi>
									<mml:mo>=</mml:mo>
									<mml:mn>1</mml:mn>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>n</mml:mi>
								</mml:mrow>
							</mml:munderover>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msqrt>
							<mml:msup>
								<mml:mrow>
									<mml:mi>λ</mml:mi>
								</mml:mrow>
								<mml:mrow>
									<mml:mo>(</mml:mo>
									<mml:mi>k</mml:mi>
									<mml:mo>)</mml:mo>
								</mml:mrow>
							</mml:msup>
						</mml:msqrt>
						<mml:msubsup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msubsup>
						<mml:mi> </mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mi>F</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
					</mml:math>
					<label>(4)</label>
				</disp-formula>
			</p>
			<p> where </p>
			<p>
				<disp-formula id="e5">
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mi>F</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:mfrac>
							<mml:mrow>
								<mml:mn>1</mml:mn>
							</mml:mrow>
							<mml:mrow>
								<mml:msqrt>
									<mml:msup>
										<mml:mrow>
											<mml:mi>λ</mml:mi>
										</mml:mrow>
										<mml:mrow>
											<mml:mo>(</mml:mo>
											<mml:mi>k</mml:mi>
											<mml:mo>)</mml:mo>
										</mml:mrow>
									</mml:msup>
								</mml:msqrt>
							</mml:mrow>
						</mml:mfrac>
						<mml:mrow>
							<mml:munderover>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>i</mml:mi>
									<mml:mo>=</mml:mo>
									<mml:mn>1</mml:mn>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>n</mml:mi>
								</mml:mrow>
							</mml:munderover>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msubsup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msubsup>
						<mml:mi> </mml:mi>
						<mml:msub>
							<mml:mrow>
								<mml:mi>X</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>.</mml:mo>
					</mml:math>
					<label>(5)</label>
				</disp-formula>
			</p>
			<p>By construction, <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>F</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mi>k</mml:mi>
									</mml:mfenced>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula> are uncorrelated and have variance 1. Since these random variables are linear combinations of the daily standardized returns of the assets, we call them (standardized) “eigenportfolio (EP) returns”, with the caveat that the actual portfolio “weights” are obtained by dividing each entry of the eigenvector by the volatility of the asset (<xref ref-type="bibr" rid="B1">Avellaneda and Lee 2008, 2010</xref>) [<xref ref-type="bibr" rid="B1">1</xref>].<xref ref-type="fn" rid="fn2"><sup>2</sup></xref>
			</p>
			<p>PCA is a framework for learning about the common factors which affect the returns of a given group of assets. The first eigenportfolio, associated with the r.v. <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>F</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mn>1</mml:mn>
									</mml:mfenced>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula>, is a common risk factor which explains the maximum variability. We can write a one-factor model for each asset, namely </p>
			<p>
				<disp-formula id="e6">
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mi>X</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>=</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:msup>
							<mml:mrow>
								<mml:mi>F</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>+</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>ϵ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
					</mml:math>
					<label>(6)</label>
				</disp-formula>
			</p>
			<p> where <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mi>β</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> is the regression coefficient of the standardized return on the first EP. The “residuals” <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mi>ϵ</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> in <xref ref-type="disp-formula" rid="e6">equation 6</xref> are uncorrelated with <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>F</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mn>1</mml:mn>
									</mml:mfenced>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula>, which is nice. However, they are generally correlated for different stocks.</p>
			<p>The regression coefficients satisfy </p>
			<p>
				<disp-formula id="e7">
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>=</mml:mo>
						<mml:msqrt>
							<mml:msup>
								<mml:mrow>
									<mml:mi>λ</mml:mi>
								</mml:mrow>
								<mml:mrow>
									<mml:mo>(</mml:mo>
									<mml:mn>1</mml:mn>
									<mml:mo>)</mml:mo>
								</mml:mrow>
							</mml:msup>
						</mml:msqrt>
						<mml:mi> </mml:mi>
						<mml:msubsup>
							<mml:mrow>
								<mml:mi>V</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msubsup>
						<mml:mo>,</mml:mo>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>j</mml:mi>
						<mml:mo>=</mml:mo>
						<mml:mn>1</mml:mn>
						<mml:mo>,</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>,</mml:mo>
						<mml:mi>n</mml:mi>
						<mml:mo>.</mml:mo>
					</mml:math>
					<label>(7)</label>
				</disp-formula>
			</p>
			<p> In the case of economic data, which is noisy, the consensus is to disregard EPs which correspond to low eigenvalues. In a celebrated paper, <xref ref-type="bibr" rid="B7">Laloux <italic>et al</italic> (2000)</xref> [<xref ref-type="bibr" rid="B7">7</xref>] proposed to use random matrix theory (RMT) to establish a cutoff in the number of EPs use to model the standardized returns, namely </p>
			<p>
				<disp-formula id="e8">
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mi>X</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>=</mml:mo>
						<mml:mrow>
							<mml:munderover>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>k</mml:mi>
									<mml:mo>=</mml:mo>
									<mml:mn>1</mml:mn>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>m</mml:mi>
								</mml:mrow>
							</mml:munderover>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msubsup>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msubsup>
						<mml:mi> </mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mi>F</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mo>+</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>ϵ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
					</mml:math>
					<label>(8)</label>
				</disp-formula>
			</p>
			<p> where <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msubsup>
								<mml:mi>β</mml:mi>
								<mml:mi>j</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mi>k</mml:mi>
									</mml:mfenced>
								</mml:mrow>
							</mml:msubsup>
						</mml:mrow>
					</mml:math>
				</inline-formula> are “factor loadings”and (with a slight abuse of notation) <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mi>ϵ</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> are residuals obtained after “defactoring” relatively to the <inline-formula>
					<mml:math>
						<mml:mi>m</mml:mi>
					</mml:math>
				</inline-formula> eigenportfolios. The number <inline-formula>
					<mml:math>
						<mml:mi>m</mml:mi>
					</mml:math>
				</inline-formula> is a cutoff which is to be determined from the context.</p>
			<p>According to [<xref ref-type="bibr" rid="B7">7</xref>], the eigenvalues of a pure noise matrix follow the Marcenko-Pastur distribution and have a spectrum which, for large matrices,is asymptotically bounded from above by <inline-formula>
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mi>λ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>+</mml:mo>
								<mml:mo>,</mml:mo>
								<mml:mi>M</mml:mi>
								<mml:mi>P</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:mo>(</mml:mo>
						<mml:mn>1</mml:mn>
						<mml:mo>+</mml:mo>
						<mml:msqrt>
							<mml:mi>n</mml:mi>
							<mml:mo>/</mml:mo>
							<mml:mi>T</mml:mi>
						</mml:msqrt>
						<mml:msup>
							<mml:mrow>
								<mml:mo>)</mml:mo>
							</mml:mrow>
							<mml:mrow>
								<mml:mn>2</mml:mn>
							</mml:mrow>
						</mml:msup>
					</mml:math>
				</inline-formula>, where <inline-formula>
					<mml:math>
						<mml:mi>T</mml:mi>
					</mml:math>
				</inline-formula> is the number of observations. Asymptotics should hold in the limit <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>n</mml:mi>
							<mml:mo>/</mml:mo>
							<mml:mi>T</mml:mi>
							<mml:mo>→</mml:mo>
							<mml:mi>γ</mml:mi>
						</mml:mrow>
					</mml:math>
				</inline-formula> (a constant) as <inline-formula>
					<mml:math>
						<mml:mi>n</mml:mi>
					</mml:math>
				</inline-formula> and <inline-formula>
					<mml:math>
						<mml:mi>T</mml:mi>
					</mml:math>
				</inline-formula> both tend to infinity. The way to use RMT to calculate the cutoff is to construct the correlation matrix <inline-formula>
					<mml:math>
						<mml:msubsup>
							<mml:mrow>
								<mml:mi>R</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
								<mml:mo>,</mml:mo>
								<mml:mi>j</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>m</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msubsup>
						<mml:mo>=</mml:mo>
						<mml:mi>C</mml:mi>
						<mml:mi>o</mml:mi>
						<mml:mi>r</mml:mi>
						<mml:mi>r</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>ϵ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>,</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>ϵ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>)</mml:mo>
					</mml:math>
				</inline-formula> for m large enough and verify that its top eigenvalue is of the order of <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>λ</mml:mi>
								<mml:mrow>
									<mml:mo>+</mml:mo>
									<mml:mo>,</mml:mo>
									<mml:mi>M</mml:mi>
									<mml:mi>P</mml:mi>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula>. One can also compare the empirical distribution of eigenvalues with the Marcenko-Pastur probability distribution.</p>
			<p>PCA aided by RMT is an elegant approach to analyzing correlation matrices of financial data and can also be applied to may areas of science. The main strength of the method is that it can detect common risk factors based on a matrix of asset returns, without any additional information. In other works, PCA “lets the data speak for itself”. Generally speaking, PCA explains the most variability with the smallest number of factors. Most studies tend to justify the PCA approach by recognizing that it produces some factors which have <italic>ex-post</italic> economic interpretations, such as equating <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>E</mml:mi>
							<mml:msup>
								<mml:mi>P</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mn>1</mml:mn>
									</mml:mfenced>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula> with the Sharpe Market Portfolio (<xref ref-type="bibr" rid="B2">Boyle 2017</xref>) [<xref ref-type="bibr" rid="B2">2</xref>], or attempt to interpret higher-order EPs in terms of industry sectors [<xref ref-type="bibr" rid="B1">1</xref>]. In the case of fixed-income, the EPs are often identified with “parallel shifts”, or with long-term vs short-term oscillations of the yield curve (<xref ref-type="bibr" rid="B8">Litterman and Scheinkman, 1991</xref>) [<xref ref-type="bibr" rid="B8">8</xref>]. </p>
		</sec>
		<sec>
			<title>2 The identification problem</title>
			<p> One of the frequent criticisms of PCA in Finance is that the common risk factors generated by higher-order eigenportfolios - aside from the first eigenportfolio - are difficult to interpret and appear to be unstable across time. We call this the <italic>identification problem</italic>. Because of it, many portfolio managers favor traditional factor models such as Barra; see <xref ref-type="bibr" rid="B9">Shkolnik et. al. (2016)</xref> [<xref ref-type="bibr" rid="B9">9</xref>] for alternative approaches to model financial correlations.</p>
			<p>The identification problem in PCA reflects the uncertainty, or unreliability, of cross-asset correlations. From a practical point of view, as the size of trading universe increases, the correlations of assets which are not economically related (a tech stock with an energy stock, or with a foreign stock) are difficult to quantify and may be noisy. This could be due to several reasons: the lack of “explanation” for the relation between the stocks, or perhaps that their prices are not sampled simultaneously (e.g. if they are end-of-day prices in different time-zones) or that the number of observations is not large compared to the number of assets considered. For example, empirical correlations of price changes of out-of-the money options with different underlying assets may not be as reliable or significant as the data would suggest.</p>
			<p>To mitigate the identification problem, we should seek a factor model which can recognize the economic nature or function of the asset as well as the statistical properties of returns. This lead us to the model described hereafter.</p>
		</sec>
		<sec>
			<title>3 Hierarchical PCA</title>
			<p> The hierarchical PCA (HPCA) applies to markets which can be partitioned into several sectors or asset-classes. Consider first an abstract market, in which the empirical data matrix of asset returns, with dimensions <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>T</mml:mi>
							<mml:mo>×</mml:mo>
							<mml:mi>n</mml:mi>
						</mml:mrow>
					</mml:math>
				</inline-formula>, can be partitioned into “blocks of columns” labeled <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>k</mml:mi>
							<mml:mo>=</mml:mo>
							<mml:mn>1,2,...,</mml:mn>
							<mml:mi>b</mml:mi>
						</mml:mrow>
					</mml:math>
				</inline-formula>. These blocks have dimensions <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>T</mml:mi>
							<mml:mo>×</mml:mo>
							<mml:msub>
								<mml:mi>n</mml:mi>
								<mml:mi>k</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> with <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>k</mml:mi>
							<mml:mo>=</mml:mo>
							<mml:mn>1,2,...,</mml:mn>
							<mml:mi>b</mml:mi>
						</mml:mrow>
					</mml:math>
				</inline-formula>. Each block represents data sampled from a sector. For simplicity, we assume that the indices of the securities are organized so that blocks which are adjacent to one another in the matrix and do not overlap. We have a few concrete situations in mind:</p>
			<p>
				<list list-type="bullet">
					<list-item>
						<p>The blocks represent data of industry sectors for equities in the same economy (e.g. sectors associated with the 500 or so stocks in the S&amp;P 500 index). In this case, the columns of a block correspond to the historical standardized returns of the stocks in the sector observed over 
								<inline-formula>
									<mml:math>
										<mml:mi>T</mml:mi>
									</mml:math>
								</inline-formula>
							 consecutive dates.</p>
					</list-item>
					<list-item>
						<p>Each block represents a stock or index and all of the derivatives written on it. In this case, the columns in a block represent the returns of the stock and the changes of the implied volatilities of options with different strikes and tenors written on the stock (<xref ref-type="bibr" rid="B4">Dobi 2015</xref> [<xref ref-type="bibr" rid="B4">4</xref>]).</p>
					</list-item>
					<list-item>
						<p>In the context of credit derivatives, the data represents changes in credit spreads for CDS. The blocks correspond to CDS referencing the same obligor (issuer) but with different tenors (<xref ref-type="bibr" rid="B3">Cont and Kan (2011)</xref> [<xref ref-type="bibr" rid="B3">3</xref>], <xref ref-type="bibr" rid="B5">Ivanov (2017)</xref> [<xref ref-type="bibr" rid="B5">5</xref>]). </p>
					</list-item>
				</list>
			</p>
			<p>Define the function <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>I</mml:mi>
							<mml:mfenced>
								<mml:mi>j</mml:mi>
							</mml:mfenced>
							<mml:mo>=</mml:mo>
							<mml:mi>k</mml:mi>
						</mml:mrow>
					</mml:math>
				</inline-formula> if asset <inline-formula>
					<mml:math>
						<mml:mi>j</mml:mi>
					</mml:math>
				</inline-formula> is in block <inline-formula>
					<mml:math>
						<mml:mi>k</mml:mi>
					</mml:math>
				</inline-formula>. According to <xref ref-type="disp-formula" rid="e4">Eq. (4)</xref> we can write, for each asset in the “big universe”, </p>
			<p>
				<disp-formula id="e9">
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mi>X</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>=</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mi> </mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mi>F</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>,</mml:mo>
								<mml:mi>I</mml:mi>
								<mml:mo>(</mml:mo>
								<mml:mi>j</mml:mi>
								<mml:mo>)</mml:mo>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>+</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>ϵ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>,</mml:mo>
					</mml:math>
					<label>(9)</label>
				</disp-formula>
			</p>
			<p> where <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mi>β</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> is the regression coefficient of the returns of asset <inline-formula>
					<mml:math>
						<mml:mi>j</mml:mi>
					</mml:math>
				</inline-formula> on the first factor of block <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>I</mml:mi>
							<mml:mfenced>
								<mml:mi>j</mml:mi>
							</mml:mfenced>
						</mml:mrow>
					</mml:math>
				</inline-formula> and <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mi>ϵ</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> is the residual.</p>
			<p>We shall make the following assumption (“HPCA assumption”): </p>
			<p>
				<disp-formula id="e10">
					<mml:math>
						<mml:menclose notation="box">
							<mml:mi> </mml:mi>
							<mml:mi>I</mml:mi>
							<mml:mi>f</mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi>I</mml:mi>
							<mml:mo>(</mml:mo>
							<mml:mi>i</mml:mi>
							<mml:mo>)</mml:mo>
							<mml:mo>≠</mml:mo>
							<mml:mi>I</mml:mi>
							<mml:mo>(</mml:mo>
							<mml:mi>j</mml:mi>
							<mml:mo>)</mml:mo>
							<mml:mo>,</mml:mo>
							<mml:mi> </mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi>t</mml:mi>
							<mml:mi>h</mml:mi>
							<mml:mi>e</mml:mi>
							<mml:mi>n</mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi> </mml:mi>
							<mml:mi>C</mml:mi>
							<mml:mi>o</mml:mi>
							<mml:mi>r</mml:mi>
							<mml:mi>r</mml:mi>
							<mml:mo>(</mml:mo>
							<mml:msub>
								<mml:mrow>
									<mml:mi>ϵ</mml:mi>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>i</mml:mi>
								</mml:mrow>
							</mml:msub>
							<mml:mo>,</mml:mo>
							<mml:msub>
								<mml:mrow>
									<mml:mi>ϵ</mml:mi>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>j</mml:mi>
								</mml:mrow>
							</mml:msub>
							<mml:mo>)</mml:mo>
							<mml:mo>=</mml:mo>
							<mml:mn>0</mml:mn>
							<mml:mo>.</mml:mo>
						</mml:menclose>
					</mml:math>
					<label>(10)</label>
				</disp-formula>
			</p>
			<p> The assumption states that residuals are uncorrelated if their assets belong to different sectors. <xref ref-type="disp-formula" rid="e9">Equation (9)</xref> defines the asset statistics within each block exactly, and the model is completed by specifying the joint statistics of the factors <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>F</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mrow>
											<mml:mn>1,</mml:mn>
											<mml:mi>k</mml:mi>
										</mml:mrow>
									</mml:mfenced>
								</mml:mrow>
							</mml:msup>
							<mml:mo>,</mml:mo>
							<mml:mo> </mml:mo>
							<mml:mo> </mml:mo>
							<mml:mi>k</mml:mi>
							<mml:mo>=</mml:mo>
							<mml:mn>1,2,...,</mml:mn>
							<mml:mi>b</mml:mi>
							<mml:mo>.</mml:mo>
						</mml:mrow>
					</mml:math>
				</inline-formula> The HPCA assumption says nothing new regarding intra-block correlations, which are set equal to the empirical correlations between asset returns within the same sector or block. Of course, the intra-block correlations could be further denoised using RMT if necessary ([<xref ref-type="bibr" rid="B4">4</xref>]).</p>
			<p>Using the HPCA assumption <xref ref-type="disp-formula" rid="e10">Eq. (10)</xref>, the proposed model has the modified correlation matrix for asset returns:</p>
			<p>
				<disp-formula>
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mover accent="true">
									<mml:mrow>
										<mml:mi>R</mml:mi>
									</mml:mrow>
									<mml:mo>~</mml:mo>
								</mml:mover>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>=</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>R</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>i</mml:mi>
						<mml:mi>f</mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>I</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:mi>i</mml:mi>
						<mml:mo>)</mml:mo>
						<mml:mo>=</mml:mo>
						<mml:mi>I</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:mi>j</mml:mi>
						<mml:mo>)</mml:mo>
					</mml:math>
				</disp-formula>
			</p>
			<p>
				<disp-formula id="e11">
					<mml:math>
						<mml:mo>=</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mi> </mml:mi>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mi> </mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mover accent="false">
									<mml:mrow>
										<mml:mi>ρ</mml:mi>
									</mml:mrow>
									<mml:mo>¯</mml:mo>
								</mml:mover>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>I</mml:mi>
								<mml:mo>(</mml:mo>
								<mml:mi>i</mml:mi>
								<mml:mo>)</mml:mo>
								<mml:mi>I</mml:mi>
								<mml:mo>(</mml:mo>
								<mml:mi>j</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>i</mml:mi>
						<mml:mi>f</mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>I</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:mi>i</mml:mi>
						<mml:mo>)</mml:mo>
						<mml:mo>≠</mml:mo>
						<mml:mi>I</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:mi>j</mml:mi>
						<mml:mo>)</mml:mo>
					</mml:math>
					<label>(11)</label>
				</disp-formula>
			</p>
			<p>where <inline-formula>
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mover accent="false">
									<mml:mrow>
										<mml:mi>ρ</mml:mi>
									</mml:mrow>
									<mml:mo>¯</mml:mo>
								</mml:mover>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>k</mml:mi>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mi>'</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:mi>C</mml:mi>
						<mml:mi>o</mml:mi>
						<mml:mi>r</mml:mi>
						<mml:mi>r</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>F</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>,</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>F</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mi>'</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>)</mml:mo>
					</mml:math>
				</inline-formula>.</p>
			<p><bold>Proposition 1</bold> 
 <italic><xref ref-type="disp-formula" rid="e11">Eq. (11)</xref> corresponds to a symmetric non-negative matrix with</italic><inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mover accent="true">
									<mml:mi>R</mml:mi>
									<mml:mo>˜</mml:mo>
								</mml:mover>
								<mml:mrow>
									<mml:mi>i</mml:mi>
									<mml:mi>i</mml:mi>
								</mml:mrow>
							</mml:msub>
							<mml:mo>=</mml:mo>
							<mml:mn>1</mml:mn>
						</mml:mrow>
					</mml:math>
				</inline-formula><italic>for all</italic><inline-formula>
					<mml:math>
						<mml:mi>i</mml:mi>
					</mml:math>
				</inline-formula><italic>. In particular, it corresponds to the correlation matrix of a system of standardized random variables.</italic></p>
			<p><bold>Proof</bold>. To check non-negative definiteness, note that for all <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>θ</mml:mi>
							<mml:mo> </mml:mo>
							<mml:mo>∈</mml:mo>
							<mml:mo> </mml:mo>
							<mml:msup>
								<mml:mi>R</mml:mi>
								<mml:mi>n</mml:mi>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula> we have </p>
			<p>
				<disp-formula id="e12">
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mi>θ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>t</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mover accent="true">
							<mml:mrow>
								<mml:mi>R</mml:mi>
							</mml:mrow>
							<mml:mo>~</mml:mo>
						</mml:mover>
						<mml:mi>θ</mml:mi>
						<mml:mo>=</mml:mo>
						<mml:mrow>
							<mml:munderover>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>k</mml:mi>
									<mml:mo>=</mml:mo>
									<mml:mn>1</mml:mn>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>b</mml:mi>
								</mml:mrow>
							</mml:munderover>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:mrow>
							<mml:munder>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>I</mml:mi>
									<mml:mo>(</mml:mo>
									<mml:mi>i</mml:mi>
									<mml:mo>)</mml:mo>
									<mml:mo>=</mml:mo>
									<mml:mi>I</mml:mi>
									<mml:mo>(</mml:mo>
									<mml:mi>j</mml:mi>
									<mml:mo>)</mml:mo>
									<mml:mo>=</mml:mo>
									<mml:mi>k</mml:mi>
								</mml:mrow>
							</mml:munder>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msub>
							<mml:mrow>
								<mml:mi>θ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:msub>
							<mml:mrow>
								<mml:mi>θ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>(</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>R</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>-</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>)</mml:mo>
						<mml:mo>+</mml:mo>
						<mml:mrow>
							<mml:munderover>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>k</mml:mi>
									<mml:mo>,</mml:mo>
									<mml:mi>k</mml:mi>
									<mml:mi>'</mml:mi>
									<mml:mo>=</mml:mo>
									<mml:mn>1</mml:mn>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>b</mml:mi>
								</mml:mrow>
							</mml:munderover>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:mo>(</mml:mo>
						<mml:mrow>
							<mml:munder>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>I</mml:mi>
									<mml:mo>(</mml:mo>
									<mml:mi>i</mml:mi>
									<mml:mo>)</mml:mo>
									<mml:mo>=</mml:mo>
									<mml:mi>k</mml:mi>
								</mml:mrow>
							</mml:munder>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msub>
							<mml:mrow>
								<mml:mi>θ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>)</mml:mo>
						<mml:mi> </mml:mi>
						<mml:mo>(</mml:mo>
						<mml:mrow>
							<mml:munder>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>I</mml:mi>
									<mml:mo>(</mml:mo>
									<mml:mi>j</mml:mi>
									<mml:mo>)</mml:mo>
									<mml:mo>=</mml:mo>
									<mml:mi>k</mml:mi>
									<mml:mi>'</mml:mi>
								</mml:mrow>
							</mml:munder>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msub>
							<mml:mrow>
								<mml:mi>θ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>)</mml:mo>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mover accent="false">
									<mml:mrow>
										<mml:mi>ρ</mml:mi>
									</mml:mrow>
									<mml:mo>¯</mml:mo>
								</mml:mover>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>k</mml:mi>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mi>'</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mo>.</mml:mo>
					</mml:math>
					<label>(12)</label>
				</disp-formula>
			</p>
			<p> For any <inline-formula>
					<mml:math>
						<mml:mi>k</mml:mi>
					</mml:math>
				</inline-formula>, the matrix <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mi>R</mml:mi>
								<mml:mrow>
									<mml:mi>i</mml:mi>
									<mml:mi>j</mml:mi>
								</mml:mrow>
							</mml:msub>
							<mml:mo>−</mml:mo>
							<mml:msub>
								<mml:mi>β</mml:mi>
								<mml:mi>i</mml:mi>
							</mml:msub>
							<mml:msub>
								<mml:mi>β</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> restricted to sector <inline-formula>
					<mml:math>
						<mml:mi>k</mml:mi>
					</mml:math>
				</inline-formula> is identical to the sector correlation, except for the fact that the eigenvalue corresponding to <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>V</mml:mi>
								<mml:mrow>
									<mml:mfenced>
										<mml:mrow>
											<mml:mn>1,</mml:mn>
											<mml:mi>k</mml:mi>
										</mml:mrow>
									</mml:mfenced>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula> is set to zero. In particular, it is non-negative definite. Moreover, the matrix <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mover accent="true">
									<mml:mi>ρ</mml:mi>
									<mml:mo>¯</mml:mo>
								</mml:mover>
								<mml:mrow>
									<mml:mi>k</mml:mi>
									<mml:mo>,</mml:mo>
									<mml:mi>k</mml:mi>
									<mml:mo>'</mml:mo>
								</mml:mrow>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula> is also a correlation matrix, so it is non-negative definite. Since both summands are non-negative it follows that <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>θ</mml:mi>
								<mml:mi>t</mml:mi>
							</mml:msup>
							<mml:mo> </mml:mo>
							<mml:mover accent="true">
								<mml:mi>R</mml:mi>
								<mml:mo>˜</mml:mo>
							</mml:mover>
							<mml:mi>θ</mml:mi>
							<mml:mo>≥</mml:mo>
							<mml:mn>0</mml:mn>
						</mml:mrow>
					</mml:math>
				</inline-formula> for all <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>θ</mml:mi>
							<mml:mo> </mml:mo>
							<mml:mo>∈</mml:mo>
							<mml:mo> </mml:mo>
							<mml:msup>
								<mml:mi>R</mml:mi>
								<mml:mi>n</mml:mi>
							</mml:msup>
						</mml:mrow>
					</mml:math>
				</inline-formula>.</p>
			<p>A concrete implementation of the data model is achieved as follows: let <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msub>
								<mml:mi>ψ</mml:mi>
								<mml:mn>1</mml:mn>
							</mml:msub>
							<mml:mn>,...,</mml:mn>
							<mml:msub>
								<mml:mi>ψ</mml:mi>
								<mml:mi>b</mml:mi>
							</mml:msub>
						</mml:mrow>
					</mml:math>
				</inline-formula> be Gaussian random variables with mean zero and covariance matrix <inline-formula>
					<mml:math>
						<mml:mover accent="true">
							<mml:mi>ρ</mml:mi>
							<mml:mo>¯</mml:mo>
						</mml:mover>
					</mml:math>
				</inline-formula>, and let <inline-formula>
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mi>ζ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
								<mml:mi>k</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mo>,</mml:mo>
						<mml:mi>i</mml:mi>
						<mml:mo>:</mml:mo>
						<mml:mi>I</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:mi>i</mml:mi>
						<mml:mo>)</mml:mo>
						<mml:mo>=</mml:mo>
						<mml:mi>b</mml:mi>
						<mml:mo>,</mml:mo>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>k</mml:mi>
						<mml:mo>=</mml:mo>
						<mml:mn>1</mml:mn>
						<mml:mo>,</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>,</mml:mo>
						<mml:mi>b</mml:mi>
					</mml:math>
				</inline-formula> be i.i.d. standardized Gaussian random variables which are independent of the <inline-formula>
					<mml:math>
						<mml:mi>ψ</mml:mi>
					</mml:math>
				</inline-formula>’s. The data model is </p>
			<p>
				<disp-formula id="e13">
					<mml:math>
						<mml:msub>
							<mml:mrow>
								<mml:mi>X</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>=</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>β</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mi>ψ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>I</mml:mi>
								<mml:mo>(</mml:mo>
								<mml:mi>i</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mo>+</mml:mo>
						<mml:mrow>
							<mml:munder>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mo>{</mml:mo>
									<mml:mi>j</mml:mi>
									<mml:mo>:</mml:mo>
									<mml:mi>j</mml:mi>
									<mml:mo>≥</mml:mo>
									<mml:mn>2</mml:mn>
									<mml:mo>,</mml:mo>
									<mml:mi> </mml:mi>
									<mml:mi> </mml:mi>
									<mml:mi>I</mml:mi>
									<mml:mo>(</mml:mo>
									<mml:mi>j</mml:mi>
									<mml:mo>)</mml:mo>
									<mml:mo>=</mml:mo>
									<mml:mi>I</mml:mi>
									<mml:mo>(</mml:mo>
									<mml:mi>i</mml:mi>
									<mml:mo>)</mml:mo>
									<mml:mo>}</mml:mo>
								</mml:mrow>
							</mml:munder>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msub>
							<mml:mrow>
								<mml:mi>γ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>i</mml:mi>
								<mml:mi>j</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:msub>
							<mml:mrow>
								<mml:mi>ζ</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>j</mml:mi>
								<mml:mi> </mml:mi>
								<mml:mi>I</mml:mi>
								<mml:mo>(</mml:mo>
								<mml:mi>i</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msub>
					</mml:math>
					<label>(13)</label>
				</disp-formula>
			</p>
			<p>The random variables need not be necessarily Gaussian: they can be multivariate Student-t, or they can be transforms of arbitrary distributions connected by a Gaussian or t-Copula; see for instance [<xref ref-type="bibr" rid="B5">5</xref>].</p>
			<p>The multivariate distribution associated with HPCA presents an alternative model to the classical PCA (<xref ref-type="disp-formula" rid="e8">Eq. (8)</xref>). It has a tree structure: in the equity example discussed below, the top vertex corresponds to the “market”; there are 11 branches corresponding to industry sectors, and each of the 11 vertices has branches corresponding to the stocks in each sector.</p>
			<p>Hierarchical models with more than two layers arise naturally. For instance, HPCA can be used to model “world portfolios”, in which the first layer consists of countries or regions, the second to industry sector indices in each country, and the third layer could describe the individual securities in each region/sector.</p>
			<p>For another useful example, consider a stock market in which stocks belong to different industry sectors, and then, include columns associated with equity options returns. In this case, the tree has three layers because we can associate to each stock an additional sub-group: the block consisting of the returns of implied volatilities (on a constant delta/time-to-maturity grid) and the stock returns. Now the root corresponds to the full market, the first layer corresponds to industry sectors, the second layer corresponds to stocks and the third layer represents an individual name with all the associated option-implied volatilites.</p>
			<p>A similar approach works for credit derivatives. In this case, the returns of the CDS with different tenors referencing each obligor constitute a block associated with an obligor. These blocks can be grouped by industry sectors or, alternatively, blocks could be generated according to membership in a credit index (CCX.IG, CDX.HY, CDX.HV), or both; [<xref ref-type="bibr" rid="B5">5</xref>].</p>
			<p>In summary, if financial data can be grouped into blocks or sectors with clear economic interpretation, with multiple instruments associated with each block, we can generate a data model with tree-like structure from the HPCA assumption in <xref ref-type="disp-formula" rid="e10">Eq. (10)</xref>. This approach combines information available for each asset (sector, sub-sector, reference obligor, option underlying asset) with the explanatory power of PCA. For simplicity, we will consider the analysis of a two-layer HPCA. Adding more layers is mathematically straightforward. </p>
		</sec>
		<sec>
			<title>4 Spectral analysis</title>
			<p> The HPCA assumption <xref ref-type="disp-formula" rid="e10">Eq. (10)</xref> gives rise to explicitly computable eigenvalues and eigenvectors for the matrix <inline-formula>
					<mml:math>
						<mml:mover accent="true">
							<mml:mi>R</mml:mi>
							<mml:mo>˜</mml:mo>
						</mml:mover>
					</mml:math>
				</inline-formula> defined in <xref ref-type="disp-formula" rid="e11">Eq. (11)</xref>.</p>
			<p><bold>Proposition 2.</bold></p>
			<p>
				<list list-type="simple">
					<list-item>
						<p><italic>1. For each sector</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:mi>k</mml:mi>
											<mml:mo>=</mml:mo>
											<mml:mn>1,...,</mml:mn>
											<mml:mi>b</mml:mi>
										</mml:mrow>
									</mml:math>
								</inline-formula>, <italic>let</italic>
								<inline-formula>
									<mml:math>
										<mml:msup>
											<mml:mrow>
												<mml:mi>λ</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mo>(</mml:mo>
												<mml:mn>1</mml:mn>
												<mml:mo>,</mml:mo>
												<mml:mi>k</mml:mi>
												<mml:mo>)</mml:mo>
											</mml:mrow>
										</mml:msup>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mo>&gt;</mml:mo>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:msup>
											<mml:mrow>
												<mml:mi>λ</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mo>(</mml:mo>
												<mml:mn>2</mml:mn>
												<mml:mo>,</mml:mo>
												<mml:mi>k</mml:mi>
												<mml:mo>)</mml:mo>
											</mml:mrow>
										</mml:msup>
										<mml:mo>≥</mml:mo>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mo>.</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mo>≥</mml:mo>
										<mml:msup>
											<mml:mrow>
												<mml:mi>λ</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mo>(</mml:mo>
												<mml:msub>
													<mml:mrow>
														<mml:mi>n</mml:mi>
													</mml:mrow>
													<mml:mrow>
														<mml:mi>k</mml:mi>
													</mml:mrow>
												</mml:msub>
												<mml:mo>,</mml:mo>
												<mml:mi>k</mml:mi>
												<mml:mo>)</mml:mo>
											</mml:mrow>
										</mml:msup>
									</mml:math>
								</inline-formula> 
							<italic>denote the</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msub>
												<mml:mi>n</mml:mi>
												<mml:mi>k</mml:mi>
											</mml:msub>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							 <italic>eigenvectors of the sector correlation matrix, ordered from largest to smallest, and let</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>V</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mrow>
															<mml:mi>i</mml:mi>
															<mml:mo>,</mml:mo>
															<mml:mi>k</mml:mi>
														</mml:mrow>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							  <italic>be the corresponding eigenvectors. Define the n-dimensional vectors</italic></p> 
					</list-item>
					</list>
			</p>
						<p>
							
								<disp-formula>
									<mml:math>
										<mml:msubsup>
											<mml:mrow>
												<mml:mi>W</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mi>j</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mo>(</mml:mo>
												<mml:mi>i</mml:mi>
												<mml:mo>,</mml:mo>
												<mml:mi>k</mml:mi>
												<mml:mo>)</mml:mo>
											</mml:mrow>
										</mml:msubsup>
										<mml:mo>=</mml:mo>
										<mml:msubsup>
											<mml:mrow>
												<mml:mi>V</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mi>j</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mo>(</mml:mo>
												<mml:mi>i</mml:mi>
												<mml:mo>,</mml:mo>
												<mml:mi>k</mml:mi>
												<mml:mo>)</mml:mo>
											</mml:mrow>
										</mml:msubsup>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi>i</mml:mi>
										<mml:mi>f</mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi>I</mml:mi>
										<mml:mo>(</mml:mo>
										<mml:mi>j</mml:mi>
										<mml:mo>)</mml:mo>
										<mml:mo>=</mml:mo>
										<mml:mi>k</mml:mi>
									</mml:math>
								</disp-formula>
							
						</p>
				
			<p>
				<disp-formula id="e14">
					<mml:math>
						<mml:mo>=</mml:mo>
						<mml:mn>0</mml:mn>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>i</mml:mi>
						<mml:mi>f</mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi> </mml:mi>
						<mml:mi>I</mml:mi>
						<mml:mo>(</mml:mo>
						<mml:mi>j</mml:mi>
						<mml:mo>)</mml:mo>
						<mml:mo>≠</mml:mo>
						<mml:mi>k</mml:mi>
						<mml:mo>,</mml:mo>
					</mml:math>
					<label>(14)</label>
				</disp-formula>
			</p>
			<p>
				<list list-type="simple">
					<list-item>
						<p><italic>which correspond to the embedding of the sector-level eigenvectors</italic>, 
								<inline-formula>
									<mml:math>
										<mml:msup>
											<mml:mrow>
												<mml:mi>V</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mo>(</mml:mo>
												<mml:mi>i</mml:mi>
												<mml:mo>,</mml:mo>
												<mml:mi>k</mml:mi>
												<mml:mo>)</mml:mo>
											</mml:mrow>
										</mml:msup>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mo>∈</mml:mo>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:msup>
											<mml:mrow>
												<mml:mi>R</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:msub>
													<mml:mrow>
														<mml:mi>n</mml:mi>
													</mml:mrow>
													<mml:mrow>
														<mml:mi>k</mml:mi>
													</mml:mrow>
												</mml:msub>
											</mml:mrow>
										</mml:msup>
									</mml:math>
								</inline-formula>
							, <italic>into the large space</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>R</mml:mi>
												<mml:mi>n</mml:mi>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>. <italic>The vectors</italic> 
								<inline-formula>
									<mml:math>
										<mml:msup>
											<mml:mrow>
												<mml:mi>W</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mo>(</mml:mo>
												<mml:mi>i</mml:mi>
												<mml:mo>,</mml:mo>
												<mml:mi>k</mml:mi>
												<mml:mo>)</mml:mo>
											</mml:mrow>
										</mml:msup>
										<mml:mo>,</mml:mo>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi>i</mml:mi>
										<mml:mo>=</mml:mo>
										<mml:mn>1</mml:mn>
										<mml:mo>,</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mo>,</mml:mo>
										<mml:msub>
											<mml:mrow>
												<mml:mi>n</mml:mi>
											</mml:mrow>
											<mml:mrow>
												<mml:mi>k</mml:mi>
											</mml:mrow>
										</mml:msub>
										<mml:mo>,</mml:mo>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi> </mml:mi>
										<mml:mi>k</mml:mi>
										<mml:mo>=</mml:mo>
										<mml:mn>1</mml:mn>
										<mml:mo>,</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mo>.</mml:mo>
										<mml:mo>,</mml:mo>
										<mml:mi>b</mml:mi>
									</mml:math>
								</inline-formula>
							 <italic>form an orthogonal basis of</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>R</mml:mi>
												<mml:mi>n</mml:mi>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>.</p>
					</list-item>
				</list>
			</p>
			<p>
				<list list-type="simple">
					<list-item>
						<p><italic>2. The subspace</italic> 
								<inline-formula>
									<mml:math>
										<mml:mtext>Ω</mml:mtext>
									</mml:math>
								</inline-formula>
							 <italic>of</italic>
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>R</mml:mi>
												<mml:mi>n</mml:mi>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							<italic>generated by the vectors</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>W</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mrow>
															<mml:mn>1,</mml:mn>
															<mml:mi>k</mml:mi>
														</mml:mrow>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
											<mml:mo>,</mml:mo>
											<mml:mo> </mml:mo>
											<mml:mo> </mml:mo>
											<mml:mi>k</mml:mi>
											<mml:mo>=</mml:mo>
											<mml:mn>1,...,</mml:mn>
											<mml:mi>b</mml:mi>
										</mml:mrow>
									</mml:math>
								</inline-formula>, <italic>viz.</italic></p>
					</list-item>
				</list>
			</p>
			<p>
				<disp-formula id="e15">
					<mml:math>
						<mml:mi mathvariant="normal">Ω</mml:mi>
						<mml:mo>=</mml:mo>
						<mml:mo>{</mml:mo>
						<mml:mrow>
							<mml:munderover>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>k</mml:mi>
									<mml:mo>=</mml:mo>
									<mml:mn>1</mml:mn>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>b</mml:mi>
								</mml:mrow>
							</mml:munderover>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msub>
							<mml:mrow>
								<mml:mi>α</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>k</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:msup>
							<mml:mrow>
								<mml:mi>W</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>:</mml:mo>
						<mml:mo>(</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>α</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mn>1</mml:mn>
							</mml:mrow>
						</mml:msub>
						<mml:mo>,</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>.</mml:mo>
						<mml:mo>,</mml:mo>
						<mml:msub>
							<mml:mrow>
								<mml:mi>α</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>b</mml:mi>
							</mml:mrow>
						</mml:msub>
						<mml:mo>)</mml:mo>
						<mml:mo>∈</mml:mo>
						<mml:msup>
							<mml:mrow>
								<mml:mi>R</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>b</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mo>}</mml:mo>
						<mml:mo>,</mml:mo>
					</mml:math>
					<label>(15)</label>
				</disp-formula>
			</p>
			<p>
				<list list-type="simple">
					<list-item>
						<p><italic>is invariant under the action of</italic> 
								<inline-formula>
									<mml:math>
										<mml:mover accent="true">
											<mml:mi>R</mml:mi>
											<mml:mo>˜</mml:mo>
										</mml:mover>
									</mml:math>
								</inline-formula>
							 <italic>viewed as an operator from</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>R</mml:mi>
												<mml:mi>n</mml:mi>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							 <italic>to</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>R</mml:mi>
												<mml:mi>n</mml:mi>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>. </p>
					</list-item>
					<list-item>
						<p><italic>3. Consider the</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:mi>b</mml:mi>
											<mml:mo>×</mml:mo>
											<mml:mi>b</mml:mi>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							 <italic>matrix</italic></p>
					</list-item>
				</list>
			</p>
			<p>
				<disp-formula id="e16">
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mi>M</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>k</mml:mi>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mi>'</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mo>:</mml:mo>
						<mml:mo>=</mml:mo>
						<mml:msqrt>
							<mml:msup>
								<mml:mrow>
									<mml:mi>λ</mml:mi>
								</mml:mrow>
								<mml:mrow>
									<mml:mo>(</mml:mo>
									<mml:mn>1</mml:mn>
									<mml:mo>,</mml:mo>
									<mml:mi>k</mml:mi>
									<mml:mo>)</mml:mo>
								</mml:mrow>
							</mml:msup>
						</mml:msqrt>
						<mml:msqrt>
							<mml:msup>
								<mml:mrow>
									<mml:mi>λ</mml:mi>
								</mml:mrow>
								<mml:mrow>
									<mml:mo>(</mml:mo>
									<mml:mn>1</mml:mn>
									<mml:mo>,</mml:mo>
									<mml:mi>k</mml:mi>
									<mml:mi>'</mml:mi>
									<mml:mo>)</mml:mo>
								</mml:mrow>
							</mml:msup>
						</mml:msqrt>
						<mml:msup>
							<mml:mrow>
								<mml:mover accent="false">
									<mml:mrow>
										<mml:mi>ρ</mml:mi>
									</mml:mrow>
									<mml:mo>¯</mml:mo>
								</mml:mover>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>k</mml:mi>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mi>'</mml:mi>
							</mml:mrow>
						</mml:msup>
						<mml:mo>.</mml:mo>
					</mml:math>
					<label>(16)</label>
				</disp-formula>
			</p>
			<p>
				<list list-type="simple">
					<list-item>
						<p><italic>Let</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>μ</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mn>1</mml:mn>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
											<mml:mn>,...,</mml:mn>
											<mml:msup>
												<mml:mi>μ</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mi>b</mml:mi>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							 <italic>denote the eigenvalues of</italic> 
								<inline-formula>
									<mml:math>
										<mml:mi>M</mml:mi>
									</mml:math>
								</inline-formula>, <italic>ranked in decreasing order, and let</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:mo stretchy="false">(</mml:mo>
											<mml:msup>
												<mml:mi>α</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mi>k</mml:mi>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
											<mml:mo>=</mml:mo>
											<mml:mfenced>
												<mml:mrow>
													<mml:msubsup>
														<mml:mi>α</mml:mi>
														<mml:mn>1</mml:mn>
														<mml:mrow>
															<mml:mfenced>
																<mml:mi>k</mml:mi>
															</mml:mfenced>
														</mml:mrow>
													</mml:msubsup>
													<mml:mn>,....,</mml:mn>
													<mml:msubsup>
														<mml:mi>α</mml:mi>
														<mml:mi>b</mml:mi>
														<mml:mrow>
															<mml:mi>k</mml:mi>
															<mml:mo stretchy="false">)</mml:mo>
														</mml:mrow>
													</mml:msubsup>
												</mml:mrow>
											</mml:mfenced>
											<mml:mo> </mml:mo>
											<mml:mo> </mml:mo>
											<mml:mi>k</mml:mi>
											<mml:mo>=</mml:mo>
											<mml:mn>1,...,</mml:mn>
											<mml:mi>b</mml:mi>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							 <italic>represent the corresponding normalized eigenvectors (defined up to sign). The vectors</italic></p>
					</list-item>
				</list>
			</p>
			<p>
				<disp-formula id="e17">
					<mml:math>
						<mml:msup>
							<mml:mrow>
								<mml:mover accent="true">
									<mml:mrow>
										<mml:mi>W</mml:mi>
									</mml:mrow>
									<mml:mo>~</mml:mo>
								</mml:mover>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mn>1</mml:mn>
								<mml:mo>,</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
						<mml:mo>=</mml:mo>
						<mml:mrow>
							<mml:munderover>
								<mml:mo stretchy="false">∑</mml:mo>
								<mml:mrow>
									<mml:mi>p</mml:mi>
									<mml:mo>=</mml:mo>
									<mml:mn>1</mml:mn>
								</mml:mrow>
								<mml:mrow>
									<mml:mi>b</mml:mi>
								</mml:mrow>
							</mml:munderover>
							<mml:mrow>
								<mml:mi mathvariant="normal">‍</mml:mi>
							</mml:mrow>
						</mml:mrow>
						<mml:msubsup>
							<mml:mrow>
								<mml:mi>α</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mi>p</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>k</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msubsup>
						<mml:mi> </mml:mi>
						<mml:msup>
							<mml:mrow>
								<mml:mi>W</mml:mi>
							</mml:mrow>
							<mml:mrow>
								<mml:mo>(</mml:mo>
								<mml:mi>l</mml:mi>
								<mml:mo>,</mml:mo>
								<mml:mi>p</mml:mi>
								<mml:mo>)</mml:mo>
							</mml:mrow>
						</mml:msup>
					</mml:math>
					<label>(17)</label>
				</disp-formula>
			</p>
			<p>
				<list list-type="simple">
					<list-item>
						<p><italic>are eigenvectors of</italic> 
								<inline-formula>
									<mml:math>
										<mml:mover accent="true">
											<mml:mi>R</mml:mi>
											<mml:mo>˜</mml:mo>
										</mml:mover>
									</mml:math>
								</inline-formula>, <italic>with corresponding eigenvalues</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>μ</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mi>k</mml:mi>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>, <italic>for</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:mi>k</mml:mi>
											<mml:mo>=</mml:mo>
											<mml:mn>1,...,</mml:mn>
											<mml:mi>b</mml:mi>
										</mml:mrow>
									</mml:math>
								</inline-formula>. </p>
					</list-item>
					<list-item>
						<p><italic>4. For each sector</italic> 
								<inline-formula>
									<mml:math>
										<mml:mi>k</mml:mi>
									</mml:math>
								</inline-formula>
							 <italic>and each</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:mi>j</mml:mi>
											<mml:mo>,</mml:mo>
											<mml:mo> </mml:mo>
											<mml:mo> </mml:mo>
											<mml:mn>2</mml:mn>
											<mml:mo>≤</mml:mo>
											<mml:mi>j</mml:mi>
											<mml:mo> </mml:mo>
											<mml:mo> </mml:mo>
											<mml:mo>≤</mml:mo>
											<mml:msub>
												<mml:mi>n</mml:mi>
												<mml:mi>k</mml:mi>
											</mml:msub>
										</mml:mrow>
									</mml:math>
								</inline-formula>, <italic>the vector</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>W</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mrow>
															<mml:mi>j</mml:mi>
															<mml:mo>,</mml:mo>
															<mml:mi>k</mml:mi>
														</mml:mrow>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>
							 <italic>is an eigenvector of</italic> 
								<inline-formula>
									<mml:math>
										<mml:mover accent="true">
											<mml:mi>R</mml:mi>
											<mml:mo>˜</mml:mo>
										</mml:mover>
									</mml:math>
								</inline-formula>, <italic>with eigenvalue</italic> 
								<inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:msup>
												<mml:mi>λ</mml:mi>
												<mml:mrow>
													<mml:mfenced>
														<mml:mrow>
															<mml:mi>j</mml:mi>
															<mml:mo>,</mml:mo>
															<mml:mi>k</mml:mi>
														</mml:mrow>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>. </p>
					</list-item>
				</list>
			</p>
			<p> This proposition completely characterizes the eigenvalues and eigenvectors of the HPCA correlation matrix relating them to the eigenvalues and eigenvectors of sector PCAs.<xref ref-type="fn" rid="fn3"><sup>3</sup></xref> Thus, the HPCA assumption eliminates the identification problem for common factors: “eigenportfolios” have concrete meanings attached to the information about the correlations of sectors. In the examples to follow, we shall compare HPCA with PCA and show that the former is an excellent substitute for the full empirical correlation matrices when we model multivariate financial data.</p>
		</sec>
		<sec>
			<title>5 Application: S&amp;P 500 constituents</title>
			<p> We consider data for <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>n</mml:mi>
							<mml:mo>=</mml:mo>
							<mml:mn>434</mml:mn>
						</mml:mrow>
					</mml:math>
				</inline-formula> equities which are constituents of the S&amp;P500 index. The data ranges from February 22, 2012 to February 16, 2018. We consider the correlation matrix of standardized stock returns, and define the sectors as General Industry Classification groups (GICs), so <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>b</mml:mi>
							<mml:mo>=</mml:mo>
							<mml:mn>11</mml:mn>
						</mml:mrow>
					</mml:math>
				</inline-formula>; see <xref ref-type="table" rid="t1">Table 1</xref>.</p>
			<p>
				<table-wrap id="t1">
					<label>Cuadro 1</label>
					<caption>
						<title>GIC sectors and number of companies in each sector. </title>
					</caption>
					<table>
						<colgroup>
							<col/>
							<col/>
							<col/>
						</colgroup>
						<thead>
							<tr>
								<th align="center">GIC (<inline-formula>
										<mml:math>
											<mml:mi>k</mml:mi>
										</mml:math>
									</inline-formula>)</th>
								<th align="center">Description</th>
								<th align="center">Number of companies (<inline-formula>
										<mml:math>
											<mml:mrow>
												<mml:msub>
													<mml:mi>n</mml:mi>
													<mml:mi>k</mml:mi>
												</mml:msub>
											</mml:mrow>
										</mml:math>
									</inline-formula>)</th>
							</tr>
						</thead>
						<tbody>
							<tr>
								<td align="center">1</td>
								<td align="center">Consumer Discretionary</td>
								<td align="center">73</td>
							</tr>
							<tr>
								<td align="center">2</td>
								<td align="center">Consumer Staples</td>
								<td align="center">56</td>
							</tr>
							<tr>
								<td align="center">3</td>
								<td align="center">Energy</td>
								<td align="center">27</td>
							</tr>
							<tr>
								<td align="center">4</td>
								<td align="center">Financials</td>
								<td align="center">59</td>
							</tr>
							<tr>
								<td align="center">5</td>
								<td align="center">Health Care</td>
								<td align="center">51</td>
							</tr>
							<tr>
								<td align="center">6</td>
								<td align="center">Industrials</td>
								<td align="center">57</td>
							</tr>
							<tr>
								<td align="center">7</td>
								<td align="center">Information Technology</td>
								<td align="center">58</td>
							</tr>
							<tr>
								<td align="center">8</td>
								<td align="center">Materials</td>
								<td align="center">23</td>
							</tr>
							<tr>
								<td align="center">9</td>
								<td align="center">Real Estate</td>
								<td align="center">27</td>
							</tr>
							<tr>
								<td align="center">10</td>
								<td align="center">Telecommunication Services</td>
								<td align="center">3</td>
							</tr>
							<tr>
								<td align="center">11</td>
								<td align="center">Utilities</td>
								<td align="center">28</td>
							</tr>
						</tbody>
					</table>
				</table-wrap>
			</p>
			<sec>
				<title>5.1 Eigenvalues</title>
				<p> We considered the full empirical correlation matrix<xref ref-type="fn" rid="fn4"><sup>4</sup></xref> and the HPCA correlation matrix <inline-formula>
						<mml:math>
							<mml:mover accent="true">
								<mml:mi>R</mml:mi>
								<mml:mo>˜</mml:mo>
							</mml:mover>
						</mml:math>
					</inline-formula> (“HPCA matrix”). The spectrum of the HPCA matrix is very similar than the one of the empirical correlation matrix <inline-formula>
						<mml:math>
							<mml:mi>R</mml:mi>
						</mml:math>
					</inline-formula>, with the difference that the latter eigenvalues at the top of the spectrum are slightly larger the eigenvalues of the HPCA matrix. This is due to the fact that PCA explains more variance with fewer common factors (see <xref ref-type="fig" rid="f1">Figure (5.1)</xref>). On the other hand, the sum of eigenvalues is equal to <inline-formula>
						<mml:math>
							<mml:mrow>
								<mml:mi>n</mml:mi>
								<mml:mo>=</mml:mo>
								<mml:mn>434</mml:mn>
							</mml:mrow>
						</mml:math>
					</inline-formula> in both cases, which means that for high enough rank, the higher-order eigenvalues of HPCA are larger than those of PCA. The lowest eigenvalues of <inline-formula>
						<mml:math>
							<mml:mi>R</mml:mi>
						</mml:math>
					</inline-formula> are infinitesimal, and the latter matrix is degenerate. At the bottom of the spectrum (not shown here) the HPCA spectrum has much higher eigenvalues (separated from zero) than PCA, since they are bounded from below by the lowest eigenvalue from all the sectors. Thus, the HPCA matrix is better conditioned than the full empirical matrix. </p>
				<p>
					<fig id="f1">
						<label>Figure 1</label>
						<caption>
							<title>X=axis: rank (<inline-formula>
									<mml:math>
										<mml:mi>k</mml:mi>
									</mml:math>
								</inline-formula>) of the eigenvalues, sorted in decreasing order. Y-axis: sum of the first <inline-formula>
									<mml:math>
										<mml:mi>k</mml:mi>
									</mml:math>
								</inline-formula> eigenvalues divided by <inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:mi>n</mml:mi>
											<mml:mo>=</mml:mo>
											<mml:mn>434</mml:mn>
										</mml:mrow>
									</mml:math>
								</inline-formula>. The PCA curve rises faster than HPCA, due to the nature of the PCA algorithm.</title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf1.svg"/>
					</fig>
				</p>
				<p>
					<table-wrap id="t2">
						<label>Cuadro 2</label>
						<caption>
							<title>Top 25 eigenvalues of PCA and HPCA, sorted in decreasing order.</title>
						</caption>
						<table>
							<colgroup>
								<col/>
								<col/>
								<col/>
								<col/>
								<col/>
								<col/>
								<col/>
							</colgroup>
							<thead>
								<tr>
									<th align="center">PCA</th>
									<th align="center">HPCA</th>
									<th align="center">Eigenportfolio</th>
									<th align="center"> </th>
									<th align="center">PCA</th>
									<th align="center">HPCA</th>
									<th align="center">Eigenportfolio</th>
								</tr>
							</thead>
							<tbody>
								<tr>
									<td align="center">138.87</td>
									<td align="center">137.19</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.79</td>
									<td align="center">2.18</td>
									<td align="center">Industrials</td>
								</tr>
								<tr>
									<td align="center">26.84</td>
									<td align="center">20.70</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.52</td>
									<td align="center">2.15</td>
									<td align="center">Consumer Disc.</td>
								</tr>
								<tr>
									<td align="center">11.88</td>
									<td align="center">8.18</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.46</td>
									<td align="center">2.14</td>
									<td align="center">Healthcare</td>
								</tr>
								<tr>
									<td align="center">7.70</td>
									<td align="center">5.91</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.36</td>
									<td align="center">2.09</td>
									<td align="center">Inf. Technology</td>
								</tr>
								<tr>
									<td align="center">6.87</td>
									<td align="center">4.93</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.32</td>
									<td align="center">2.03</td>
									<td align="center">Multi-sector</td>
								</tr>
								<tr>
									<td align="center">5.75</td>
									<td align="center">3.69</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.24</td>
									<td align="center">1.94</td>
									<td align="center">Technology</td>
								</tr>
								<tr>
									<td align="center">5.16</td>
									<td align="center">3.38</td>
									<td align="center">Consumer Disc.</td>
									<td align="center"> </td>
									<td align="center">2.20</td>
									<td align="center">1.93</td>
									<td align="center">Industrials</td>
								</tr>
								<tr>
									<td align="center">4.70</td>
									<td align="center">2.88</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.18</td>
									<td align="center">1.92</td>
									<td align="center">Energy</td>
								</tr>
								<tr>
									<td align="center">3.90</td>
									<td align="center">2.80</td>
									<td align="center">Financials</td>
									<td align="center"> </td>
									<td align="center">2.13</td>
									<td align="center">1.80</td>
									<td align="center">Consumer Disc.</td>
								</tr>
								<tr>
									<td align="center">3.61</td>
									<td align="center">2.68</td>
									<td align="center">Multi-sector</td>
									<td align="center"> </td>
									<td align="center">2.06</td>
									<td align="center">1.59</td>
									<td align="center">Inf. Technology</td>
								</tr>
								<tr>
									<td align="center">3.48</td>
									<td align="center">2.67</td>
									<td align="center">Healthcare</td>
									<td align="center"> </td>
									<td align="center">2.01</td>
									<td align="center">1.57</td>
									<td align="center">Industrials</td>
								</tr>
								<tr>
									<td align="center">3.02</td>
									<td align="center">2.53</td>
									<td align="center">Cons. Cyclical</td>
									<td align="center"> </td>
									<td align="center">1.96</td>
									<td align="center">1.57</td>
									<td align="center">Healthcare</td>
								</tr>
								<tr>
									<td align="center">2.87</td>
									<td align="center">2.25</td>
									<td align="center">Healthcare</td>
									<td align="center"> </td>
									<td align="center"> </td>
									<td align="center"> </td>
									<td align="center"> </td>
								</tr>
							</tbody>
						</table>
						<table-wrap-foot>
							<fn id="TFN1">
								<p>The column “Eigenportfolio” gives an interpretation of the corresponding HPCA eigenportfolio. “Multi-sector” corresponds to a 
										<inline-formula>
											<mml:math>
												<mml:mrow>
													<mml:msup>
														<mml:mi>μ</mml:mi>
														<mml:mrow>
															<mml:mfenced>
																<mml:mi>k</mml:mi>
															</mml:mfenced>
														</mml:mrow>
													</mml:msup>
												</mml:mrow>
											</mml:math>
										</inline-formula>
									-eigenvalue and eigenvector, which are combinations of the <italic>first</italic> eigenportfolios for each of the 11 sectors (space 
										<inline-formula>
											<mml:math>
												<mml:mtext>Ω</mml:mtext>
											</mml:math>
										</inline-formula>). The other eigenvalues/eigenvectors correspond to higher-order eigenvalues/eigenvectors for individual GIC sectors. Notice that, after sorting, some of the GIC eigenportfolios are more important in terms of explaining variability than multi-sector portfolios.</p>
							</fn>
						</table-wrap-foot>
					</table-wrap>
				</p>
			</sec>
			<sec>
				<title>5.2 Eigenvectors</title>
				<p> We turn to empirical analysis of the eigenvectors of the HPCA and the empirical correlation matrices, <italic>i.e.</italic> to the issue of identification problem for PCA/HPCA. The first eigenvectors for HPCA and PCA are plotted in <xref ref-type="fig" rid="f2">Figures (5.2)</xref> and <xref ref-type="fig" rid="f2">(5.2)</xref>. Since the first eigenvector of <inline-formula>
						<mml:math>
							<mml:mi>M</mml:mi>
						</mml:math>
					</inline-formula> has positive entries and the first eigenvectors of sector correlations also have positive entries due to the positive correlations of stocks ( [<xref ref-type="bibr" rid="B1">1</xref>],[<xref ref-type="bibr" rid="B2">2</xref>] ; EV1 loadings are positive for both PCA and PCA. <xref ref-type="fig" rid="f2">Figure (5.2)</xref> superimposes both eigenvectors. The ordering of the X-axis is alphabetical in each sector and sectors are grouped displayed in increasing order of GIC according to Table (5). The two eigenvectors are practically indistinguishable in the sense that their average difference is of order <inline-formula>
						<mml:math>
							<mml:mrow>
								<mml:mn>1.0</mml:mn>
								<mml:mo>×</mml:mo>
								<mml:msup>
									<mml:mrow>
										<mml:mn>10</mml:mn>
									</mml:mrow>
									<mml:mrow>
										<mml:mo>−</mml:mo>
										<mml:mn>5</mml:mn>
									</mml:mrow>
								</mml:msup>
							</mml:mrow>
						</mml:math>
					</inline-formula> and the standard deviation (centered RMS distance) is <inline-formula>
						<mml:math>
							<mml:mrow>
								<mml:mn>5.3</mml:mn>
								<mml:mo>×</mml:mo>
								<mml:msup>
									<mml:mrow>
										<mml:mn>10</mml:mn>
									</mml:mrow>
									<mml:mrow>
										<mml:mo>−</mml:mo>
										<mml:mn>3</mml:mn>
									</mml:mrow>
								</mml:msup>
							</mml:mrow>
						</mml:math>
					</inline-formula>. The RMS error is one order of magnitude smaller than the average size of each entry in the eigenvectors which is approximately equal to <inline-formula>
						<mml:math>
							<mml:mrow>
								<mml:mn>4.7</mml:mn>
								<mml:mo>×</mml:mo>
								<mml:msup>
									<mml:mrow>
										<mml:mn>10</mml:mn>
									</mml:mrow>
									<mml:mrow>
										<mml:mo>−</mml:mo>
										<mml:mn>2</mml:mn>
									</mml:mrow>
								</mml:msup>
							</mml:mrow>
						</mml:math>
					</inline-formula>, in both cases.</p>
				<p>This identifies the first eigenportfolio of the market as a “portfolio of first eigenportfolios” of different sectors (GICs). The difference in explanatory power between the two eigenvectors is the difference between the corresponding eigenvalues, divided by the number of stocks, namely <inline-formula>
						<mml:math>
							<mml:mrow>
								<mml:mfenced>
									<mml:mrow>
										<mml:mn>138.87</mml:mn>
										<mml:mo>−</mml:mo>
										<mml:mn>137.19</mml:mn>
									</mml:mrow>
								</mml:mfenced>
								<mml:mo>/</mml:mo>
								<mml:mn>434</mml:mn>
								<mml:mo>=</mml:mo>
								<mml:mn>0.39</mml:mn>
								<mml:mtext>%</mml:mtext>
							</mml:mrow>
						</mml:math>
					</inline-formula>, which is negligible in this context. In particular, this suggests that using the first HPCA eigenportfolio as a proxy for the market portfolio gives rise to a better description of the market portfolio and an easier way to allocate to each stock. For instance, the first EV could be proxied by a capitalization-weighted sector ETF.<xref ref-type="fn" rid="fn5"><sup>5</sup></xref>.</p>
				<p>For eigenvectors 2 through 5 <xref ref-type="fig" rid="f2">Figures (5.2)</xref> through <xref ref-type="fig" rid="f2">(5.2)</xref>, we find that the PCA eigenvectors correspond to “noisy versions” of the corresponding HPCA eigenvectors. The latter are essentially long-short sector eigenportfolios. The discrepancy increases when we consider higher-order eigenvalues, beyond 5. Eigenvectors #6 aren’t similar as shown in <xref ref-type="fig" rid="f2">Figure (5.2)</xref>. The PCA eigenvector contains both positive and negative signs within the Consumer Discretionary sector. Eigenvector 7 in HPCA is the first which is concentrated in a single sector, which is Consumer Discretionary (<xref ref-type="fig" rid="f2">Fig. (5.2)</xref>. The remaining eigenvectors up to rank 10 are displayed in <xref ref-type="fig" rid="f2">Figures (5.2)</xref> to <xref ref-type="fig" rid="f2">(5.2)</xref>.</p>
				<p>The main conclusions are: (a) most of the top eigenvalues and corresponding eigenvectors are related to the inter-sector correlation <inline-formula>
						<mml:math>
							<mml:mover accent="true">
								<mml:mi>ρ</mml:mi>
								<mml:mo>¯</mml:mo>
							</mml:mover>
						</mml:math>
					</inline-formula>. This provides an interpretation for these eigenportfolios, or common risk factors, as “portfolios of long-only sector portfolios”. (b) The remaining eigenvectors may be quite different. The HPCA defines the factors into “sector-sector” and “long-short intra-sector”. PCA eigenvectors, in contrast, become increasingly difficult to interpret as simple sector-sector interactions or intra-sector interactions.</p>
				<p>
					<fig id="f2">
						<label>Figure 2</label>
						<caption>
							<title>First eigenvector of HPCA. Variance explained= 30%. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf2.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f3">
						<label>Figure 3</label>
						<caption>
							<title>Comparison of the first eigenvectors of HPCA and PCA, which have approximately the same explanatory value. Their Euclidean distance (RMS error) is <inline-formula>
									<mml:math>
										<mml:mrow>
											<mml:mn>5.5</mml:mn>
											<mml:mo>×</mml:mo>
											<mml:msup>
												<mml:mrow>
													<mml:mn>10</mml:mn>
												</mml:mrow>
												<mml:mrow>
													<mml:mo>−</mml:mo>
													<mml:mn>3</mml:mn>
												</mml:mrow>
											</mml:msup>
										</mml:mrow>
									</mml:math>
								</inline-formula>, which is an order of magnitude smaller than the average entry size. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf3.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f4">
						<label>Figure 4</label>
						<caption>
							<title>Second eigenvector of HPCA. The variance explained is 4.7% for HPCA and 6.1% for PCA. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf4.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f5">
						<label>Figure 5</label>
						<caption>
							<title>Comparison of the second eigenvectors. The PCA eigenvector is essentially a noisy version of the HPCA eigenvector. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf5.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f6">
						<label>Figure 6</label>
						<caption>
							<title>The third eigenvectors of HPCA: one can observe again that PCA EV3 is a noisy version of HPCA EV3. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf6.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f7">
						<label>Figure 7</label>
						<caption>
							<title>The fourth eigenvectors. Notice the similar loadings for sectors. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf7.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f8">
						<label>Figure 8</label>
						<caption>
							<title>The fifth eigenvectors. Notice the similar loadings for sectors. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf8.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f9">
						<label>Figure 9</label>
						<caption>
							<title>The sixth eigenvectors. In this case, PCA presents a different shape and is not “localized” on any sector. The leftmost part of the PCA eigenvector corresponds to Consumer Discretionary. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf9.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f10">
						<label>Figure 10</label>
						<caption>
							<title>The seventh eigenvectors. The HPCA is essentially an eigenvector localized on the Consumer Discretionary sector (the second eigenvector of this sector. The PCA eigenvector is completely delocalized. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf10.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f11">
						<label>Figure 11</label>
						<caption>
							<title>Eight eigenvectors. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf11.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f12">
						<label>Figure 12</label>
						<caption>
							<title>Ninth eigenvectors. The HPCA eigenvector is localized in the Financials sector. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf12.svg"/>
					</fig>
				</p>
				<p>
					<fig id="f13">
						<label>Figure 13</label>
						<caption>
							<title>Tenth eigenvectors. </title>
						</caption>
						<graphic xlink:href="2448-6795-rmef-15-01-1-gf13.svg"/>
					</fig>
				</p>
			</sec>
		</sec>
		<sec sec-type="conclusions">
			<title>6 Analysis of residuals via RMT &amp; Conclusion</title>
			<p> To further evaluate the HPCA, we considered both models (HPCA,PCA) with a cutoff <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>m</mml:mi>
							<mml:mo>=</mml:mo>
							<mml:mn>30</mml:mn>
						</mml:mrow>
					</mml:math>
				</inline-formula>, and compared the multivariate statistics. We expect that after removing <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mi>m</mml:mi>
							<mml:mo>≈</mml:mo>
							<mml:mn>30</mml:mn>
						</mml:mrow>
					</mml:math>
				</inline-formula> eigenvectors, the correlations of the residuals (both intra- and inter- sector) should be small.</p>
			<p>Empirically, the top eigenvectors of the correlation matrices of residuals are approximately 6.8 (HPCA) and 7.7 (PCA), which correspond to an approximate average correlation of <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:mn>7.3</mml:mn>
							<mml:mo>/</mml:mo>
							<mml:mn>434</mml:mn>
							<mml:mo>=</mml:mo>
							<mml:mn>1.7</mml:mn>
							<mml:mtext>%</mml:mtext>
						</mml:mrow>
					</mml:math>
				</inline-formula>. We compared the histograms of the eigenvalues for the corresponding correlation matrices and found that they are very near each other. We also compared the histograms with a discretization of the Marcenko-Pastur distribution (mimicking the comparable histogram for the large-matrix limit), suggesting that the residuals behave like a random matrix in both models; see <xref ref-type="fig" rid="f6">Fig. (6)</xref>. The majority of the lines, in both cases, are below the Marcenko-Pastur cutoff <inline-formula>
					<mml:math>
						<mml:mrow>
							<mml:msup>
								<mml:mi>λ</mml:mi>
								<mml:mo>+</mml:mo>
							</mml:msup>
							<mml:mo>=</mml:mo>
							<mml:mn>2.36</mml:mn>
						</mml:mrow>
					</mml:math>
				</inline-formula>, as postulated by RMT, and have comparable sizes to the MP distribution. There are, nevertheless, some lines above the MP threshold in both models (which are essentially equal), but they decreasing in magnitude as <inline-formula>
					<mml:math>
						<mml:mi>λ</mml:mi>
					</mml:math>
				</inline-formula> increases,and could perhaps be interpreted as finite-size fluctuations.</p>
			<p>This calculation suggests that using the full empirical correlation matrix is not more informative than using the HPCA model, which uses only the sector correlation matrices, and in which intra-sector correlations are derived from the correlations of the EV1 for different sectors. Clearly, the HPCA provides a simpler description of common risk factors than PCA. The HPCA is therefore a viable alternative to PCA in the analysis of multivariate data in Finance, which should be of interest for asset-allocation and portfolio risk-management. </p>
			<p>
				<fig id="f14">
					<label>Figure 14</label>
					<caption>
						<title>Histograms of residuals for HPCA (blue) and PCA (orange) after removing the first 3 eigenportfolios. For reference we display the “histogram” of the Marcenko-Pastur (MP) for corresponding to the same ratio of rows to columns (<inline-formula>
								<mml:math>
									<mml:mrow>
										<mml:mn>1508</mml:mn>
										<mml:mo>×</mml:mo>
										<mml:mn>434</mml:mn>
									</mml:mrow>
								</mml:math>
							</inline-formula>). The histograms of HPCA and PCA are comparable. Both are localized below the critical MP level of 2.36, with a smooth “leakage” as expected due to finite-size effects. </title>
					</caption>
					<graphic xlink:href="2448-6795-rmef-15-01-1-gf14.svg"/>
				</fig>
			</p>
		</sec>
	</body>
	<back>
		<ref-list>
			<title>References</title>
			<ref id="B1">
				<label>1</label>
				<mixed-citation>[1] Avellaneda, M. and Lee, JH, Statistical arbitrage in the US equities market, Quantitative Finance, 2010, vol. 10, issue 7, 761-782</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Avellaneda</surname>
							<given-names>M.</given-names>
						</name>
						<name>
							<surname>Lee</surname>
							<given-names>JH</given-names>
						</name>
					</person-group>
					<article-title>Statistical arbitrage in the US equities market</article-title>
					<source>Quantitative Finance</source>
					<year>2010</year>
					<volume>10</volume>
					<issue>7</issue>
					<fpage>761</fpage>
					<lpage>782</lpage>
				</element-citation>
			</ref>
			<ref id="B2">
				<label>2</label>
				<mixed-citation>[2] Boyle, Phelim P., Positive Weights on the Efficient Frontier (October 9, 2012). Available at SSRN: <ext-link ext-link-type="uri" xlink:href="https://ssrn.com/abstract=2159445">https://ssrn.com/abstract=2159445</ext-link> or http://dx.doi.org/10.2139/ssrn.2159445</mixed-citation>
				<element-citation publication-type="report">
					<person-group person-group-type="author">
						<name>
							<surname>Boyle</surname>
							<given-names>Phelim P.</given-names>
						</name>
					</person-group>
					<source>Positive Weights on the Efficient Frontier</source>
					<year>2012</year>
					<publisher-name>SSRN</publisher-name>
					<ext-link ext-link-type="uri" xlink:href="https://ssrn.com/abstract=2159445">https://ssrn.com/abstract=2159445</ext-link>
					<pub-id pub-id-type="doi">10.2139/ssrn.2159445</pub-id>
				</element-citation>
			</ref>
			<ref id="B3">
				<label>3</label>
				<mixed-citation>[3] Cont, R and Kan, Y.H., Statistical Modeling of Credit Default Swap Portfolios (April 1, 2011). Available at SSRN: <ext-link ext-link-type="uri" xlink:href="https://ssrn.com/abstract=1771862">https://ssrn.com/abstract=1771862</ext-link> or http://dx.doi.org/10.2139/ssrn.1771862</mixed-citation>
				<element-citation publication-type="report">
					<person-group person-group-type="author">
						<name>
							<surname>Cont</surname>
							<given-names>R</given-names>
						</name>
						<name>
							<surname>Kan</surname>
							<given-names>Y.H.</given-names>
						</name>
					</person-group>
					<source>Statistical Modeling of Credit Default Swap Portfolios</source>
					<year>2011</year>
					<publisher-name>SSRN</publisher-name>
					<ext-link ext-link-type="uri" xlink:href="https://ssrn.com/abstract=1771862">https://ssrn.com/abstract=1771862</ext-link>
					<pub-id pub-id-type="doi">10.2139/ssrn.1771862</pub-id>
				</element-citation>
			</ref>
			<ref id="B4">
				<label>4</label>
				<mixed-citation>[4] Dobi, Doris, Modeling Volatility Risk in Equity Options A Cross-Sectional Approach, Scholars’ Press, (June 2018), NYU Ph.D. Dissertation, August 2014.</mixed-citation>
				<element-citation publication-type="thesis">
					<person-group person-group-type="author">
						<name>
							<surname>Dobi</surname>
							<given-names>Doris</given-names>
						</name>
					</person-group>
					<source>Modeling Volatility Risk in Equity Options A Cross-Sectional Approach, Scholars’ Press</source>
					<year>2018</year>
					<publisher-loc>NYU</publisher-loc>
					<comment content-type="degree">Ph.D. Dissertation</comment>
				</element-citation>
			</ref>
			<ref id="B5">
				<label>5</label>
				<mixed-citation>[5] Ivanov, S, Initial margin estimations for credit default swap portfolios, RISK Journal of Financial Market Infrastructures, July 2017</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Ivanov</surname>
							<given-names>S</given-names>
						</name>
					</person-group>
					<article-title>Initial margin estimations for credit default swap portfolios</article-title>
					<source>RISK Journal of Financial Market Infrastructures</source>
					<year>2017</year>
				</element-citation>
			</ref>
			<ref id="B6">
				<label>6</label>
				<mixed-citation>[6] Jollife, I.T., Principal Compoment Analysis, 2nd edition, Springer, New York, 2002.</mixed-citation>
				<element-citation publication-type="book">
					<person-group person-group-type="author">
						<name>
							<surname>Jollife</surname>
							<given-names>I.T.</given-names>
						</name>
					</person-group>
					<source>Principal Compoment Analysis</source>
					<edition>2nd</edition>
					<publisher-name>Springer</publisher-name>
					<publisher-loc>New York</publisher-loc>
					<year>2002</year>
				</element-citation>
			</ref>
			<ref id="B7">
				<label>7</label>
				<mixed-citation>[7] Laloux, L., Cizeau, P., Potters, M. and Boucheaud, J.-P., Random matrix Theory and Financial Correlations, Mathematical Methods in Applied Sciences, 2000.</mixed-citation>
				<element-citation publication-type="book">
					<person-group person-group-type="author">
						<name>
							<surname>Laloux</surname>
							<given-names>L.</given-names>
						</name>
						<name>
							<surname>Cizeau</surname>
							<given-names>P.</given-names>
						</name>
						<name>
							<surname>Potters</surname>
							<given-names>M.</given-names>
						</name>
						<name>
							<surname>Boucheaud</surname>
							<given-names>J.-P.</given-names>
						</name>
					</person-group>
					<source>Random matrix Theory and Financial Correlations, Mathematical Methods in Applied Sciences</source>
					<year>2000</year>
				</element-citation>
			</ref>
			<ref id="B8">
				<label>8</label>
				<mixed-citation>[8] Litterman, R., and Scheinkman, J., Common factors affecting bond returns, The Journal of Fixed Income, 1991</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Litterman</surname>
							<given-names>R.</given-names>
						</name>
						<name>
							<surname>Scheinkman</surname>
							<given-names>J.</given-names>
						</name>
					</person-group>
					<article-title>Common factors affecting bond returns</article-title>
					<source>The Journal of Fixed Income</source>
					<year>1991</year>
				</element-citation>
			</ref>
			<ref id="B9">
				<label>9</label>
				<mixed-citation>[9] Shkolnik, A.D., Goldberg, L., Bohn, J.R., Identifying Broad and Narrow Financial Risk Factors with Convex Optimization (August 20, 2016). Available at SSRN: <ext-link ext-link-type="uri" xlink:href="https://ssrn.com/abstract=2800237">https://ssrn.com/abstract=2800237</ext-link> or http://dx.doi.org/10.2139/ssrn.2800237</mixed-citation>
				<element-citation publication-type="report">
					<person-group person-group-type="author">
						<name>
							<surname>Shkolnik</surname>
							<given-names>A.D.</given-names>
						</name>
						<name>
							<surname>Goldberg</surname>
							<given-names>L.</given-names>
						</name>
						<name>
							<surname>Bohn</surname>
							<given-names>J.R.</given-names>
						</name>
					</person-group>
					<source>Identifying Broad and Narrow Financial Risk Factors with Convex Optimization</source>
					<year>2016</year>
					<publisher-name>SSRN</publisher-name>
					<ext-link ext-link-type="uri" xlink:href="https://ssrn.com/abstract=2800237">https://ssrn.com/abstract=2800237</ext-link>
					<pub-id pub-id-type="doi">10.2139/ssrn.2800237</pub-id>
				</element-citation>
			</ref>
		</ref-list>
		<fn-group>
			<fn fn-type="other" id="fn1">
				<label><sup>1</sup></label>
				<p>No declared funding source for research development</p>
			</fn>
			<fn fn-type="other" id="fn2">
				<label><sup>2</sup></label>
				<p>We consider correlations instead of covariances because it mathematically simpler to work in dimensionless units, i.e. to reduce to the case when all the volatilities are equal to one.</p>
			</fn>
			<fn fn-type="other" id="fn3">
				<label><sup>3</sup></label>
				<p>The proof of Proposition 2 is straightforward: one just has to observe that 
						<inline-formula>
							<mml:math>
								<mml:mrow>
									<mml:msub>
										<mml:mi>β</mml:mi>
										<mml:mi>i</mml:mi>
									</mml:msub>
									<mml:mo>=</mml:mo>
									<mml:msqrt>
										<mml:mrow>
											<mml:msup>
												<mml:mi>α</mml:mi>
												<mml:mrow>
													<mml:mi>I</mml:mi>
													<mml:mfenced>
														<mml:mi>i</mml:mi>
													</mml:mfenced>
												</mml:mrow>
											</mml:msup>
										</mml:mrow>
									</mml:msqrt>
									<mml:msubsup>
										<mml:mi>V</mml:mi>
										<mml:mi>i</mml:mi>
										<mml:mrow>
											<mml:mfenced>
												<mml:mrow>
													<mml:mn>1,</mml:mn>
													<mml:mi>I</mml:mi>
													<mml:mfenced>
														<mml:mi>i</mml:mi>
													</mml:mfenced>
												</mml:mrow>
											</mml:mfenced>
										</mml:mrow>
									</mml:msubsup>
								</mml:mrow>
							</mml:math>
						</inline-formula>
					 and calculate explicitly the action of 
						<inline-formula>
							<mml:math>
								<mml:mover accent="true">
									<mml:mi>R</mml:mi>
									<mml:mo>˜</mml:mo>
								</mml:mover>
							</mml:math>
						</inline-formula>
					 on each of the vectors 
						<inline-formula>
							<mml:math>
								<mml:mrow>
									<mml:msup>
										<mml:mi>W</mml:mi>
										<mml:mrow>
											<mml:mfenced>
												<mml:mrow>
													<mml:mi>j</mml:mi>
													<mml:mo>,</mml:mo>
													<mml:mi>k</mml:mi>
												</mml:mrow>
											</mml:mfenced>
										</mml:mrow>
									</mml:msup>
								</mml:mrow>
							</mml:math>
						</inline-formula>.</p>
			</fn>
			<fn fn-type="other" id="fn4">
				<label><sup>4</sup></label>
				<p>In the sequel we refer to the full empirical correlation matrix as the “PCA matrix”, for short.</p>
			</fn>
			<fn fn-type="other" id="fn5">
				<label><sup>5</sup></label>
				<p>A careful analysis of this idea, including out-of-sample tracking error analysis, will be done in a separate publication.</p>
			</fn>
		</fn-group>
	</back>
</article>