Correlation Analysis¶

Difficulty advanced

Overview¶

Correlation measures how returns move together. It is the central input to portfolio risk (variance is a function of pairwise correlations) and the most underestimated source of P&L surprises — correlations are unstable, regime-dependent, and tend to converge to 1 in crashes.

Pearson Correlation¶

ρ(X, Y) = Cov(X, Y) / (σ_X · σ_Y)

where: Cov(X, Y) covariance of return series X and Y · σ_X standard deviation of X · σ_Y standard deviation of Y does: the scale-free linear-dependence statistic — used as the input to mean-variance optimizers and as the headline diversification check; misleading when the joint distribution has fat tails or non-linear copulas.

Range: −1 to +1. Measures linear dependence only.

Spearman (Rank) Correlation¶

Pearson on the ranks. Captures monotonic but non-linear relationships and is robust to outliers.

Rolling Correlation¶

Static correlation is a poor proxy when you trade across regimes.

Plot a rolling 60-day correlation of any two equities through 2008 or 2020 — the rise toward 1 during stress is the defining risk-management lesson.

Covariance Matrix¶

The full portfolio risk object:

Σ_ij = ρ_ij · σ_i · σ_j

where: Σ_ij covariance between assets i and j · ρ_ij correlation between i and j · σ_i, σ_j standard deviations of the two assets does: assembles the full covariance matrix from pairwise correlations and individual volatilities — used as the core input for portfolio variance, mean-variance optimization, risk parity, and VaR.

Correlation Breakdown in Stress¶

Empirically, diversification weakens precisely when needed. In normal regimes, equity pairs might run 0.3–0.5 correlation; in panics they cluster above 0.8.

Regime	Avg. cross-asset ρ	Diversification benefit
Calm bull	0.2–0.4	Strong
Range-bound	0.4–0.6	Moderate
Sell-off	0.7–0.9	Weak
Crisis	0.85+	Minimal

Implication: size positions on the stressed covariance matrix, not the calm one. A common heuristic — multiply off-diagonals by α ≥ 1.0 when stress-testing.

Correlation vs. Causation¶

A correlated pair is not a stable pair. Spurious correlations multiply in finance because thousands of series cross-tested will yield seemingly significant relationships by chance (look-ahead, multiple-testing, in-sample biases).

A "stable" pair shifts correlation by less than 0.2 across windows — most pairs do not.

Eigenvalue Decomposition (PCA)¶

Principal Component Analysis of the correlation matrix reveals dominant risk factors. In equity markets, the first PC typically explains 40–60% of variance and is interpretable as the market factor.

Shrinkage¶

Sample correlation matrices are noisy and often non-invertible (especially when K > N). Shrink toward a structured target:

Σ_shrunk = (1 − δ) · Σ_sample + δ · Σ_target

where: Σ_sample raw sample covariance matrix · Σ_target structured shrinkage target (identity, constant-ρ, or factor) · δ ∈ [0, 1] shrinkage intensity does: blends a noisy sample estimate toward a low-parameter target — used before inverting the covariance matrix in optimizers and large-portfolio risk models, stabilizing weights and reducing out-of-sample error.

Ledoit-Wolf provides an optimal δ analytically. Common targets:

Identity (no correlations)
Constant correlation
Single-factor (market beta) implied

Diversification Metrics¶

Metric	Formula	Interpretation
Average correlation	mean of off-diagonals	Concentration in one factor
Effective N	(Σ w)² / Σ w² (Herfindahl)	Equivalent number of independent positions
Diversification ratio	(Σ w·σ) / σ_portfolio	> 1 ⇒ benefit from combining
Number of effective bets	function of cov eigenvalues	Concentration in PCs

Application to Position Sizing¶

For a basket of correlated positions, naive sum-of-individual-risks overstates total risk:

Effective risk per position = σ / √(1 + (n − 1) · ρ̄)

where: σ individual-position volatility · n number of positions in the basket · ρ̄ average pairwise correlation across the basket does: approximates per-position contribution to portfolio risk given correlation drag — used to scale notional sizes when adding correlated positions so total portfolio volatility stays inside its limit.

Where ρ̄ is the average pairwise correlation. Higher average correlation ⇒ less diversification ⇒ scale positions down.

Practical Guidelines¶

Always look at rolling correlations. A single point estimate hides regime changes.
Stress-test with elevated correlations. Run "what if all pairs go to 0.9" scenarios on every portfolio.
Shrink before inverting. Mean-variance optimizers on raw covariance produce extreme weights.
Track correlation to your benchmark. Quietly drifting from "uncorrelated alpha" to "factor exposure" is a common failure mode.
Use rank or distance metrics for non-linear pairs. Pearson misses tail co-dependence.
Validate signal stability. A pair that "worked" in-sample with high ρ may have a much lower out-of-sample ρ.

Common Pitfalls¶

Pitfall	Effect
Computing ρ over too short a window	High variance, false signals
Computing ρ over too long a window	Misses regime change
Treating ρ as constant	Underestimates crash risk
Ignoring tail co-movement	Linear ρ silent on dependence in tails
Inverting raw sample Σ	Unstable optimization weights

References: 1. Ledoit & Wolf, "Honey, I Shrunk the Sample Covariance Matrix", Journal of Portfolio Management, 2004. 2. Longin & Solnik, "Extreme Correlation of International Equity Markets", Journal of Finance, 2001. 3. Embrechts et al., "Correlation and Dependence in Risk Management: Properties and Pitfalls", 2002.

Next Steps¶

Portfolio Optimization — using correlations to construct portfolios
VaR / CVaR — risk metrics built on the covariance matrix
Position Sizing — accounting for correlation in trade sizing
Stress Testing — modeling correlation breakdown