Skip to content

Correlation Analysis

Difficulty advanced

Overview

Correlation measures how returns move together. It is the central input to portfolio risk (variance is a function of pairwise correlations) and the most underestimated source of P&L surprises — correlations are unstable, regime-dependent, and tend to converge to 1 in crashes.

Pearson Correlation

ρ(X, Y) = Cov(X, Y) / (σ_X · σ_Y)

where: Cov(X, Y) covariance of return series X and Y · σ_X standard deviation of X · σ_Y standard deviation of Y does: the scale-free linear-dependence statistic — used as the input to mean-variance optimizers and as the headline diversification check; misleading when the joint distribution has fat tails or non-linear copulas.

Range: −1 to +1. Measures linear dependence only.

Spearman (Rank) Correlation

Pearson on the ranks. Captures monotonic but non-linear relationships and is robust to outliers.

Rolling Correlation

Static correlation is a poor proxy when you trade across regimes.

Plot a rolling 60-day correlation of any two equities through 2008 or 2020 — the rise toward 1 during stress is the defining risk-management lesson.

Covariance Matrix

The full portfolio risk object:

Σ_ij = ρ_ij · σ_i · σ_j

where: Σ_ij covariance between assets i and j · ρ_ij correlation between i and j · σ_i, σ_j standard deviations of the two assets does: assembles the full covariance matrix from pairwise correlations and individual volatilities — used as the core input for portfolio variance, mean-variance optimization, risk parity, and VaR.

Correlation Breakdown in Stress

Empirically, diversification weakens precisely when needed. In normal regimes, equity pairs might run 0.3–0.5 correlation; in panics they cluster above 0.8.

Regime Avg. cross-asset ρ Diversification benefit
Calm bull 0.2–0.4 Strong
Range-bound 0.4–0.6 Moderate
Sell-off 0.7–0.9 Weak
Crisis 0.85+ Minimal

Implication: size positions on the stressed covariance matrix, not the calm one. A common heuristic — multiply off-diagonals by α ≥ 1.0 when stress-testing.

Correlation vs. Causation

A correlated pair is not a stable pair. Spurious correlations multiply in finance because thousands of series cross-tested will yield seemingly significant relationships by chance (look-ahead, multiple-testing, in-sample biases).

A "stable" pair shifts correlation by less than 0.2 across windows — most pairs do not.

Eigenvalue Decomposition (PCA)

Principal Component Analysis of the correlation matrix reveals dominant risk factors. In equity markets, the first PC typically explains 40–60% of variance and is interpretable as the market factor.

Shrinkage

Sample correlation matrices are noisy and often non-invertible (especially when K > N). Shrink toward a structured target:

Σ_shrunk = (1 − δ) · Σ_sample + δ · Σ_target

where: Σ_sample raw sample covariance matrix · Σ_target structured shrinkage target (identity, constant-ρ, or factor) · δ ∈ [0, 1] shrinkage intensity does: blends a noisy sample estimate toward a low-parameter target — used before inverting the covariance matrix in optimizers and large-portfolio risk models, stabilizing weights and reducing out-of-sample error.

Ledoit-Wolf provides an optimal δ analytically. Common targets:

  • Identity (no correlations)
  • Constant correlation
  • Single-factor (market beta) implied

Diversification Metrics

Metric Formula Interpretation
Average correlation mean of off-diagonals Concentration in one factor
Effective N (Σ w)² / Σ w² (Herfindahl) Equivalent number of independent positions
Diversification ratio (Σ w·σ) / σ_portfolio > 1 ⇒ benefit from combining
Number of effective bets function of cov eigenvalues Concentration in PCs

Application to Position Sizing

For a basket of correlated positions, naive sum-of-individual-risks overstates total risk:

Effective risk per position = σ / √(1 + (n − 1) · ρ̄)

where: σ individual-position volatility · n number of positions in the basket · ρ̄ average pairwise correlation across the basket does: approximates per-position contribution to portfolio risk given correlation drag — used to scale notional sizes when adding correlated positions so total portfolio volatility stays inside its limit.

Where ρ̄ is the average pairwise correlation. Higher average correlation ⇒ less diversification ⇒ scale positions down.

Practical Guidelines

  1. Always look at rolling correlations. A single point estimate hides regime changes.
  2. Stress-test with elevated correlations. Run "what if all pairs go to 0.9" scenarios on every portfolio.
  3. Shrink before inverting. Mean-variance optimizers on raw covariance produce extreme weights.
  4. Track correlation to your benchmark. Quietly drifting from "uncorrelated alpha" to "factor exposure" is a common failure mode.
  5. Use rank or distance metrics for non-linear pairs. Pearson misses tail co-dependence.
  6. Validate signal stability. A pair that "worked" in-sample with high ρ may have a much lower out-of-sample ρ.

Common Pitfalls

Pitfall Effect
Computing ρ over too short a window High variance, false signals
Computing ρ over too long a window Misses regime change
Treating ρ as constant Underestimates crash risk
Ignoring tail co-movement Linear ρ silent on dependence in tails
Inverting raw sample Σ Unstable optimization weights

References: 1. Ledoit & Wolf, "Honey, I Shrunk the Sample Covariance Matrix", Journal of Portfolio Management, 2004. 2. Longin & Solnik, "Extreme Correlation of International Equity Markets", Journal of Finance, 2001. 3. Embrechts et al., "Correlation and Dependence in Risk Management: Properties and Pitfalls", 2002.

Next Steps