Correlation Analysis¶
Difficulty advanced
Overview¶
Correlation measures how returns move together. It is the central input to portfolio risk (variance is a function of pairwise correlations) and the most underestimated source of P&L surprises — correlations are unstable, regime-dependent, and tend to converge to 1 in crashes.
Pearson Correlation¶
where:
Cov(X, Y)covariance of return seriesXandY·σ_Xstandard deviation ofX·σ_Ystandard deviation ofYdoes: the scale-free linear-dependence statistic — used as the input to mean-variance optimizers and as the headline diversification check; misleading when the joint distribution has fat tails or non-linear copulas.
Range: −1 to +1. Measures linear dependence only.
Spearman (Rank) Correlation¶
Pearson on the ranks. Captures monotonic but non-linear relationships and is robust to outliers.
Rolling Correlation¶
Static correlation is a poor proxy when you trade across regimes.
Plot a rolling 60-day correlation of any two equities through 2008 or 2020 — the rise toward 1 during stress is the defining risk-management lesson.
Covariance Matrix¶
The full portfolio risk object:
where:
Σ_ijcovariance between assetsiandj·ρ_ijcorrelation betweeniandj·σ_i,σ_jstandard deviations of the two assets does: assembles the full covariance matrix from pairwise correlations and individual volatilities — used as the core input for portfolio variance, mean-variance optimization, risk parity, and VaR.
Correlation Breakdown in Stress¶
Empirically, diversification weakens precisely when needed. In normal regimes, equity pairs might run 0.3–0.5 correlation; in panics they cluster above 0.8.
| Regime | Avg. cross-asset ρ | Diversification benefit |
|---|---|---|
| Calm bull | 0.2–0.4 | Strong |
| Range-bound | 0.4–0.6 | Moderate |
| Sell-off | 0.7–0.9 | Weak |
| Crisis | 0.85+ | Minimal |
Implication: size positions on the stressed covariance matrix, not the calm one. A common heuristic — multiply off-diagonals by α ≥ 1.0 when stress-testing.
Correlation vs. Causation¶
A correlated pair is not a stable pair. Spurious correlations multiply in finance because thousands of series cross-tested will yield seemingly significant relationships by chance (look-ahead, multiple-testing, in-sample biases).
A "stable" pair shifts correlation by less than 0.2 across windows — most pairs do not.
Eigenvalue Decomposition (PCA)¶
Principal Component Analysis of the correlation matrix reveals dominant risk factors. In equity markets, the first PC typically explains 40–60% of variance and is interpretable as the market factor.
Shrinkage¶
Sample correlation matrices are noisy and often non-invertible (especially when K > N). Shrink toward a structured target:
where:
Σ_sampleraw sample covariance matrix ·Σ_targetstructured shrinkage target (identity, constant-ρ, or factor) ·δ ∈ [0, 1]shrinkage intensity does: blends a noisy sample estimate toward a low-parameter target — used before inverting the covariance matrix in optimizers and large-portfolio risk models, stabilizing weights and reducing out-of-sample error.
Ledoit-Wolf provides an optimal δ analytically. Common targets:
- Identity (no correlations)
- Constant correlation
- Single-factor (market beta) implied
Diversification Metrics¶
| Metric | Formula | Interpretation |
|---|---|---|
| Average correlation | mean of off-diagonals | Concentration in one factor |
| Effective N | (Σ w)² / Σ w² (Herfindahl) | Equivalent number of independent positions |
| Diversification ratio | (Σ w·σ) / σ_portfolio | > 1 ⇒ benefit from combining |
| Number of effective bets | function of cov eigenvalues | Concentration in PCs |
Application to Position Sizing¶
For a basket of correlated positions, naive sum-of-individual-risks overstates total risk:
where:
σindividual-position volatility ·nnumber of positions in the basket ·ρ̄average pairwise correlation across the basket does: approximates per-position contribution to portfolio risk given correlation drag — used to scale notional sizes when adding correlated positions so total portfolio volatility stays inside its limit.
Where ρ̄ is the average pairwise correlation. Higher average correlation ⇒ less diversification ⇒ scale positions down.
Practical Guidelines¶
- Always look at rolling correlations. A single point estimate hides regime changes.
- Stress-test with elevated correlations. Run "what if all pairs go to 0.9" scenarios on every portfolio.
- Shrink before inverting. Mean-variance optimizers on raw covariance produce extreme weights.
- Track correlation to your benchmark. Quietly drifting from "uncorrelated alpha" to "factor exposure" is a common failure mode.
- Use rank or distance metrics for non-linear pairs. Pearson misses tail co-dependence.
- Validate signal stability. A pair that "worked" in-sample with high ρ may have a much lower out-of-sample ρ.
Common Pitfalls¶
| Pitfall | Effect |
|---|---|
| Computing ρ over too short a window | High variance, false signals |
| Computing ρ over too long a window | Misses regime change |
| Treating ρ as constant | Underestimates crash risk |
| Ignoring tail co-movement | Linear ρ silent on dependence in tails |
| Inverting raw sample Σ | Unstable optimization weights |
References: 1. Ledoit & Wolf, "Honey, I Shrunk the Sample Covariance Matrix", Journal of Portfolio Management, 2004. 2. Longin & Solnik, "Extreme Correlation of International Equity Markets", Journal of Finance, 2001. 3. Embrechts et al., "Correlation and Dependence in Risk Management: Properties and Pitfalls", 2002.
Next Steps¶
- Portfolio Optimization — using correlations to construct portfolios
- VaR / CVaR — risk metrics built on the covariance matrix
- Position Sizing — accounting for correlation in trade sizing
- Stress Testing — modeling correlation breakdown