Time Series Analysis¶

Difficulty beginner

Overview¶

Financial data is inherently sequential — each observation depends on time. Time series analysis provides tools to model, forecast, and extract signals from sequential data.

Components of a Time Series¶

Y_t = Trend_t + Seasonality_t + Cycle_t + Noise_t

where: Y_t observed value at time t · Trend_t slow-moving level · Seasonality_t fixed-period repetition · Cycle_t longer, irregular oscillation · Noise_t zero-mean random residual. does: classical additive decomposition. The conceptual map for what every time-series technique below is trying to isolate or remove.

Component	Description	Example
Trend	Long-term direction	Secular bull market
Seasonality	Regular periodic patterns	January effect, end-of-month flows
Cycle	Irregular multi-period patterns	Business cycles
Noise	Random fluctuations	Daily price movement

Stationarity¶

Definition¶

A time series is stationary if: 1. Mean is constant over time: E[Y_t] = μ 2. Variance is constant over time: Var(Y_t) = σ² 3. Autocovariance depends only on lag, not time: Cov(Y_t, Y_{t-k}) = γ_k

most price levels are non-stationary · most returns are approximately stationary · always test (ADF, KPSS) before modelling

Why Stationarity Matters¶

Most statistical methods assume stationarity. Non-stationary data leads to: - Spurious regression - Invalid hypothesis tests - Unreliable forecasts

Testing for Stationarity¶

Augmented Dickey-Fuller (ADF) Test:

H₀: Series has unit root (non-stationary)
H₁: Series is stationary

ADF test statistic compared to critical values
More negative = more likely stationary

where: H₀ null hypothesis · H₁ alternative hypothesis · "unit root" means a stochastic trend (a coefficient of 1 on the lagged level in an AR(1) regression). does: tests whether a series needs differencing to be stationary. Rejecting H₀ (more negative statistic, smaller p-value) means the series is stationary as-is.

Making a Series Stationary¶

Differencing:

ΔY_t = Y_t - Y_{t-1}

For log prices: r_t = ln(P_t) - ln(P_{t-1}) ≈ (P_t - P_{t-1}) / P_{t-1}

where: ΔY_t first difference · Y_t series at time t · for prices, differencing log-prices yields log-returns — the standard transformation in finance. does: removes a stochastic trend (unit root). One round of differencing kills linear drift; a second round handles quadratic drift but is rarely needed for financial data.

Detrending:

Y_t' = Y_t - Trend_t

where: Trend_t deterministic trend component (e.g. linear fit, moving average, HP filter) · Y_t' detrended residual. does: removes a deterministic trend instead of a stochastic one. Choose differencing for unit-root series, detrending for trend-stationary series — the wrong choice over- or under-corrects.

Transformation:

Log: Y_t' = ln(Y_t)
Box-Cox: Y_t' = (Y_t^λ - 1) / λ

where: ln natural log · λ Box-Cox shape parameter (λ=0 reduces to log, λ=1 to identity-minus-1) · both require Y_t > 0. does: variance-stabilizing transforms. Log compresses multiplicative variance into additive form; Box-Cox tunes the compression strength via cross-validation.

Autocorrelation¶

Autocorrelation Function (ACF)¶

ρ(k) = Cov(Y_t, Y_{t-k}) / Var(Y_t)

Measures correlation between Y_t and Y_{t-k}

where: ρ(k) autocorrelation at lag k · Y_t series at time t · Y_{t-k} series k periods earlier · ratio normalizes covariance by variance. does: measures linear dependence between the series and its own past. ACF dying off quickly → little memory; persistent ACF → long memory (or non-stationarity).

Partial Autocorrelation Function (PACF)¶

Measures correlation between Y_t and Y_{t-k} after removing
the effect of intermediate lags (Y_{t-1}, ..., Y_{t-k+1})

where: PACF at lag k = regression coefficient on Y_{t-k} in an AR(k) fit · differs from ACF in that it nets out the influence of all shorter lags. does: isolates the direct dependence at lag k. The classic AR-order-selection tool: PACF that cuts off after lag p signals AR(p).

Interpretation¶

Pattern	ACF	PACF	Model
AR(p)	Exponential decay	Cuts off after lag p	AR(p)
MA(q)	Cuts off after lag q	Exponential decay	MA(q)
ARMA(p,q)	Exponential decay	Exponential decay	ARMA(p,q)
Non-stationary	Slow decay	—	Difference first

arima models¶

AR(p) — Autoregressive¶

Y_t = c + φ₁Y_{t-1} + φ₂Y_{t-2} + ... + φₚY_{t-p} + ε_t

Current value depends on p past values

where: c constant · φ_i AR coefficient on lag i · ε_t white-noise innovation · p AR order. does: models Y_t as a linear combination of its own past values plus noise. Stationary if all roots of the lag polynomial lie outside the unit circle (Σφᵢ < 1 for AR(1)).

MA(q) — Moving Average¶

Y_t = μ + ε_t + θ₁ε_{t-1} + θ₂ε_{t-2} + ... + θqε_{t-q}

Current value depends on q past error terms

where: μ mean of Y · θ_i MA coefficient on lag-i innovation · ε_t white noise · q MA order. does: models the series as a linear combination of current and past noise (not past Y values). Always stationary regardless of θ values; the role symmetric to AR.

ARMA(p,q)¶

Y_t = c + φ₁Y_{t-1} + ... + φₚY_{t-p} + ε_t + θ₁ε_{t-1} + ... + θqε_{t-q}

where: φ_i AR coefficients on past Y · θ_j MA coefficients on past ε · ε_t white-noise innovation · c constant. does: combines AR (memory of past levels) and MA (memory of past shocks) in a single parsimonious model. Stationary on the AR side, invertible on the MA side — together they fit a wide range of autocorrelation structures.

ARIMA(p,d,q)¶

ARMA applied to d-times differenced series

p = AR order
d = differencing order
q = MA order

where: p AR lag count · d number of differencing operations applied to reach stationarity · q MA lag count · setting d=0 collapses to ARMA(p,q). does: extends ARMA to non-stationary series by integrating differencing into the model spec. The default forecasting workhorse for price levels, yields, and other unit-root series.

garch models¶

Volatility Clustering¶

Financial returns exhibit volatility clustering: - Large changes tend to follow large changes - Small changes tend to follow small changes - Direction may not be predictable, but magnitude is

ARCH(q)¶

σ²_t = α₀ + α₁ε²_{t-1} + α₂ε²_{t-2} + ... + αqε²_{t-q}

Conditional variance depends on past squared errors

where: σ²_t conditional variance at time t · α₀ baseline variance (> 0) · α_i ≥ 0 weights on past squared innovations · ε_{t-i} past return shocks. does: models volatility as a function of recent shock magnitudes. Captures clustering — large |ε| today predicts elevated variance tomorrow — but typically needs many lags to fit persistence.

GARCH(p,q)¶

σ²_t = α₀ + Σᵢ αᵢε²_{t-i} + Σⱼ βⱼσ²_{t-j}

More parsimonious than ARCH

where: α_i weights on past squared returns · β_j weights on past variance · stationarity requires Σα + Σβ < 1 · GARCH(1,1) is the workhorse spec. does: adds an AR component to ARCH so variance is also driven by its own past. Captures long volatility persistence with just two or three parameters — the default vol model in finance.

GARCH Variants¶

Model	Feature	Use Case
GARCH	Symmetric	General volatility modeling
EGARCH	Asymmetric (leverage)	Captures negative shock impact
GJR-GARCH	Asymmetric	Similar to EGARCH
IGARCH	Persistent	Long memory volatility
FIGARCH	Fractional	Very long memory

Cointegration¶

Definition¶

Two non-stationary series are cointegrated if a linear combination of them is stationary:

Y_t ~ I(1), X_t ~ I(1)
But: Y_t - βX_t ~ I(0)

where: I(1) integrated of order 1 (needs one difference to be stationary) · I(0) stationary · β cointegrating coefficient estimated by Engle-Granger or Johansen. does: formalizes "the spread between two random-walk-like series is mean-reverting." The mathematical foundation for pairs trading, statistical arbitrage, and many curve trades.

Kalman Filter¶

Purpose¶

Optimal estimation of hidden states from noisy observations.

Applications¶

Dynamic hedge ratio estimation
Trend following
Signal extraction from noise
Tracking time-varying parameters

Spectral Analysis¶

Fourier Transform¶

Decompose time series into frequency components:

X(f) = Σ x(t) × e^(-2πift)

where: X(f) complex amplitude at frequency f · x(t) time-domain signal · e^(-2πift) complex sinusoid basis function · i imaginary unit. does: projects a time series onto sines and cosines of every frequency. Used for cycle detection, denoising (low-pass filter), and constructing frequency-domain features for ML.

Applications¶

Application	Purpose
Cycle detection	Find periodic patterns
Noise filtering	Remove high-frequency noise
Feature extraction	Frequency-domain features for ML

Practical Guidelines¶

Model Selection Workflow¶

1. Visualize the series
2. Test for stationarity (ADF, KPSS)
3. If non-stationary → Difference or transform
4. Plot ACF and PACF
5. Select candidate models
6. Fit and compare (AIC, BIC)
7. Check residuals (should be white noise)
8. Validate out-of-sample

Common Pitfalls¶

Pitfall	Problem	Solution
Ignoring non-stationarity	Spurious results	Always test first
Overfitting	Poor out-of-sample	Use AIC/BIC, cross-validate
Ignoring structural breaks	Model instability	Test for breaks, use rolling models
Assuming linearity	Miss non-linear patterns	Try non-linear models
No out-of-sample test	Unknown true performance	Walk-forward validation

Key Formulas Reference¶

Returns: r_t = ln(P_t / P_{t-1})
AR(p): Y_t = c + Σᵢ φᵢY_{t-i} + ε_t
MA(q): Y_t = μ + ε_t + Σⱼ θⱼε_{t-j}
GARCH: σ²_t = α₀ + Σαᵢε²_{t-i} + Σβⱼσ²_{t-j}
ACF: ρ(k) = Cov(Y_t, Y_{t-k}) / Var(Y_t)

Next Steps¶

Regression Models — Predictive modeling
Pairs Trading — Cointegration application
Stochastic Calculus — Continuous-time modeling