Skip to content

Time Series Analysis

Difficulty beginner

Overview

Financial data is inherently sequential — each observation depends on time. Time series analysis provides tools to model, forecast, and extract signals from sequential data.

Components of a Time Series

Y_t = Trend_t + Seasonality_t + Cycle_t + Noise_t

where: Y_t observed value at time t · Trend_t slow-moving level · Seasonality_t fixed-period repetition · Cycle_t longer, irregular oscillation · Noise_t zero-mean random residual. does: classical additive decomposition. The conceptual map for what every time-series technique below is trying to isolate or remove.

Component Description Example
Trend Long-term direction Secular bull market
Seasonality Regular periodic patterns January effect, end-of-month flows
Cycle Irregular multi-period patterns Business cycles
Noise Random fluctuations Daily price movement

Stationarity

Definition

A time series is stationary if: 1. Mean is constant over time: E[Y_t] = μ 2. Variance is constant over time: Var(Y_t) = σ² 3. Autocovariance depends only on lag, not time: Cov(Y_t, Y_{t-k}) = γ_k

stationary μ (constant) mean & variance stable over time non-stationary drifting mean trend + growing variance
most price levels are non-stationary · most returns are approximately stationary · always test (ADF, KPSS) before modelling

Why Stationarity Matters

Most statistical methods assume stationarity. Non-stationary data leads to: - Spurious regression - Invalid hypothesis tests - Unreliable forecasts

Testing for Stationarity

Augmented Dickey-Fuller (ADF) Test:

H₀: Series has unit root (non-stationary)
H₁: Series is stationary

ADF test statistic compared to critical values
More negative = more likely stationary

where: H₀ null hypothesis · H₁ alternative hypothesis · "unit root" means a stochastic trend (a coefficient of 1 on the lagged level in an AR(1) regression). does: tests whether a series needs differencing to be stationary. Rejecting H₀ (more negative statistic, smaller p-value) means the series is stationary as-is.

Making a Series Stationary

Differencing:

ΔY_t = Y_t - Y_{t-1}

For log prices: r_t = ln(P_t) - ln(P_{t-1}) ≈ (P_t - P_{t-1}) / P_{t-1}

where: ΔY_t first difference · Y_t series at time t · for prices, differencing log-prices yields log-returns — the standard transformation in finance. does: removes a stochastic trend (unit root). One round of differencing kills linear drift; a second round handles quadratic drift but is rarely needed for financial data.

Detrending:

Y_t' = Y_t - Trend_t

where: Trend_t deterministic trend component (e.g. linear fit, moving average, HP filter) · Y_t' detrended residual. does: removes a deterministic trend instead of a stochastic one. Choose differencing for unit-root series, detrending for trend-stationary series — the wrong choice over- or under-corrects.

Transformation:

Log: Y_t' = ln(Y_t)
Box-Cox: Y_t' = (Y_t^λ - 1) / λ

where: ln natural log · λ Box-Cox shape parameter (λ=0 reduces to log, λ=1 to identity-minus-1) · both require Y_t > 0. does: variance-stabilizing transforms. Log compresses multiplicative variance into additive form; Box-Cox tunes the compression strength via cross-validation.

Autocorrelation

Autocorrelation Function (ACF)

ρ(k) = Cov(Y_t, Y_{t-k}) / Var(Y_t)

Measures correlation between Y_t and Y_{t-k}

where: ρ(k) autocorrelation at lag k · Y_t series at time t · Y_{t-k} series k periods earlier · ratio normalizes covariance by variance. does: measures linear dependence between the series and its own past. ACF dying off quickly → little memory; persistent ACF → long memory (or non-stationarity).

Partial Autocorrelation Function (PACF)

Measures correlation between Y_t and Y_{t-k} after removing
the effect of intermediate lags (Y_{t-1}, ..., Y_{t-k+1})

where: PACF at lag k = regression coefficient on Y_{t-k} in an AR(k) fit · differs from ACF in that it nets out the influence of all shorter lags. does: isolates the direct dependence at lag k. The classic AR-order-selection tool: PACF that cuts off after lag p signals AR(p).

Interpretation

Pattern ACF PACF Model
AR(p) Exponential decay Cuts off after lag p AR(p)
MA(q) Cuts off after lag q Exponential decay MA(q)
ARMA(p,q) Exponential decay Exponential decay ARMA(p,q)
Non-stationary Slow decay Difference first

arima models

AR(p) — Autoregressive

Y_t = c + φ₁Y_{t-1} + φ₂Y_{t-2} + ... + φₚY_{t-p} + ε_t

Current value depends on p past values

where: c constant · φ_i AR coefficient on lag i · ε_t white-noise innovation · p AR order. does: models Y_t as a linear combination of its own past values plus noise. Stationary if all roots of the lag polynomial lie outside the unit circle (Σφᵢ < 1 for AR(1)).

MA(q) — Moving Average

Y_t = μ + ε_t + θ₁ε_{t-1} + θ₂ε_{t-2} + ... + θqε_{t-q}

Current value depends on q past error terms

where: μ mean of Y · θ_i MA coefficient on lag-i innovation · ε_t white noise · q MA order. does: models the series as a linear combination of current and past noise (not past Y values). Always stationary regardless of θ values; the role symmetric to AR.

ARMA(p,q)

Y_t = c + φ₁Y_{t-1} + ... + φₚY_{t-p} + ε_t + θ₁ε_{t-1} + ... + θqε_{t-q}

where: φ_i AR coefficients on past Y · θ_j MA coefficients on past ε · ε_t white-noise innovation · c constant. does: combines AR (memory of past levels) and MA (memory of past shocks) in a single parsimonious model. Stationary on the AR side, invertible on the MA side — together they fit a wide range of autocorrelation structures.

ARIMA(p,d,q)

ARMA applied to d-times differenced series

p = AR order
d = differencing order
q = MA order

where: p AR lag count · d number of differencing operations applied to reach stationarity · q MA lag count · setting d=0 collapses to ARMA(p,q). does: extends ARMA to non-stationary series by integrating differencing into the model spec. The default forecasting workhorse for price levels, yields, and other unit-root series.

garch models

Volatility Clustering

Financial returns exhibit volatility clustering: - Large changes tend to follow large changes - Small changes tend to follow small changes - Direction may not be predictable, but magnitude is

ARCH(q)

σ²_t = α₀ + α₁ε²_{t-1} + α₂ε²_{t-2} + ... + αqε²_{t-q}

Conditional variance depends on past squared errors

where: σ²_t conditional variance at time t · α₀ baseline variance (> 0) · α_i ≥ 0 weights on past squared innovations · ε_{t-i} past return shocks. does: models volatility as a function of recent shock magnitudes. Captures clustering — large |ε| today predicts elevated variance tomorrow — but typically needs many lags to fit persistence.

GARCH(p,q)

σ²_t = α₀ + Σᵢ αᵢε²_{t-i} + Σⱼ βⱼσ²_{t-j}

More parsimonious than ARCH

where: α_i weights on past squared returns · β_j weights on past variance · stationarity requires Σα + Σβ < 1 · GARCH(1,1) is the workhorse spec. does: adds an AR component to ARCH so variance is also driven by its own past. Captures long volatility persistence with just two or three parameters — the default vol model in finance.

GARCH Variants

Model Feature Use Case
GARCH Symmetric General volatility modeling
EGARCH Asymmetric (leverage) Captures negative shock impact
GJR-GARCH Asymmetric Similar to EGARCH
IGARCH Persistent Long memory volatility
FIGARCH Fractional Very long memory

Cointegration

Definition

Two non-stationary series are cointegrated if a linear combination of them is stationary:

Y_t ~ I(1), X_t ~ I(1)
But: Y_t - βX_t ~ I(0)

where: I(1) integrated of order 1 (needs one difference to be stationary) · I(0) stationary · β cointegrating coefficient estimated by Engle-Granger or Johansen. does: formalizes "the spread between two random-walk-like series is mean-reverting." The mathematical foundation for pairs trading, statistical arbitrage, and many curve trades.

Kalman Filter

Purpose

Optimal estimation of hidden states from noisy observations.

Applications

  • Dynamic hedge ratio estimation
  • Trend following
  • Signal extraction from noise
  • Tracking time-varying parameters

Spectral Analysis

Fourier Transform

Decompose time series into frequency components:

X(f) = Σ x(t) × e^(-2πift)

where: X(f) complex amplitude at frequency f · x(t) time-domain signal · e^(-2πift) complex sinusoid basis function · i imaginary unit. does: projects a time series onto sines and cosines of every frequency. Used for cycle detection, denoising (low-pass filter), and constructing frequency-domain features for ML.

Applications

Application Purpose
Cycle detection Find periodic patterns
Noise filtering Remove high-frequency noise
Feature extraction Frequency-domain features for ML

Practical Guidelines

Model Selection Workflow

1. Visualize the series
2. Test for stationarity (ADF, KPSS)
3. If non-stationary → Difference or transform
4. Plot ACF and PACF
5. Select candidate models
6. Fit and compare (AIC, BIC)
7. Check residuals (should be white noise)
8. Validate out-of-sample

Common Pitfalls

Pitfall Problem Solution
Ignoring non-stationarity Spurious results Always test first
Overfitting Poor out-of-sample Use AIC/BIC, cross-validate
Ignoring structural breaks Model instability Test for breaks, use rolling models
Assuming linearity Miss non-linear patterns Try non-linear models
No out-of-sample test Unknown true performance Walk-forward validation

Key Formulas Reference

Returns: r_t = ln(P_t / P_{t-1})
AR(p): Y_t = c + Σᵢ φᵢY_{t-i} + ε_t
MA(q): Y_t = μ + ε_t + Σⱼ θⱼε_{t-j}
GARCH: σ²_t = α₀ + Σαᵢε²_{t-i} + Σβⱼσ²_{t-j}
ACF: ρ(k) = Cov(Y_t, Y_{t-k}) / Var(Y_t)

Next Steps