Skip to content

Kelly Criterion

Difficulty advanced

Overview

The Kelly Criterion determines the optimal fraction of capital to bet when you have an edge. It maximizes long-term geometric growth rate.

The Formula

Discrete Case (Binary Outcomes)

f* = (bp - q) / b

where:
b = Net odds (avg win / avg loss)
p = Probability of winning
q = 1 - p = Probability of losing

does: the fraction of capital that maximizes expected logarithmic wealth growth for repeated bets with binary outcomes. Equivalent shortcut: f* = edge / odds, where edge = p − q/b.

Continuous Case

f* = μ / σ²

where:
μ = Expected return
σ² = Variance of returns

does: the Kelly fraction for any normally-distributed return stream. The optimal exposure equals expected return divided by variance — high Sharpe + low variance → bet more; low Sharpe + high variance → bet less.

Example

Win rate: 55%
Avg win: $200
Avg loss: $150

b = 200/150 = 1.33
p = 0.55
q = 0.45

f* = (1.33 × 0.55 - 0.45) / 1.33 = (0.733 - 0.45) / 1.33 = 0.213

Optimal bet: 21.3% of capital

Why Not Full Kelly?

Full Kelly has drawbacks: 1. High Volatility — Equity curve is very volatile 2. Large Drawdowns — Can experience 50%+ drawdowns 3. Parameter Uncertainty — Win rate and payoff estimates are noisy 4. Changing Edge — Edge decays over time

expected log growth ½× (half-Kelly) 1× (full Kelly) 1½× f → ~75% of max max growth ~75% of max ≈ 0 break-even
kelly's growth curve is symmetric around f=1 · over-betting beyond f=2 produces negative log-growth (bankruptcy almost surely) · half-Kelly captures 75% of max growth with far lower drawdown variance

Fractional Kelly

Fraction Growth Rate Volatility Recommended For
100% Maximum Very High Theoretical optimal
50% ~75% of max Moderate Most traders
25% ~50% of max Low Conservative
10% ~25% of max Very Low Beginners

Growth Rate and Drawdown

Expected Growth Rate

G(f) = p × ln(1 + bf) + q × ln(1 - f)

Maximized at f = Kelly fraction

where: f fraction of capital wagered · p probability of winning · q = 1 − p · b net win-to-loss ratio does: the expected logarithmic growth as a function of bet size — used to visualize how growth rises to a peak at Kelly then collapses past 2× Kelly; the curve's flatness near the peak is the argument for fractional-Kelly sizing.

Expected Drawdown

Expected DD ≈ e^{σ_Kelly × √2} - 1

where σ_Kelly is the volatility at full Kelly

where: σ_Kelly annualized volatility of the equity curve when betting full Kelly does: rough closed-form for the expected maximum drawdown at full Kelly — used to translate a stated Kelly fraction into the drawdown investors should expect to live through and to motivate using a half- or quarter-Kelly fraction instead.

Kelly with Multiple Bets

For multiple simultaneous bets, Kelly generalizes to a vector of weights solved from the joint distribution of returns. Under the multivariate-normal assumption it reduces to a closed form involving the inverse covariance matrix and the vector of excess returns.

f* = Σ⁻¹ · μ

where: f* vector of optimal capital fractions per asset · Σ⁻¹ inverse of the return covariance matrix · μ vector of expected excess returns over the risk-free rate does: the multivariate Kelly solution — used to allocate across correlated strategies or assets; algebraically identical to the unconstrained tangency portfolio, so allocation, leverage, and diversification fall out together. Sensitive to noise in Σ⁻¹; shrink the covariance matrix before inverting and bet a fraction (half- or quarter-Kelly) of the resulting weights.

Estimating Parameters

Shrinkage Estimator

Shrink Kelly toward zero to account for estimation error. Parameter uncertainty makes the raw Kelly fraction systematically optimistic — the higher the estimation variance relative to the squared edge, the more the bet should be pulled toward zero.

f_shrunk = f_Kelly × edge² / (edge² + Var(edge))

where: f_Kelly raw Kelly fraction from point estimates · edge estimated mean excess return per bet · Var(edge) sampling variance of the edge estimate does: a Bayes-shrinkage analogue applied to Kelly — used when win-rate or payoff parameters are estimated from limited history; reduces sizing in proportion to how noisy the edge estimate is, recovering full Kelly only when Var(edge) → 0.

Practical Guidelines

  1. Use Half Kelly at Most — Full Kelly is too volatile
  2. Uncertainty Adjustment — Reduce Kelly when parameters are uncertain
  3. Recalculate Regularly — Edge changes; update parameters
  4. Cap Position Size — Never exceed 25% in any single position
  5. Portfolio Kelly — Consider correlations across positions
  6. Track Estimates — Compare predicted vs. actual win rate/payoff
  7. When in Doubt — Use less; survival > optimization

Key Insight

Kelly maximizes long-term growth, but the growth rate curve is flat near the optimum:

Growth Rate
    │     *
    │    ***
    │   *****     ← Flat near optimum
    │  *******
    │ ***********
    │*************
    └───────────────────
     0   f*/2  f*  2f*
     Fraction of Kelly

Half Kelly gives ~75% of growth with much less volatility.

q&a

Why does Kelly maximize log wealth instead of expected wealth?

Expected wealth is dominated by tail outcomes — a strategy that occasionally returns 1000× has high expected wealth even if it usually goes to zero. Log wealth (geometric growth) accounts for compounding: a series of 50% gains and 50% losses has positive arithmetic expectation but negative geometric outcome (you go to zero). Kelly optimizes the realistic long-run outcome of repeated betting.

What happens if I bet more than Kelly?

Your expected geometric return decreases, then turns negative. The curve is asymmetric: at 2× Kelly your expected log growth is zero — meaning you have positive arithmetic expectation but break-even compounded outcome. Beyond 2× Kelly you go bankrupt almost surely, even with a real edge.

What if I don't know my win rate?

Then your Kelly estimate has wide confidence intervals. The rule of thumb: estimate Kelly, then bet a fraction (½ or ¼). The reduction in compound growth from half-Kelly is modest (~25%) but the reduction in drawdown variance is large (~75%). Robust to overestimating your edge.

How is Kelly different from mean-variance optimization?

Mean-variance picks weights that maximize return for a given variance. Kelly maximizes log return without a variance constraint. For a single asset with normal returns, fractional Kelly is equivalent to a particular point on the MVO efficient frontier — but Kelly extends naturally to non-normal returns and binary bets where MVO breaks down.

Does Kelly work for continuous returns (not just binary bets)?

Yes. For a normally distributed return stream with mean μ and variance σ², the Kelly fraction is f* = μ / σ². Same idea — optimal log-growth — different functional form.

Next Steps