Model Selection for Trading¶
Overview¶
Choosing the right machine learning model for trading is critical. Unlike other domains, financial data is noisy, non-stationary, and has low signal-to-noise ratios. Model selection must balance predictive power, interpretability, and robustness.
Difficulty advanced
Model Taxonomy¶
Supervised Learning¶
| Model | Use Case | Pros | Cons |
|---|---|---|---|
| Linear/Logistic Regression | Baseline, feature importance | Interpretable, fast | Assumes linearity |
| Random Forest | Feature selection, non-linear | Robust, handles non-linearity | Slow inference, overfits |
| Gradient Boosting (XGBoost, LightGBM) | Most tabular data tasks | High accuracy, handles missing | Complex tuning, overfits |
| SVM | Classification, regime detection | Good in high dimensions | Slow, hard to tune |
| Neural Networks (MLP) | Complex patterns | Universal approximator | Data hungry, black box |
| k-Nearest Neighbors | Simple pattern matching | Non-parametric | Curse of dimensionality |
Time Series Models¶
| Model | Use Case | Pros | Cons |
|---|---|---|---|
| ARIMA/ARIMAX | Baseline forecasting | Interpretable | Linear, stationary |
| GARCH | Volatility forecasting | Captures clustering | Only volatility |
| LSTM/GRU | Sequential prediction | Captures long-term deps | Hard to train, slow |
| Transformer | Multi-horizon prediction | Parallelizable, attention | Data hungry, complex |
| Temporal Fusion Transformer | Forecasting with covariates | Handles static/dynamic features | Very complex |
| N-BEATS | Pure forecasting | State-of-the-art for univariate | Univariate only |
Reinforcement Learning¶
| Model | Use Case | Pros | Cons |
|---|---|---|---|
| Q-Learning / DQN | Discrete action spaces | Simple, proven | Unstable training |
| PPO | Continuous actions | Stable, sample efficient | Hyperparameter sensitive |
| SAC | Continuous control | Sample efficient, stable | Complex |
| A2C/A3C | Multi-environment | Parallelizable | Complex |
Selection Framework¶
Step 1: Problem Definition¶
1. Prediction Type:
- Classification: Direction up/down (binary or multi-class)
- Regression: Price return, volatility, volume
- Ranking: Relative performance across assets
- Sequence: Regime identification
2. Time Horizon:
- Intraday: Seconds to hours
- Short-term: Days to weeks
- Medium-term: Weeks to months
- Long-term: Months to years
3. Data Structure:
- Tabular: Features × observations
- Sequential: Time series
- Graph: Cross-asset relationships
- Text: News, social media
Step 2: Baseline Models¶
Always start with simple baselines: a constant predictor (next return = 0), a momentum predictor (sign of trailing return), and OLS on a handful of well-understood features (volatility, spread, prior return). If a fancy model can't beat these out-of-sample, the fancy model isn't the edge — the features are. Used in the feature-engineering phase, before any hyperparameter tuning.
Key Considerations for Trading¶
1. Overfitting Prevention¶
1. Out-of-sample testing: Never test on training data
2. Walk-forward validation: Rolling retraining
3. Purged cross-validation: Remove overlapping data
4. Feature importance: Use only meaningful features
5. Regularization: L1/L2 penalties, dropout
6. Ensemble methods: Reduce variance
7. Simplicity first: Prefer simpler models
2. Stationarity¶
Financial data is non-stationary. Solutions:
- Use returns instead of prices
- Rolling z-scores for features
- Differencing, detrending
- Regime-aware models
- Regular retraining
3. Class Imbalance¶
Direction prediction is often 50/50, but:
- Extreme moves are rare
- Crisis periods are rare
- Use SMOTE, class weights, or focal loss
4. Latency Requirements¶
Intraday trading:
- Model inference < 1ms
- Prefer: Linear models, small trees
- Avoid: Large ensembles, deep networks
Daily/weekly trading:
- Inference < 1s acceptable
- Can use: XGBoost, neural networks
Purged Cross-Validation¶
Standard cross-validation leaks information in time series. Use purged CV: drop training samples whose label window overlaps any test sample, then embargo a buffer of bars after each test fold to defeat serial correlation. Without purging, a model "learns" the label by looking at adjacent overlapping observations and the out-of-sample Sharpe is fiction. Used in the validation phase — every hyperparameter score must come from purged folds.
Model Checklist¶
- [ ] Baseline models established
- [ ] Purged cross-validation used (not random splits)
- [ ] Features tested for stationarity
- [ ] Overfitting checked (train vs. test performance)
- [ ] Feature importance analyzed
- [ ] Latency requirements met
- [ ] Model retraining schedule defined
- [ ] Out-of-sample performance economically significant
- [ ] Transaction costs included in evaluation
- [ ] Model explainability sufficient for risk management
References¶
- Lopez de Prado, M. (2018). Advances in Financial Machine Learning. Wiley.
- Dixon, M., Klabjan, D., & Bang, J.H. (2020). Machine Learning for Algorithmic Trading (2nd ed.). Packt.
- Gu, S., Kelly, B., & Xiu, D. (2020). "Empirical Asset Pricing via Machine Learning." Review of Financial Studies, 33(5), 2223-2273.