Statistical significance, volatility analysis, walk-forward validation, and overfitting diagnostics. Combined 60/40 portfolio — Feb 2022 to Dec 2025.
| Year | Return | Volatility | Sharpe | Max DD | Win Rate |
|---|---|---|---|---|---|
| 2022 | +39.6% | 6.5% | 5.67 | -8.2% | 70.6% |
| 2023 | +110.8% | 11.7% | 6.49 | -6.4% | 64.8% |
| 2024 | +181.9% | 10.9% | 9.55 | -4.2% | 73.4% |
| 2025 | +167.6% | 12.5% | 8.00 | -11.5% | 74.0% |
| Rank | Max DD | Start | Trough | Recovered | Duration |
|---|---|---|---|---|---|
| 1 | -11.5% | 2025-02-19 | 2025-04-07 | 2025-05-13 | 83 days |
| 2 | -8.2% | 2022-03-30 | 2022-05-10 | 2022-07-18 | 110 days |
| 3 | -6.4% | 2023-09-05 | 2023-10-23 | 2023-11-13 | 69 days |
| 4 | -6.2% | 2023-02-03 | 2023-03-13 | 2023-03-30 | 55 days |
| 5 | -5.0% | 2023-07-19 | 2023-08-17 | 2023-08-30 | 42 days |
The 40% ML Prediction component runs a 15-seed LambdaRank max ensemble with walk-forward validation. Below are standalone E1D metrics before combining with ETF Rotation.
Multiple independent tests confirm the model's alpha is genuine and not an artifact of overfitting.
All reported results are strictly out-of-sample. The model is trained on historical data and tested on future unseen periods using an expanding-window walk-forward protocol. At no point does the model see test-period data during training. The WF protocol spans Feb 2022 – Dec 2025 with quarterly retraining.
The strategy is profitable every single year with positive Sharpe ratios throughout:
An overfit model would degrade in later periods. Instead, Sharpe increases over time, suggesting the model captures durable market structure rather than historical noise.
2,000 random stock selection trials produce a mean Sharpe of ~1.4 (market beta). The model's observed Sharpe of 8.91 (E1D standalone) exceeds 100% of random trials (p < 0.0005). This rules out luck as an explanation for the returns.
15 model seeds were tested. Applying the Bonferroni correction (threshold = 0.05/15 = 0.0033), the t-test p-value of 8.03e-44 passes with overwhelming margin. The result is robust to multiple testing adjustments by many orders of magnitude.
10,000-sample bootstrap yields a 95% CI for Sharpe of [6.389, 8.418]. The entire interval is far above zero, confirming the strategy's risk-adjusted performance is statistically reliable, not a sampling artifact.
Daily return skewness (+0.305) is mildly positive and excess kurtosis (1.14) is modest, indicating returns are not driven by a few extreme outlier days. The positive skew means the strategy has a slight tendency toward large positive surprises rather than large losses.
The backtest includes all production safeguards: stop-loss (5% skip=4 using OHLCV prices), chase filter (skip stocks up >50% in 5 days), NoRepeat cross-cohort deduplication, and $100K minimum dollar volume filter. Entry = Day 1 Open, exit = Day 5 Close or Open (if SL triggered). No idealized assumptions.