Silver Alpha v2: Expanded 2020–2025 Out-of-Sample Validation
Theory is cheap — and so is a backtest that only covers one regime. We recently widened the Silver Alpha v2 validation window from the 2023–2025 period to a broader 2020–2025 out-of-sample test. This matters because the expanded window includes the COVID crash and rebound, the 2022 bear market, and the later AI-led recovery. The goal was not just to find a stronger return profile, but to see whether the signal still held up across very different market regimes.
Headline Result
In the expanded test, the quality-focused Silver Alpha v2 candidate turned a starting value of $100,000 into approximately $378,300, compared with about $229,100 for SPY over the same period.
Key stats:
- Total return: +278.3% (vs SPY +129.1%)
- Annualized return: 24.9% (vs SPY 14.9%)
- Excess annualized return: +10.0%
- Sharpe ratio: 1.07 (vs SPY 0.77)
- Maximum drawdown: -33.7%
- Completed trades: 96 (signal-based)
- Average trade return: +10.1%
- Median trade return: +3.6%
- Average excess return vs SPY: +6.3%
- Median excess return vs SPY: +1.3%
- Trades beating SPY: 53%
The main takeaway: the edge did not disappear when the test was expanded to include the COVID period and the 2022 drawdown. The system still produced positive excess return, though it did not eliminate equity-market drawdowns.

Annual Performance
The strategy outperformed or matched SPY in most calendar years:
| Year | Strategy | S&P 500 | Excess |
|---|---|---|---|
| 2020 | +16.5% | +17.2% | -0.7% |
| 2021 | +54.7% | +30.5% | +24.2% |
| 2022 | -8.5% | -18.6% | +10.2% |
| 2023 | +41.6% | +26.7% | +14.9% |
| 2024 | +22.5% | +25.6% | -3.1% |
| 2025 | +39.6% | +18.0% | +21.6% |
The strongest outperformance came in years with sector rotation and momentum divergence (2021, 2023, 2025). During the 2022 bear market the strategy significantly outperformed by limiting losses. During the COVID year (2020) and the narrow large-cap rally of 2024, the strategy roughly tracked SPY.

What Changed In The Research
The newer test puts more emphasis on trade quality and repeatability. Rather than optimizing only for final account value, we looked for a version with positive average excess return, positive median excess return, and a win rate against SPY above 50%.
That matters because annualized return can be distorted by a few unusually good trades. Silver Alpha v2 is now being evaluated more on whether the average trade has a repeatable edge over SPY, not only whether the equity curve compounds well in one lucky path.

Holding Window Finding
The fixed-horizon research suggests the edge is strongest around the intermediate holding window. Median excess return improved from roughly +1.6% at 21 trading days, to +3.2% at 42 trading days, and to +5.3% around 63 trading days. Extending the window further still had positive results, but the median edge no longer improved and fell to about +4.5% by 126 trading days.
The practical implication: Silver Alpha v2 appears to work best as a disciplined multi-week to multi-month process, not as a buy-and-ignore strategy.
Universe And Capacity
The broader research universe includes both large-cap and mid-cap stocks, but portfolio construction now gives preference to S&P 500 candidates when choices are close. Mid-cap stocks can still produce strong individual winners, but the large-cap preference improves liquidity, repeatability, and implementation quality for a public-facing system.
Cumulative Excess Return
The chart below shows the cumulative relative performance of the strategy versus SPY over time. The excess built primarily during 2021 and 2023–2025, with a flat period during 2022 and a brief underperformance dip in early 2020 during the COVID crash.

Bottom Line
Silver Alpha v2 looks meaningfully more robust after the expanded 2020–2025 test. The best version did not rely only on a short post-2023 sample, and the trade-level statistics improved enough to justify continuing with the v2 research direction.
The research does not prove the strategy will outperform in every environment. It does show that the signal survived a wider and more difficult validation period, with a cleaner edge profile than the original public test.
← Back to research