OOS: Out-of-Sample Test
Also known as: out-of-sample validation, holdout test
What is it?
An out-of-sample test checks a strategy on data that it was never optimised on. The idea is simple: before you start tuning a strategy, you set aside a slice of historical data, the holdout, and you do not touch it while you build and adjust the rules. Once the strategy is finished, you run it on that held-back data for the first time. Because the strategy never learned from this period, its performance there is a much fairer indication of whether the edge is real or just fitted to the data you optimised on.
It is the main defence against overfitting, where a strategy looks perfect only because it was shaped around the exact past it was tested on. For example, you might tune a strategy on data from 2018 to 2022, then test it untouched on 2023 to 2024. If it performs similarly on that unseen period, that is an encouraging sign the edge may be genuine. If it collapses, that is a red flag that the strategy was probably curve-fit.
The single most important rule is to keep the out-of-sample data truly untouched: if you peek at it and then go back and adjust the rules, it is no longer independent and its value as a check is destroyed. Even a strong out-of-sample result is not a promise. It is the best pre-live evidence you can gather, but markets change, past performance does not guarantee future results, no strategy is risk-free, and your capital is at risk.
Why it matters: It is the main defence against overfitting: a strategy that holds up on unseen data is far likelier to work live.
Out-of-sample results are the strongest pre-live signal that an edge is genuine.
Real-world example
A strategy tuned on 2018-2022 is then tested untouched on 2023-2024; similar performance is a good sign, a collapse is a red flag.
How SignalBots handles it
SignalBots emphasises out-of-sample and forward results over in-sample backtests when presenting a feed's track record. See /risk-warning.
Pro tip
Hold out a meaningful, untouched slice of data before you start optimising, and never peek at it while tuning.
Common pitfalls
Quietly using the out-of-sample period during tuning, which destroys its value as an independent check.
Frequently asked questions
Why hold out data for testing?
So you can check the strategy on a period it never learned from. Strong out-of-sample results suggest a real edge rather than curve-fitting, though they still do not guarantee future profit.
How much data should I hold out?
A meaningful, untouched slice that includes varied conditions, often a recent chunk of the history. It needs to be large enough that good performance on it is convincing rather than luck.
What happens if I peek at the out-of-sample data?
Its value as an independent check is destroyed. Once you adjust the rules based on it, the strategy has effectively been fitted to that period too, and it can no longer prove the edge is genuine.
Does a passing out-of-sample test guarantee live profit?
No. It is the strongest pre-live evidence you can gather, but future markets can differ from any tested period. Past performance does not guarantee future results, and your capital is at risk.
How is out-of-sample testing different from forward testing?
Out-of-sample testing uses held-back historical data the strategy never saw. Forward testing runs the strategy on new data as the market unfolds, often on demo. Both check robustness on unseen conditions.
Trading involves substantial risk of loss. Historical and backtested results do not guarantee future performance. Read the full risk warning.