Out-Of-Sample Backtesting: Importance and Strategies Explained
Most traders test their trading ideas on all their available data and conclude “yes or no” to go live with the strategy. But it’s a major problem with this method: You test on known data – not unknown. Almost all backtests are to a certain degree curve-fitted – mostly unconsciously. The missing element is out of sample testing:
An underrated part of trading is out of sample testing. Out-of-sample backtesting is when you divide your backtest into two parts: in sample vs. out of sample. The in-sample test is where you make the rules, signals, and parameters. The out-of-sample is where you test your rules and signals on unknown data. The best out of sample backtest is an incubation.
In this article, we explain what out of sample trading tests are and look into why this is important and how you should test out-of-sample in your backtesting.
In sample vs. out of sample means dividing your historical data into two parts: one part where you make the rules and parameters (in sample), and one part where you test the in sample rules on unknown data (out of sample). Finally, before putting real money to the test, you test the trading strategy live in a demo account
At the end of the article, we test a short strategy by diving the backtest into two parts: in-sample and out-of-sample.
(Before we go on we’d like to mention that we have a backtesting course that covers all aspects of how to backtest.)
First, you need to generate trading ideas
In order to succeed as a trader, you need to spend a lot of time testing ideas. We have written about this before:
Why do you need to test and generate ideas?
Because most strategies tend to wither away as time goes by, due to a number of reasons. One reason is curve-fitting in your backtests, and another reason is that markets change.
What is a backtest?
This website is all about quantified trading, and backtesting is an essential part of forming trading strategies. Backtesting is when you test your ideas on a sample of data to see how they performed historically.
We can divide the backtest into the following order:
- Observation: this is when you form your hypothesis or the idea you want to test.
- Then you make a quantified idea and hypothesis based on your observation(s).
- You need data to test your hypothesis.
- Test your idea and hypothesis on the data.
- Can you confirm or falsify your hypothesis?
When you have done these five steps, you have done the in-sample test:
What is an in-sample test?
An in-sample test is simply the testing you do on your available data. It’s the data you use to confirm or falsify your hypothesis.
Many traders like to split their dataset into two parts: one part to test in-sample and one part to test out-of-sample. You compare the in-sample data vs the out-of-sample data:
What is an out-of-sample test?
When you have tested a trading idea and formed a conclusion you need to test your trading strategy on unknown data.
Let’s say you have data from 2005 until 2021. A practical way of testing is by splitting the dataset into two parts, for example, the in-sample test from 2005 until 2017, and then out of sample from 2018 until 2021.
Doing it this way, you do two tests: in-sample and out of sample.
What is sample validation?
Validation is when you confirm your trading idea or hypothesis via an out-of-sample test. Did the in sample predict the out-of-sample results well? If not, you should not go live with the strategy or you should wait or test more.
We are skeptical about dividing your dataset into two parts:
First, most traders tend to “cheat” by looking at the out-of-sample test before they do the in-sample test. Second, it’s not a realistic way of trading. You get your results in the blink of an eye, but you miss the details. As the saying goes, “the devil is in the details”.
The best method of doing out of sample: incubation
We believe the best way to perform an out-of-sample test is to use a live demo account. We like to call this the incubation period.
After you have backtested a promising trading strategy, you don’t proceed to live trading. Instead, you put the strategy “on hold” for at least 6 months, but preferably 12 months or longer, depending on the number of trades. You observe the strategy and see how it performs out of sample.
By doing it this way you resemble live trading and you get to “feel” how the strategy performs. Furthermore, you might discover some small details you never thought of when you did the testing.
The main advantage with this method is time: a backtest is done in seconds and minutes, but via a demo account you discover the strategy in real life. A backtest done in minutes is worth a lot less than incubation, in our opinion.
A demo account is the best tool for out of sample
Luckily, most brokers offer demo accounts. At Interactive Brokers, you just check a box and you have a demo account ready in minutes. The account is practically just like a real account except for a few minor details.
Thus, after backtesting, put the trading rules in the demo account and let it run. Of course, a demo account is never a substitute for live trading, but incubation is significantly better than out of sample.
A practical example of in-sample and out of sample backtest (in sample vs. out of sample)
Let’s end this article by showing a practical example of an in-sample test and out-of-sample test:
We want to test a short strategy that we currently have in “incubation” on the XLP (the ETF tracking consumer staples) and might publish the strategy as a Trading Edge later this year or next year. We have developed many XLP trading strategies and consider this ETF as one of the best trading vehicles around.
We developed this strategy in 2017, but recently changed the exit criteria and we discovered a huge improvement in the profitability.
The strategy has currently two parameters as a buy signal, and the in-sample period from 1993 until the end of 2017 looks like this:
The in-sample test showed 264 trades, 0.4% average gain per trade, CAGR was 5.88%, time spent in the market was 8%, and the profit factor was 3.33.
How does the out-of-sample test look so far? It still looks pretty good:
From 2018 until May 2021 it has generated 43 trades, the average gain is 0.41%, time spent in the market is 7.7%, the CAGR is 5.5%, and the profit factor is 4.98. We believe the result is pretty good for a short strategy.
In other words, the strategy has performed more or less exactly as the in-sample test. However, we like to do some trades in the demo account to see how it performs, and also how it performs together with the other strategies in XLP and other equity ETFs/futures.
Over the whole period the equity curve looks like this:
Walk forward optimization:
In order to make a better test that can stand the test of time, many traders like to use what is called walk forward optimization. Yes, the word optimization is correct: Walk forward is a kind of optimization by using in-sample and out of sample tests frequently.
It’s done like this:
Let’s assume you have 20 years of data. You can divide the data into 10 equal parts, ie two years. Those two years are then divided into two parts: the first year is for in-sample and the second is for out of sample. You make the best parameters in year one, and test this out of sample in year two.
This is repeated ten times and the final results are evaluated to make the final parameters for the strategy.
Is this a good way of making strategies? We have tried it but we stopped for a number of reasons, the main reason being we didn’t find any improvement in our trading by doing it this way.
Out of sample backtesting – video
We made a video on YouTube: Out-of-sample backtesting
Conclusion: What is out of sample backtesting?
Out-of-sample backtesting is when you divide your backtest into two parts: in sample vs. out of sample. The in-sample test is where you make the rules, signals, and parameters. The out-of-sample is where you test your rules and signals on unknown data.
Out of sample tests are a necessity, even though it is not (of course) foolproof. The last stage of an out-of-sample test is the incubation period of many months where you trade it live in a demo account.
The whole point of doing backtests is to forecast the future. Doing so, you need to be careful and patient. We believe an out-of-sample test is an important aspect of this procedure before you allocate money to a strategy.
FAQ:
– What is out-of-sample backtesting in trading?
Out-of-sample backtesting involves dividing a backtest into two parts: in-sample and out-of-sample. The in-sample test establishes rules and parameters, while the out-of-sample test evaluates these rules on unknown data.
– How do you perform in-sample vs. out-of-sample testing in backtesting?
In-sample vs. out-of-sample testing involves dividing historical data into two parts: in-sample (for rule creation) and out-of-sample (for testing rules on unknown data). This approach helps validate the robustness of a trading strategy.
– What is the difference between in-sample and out-of-sample tests in backtesting?
In-sample testing is conducted on historical data used to create rules, signals, and parameters. Out-of-sample testing, on the other hand, assesses the performance of these rules on data that the strategy has not encountered during development.