For back tests on more than one symbol I'm finding running the test on the maximum date range by default includes dates for the symbol(s) with the oldest data. This gives substantially different results than running a back test only from the time when there is data for all of the symbols.
Is there a way to figure out and code the starting date for a back test based on the symbol with the least available data including an adjustment for the number of bars required for any indicators used?
The best I have come up so far is to loop through all the symbols comparing the last value of BarIndex for each symbol to find the smallest value, but this doesn't account for bars required for indicators.
The Apply to Range could be set to All Bars and then a condition included in code to only test when date is greater than the desired test start date based on when data is available for all symbols with whatever offset is required for the indicators used. The max offset bar value is probably something Amibroker calculates, but didn't find a way to access it.
bi = BarIndex();
ind_b = 100;
sm100 = MA( C, ind_b);
sm100_corrected = IIf( bi > ind_b, sm100, Null);
// Null, 0 or extreme high 9999999 etc depending on use case
// see safe_divide_by_zero examples in forum
// sm100_corrected is what you then use,
// and same for all other bars count relying variables
Can you try this approach.
So what we are trying to do is not generate signal until you have sufficient bars to get correct Indicator value because AB by design might do MA( Array, min( barindex(), period)) or something similar for different things.
With no signal, there is no need to know bars ahead in time and no custom if/else code, it will iron out issues on the fly.
No, if any indicator, including MA( array, X ), does not have "X" bars available before first analysis date it will just return Null. Null value is special as it propagates thru expressions, therefore no buy/sell/short/cover signal will be generated if there is no sufficient data yet to calculate.
You should not need to use any calculations in your formula as the formula by itself would only generate signals when enough data are available.
If there isn't enough, you would just observe Nulls in those bars that don't have enough input yet.
Say
Buy = Cross( Close, MA( Close, 100 ) ); // would produce Null values for first 100 bars
choosing MA() as an example was wrong choice. Should've called it my_ind1()
As Tomasz says, AB Indicators with less bars are NULL, so maybe you can check your own code, apply the same and see if you were missing something.
If you put NULL then you'll get rid of the signals.
Must not have done a good job of explaining what I wanted to do. Here is a simple example. Create a Watch List containing SPY, GOOG, PLTR, select that watch list as the "Apply To" filter, set the date range from to dates to something like from 1/1/1985 to "Today" or better to "All Bars" and run a back test using the following code.
PositionScore = ROC(C, 63);
Buy = C > MA(C, 100);
Sell = C < MA(C, 100);
Depending on your data source and downloaded data base the first trade will be maybe in 1993 on the SPY and all trades will be on the SPY until the year 2014 when there will start to show up trades on GOOG. Trades will alternately be on SPY and GOOG until 2023 when some trades will start to show up on PLTR.
There is no data for PLTR before September of 2020, but the strategy back tests as far back as there is data for the symbol with the most data.
I'm looking for a way in code to limit the date range to only those dates when all of symbols contained in the watch list have enough data to all be potential trade candidates. In this case the start date when all of symbols could be traded would be sometime in 2021 when there is enough data for all three symbols to be potential trades.
The start date would be first date of data for the symbol with the least amount of data plus however many bars are required for the indicator requiring the most bars.
It can be done by hand, but for all but very small symbol lists it can be time consuming to have to look up the beginning date of each symbol and the number of bars required for all the indicators used.
The backtester forum members are the best 2 answer the optimal way.
I have an acronym, FWTO - First work, then optimize In your case it looks like you need to make 2 passes.
Use 1 exploration and then 1 backtest, and use this trick by Tomasz.
"Instead of looking for first occurrence, Reverse Arrays and look for LAST occurrence"
See here
Once the 1st explore is done, you have the start trades.
In your main explore/backtest, you just drop all entry trades before the reference datetime derived from first run.
You could put all the code in one file but see what suits your needs best.
Thanks for the ideas. Here is a solution. Perhaps not the most elegant, but seems to work.
#pragma sequence(scan, backtest)
PositionScore = ROC(C, 63);
Buy = C > MA(C, 100);
Sell = C < MA(C, 100);
//--- Scan to get first date when all symbols are available for testing
if (Status("action") == actionScan) {
if (Status("StockNum") == 0) StaticVarSet("BeginTest", 0);
Begin = ValueWhen(ExRem(Buy, 0), DateNum());
StaticVarSet("BeginTest", Max(Begin, StaticVarGet("BeginTest")));
_exit();
}
//--- Add start date condition to Buys
Buy = Buy AND DateNum() >= StaticVarGet("BeginTest");
I noticed a bit of a delay when using it. Does anyone see a cleaner, faster, better way? Maybe moving PositionScore and Sell below the scan would help a little.
This runs the scan a bit faster for me, and I think it's also more accurate as it does not require your newest symbol (one with the least history) to have a Close above the MA but rather just that the MA is available.
#pragma sequence(scan, backtest)
//--- Scan to get first date when all symbols are available for testing
if (Status("action") == actionScan) {
if (Status("StockNum") == 0) StaticVarSet("BeginTest", 0);
isDataAvailable = !IsNull(MA(C,100));
Begin = LastValue(ValueWhen(ExRem(isDataAvailable, 0), DateNum()));
BeginTest = StaticVarGet("BeginTest");
if (Begin > BeginTest)
{
StaticVarSet("BeginTest", Begin);
}
_exit();
}
//--- Add start date condition to Buys
PositionScore = ROC(C, 63);
SetPositionSize(10, spsPercentOfEquity);
Buy = C > MA(C, 100) AND DateNum() >= StaticVarGet("BeginTest");
Sell = C < MA(C, 100);
The suggested code works when entry conditions are simple, but may get more difficult when there are more Buy and PositionScore calculations involved. Considering this I prefer the following based on Buy. PositionScore might also be factor. So I leave all Buy and PositionScore calculations above the BeginDate check.
#pragma sequence(scan, backtest)
//--- Buy conditions
SPY = Foreign("SPY", "C");
Buy = SPY > EMA(SPY, 200); // Bull market check
Buy = Buy AND C > MA(C, 50) AND MA(C, 50) > MA(C, 100); // Buy entry rule
//--- Rank Buys
PositionScore = ROC(C, 150);
//--- Scan to get first date when all symbols are available for testing
if (Status("action") == actionScan) {
if (Status("StockNum") == 0) StaticVarSet("BeginTest", 0);
Begin = LastValue(ValueWhen(ExRem(Buy, 0), DateNum()));
BeginTest = StaticVarGet("BeginTest");
if (Begin > BeginTest) StaticVarSet("BeginTest", Begin);
_exit();
}
//--- Add start date condition to Buys
Buy = Buy AND DateNum() >= StaticVarGet("BeginTest");
Sell = C < MA(C, 100);
It's your code, so you can do what you want. However, by using the first Buy signal to generate the BeginTest date, you may be delaying your backtest start more than you need to. Consider a simple example where your only Buy rule is that MA(50) > MA(200). With my rules, you can start trading 200 days after the inception of the youngest symbol. With your rules, you will wait 200 days plus however long it takes (days? weeks? months?) for that symbol's fast MA to rise above its slow MA.
If you're going to use the Buy signal, you should also account for the possibility that some symbols may never generate a Buy signal, which will mess up your logic.
Understand your point, but think you are wrong about the effect. The indicators do not have to wait for the start date to start calculating. The logic only prevents an earlier Buy (Buy = Buy And Date > StartDate). The MA(200) will still have been calculated by the start date either way. It does not have to wait 200 extra days after the inception date of the youngest symbol.
If your goal is to know when the first trade could have theoretically taken place, your approach is definitely the way to go. But my goal was to make sure my back test did not start until all symbols had enough data to potentially trade. I think using Buy accomplishes this without any loss of time in the beginning.
Your problem was Buy (entries), so to speed up just run that set of code to get "the last" of first entries in all tickers.
Anyway, at least you found the way
I did not say to start calculating the indicators at the Start Date. My point was that you should make the Start Date as early as possible, and in your example, that's 200 days after the inception of the youngest symbol.
With your logic, no symbol will be able to generate a Buy signal until all symbols have generated at least one Buy signal, i.e. your Start Date is the latest "first Buy signal" among all symbols. What if the youngest symbol doesn't generate a Buy signal for 2 years after inception? Now you've delayed your entire backtest by 2 years after the most recent inception, instead of just waiting 200 days since the most recent inception.
Also, as you've probably aware, using this technique is going to make your portfolio metrics inaccurate, because things like CAR are calculated using the entire analysis date range, but your equity curve will be flat from the date range start until you allow Buy signals to be generated.
Ok, see your point and think you are right. I do want the back tests to start as soon as there is enough data for all symbols to possibly initiate a trade.
The problem with checking data for each indicator is if you are working out the details of a new strategy and changing indicators and the bars to be used, you would then have to also adjust the start date scan code now looking for not only at the symbol with the least data but also the indicator with the least available data for each symbol.
Was hoping to find a single value that could be checked for each symbol. Wonder if there is something in the internal AFL engine that would indicate that date.
I think you are overengineering / overthinking this. In practice symbols are added / removed (delisted) constantly. You would exclude yourself from majority of market activity if you do what you are trying to do.
Also it is easy to find which symbol has shortest history, just run exploration
Delisted symbols is not a problem because the strategies to be tested are on watch lists with a limited number of symbols - often fewer than 10 or 12 such as might be the case for a sector rotation strategy using ETFs, sometimes even fewer than 5 or 6 as might be the case for asset classes using ETFs or Mutual Funds.
Sometimes an ETF strategy back test period can be extended by using the index upon which the ETF is based, sometimes by using highly correlated alternate ETFs and Mutual Funds.
For example most XL sector ETFs have data going back to the year 1998, but XLC and XLRE are comparatively recent. The data for XLRE can be extended using IYR or DFREX while XLC can be extended using VOX or PRMTX.
With a limited number of symbols, back test results turn out quite differently in the period where all the symbols have data vs. those periods where only some of the symbols have data.
The solution of using the shortest BarCount as a start date solves the problem of only testing when all of the symbols could be traded, but can still give skewed results when looking at back test metrics as the first partial year of trading will be included in CAR and similar metric calculations.
The ideal solution is to know the earliest possible start date for the symbols being tested based on available data plus the amount of data required for all the indicators being used.