I am developing a new system. After optimization, I got 7.1 in CAR/MaxDD which is wonderful, but Sharpe Ratio is only 0.82. What does it mean? Why such divergence? (from what I have experienced so far, usually the 2 numbers are positively correlated). I am quite sure, it's over-optimized, but any other reasons?
The Monte Carlo result is completely unacceptable as more than 50% of the time it has 100% drawdown. So obviously, the high CAR/MaxDD above doesn't take the Monte Carlo result into consideration. Does it mean the CAR/MaxDD is resulted from the best run out of the whole simulation? And how can I do optimization with good Monte Carlo result (like, which stat should I focus on rather than CAR/MaxDD or Sharpe Ratio?)
Copy and paste report as TEXT here not screenshots, because screenshots only show small portion of entire information. Trade stats are equally (if not more) important as summary metrics like CAR/MDD. You might have one or two disastrous trades that cause that Monte Carlo to show bankruptcy in 50% of cases. You say yourself that you over-optimized the system. And CAR/MDD would show you this over-optimized result. Monte Carlo would show you what would happen if you are not super-lucky.
Quite clearly you have few big winners and few big losers (yet we don't see trade stats from your screenshots - send TEXT of entire thing, not screenshots). When those lucky winners are taken out, your system is losing big time. And that is what Monte Carlo shows.
Your system is not system, it is lottery.
A few things to note. First, the metric calculated by AmiBroker is labelled "Sharpe Ratio of Trades". This is different than the Portfolio Sharpe Ratio that is calculated from CAR as described here: Sharpe Ratio: Definition, Formula, and Examples. You can control the Risk Free Rate used to calculate the SR on the Report tab of the Analysis Settings window.
The metrics in the Backtest Report are derived from the trade list generated by your backtest. They are not affected by the Monte Carlo analysis.
As for MC, I assume you have read the documentation here: Monte Carlo simulation. Pay particular attention to the Position Sizing section, and this statement:
Don't change - uses original position size as used during backtest. Keep in mind that it always uses original dollar value of the trade (or whatever currency you use), even if your formula is using percent of portfolio equity.
If your backtest was run using % of Equity position sizing or something similar, then it's likely that your position size values (i.e. their size measured in dollars) was growing throughout the test. Therefore, a loss late in the backtest might be a large dollar amount, even though it may not have a huge impact on your equity. In a MC test, if that same trade occurs early in the test it could devastate your account because it will be a much larger percentage of your equity. As @Tomasz already pointed out, it's likely that you have quite a few of these big losers or else they would not be affecting so many of the MC results.
Just my 2cents. I agree with what is said by both Tom, mradtke and would re-iterate and share my opinion.
I am not much of a Monte Carlo fan, nor the Sharpe Ratio. Addressing the latter is easiest. The Sharpe Ratio is a dinosaur throw-back to buy and hold portfolios, whereby it could be argued that upside return volatility was somewhat indicative of downside volatility/risk. However, in systematic, tactical trading systems, you would never want to refer to a metric that penalizes any upward thrusts/volatility in equity.
Monte Carlo, well...
Even if you were a string proponent of MC, if you were looking to measure "Scrambled" results for "tightness" of consistency, and/or consistency across instruments, like what was pointed out, you should do so with a a constant, fixed bet size throughout test. Then you can do so with variable bet sizing money management and choose you safety, emotional levels and get a reality check.
That said, there is very important information and correlations that are lost a "Scramble", as well as causation s. If you took a picture and then scrambled all the picture's pixels, you can make it show snow and nothing discernible, which might be interesting, but what does it mean really? So if you don't do this under a more risk controlled environment, i.e., limits and management on overall portfolio heat and position correlation from within the back-test, and being able to manage it in the "Moment of now" as you step thru time, then MC is meaningless IMHO.
Also, keep in mind in a dynamic bet sizing simulations, largest losses are usually always at highest equity levels, where the sizing is largest obviously. So back-test stop dates and trade losses at the end have dramatic effects on performance metrics, especially when MC throws all those losers near the end of the back-test. Try to use logarithmic performance measures on your equity line etc.
Thank you so much mradtke. Yes, my backtest used percent of equity, but the MC used "don't change". After I change the MC setting to percent of equity, it fixes the problem.
@Tomasz Please see the TEXT report I posted earlier. Thank you
@Sean It makes so much sense as you mentioned that the Sharpe Ratio penalizes upward thrusts given that the denominator in its formula is the standard deviation of excessive return.
My new question: why is the backtest result in Amibroker the same everytime I run it, isn't it just one random sample among hundreds or thousands of possible runs? I remember long long time ago I used MetaStock, and each single run was different. Thank you again
The backtest result is generated by applying your rules to your trading universe over the date range that you selected. There will be nothing random about those results unless you intentionally introduced randomness into your rules.
The MC results are generated by randomly selecting trades from the primary backtest trade list. As stated previously, the MC results do not affect the primary backtest metrics.
@mradtke Interesting. Could you help me understand this if the each primary backtest is not random?
for example, in my end of day system, my maximum number of trades is 2 and today I get 3 buy signals among which I will have to choose 2 to buy. So obviously there will be 3 combinations which may generate different results. So how could every single run of the primary backtest result from hundreds or thousands of trades be the same? Did I miss something ?
The built-in PositionScore variable is used to rank the entry signals so that you take the same trades every time you run the backtest. If you don't assign PositionScore, then the entry signals will be ranked alphabetically using the ticker symbol. If you haven't already done so, you should read this: Portfolio-level back testing
@mradtke there is an older Knowledge Base article that includes a few other factors in addition to the alphabetic order (when PositionScore is not defined).
Thanks @portfoliobuilder. I had a vague recollection that was the case, but since I always use PositionScore, the default behavior beyond alphabetical order never sticks with me.
The low payoff ratio and high winning rates show that the profits are not particularly concentrated on a small set of winners. The max drawdown is far from high, your MC results could be the consequence of a high frequency of losers in random scenarios. I'm more concerned with the low number of trades, not enough to conclude. Test it with much longer periods.
By the way, there's nothing wrong with a strategy whose profits depend on a small percentage of big winners, although in that case, you need a more significant number of trades to assess its profitability and robustness at the same confidence level.
It's difficult to give advice with limited information, anyway you could try to implement a filter to stop trading during unfavorable times if you can point at them reliably. I've also noticed that the strategy holds losers longer than winners. In some cases, one can easily improve such a system by using a time stop (possibly with N-bar stop option) or a time stop loss.
I appreciate your reply. A lot of good points there. The bad MC results were because of the wrong setting which is corrected now after @mradtke pointed it out. I am curious, why do you say 500 trades is a low number? how many trades do you consider enough. The back test was over a 5 year period.
Great observation that you see losing trades stay longer, I actually implement something to cut the length of trades, cutting trades shorter cut losses as well as potential gains and the result is already optimized.