I'm really trying to squeeze a lot Amibroker to backtest some fast strategies and I being slowed down by bottlenecks.
I am backtesting thousands of stocks at second level. I just start with 1 day and then slowly expand the dates to cover more time. The problem is I read here in the forum I think, the memory bandwith. The cpus are not at 100% while backtesting with these settings. Only when backtesting minutes they are fully utilized.
I've tested with Memory Mark, my pc and I got this result:
So I am lost how could I improve my hardware to speed up the backtests. Not asking to compute everything in seconds, but I thought that if the cpus were not at 100% increasing the memory bandwith would help.
I was considering buying an octa channel memory server or similar but I dont know if that will help after this test.
I am lost at how could I improve the speed of the backtests. I am focused right now at improving hardware.
So, more or less I guessed that, but tried this another server with that superior memory threaded test but the backtest takes almost the same and the cpus are waiting all the time. The database is local and its all loaded in the amibroker cache, disk is not working while the backtest.
What test could I do to the different servers to guide myself towards a better hardware for amibroker? I dont really what to do to speed up the backtests if possible at all.
So accessing the data is serial, I guess this is the step that is bottlenecking.
But could not this step be parallel? Could not each thread access its own symbol to test?
Did you read the article carefully and slowly? You are asking questions that are answered in the article.
It is really recommended to read very carefully because everything is explained there. It is not "serial" vs "parallel". It is oversimplification that you are making. Each thread accesses its own data in parallel BUT.... first this data has to be somehow present in RAM. It needs to be read from single shared resource (DISK) and system RAM is ALSO single shared resource. Neither RAM nor DISK scale with your core count. RAM is much slower than single core.
Guessing from timings you get I doubt if your settings are correct. It looks like in-memory cache is too small and data are read from and flushed in/out of cache all the time.
Tools->Performance Monitor will show actual numbers.
You are using Apply to "ALL SYMBOLS", while performance monitor shows that there are 28000 symbols in the database and only 2112 are in the in-memory cache. Since your database uses 200MB per symbol, fully caching entire database would require 28000 * 0.2GB/symbol = 5600 GB OF RAM.
Obviously it is unrealistic expectation as your computer doesn't have that much.
But that is because I had selected AAPL and it has a lot of activity and a lot of second bars, but the stocks I am mostly interested dont have that amount of bars.
I have filtered with a list of small caps, around 4000, that dont have that many bars, for example CERE in the screenshot.
What you are doing now is "doing nothing", just waiting for backfill. The "info" tab shows that it took ZERO SECONDS to process AFL.
Don't expect CPU to be busy when your AFL IS EMPTY or doing pretty much nothing. Again, read the article, put some REAL COMPLEX HEAVY MATHEMATICAL WORK in the AFL and then you will see CPU use.
IMPORTANT:
It is advised to run your backtest OFFLINE (not being connected to IQFeed). If you are connected to IQFeed and try to run on 28000 symbols you will hit 500 symbol limit that IQFeed has and it will subscribe/unsubscribe symbols hitting IQFeed servers. Most time will be spent WAITING FOR IQFEED TO RESPOND to non-stop subscribe/unsubscribe. Your computer would be showing ZERO USAGE, because it just keeps hitting IQFeed and IQFeed will be throthling such abuse.
PLEASE Read the article. It explains "why". EVERYTHING IS IN THE ARTICLE. The "afl" figure is ZERO, which means ZERO seconds spent processing AFL. Also your backtest produces ZERO TRADES.
How should I know that? You did not say that in first place. You post irrelevant screenshots and expect relevant responses. Sorry, but this is not productive.
Sorry about that, the Info tab keeps all the backtests I click.
but the last backtest in both screenshots is the relevant. Screenshots are not irrelevant, I try to make them contain all the relevant information I think, but then the forum doesnt allow me to post more than one image, which only difficults more the communication.
In fact, in both screenshots I show how the analysis is still running, and the remaining time is quite high, and I can tell you that keeps increasing as time passes and I try to show you the activity on the machine is low, the disk not reading, ram not filling up...
In the bottom the Xmeters not showing any activity. And I cannot you show the afl because it is private strategy work, but it is complex with some indicators...
If you want to try divide and conquer, maybe OP can
can start 4 instances Analysis window with Backtest dividing all symbols to say 4 Watchlists.
OR maybe even spin up 4 instances of AB ( not that AB is an issue ) but if you want to go down this lane, might as well try it.
Observe if something changes.
Because the CPU utilization is so slow, can try and bump it up.
With it being a local DB, should not be an issue if you can split symbols into roughly 4 groups.
Why on earth nobody seems to be reading the article and advice above.
period = Optimize( "period", 10, 2, 501, 1 );
Buy = Cross( C, MA( C, period ) );
Sell = Cross( MA( C, period ), C );
// CRUCIAL PART !!!! FROM THE ARTICLE THAT WAS QUOTED
// DOZENS OF TIMES already
// YOU HAVE TO CALL HEAVY MATH to make CPU not idling
// add some math to force i7 CPU to sweat a little bit
for( i = 0; i < 1000; i++ ) x = acos( log( C ^ period ) );
can you try what i suggested?
And you must share some part of your code to Tomasz, otherwise there is no point discussing.
For example,
you could have a shared resource in your AFL which is just making other threads wait.
you are accessing some AFL plugin functions
you are using some OLE stuff / EnableScript() etc
I dont know, maybe some http calls you see, why its hard for others?
The example of high cpu utilization code by TJ is also a case where it's very parallel in nature.