Backtest speed and memory bandwith

Hi,

I'm really trying to squeeze a lot Amibroker to backtest some fast strategies and I being slowed down by bottlenecks.

I am backtesting thousands of stocks at second level. I just start with 1 day and then slowly expand the dates to cover more time. The problem is I read here in the forum I think, the memory bandwith. The cpus are not at 100% while backtesting with these settings. Only when backtesting minutes they are fully utilized.

I've tested with Memory Mark, my pc and I got this result:

1 day of backtest takes around 5 minutes

And then Ive tested a server with this memory mark:

But the backstest takes almost the same time:

So I am lost how could I improve my hardware to speed up the backtests. Not asking to compute everything in seconds, but I thought that if the cpus were not at 100% increasing the memory bandwith would help.

I was considering buying an octa channel memory server or similar but I dont know if that will help after this test.

I am lost at how could I improve the speed of the backtests. I am focused right now at improving hardware.

In your back test 99% of time 315 seconds out of 319 is spent accessing the data. Data Access is a problem, not anything else.

1 Like

So, more or less I guessed that, but tried this another server with that superior memory threaded test but the backtest takes almost the same and the cpus are waiting all the time. The database is local and its all loaded in the amibroker cache, disk is not working while the backtest.

What test could I do to the different servers to guide myself towards a better hardware for amibroker? I dont really what to do to speed up the backtests if possible at all.

Please read

For the detailed discussion what limits the speed and how to interpret data given in the info tab

So accessing the data is serial, I guess this is the step that is bottlenecking.
But could not this step be parallel? Could not each thread access its own symbol to test?

Did you read the article carefully and slowly? You are asking questions that are answered in the article.

It is really recommended to read very carefully because everything is explained there. It is not "serial" vs "parallel". It is oversimplification that you are making. Each thread accesses its own data in parallel BUT.... first this data has to be somehow present in RAM. It needs to be read from single shared resource (DISK) and system RAM is ALSO single shared resource. Neither RAM nor DISK scale with your core count. RAM is much slower than single core.

Guessing from timings you get I doubt if your settings are correct. It looks like in-memory cache is too small and data are read from and flushed in/out of cache all the time.

Tools->Performance Monitor will show actual numbers.

4 Likes

So here is a backtest I am doing in the server:

remaining time keeps increasing

Quotation data cache size doesnt look like to fill very fast

cpus are not being used at all

Is amibroker flusing in and out all the time??

You are using Apply to "ALL SYMBOLS", while performance monitor shows that there are 28000 symbols in the database and only 2112 are in the in-memory cache. Since your database uses 200MB per symbol, fully caching entire database would require 28000 * 0.2GB/symbol = 5600 GB OF RAM.
Obviously it is unrealistic expectation as your computer doesn't have that much.

But that is because I had selected AAPL and it has a lot of activity and a lot of second bars, but the stocks I am mostly interested dont have that amount of bars.

I have filtered with a list of small caps, around 4000, that dont have that many bars, for example CERE in the screenshot.

Disk is not doing anythig, ram is not filling up fast, cpu not doing anything...

Why is this?? Should not at least one component be fully busy?

Again, please re-read the article.

What you are doing now is "doing nothing", just waiting for backfill. The "info" tab shows that it took ZERO SECONDS to process AFL.
Don't expect CPU to be busy when your AFL IS EMPTY or doing pretty much nothing. Again, read the article, put some REAL COMPLEX HEAVY MATHEMATICAL WORK in the AFL and then you will see CPU use.

IMPORTANT:

It is advised to run your backtest OFFLINE (not being connected to IQFeed). If you are connected to IQFeed and try to run on 28000 symbols you will hit 500 symbol limit that IQFeed has and it will subscribe/unsubscribe symbols hitting IQFeed servers. Most time will be spent WAITING FOR IQFEED TO RESPOND to non-stop subscribe/unsubscribe. Your computer would be showing ZERO USAGE, because it just keeps hitting IQFeed and IQFeed will be throthling such abuse.

Why you say that?

It is an screenshot of an ongoing backtest, the afl is quite complex, is not empty.

PLEASE Read the article. It explains "why". EVERYTHING IS IN THE ARTICLE. The "afl" figure is ZERO, which means ZERO seconds spent processing AFL. Also your backtest produces ZERO TRADES.

The database is totally offline, not connected to any plugin.

Those backtests were done previously, not relevant.

The relevant one is the last which is running, it says Backtest Started...

How should I know that? You did not say that in first place. You post irrelevant screenshots and expect relevant responses. Sorry, but this is not productive.

Please follow this advice: How to ask a good question

And use COMPLEX FORMULA given in the article (with transcendental trigonometric functions)

Sorry about that, the Info tab keeps all the backtests I click.

but the last backtest in both screenshots is the relevant. Screenshots are not irrelevant, I try to make them contain all the relevant information I think, but then the forum doesnt allow me to post more than one image, which only difficults more the communication.

In fact, in both screenshots I show how the analysis is still running, and the remaining time is quite high, and I can tell you that keeps increasing as time passes and I try to show you the activity on the machine is low, the disk not reading, ram not filling up...

In the bottom the Xmeters not showing any activity. And I cannot you show the afl because it is private strategy work, but it is complex with some indicators...

Again another screenshot

Divide and conquer. Have you tried swapping the AFL for a simple one? Does it change anything?

@awilson Since the CPU is already so low for his existing strategy, it is unlikely that a simpler one would make any difference like the case we had here
Regarding afl execution time - AFL Programming - AmiBroker Community Forum

If you want to try divide and conquer, maybe OP can

  1. can start 4 instances Analysis window with Backtest dividing all symbols to say 4 Watchlists.
  2. OR maybe even spin up 4 instances of AB ( not that AB is an issue ) but if you want to go down this lane, might as well try it.

Observe if something changes.

Because the CPU utilization is so slow, can try and bump it up.
With it being a local DB, should not be an issue if you can split symbols into roughly 4 groups.

Why on earth nobody seems to be reading the article and advice above.

period = Optimize( "period", 10, 2, 501, 1 );
Buy = Cross( C, MA( C, period ) );
Sell = Cross( MA( C, period ), C );

// CRUCIAL PART !!!! FROM THE ARTICLE THAT WAS QUOTED
// DOZENS OF TIMES already
// YOU HAVE TO CALL HEAVY MATH to make CPU not idling
// add some math to force i7 CPU to sweat a little bit
for( i = 0; i < 1000; i++ ) x = acos( log( C ^ period ) );
1 Like

Ive tested now this code


and the server just flyes over all the symbols as expected.

Cpu working at full, analysis under 3 minutes,

So what kind of code do I have that is making the analysis slow down a lot without keeping busy the cpu nor the memory ,disk...

can you try what i suggested?
And you must share some part of your code to Tomasz, otherwise there is no point discussing.
For example,
you could have a shared resource in your AFL which is just making other threads wait.
you are accessing some AFL plugin functions
you are using some OLE stuff / EnableScript() etc
I dont know, maybe some http calls :star_struck: you see, why its hard for others?

The example of high cpu utilization code by TJ is also a case where it's very parallel in nature.