Consider this. An afl when run on one 1 single stock takes 1 minute to execute fully and produce one trade in in analysis window trade list. This takes up 1 threads. Suppose the CPU concerned has 10 cores and can run 20 threads in parallel such as the i9 10900k.
So my question is how much time should it take to run this afl on 20 stocks? Should it not take same amount of time i.e. 1 minute. Because the CPU as mentioned earlier can run 20 threads in parallel and since analysis window allocates 1 threads per instrument logically the execution time should be 1 minute. BUT THIS IS NOT THE CASE. It is taking much longer. It takes about 4 minutes. Why?
The answer is: IT DEPENDS. First your CPU has only 10 cores, not 20 cores. So it can't run 20 threads independently. 2 threads share same CPU core. So at start it is not 20x faster as you think. But there are dozens of other things to consider.
The world is not as simple as elementary school math.
Please read this:
I didn't know that 2 threads cannot run parallel on one cpu. It seems little deceitful of the processor companies if the 2 threads must share the cpu. Because I always thought they could run parallel. Anyways Thanks for clearing this up Tomasz.
One problem I face is that when live algo trading on a watchlist of 100 stocks on a base time interval of 1 minute if 99 stocks complete execution under 1 minute but that one stock takes up 10 minutes then the queue is on hold and the the cycle has to wait for that 1 stock to complete. This is very unfair to the 99 stocks that completed on time. My afl is very complicated and does not use moving average etc. it needs sub-leg verification which involves me to iterate lower time frames one by one sometimes as many as 100 timeframes. So that one culprit stock may need more time as it's sub-leg was verified in a timeframe that was way down the line. I wonder how you professionals handle such scenarios. A timer that quits execution for the culprit stock comes to mind as a solution. Like a timeout at 1 minute. That way the cycle completes at 1 minute and the trade list is refreshed at every 1 minute no matter what. Please share your insight.
You are mixing up the terms. 2 threads can run on one CPU. What I wrote was that "2 threads share same CPU CORE". A CORE is a part of CPU. CPU is entire processor, it consists typically of MANY CORES.
In so called "hyper-threading" CORE you have ONE CORE that can execute 2 threads but NOT entirely parallel. It depends on construction of specific CORE because they differ from family to another. Some resources (like registers) might be duplicated, but some resources (like floating point execution units) may be shared. The CORE might process some instructions in parallel and some not. This is complex stuff and definitely not as easy as 2+2=4.
https://www.liquidweb.com/blog/difference-cpu-cores-thread/
As to your formula - it is your code and it is up to you to write it efficiently. Typically proper coding leads to much higher speedups than any hardware upgrade. If you are struggling with proper coding hire someone who is more proficient Third party services, blogs, courses, books, add-ons
Agreed. Nothing compares to well written code. Also minimising the bars in a chart and setting the highest possible base interval helps. Thanks Tomasz. I am going through each and every article/threads you linked. Very helpful. Thanks again.
I have gone through the article you have written and I understand that RAM and Cache plays a major role. Taking this conversation further can you define the best PC build for amibroker If MONEY IS NOT A MATTER. Consider running analysis window for watch-list containing 100 stocks. Each stock has 4000 bars at base interval. Imagine that using the present build if run the same load CPU consumption is at 100% and Consuming 30 GB RAM (total available is 48 GB) when running analysis. What is the THE BEST hardware build that can speed things up IF MONEY DOESN'T MATTER.
If the formula does "not" have scope for optimization, then maybe look at server grade hardware like xeon scalable series.
Windows 11 PRO does support dual processors ( multi socket chipset up to 2 )
All the Xeon 3rd/4th/5th Gen minimum "Silver" are scalable to 2 sockets.
Windows processor requirements Windows 11 supported Intel processors | Microsoft Learn
For even higher specs:
AB should run fine on a full GUI install of Windows Server OS, which supports 4/8 sockets and then at least a 4th gen gold (Xeon) for 4 sockets and platinum series does 8 sockets.
So if you get a dual-socket chipset for Win 11 pro, disable hyper-threading, then pop
xeon-platinum-8468H ( 48 true cores ) and 105MB cache
or,
xeon-platinum-8490h ( 60 cores ) and see the performance, then scale to 2nd one.
Intel® Xeon® Platinum 8490H Processor
The H series is higher performance cores, otherwise mixed fast ones are named P and some slow cores have E. And AB can be started in multiple instances too.
Thankyou, is amibroker scalable to multiprocessor setup?
It doesn't matter really from application standpoint if processing cores are on single chip or multiple chips. It is Operating System that assigns threads to physical cores.
That is very insightful. Thankyou.
Also share your opinion on overclocking. This concept may have some peculiar modalities with amibroker running as amibroker uses cpu non-stop at 100% especially during live trading/analysis-runs. Not a single moment of rest for cpu against what you may see with other applications. Long hours (market hours usually 6 hours trading session) without any break for cpu. Can an overclocked cpu withstand that?
What are your thoughts on over clocking an i9-10900? Mine runs at 4 GHZ (without overclocking). Can I bump it up to 5 GHZ? How much I can overclock without frying it (given unique situation introduced by ab) Would I need additional cooling? If yes how? Peltier module or some solid state cooling solution?
Overclocking that cpu isn't anywhere near a multi-xeon setup.
Everything needs cooling, because that's where the performance can be extracted.
Now using liquid nitrogen will work but it's not practical or cost effective
I used to watch those overclocking videos a decade ago, esp AMD ones
AmiBroker doesn't create any "unique situation". You just overclock then check the system for stability for 24 hours at least using typical torture tests like Prime95: GIMPS - Free Prime95 software downloads - PrimeNet
Yes that's what i meant since amibroker is used to trade market (6 and 1/2 hours session usually) live during which time the CPU usage will be NON-STOP whereas other applications like video rendering, CAD/CAM tools etc may not use the CPU non stop for 6 hours generally. But yes i take your point that a stress test should do the job.
Generally speaking the machine should pass 24 hour Prime95 torture test. If it passes this, it will handle anything.
Suppose there are two CPUs. 16 cores per CPU. 2 Threads per CPU. So 2 x 16 x 2 = 64 threads are available.
Now if I run amibroker analysis with a watchlist containing 64 stocks then how many threads will be used by amibroker?
He's written 64 threads here, and win10/11 will give a logical processor count of 64 in your example
I wanted to ask you is it possible to use the existing nVidia graphic card that is installed in the pc? In task manager nVidia GeForce shows up 0% used when amibroker is running. While my CPU is used 100%. Can you not divert some of the processing to the GPU? Why let it lie vacant when we can use it? It will be like having multi processor setup.
No, Graphic cards are not suited for non-massively-parallel problems.
To utilize hardware architecture of the graphic card you need to have problem that is massively parallel (like processing tens of millions of pixels , each independently without any serial dependence).
Trading systems typically are serial by nature (i.e. decision made "now" depends on decision made "before").
Also graphic cards don't use normal programming paradigm. They use "shaders" that are small specialised programs that do small tasks but in millions of parallel copies. You have to code specifically for this completely different programming model.
The difference is somewhat like between human brain (CPU) to millions of little brains of ant colony (GPU).
I understand that Tomasz you had mentioned it earlier, But although it may not be suited for it but since most PCs have it can we not send some of the load like 10% of the load to it. My point is why let lie there unused if we can use it? I paid $350 for the nVidia Geforce card and It just rotting away. I don't play games I don't use high end graphics So if there is a way to utilise it then we should explore that.
It is not just "send some of the load". It is like totally rewriting everything. It is like you wanted to teach ant colony to play chess. It is different universe.