Multi-CPU Thread Parallelism

I'm wondering if the thread parallelism I'm seeing is the limit of what is possible. I've read and applied all relevant articles, and addressed the single thread limitation of Norgate Data. My test case optimization looks like so:
Capture Capture3
On this dual CPU machine note that all 12 threads on Node1 (second CPU) are used. The 12 threads on Node0 remain idle. When I disable affinity for all threads on Node1 the AMI Borker workload moves to Node0. Busy threads across both nodes is never achieved.
Capture2

In the BIOS turning off NUMA and reverting to SMP sees perhaps 15% of the workload performed on the first 12 threads with 75% on the second 12 threads. Run times are slightly worse.

Screenshot 2021-04-15 102511 Screenshot 2021-04-15 122755

Is AMI Broker compiled to take advantage of multi-CPU's?
Is the use of static vars binding the workload to a single CPU?
Anyone with code I can run that is known to run on multi-CPU's?

1 Like

88% (244 sec out of 277 sec) of time in your optimization is spent waiting for DATA PLUGIN, and not actual optimization or formula execution.
AFL execution is perfectly parallel (it takes only 12% of time).

As explained here:

Amdahl’s law says that if 95% of your program runs in multiple threads and only 5% of it is serial (single-threaded), the maximum achievable speedup regardless of how many CPUs and how many cores you have is 20x (20 times).

Don't use plugins when you count on speed. Disconnect from external source and use local only. Make sure your "in memory cache" is large to allow keeping ALL symbols in RAM. Also don't use Norgate functions as they are slow.

The thread on very same subject already exists, don't post duplicates please. Continue in existing thread:

This topic was automatically closed 100 days after the last reply. New replies are no longer allowed.