Compare data .dll compared to AFL ASCII importer loops and parsing CSVs

I’ve got my system now to being able to import ASCII files using the OLE import (I’m not aware of a native .afl function)

Externally I use the Python pandas library to fill up a CSV which is renamed every 5 seconds (via shellexecute, then a simple ‘ren’ command inside a .BAT) and then I delete the file.

I need to tweak some bits and pieces so that it runs smoothly, as I’ve noticed that sometimes, the file that is meant to be deleted, is not deleted, continues to grow, then I can’t rename the prior file into it, and data stops importing. I’m sure I can work out the logistics side of things, but I also notice that when I import data, there is a brief pause in AmiBroker, dependent on the size of the file.

My question is, if I were to go through the process of learning how to write a .dll, is the data treated any differently when importing into AmiBroker? I don’t know a huge amount about .dll’s, but I’m considering learning C++ now for the purpose of being able to write one, and I’m wondering whether it it worth the investment of time.

Spandy.

try using a RAM Drive RAM drive - Wikipedia

Thanks for that. Thats a good idea.
I have started to wonder if I’m running into performance issues. I am finding that frequently, the VM I am running (parallels 17 on ARM64 Mac) just does not seem to be able to handle what I want it to. I’m limited to only using 6Gb RAM, so I’m wondering if it’s viable with my current set up.
I’m actually thinking of getting a new physical windows machine or maybe something refurbished but powerful.

Hello SpandexMan,

in my mind it is not an issue of disk i/o, you use already ssd ...

  1. OLE communication based on not fixed size data import is "asymetric". You cannot frame it to 5 seconds. It depends on your dynamic data amount and your "tweaks".

  2. OLE communication inside Windows is not realtime

  3. You should always trigger AB OLE data import from outside of AB, i my case i use a simple compiled VB process. This process should not prepare/fill your csv in append mode but in native write. This guaranties that this csv is not locked by an other process and you will be able to manage this exception ( error code ) in your code.

  4. Wait in your Python program for next OLE import schedule after finishing the correct write, delete, rename operation/workflow ... of your csv.

  5. To come closer to your 5 second schedule or expand to ~ 20 sec split your single schedule into several concarting schedules and build up some kind of staging system with Semaphores to track the current state of your running csv operations. ( write, delete, rename ... )

  6. Schedule every 1 second in first/main instance then break further operations until the job in next following instance is done or wait until specific concarting instance for finishing. This information/Status you store in your Semaphores ...
    Here is imported to wait/ask your csv file state ( locked, renamed, is existing := not deleted ... ) and break if necessary the current instance. Main shedule will execute every second ... and will trigger your current state in workflow when it is possible by csv file state.

I try to explain that you have to think from fixed timing to dynamic event handling.
Hopefully this post could give you some ideas.

Best regards,
Peter

2 Likes

You should really look at the examples included in the ADK

The point is that you should always use right tool for the task. You use a hammer, not a violin, to hit the nail.

Therefore you should write proper data plugin using ADK if you want to feed real time data.

1 Like

That’s a good analogy. Perhaps I should buy a hammer.
I’ll take another look at the ADK in the new year. I need to learn a bit of C++ first.

Ok, I like this concept. I had not heard of semaphores before but I’ve just read about them and what you’re saying really makes sense and I think I can start to use that concept generally in my coding. I’ll take a look after Christmas at how I could set up some kind of system using this, especially as webhook subscription gives delivery at unpredictable timings.
Whether I’ll end up with this as a final method or run with the ADK, I don’t know yet. I’m suspecting that it will improve my skill to learn C++ and build the .dll in any case. For me, stability, and reliability are more important than speed within reason. I’m suspecting the .dll route will end up being more stable.

For what it is worth, you don't need semaphores if you implement classic producer - consumer scheme. Just create A NEW file with python each time you have new data - only python would be writing ("producer"). After python is done with writing, trigger AmiBroker to import it - only AmiBroker will be reading ("consumer"). Then delete file. And over and over again. No need for semaphores.

Appending to files is slower and not really needed at all, as everything what is already in AmiBroker database does NOT need to be reimported again.
You should just import NEW DATA, without anything what was already imported.

In the scenario where I was doing such an approach, then, if I simultaneously had a separate instance of AmiBroker open and I ran an exploration, would that read the new data, or Would it be better in such an approach to run the explorations from outside of AmiBroker?

To clarify my post, i use here the declaration of a "semaphore", as Tomasz wrote it is not any operating system semaphore. I should call it better "like a semaphore" using inside your external "producer" ( in Tomasz words ) to handle/control any event triggered communication via OLE > AB. Tomasz is right to use ADK to build up professional data plugins.

No, I think between yourself and Tomasz, you have clarified a really useful approach for me. What you said makes perfect sense.

Hi SpandexMan,

it is called "there are many ways upon a hill" and my skill and especially time is limited. So i use MS Windows 11 native, Visual Studio, for several years in the past parallels on my lovely macbook core 2 duo ...

https://forum.amibroker.com/t/system-performance-and-downloads/23050/3

This way has been soft landing for me and OLE is the key for my success ... I can confirm it will work fast as a charm with many of my "spooky sources" and reseach, i studied books from Howard, he would say " i am fiddle around". But finally i got everything up and running, simulation-, synthetic-, random-data, replaying of defined breakpoints and many more ...

Just one additional hint in my mind. OLE data import through exploration will not work because we should not OLE looping inside AB. Exploration technology is brillant because you can setup time based iterations simple by menu setting. But you cannot eleminate your problem of dynamic/not time based executions and workflow processing of your own logic.

Some sample of my approach ...

best regards,
Peter

I had posted my query here and hoped to get an opinion from Tomasz but maybe he missed this post.

1 Like

Ok, so you have cleared my understanding very much and I finally understand what OLE is actually for.
I was previously calling a script (.bat) which was along the lines of "Broker.exe" /runbatch etc and then when coming to actually importing, I was calling AFL from inside AB. I had misunderstood, thinking that because I was calling from a .bat, I was calling from outside AB, but AB had already loaded.

I noticed that AB.import takes a while if amibroker is open and this was the limiting step.

However, thanks to all you guys, I am now triggering these processes from python OUTSIDE of amibroker and the import is almost instantaneous. The rate limiting step now is pulling multiple symbols from a REST API but I can live with that. I can pull about 720 symbols in about 15 -20 seconds by running several loops in parallel. (Asyncio didn't work for me in python,.... from what I have read, GET and POST requests etc are not supported.)

In any case, the strategy I went with was to count the number of files (each file is parsed into a master file) and when it reaches a certain threshold, import, and it is working.

import win32com.client
import os
import time

n=0
while n<1:
    num_files = os.listdir("C:\\Program Files\\AmiBroker\\Brokers\\Binance\\Data\\symbolcsv\\klines\\csvFiles\\")
    len_num_files = len(num_files)
    if len_num_files >= 720:
        try:
            os.rename("C:\\Program Files\\AmiBroker\\Brokers\\Binance\\Data\\symbolcsv\\klines\\csvFiles\\quotes.BinHist","C:\\Program Files\\AmiBroker\\Brokers\\Binance\\Data\\symbolcsv\\klines\\csvFiles\\quotesRen.BinHist")
        except:
            pass
        print("IMPORTING DATA INTO AMIBROKER")
        AB = win32com.client.Dispatch("Broker.Application")
        AB.LoadDatabase("C:\\Program Files\\AmiBroker\\Databases\\Crypto")
        AB.Import(0, "C:\\Program Files\\AmiBroker\\Brokers\\Binance\\Data\\symbolcsv\\klines\\csvFiles\\quotesRen.BinHist","BinHist.format")
        os.remove("C:\\Program Files\\AmiBroker\\Brokers\\Binance\\Data\\symbolcsv\\klines\\csvFiles\\quotesRen.BinHist")
        AB.RefreshAll()
        AB.SaveDatabase()
        break
    else:
        time.sleep(0.5)

Thanks guys!

Spandy

1 Like

On a side note, I'll share my machinations, based on the (perhaps my inaccurate AB knowledge) potential product idea for a AB-Server for customers local area networks. Meaning one instance of AB, that is receiving the real-time market data feed thru whatever plugin, could act as a Master AB - server to other AB instances on other computers on our local network or any coded projects we write can "plug right into" the designated AB-server and one of it's connection sockets, for both the Database and real-time feed as a middleman using both its hard drive stored data and memory resident data over RDMA.

Why? It would seem really valuable to be able to serve any other running instances of AB, software or custom coded projects both historical data and real-time data that the AB Master/AB-server is connected to and storing(thru a standard connection protocol script or a lightweight client module on other computers that does the connection?). Even instances of Excel for example. In this way we can have SDK plugins running on other AB instances because they are not already using an SDK plugin for real time data, but instead connected to the master AB acting as the AB-Server that is connected?

Just thinking. :open_mouth:

This topic was automatically closed 100 days after the last reply. New replies are no longer allowed.