Precision Problem with Setting GICS Code in AFL

Hi,

I use a database with EOD data for stocks from many countries. For my system development I need the GICS code of the stock which is unfortunately not delivered by the data provider. However I have a CSV file with a mapping from ISIN code to GICS. So I wrote a small AFL exploration which reads that file and sets the GICS code with AmiBroker functions. When I do an AFL check on that code I get a warning:

Warning 506. You have specified precision that exceeds IEEE standard. Numbers are only accurate upto the 7th significant digit.

And indeed there are a lot of stocks which do not receive a GICS code because of that precision problem. I use AFL function StrToNum to convert the ASCII GICS code from the CSV file to a number which in AmiBroker in general is a single precision float. I use function CategoryAddSymbol to set the GICS code wich itself needs a number for the code and not a string. The GICS code is 8 digits decimal numeric, so a single precision float with 7 digits will loose the precision of the 8th digit and thus the wrong converted GICS code will not be found in AmiBroker GICS file and therefore not assigned to the stock.

So my question is: How can I set GICS values with 8 digit precision?

Example Code:

file = fopen(gicsFile, "r"); 
if(file) { 
	i = 0;
	while(!feof(file)) {
		line = fgets(file);
		if(i++ == 0)				// Skip Header
			continue;
		isin = StrExtract(line, 0, ';');
		gicsText1 = StrExtract(line, 1, ';');
		gicsFloat = StrToNum(gicsText1);
		gicsText2 = NumToStr(gicsFloat);

		_TRACE("FileProcessing: ISIN = " + isin + "   gicsText1 = " + gicsText1 + "   gicsFloat = " + gicsFloat + "   gicsText2 = " + gicsText2);

		CategoryAddSymbol(Name(), categoryGICS, gics);
	}

	fclose(file);
}

For example, I get

FileProcessing: ISIN = US7611521078   gicsText1 = 35101010    gicsFloat = 3.510101e+07   gicsText2 = 35 101 008.000

So GICS code 35101010 is shifted to 35101008 because of the precision loss.

@paxromana,

You need to get your "Verified Bade" to post questions. Search it and find out how.

Also, Search and you will find some other threads on precision.

Thanks! I added "Verified License Badge". I also did a search on "precision" and on "GICS" but did not found a solution to the problem.

@paxromana, Here is a link that you may need to review....

http://www.amibroker.com/kb/2010/07/20/about-floating-point-arithmetic/

Essentially you have a "Number" that is too big. I see a couple of way to deal with this...

  1. Break the GICS code down into sections, each with much smaller numbers
  2. Use the GICS code as TEXT, and use text manipulation to determine what you want.

Hope this helps.

@snoopy.pa30 Thank you for your answer!

Regarding 1.:
I understand the limitations of single precision float.

GICS standard allows codes with 8 digits. E.g. "35101010" in my example is a perfect legal GICS code. All I want is to pass this 8 digits to the propriate AmiBroker AFL function.

However AmiBroker interface requires a 7 digit data type in function CategoryAddSymbol(symbol, category, number) for passing the GICS code which by nature can not fit the entire length of the 8 digit GICS code. So function CategoryAddSymbol will only work for GICS codes which are 7 digits or less. From that point of view the GICS code is indeed too big.

Regarding 2:
Yes, that's almost the question I have: I can not find a AmiBroker function which allows me to set the GICS code with a string data type which then will allow me to make usage of 8 digits GICS code. I could not find such a function in the online help when searching for GICS and could not find it here in the forum search. A function which sets the GICS code by index number would also be ok.

@paxromana I don't have a solution, but with my data provider (Norgate) I can find all 8 digits in the GICS code. So Norgate has figured something out. In AmiBroker GicsID( 0 ) returns a string

AddTextColumn( GicsID( 0 ), "Gics Code");
AddTextColumn( GicsID( 1 ), "Gics Name");

image

Interestingly in all my sector data and all GICS data I have seen, it appears as if the 8th digit is always a zero? Don't know if that is relevant or may help you edit your data down to 7 digits.

Norgate have nothing figured.
It is AmiBroker who have figured out something and who have provided documented import feature. Norgate just use technology figured out by AmiBroker.

So @paxromana,
you say that you have Gics code in CSV file. This line gicsText1 = StrExtract(line, 1, ';'); tells that Gics code is in second column.

Then please do read Ascii Import documentation. Even search in AmiBroker help comes up with #1 topic -> inserting "Import Gics" in seach -> click List topics... Et voila, "Import from ASCII file"... that's it.

14

If you have Gics code ready in file then there is import format example at upper link.

$FORMAT Ticker,FullName,GICS
$OVERWRITE 1
$SEPARATOR ,
$CONT 1
$GROUP 255
$AUTOADD 1
$NOQUOTES 1

So all you need to do is to pick the file (once it is ready) via Import Wizard (and applying correct import format).
And that's it. No AFL, no loop, no CategoryAddSymbol() and what not... but just using existing feature.

1 Like

@fxshrat Thank you very much. I was able to load 8 digit GICS code with the ASCII importer. For me it is not as convenient as an AFL function (esp. if there is a AFL function designed for that case), but hopefully this process has not to be run often. So for me this is ok. I will install an import format and import type to accellerate and simplify the usage of that process.

@portfoliobuilder I don't know how @NorgateData implemented this. But when using a data plugin it is possible to access StockInfo Object which has gics attribute. Gics attribute is of C++ datatype int and can easily store 8 digit decimal without loss of precision.
In Norgate database some tickers exits with "5" as 8th digit, e.g. BEAT which is 35102015.

@paxromana,

not as convenient? You can't be serious.
How is something that you just need to copy from and change first line (to fit your CSV columns) and then save to Formats folder is less convenient than an AFL loop that does not work (and well, requires coding effort).

Secondly... and if you want to run by automatism then besides batch there is another way using OLE import method

SetOption("RefreshWhenCompleted", 1);
Buy = Sell = Short = Cover = 0;   
if ( Status("action") == actionScan ) {      
   if ( Status( "stocknum" ) == 0 ) {
      ab = CreateObject("Broker.Application");
      // GICS.format being part of Formats folder
      ab.Import(0, "C:\\Gics.csv","GICS.format" );
   }
}

How is that not as convenient than a loop version that does not work?
Even if your loop version would work it would not be more convenient.

So I repeat myself, you can't be serious.
And I can't stop shaking my head in disbelief all weekend.

1 Like

@fxshrat I still say thank you for support. However there is no reason for me to answer in that kind of language which you choosed.

Yes, it is not as convenient. Yes I am serious. No I am not kidding.

I wanted to drill down the problem to the part of the code where the problem appears, because it would be easier to understand for the users here. That's how the example code above was written. I never said this is the way how it will run in my workflow and even more I also never said that ISIN is identical to TICKER. It is not and that is the reason for the inconvenience, because it will add some manual (= errorneous) steps to the creation of the CSV file. Errorneous steps may misalign the CSV file and misaligend CSV files could lead to a mess in the database after importing.

The example code above is only written to reproduce that error. It will not work in reality.

That's something which I did not told you because I felt it is not related to the problem and I do not want to keep others busy with unneccessary stuff. Maybe this was a misjudgement from me and I apologize for that. However a simple question would have given you that information without shaking your heads all weekend.

AFAICS, my language was fine as it is based on given information and comparisons and does not contain insult. If I said that you can't be serious then I saw a sign of hope that what you meant to say can't be true. I am passionate optimist. "Being impossible" does not exist in my vocabulary. And your import process comparison just did not make sense. So there was nothing I could do about shaking head. It's my problem I reported about.

As for questions...
It is the other way around (addressing requesters).
Please read here:

Now here is the deal... feel free to make copy of your DB and delete all symbols in that copy except for ...3,4,5,... ones. Send it to my mail address anytime together with description of CSV creation and import process (plus CSV and AFL files). Then I am certain afterwards it will be convenient.

You've triggered sportsmanlike competitive spirit within me.

Thank you for your help!

The database cannot be shared because of licensing issues which also applies to parts of the database. The trading universe is delivered by third party via an Excel sheet which contains among other columns the ISIN and the GICS code and is seperately licensed and can also not be shared.