Correlation Matrix

Hi,

I have modified the code as described here: https://www.amibroker.com/kb/2014/12/04/how-to-display-correlation-between-symbols/ to display correlation of prices or returns.


// How to display correlation between symbols
//-------------------------------------------

/// @link https://www.amibroker.com/kb/2014/12/04/how-to-display-correlation-between-symbols/


// To use the formula we need to do the following: 
// (1) select Tools -> Send to Analysis menu
// (2) in the Analysis window, Apply to: Filter
//     (in Include tab hit Clear and choose the CORRECT watchlist)
// (3) CHOOSE YOUR PARAMETER SETTINGS
// (4) select Range: 1 recent bar(s) 
// (5) make sure you clicked on the CORRECT watchlist on the LHS
// (6) press Explore button



//===============================
// CHOOSE

// Watchlist Number
WLN	= Param( "WLN#", 63, 1, 1000, 1);

// Correlation Period
range = Param( "Periods", 757, 0, 2000, 1);


  
//===============================     
// read the list of symbols from Watchlist WLN
symlist = CategoryGetSymbols( categoryWatchlist, WLN );

// display only last bar from the Analysis range
Filter = Status( "lastbarinrange" );



SelectedIndicator = ParamList( "Prices_OR_Returns", "Prices,Returns", 1 );

// iterate through symbols
for ( i = 0; ( sym = StrExtract( symlist, i ) ) != ""; i++ )
{


	switch ( SelectedIndicator )
	{

			case "Prices":
			//===============================
			// Correlation of PRICES
			Corr = Correlation( C, Foreign( sym, "C" ), range );
			break;


			case "Returns":
			//===============================
			// Correlation of RETURNS
			Var1 = Foreign( sym, "C" );
			Var1ret = ROC( Var1, 1 );

			// calculate correlation RETURNS over range bars
			Corr = Correlation( ROC( C, 1 ), Var1ret, range );
			break;

	}
	

    //=============================== 
    // set color dynamically based on correlation values 
    // and display the output in exploration column
    Clr = 32 + SelectedValue( Corr ) * 32;
    AddColumn( Corr, sym, 1.2, 
               ColorHSB( 128 + Clr, 255, 255 ), 
               ColorHSB( Clr, 255, 255 ) );
               
}



SetSortColumns( 1 );     

I have 3 questions:

  1. How do I sort the matrix so that there are 1's on the leading diagonal ?
  2. The question of do I use prices or returns for analysis? There's an interesting discussion: http://www.portfolioprobe.com/2011/01/12/the-number-1-novice-quant-mistake/ , however I would be very interested to hear views on which to use and when and why?
  3. In the above code how do I show significance of correlation coefficients? Ideally a highlight of values or a star next to them or some other away?
    The t-stat is:
    t-stat = (r * sqrt( n-2) ) / sqrt ( 1 -( r^2) ), where
    r = correlation coefficient
    n = sample size
    For large degrees of freedom the t-tends to the normal, so values of +/-1.96, apply for 95% significance.
    Also, there's a p-value https://onlinecourses.science.psu.edu/stat501/node/259

Thanks.

1 Like

Anybody have any answers?

Thanks,
Amarjit

With regards to 1 - original code (from the official Knowledge Base) has 1's on diagonal. If you keep symbols in rows and columns the same (with same order across rows and columns), you will get that diagonal automatically because diagonal would represent same symbols, which means correlation = 1.

2 & 3 are not really AmiBroker questions.

3 Likes

I don't know the mathematically correct answers but have seen discussion in favor of using Log(today's Close/ yesterday's close) as a valid method of comparing correlations.

As for your question about highlighting values that interest you, with AmiBroker you can change the colors of either the font or the background of the cell's.

An example, you can use your choice of statistical testing but here is some code example that uses a Corr of the Log's greater than 0.6 to color the cell green.

// choose your WL in Analysis window
// I've added the log(Close/yesterdays Close) which seems more mathematically proper

CorrLB = Param("Lookback", 252, 50, 300, 1); // daily correlation look-back period//

wlnumber = GetOption( "FilterIncludeWatchlist" ); // choose your WL in Analysis window
symlist = GetCategorySymbols( categoryWatchlist, wlnumber );

// display only last bar from the Analysis range
Filter = Status( "lastbarinrange" );
SetSortColumns(1); // I added this to keep the columns and rows in order
Tickers = Name();

// iterate through symbols
for( i = 0; ( sym = StrExtract( symlist, i ) ) != ""; i++ )
{
    
    Corr = Correlation( log( C / Ref( C, -1 ) ), log( Foreign( sym, "C" ) / Ref( Foreign( sym, "C" ), -1 ) ), CorrLB );

    // set color dynamically based on correlation values
    // and display the output in exploration column
    Clr = 32 + SelectedValue( Corr ) * 32;
    Clr = IIf( Tickers == sym, colorBlack, Clr );
     AddColumn( Corr, sym, 1.3, IIf( Tickers == sym, colorWhite, IIf(Corr>0.6,colorWhite, colorDefault )), 
		IIf( Tickers == sym, colorDarkBlue, IIf(Corr>0.6, colorGreen, colorDefault) ), width = 60 );
}
   

//     AddColumn( Corr, sym, 1.3, IIf( Tickers == sym, colorWhite, ColorHSB( 128 + Clr, 255, 255 ) ), IIf( Tickers == sym, colorDarkBlue, ColorHSB( Clr, 255, 255 ) ), width = 60 );

Produces the output in the picture below on the DJIA.

Good luck.

8 Likes

@portfoliobuilder, please don't understand the following as rant but as hint(s) for your own and others' benefit using your code example or writing next (different) code next time.

Rather move things such as functions/arrays/ loop invariant code that have no place within loop to outside of the loop. So in your case rather move

log( C / Ref( C, -1 ) )

outside of loop and store to variable:

logC = log( C / Ref( C, -1 ) );

logC variable is then used within the loop.

Next one, calling arrays/array functions multiple times within loop is not a good idea, even more so if dealing with Foreign. So storing multiple calls to variable is another better idea (this refers to any multiple calls, not just foreign ones).

So better one

Frg = Foreign( sym, "C" ); // foreign just called one time
FrgLog = log( Frg / Ref( Frg, -1 ) );

In summary, a more optimal code would be a following one below. So if originally the code took 60 seconds to complete it now runs at 30 to 40 seconds (around 30% here). Massive time gain if you hate waiting.

CorrLB = Param("Lookback", 252, 50, 300, 1); // daily correlation look-back period//

wlnumber = GetOption( "FilterIncludeWatchlist" ); // choose your WL in Analysis window
symlist = GetCategorySymbols( categoryWatchlist, wlnumber );

// display only last bar from the Analysis range
Filter = Status( "lastbarinrange" );
SetSortColumns(1); // I added this to keep the columns and rows in order
Tickers = Name();

// iterate through symbols
logC = log( C / Ref( C, -1 ) );// does not belong within loop
for( i = 0; ( sym = StrExtract( symlist, i ) ) != ""; i++ )
{
    Frg = Foreign( sym, "C" ); // store Foreign to variable since it is called multiple times
    FrgLog = log( Frg / Ref( Frg, -1 ) );
    Corr = Correlation( logC, FrgLog, CorrLB );

    // set color dynamically based on correlation values
    // and display the output in exploration column
    corrcond = Corr>0.6;// used several times
    tickercond = Tickers == sym;// used several times
    Clr = 32 + SelectedValue( Corr ) * 32;
    Clr = IIf( tickercond, colorBlack, Clr );
    fntcolor = IIf( tickercond, colorWhite, IIf(corrcond,colorWhite, -1 ));
    cellcolor = IIf( tickercond, colorDarkBlue, IIf(corrcond, colorGreen, -1) );
    AddColumn( Corr, sym, 1.3, fntcolor, cellcolor, width = 60 );
}  

@amarjit,

As for 1. if you don't get proper diagonal then the reason is that column order depends on symbols order within Watchlist. So if symbols in watchlist are not sorted alphabetically like in this sample case (See EURJPY being at the end).

then you will get a diagonal like this (see EURJPY column being at the end too).

So what to do? Easy. Just go to Symbol window and there to your watchlist, right click it and choose "Sort alphabetically"

After that one is done re-do exploration and you will get proper diagonal since columns are sorted now too.

14 Likes

Thanks for the tips, I really appreciate any programming help. As you may know I don’t have a programming background so techniques that improve efficiency and reduce opportunities for errors are much appreciated.

I’ll try to learn and incorporate those ideas into future codes.

And please do not hesitate to offer constructive criticism to me anytime. I think you are a great asset to our AmiBroker community and we just have to keep some of our favourite Sopranos quotes on our private emails and not on the public forum :grinning:

7 Likes

Many thanks to everyone!

1 Like

Hi,

I am using the correlation matrix code posted above, amarjit's code, on a watchlist of 600 plus securities. The code runs correctly in Explorer and produces the matrix. When I select all and copy-paste to excel, all rows are produced correctly except for the first row which seems to abruptly stop and end with ",,,,".

Thanks,

Mike

imageMCDFX

MCEMX	MCHFX	MCHI	MDIIX	MDY	MEASX	MECFX	MELIX	MEOIX
0.742427	0.910101	0.808445	0.9211	0.688269	0.513533	...			
0.587688	0.768152	0.673008	0.748231	0.899174	0.791314	0.286097	0.916097	0.632079	0.577439
0.782474	0.77336	0.825344	0.815789	0.483168	0.350167	0.320215	0.556474	0.643448	0.676263
0.371448	0.529167	0.417871	0.474861	0.848407	0.434829	0.276045	0.718584	0.553492	0.426029
0.513086	0.680173	0.598885	0.636018	0.820376	0.8121	0.287302	0.884252	0.579213	0.548123

If all other rows are correct, can I run the script twice on the first symbol so it is duped and then I could delete the first row or can a parameter be changed to allow the first row to be produced and copied to Excel? Also, I tried pasting the matrix directly into a text file and the first row has the same issue.

Thanks,

Mike

@reds sorry but it works fine for me. I used Edit --> Copy and then pasted into Excel spreadsheet on the SP500 and all 505 stocks appear.
image

And top right hand corner of spreadsheet looks like this
image

thanks for taking a look! I have tried several times and always with the first line truncated but all other rows ok. I have also tried shorter watchlists and it all rows were produced. Maybe something with my data? I will check further. Thanks again!

Mike

portfoliobuilder,

Could your script be modified to produce a table that lists the primary symbols in column A and then the top 3 or 4 correlated symbols(not correlation coefficient), listed by correlation score, in columns in B,C,D,E…?

Thanks,

Mike

@reds i think the current script would not be of much use for what you are asking for. I believe you probably need to rank the correlations ( StaticVarGenarateRanks), and then for that output that you are looking for perhaps using AddMultiTextColumn would produce the result. I have not attempted that as it is a big job and I am too busy. But if you are able then please come back and share with us here on the forum. Good luck.

Thank you Larry for the suggestions. I will post back if I am able to find a solution but it may be out of my limited abilities…Thanks…Mike

@reds, this can be done by storing the corr. table to matrix at the end of exploration (using matrix functions). The sorted top N get picked from that matrix and output via Addrow.

It is a bit tricky (but not difficult). I'm saying tricky because you have to think a little bit.

Here is proof that it is possible to do that (output of sorted table (from largest to lowest together with standard correlation table below of it). As aside Addrow does not offer cell coloring (yet). I have output all but of course you can pick just a top N via MxGetBlock also.
230136

Here is another example just outputting overall top five pairs + corr.

174133

So as always endless possibilities of output.


BTW, instead of using Foreign for default correlation table it is much much faster using StaticVarAdd to store second array of Correlation function. In fact that method is lightning fast compared to using Foreign. Milliseconds per each exploration run.

3 Likes

@reds here is a possible way to do it (a bit unorthodox approach):

/****** Constants *****/
							// Change these as needed
MAX_ALLOWED_TICKERS = 1000; // Over this size probably it is too slow...
MAX_PAIRS_IN_TABLE = 250;   // To avoid using too many columns and hit a Windows limit

/***** Parameters *****/

// Number of the watchlist to use to create the correlation matrix
wlNumber = Param("Watchlist Number", 1, 0, 63, 1);         
// Number of columns pair to display in resulting table
maxColRes = Param("Number of (pair) columns in results (0 = max)", 5, 0, MAX_PAIRS_IN_TABLE, 1);  
// Daily correlation look-back period
corrLB = Param("Daily correlation lookback period", 252, 50, 300, 1);         
// Send trace messages for debugging purpose on progess
debugTrace = ParamToggle("Enable _TRACE()", "No|Yes", 1); 


/***** Functions *****/

function _TRACE_(msg) {
    if (debugTrace) 
		_TRACE(msg);
}		


/***** Exploration *****/

actionStatus = Status("action");
if (( actionStatus == actionExplore ) OR (actionStatus == actionBacktest)) {
	symlist = GetCategorySymbols( categoryWatchlist, wlNumber);
	Filter = 0;
	stockNum = Status("stocknum");
	if (stockNum == 0)
	{
		// Do this only ONCE
		if (symList != "") {
			EnableTextOutput( 0 ); 
			size = 1 + StrCount( symlist, ",");
			if (maxColRes >= size) {
				maxColRes = size-1;
			}
			if (maxColRes <= 0) {
				maxColRes = size-1;
			}
			if (maxColRes >= MAX_PAIRS_IN_TABLE) {
				maxColRes = MAX_PAIRS_IN_TABLE; // 
			}
			
			_TRACE_("Selected watchlist: " + WriteVal(wlNumber, 2.0) + " - Size: " + WriteVal(size, 3.0));
			_TRACE_("Symbols: " + symList);
			if (size < MAX_ALLOWED_TICKERS+1) {
				Filter = 0;
				SetOption("NoDefaultColumns", True );	
				AddColumn(Null, "Ticker", 1.2, -1, -1, 60);
				for (i = 1; i <= maxColRes; i++) {
					AddColumn(Null, "T" + WriteVal(i, 1.0), 1.2, -1, -1, 60);
					AddColumn(Null, "C" + WriteVal(i, 1.0), 1.2, -1, -1, 60);	
				}
				
				// Get Tickers and do calculations ONCE 
				_TRACE_("Parsing tickers");				
				for( i = 0; ( sym = StrExtract( symlist, i ) ) != ""; i++ ) 
				{
					VarSetText("T_" + i, sym);
					CFrgn = Foreign( sym, "C" );
					logCFrgn = log( CFrgn / Ref( CFrgn, -1 ) );
					VarSet("L_" + i, logCFrgn);					
				}	
				_TRACE_("Parsing tickers. Done");
				// Add top row used for ticker index 
				mx = Matrix(size+1, size, 0);		
				_TRACE_("Created matrix. " + WriteVal(size+1, 3.0) + "*" +  WriteVal(size, 3.0));						
				
				_TRACE_("Filling matrix with correlations.... ");
				// First row is reserved to ticker indexes
				for( col = 0; col < size; col++ )
				   mx[0][col] = col;
				
				// Filling a matrix with correlation values
				startCol = 0;
				for( row = 0; row < size; row++ ) 
				{
					symRow = VarGetText("T_" + row);
					// CRow = Foreign( symRow, "C" ); 
					// logCRow = log( CRow / Ref( CRow, -1 ) ); 
					if (row % 10 == 0) 
						_TRACE_("Row " + WriteVal(row, 3.0));
					for( col = startCol; col < size; col++ )
					{
						symCol = VarGetText("T_" + col);
						// _TRACE_("R " + WriteVal(row, 2.0) + " C " + WriteVal(col, 3.0) + " " + symRow + " * " + symCol);
						// CCol = Foreign( symCol, "C" ); // store Foreign to variable since it is called multiple times
						// LogCCol = log( CCol / Ref( CCol, -1 ) ); 
						// corrARray = Correlation( LogCRow, LogCCol, corrLB );
						// corr = LastValue(corrArray);
						corr = LastValue(Correlation( VarGet("L_" + row), VarGet("L_" + col), corrLB ));
						// corr is a columm 
						mx[col+1][row] = corr;
						// Filling the symmetrical cell
						mx[row+1][col] = corr;
					}
					startCol += 1; // reducing number of cells to fill thanks to to this matrix symmetry
				}    

				_TRACE_("Matrix done. Sorting - Ignoring header");

				
				_TRACE_("Coupling elements");		
				// Print result in the Analisys window using AddRow (to output too if needed)
				for( row = 0; row < size; row++ ) {

					// we need to transpose the matrix to sort by columns - then restore
					// each iteration we sort on a different row for each ticker (rows headers)
					// and row 0 will have the orders of correlated tickers (columns headers)
					mx2 = MxTranspose(mx); 
					mx3 = MxSortRows(mx2, False, row+1); // sort descending
					mx2 = MxTranspose(mx3);  
					symRow = VarGetText("T_" + row);				
					s = symRow + "\t";
					// Skip col 0 since it is 1.0 for the ticker own correlation.... 
					for( col = 1; col <= maxColRes; col++ )
					{
						symIdx = mx2[0][col]; // get the index from row 0
						symCol = VarGetText("T_" + symIdx);				
						corr = mx2[row+1][col];
						s = s + symCol + "\t" + WriteVal( corr, 3.3 ) + "\t";
					}		
					AddRow(s);
				}
				_TRACE_("Coupling elements/Add rows done");	
			} else {
				_TRACE_("Too many tickers. Skipping");	
			}
		} else {
			_TRACE_("Empty watchlist. Skipping");	
		}
	} 
	_TRACE_("Script completed for stock " + WriteVal(stockNum, 4.0));					
} else {
	_TRACE_("Script ignored. Status " + WriteVal(actionStatus, 1.0));					
}

w = 0; // used to set a breakpoint when using the debugger

I too used matrices and addRow().

(In this sample I applied the correlation logic used by @portfoliobuilder - using only the last value of the correlation arrays; you may want to change it as needed).

I use a matrix to store the correlations and then sort it multiple times; one for each ticker row - using the first row of my original unsorted matrix as an index for the tickers to display in the correlation table.

Seems to work well enough for medium/sized watchlists (I used a max of 500 tickers)

To display the resulting data in the Analysis window I used the AddRow() function with the filter set to 0 (search here in the forum other examples about its usage). Unfortunately as said also in the previous message no colors…

For the exploration, you have to choose a watchlist (selecting its corresponding number as a parameter).
In order to get the table a little faster, I suggest setting the analysis filter to “current” (it is actually ignored but the process will be done only once).

This sample raw code (where I left on purpose many commented lines to show how some code was developed/moved/replaced) is provided “as it is” (it is not fully tested); take it mainly as a starting point for your own further development.

(All suggestions to do it in a better way are welcome!)

9 Likes

P.S. Note that all the relevant code happens when the Status(“stocknum”) == 0 so it is run in a SINGLE THREAD (I did it on purpose to be able to easily follow the code logic tracing/debugging).

1 Like

Portfoliobuilder, fxshrat, and beppe…Thank you for all you recommendations and direction! I will consider all and move forward…Happy New Year! Best…Mike

Two questions about this matrix:

(1) Why are the correlation values on both sides of the diagonal sometimes different for the same pair of symbols? Which correlation is then the good one?

(2) If we want to do a correlation analysis using the weekly or monthly returns instead of the daily returns, would changing the "periodicity" in the settings window be the right way to do it? (Assuming that we also change the AFL number of bars lookback also obviously)

Thanks,

Oscar

How do i get Full Name of the ticker in the above matrix
i'm able to get in vertical lines but not on horizintal lines