Shuffle an array

Hello!
I'm trying to do Monte Carlo methods manually. I can not find any function to shuffle an array. What is the best way to do it?
Thank you

The function below shuffles ALL elements of input array in random order (note that it may pick the same element more than once)

function ShuffleArray( input )
{
   rndidx = floor( mtRandomA() * BarCount );
   
   for( i = 0; i < BarCount; i++ )
   {
		output[ i ] = input[ rndidx[ i ] ];
   }
   
   return output;
}
8 Likes

Nice clever code :+1:

Can we come up with another variant that can have BarCount iterations over each index of the input array but in this algorithm, we will swap the value at current index i with another index value which is rndidx or the random index.

Will need to check that the index is within bounds I guess for each iteration or generate the rndidx such that it is less than BarCount.

Let me give the code a try :slight_smile:

Thank you, @Tomaz.
Implemented the Fisher-Yates algorithm to shuffle an array.Fisher-Yates.
So you can not choose the same item more than once.
The array has a custom measure.
I would like to know if there is a simpler way to declare a 1D array.

// --------------------------------------------------------------------------//
// -                              Fisher-Yates                              -//
// --------------------------------------------------------------------------//

// Implementation Fisher-Yates

function SuffleArrayFisherYates( input )

{
  i = MxGetSize( input, 0 ) - 1;
  j = 0;
  temp = 0;

  while( i > 0 )
  {
    j = floor( mtRandom() * ( i ));
    temp = input[j][0];
    input[j][0] = input[i][0];
    input[i][0] = temp;

    i--;
  }

  return input;
}
array = Matrix (10,1);
array[0][0] = 0;
array[1][0] = 1;
array[2][0] = 2;
array[3][0] = 3;
array[4][0] = 4;
array[5][0] = 5;
array[6][0] = 6;
array[7][0] = 7;
array[8][0] = 8;
array[9][0] = 9;


rndArray = SuffleArrayFisherYates( array );

Thanks again!

input = C;
tmp = 0;

   for( i = 0; i < BarCount; i++ )
   {
	j = floor( mtRandom() * (i));
	tmp = input[j];
	input[j] = input[i];
	input[i] = tmp;
   }

Assuming that

mtRandom( seed = Null ) - returns single random number (scalar) in the range [0,1)

Therefore that Floor() of mtRandom() * (i) is less than BarCount

If you use BarCount you have to add 1 if not you never choose the last term.

	j = floor( mtRandom() * (i+1));

Best regards!

As I have shown in my code, you can use just single call to mtRandomA outside the loop instead of calling mtRandom for each bar inside loop. This creates ARRAY of random numbers and is much more effective than calling mtRandom inside loop.


rndind = floor( mtRandomA() * BarCount ); // one function call for entire array

for( i = 0; i < BarCount; i++ )
{
  j = rndind[ i ];
  tmp = input[ j ];
  input[ j ] = input[ i ];
  input[ i ] = tmp;
}
1 Like

Thanks for the reply @Tomaz
But what I'm trying to achieve is to shuffle arrays of a certain size, not the entire array. What is the best way to create an array in a single line without having to use the two dimensions of the matrix or the functions of moving from text to matrix. Is there no Built-in function to create 1D and slice arrays?

arraystring = "[0,1;2;3;4,5,6,7,8,9]";

array = MxFromString( arraystring );

printf( MxToString( array) );

Thanks again

Array in AFL has BarCount elements automatically. You don't need to use ALL elements. You can just use first N elements. That is the easiest way. You can also use matrices. Matrix in RAM is actually 1D because RAM is 1D.

@Jhony5 - like @Tomasz has said, it's just a matter of specifying the range.

The following builds upon @Tomasz's example, but I think there's still a slight problem with it:

function RandomiseSegmentArr(inpArray, begElem, endElem)
{
	// Range check - either do it here, or, before calling the function.
	if (endElem > BarCount)
		endElem = BarCount ;
	
	numElem = endElem - begElem + 1 ;
	
	// one function call for entire array - create once, read many.
	rndind = floor( mtRandomA() * numElem ); 
	rndInd = rndInd + begElem ;

	// This keeps the original pristine, just in case you need to reference it again.
	result = inpArray ;
	
	// Now randomise the elements
	for( i = begElem; i < endElem; i++ ) 
	{ 
		j = rndind[ i ]; 
		tmp = inpArray[ j ]; 
		result[ j ] = inpArray[ i ]; 
		result[ i ] = tmp; 
	}
	
	// Emit the shuffle.
	return result ;
}


initArray = Cum(1) ;

startElem = 10 ;
endElem = 19 ;

rndArray = RandomiseSegmentArr(initArray, startElem, endElem) ;

Plot(initArray, "initArray", ParamColor( "initArray Color", colorCycle ), ParamStyle("initArray Style", styleLine) );
Plot(rndArray, "rndArray", ParamColor( "rndArray Color", colorCycle ), ParamStyle("rndArray Style", styleLine) );


function RandomiseSegmentMtx(initArray, startElem, endElem)
{
	// similar code to above, but specific to matricies.
	return Null ;
}

numElem = endElem - begElem + 1 ;
	
// one function call for entire array - create once, read many.
rndind = floor( mtRandomA() * numElem ); 
rndInd = rndInd + begElem ;

I don't get what you are trying to achieve in the above code but I don't think all this is required.

All arrays are of size BarCount anyway.

If you just create an array of Random Values with
rndind = floor( mtRandomA() * BarCount );
then this part j = rndind[ i ];
will just select the corresponding random number at that index and use it for shuffling.

Hi @travick,

I’ll try to address your query, and provide some enhanced code at the end to illustrate it.

After submitting the post, something was nagging at me, and didn’t seem right, so I had another go at it, and I think I now understanding what’s going on, and why. There are a couple of coding mistakes also.

Firstly, a few posts ago, @Johny5 says:

, which I interpreted as a “segment”/ subset of the array. For example:

  1. divide a brand new deck of cards into 3 smaller decks of 1/3 each, and keep them separate from one another.
  2. shuffle only the 2nd small deck. Do not shuffle the other two small decks (1 and 3).
  3. Place the (now shuffled) 2nd deck on top of the 3rd small deck, and then the 1st small deck on top of the 2nd.

To do that, we need to restrict the range of bars that the for loop actually loops over, hence, begElem and endElem in the function arguments, which give us numElem. Then, use numElem to generate the random index, which means that the range of values generated are restricted to values between 0 and numElem:


rndind = floor( mtRandomA() * numElem ); // only values in the range 0..numElem are generated.

That’s Ok, but those numbers aren’t necessarily within the range of index numbers of the elements we want shuffled. So, we need to “shift” them into the range we want, by:


rndInd = rndInd + begElem ; // now we have pointers that only “point” to other elements _ *within* _ the range we want shuffled.

If we simply did rndind = floor( mtRandomA() * BarCount );, we’d get (some) pointers to elements that are outside of the range we want shuffled, ie. some of the elements in the specified range would be shuffled to other elements _ within _ the range (appropriate), and some of them _ outside _ the range (not appropriate).

However, there’s a conundrum, and it’s to do with the random numbers. Despite doing all of the above, we’re more likely to get repeated index numbers, eg. [1,0,2,4,1], than we are to get non-repeated numbers, eg. [0,3,4,1,2]. Logically, and this is what was bugging me, I expected the later to be the case, ie. all element numbers within the range to be represented. But, more times than not, it _ didn’t _ happen.

So, what’s going on? Then it dawned on me – something that @Tomasz said above:

, but why?

I think the answer lies in the numbers generated by the Mersene Twister (or any other) algorithm - technically they’re "different", but the difference between them is sometimes so small, that when we floor(rndNum * numElem) it, we end up with integers that are the same.

Does this really matter? Well, …, no, it doesn’t, so long as you don't mind some elements being shuffled more than once. Just to be sure, I did a paper-exercise with the following:

initArray: 
idx: [0, 1, 2, 3, 4] 
val: [A, B, C, D, E] 

rndIdxArray:
idx: [0, 1, 2, 3, 4]
rndIdx: [1, 0, 2, 4, 1]

result:
idx: [0, 1, 2, 3, 4]
val: [D, B, C, E, A]

, we end up with more shuffling than anticipated, more randomisation!

For example, if we set firstElem = 0 ; and lastElem = 4 ; in the main body of the script (see new code below), before calling the function, we can see the actual values of the first 5 elements of the array, at each step of debugging.

And, this is the first coding error:

result[ j ] = inpArray[ i ]; // This doesn't permit multi-shuffling, because we keep taking from the _original_.

, it should have been:

result[ j ] = result[ i ]; // This enables some elements to be shuffled many times, within the same segment.

The second coding error was to use the variable endElem twice, potentially leading to confusion and, bugs: declared as a (implicit) global variable in the body of the script, and as an (implicit local) argument to the function. I should have checked.

Anyway, here’s the revised code. If you put a breakpoint on lastElem = 4 ;, you can step through the code in debug mode, and check the variables after each step:

function RandomiseSegmentArr(inpArray, begElem, endElem)
{
	// Range check - either do it here, or, before calling the function.
	if (endElem > BarCount)
		endElem = BarCount ;
	
	numElem = endElem - begElem + 1 ;
	
	// one function call for entire array - create once, read many.
		// Note: the numbers generated by the Mersene Twister algorithm may be technically "different", 
		// but they're so close together, that when we ```floor(rndNum * numElem)```, we end up with some integers
		// that are the same.
		//	This is Ok, so long as we don't mind shuffling some elements multiple times - just need to 
		// follow the index numbers sequentially in order to verify this.
	//rndind = floor( mtRandomA() * BarCount ); 	// generates numbers outside the range we want.
	
	rndind = floor( mtRandomA() * numElem ) ; 	// only values in the range 0..numElem are generated.
	rndInd = rndInd + begElem ;		// "shifts" the number from 0..numElem range, into the begElem..endElem range.

	// This keeps the original pristine, just in case it needs to referenced again.
	result = inpArray ;
	
	// Now randomise the elements within the specified range
	for( i = begElem; i <= endElem; i++ ) 
	{ 
		j = rndind[ i ]; 
		tmp = inpArray[ j ]; 
		//result[ j ] = inpArray[ i ]; 	// This doesn't permit multi-shuffling, because we keep taking from the original!
		result[ j ] = result[ i ]; 	// This enables some elements to be shuffled many times, within the same segment.
		result[ i ] = tmp; 
	}
	
	// Emit the shuffle.
	return result ;
}


initArray = Cum(1) - 1 ;	// added the "- 1" to sync zero-based

firstElem = 0 ;
// endElem = 4 ;	// This may cause confusion/ bugs, because "endElem" (global) is also an argument to the function (local).
lastElem = 4 ;

rndArray = RandomiseSegmentArr(initArray, firstElem, lastElem) ;

Plot(initArray, "initArray", ParamColor( "initArray Color", colorCycle ), ParamStyle("initArray Style", styleLine) );
Plot(rndArray, "rndArray", ParamColor( "rndArray Color", colorCycle ), ParamStyle("rndArray Style", styleLine) );


function RandomiseSegmentMtx(initArray, startElem, endElem)
{
	// similar code to above, but specific to matricies.
	return Null ;
}

2 Likes

"The generation of random numbers is too important to be left to chance."
Robert R. Coveyou

A small addition is needed to make the above code work properly, under all conditions - add:

SetBarsRequired(sbrAll, sbrAll) ; // this is required for the data to show-up onscreen.

, before calling the function, otherwise, AB will limit the number of quotes it supplies, causing an Error 10: Array subscript out of range. And, logically, the function may need most of the array to work with.

Thanks for this @Jhony5,

I've tried to find a digital copy of the article by Coveyou, but EBSCO Host and Wiley only go back to 1998, and the article appeared sometime earlier. I'll keep looking.