Alaska Software Inc. - re : RandomInt()
Username: Password:
AuthorTopic: re : RandomInt()
Chris Palmer re : RandomInt()
on Thu, 09 Jul 2020 16:37:45 +0100
Hello

There is a problem with RandomInt() that I posted a while ago, has this 
been fixed ?
Basically, with any random number over 32735 - some numbers will never 
be returned using the RandomInt() function

The below example runs 1,000,000 draws, and writes the number of 'hits' 
for each possible number to a CSV using RandomInt(1,qHighNo)

If qHighNo is set to 32735 ( or below ) then everything is OK
If qHighno is set to 32736, then this number will never be returned ( 
every number apart from 32716 is OK )
If qHighno is set to 50000, then a pattern can be seen in the CSV file 
of numbers that are never returned ( 3, 6, 9, 12 .... )
You can also see a different pattern with 70000, 80000 etc

Does anybody have any answers ?
Thanks
Chris


--------------------------------------------------------------
qHighNo=50000

set alternate to "test.csv"
set alternate on

?? "Rand No,How Many Results"
?

arryN=array(0,2)
for qLoop1=1 to 1000000
     aadd(arryN,{qLoop1,0})	
next

for qLoop2=1 to 1000000
     sndx = randomint(1,qHighNo)
     arryN[sndx,2]=arryN[sndx,2]+1
next
	
for qLoop3=1 to len(arryN)
     ? str(arryN[qLoop3,1])+","+str(arryN[qLoop3,2])
next	
	
set alternate to
set alternate off
Andreas Gehrs-Pahl
Re: re : RandomInt()
on Mon, 13 Jul 2020 07:21:41 -0400
Chris,

>There is a problem with RandomInt() that I posted a while ago, has this 
>been fixed ?

I don't know if this is an implementation or a documentation issue, but you 
might be able to create some workarounds for this or at least alleviate the 
issues, if you realize what the reason for those issues is.

>Basically, with any random number over 32735 - some numbers will never 
>be returned using the RandomInt() function

The reason for this (and the other issues) is probably that this function 
internally works with a limited number of values, 65,535 to be exact, which 
is basically a 16-bit value (a signed Int). Only 32,767 are available for 
the resulting random numbers. All other values are created by adding and/or 
subtracting (multiplying and/or dividing) values to distribute the results 
over the requested range of results.

>The below example runs 1,000,000 draws, and writes the number of 'hits' 
>for each possible number to a CSV using RandomInt(1,qHighNo)

Your example contains several issues, like creating and printing an array of 
1,000,000 items, even though a maximum of 50,000 are actually needed/used 
and saving the random value in a two-dimensional array, when the index 
number implicitly already contains that information.

>If qHighNo is set to 32735 ( or below ) then everything is OK
>If qHighno is set to 32736, then this number will never be returned ( 
>every number apart from 32716 is OK )

You mean every number but 32736 is returned (not 32716). This (and the other 
issues) is most likely caused by rounding problems in binary math and how 
the random values are distributed over the range of results.

>If qHighno is set to 50000, then a pattern can be seen in the CSV file 
>of numbers that are never returned ( 3, 6, 9, 12 .... )
>You can also see a different pattern with 70000, 80000 etc

These patterns are caused by the same rounding issue. Only specific values 
can be created by the actual randomizer routine, which then get to be (more 
or less) evenly distributed between the desired range of numbers (starting 
with 1 and ending with the specified value). This causes patterns to emerge 
at much lower number ranges than the 32-thousands.

The key-words here are "evenly distributed". That means that on average, all 
the numbers returned come up the approximate same number of times over many 
iterations. But when the range of numbers get too large, there will be gaps.

For example, if you use 4700 as your high number, you will notice that every 
35th or 36th value will have approximately 10% less hits than all the other 
values. When the max-number get much larger, there will be numbers that will 
never be returned, with many (evenly distributed) "missing" values. If you 
test any range that is much larger, like 500,000, you will find that only 
32767 distinct values are returned. All others are Zero. If you do a range 
starting with Zero or a negative number, 32768 distinct values are returned.

>Does anybody have any answers ?

The above hopefully explains the issues.

There are ways to get you all the numbers in a specified range, but it is 
hard to keep the "evenly distributed" part intact. I have attached a small 
program that tries to accomplish this.

The simplest way is to create a random offset value to fill the gaps between 
the original values created by RandomInt(). This works pretty good and keeps 
the resulting values pretty much (not perfectly) "evenly distributed". But 
it won't work for values with more than 31 bits, because the RandomInt() 
function, is limited to 32 bit (signed) numbers (maximum is 0x7FFFFFFF or 
2,147,483,647, minimum is 0x80000000 or -2,147,483,648) for input values.

If two negative values are specified, the original RandomInt() routine will 
sometimes also return a value that is outside the specified range. For 
example, if a range from -300 to -100 is given, the routine sometimes 
returns -99. The new routine won't do that, either.

Another option is to create two or three smaller numbers -- preferably with 
the same number of bits or similar values -- and then create separate random 
numbers for those smaller ranges and then add those values together, to 
create the larger random numbers. The problem is that the values won't be 
"evenly distributed" anymore, as some of the part values have a higher 
influence on the final value than others. But this allows much larger number 
ranges than the standard RandomInt() function does, which is limited to 
signed 32 bit values.

The attached demo program runs for about 20 minutes to create 15 demo *.csv 
files. You can also enter one or two values to manually specify a specific 
range on the command prompt. I hope this AGP_RandomInt() routine will help 
you with your immediate problem.

Any code improvements or corrections are appreciated!

Andreas

Andreas Gehrs-Pahl
Absolute Software, LLC

phone: (989) 723-9927
email: Andreas@AbsoluteSoftwareLLC.com
web:   http://www.AbsoluteSoftwareLLC.com
[L]:   https://www.LinkedIn.com/in/AndreasGehrsPahl
[F]:   https://www.FaceBook.com/AbsoluteSoftwareLLC

AGP_RandomInt_Demo.zip