I do not care what you say. (Part 2)

VoiceOfUnreason · Jun 28, 2009

ParadigmShifter said:
Yeah it's an LCG alright.

EDIT: That method returns an integer in the range 0 .. (usNum-1) by the way, and m_ulRandomSeed will be an unsigned long member variable of the class CvRandom.

It's essentially the LCG taken straight out of Plauger's "The Standard C Library". If they had read Teukolsky et al, they might have made another choice, but I suspect that in practice, the standard RNG is "good enough".

Of course, it has the usual off by one distributions when attempting to choose uniformly among possibilities that do not evenly divide the space, but so it goes.

Jet · Jun 28, 2009

VoiceOfUnreason said:
Teukolsky et al

What? I looked quickly and the only books I found did not have him as the the first author.

VoiceOfUnreason · Jun 28, 2009

Jet said:
What? I looked quickly and the only books I found did not have him as the the first author.

Numerical Recipes.

I suppose William Press is the principal author, but I spent a summer playing cards against Teukolsky, so in my mind it is "his" book.

PieceOfMind · Jun 29, 2009

VoR you might be able to help here. In the code for the RNG DanF posted, is the LCG essentially one that uses m = 2^16? It seems in most places I look the pair of A and C that was used tend to go with m=2^32 instead.

If it is 2^16 then that would be its period from what I can tell. Now I'm just trying to figure out if the line

Code:

unsigned short us = ((unsigned short)((((m_ulRandomSeed >> RANDOM_SHIFT) & MAX_UNSIGNED_SHORT) * ((unsigned long)usNum)) / (MAX_UNSIGNED_SHORT + 1)));

does what it is supposed to do. Does anyone know?

ParadigmShifter · Jun 29, 2009

It takes the high 16 bits of the 32-bit m_ulRandomSeed, multiplies it by usNum to give a random int in the range [0, 65535*usNum] then divides by 65536 giving a result in the range [0, usNum-1]

EDIT: It doesn't necessarily mean the period is 2^32 though, as soon as m_ulRandomSeed repeats the sequence is repeated.

PieceOfMind · Jun 29, 2009

Thanks PS. Why is the "& MAX_UNSIGNED_SHORT" part necessary? Is part of the taking the 16 high bits? I'm guessing it is just the shift operation that does the taking the high bits.

ParadigmShifter · Jun 29, 2009

I think it is irrelevant. It masks out the high 16 bits (zero's them), but since a right shift with an unsigned int doesn't bring in leading 1's if the top bit is set (which does happen with signed ints since top bit set => negative number, so -1 >> 16 = -1 still).

It's processor/compiler implementation dependent though I think (i.e. not specified in the C++ standard) so it is the safe way to do it on all platforms.

EDIT: You are correct, the right shift 16 bits is extracting the high 16 bits.

Ultimocrat · Jun 29, 2009

I still think this response is a little lacking in the math department.

Given n 50-50 battles (or coin flips), the number of ways to win exactly r is: n!/(r!(n-r)!)

For n = 100, doing the factorials manually is prohibitively difficult, so we use a form of Stirling's approximation to estimate the logarithms of the factorials.

LN(m!) ~ m*LN(m) - m + 1/2*LN(2*PI*m)

for n=100, r=38 we have

LN(100!/(38!*62!)) = LN(100!) - LN(38!) - LN(62!)
~100*LN(100) - 100 +1/2*LN(2*PI*100) - 38*LN(38) + 38 - 1/2*LN(2*PI*38) - 62*LN(62) + 62 -1/2*LN(2*PI*62)

~ 63.908

and the total number of ways of rolling 100 50-50 battles is 2^100, and LN(2^100) = 100*LN(2)

~ 69.315

the probability of rolling exactly 38 wins out of 100 is thus
p(38:100) = [100!/(38!*62!)]/[2^100]
LN(p(38:100) = LN([100!/(38!*62!)]/[2^100]) = LN[100!/(38!*62!)] - LN[2^100]

~ -5.407

exponentiating gives p(38:100) ~ 0.0045, or 0.45%

peak probability of a single outcome is 50 wins, with ~8% odds of occurring.

PieceOfMind · Jun 29, 2009

Ultimocrat, isn't the maths itself fairly pointless unless it is used to support an argument? 50 wins may be the most probable outcome, and 38 wins might have 0.45% chance, but that doesn't tell you much else. You can't use those facts alone to argue anything about the RNG.

PieceOfMind · Jun 29, 2009

I've just finished doing the Runs test on the RNG using a bit string of length 2^28, which is one 16th of the full period. Using the RNG to produce random bits, it passed the Runs Test detailed in page K.4 of this document which I've been using as a reference for these tests.

This test confirms that the RNG produces as many runs of ones and zeroes as should be expected.

Ultimocrat · Jun 29, 2009

Ultimocrat, isn't the maths itself fairly pointless unless it is used to support an argument? 50 wins may be the most probable outcome, and 38 wins might have 0.45% chance, but that doesn't tell you much else. You can't use those facts alone to argue anything about the RNG.

I'm not trying to say anything about whether the RNG is working correctly, just trying to point out that the probability of occurrence of any outcome from the binomial distribution can be calculated in a fairly straightforward fashion, that nobody had actually gone through it in this thread yet, and that it is the basis for the way to think about odds over a large number of battles.

PieceOfMind · Jun 29, 2009

Ultimocrat said:
I'm not trying to say anything about whether the RNG is working correctly, just trying to point out that the probability of occurrence of any outcome from the binomial distribution can be calculated in a fairly straightforward fashion, that nobody had actually gone through it in this thread yet, and that it is the basis for the way to think about odds over a large number of battles.

Ah ok. I would have thought that a figure that would be more interesting than finding the probability of winning 38 battles would be winning between 38 and 62 battles. i.e. What is the confidence of the outcome falling in that range? (it's about 99% or more IIRC). Getting exactly 38 wins, and its probability are not all that useful because we don't judge that particular result to be any more interesting than winning 39 battles or 40 battles, which would also have similarly small odds of occuring.

I suppose you could find the standard deviation from the mean by using the normal distribution approximation. For memory, getting a result outside of 3 standard deviations from the mean is about 0.3% odds or so.

ParadigmShifter · Jun 29, 2009

I wouldn't call using Stirling's approximation to calculate 100! very straightforward

Especially when N>20 the Normal distribution with mean np and s.d. sqrt(npq) with the appropriate continuity correction applied is a very very good approximation

EDIT: for 100 50-50 battles the standard deviation would be sqrt(100*0.5*0.5) = 5 by the way PoM.

PieceOfMind · Jun 29, 2009

ParadigmShifter said:
EDIT: for 100 50-50 battles the standard deviation would be sqrt(100*0.5*0.5) = 5 by the way PoM.

Yeah I noticed just now.

Hey I knew what the mean would be! :lol:

I'm getting the odds of being outside of 38 to 62 as about 2.8%. Similar to the significance test done a couple pages ago, the number is not all that alarming. Less than 1% is where one really ought to be suspicious.

Ultimocrat · Jun 29, 2009

I wouldn't call using Stirling's approximation to calculate 100! very straightforward

Why not? It's just elementary statistical mechanics...

I would have thought that a figure that would be more interesting than finding the probability of winning 38 battles would be winning between 38 and 62 battles. i.e. What is the confidence of the outcome falling in that range? (it's about 99% or more IIRC). Getting exactly 38 wins, and its probability are not all that useful because we don't judge that particular result to be any more interesting than winning 39 battles or 40 battles, which would also have similarly small odds of occuring.

Yeah, it's 98.8% odds for the 38-62 (inclusive) range. 45-55 ~ 72.9%; 40-60 ~ 96.5%; 35-65 ~99.8% (if you can trust excel with large to very large numbers).

PieceOfMind · Jun 29, 2009

Ultimocrat said:
(if you can trust excel with large to very large numbers).

Depends what you do with the large numbers.

Some operations are stable - others aren't.

Ultimocrat · Jun 29, 2009

Depends what you do with the large numbers. Some operations are stable - others aren't.

Just dividing by a SUM() with arguments on the order of 10^28, which are the result of EXP() with arguments on the order of 60-70.

Cross-checking with the applet referenced earlier: http://onlinestatbook.com/stat_sim/normal_approx/index.html
confirms my results, so perhaps I should remove the caveat.

ParadigmShifter · Jun 29, 2009

I've got a stats tables book with logs of factorials in it.

log(100!) = 157.970004
log(62!) = 85.4978964
log(38!) = 44.7185205

EDIT: Whoops, just checked, they are logs to the base 10

How ridiculous is that???

But seriously, just use the normal approx

So for 100 50-50 trials, chances of between 38 and 62 is (with continuity correction applied)

phi( (62.5 - 50)/5 ) - phi( (37.5 - 50)/5 )

= phi ( 2.5 ) - phi( -2.5 )

= 0.9938 - 0.0062 = 98.76%

PieceOfMind · Jun 29, 2009

Note I accidentally linked the wrong pdf in post 90, which is now fixed.

When I reported 2.8% for 38 to 62 before, I neglected to take the continuity correction PS speaks of into account.

ParadigmShifter · Jun 29, 2009

Here's the normal dist approx for the other figures in Ultimocrat's post (#95)

45-55: phi( (55.5-50)/5 ) - phi( (44.5-50)/5 ) = phi(1.1) - phi( -1.1 ) = 0.8643 - 0.1357 = 72.86%

40-60: phi( (60.5-50)/5 ) - phi( (39.5-50)/5 ) = phi( 2.1 ) - phi( -2.1 ) = 0.9821 - 0.0179 = 96.42%

35-65: phi( (65.5 - 50)/5 ) - phi( (34.5 - 50)/5 ) = phi( 3.1 ) - phi( -3.1 ) = 0.99865 - 0.0010 = 99.77%

I do not care what you say. (Part 2)

Deity

No, no, please. Please.

Deity

Drill IV Defender

Random Nonsense Generator

Drill IV Defender

Random Nonsense Generator

Warlord

Drill IV Defender

Drill IV Defender

Warlord

Drill IV Defender

Random Nonsense Generator

Drill IV Defender

Warlord

Drill IV Defender

Warlord

Random Nonsense Generator

Drill IV Defender

Random Nonsense Generator

Similar threads