Let's discuss Mathematics

ParadigmShifter · Mar 4, 2011

Well there was an interesting question raised in the Civ4 GD forum.

What is the distribution of the number of runs of consecutive heads/tails in a coin toss experiment? How many tosses would you need to perform to have a probability of > 0.5 to have a run of 10 consecutive heads or tails?

I think the continuous case may be the F-distribution, but don't quote me on that.

Spoonwood · Mar 4, 2011

ParadigmShifter said:
How many tosses would you need to perform to have a probability of > 0.5 to have a run of 10 consecutive heads or tails?

Well, you just need a run of 10 consecutive heads or tails, and with that each coin has a probability of heads or tails respectively of 1. Oh, you mean before you see the results of the coin flip :lol:

ParadigmShifter · Mar 4, 2011

My intuition says it should be related to log₂ of the number of trials.

sanabas · Mar 4, 2011

Maybe look at each tail as a separate event?

i.e. T is length 1, p = 0.5

HT is length 2, p=0.25

HHT is length 3, p=0.125

HHHHHHHHHT is length 10, p = 0.000977

HHHHHHHHHH is length 10, p = 0.000977

Then all you want is how long before 10 heads comes up.

It's once in 1024 trials, which should mean that on average it comes up in the 1024th trial (not 100% sure, can't be bothered checking.)

1024 trials will have an average length of 512 x 1 + 256 x 2 + 128 x 3 + ... + 1 x 10 + 1 x 10 = 2046 coin tosses.

Easy enough to redo it for p different to 0.5, it's still p = 1/1024 for the individual event, and the average event is just under 2 tosses long.

ParadigmShifter · Mar 4, 2011

With 100 tosses you get more than one chance to get a run of 10.

sanabas · Mar 5, 2011

ParadigmShifter said:
With 100 tosses you get more than one chance to get a run of 10.

Yep, you get approximately 45 chances.

I can't tell if that was a response to my post, and if it was, what it was refuting.

Every time a tail comes up, you get a new chance to have a run of 10 heads. Every time a head comes up, you're still in the same chance to get 10 heads. So every time a tail appears, there's a 1/1024 chance that that's your tail before the run of 10 heads.

Souron · Mar 5, 2011

May be you can do something like this:
[Chance to get a run of 10 in 10 flips]*90-sum(i=11,20,[Chance to get a run of i in i flips]*(100-i))-2*sum(i=21,30,[Chance to get a run of i in i flips]*(100-i))-3*sum(i=31,40,[Chance to get a run of i in i flips]*(100-i))-4*sum(i=41,50,[Chance to get a run of i in i flips]*(100-i))

That is, the number of chances to get a run of 10, times that probability, minus all the ways a run can be double counted.

EDIT2:
2^(1-10)*90-(sum(i=11 to 20) 2^(1-i)*(100-i)) -(2*sum(i=21 to 30)2^(1-i)*(100-i))-(3*sum(i=31 to 40) 2^(1-i)*(100-i))-(4*sum(i=41 to 50) 2^(1-i)*(100-i))

Wolfram yeilds:
0.00375735

sanabas · Mar 5, 2011

My back of the envelope calculation says roughly 4.3% chance of getting a run of 10 heads in 100 coin flips.

It also says 52.3% chance of a run of 6 heads, and 0.064% chance of a run of 16 heads.

Who wants to do 1 million tests of 100 virtual coin flips and see how many have a run of 10 heads? I'm not, because my programming skills are non-existent.

ParadigmShifter · Mar 5, 2011

I'd do a proper histogram of the results not just record whether we got 10+ in a row or not.

It's very easy in C++ using a std::map for the histogram and a std::vector for the coin toss results.

I can't be bothered downloading Visual C++ express again (I used to have it but I had to reinstall windows).

EDIT: Looks like it has to do with Fibonacci numbers http://mathworld.wolfram.com/CoinTossing.html

Mise · Mar 5, 2011

Maybe look at it from a different way, e.g. what's the probability that a 100 digit number will contain a string of 10 even digits in a row.

(Or if your brain naturally thinks in binary, a 100 bit number containing 10 "1"s in a row.)

I haven't quite woken up yet, so bear with me...

ParadigmShifter · Mar 5, 2011

It's a lot easier in binary

You could enumerate all 2^100 possibilities to get the exact distribution... I suppose. I'm guessing that would take a while.

EDIT: And it doesn't generalise for arbitrary number of tosses.

Atticus · Mar 5, 2011

I don't know how to do it, but I do know that this is nontrivial subject.

Here's some preliminary thoughts:
In order to succeed you need at least 10+k tosses, k=0,1,2,...
For k=0 the probability is 1/2^10 = 1/1024
Let's assume that you get HHHHHHHHHH first time after the 10+k:th toss and k>0. Then the k:th toss must be T. Also it is required that you haven't succeeded during the first k-1 tosses.
Therefore the probability to succeed after n=10+k tosses is
p_n =1/1024 * 1/2 *q_{n-11},
where we have respectively the probability for success in the last 10 tosses, the tails on the k:th toss and failure during the first k-1 tosses, q_{k-1}=q_{n-11}.

ParadigmShifter · Mar 5, 2011

It's 1/512 to start: it doesn't have to be a run of heads, tails are OK too.

Mise · Mar 5, 2011

I'm pretty sure nobody will ever need an exact solution. Can it be estimated? Also I'm not going to worry about the requirement not to have won in the previous k-1 tosses. To me, the utility is just in a rough stab at how many units you'll need to throw at a stack of 10 units if both sides have equal strength. Useful in other games too of course.

ParadigmShifter · Mar 5, 2011

It's easy to estimate via simulation. You are the excel whizz Mise

As I said I haven't got a C++ compiler at the moment.

I wonder if there is an exact distribution for this though, or a continuous version (like the normal for estimating binomial).

EDIT: I have got a ZX Spectrum emulator

I'm not going to use that though :lol:

Atticus · Mar 5, 2011

Sure it can be estimated! For example, the probability is bigger than getting ten H in n tosses, for which the probability is
n!/(10! (n-10)! ) 1/2^n.
(I assume we're talking about only H's here, because as PS said, you can get H's or T's by doubling)

On the other hand it's smaller than getting THHHHHHHHHH at the end of the tosses, this is your approach of disregarding q_n. So for the exactly n:th toss that is 1/2048 (n>11), and for the "by n:th toss
\sum_{k=1}^{n-10} 1/2048 =(n-10)/2048.

I'm pretty sure I did something stupid there.

If you want to know whether you or opponent win 10 times first, there are easier methods for that, I calculated it yesterday actually...

ParadigmShifter · Mar 5, 2011

Last sentence: That's the negative binomial I think (distribution of number of successes before a set number of failures).

http://en.wikipedia.org/wiki/Negative_binomial_distribution

Atticus · Mar 5, 2011

Yeah, that's probably it!

I've been doing probability thingies lately, because I think I need a real job....

That's something I've been wanting to rant a while now, how frustrating these things can be, especially for a mathematician!

First I borrowed Probability with martingales by David Williams. I had used it ~10 years ago to quick learn martingales for Banach space studies. Then I noticed I didn't like his conversational style, for which it is often lauded by the way. Also, I don't remember anything about generating functions and all these things, so I go to borrow the undergraduate book for our local university.

Well, that in turn is unreadable, because the writer doesn't tell anything: he uses words like "distribution" or "random variable" without definition, this is of course because they don't want to have measure theory as prerequisite.

Then I rant about it to a mathematician friend, and she advices me to read another book, also for the local university, but this is for graduate students. Again, I can't stand it, because this book doesn't have enough of simple exercises, and it still contains undefined or poorly defined things. Like distribution, it's supposed to be "the set of probabilities for a random variable". So if p(1)=p(2)=1/4 and p(3)=...=p(6)=1/8, does this random variable has the same distribution as the one for which p(1)=p(2)=1/8 and p(3)=p(4)=p(5)=1/4, namely {1/4, 1/8} for both of them?

Ok, now I borrow Shiryaev's Probability, which is more rigorous, but it has very polarized exercises, either very hard or trivial, and also, it uses unorthodox symbols. And this btw is a general issue with these probability guys, they have to always use symbols the wrong way. They can't say X^{-1} (A) like the normal people, that is if normal people ever called functions Xs, but instead it must be {X=A} or something else equally stupid.

Now I've finally found a good book, Feller's Introduction to probability, which is from the 40s or 50s, and has lots of interesting examples, good exercises, very good justification for things, and also lovely nostalgia of the time when Harvard computer lab printed heavy books like "binomial probabilities for n<100".

I've learned that the dropping of bombs in London Blitz conformed to Poisson distribution, and there were actually people who cast 12 dice some 12000 times to find out whether it conforms to multinomial distribution. I've also calculated, how many raisins there should be in cookie dough for a probability less than 1/100 for a cookie with no raisins to emerge.

It's interesting subject, but very diffficult to find proper literature.
[/rant]

ParadigmShifter · Mar 5, 2011

Distribution is just the shape of the probability density curve. Scaling/shifting is just a change of the parameters for the distribution.

They use X as a random variable (from a given pdf usually), so it's both a variable and a function. I guess you're talking about stuff like p(X=a).

Normally use capital Greek letters for cumulative density functions (inverse pdf doesn't come up very often) and we do use proper inverse notation for that.

Generating functions are just a convenient way to express a pdf based on formal power series (and integrals I think for continuous pdf's). EDIT: They have some nice features which makes calculating the expected value and variance very easy via differentiation.

Atticus · Mar 5, 2011

Yeah, well those are no more problem since I encountered Feller! :love:

I understood that distribution is "the shape" of the density function, that's the intuitive meaning of the word. The problem however was that these writers didn't bother to give it accurate definition, and yet they had sentences like "these two have the same distribution", "find out the distribution of this". That's shameful!

Let's discuss Mathematics

Random Nonsense Generator

Grand Philosopher

Random Nonsense Generator

Psycho Bunny

Random Nonsense Generator

Psycho Bunny

The Dark Lord

Psycho Bunny

Random Nonsense Generator

isle of lucy

Random Nonsense Generator

Deity

Random Nonsense Generator

isle of lucy

Random Nonsense Generator

Deity

Random Nonsense Generator

Deity

Random Nonsense Generator

Deity

Similar threads