stats question

Hygro

soundcloud.com/hygro/
Joined
Dec 1, 2002
Messages
26,342
Location
California
This guy on twitch was playing Diablo 2 single player trying to get every item with one char.

He has one item left. The estimated rate the item drops, per boss, is 1/110000.

I estimated that he kills 4 of such bosses a minute (dunno if I was right).

I built a little calculator to see how many hours it would take for him to have a chance that basically converged to 1, which was a little over 450 hours. Let's say its 400.


If something basically is converging to 1, you can say it's effectively guaranteed that in that amount of time the thing will happen.

But we also know that no matter what came up before, the next coin flip is an independent event.

So what if this guy is unlucky, and doesn't get the item in the first 200 hours. That's pretty likely, he's got a 50% chance. But over 400 some hours it's pretty damn close to 1. Yet 2 sets of 200 hours is 2 sets of is two heads or tails chances.

So while it would be an exceedingly rare event in the universe if he doesn't get the item in the 400 hours, if he fails to get it in the first 200, and he's not more likely to get it in the second half than the first, at the point of the second half you would only bet he has a 50% chance that the all-but-guaranteed activity will happen. This is very strange to me.


Semantically I understand that the ~100% chance is a chance that "it will have happened" in that time frame, but if it doesn't happen in the first half of the time frame, which is quite likely, how does that not change the odds of the second half, which is itself independently quite likely?
 
The only reason I can think is that the way you're drawing your conclusions is by looking at the aggregate. He's going to be more likely over time to pick up the item since he's taken more chances at fighting the bosses.

If he's more likely to pick it up in the second half than the first half, why doesn't he play the second half first? :mischief:
 
It's a geometric distribution : the only distribution that has no "memory".

The error in your reasoning is that "converging to 1" in 400h is meaningless : the probability (which should be approximately 1-(109999/110000)^(400x4) ) is close to 1 but not equal to 1. Only the limit when t->infinity is converging to 1
 
This guy on twitch was playing Diablo 2 single player trying to get every item with one char.

He has one item left. The estimated rate the item drops, per boss, is 1/110000.

I estimated that he kills 4 of such bosses a minute (dunno if I was right).

I built a little calculator to see how many hours it would take for him to have a chance that basically converged to 1, which was a little over 450 hours. Let's say its 400.


If something basically is converging to 1, you can say it's effectively guaranteed that in that amount of time the thing will happen.

But we also know that no matter what came up before, the next coin flip is an independent event.

So what if this guy is unlucky, and doesn't get the item in the first 200 hours. That's pretty likely, he's got a 50% chance. But over 400 some hours it's pretty damn close to 1. Yet 2 sets of 200 hours is 2 sets of is two heads or tails chances.
Using your estimations (1/110000 rate, 4 bosses in a minute) I got ~0.35 chance in 200 hours and ~0.58 in 400 hours.
 
Semantically I understand that the ~100% chance is a chance that "it will have happened" in that time frame, but if it doesn't happen in the first half of the time frame, which is quite likely, how does that not change the odds of the second half, which is itself independently quite likely?
It is a difficult question to convince yourself of, but it is true. One way to answer would be how would it change the odds? If the computer program is not specifically designed to alter the odds, how would the odds of getting the item change depending how many times he has tried?

BTW, I think you are wrong about the '~100% chance that "it will have happened' in that time frame". Here is what I think the probability that he will not have got it in this many hours, and it is still 0.2% after 2800 hours.


Spoiler How I made the graph, I may be wrong :
Code:
p_one <- 1/110000
p_hour <- p_one * (4*60)
lambda_hour <- 1/p_hour

events <- 1:2800
lambda_events <- p_hour * events

p_zero <- ppois(q = 0, lambda = lambda_events, lower.tail = TRUE)
df <- data.frame(events, p_zero)

ggplot() +
  labs(title = "Chance of not getting item",
       x = "Hours",
       y = "Probability") +
  geom_line(data = df, aes(x = events, y = p_zero))

> tail(df$p_zero, n=1)
[1] 0.00222257
 
Last edited:
Seems a bit like the frog jumping to the stone in the pond
where his first jump takes him halfway and as he tires
each subsequent jump is half that of the previous jump.
 
It's RNG so if you play long enough, what you want will drop. If the items is locked to a particular boss, then you just have to keep killing that boss. Stay on task and you will get there. POE is an RNG driven game for the best loot. The rarest item (Mirror of Kalandra) can drop anywhere from any source that produces loot. In 7600 hours of play I've never seen one. No lifers see them with some regularity. It is all about luck and time on task.
 
So while it would be an exceedingly rare event in the universe if he doesn't get the item in the 400 hours, if he fails to get it in the first 200, and he's not more likely to get it in the second half than the first, at the point of the second half you would only bet he has a 50% chance that the all-but-guaranteed activity will happen. This is very strange to me.
It's because you've arbitrarily grouped the salient events - each Boss killed - into hours, and then arbitrarily split that into two groups of 200. As you've laid out the equation, the only meaningful units are Each Boss Killed and Total Bosses Killed. "Hours played" has no meaningful relationship to the variable, whether or not the item drops.
 
So what if this guy is unlucky, and doesn't get the item in the first 200 hours. That's pretty likely, he's got a 50% chance. But over 400 some hours it's pretty damn close to 1. Yet 2 sets of 200 hours is 2 sets of is two heads or tails chances.

If one set set of 200 hours results in a 50% chance, then two sets of 200 hours or one set of 400 hours results in 75% to get it at least once.
If it is not, there are two possibilities:
1) The events are not independent
2) Your calculations are wrong.
 
It's the same reason why getting a heads in 2 coin flips is 75% but only 50% with one flip. With the 2nd flip, you only have a 50% chance of getting heads.
Right, one unit of two coin-flips isn't the same as two units of one flip apiece. If you say, "If I flip this coin twice, there's a 75% chance I'll get Heads at least once." Then you flip the coin just once, stop, and ask a whole new question: "Okay, I flipped the coin once and it came up tails. Why aren't my odds of getting Heads at least once with another flip of the coin 75%?" Because you've re-set the entire experiment. In effect, as soon as you stop to ask the question about the second coin flip, you've abandoned your original experiment halfway and are starting over from scratch.
 
Right, one unit of two coin-flips isn't the same as two units of one flip apiece. If you say, "If I flip this coin twice, there's a 75% chance I'll get Heads at least once." Then you flip the coin just once, stop, and ask a whole new question: "Okay, I flipped the coin once and it came up tails. Why aren't my odds of getting Heads at least once with another flip of the coin 75%?" Because you've re-set the entire experiment. In effect, as soon as you stop to ask the question about the second coin flip, you've abandoned your original experiment halfway and are starting over from scratch.
I understand how this isn't metaphysical, but it sure seems like it.
 
If he's more likely to pick it up in the second half than the first half, why doesn't he play the second half first? :mischief:
LMFAO! :lol:
Only the limit when t->infinity is converging to 1
Good point, converging on one is an actual math trait and I substituted my language's rounding limits for the real math in using that terminology, which is wrong.


It's because you've arbitrarily grouped the salient events - each Boss killed - into hours, and then arbitrarily split that into two groups of 200. As you've laid out the equation, the only meaningful units are Each Boss Killed and Total Bosses Killed. "Hours played" has no meaningful relationship to the variable, whether or not the item drops.
Wait why can't we use time interchangably if the time is set to the actions?

2) Your calculations are wrong.
I'm going to have to revisit it, maybe...
 
Wait why can't we use time interchangably if the time is set to the actions?
You sort of said it yourself in the first post:

But we also know that no matter what came up before, the next coin flip is an independent event.
If you decide that the next coin flip is an independent event, then the answer could be different from evaluating the set because you're asking a whole new question.

So what if this guy is unlucky, and doesn't get the item in the first 200 hours. That's pretty likely, he's got a 50% chance. But over 400 some hours it's pretty damn close to 1. Yet 2 sets of 200 hours is 2 sets of is two heads or tails chances.
In this case, you're dividing one set into two and looking at each independently, where the result of the first doesn't impact the second. Technically the questions aren't even the same. Because you already have the results of the first, 200-hour set, the question you're answering is, Did 200 hours produce Result X? The question you're asking about the second set is, Might 200 hours produce Result X?
 
You sort of said it yourself in the first post:
Oh I see, you mean you can't divide up the time halfway as part of some progress bar. Yeah, I thought you were saying you can't covert the math to a time rate from a action rate and I was like.... :hmm: Clear now.


In this case, you're dividing one set into two and looking at each independently, where the result of the first doesn't impact the second. Technically the questions aren't even the same. Because you already have the results of the first, 200-hour set, the question you're answering is, Did 200 hours produce Result X? The question you're asking about the second set is, Might 200 hours produce Result X?
To be honest I think I was legitimately tripped on the idea how certain 400 hours was but how uncertain 200 hours was. Adrien set me straight and the rest of your ironed me out.

It didn't help I tried asking some math people and they were trying to tell me otherwise until they weren't. I figured CFC would understand me more clearly. I figured correctly.

I will be reviewing my calculator and after reviewing it will post it up here for criticism. It seems the numbers aren't consistent, and you guys are pulling numbers closer to my intuition for what the probabilities would be. Mine is suspiciously linear, but I chalked that up to being so close to multiple of base 10 and my intuition not being a help in a calculation with that many iterations.
 
Oh I see, you mean you can't divide up the time halfway as part of some progress bar. Yeah, I thought you were saying you can't covert the math to a time rate from a action rate and I was like.... :hmm: Clear now.



To be honest I think I was legitimately tripped on the idea how certain 400 hours was but how uncertain 200 hours was. Adrien set me straight and the rest of your ironed me out.

It didn't help I tried asking some math people and they were trying to tell me otherwise until they weren't. I figured CFC would understand me more clearly. I figured correctly.

I will be reviewing my calculator and after reviewing it will post it up here for criticism. It seems the numbers aren't consistent, and you guys are pulling numbers closer to my intuition for what the probabilities would be. Mine is suspiciously linear, but I chalked that up to being so close to multiple of base 10 and my intuition not being a help in a calculation with that many iterations.
Just FYI, if you are working on your calculator, I used the poisson approximation to the binomial distribution.
 
I will be reviewing my calculator and after reviewing it will post it up here for criticism.
If all you need is to calculate probability at certain amount of hours, you can use a formula

p=1-(109999/110000)^(N*4*60), where N is a number of hours

like in AdrienIer post, just with little correction (power multiplied by 60)
 
Top Bottom