Let's prove it! Roman deity competition

Melior Traiano · Jan 15, 2012

MyOtherName said:
This means the variance of your statistic is 81135. The standard deviation is its square root -- 285.

The value of the statistic you computed -- 800 -- is 2.81 standard deviations away from the mean (0).

The odds of seeing that is no better than a one in 7.8 chance. (ignoring selection bias)

If we boldly assume the statistic is normally distributed, it means it's closer to 1 in 200 or 1 in 300. (I don't have a table handy to do the lookup) But such an assumption is probably too bold.

If someone was motivated, they could work out the precise significance of the result. Alas I'm not motivated at the moment.

The mean (the number of winning combats subtracted by the number of losing combats) actually should not be zero, it should be something greater than zero, because most of the combats were fought at winning odds.

The question becomes: is the observed number of winning combats significantly more than one would expect, given the stated odds for each combat in this sample of independent trials? Again, I for one don't see anything suspicious about this sample to even bother to run the calculations on it.

In any case, if one were to do this, you'd need to compute a test statistic that compares the odds of winning, as stated by the game, vs. the actual frequency observed in a sample large enough to be reasonably confident that you would detect a difference, if in fact, there were a difference.

leif erikson · Jan 15, 2012

Moderator Action: Removed several posts from this thread for incivility. A few seem to like to make accusations and that is not allowed.
Please read the forum rules: http://forums.civfanatics.com/showthread.php?t=422889

MyOtherName · Jan 15, 2012

Melior Traiano said:
The mean (the number of winning combats subtracted by the number of losing combats) actually should not be zero, it should be something greater than zero, because most of the combats were fought at winning odds.

It is true that the mean of the statistic you quote should not be zero... but it has very little to do with the statistic that is actually being considered.

The statistic is computed by, for each victory with x% chance of survival, adding 100-x if the unit survived, and subtracting x if the unit died.

The mean is

sum (100-x) P(survival) - x P(death) = sum 0 = 0

In any case, if one were to do this, you'd need to compute a test statistic that compares the odds of winning, as stated by the game, vs. the actual frequency observed in a sample large enough to be reasonably confident that you would detect a difference, if in fact, there were a difference.

While you could certainly use such a statistic, one doesn't need to.

MyOtherName · Jan 15, 2012

The Oz-Man said:
I haven't seen this kind of abuse of statistics since the last major political debate.

The solution isn't to abuse statistics in the opposite direction.

If you flip a coin, the odds of getting heads are 50%. If you flip it again, the odds are still 50%. The coin does not remember what your last flip was.

If you flip eight times and get eight heads in a row, it's not because it's a trick coin; it's because the odds of getting heads are 50% each time, and that's what happened.

Specifically, you're rejecting the very notion of statistical inference, and the empirical method itself.

The problem, here, is that you've started with the premise that "the coin is fair" is absolute, unquestionable fact. So, obviously, seeing 8 heads in a row has no affect on our confidence that the coin is fair.

But in reality, we don't have an absolute guarantee that any particular coin is fair. If I pull a coin out of my pocket, I'm merely virtually certain that it's fair. If I flip it 8 times and see 8 heads, that counts as evidence against the hypothesis it's fair -- but the evidence is relatively weak, and so it doesn't shake my confidence.

If I got 50 heads in a row, though, I would be a fool to remain certain that the coin flip was fair.

(aside: there are reasons why the coin flip could be unfair other than the coin itself being unfair)

However, there is a huge selection bias effect which, while I have no idea how to quantify, I imagine to be immense, rendering any one-off test like this one to be fairly unreliable.

Mylene had winning odds in nearly all of those battles. She won most of them. She only won against bad odds, what, three times? That's not abnormal, I don't think.

You see, this is what probability and statistics is meant to do -- so that we don't have to "think" or "guess", but instead get cold hard facts.

This is something that is easily quantified -- assuming a random sample according to the given the sequence of percentages, the odds that the number Serial computed would have magnitude of 800 or more is strictly worse than 1 in 7.8. A little unlikely, but only a little. I strongly expect a full analysis could raise that to a double digit number, or maybe even low triple digits.

(which, incidentally, I do not find significant due to selection bias)

The problem with statistics is not that they're unreliable or anything like that -- the problem is that people just don't know how to reason with them. The test Serial came up with is a good one...

To look at all that and imply that something was off--and then to throw out a number that means nothing from a statistics standpoint (800% gained!)--is nuts.

... but as you rightly point out, the result of the test was originally reasoned with badly.

The Oz-Man · Jan 15, 2012

...man, you guys are nerds.

I totally mean that in the nicest way possible.

Yes, that was my "I haven't taken a stats class in seven years, but here's a little bit of something anyway" answer up there.

Archbob · Jan 15, 2012

He basically proved that he can do just as well without Praets(or any special UU) what you could do with Praets so Praets aren't some super-special unit. I've seen enough HA videos online in Civ 4 to know that is works just as well as Praet as well as played some rush games myself.

Its a good unit but its not some super-unit that wins you the game by itself. If you can rush and kill your opponents with Praets, you can probably do it with other Civ's UU as well, as well as not using UU's at all. I don't understand your obsession with Praets honestly, there are other rushes that work just as well.

Melior Traiano · Jan 15, 2012

Being curious about these numbers, I went ahead & computed the variance & test statistic for this sample:

The test statistic is 5.86, variance is 6.57, standard deviation is 2.56, so if we assume a Gaussian distribution, the P-value (i.e. the probability that we got a test statistic that is at least this far from the hypothesized mean of 0) is ~0.01. So call me surprised that the test statistic for this sample is over two standard deviations from the mean.

But, there are caveats to this:

A valid statistical test would assume random sampling, which did not occur in this case.

Also, it isn't at all clear that a Gaussian distribution would be a good model for combat outcomes in Civ 4. I've seen runs of both good luck and bad luck that are similar to this sample while playing the game, so my suspicion is that a Gaussian fit would only be a very crude approximation to the distribution of differences between combat outcome frequencies vs. game-computed odds. This could be due to the (pseudo-)random number generator in the game not being truly random "enough", the formula(s) for computing odds being mathematically unsound, the rules or mechanics of the virtual dice rolls to determine combat outcomes being inconsistent with the odds computations, some combination of the above, or something else entirely.

Still, the bottom line is that based on what we do know, I just don't see that there's any sound basis for implying that anyone has cheated.

Archbob · Jan 16, 2012

I've noticed that if you start to win in a major stack-to-stack contest, you usually continue to win more than the odds should suggest. In all seriousness, He should have lost perhaps 3-4 more units, but he had odds on the vast majority of combat cases and I doubt losing 3-4 more units would have really stopped his conquest.

TheWilltoAct · Jan 16, 2012

Melior Traiano said:
This could be due to the (pseudo-)random number generator in the game not being truly random "enough", the formula(s) for computing odds being mathematically unsound, the rules or mechanics of the virtual dice rolls to determine combat outcomes being inconsistent with the odds computations, some combination of the above, or something else entirely.

Still, the bottom line is that based on what we do know, I just don't see that there's any sound basis for implying that anyone has cheated.

Wow. You hit that description of the odds calculator out of the park.

Basically a mathematical analysis is superfluous here, in the sense that with such a narrow sample size and possibly a host of other mitigating factors we must take the game results precisely as they are presented.
In other words, good show Mylene!

vranasm · Jan 16, 2012

wow such dense analysis of ~50 battles...

there is couple of problems...

in real world the statisticians talk about ~millions of rows to balance out and you dig deep into 50 numbers?

this is computer generated random number, it's supposed to emulate gaussian, it definitely is NOT random at all... if you want you can dig out the algorithm used that will prove that it's just clever mathematical function that tries to emulate random numbers...

that said I lost enough of 95% battles that I don't see problem with someone consistently winning on 60%+ odds.
The thing is that losing 95% battle on cuirassiers is not problem with another 10 standing nearby, but losing 80% battle on HA's usually proves bigger issue.

Thus I don't like BC era rushes that much and the same definitely holds for Praets too. Btw if anyone watched AZ videos he knows how this things can go, since he BC rushes a lot...

Mylene · Jan 16, 2012

I have the autosave (set to every round cos of sgotm) where i got a bit lucky on taking Alex second last city, feel free to play the attack yourself ...

Mylene · Jan 16, 2012

Wrong one..

MyOtherName · Jan 16, 2012

vranasm said:
wow such dense analysis of ~50 battles...

there is couple of problems...

in real world the statisticians talk about ~millions of rows to balance out and you dig deep into 50 numbers?

Math is fun.

You only need millions of samples to find subtle things, things that your test aren't very good at picking out, or things that are otherwise hard to detect. Dozens are enough for more obvious features.

this is computer generated random number, it's supposed to emulate gaussian, it definitely is NOT random at all... if you want you can dig out the algorithm used that will prove that it's just clever mathematical function that tries to emulate random numbers...

It is true that bad PRNG's can have a noticeable influence on simple tests, and good PRNG's can be detected with a sufficiently powerful test (often requiring immense amounts of data and/or computation).

The very definition of "good" PRNG is roughly that you can't tell the difference between it and true random with "simple" experiments.

I'm not aware of Civ 4 having a particularly bad PRNG, or any qualities of the statistic we're computing that make it likely to discover bad PRNGs.

PJyang · Jan 16, 2012

Previous content deleted.

EDIT to clarify my previous post: Can sum(p * (1-p)) tell you anything about the actually outcome of the battles?

Imagine there is only one battle with winning odds 0.1, you will get p * (1 - p) = 0.1 * 0.9 = 0.09 no matter you win or lose.
Imagine there is only one battle with winning odds 0.9, you will get p * (1 - p) = 0.9 * 0.1 = 0.09 no matter you win or lose.

What does the number 0.09 tell you about the outcome of the battle?

Sorry for my bad English.

TheMeInTeam · Jan 16, 2012

I'm not interested in any of these "was it cheating" crummy stuff. However, I am interested if that is the kind of attack one would typically get away with...or if Mylene got "lucky" on an attack that normally would have failed horribly (or had at least a 30-50% chance of ending horribly). In evaluating whether a strategy is worth attempting, the odds of it leading to victory or a stronger position as opposed to failure are very important. One of the reasons I despise early rushes on high difficulties is that it seems there's always a reasonably solid chance that an unlucky run will freeze your offensive and leave you in a losing position.

Not very easy to determine exact success with so many battles of variable odds however, especially given that not all of them had to be fought in that exact order either.

leif erikson · Jan 16, 2012

PJyang said:
The proper way of testing if "Mylene cheats" from a statistical point of view is to compute the probabilty of the event "one would win the same number of wins or more wins in those battles given the odds shown the game log."

Moderator Action: Time to end the cheating discussion.
Please read the forum rules: http://forums.civfanatics.com/showthread.php?t=422889

MyOtherName · Jan 16, 2012

edit: removed comment on variance of Bernoulli trials, in reaction to leif's post.

leif erikson · Jan 16, 2012

MyOtherName said:
edit: removed comment on variance of Bernoulli trials, in reaction to leif's post.

Dirk1302 · Jan 16, 2012

ShengWuLien said:
I haven't been coming to the civfanatics forums for very long, so forgive me for speaking out on this.

All of you are coming close to ruining this place for other people. Both sides of this Praetorian debate are equally to blame--the level of vitriol regarding disagreements about a game is totally disproportionate. We have had half a dozen threads like this now in the last couple weeks, including wild acts of necromancy on threads 6 or 7 years old.

Please, please give it a rest now.

Agree though i don't think everyone is equally to blame. I like a good argument myself and we're not made of sugar here but this praet and Marathon/huge thing is way out of control (I don't mean the Marathon/Huge thing itself but the bickering about it).

ahcos said:
@Seraiel:

Last but not least - she is not my friend, neither am i her friend. I like and appreciate her just like i appreciate other people in the forum that i learned from or whom i had an interesting discussion with, like Dirk. There was a time when Gwaja reanimated the civfanatics IRC chat, and Mylene, Dirk, TMIT, some other folks and me used to hang out there for a few days - but that's about it. Yes, i care somewhat more for her than i care for people from the forums that i know less, but we're far from being friends. Which is sad, but true.

Frankly speaking, i'm not too happy that i have to spell all that out, but i somehow feel i'm forced to do so. And now, please, would you let it go... there's no conspiracy of players protecting each other or a certain way to play Civ4.

Indeed no conspiracy i know of. While i like Mylene and value her input (wouldn't have known of the engineer rush trick without the landsknecht thread), we disagree as often as we agree on things. Different viewpoints sometimes. Agree with you and TMIT quite often but not always. This is not a problem as long as the discussion is civilized and based on arguments. And before we begin talking about "sides" again, i'm not a fan of Marathon/huge but i certainly agree with some of the points Marigold's making about praets on standard speed.

Imo the name calling has to stop right now. Forum is deteriorating rapidly in this way. As it is now i will only post about forum games i play (Musketeers and the Yatta game atm). Discussing strategy in the way it is done now based on insults instead of arguments is a waste of time.

Mylene · Jan 16, 2012

Interesting how it goes on (sorry Leif, but i really feel like clarifying), but noone commented on my autosave.

Here you could see..
- i had enough HAs should i be "less lucky"
- i brought 2 Catas to reduce luck factors, slowing me down
- under peace vassal pressure, and waiting for 2 Catas was already risky

Of course the question was raised "why did i stop building HAs?".
Much easier to throw in more ???, compared to thinking for a while...
Alex reached longbows, and his last city was on a hill. Nothing i could have built to take it from him.
Of course i also expected capitulation..

Let's prove it! Roman deity competition

Warlord

Game of the Month Fanatic

Emperor

Emperor

Enter: The VAIKE!

Ancient CFC Guardian

Warlord

Ancient CFC Guardian

I am observe

Deity

Deity

Deity

Attachments

Emperor

Chieftain

If A implies B...

Game of the Month Fanatic

Emperor

Game of the Month Fanatic

Deity

Deity

Similar threads