A statistical analysis of which start conditions improve the likelihood of winning

Candidly, I suspect variations in player skill would swamp any other factor in any multi-player analysis. Give me a good player with a crap start and he or she will dramatically outperform a poor player with the same start.

The problem is that player skill is not a variable easily controlled for in a statistical study. (It's not like there are any recognized Civ player-rating systems, such as those that exist, for example, with chess (FIDE, USCF, etc.).)

And that's before you get to the issue of reporting bias -- how many abandoned games would go unreported, and how would you even treat abandoned games, if they were reliably reported? This has always been an issue with extrapolating any conclusions from HOF submissions -- players routinely re-roll for "good dirt" (whatever that player subjectively views as good dirt), discard slow games, and only submit their best outcomes, which severely skews the picture you get from surveying HOF submissions.
 
Candidly, I suspect variations in player skill would swamp any other factor in any multi-player analysis. Give me a good player with a crap start and he or she will dramatically outperform a poor player with the same start.

The problem is that player skill is not a variable easily controlled for in a statistical study. (It's not like there are any recognized Civ player-rating systems, such as those that exist, for example, with chess (FIDE, USCF, etc.).)

I don't like to rain on any analytic parade, so I'm hesitant to chime in on this thread, but what Browd says is immensely on point. I see analytic conclusions all the time in sports and other areas, and the tricky part, which is often quite elusive, is how to cut through the noise.

CiV is especially ripe for noise. On the one hand, you have the obvious tiles and conditions that a player would want, but brainstorming those can be done in 30 seconds. As Browd suggests, it's players skill, and not start conditions that make the largest difference, and those are immensely difficult to measure.

Particularly, if the subtly real question is what median or slightly above-median advantages would a good player want but might not actually realize that he or she should want over something more obvious.
 
Good work as usual and quite interesting! I like to see those low p-values, and they definitely lay the issue to rest for me given the consistency of your dataset. I wasn't sure how big an effect mountain or coast had. I know there are advantages but there are downsides to it too: on multi being on coast makes you more vulnerable to attack despite the benefit of boosted routes. On multi you are warring a lot so those are easy to plunder and assail so I was not convinced it would result in a solid win so often. Maybe it is because Filthy is such an experienced multi player. As for mountain, if you start right next to the mountain of course it's better. But 3/4 times you've moved to get there and thus your start can be less balanced. You might even have moved OFF water to get there. I've seen some players do that. I agree mining luxes are superior. I discuss why I think so below.

Browd's and others concerns are quite valid but I think most of what is discussed forgets the style of the 180 played games.

These are all the same player: FilthyRobot who is known as a very skilled player. With a random dataset you get the problem that a less-skilled player might not even know how to take best advantage of all the start conditions. So, though there is a bias in that Filthy might be better with certain starts, it is as best-controlled as possible by picking a large set of a single, very skilled, player.

These are all on the same settings: pangaea, small, quick speed so the settings are as homogenized as can be. that said a weakness of the analysis is it does not consider the base terrain at all which I find to dramatically affect my game. Is this plains, grassland, tundra, or desert? Is it heavily forested or bare? Is it a heavy-jungle start? Are there nearby hills to work when I produce settlers? I'd like to see these variables tested in your model and reported since I find them significant in my games.

Lastly, I doubt cherry-picking or rerolling is a problem with Filthy's games. These aren't SP uploads these are multiplayer games and thus he has to play out every one. You could argue he maybe got rid of losses but given that he uploaded maybe 40% losses I doubt it. It looks like this YT player uploads them all which removes the bias from cherry-picking games. Also since it is multi rerolling is not an issue. Filthy plays every start he gets. Some multi games have player drops but usually not so many the game ends prematurely. That's the point of quick-speed, to finish in one sitting. So actually this dataset is ideal for testing start conditions since you don't have a bias for rerolling and probably not one for cherry-picking either.

For these reasons and the fact that they make intuitive sense, I think the finding that mining luxuries are superior, mountain start is better for midgame science, and coast start is better for victory is probably true.

I find mining luxes better for two reasons:
#1: one tech away from utilizing and it's the same tech that you need to chop forests--so on your early tech path already with no diversion
#2: they are mostly on hills and thus after mining give 3 production when worked and you will be working them if only for settler speed. The exception is salt which is amazing as a start for a balanced amount of production AND food. Some luxes are hard to work as the base values are bad but you will be working mining luxes a lot making them good.

this means all I have to do is research mining, steal/build a worker, and I can be pumping out early settlers way, way more quickly from working the mining lux tiles and simultaneously chopping forest. Fast expos on multiplayer means a stronger game. Most luxes give just extra gold to the base terrain so they are only good if you'd want to work that base terrain. In the case of mining luxes they are on hills which are a great terrain feature and result in significantly faster build times, on settlers in particular. the extra gold is just icing on the cake. So I consider them good due to the fast tech and base terrain. The strength of the lux is convoluted inextricably with the value of the base terrain.

Back to base terrain considerations. they DO make a huge difference. Being in pure grassland in particular is harsh for production without hills. Balanced starts with some hill/forest are my better games.

I would not consider any tiles in the third ring for simplicity only the first 2 since they make the biggest difference when starting.
For a start I would define classes of base terrain that are common and go from there here's the set I'd use:

pure flood-plain+desert
pure grassland start
pure plains start
tundra/forest start
mixed plains/grassland

Having forest to chop in rings 1-2 makes a bit early difference so you really should include that in analysis if nothing else.
Also having nearby hills helps and might be part of the reason why mining luxes scored so high as they are better than normal hills.

I would also independently compare wins with these variables and generate a win% vs. condition curve:

nearby hills, nearby forest, nearby jungle.

light (0-2 tiles), moderate (3-8), heavy (9+ or approximately half or more of 1st+2nd ring tiles)

For these continuous variables would be better so you could just test the variables once like in a mixed model, but the above could be used for a simpler, binary, test. I suspect though being near a few forest or hill tiles is always great for a start (in my opinion) if you have too many the lack of growth offsets the effectiveness. Being near a moderate amount of jungle can be as strong as being near a mountain in my opinion. Maybe even better since the university comes a bit earlier. My best games have a nice balance with several nearby forest AND hills but not majority of them. Don't know how you'd test the gradient without a continuous variable though.

Lastly to complete the full analysis you need to consider bonus and strategic terrain resources, you can group them together by the classes: production, food

food bonus: wheat, bananas, sheep, cattle, bison, deer, fish
production bonus: stone, horses, iron

after improvement the effect can be mixed but I'm neglecting this.
 
Filthy might be better with certain starts, it is as best-controlled as possible by picking a large set of a single, very skilled, player.

I don't think some of you guys realize how erroneous such an approach is. This isn't how proper statistics is done, especially regression analysis. You do not start with a belief (I.E. Filthy is a skilled player, thus it's best to use his games only), then collect data based on it. Even if it were true, there could still be big difference in gameplay between skilled players, yet we would have no way of knowing this. A large dataset of different sources could even show that the correlation of starting luxuries to wins is statistically insignificant for the vast majority of players, we would also have no way of knowing.
 
Obviously the analysis is limited because it only involves data from one source. However the data is of excellent quality as many others have pointed out. The initial results are obviously very intriguing and we're all clearly excited to debate its implications.

I think the next step is to conduct this same analysis on data from another source. Are there any other highly skilled multiplayer players who have uploaded 100+ games out there?
 
'Games with mining luxuries' is a set that includes the subsets 'salt starts' (known to be a massive advantage) and 'gold/silver starts' (presents a really good Pantheon, and with Currency, these get a bonus particular from Copper). The additional faith, culture and GPT for G/S is one thing, but the sheer power of a Salt capital (presumably accounting for 1/5 of Mining starts) is probably going to skew the data.
 
'Games with mining luxuries' is a set that includes the subsets 'salt starts' (known to be a massive advantage) and 'gold/silver starts' (presents a really good Pantheon, and with Currency, these get a bonus particular from Copper). The additional faith, culture and GPT for G/S is one thing, but the sheer power of a Salt capital (presumably accounting for 1/5 of Mining starts) is probably going to skew the data.

I don't think it would skew the data. It IS the data. 1/5th of "mining" starts are always going to be salt starts for everyone right?

Edit: OH ok you are saying that Mining might not be good...you are saying that salt starts are so insanely good that they make all mining starts look good and if just the salt starts were removed from the calculations mining starts would be not so great.

I think mining is a better start because it includes chopping with it and is on the path to cool early wonders like Mausoleum and Pyramids as well as Comp bowmen.
 
I don't think some of you guys realize how erroneous such an approach is. This isn't how proper statistics is done, especially regression analysis. You do not start with a belief (I.E. Filthy is a skilled player, thus it's best to use his games only), then collect data based on it. Even if it were true, there could still be big difference in gameplay between skilled players, yet we would have no way of knowing this. A large dataset of different sources could even show that the correlation of starting luxuries to wins is statistically insignificant for the vast majority of players, we would also have no way of knowing.

I recognize the point you are trying to make (and agree with you to an extent!), but to imply that trying to learn something from a particular sample dataset or case study is erroneous or wrong is a bit ridiculous. As I've already stated, I did not start with the belief "Filthy is a skilled player, so I'll use his data", I took what is honestly the only viable dataset available. No other player has uploaded all their wins and losses from a set of games that all play the same conditions (I even wrote to him to check that the games were complete!). We have the choice of this data... or nothing!

I want to address your point about selection bias in detail, and I'll use the mountain result as an example.

Based on the analysis I did, the correct conclusion to draw is:
- Starting next to a mountain increased the chance of FilthyRobot winning games

If I wanted to say:
- Starting next to a mountain increases the chance of most players winning games
That would technically be a hypothesis, as we only have data from one player.

Selection bias might mean we are wrong. Let's imagine that FilthyRobot is the only player clever enough to build an observatory when next to a mountain (a crude and ridiculous example but hopefully you get the point). We would therefore be wrong to assume that the data applies to all players!

So in that sense you are correct, and up to here I agree with you, but the rational thing to do is to place the results we see in the context of what we know about Civilization V. We all know about the big bonus from observatories, so it is not ridiculous or wrong to propose a broader conclusion that applies to all players. Now, if my article were a formal scientific article, I could state these precise conclusions in the results section and then explore what they might mean in the discussion. But my article isn't (and nor should it be), so I chose to try and integrate my results within the bigger picture to make it an enjoyable read that people could relate to. After all, trying to understand what results from a particular study might mean is part of the scientific process!

As an example from Biology, John Gurdon showed that if you take the nucleus from the skin of a tadpole and put it in a frog's egg cell, the egg is viable and able to turn into an adult frog. He could have stopped his paper there and said "We conclude this works in frogs, we cannot say anything about other animals or cell types". However, he went on to propose that this underpins all complex life, and also a much broader conclusion that DNA contains sufficient information to make an organism (this was a big deal at the time).

Although that might seem off topic, I'm trying to illustrate that case studies of particular examples can still tell us something, and that a valuable part of the process is placing findings in the bigger picture. With the things my article identified, none of them are exactly surprising, right? If the analysis had said settling next to a mountain makes you more likely to lose, then you might have a point in questioning the validity of the approach, but does anything in the results really suggest to you that the conclusions don't apply to most players?

Just to make sure we're on the same page, I'm also very clearly not proposing "the player that starts next to a mountain is more likely to win", that would be ridiculous. I am saying that all other things being equal, I expect the same player to be more likely to win starting next to a mountain than not.

Once again, we have the choice of this data... or nothing, and I'd choose trying to learn something (whatever the caveats) over doing no analysis every time :)
 
Candidly, I suspect variations in player skill would swamp any other factor in any multi-player analysis. Give me a good player with a crap start and he or she will dramatically outperform a poor player with the same start.

The problem is that player skill is not a variable easily controlled for in a statistical study. (It's not like there are any recognized Civ player-rating systems, such as those that exist, for example, with chess (FIDE, USCF, etc.).)

And that's before you get to the issue of reporting bias -- how many abandoned games would go unreported, and how would you even treat abandoned games, if they were reliably reported? This has always been an issue with extrapolating any conclusions from HOF submissions -- players routinely re-roll for "good dirt" (whatever that player subjectively views as good dirt), discard slow games, and only submit their best outcomes, which severely skews the picture you get from surveying HOF submissions.

I don't disagree, but I wanted to make extra clear that I'm not proposing "the player next to a mountain or coast is most likely to win". Instead I am saying that "all other things being equal, the same player is more likely to win when starting next to a mountain or coast than not". I would argue that taking data from one player only is controlling for the variance in player ability as best we can.

As for reporting bias, I think I am correct in saying that FilthyRobot uploaded all the games except those that crashed due to technical difficulties. Having seen most of the games, I can safely say there are one or two where he was outplayed, so I don't think he's trying to avoid embarrassment.

The vast majority of channels cherry pick the best games, so this is a valid concern. However, FilthyRobot and I exchanged a couple of emails before I started the project and he was interested to know his win/loss ratio. Someone who is only uploading their wins or a subset of games would probably not be asking that
 
But mounains mining lux and coast are already known advantages. You kind of already know your answer here so we arguably ignore the player bias exactly because it confirms our intuition. Take a harder question and this kind of analysis fails as a generalisation because of this bias. If filthyrobot win rate shows an increase with using nqtradition over nqliberty it becomes a lot harder to ignore the bias. His use of one tree may simply not be as good as the other. There is no tradeoff with a mountain so there is no question. Its a direct bonus.

I guess you can still conclude how things go for him in that case.
 
But mounains mining lux and coast are already known advantages. You kind of already know your answer here so we arguably ignore the player bias exactly because it confirms our intuition.

These threads are all intuition confirmation :). Fun, but quite limited in that respect.

Out of curiousity, Acken, what do you think in regards to my above question: What median or slightly above-median advantages would a good player want but might not actually realize that he or she should want over something more obvious?

I think of River systems. Provided I can steal workers, I don't care as much which luxes I have or don't have, as much as I find really good river systems, hill spots to settle on (particularly if a forward settle), an especially powerful ease of winning factor. Civil Service Food, Defense organization, et al. We know they're good, but maybe still undervalued.

Any suggestions, or intuitions?
 
Well i dont think civ5 provides enough trade offs in its maps to easily answer this question. Not much come to my mind that isnt obvious. My perfect start would look very much like yours. We could maybe look at something like ivory for example that provides circuses on top of a luxury. Or having some jungle around.
 
@adsin15: I do think Mining starts are good starts, I agree with your reasoning. I just thing Salt starts are so brilliant they need to be in a distinct league of their own (likewise, for that matter, wiggly rivers through Desert hills).
 
But mounains mining lux and coast are already known advantages. You kind of already know your answer here so we arguably ignore the player bias exactly because it confirms our intuition. Take a harder question and this kind of analysis fails as a generalisation because of this bias. If filthyrobot win rate shows an increase with using nqtradition over nqliberty it becomes a lot harder to ignore the bias. His use of one tree may simply not be as good as the other. There is no tradeoff with a mountain so there is no question. Its a direct bonus.

I guess you can still conclude how things go for him in that case.

We're not going to get exhaustive, perfect evidence without noise.

What we do get is some evidence, and it is useful so long as we don't make conclusions that don't follow from it.

For example, what if starting near mountains actually lowered Filthyrobot's win rate over 180 games? That would be inconsistent with our anticipation prior to reviewing the data, and could cause us to reconsider our model that mountains are an advantage.

Coast improved his win rate more than I'd have anticipated it would improve it for an arbitrary player (I was not expecting it to outperform a number of strong luxuries). This could be because Filthyrobot is better than his overall average at utilizing it, yes, but it still serves a piece of evidence that can influence how players weigh coast starts, even if you wouldn't attach high confidence to it.

Basically, it's not strong evidence, but it's evidence. It's interesting that his tier list had less predictive value on winning than his starting terrain. It certainly makes his basis for putting stuff in one tier vs another questionable; tier rating should reflect win % to at least some degree, but that isn't happening.
 
We're not going to get exhaustive, perfect evidence without noise.

What we do get is some evidence, and it is useful so long as we don't make conclusions that don't follow from it.

For example, what if starting near mountains actually lowered Filthyrobot's win rate over 180 games? That would be consistent with our anticipation prior to reviewing the data, and could cause us to reconsider our model that mountains are an advantage.

Coast improved his win rate more than I'd have anticipated it would improve it for an arbitrary player (I was not expecting it to outperform a number of strong luxuries). This could be because Filthyrobot is better than his overall average at utilizing it, yes, but it still serves a piece of evidence that can influence how players weigh coast starts, even if you wouldn't attach high confidence to it.

Basically, it's not strong evidence, but it's evidence. It's interesting that his tier list had less predictive value on winning than his starting terrain. It certainly makes his basis for putting stuff in one tier vs another questionable; tier rating should reflect win % to at least some degree, but that isn't happening.

Right, I think this nicely summarizes how I feel about it. Of course the things we feel strongest about are those that match our understanding (e.g. mountains = good), but the results can otherwise still be thought provoking. As you say, it's not strong evidence, but it's better than no evidence ;)

I have known people in multiplayer to move off the coast because they felt the disadvantages (vulnerability to frigates) were worse than the advantages. I'm hoping that will at least raise a question in people's minds. Sure, they might not defend as well as FilthyRobot, but at least those people will weigh up the fact they might be throwing away an advantage.

Also, it's one thing for civ experts to say that they feel the conclusions are obvious, but I had multiple replies from people on reddit that had never even really thought about luxuries, and how different ones will affect the speed of the start (other than the obvious salt fanaticism that happens over there). So some people are learning something!

To be honest, I also feel like I have learnt something. As I play single player 80-90% of the time, I genuinely wasn't sure if coast would work out with a net advantage in multiplayer, even if it is for one player. Similarly, given how much people complain when they roll a weak civ in multi-player, I thought that might have made a bigger difference in the data. It also makes me question if coastal start biases haven't been thought about enough in the popular tier lists.
 
Strangely enough, I find all those factors less important that settling along a river. However, the data is highly correlated so this evidence, as you all say, it not strong evidence, but it's still evidence. You know, a trapping lux may mean a tundra start and a mining lux may mean a Salt start
 
Based on the analysis I did, the correct conclusion to draw is:
- Starting next to a mountain increased the chance of FilthyRobot winning games
If I wanted to say:
- Starting next to a mountain increases the chance of most players winning games
That would technically be a hypothesis, as we only have data from one player.

I love that you pointed this out just as I was about to post something to this effect!

I agree that you are doing the best you can with the data you have.

OTOH, you are not controlling for a huge variable, namely opponent skill. Does FilthyRobot face the same set of opponents with enough frequency to use only those games?

Failing that, how about using only games where FilthyRobot wins by a certain VC, but then looking for a correlation between terrain and win time?
 
For people saying Captain's analysis is obvious; his findings on how unimportant the civilization is, is very surprising. Either we reduce how important we think the bonuses are, or find something wrong with the analysis (e.g. FilthyRobot only experiments with bad civs when he's playing weaker players, etc.)
 
...his findings on how unimportant the civilization is, is very surprising.
I disagree. The conventional wisdom has long been:
Player Skill >>> Starting Dirt >>> Civ Choice

The frequent civ tier discussions never come to any consensus, so that implies that such tiers are largely arbitrary. The best attempt at an objective numerical ranking ended with only a single point difference between tiers, so again more implication that tiers are imaginary.

Still, it is nice to some objective evidence that the civs are pretty well balanced!
 
Top Bottom