[R&F] Use machine learning to select better city settlement locations

Kebnoa

Chieftain
Joined
Mar 31, 2016
Messages
33
Location
Barcelona, Spain
I ran an experiment to see if I could use a Machine Learning model to determine what makes for a good city based on only the starting tiles terrain, features, and resources.

Here is the results of the model:
  • 2 or more Grassland (Hills) with Woods tiles are great. (1 is better than none, though)
  • 1x Luxury is great. (More than 1 isn't significant, not having one is significantly negative).
  • 2x Stone is good. (1 is better than none)
  • Bonus resources are good and are significant in this order of preference:
    • 1x Bananas tile.
    • 1x Rice tile, more than 1 isn't significant.
    • 2 or more Wheat tiles.
    • 1x Deer tile
    • 1x Fish tile
  • 2x Plains with Woods is good.
  • 2x Plains (Hills) with Rainforest is good.
  • Minimal, or no, Grassland with Woods tiles are preferable.
  • Plains with Rainforest are positive.
  • 8 or more Grassland tiles are positive. (Less than this is generally negative)
  • 4 or more Grassland (Hills) tiles are positive.
  • 1x Coast and Lake tile is marginally positive, more than this is negative.
  • Minimal Plains tiles are preferable.
  • 1 or more Grassland Mountain tiles are positive
  • Not settling next to a River is negative.
You can read much more at the GitHub project I just published.

Questions, comments, critiques all welcome. Tell me what you think :-)
 
Really interesting, thanks for conducting the experiment and sharing it. A few quick questions (might be apparent if I dig into the mod / code), but did i understand this correctly:
- You measure total yields of a city during its total lifespan?
- How are tile improvements managed? Is it "standard AI"-bevahiour?
- Are districts built as well?
 
On Autoplay computer plays the game for you, you don't need actually do anything, just observe if you want. This is base FireTuner functionality. Great for testing mods and gathering gameplay info.
 
Really interesting, thanks for conducting the experiment and sharing it. A few quick questions (might be apparent if I dig into the mod / code), but did i understand this correctly:
- You measure total yields of a city during its total lifespan?
- How are tile improvements managed? Is it "standard AI"-bevahiour?
- Are districts built as well?

Thank you!

I tried to keep it simple. I recorded the yields per turn, specific to each city settled within the first 3 turns. I stopped recording the data after 55 turns. To calculate the city score I summed this per turn yield to determine the total yield produced in 50 turns.

The idea was to simplify as much as possible. With this in mind I used the 19 tiles "visible" at settle time as input, and the total yield after 50 turns as output. I didn't capture or try to model any player actions. For example, what impact building a granary has. Or build order. All the actions by the AI etc are "standard AI" behaviour, except for my actions of course. This includes buildings, district, work improvement etc.

To reduce variability and improve comparability I played at Prince level (least AI benefits) on continent maps exclusively.

<edited to clarify>
 
Last edited:
So it seems the algorithm puts a focus on food and tiles with natural Cog production like plains and woods (instead of more hills and resources with mines). Could it be that mines are overused by players and that the Maori (bonus from features) and Inca (bonus food from hills and production from mountains) will show another facet of CIV 6 to the player ?
 
Talking more on substance, after reading your pdf report I would have the following questions (remarks):
1) analysis does not account for situation when city produces Settler. Partly it is accounted by production, but only partly. With current structure city which produced say 2 Settlers would get low score on output;
2) not all yields are equal. Most important yield is Production (not food, and not gold). Next would go Science and Culture. I would suggest redistribute weights accordingly.
 
1) analysis does not account for situation when city produces Settler. Partly it is accounted by production, but only partly. With current structure city which produced say 2 Settlers would get low score on output;
2) not all yields are equal. Most important yield is Production (not food, and not gold). Next would go Science and Culture. I would suggest redistribute weights accordingly.

Completely agree on both points.

1st one is tricky and I found myself purposely NOT building settlers. No idea how to account for it though. I also considered the 2nd topic, hence the leaving Faith out completely :)

My questions back are how do you account for the "penalty" when a city builds a settler. Likewise how do I determine the ratio between the different yield contributions objectively?

I also politely ignored the following in the pursuit of build a working model:
Leader bonusses
Build Order
Impact of city centre tile, vs adjacent ones, vs those in ring 2 and ring three even
Impact of specific luxuries
At the moment I have terrain and features lumped together. It would be interesting to see what impact Rainforest or Woods have.
 
Last edited:
Probably the best way to account for Settlers would be remove them from test runs all together (as you already done, but ensuring AI also does not build them).
 
So it seems the algorithm puts a focus on food and tiles with natural Cog production like plains and woods (instead of more hills and resources with mines). Could it be that mines are overused by players and that the Maori (bonus from features) and Inca (bonus food from hills and production from mountains) will show another facet of CIV 6 to the player ?

Interesting thoughts, thank you for sharing them.

I would love to repeat the experiment with more data. At the moment the model is simplistic as the number of features (in ML terms) explodes very quickly when you try to take more into consideration.

I need to look at Autoplay as suggested earlier in the thread. AND, also getting the data out of game is difficult at the moment. I found that writing json (or even simple strings) to the lua.log file breaks quickly. I wasn't able to store the data in the game savefile either. If someone with better modding skills can solve these for me we can repeat the exercise etc.
 
As a machine learning researcher, I seriously doubt what your system have learnt. From the description on Github, I guess it just learned from how AIs settle their cities.

I used to think AIs don't move their capital, it seems that I'm wrong. But this doesn't change my result here, your approach just learned how AIs settle their capitals.
 
Last edited:
AND, also getting the data out of game is difficult at the moment.
If you are talking here about getting yields data it is super easy. Just loop cities at the end of each turn and simply print each yield into Lua.log in csv compatible format... Or maybe I am not quite understand your needs here.
 
As a machine learning researcher, I seriously doubt what your system have learnt. From the description on Github, I guess it just learned from how AIs settle their cities, maybe even just learnt how the map generator arrange starting location since AI never move their capital and just settle in place.
The Ai does move its starting settler, I even had to code something to force the civ's capital placement on the starting position for TSL map.
 
The link to the pdf is 404 so I'm not entirely sure of your approach here - looks like you run a few games to gather the training vectors. Problem is, aren't you simply modeling the expert system they use to settle cities? Reverse engineering it as it were. It's also Civ dependent obviously, as they have hand coded criteria for each Civ's opening settlement.

If that's correct, a different approach would be to create a GAN which runs, for each civ, to determine optimal placement. It might actually be easier to do this in a toy game (not the actual Civ), encoded with the yields, then run against itself to generate the model.

Or is your mod doing the placing (sorry I don't have time to read through your code)? That's fine, but then the problem is each Civ has differing needs that aren't apparent early in the game (districts, mountain placement, etc).
 
As a machine learning researcher, I seriously doubt what your system have learnt. From the description on Github, I guess it just learned from how AIs settle their cities, maybe even just learnt how the map generator arrange starting location since AI never move their capital and just settle in place.

Please .expand a little more on your reasoning.

I was surprised to see the AI does in fact and relatively frequently moves before settling. Sometimes it even moves twice before doing so. If you look in the raw data you can find many examples where cities are settled on the 2nd or 3rd turn of the game.

Also, even if the model merely acts as a filter highlighting favourable tiles in comparison to the cities I determined as good it is still worthwhile knowing, is it not?
 
The link to the pdf is 404 so I'm not entirely sure of your approach here - looks like you run a few games to gather the training vectors. Problem is, aren't you simply modeling the expert system they use to settle cities? Reverse engineering it as it were. It's also Civ dependent obviously, as they have hand coded criteria for each Civ's opening settlement.

If that's correct, a different approach would be to create a GAN which runs, for each civ, to determine optimal placement. It might actually be easier to do this in a toy game (not the actual Civ), encoded with the yields, then run against itself to generate the model.

Or is your mod doing the placing (sorry I don't have time to read through your code)? That's fine, but then the problem is each Civ has differing needs that aren't apparent early in the game (districts, mountain placement, etc).

The file is definitely there. Is anyone else experiencing the same problem?

I agree that running a GAN is likely to be more accurate. Adding in extra information related to leaders and civilizations would improve the model as well. As would capturing build order, and probably a few things I excluded and didn't think of. These approaches increases complexity and for my purposes I wanted to reduce complexity.

The experiment is to see what the Machine Learning algorithm deems important given the limited information available to it :)

The mod is merely a logger, it is a dumb as dumb can be.
 
Well that could help to decide the age old question whether we should emphasize food or production to gain most productivity fastest. Didn't @Victoria do something along these lines?
 
Well that could help to decide the age old question whether we should emphasize food or production to gain most productivity fastest. Didn't @Victoria do something along these lines?

What I found interesting simply from a numerical perspective is the cumulative yields and the variability and shape of the distributions. Check out the Food yields graph for example.
 
Well that could help to decide the age old question whether we should emphasize food or production to gain most productivity fastest. Didn't @Victoria do something along these lines?
It was a long time ago and a little incorrect. The basic answer was production without food and vice versa is not good, happy balance. Mainly it comes down to growth being better until about 4 pop to a degree, strongest tile wins but initially a strong production tile like 1-4 will likely drag you down and with loyalty now important it certainly tips food initially... but it is situational and I really appreciate that feel in civ 6. Naturally a,entities and housing limits change the food value later amd the choice is about value of many districts.
 
Back
Top Bottom