Game AI & net based machine learning

Yes you can. By reinforcement learning even data is not a big problem as long as you know how to win.

But the most important thing is that most people do not like a strong AI. Many people gain their feeling of achievements from "I can beat Deity Ai despite they have many bonuses." But A machine learning based AI is likely to beat them even if not given any bonus. Perhaps even given weaker players (Right , here I refer to those who cannot play a T150 SV in Civ6.) handicaps, they'll still fail to win most of the games, making them dissatisfied with their game experience.

Moderator Action: This type of trolling other players is not acceptable and must stop --Noble Zarkon
Please read the forum rules: http://forums.civfanatics.com/showthread.php?t=422889

If people feel dissatisfied with the AI they'll no longer buy the game nor its expansions, so there's no benefit for the company to develop such an AI. That's why the current AI is so weak, and the developers are always trying to make Ai even weaker, instead to the opposite direction.
 
Last edited by a moderator:
You're stuck working with incomplete information, and "possible moves" and "viable moves" have a large gulf between them in both games.

I don't quite understand what you mean by "possible moves" and "viable moves".

I am no mathematician nor competent Go player so I can't explain the specifics, but I can tell you that in Civ I rarely come across decisions of one turn making the difference across the entire board spelling out victory and defeat

And the exponential nature of possibilities in Go alone eclipses the growth and expansion nature of Civ games that force you into specific moves. In Go you can choose to lay your stone wherever you want and the possibilities of growing/hurting your foe are endless. In Civ it just comes down to managing your resources and relying on random chance.
 
The sheer scale of possibilities. There are 361 spaces on the Go board so every time a stone is put down the possibilities go like this:

361x360x359x358x357...

do this 200 times (until you come close to x160)

it is not even mathematically comparable to Civ games where you have a limited set of moves and it comes down to managing what you have to maximize your output
and what would be the number for a civilization game ?
 
The sheer scale of possibilities. There are 361 spaces on the Go board so every time a stone is put down the possibilities go like this:

361x360x359x358x357...

do this 200 times (until you come close to x160)

it is not even mathematically comparable to Civ games where you have a limited set of moves and it comes down to managing what you have to maximize your output

There are way more than 361 spaces on a Civ board. It's also more than a two-player game. You make more than one move per turn. Actions involve more than just moving.
 
It's dangerous to make assumptions. In a 1v1 it might use demographic data and warrior choke you to death or something if you underbuild military. Since it won't need much time for demographic analysis and can learn the implications of them given enough iterations it could pattern-react with near map-hack levels of efficiency.

I'm not convinced a sword rush always wins in such a scenario. Alpha zero does some moves in games that even professionals shy away from doing, I've seen several of its chess games vs stockfish engine.

Point here isn't that a particular outcome is certain. Point here is that one consistent outcome will be consistently certain.

Civ VI is no Go - it is not a well pieced together game, the systems and functions do not delicately interweave and there is definitely a single optimal way to play because of these imbalances, regardless of whether we are aware of it or not yet. It is closer to farming simulator than it is to chess. If there was some deeper level of trade off in the game there might be some interesting outcomes from an AI experiment, but even if our base assumptions about what is optimal in the meta from a human point of view are proved wrong, it will only ever go for one victory, one way, with very minor variations. That will add nothing of value to the gaming experience except exposing the weakness at the core of the game IMO.

To go into the realm of horrifically life threatening assumptions... Maybe there would be slight modifications based on start location, but I'm 99% certain AI will realise production is key, focus on developing it and speed towards the quickest victory condition that production aids, which on standard maps would be domination.

There are way more than 361 spaces on a Civ board. It's also more than a two-player game. You make more than one move per turn. Actions involve more than just moving.

But in Go you can place a piece anywhere on the board, anytime. Think of it less as board space, and more like, "there are 361 different actions you can take on turn 1"
 
Last edited by a moderator:
and what would be the number for a civilization game ?

I'd say multiplying a number close to 5 over and over for the first 100 turns, then multiplying 20 over and over for the rest? You don't have an infinite number of choices, you have to pull off specific moves to get bigger and stronger. Nowhere near non-linear as Go.

The 361x360... example I gave for Go is an exaggeration, but there really are hundreds of ways to respond to each every move. Given that shorter matches can end in 100 turns and longer ones 300 that's a lot of multiplication.

There are way more than 361 spaces on a Civ board. It's also more than a two-player game. You make more than one move per turn. Actions involve more than just moving.

It just comes down to maximizing your output with the resources you have, the rest is negligible. There usually is no more than 10 or so different ways to play a map optimally. It's just about where to place your districts, farms and cities to get better techs, civics and units. On the other hand, non-linearity in Go is unmatched. In Civ you don't have situations where you can respond to an enemy move by 200 different ways, in Go that's what happens every turn. What makes Civ so complex is random chance, that is map, goody hut rewards and battle results. But still you're restricted to your yields and location and not laying your stone wherever you want on the board.

I'm no mathematician but it only takes a high school level of understanding to realize that two exponential functions with one that has a greater base number (ways to change the game with impact) than the other will also have a far greater y-value (complexity) than the other at the same x-value (turns passed).
 
The sheer scale of possibilities. There are 361 spaces on the Go board so every time a stone is put down the possibilities go like this:

361x360x359x358x357...

do this 200 times (until you come close to x160)

it is not even mathematically comparable to Civ games where you have a limited set of moves and it comes down to managing what you have to maximize your output


A Go board may have 361 spaces, but a duel sized Civ map has 1,144 hexes, so I don't really see your point.

Furthermore, a Go AI only has to calculate the value of placing a piece on the remaining spaces. It only has to calculate one move per turn. In contrast, think of how many choices an AI player has to make in one turn midway through a Civ game. If the AI has fifteen military units it has to weigh the value of a combination of moves for every single one. If you have to account for actions like “fortify” and “pillage tile”, a single knight might have a hundred potential moves. The AI has to choose from a couple dozen production options in every city and about half a dozen techs and civs. The AI has to decide what improvements to build and where, what tiles to work and so on. There are an absurd combination of moves and the AI has to, somehow, know the value of each one. This is simple for a human brain, but even the most advanced neural networks aren't there yet.

AlphaGo also has to play itself hundreds or thousands of times in order to get good, in a game like Civ VI think that could prove difficult.

I also think that a neural network would struggle badly with randomly generated maps.
 
As SallowSeas points out, the number of legal and situationally relevant moves in Go is far less than the entire board, and you only place 1 stone per turn. In any Civ game, once you get a few turns into the game, you and the AI have many dozens of decisions to make every turn. Each unit may have the option to move to anywhere from 6 to 18 tiles (rings 1 and 2 around the unit's current location) and many more tiles for units (like horsemen, cav, etc.) with extra MP (taking into account mountains, occupied tiles, and enemy unit ZOC), or decide to stay in place (move 0 tiles), or decide to attack another unit, and if multiple units can attack the same unit, in what order to attack the unit. You can choose to reassign what tiles citizens work, or change what each city is producing, or what tech and civics you are researching. You can choose to (or not to) make trades with other civs, or engage in diplomatic actions with those civs, or send envoys (or not) to CS, etc. etc. etc. On a given turn, a player has the option of making 50, 100 or more decisions. So, I would submit that Civ beats Go in the raw number of decisions a player has to make on any given turn.

However, where Go beats Civ, hands down, is the importance of the one decision a Go player has to make on each turn, as compared to the dozens of decisions a Civ player makes -- the Go decision is irrevocable and always momentous. You can't move a stone once it's placed, so you have to think, think, think about every stone placement. Civ is, in contrast, extremely forgiving -- no particular move is all that momentous, most moves can be undone a turn or so later (and I'm not talking about reloads -- if you select the wrong tech to research, you can change it next turn; if you pick the wrong policy card, you'll get another option when the next civic is researched, or you can just drop gold to change policy cards more quickly, etc.).

Frankly, they are profoundly different games.
 
None of this is an issue for an AI as long as there’s adequate space for it to learn.

Play through a million time and it’ll already have spotted patterns that show their are clearly better districts, policies, researched etc to choose at certain times.

Play through 2 million times, it’s refined it more.

It’s not a question of if, it’s a question of how long. And more importantly, to what benefit?
 
As SallowSeas points out, the number of legal and situationally relevant moves in Go is far less than the entire board, and you only place 1 stone per turn. In any Civ game, once you get a few turns into the game, you and the AI have many dozens of decisions to make every turn. Each unit may have the option to move to anywhere from 6 to 18 tiles (rings 1 and 2 around the unit's current location) and many more tiles for units (like horsemen, cav, etc.) with extra MP (taking into account mountains, occupied tiles, and enemy unit ZOC), or decide to stay in place (move 0 tiles), or decide to attack another unit, and if multiple units can attack the same unit, in what order to attack the unit. You can choose to reassign what tiles citizens work, or change what each city is producing, or what tech and civics you are researching. You can choose to (or not to) make trades with other civs, or engage in diplomatic actions with those civs, or send envoys (or not) to CS, etc. etc. etc. On a given turn, a player has the option of making 50, 100 or more decisions. So, I would submit that Civ beats Go in the raw number of decisions a player has to make on any given turn.

However, where Go beats Civ, hands down, is the importance of the one decision a Go player has to make on each turn, as compared to the dozens of decisions a Civ player makes -- the Go decision is irrevocable and always momentous. You can't move a stone once it's placed, so you have to think, think, think about every stone placement. Civ is, in contrast, extremely forgiving -- no particular move is all that momentous, most moves can be undone a turn or so later (and I'm not talking about reloads -- if you select the wrong tech to research, you can change it next turn; if you pick the wrong policy card, you'll get another option when the next civic is researched, or you can just drop gold to change policy cards more quickly, etc.).

Frankly, they are profoundly different games.
The importance of any decision in one case, and not so important in the other, wouldn't that make it more difficult for the AI to learn playing civ ?

None of this is an issue for an AI as long as there’s adequate space for it to learn.

Play through a million time and it’ll already have spotted patterns that show their are clearly better districts, policies, researched etc to choose at certain times.

Play through 2 million times, it’s refined it more.

It’s not a question of if, it’s a question of how long. And more importantly, to what benefit?
IMO the benefit would be to point out exploits and unbalances that this kind of AI would use to win, because all of them would have to be fixed before I would accept to play against it.

And even then, I would not play long if it doesn't know how to role-play.
 
there's no point comparing a simple perfect-information 1v1 game with 2 or 3 rules to something like civ

as for machine learning, it's not a good fit for these types of games that put you in new situations so that you have to actually think about what moves to make instead of memorizing patterns from prior observed games

maybe it can eventually see some success with some rote build-order game like starcraft, but even if civ became the most played game in the world, there wouldn't be enough data to do much because it turns out that watching a bunch of mediocre players play doesn't really help that much

eventually i will write a real 4X game AI, but i highly doubt it will be as dominant (relative to human skill) as the AIs for trivial games like chess/go

a different topic would be making a mediocre AI by observing data from a bunch of mediocre players, but there are an infinite number of ways to make a mediocre AI so who really cares. it would be a neat technical achievement, but ultimately not useful for any serious strategy gamer
 
Having studied Machine Learning and Mathematics, this idea has come to my mind as well. However, learning AI will probably never be possible for strategy games like Civ due to (i) dynamism, (ii) stochasticity, and (iii) incomplete information of the game.

Ill try to tackle the easiest of these, and explain how the dynamism blows up the number of alternative actions. Consider the first turn in the game. You can
a) Move your warrior to 6-18 different hexes, and you can move the warrior in one or two clicks, resulting in about a maximum of 6+12+6+12=36 different moves.
b) Move your settler to one of the reachable (max 36) hexes. You can also settle the city in a maximum of 7 hexes. If you settle, you can choose one of 4 technologies and around 10 build selection. Thus the 7 settling options result in about 7*4*10=280 different actions. Thus, actions resulting from your decision in the settler result in about 316 different alternatives.
Lets say you only consider what the two units do, not what is the sequence of actions. Then you can simply count the number of possible way to combine any action of the settler with those of the warrior, which is 316*36=11 376 alternatives.

Now this is where things get difficult. Any "two-click" actions can also be made mutually sequentially (for example, move warrior 1 hex, then move settler 1 hex, then settle and select technology, then move warrior one more hex, then select the building). I don't 100% sure know how to calculate this, but Ill try to make an educated guess. Lets first count the easy ones, so the "one-click" actions of which there are at most 12 for both units, thus a total of 12*12=144, and these can be made starting with either warrior or settler 2*12*12=288. Lets then tackle the "two-click" actions. There are 2*2*2=8 different sequences if only the units are considered ([warrior, warrior, settler, settler], [warrior, settler, warrior, settler], etc.). The first movement can be made to 6 different hexes, and the second to 3 hexes after that +1 if the turn is ended in the first hex. So there are 8*6*6*4*4=4608 pure moving sequences. The second movement of the settler can be settling the city. After settling the city, there will be selections of the technology and building. Ill make a simplifying assumption here that after the city is built, then the technology and building will be immediately selected, so in addition to the 4 moving alternatives, the second settler action can be one of 2*4*10=80 ways of settling and selecting tech and build. This would result in 8*6*6*4*84+288=97 056 different alternative sequences. Because the warrior actually can be moved before and between the tech and build selection, the actual number of alternatives will definitely be over 100 000.

So in the first turn with 2 units, you can have well over 100 000 formal possible sequences of actions. When human players play the game, 99.9% of these possibilities are unconsciously ruled out as "bad" decisions. But a pure learning AI would have to test these alternatives to know that these 99.9% would be bad.

Edit: I guess there are 5 techs, I forgot about Sailing, and the number of first buildings is 6 so the numbers I used arent all on spot. Then again, I skipped some possible actions like selecting the workable tile after settling, so the true maximum number of possible sequences is definitely over 100 000.
 
Last edited:
Hey, I've signed up on the site because of this thread. I find this stuff fascinating, but agree with the last 2 posts in that developing a general AI for games like Civ using machine learning isn't presently achievable. But, do you think it'd be possible to train an AI only for combat scenarios, and have the game use its deep learning derived algorithms only when it comes to warfare? I'm talking just moving units, attacking and defending, coming up with tactics to complete military objectives that a general, normal, preprogrammed AI sets.

For example, the general AI sets the objective "I don't want to lose X city" and hands control of it's military (or a part of it) to the deep learning AI that's trained for warfare, in order to defend the city.

The military AI doesn't need to be perfect, just vastly more competent and adaptable in warfare scenarios than current game AI is. In this split AI system, grand scale strategic errors that leave some parts of a civilization vulnerable would be the general AI fault for pointing it's military to a specific zone of the map. This general AI could also handle the allocation of resources for each combat scenario it faces simultaneuosly, like being attacked by civs on opposing sides of it's territory. The deep learning military AI would just "make do" with what's available to it, including maybe withdrawing troops when it's overwhelmed and can't complete the task it's been given by the general AI. That'd prompt the general AI to reassign the military AI to another task (like defending the closest city). A system like this might preserve the organic feeling of Civ leaders and the bias they show towards specific paths, despite those paths working against their chances of winning, but save us from the stupid behaviour of enemy units when it comes to fighting.

I'm not exactly an expert on AI or deep learning, but I'd guess the limited scope of moves and situations might reduce the complexity of the calculations enough for it to work in the near future. It would be a bit like playing chess with civ units, taking terrain modifiers and nearby reinforcements into account.
 
It would be a bit like playing chess with civ units, taking terrain modifiers and nearby reinforcements into account.

Id say this is the terrain consideration is The issue in developing a decent AI, whether it is based on learning or hard coded decision rules. Learning AIs best work in situations where the map/game board is fixed, because then you can just run the possible situations a billion times and step by step converge to an optimal one.

I guess some intermediate solution between a hard coded and learning ai could work. That is, instead of having an AI that uses, e.g., an evolutionary algorithm to test large portion of different possible action sequences, the programmed decision rules could be optimized by finding the optimal parameters for the decision rules via learning. Then again, I would be surprised if such optimization wasnt already a part of the AI development of Civ.
 
But in Go you can place a piece anywhere on the board, anytime. Think of it less as board space, and more like, "there are 361 different actions you can take on turn 1"

Sure. Go is more complicated on move one. But as the game progresses, an AI needs to account for all the moves of every unit of every civ. It wouldn't take 500 turns for the game to become far, far more complicated than Go. You need to think about the number of possible unit placements, build decisions, city positions, etc.
 
Sure. Go is more complicated on move one. But as the game progresses, an AI needs to account for all the moves of every unit of every civ. It wouldn't take 500 turns for the game to become far, far more complicated than Go. You need to think about the number of possible unit placements, build decisions, city positions, etc.

Go is infinitely more reliant upon what your opponent does though. Any AI is civ would quickly recognise an optimal formula to apply to the game because it rarely has to react to an opponents moves compared. The complication is removed by the fact that while there are more options, you would never EVER pick 99.99% of them, because they are clearly suboptimal
 
Go is infinitely more reliant upon what your opponent does though. Any AI is civ would quickly recognise an optimal formula to apply to the game because it rarely has to react to an opponents moves compared. The complication is removed by the fact that while there are more options, you would never EVER pick 99.99% of them, because they are clearly suboptimal

The AI won't know that yet unless it's programmed. We're talking about deep learning here. If you program the AI to know some moves are suboptimal, we're back to square one.
 
The AI won't know that yet unless it's programmed. We're talking about deep learning here. If you program the AI to know some moves are suboptimal, we're back to square one.


Exactly, the AI wouldn't know not to delete its warrior on turn one or to trade it's cities away for luxuries, it would have to learn all of this. Perhaps it could learn this from studying data from human players, but the complexity of the game and the random maps would be crippling. Playing exclusively on a TSL earth map with the same civs each time would probably help, but not much.
 
The AI won't know that yet unless it's programmed. We're talking about deep learning here. If you program the AI to know some moves are suboptimal, we're back to square one.

But my point is that that isn’t the slightest issue. It can go through each possibility pretty much exactly once and rule out 99% based on the quality of outcome it achieves at the end of the game

Go cannot do that, all moves are inherently more valuable and dependent on external factors

So yes, a Civ AI would require time and space, but it’s not complex, in the same way.
 
Back
Top Bottom