Diplomacy AI Development

fallen_addict · May 8, 2020

Recursive said:
Sure, I can do that. Not for next version, but it's on my to-do list.

(condensed to 'Lots and Lots and Lots and Lots of stuff')

I don't have a fixed timeline for when this will get done, but those are my plans for the time being.

Oh. My. Goodness!

That's incredible. Thank you!

Akbarthegreat · May 13, 2020

What factors does the AI currently consider when making defensive pacts? In my current game, I noticed that the Ottomans were signing pacts quite promiscuously, even with civs they did not like and civs they had been at war with previously. Whereas they refused to sign one with me despite us being friends for millennia :crazyeye:

Vhozite · May 13, 2020

So does the victory block aggression boost apply between AI’s or is that exclusive to the player?

Basically will losing civs try to block a runaway Civ the way they would a player?

Recursive · May 13, 2020

Akbarthegreat said:
What factors does the AI currently consider when making defensive pacts? In my current game, I noticed that the Ottomans were signing pacts quite promiscuously, even with civs they did not like and civs they had been at war with previously. Whereas they refused to sign one with me despite us being friends for millennia

A lot of different factors. I'll be looking over this function soon.

Vhozite said:
So does the victory block aggression boost apply between AI’s or is that exclusive to the player?

Basically will losing civs try to block a runaway Civ the way they would a player?

AIs usually treat AIs the same way they treat human players, with a few exceptions. Victory Block is not one of those exceptions.

tu_79 · May 14, 2020

It has been brought a posible issue from modificators in the main thread.

The feedback was about the high numbers that stack up too much with penalty opinions and such. I suppose that the problem is that once you've passed certain threshold you know that there's no coming back to good relations, ever. And that is not how we humans behave. We can get in good terms with an old enemy when the circumstances call for it.

I can't speak for myself, since I haven't been testing this last version, so I'd rather ask you. How do these penalties add up in terms of AI behavior and how are past actions forgotten?

Kim Dong Un · May 14, 2020

Recursive said:
Sure, I can do that. Not for next version, but it's on my to-do list.

For anyone who's curious as to what I'm up to, I've completed small bugfixes for next version, and am now working on the tedious task of dissecting Firaxis's dumb, terrible, redundant and overcomplicated memory system for the diplomacy AI, by which I mean going through every single memory value and:
- Learning how everything interacts;
- Cleaning up old code and inefficient routines;
- Adding checks for extra stability;
- Preventing overflow of values (there was a bug recently where the # of times Influence with a City-State was lowered would overflow to a negative number);
- Reducing the amount of memory it takes up overall by eliminating unused or unnecessary values (for instance, rather than tracking if something happened AND the turn on when it happened, the turn is all that really needs to be kept track of - and several values are candidates for deletion);
- Fixing various bugs and issues;
- Then testing everything to make sure it doesn't crash and everything works properly.

Replacing the system with one that isn't stupid will come later (hopefully), but for now I'm going to optimize the existing one as much as I can.

When this is done, I'll follow it up with a general cleanup of the code and reorganizing it in a more sensible order (right now there's a lot of functions scattered haphazardly throughout the source code files), along with implementing better logging and debugging for certain functions, namely DoRelationshipPairing and GetBestApproachTowardsMajorCiv.

And when that is done, I'll be in a much better position to make further improvements, including:
- A few requested features, like an option to show number values for opinion modifiers but allow the AI to hide modifiers as well;
- Finally finishing that rework for backstabbing penalties and training the AI to use it intelligently;
- Fixing the code for promises (which has problems, like some penalties remaining forever - I'm also gonna try to add some kind of "promise tracker" to the UI screen at some point if I can, because keeping them in the opinion table takes up a lot of space);
- Better scaling of opinion modifiers based on game speed, having most of them gradually rather than suddenly decay, and additional customization options;
- Major rework for interaction logic with other players (the Firaxis logic is so overcomplicated and inefficiently programmed it hurts, with lots of redundancy and general dumbness); this will take a long time, but when it's done I think players will appreciate it a lot.

I don't have a fixed timeline for when this will get done, but those are my plans for the time being.

Wow, tremendous! Make sure to pace yourself dammit, you're a beast! When this all becomes refined, you will go down as an absolute legend. As it is, I've already reserved you a seat on Supreme Leader's High Council of Glory!

Recursive · May 14, 2020

tu_79 said:
It has been brought a posible issue from modificators in the main thread.

The feedback was about the high numbers that stack up too much with penalty opinions and such. I suppose that the problem is that once you've passed certain threshold you know that there's no coming back to good relations, ever. And that is not how we humans behave. We can get in good terms with an old enemy when the circumstances call for it.

I can't speak for myself, since I haven't been testing this last version, so I'd rather ask you. How do these penalties add up in terms of AI behavior and how are past actions forgotten?

The city-state competition penalties are bugged right now, as I mentioned in the main thread. This has been fixed for the upcoming version.

Some modifiers are permanent, some are condition-based (and never leave until the condition is cleared) others decay over time but they tend to do so at different rates which don't tend to scale that well with the nature of the offense or the game speed, and some things can reset the timer - this is another problem related to Firaxis's terrible vanilla code and memory system, as I mentioned in the post above. Explaining every single modifier in full detail would take a long time, but I'm working on this. After the rework I'll update the diplomacy guide on the wiki with the new information.

My intention is to remove all this varying decay rate nonsense and just have it decay by percentage - for instance, if you're 25% to the modifier's expiry timer, the modifier will only be 75% as potent. The timer will scale with game speed.

Still thinking about how to handle demands, spying, etc. where the penalties are cumulative, though.

Kim Dong Un said:
Wow, tremendous! Make sure to pace yourself dammit, you're a beast! When this all becomes refined, you will go down as an absolute legend. As it is, I've already reserved you a seat on Supreme Leader's High Council of Glory!

Not to worry, I only work on this in my spare time - the amount of which varies, as life can be unpredictable. Sadly, I haven't had as much time lately as I would like.

Thank you for your concern, but word on the wind is that Supreme Leader should be more concerned about his own health. That may, however, just be imperialist propaganda. :lol:

Recursive · May 14, 2020

Recursive said:
Still thinking about how to handle demands, spying, etc. where the penalties are cumulative, though.

I'm thinking perhaps these could be handled sort of like China's UA - reducing by half every time the modifier timer expires, with new offenses resetting the timer. That way large penalties could be accumulated from repeated offenses, but they wouldn't disappear as agonizingly slowly as they do now (it tends to be -1 penalty every 50 turns or so).

Kim Dong Un · May 14, 2020

Recursive said:
Thank you for your concern, but word on the wind is that Supreme Leader should be more concerned about his own health. That may, however, just be imperialist propaganda.

Hah, the fools they are! To think, that the pristine condition of Supreme Leader could ever be compromised? This is beyond asinine! The Glorious Dong shall never shrivel up into obscurity; he will rise harder, faster, stronger, and thrust himself back into the faces of his devoted followers, so that they may once again bask in the overflowing essence of purity and righteousness while leaving nothing but the taste and remnants of victory and salvation coursing throughout their being! :mischief:

To be honest though, just between you and me, but I'm actually holed up in an underground bunker with all my essential needs for the next couple months: LSD infused toothpaste, 300 cases of Mountain Dew and a silo full of Ramen, alongside Richard Simmons' anniversary collection to help keep my trainers fit. There's also a loop of "Yoko Ono's Greatest Hits" that plays throughout the facility, to keep me sane, so I think I'll be okay. I even had a custom bidet installed that has Emeril Lagasse yell "BAM!" every time the water shoots out (I had to splurge on that one). Oh and of course, I brought my PC so I can continue to run simulations of my world domination through VP - we can't be using this pandemic as an excuse to hinder our vision and deny our rightful fate! But yeah, since I did assign you to my Supreme Glorious Council, that'd be cool if you could just send me a text (1-800-1AND-UNLY) when this hoax blows over...

tu_79 · May 15, 2020

Recursive said:
Still thinking about how to handle demands, spying, etc. where the penalties are cumulative, though

We faced this problem when adding up unhappiness from needs. They needed to be capped, but the way they were, removing one point of crime just added one unhappiness at poverty.

Current addition is more sensitive, since it works on the remaining happy people. You know how it goes, a percentage of the remaining happy people may become unhappy.

Maybe remembering this could inspire you.

Recursive · May 15, 2020

tu_79 said:
We faced this problem when adding up unhappiness from needs. They needed to be capped, but the way they were, removing one point of crime just added one unhappiness at poverty.

Current addition is more sensitive, since it works on the remaining happy people. You know how it goes, a percentage of the remaining happy people may become unhappy.

Maybe remembering this could inspire you.

The problem I see with this kind of focusing approach is that most of these triggers (demands, spying, converting cities, stealing artifacts etc) are entirely within player control, independent of each other and it's logically sound for the AI to compound being upset about them, especially if done repeatedly. I don't see it as a problem aside from the penalties lasting for way too long sometimes.

tu_79 · May 16, 2020

Recursive said:
The problem I see with this kind of focusing approach is that most of these triggers (demands, spying, converting cities, stealing artifacts etc) are entirely within player control, independent of each other and it's logically sound for the AI to compound being upset about them, especially if done repeatedly. I don't see it as a problem aside from the penalties lasting for way too long sometimes.

Only that this is not how we humans think.

Let's talk about holding grudges. Some people just can't forgive. You are playing competitively with them, beating them at settling in one location and they get so pissed off that you become their sworn enemy forever. Other people will negotiate with their most hated player if it is in their interest. Rational people are aware of past actions, but don't let that stop thinking with clarity. For example, if I know that Tim has been a jerk to everyone, I know I can't trust him, but maybe I can trade with him and even help him when he is being attacked by Bob who is a bigger threat to me than Tim. But if I am the sensitive person, I would allow Tim to rot even when that is in my detriment. How fast a player would forgive is part of the personality.

The actions of a player reveal his nature as well as his current strategy. For example Tim might be a jerk, but he is being a jerk to everyone not just me. He is trying to claim every single CS for himself, every arch site for himself, but he is not really picking on just me. He is too aggressive, but he is not after me after all. Meanwhile Bob is purposely claiming tiles and CS that are in my side, he is not sending any trade route towards me. Tim is more annoying, but the real threat is Bob. Bob is a focused, tactical, chirurgical player and I am his next target.
If I can correctly categorize a player by his behaviour and his intentions, then I can make a much better decision.

Going back to modifiers. They should serve AI to categorize the human behaviour, his NATURE. Player's nature is not going to change, so this kind of information must not decay. Player's nature can be categorized by
1. Generous > Trusty > Untrusty > Unwordy. A trusty player is one whose interactions are always positive to their friends and never breaks a vow. The untrusty is the one who doesn't mind to compete against their friends. The unwordy breaks his promises. Facing a trusty player does not mean that this player is never going to betray us, but at least we know our chances are better. The generous offers help to other civs (obviously for selfish reasons).
2. Risk averted > Cautious > Bold. Is the player acting against the weaker or against the stronger opponents? Maybe he is allying to all of his neighbours against far away civs he is never going to actually engage? Or is he just going against the closer target for convenience? Does it take him too long to make a decision?
3. Focused > Blocks > Wild. This is granted on how the player builds up targets and alliances, whether he commits acts of aggression against a single target at a time, against a select group of civs, or indiscriminately.

Then we need to guess what is the player's CURRENT STRATEGY. He might be focused on infrastructure, or extending his soft power, maybe he is turtling or building a large navy. Who is he annoying? Cutting trade routes? Not trading? That gives his probable next targets. Or maybe he is acting erratically.
The modifiers that talk about the current strategy must decay. I would revise my strategy every 30 or so standard turns.

In this game we have no friends, we have only temporarily alliances, trading and interested cooperation. Even being generous with other player is because we want this player to be able to stand against a bigger threat.

Recursive · May 16, 2020

tu_79 said:
Only that this is not how we humans think.

Let's talk about holding grudges. Some people just can't forgive. You are playing competitively with them, beating them at settling in one location and they get so pissed off that you become their sworn enemy forever. Other people will negotiate with their most hated player if it is in their interest. Rational people are aware of past actions, but don't let that stop thinking with clarity. For example, if I know that Tim has been a jerk to everyone, I know I can't trust him, but maybe I can trade with him and even help him when he is being attacked by Bob who is a bigger threat to me than Tim. But if I am the sensitive person, I would allow Tim to rot even when that is in my detriment. How fast a player would forgive is part of the personality.

The actions of a player reveal his nature as well as his current strategy. For example Tim might be a jerk, but he is being a jerk to everyone not just me. He is trying to claim every single CS for himself, every arch site for himself, but he is not really picking on just me. He is too aggressive, but he is not after me after all. Meanwhile Bob is purposely claiming tiles and CS that are in my side, he is not sending any trade route towards me. Tim is more annoying, but the real threat is Bob. Bob is a focused, tactical, chirurgical player and I am his next target.
If I can correctly categorize a player by his behaviour and his intentions, then I can make a much better decision.

Going back to modifiers. They should serve AI to categorize the human behaviour, his NATURE. Player's nature is not going to change, so this kind of information must not decay. Player's nature can be categorized by
1. Generous > Trusty > Untrusty > Unwordy. A trusty player is one whose interactions are always positive to their friends and never breaks a vow. The untrusty is the one who doesn't mind to compete against their friends. The unwordy breaks his promises. Facing a trusty player does not mean that this player is never going to betray us, but at least we know our chances are better. The generous offers help to other civs (obviously for selfish reasons).
2. Risk averted > Cautious > Bold. Is the player acting against the weaker or against the stronger opponents? Maybe he is allying to all of his neighbours against far away civs he is never going to actually engage? Or is he just going against the closer target for convenience? Does it take him too long to make a decision?
3. Focused > Blocks > Wild. This is granted on how the player builds up targets and alliances, whether he commits acts of aggression against a single target at a time, against a select group of civs, or indiscriminately.

Then we need to guess what is the player's CURRENT STRATEGY. He might be focused on infrastructure, or extending his soft power, maybe he is turtling or building a large navy. Who is he annoying? Cutting trade routes? Not trading? That gives his probable next targets. Or maybe he is acting erratically.
The modifiers that talk about the current strategy must decay. I would revise my strategy every 30 or so standard turns.

In this game we have no friends, we have only temporarily alliances, trading and interested cooperation. Even being generous with other player is because we want this player to be able to stand against a bigger threat.

That's a lot easier said than done in terms of getting the AI to understand all this (especially while avoiding false positives), let alone make decisions based on it.

While this kind of higher-level thinking is not impossible, it's far beyond what the current AI is capable of. Furthermore, AI currently only acts on information available to it - it can't tell whether other players are trading generously with each other, for example, because it only sees its own trade deals. It's easier to program an AI that plucks information out of the game engine rather than acting on estimates and guesses, but it feels unfun.

Not saying I'll never consider implementing something like this, but it's outside the scope of my improvements at the moment.

tu_79 · May 16, 2020

Recursive said:
That's a lot easier said than done in terms of getting the AI to understand all this (especially while avoiding false positives), let alone make decisions based on it.

While this kind of higher-level thinking is not impossible, it's far beyond what the current AI is capable of. Furthermore, AI currently only acts on information available to it - it can't tell whether other players are trading generously with each other, for example, because it only sees its own trade deals. It's easier to program an AI that plucks information out of the game engine rather than acting on estimates and guesses, but it feels unfun.

Not saying I'll never consider implementing something like this, but it's outside the scope of my improvements at the moment.

I understand and agree. But I love to throw ideas and discuss them. You never know when one of those might be of help.

Recursive · May 16, 2020

tu_79 said:
I understand and agree. But I love to throw ideas and discuss them. You never know when one of those might be of help.

Not a problem! As long as you don't take it personally if those ideas are challenged. My only goal here is to create a great end product.

tu_79 · May 16, 2020

Recursive said:
Not a problem! As long as you don't take it personally if those ideas are challenged. My only goal here is to create a great end product.

Believe me, after all these years I don't give a damn if I am rejected

Vhozite · May 18, 2020

So when the AI asks you to DoW another Civ with them, you can still get the “concerns about warmongering” negative modifier from the Civ that asked for your help.

Is this intended? It’s not a huge deal since you get the positive modifier for fighting a common foe, but i don’t think it makes any sense.

Recursive · May 18, 2020

Vhozite said:
So when the AI asks you to DoW another Civ with them, you can still get the “concerns about warmongering” negative modifier from the Civ that asked for your help.

Is this intended? It’s not a huge deal since you get the positive modifier for fighting a common foe, but i don’t think it makes any sense.

Yeah, this is a known issue. I'll fix as part of my upcoming memory rework.

Recursive · Jun 6, 2020

@Kim Dong Un I forget where, but I remember a post by you where you said the AI (Rome) was slow on the uptake in realizing that you were going to destroy them after you'd conquered every other civ.

AI should be better at recognizing dangerous conquerors now, but as for the "slow on the uptake" part, I've just added some functionality whereby the diplo AI can update the approaches of specific players while ignoring the approach curve, allowing it to "reevaluate" specific players while continuing to process other players normally, or simply reevaluate all of them.

Once the next version is posted I'll start my major rework, and this will include fixes in this area. I think when a player kills another player or wins a war, all other civs should reevaluate that player immediately, and there are a few other circumstances where this would be appropriate as well.

Kim Dong Un · Jun 6, 2020

Recursive said:
@Kim Dong Un I forget where, but I remember a post by you where you said the AI (Rome) was slow on the uptake in realizing that you were going to destroy them after you'd conquered every other civ.

Last paragraph of this post Vox Populi Diplomacy Feedback

AI should be better at recognizing dangerous conquerors now, but as for the "slow on the uptake" part, I've just added some functionality whereby the diplo AI can update the approaches of specific players while ignoring the approach curve, allowing it to "reevaluate" specific players while continuing to process other players normally, or simply reevaluate all of them.

Once the next version is posted I'll start my major rework, and this will include fixes in this area. I think when a player kills another player or wins a war, all other civs should reevaluate that player immediately, and there are a few other circumstances where this would be appropriate as well.

Sounds great!

Diplomacy AI Development

Chieftain

Angel of Junil

Warlord

Already Looping

Deity

The One & Unly

Already Looping

Already Looping

The One & Unly

Deity

Already Looping

Deity

Already Looping

Deity

Already Looping

Deity

Warlord

Already Looping

Already Looping

The One & Unly

Similar threads