AI Jumbled Rumble : another set of AI Survivor Alternate Histories, with a twist

Thrasybulos · Jul 31, 2023

As I've already said, Sullla's AI Survivor (btw, let me add mine to the thanks for running that) is a concept which mixes several aspects:
- It's an AI competition, aiming at ultimately providing a ranking of the AIs.
- It's a predicton contest.
- It's a Live Show.

The Alternate Histories for Sullla's AI Survivor series provide information aiming at showing the different ways a game could have gone, at and determining how predictable the outcome actually was.
In other words, they're about the prediction contest : they are replays of the exact same setups, leaving essentially unanswered the question of whether the results should be explained in terms of AI strength or in terms of the context of the game (starting position, neighbours).

That lead me to the notion of providing another set of Alternate Histories, but this time with a focus on the first aspect: ranking the AIs.
In order to achieve that, I'll be replaying the maps... while shuffling around the AIs on the map.
As with the Alternate Histories, I'll be playing 20 iterations of each game (same map, same AIs), but with a different permutation of the starting positions for each game.
That should provide an idea of the relative strength of each leader in that game's subset, regardless of its starting position and specific set of neighbours.

It'll also fulfill a secondary objective: determining how balanced (or unbalanced) each map was, and thus how much that influenced the game outcome.

Spoiler :

Instead of Sullla's Power Rating (5 pts for a win / 2 pts for a runner-up result / 1 pt per kill), I'll use an Elo rating to rank the AIs.

Spoiler :

Spoiler Elo Implementation :

Tournament Format:

Season 1 will be based on Season 4 of AI Survivor.

I'll be following AI Survivor's format, with some changes:
- No change to the Opening Round.
- The Playoffs will have different participants (those who score best here, as opposed to those which made it in the live games).
- The Championship will have different participants and use a different map.
- The bigger change concerns the Wildcard game : for Season 1, it's replaced with a Wildcard League. The 36 AIs which don't make it to the playoffs are made into two pools of 18 AIs. For each pool, 3 6-player games are played, and the best two move on to a 6-player Pool Finals. The winner of that game gets a wildcard to the Playoffs.
The idea is to get more play-time for the weaker AIs, and to send two "Elo-bags" to the playoffs.

Tournament Rules:

I'll use the same settings and rules as AI Survivor, with two exceptions.
- The big one: No UN.
Diplomatic Victory is disabled.

Spoiler Rationale :

- Enforced peace when an AI is at war with a opponent reduced to a single city hidden behind another civ with closed borders.

Spoiler Rationale :

"Test Protocol":

- Each game is run from the worldbuilder save (for practical reasons, and to get new peaceweights each time).
- Permutations are performed by changing the team number for the AIs, not by moving their units around. That means turn order is tied to each starting position, not to each AI.
- No Great Spy infiltrations to unlock demographics. Ok, they do have an impact (I seem to observe far less instances of an AI completely tanking its early eco - guess being free of that 20% spending on Espionnage helps ; conversely, those AIs which choose to spend on Espionnage target their actual opponents), but probably nothing major. The main reason is that they're a hassle: since I'm running the game from the wb file, I would have to re-add them each time.
They serve two purposes:
- Contact with the AIs: done through the wb file instead.
- Enabling graphs: done though a simple change to CIV4EspionageMissionInfo.xml, assigning 0 to the cost of the see demographics mission. Which permanently enables them as long as you have at least one EP spent vs a civ. So I just run the espionnnage slider for the first turn of the game, and I'm done.

For each game, I'll also provide an archive containing:
- An Excel file with the detailed game results (the macro is used for the Elo calculations).
- A second Excel files with graphs about the game.
- Minimap pictures of the game start and end.
- The worldbuilder files used for the 20 runs.
- The replay files for each run.

Thrasybulos · Jul 31, 2023

(wip)

Opening Round

Wildcard League

Playoffs

Thrasybulos · Jul 31, 2023

The first game saw a surprise win from Isabella and an even more surprising Roosevelt making it to the runner-up spot.
The AH were more in line with the community's expectations, with Cyrus as the top leader, and Cathy right behind.

I expected roughly the same here, but Qin's starting position was clearly atrocious, so I was curious as to how he'd fare.

Cyrus did come on top, but Cathy was outperformed by Qin.
I was at first at a loss at how to explain how Cathy could perform worse than in the Alternate Histories, especially since it turns out her starting position was one of the worst... but I believe Qin to be the explanation: when not stuck with that horrible start, he proved a tough competitor who could prevail over her.
Qin is a balanced leader, an economy-focussed leader who doesn't neglect his military. His performance was rarely impressive, but usually solid.
Cathy seems more erratic: she could be an unstoppable force, especially when starting from Cyrus's start (A), or face a very early elimination as in her first game here.

Elisabeth and Roosevelt were the marked AIs here, owing to their peaceweight, and to the fact that their peaceweight would-be ally, Isabella, would often run a different religion, and we know what that means.

On the whole, no AI proved dominant here.

...and the explanation may stem from the map unbalance. Cyrus's initial start (position A) accounts for half the wins there!
Whichever AI would start there would become a powerhouse unless it was severely dogpiled. Even Elisabeth and Roosevelt managed to win from that spot!

Isabella's initial position (D) and Elisabeth's (C) provided a fighting chance, but they were still far inferior. It should be noted, though, that Isabella actually performed better from her initial spot than from Cyrus' where her zealotry brought her into more trouble.
As expected, Qin's (E) was the very worst spot to start from. The Chinese leader was the only one to almost get a win from there, but instead set a record that I'm 100% sure won't be beaten: in a crazy game which ended with a Time victory, he was eliminated on turn 495!
Roosevelt's start (F) proved better than it looked: if not for winning, at least for survival.
Combine that with Isabella's preference for start D, and the live game's outcome, if not the most likely, at least starts making sense.

Cyrus and Qin move on to the playoffs, the rest will play in the Wildcard League.

need my speed · Jul 31, 2023

Thrasybulos said:
- Permutations are performed by changing the team number for the AIs, not by moving their units around. That means turn order is tied to each starting position, not to each AI.

Do all AIs start with the exact same units (Scouts/Warriors/Quechuas/Workers/Fast Workers)?

Thrasybulos said:
- No Great Spies.

Why? This one confuses me and seems super random.

It's also not entirely clear to me if/how you account for neighbours (Gandhi next to Montezuma or Gandhi next to Mansa Musa are very very different games for Gandhi).

Fippy · Jul 31, 2023

Roosy has not just tundra but also ice starting 3 tiles away from his capital :lol:

(first time i saw the original starting position)

Worst land (if we can even call it that) ever..sure he's not a strong AI leader in this setup anyways, but come on would Sulla start fan favorites like HC or Mansa there.

Thrasybulos · Jul 31, 2023

@need my speed
If an AI is entitled to special units, I try and make sure it gets them (I did mess up a coupla times, but mostly got it right). For instance if Gandhi was on Team 2, and plays the next game as Team 4, I edit the worker for Team 4 to make it a Fast Worker, and demote Team 2's Fast Worker to a mere Worker.

By "No Great Spies" I mean that I don't infiltrate Observer Civ Great Spies in each AI Capital at the start of the game as Sullla does. Not that Great Spies as a unit are removed from the game.

(I've reworded it, hoping it's clearer now).

I can't ensure total fairness regarding positions played and neighbours. As I said, I'd need to play all permutations for that, and well... No.
I do keep track of the permutations that were played, so that data is available to qualify the results. I also try to have permutations as varied as possible (eg, no runs differing by a single swap).

@Fippy
Qin's was worse : completely boxed in, in jungle. Not helped either by the fact that every AI sent its second settler to the SE coastal spot. Dry Rice and jungle. Couldn't figure that one out. Why that spot??
And have a look at Mao's start in Playoff 2. :lol:

Keler · Jul 31, 2023

Thrasybulos said:
Ideally we'd use something like the highest in-game score they'd reached, but getting that information would require a python mod or a tool to extract it from the replay file... way too much effort.

My utopic ideal would be to give them points based on their score area, maybe except for winners (especially culture winners). Sadly it is not as possible unless I roughly try to draw them by myself on autocad.

Your excel is very professional and detailed. So let me try to understand
Your first game resulted in:

Domination-Turn 323
Cyrus 7665 points
Isabella 4656 points
Elizabeth Turn 312
Qin Shi Huang Turn 252
Roosevelt Turn 237
Catherine Turn 136

Now, winner Cyrus gets clean 5,00 points.
Isabella got 4,00 points simply being the only survivor, 1 point off.

Thrasybulos said:
the turn score = 1 + 2 + 3 + ... + Turn of elimination = (T Elim)*(T Elim + 1)/2

I assume all eliminated leaders can get no more than 3,00 points because survivors win against eliminated so 1-0
from there how do you exactly calculate how much points they should get?
I mean in relation to how much Elizabeth should get compared to Catherine, 312/136 makes = 2,29 times more.
or 48,82/9,31=5,24 times more points Elizabeth should get to whatever cathy gets. But here I see Elizabeth got 3,25 times more points than Catherine. I mean I could just assume game end date 323 as 3,00 clean points and could have give them whatever proportional to that. I really needed something simpler than this

Here I see
Elizabeth got 2,08 points. She died in Turn 312, so 312x313/2=48,82
Qin Shi Huang got 1,69 points. He died in Turn 252, 252x253/2= 31,87
Roosevelt got 1,59 points. Epsilon would make it 28,20
Catherine got 0,64 points. That gives her 9,31

Clearly there are more to proceed, the games where more than 2 survivors, Elo adjustments is completely different. I will try to understand them too.

Saxo Grammaticus · Jul 31, 2023

Thrasybulos said:
I'll use the same settings and rules as AI Survivor, with two exceptions.
- The big one: No UN.
Diplomatic Victory is disabled.

I understand your choice in the context of these tests while also wondering what to do about Diplomatic Victory in the general sense of AI Survivor. I like having it in the game not least of all because it rewards good relations, can resolve otherwise intractable late-game messes, and elicits terrific odds from Fippy.

One possible solution with added victory micro (as we already see when checking for Domination, for instance) would be to check for votes whenever the UN meets. This would have the benefit of rewarding the AI for friendly relations even if they are unaware of how to leverage that into victory. If the AI is unable to navigate the UN in its own interests, then I see it as less of an issue to award Diplomatic Victory in this manner. After all, not all games result in diplomatic relations that are compatible with victory. An issue I have observed, however, is where AI vote for Diplomatic Victory with no obvious friendly attitude.

antimony · Jul 31, 2023

There is software that will generate a random latin square (e.g. 6 permutations where each leader occurs in each position once). The random part is important because some latin square generators will create squares with a relatively regular structure (e.g. players 1 and 2 would move in each iteration but always be neighbors in terms of starting position), which you want to avoid. It also allows you e.g. to generate 3 random latin squares for 18 iterations.

I understand the point about 6 vs. 7 player maps, but given that you're using ELO does it matter if it's 20 iterations each time? In fact, given that each iteration is treated as a mini round robin tournament for ELO purposes, then it's like if each AI is playing 6 "games" on a 7-player map vs. 5 in a 6-player map. Therefore running for example 24 iterations of the 6-player maps (120 ELO duels for each Civ) is close to 21 iterations of the 7-player maps (126 ELO duels for each Civ).

Thrasybulos · Jul 31, 2023

@Keler

Providing an expanded view of the score was (still is, but fell way down the list) on my TODO list.

Let me try and detail it here.
Let's take as an example game 12, which has 3 eliminated and 3 survivors.
Roosevelt, winner.
Cyrus, 2nd, lives with 2649 pts.
Isabella, 3rd, lives with 2535 pts.
Cathy, 4th, eliminated on turn 379, "turn score" = 379 * 380 / 2 = 72,010
Qin, 5th, eliminated on turn 287, turn score = 41,328
Elizabeth, last, eliminated on turn 129, turn score = 8,256

Cyrus gets vs Isabella 2649 / (2649 + 2535) = 0.51
Isabella gets the remainder 2535 / (2649 + 2535) = 1 - Cyrus's score = 0.49

Qin's score vs Elizabeth is 41,328 / (41,328 + 8,256) = 0.83
Elizabeth's is 1 - 0.83 = 0.17.
And so on...

Which gives :

vs	Roosevelt	Cyrus	Isabella	Cathy	Qin	Elizabeth	Total
Roosevelt	XXX	1	1	1	1	1	5
Cyrus	0	XXX	0.51	1	1	1	3.51
Isabella	0	0.49	XXX	1	1	1	3.49
Cathy	0	0	0	XXX	0.64	0.90	1.54
Qin	0	0	0	0.36	XXX	0.83	1.19
Elizabeth	0	0	0	0.1	0.17	XXX	0.27

I only display the last, total value, in the file.

But that's how the detailed score is calculated.

Hope that clears it.

Thrasybulos · Jul 31, 2023

@Saxo Grammaticus

I'm not suggesting that Sullla should remove the UN. That's part of the "Live Show" aspect, providing many a facepalm moment.

But for my purpose here, it's random noise.
Interesting idea about a "forced diplo victory"... simplest way to implement it would probably be through a mod which forces the victory vote as the first resolution after each secretary general election. But Sullla's very mod-reluctant...

@antimony

Good suggestions... Since there are only 4 7-player games in the tournament format, I'm not sure I'd like to increase the game count to 24 for most games, but keeping the same idea I could on the contrary decrease it to 18 runs for 6-player games, and 15 runs for 7-player games (14 "fair" games where each AI get to play in each position twice, then a "final" where those still in the race get attributed a "good" position). That would be exactly 90 Elo duels in each situation.
Food for thought... :thumbsup:

Thrasybulos · Aug 1, 2023

Game 2 was a Gandhi cultural win which also featured Willem at last emerging from the mediocrity pit he'd wallowed in so far, and a very early and humiliating Pacal defeat, courtesy of the Troll King himself.
The AH have revealed Willem to be the dominant AI on this map, with Louis a solid second, and Gandhi managing to pull in some wins.

For the shuffled replays, purely based on past performance and impressions, I expected Willem and Pacal to be the best performers here. But as Season 5 champion, could Mehmed also be a contender? And what of Louis (this was before this season's finale)? What would be Gandhi's win / FTD ratio? And could Victoria, when starting in a better spot, live up to her strengths?
So lots of interrogations...

... and no real surprise, in fact. Willem and Pacal did end up as the better AIs in this group.
It was also a group where peaceweight mattered: a standard 4 low peaceweights vs 3 high peaceweights where although the high peaceweights managed 5 wins, the field ended up dominated by the low peaceweights.

Gandhi confirmed his status as a feast of famine leader: he got 3 wins, and died early most of the rest of the time.
Victoria (a mild disappointment) and Wang Kong weren't able to achieve much.
Mehmed just couldn't compete with the much better techers out there.
Louis was generally solid, and a real beast when starting from his original position, winning all 3 such games.
Willem may have got lucky a few times by surviving when he should have been eliminated, but otherwise played the high risk high rewards game we's come to know him for.
Pacal... I remember thinking at the time: he's a strong leader, but a bad leader. Meaning his strong traits and preferences carry him along, but his usual decisions and gameplan tend to be anything but impressive.

When it appeared towards the end that Willem and Pacal would be in a tight race for the top spot, I "rigged" the last game so that both would start in a strong position.
And had that game been on the live stream, it would have been something! :wow:

In a nutshell: Willem did get to Rifling early, but then turned on the Culture slider... way too early. It seemed he'd thrown the game. Meanwhile, Pacal did deliver, and got a crushing tech and territory lead. His spaceship would land way before Willem achieved 3 Legendary cities, and that's if he didn't simply conquer his way to Domination before.
And then, somehow, Willem managed to drastically increase his culture output, and it appeared he would actually beat Pacal's space victory by about 5 turns!
Pacal declared on him. Rifles don't do too well against mechs and modern armors. Willem's outer cities quickly fell, and Pacal's main stack moved next to one of Willem's Legendary candidates.
Pacal signed peace. :wallbash:

Willem dropped the slider. :smoke:

And turned it back after a few turns. He won by Culture on the very turn before Pacal's ship was due to land!!

As was the case for the previous map, this map wasn't balanced, with Louis's starting spot (C) accounting again for 50% of the wins.
It shows that it was a central position as it basically led to a win or die result.
The second strongest position was Willem's (G), with Mehmed's (D) decent as well.
The two worst starting positions were Pacal's (A) and Victoria's (E), although for different reasons. Pacal's was boxed in, with no Copper, and opponents on 3 sides. It has the highest elimination ratio as a result. Victoria's had a nice backline, although barb territory, but with no food and jungle-choked: it led to a very slow development.

I initially tried for this game a shuffling algorithm based on kills (basically killer swaps with victim). It was the most fun, and at least made kill steals interesting instead of annoying, but I had to drop the attempt quickly as it became apparent it couldn't lead to fair permutations. So if you're wondering why for instance Louis played his 3 games from position C so close together, that's why.

Keler · Aug 1, 2023

No matter what the settings are, low peaceweight leaders ALWAYS outperform high peaceweight leaders under equal enough conditions. What's a big surprise is how all those high peaceweight leaders like Gandhi and Mansa Musa outperformed in their live games while they realy should not. Ending up as best leaders in Sulla's ranking. Extreme amount of luck on their parts. In fact there are way too many live games we watched where a %5 chance outcome happened, and their same alternative replays pretty much supports that. And here with shuffled starts where every leaders gets to play 3 or 4 times in every capital, high pw leader's score pretty much says it all.I almost never see Gandhi doing well ever in my single games too. I wouldn't consider Willem as low peaceweight leader, he should not be in good relations with mehmed or pacal.This is another equally distributed pw game just like game 1 where high pw leaders proved to be able to do nothing. Not into conquering anything, techs away to get rifles yet dies to a warmonger grenade rush because their AIs too busy building wonders and buildings. Their early game expansion could turn into a Pyramids and missionaires disaster too..

And as for the map, I couldn't realise at that time that victoria's start (E) would be that hopeless. Then again I never see such terrible starts like Pacal (A) when I generate maps on my computer. I also thought Gandhi's (B) would do better than this at first glance.

Thrasybulos · Aug 2, 2023

High peaceweight leaders are outnumbered in the game, so the odds are stacked against them from the start. What seals the deal is I believe the fact most of them cannot plot at Pleased. Add that to a usually low attack rating, and you've got a really bad combination for AI Survivor settings.
Mansa would be an exception: I find him to be fairly aggressive, and he can plot at Pleased. With his strong eco, that makes him a serious contender. At least for the Opening Round. The peaceweight situation in the Playoffs tends to make the situation pretty bad for him, barring extreme luck... which he seemed to have enjoyed on a few occasions indeed.
Here Gandhi got the expected 3 wins for 20 iterations of a 6-player game, so in that respect, his performance was simply average. But usually, he needs a high peaceweight field to perform well... which he got an unduly number of times (Seasons 3, 5, 7 come to mind, and I might have forgotten some).

Outlier results are to be expected sometimes, but I must confess that this season (Season 7) seems to have been particularly ripe with them.

Willem is peaceweight 4 (low end of "neutral"), Mehmed and Pacal are PW 2. Random modifiers at the start of the game can certainly widen the gap a lot, but a base gap of 2 isn't that bad. I'd say PW 0 leaders would be more of an issue for him.

Thrasybulos · Aug 2, 2023

Game 3's story was a simple one, whether for the live game or the AH: a complete and overwhelming domination by Julius Caesar.
His starting position seemed particuliarly good, and especially suited for his traits. So could he achieve similar results from different starts?

Well... No.
Caesar debunked ? Possibly. He didn't perform poorly (he does finish second after all), but he certainly didn't live up to expectations.

This is a game which got me worried on two accounts.
First, there's always the possibility that this experiments yields a result we'd rather not: all AIs actually perform about the same, it's all about the game context.
And it surely felt that way here: until about run 10 or so, all 6 AIs had about the same score, and each run's result seemed dictated by the starting positions.
Then some starting to pull ahead, and some to fall behind, and that gave rise to a second concern.
Look at the final results: Suleiman got 5 wins. Caesar got 5 wins. Joao got 5 wins. And Shaka, with only 3 wins, finishes 1st.
Is my assessment method just wrong? Note that Sulla's get to the same results too.
My "score" heavily emphasizes survival (figures for "AI survivor"). Sullla's less so, but it rewards kills. And Shaka's typical game involved conquering 3 opponents while another AI conquered only one but seriously out-teched the Zulus along the way. Then came the time of the last war for the Zulus, the one that would ensure their Domination... and it went poorly. Massed cavs are impressive... until they face mechs, modern armors, and nukes. Boy, did Shaka get to glow in the dark game after game! But still, that approach works here: as the sole survivor beyond the game winner, he would score high. And score high on Sulla's Power Rating too through his numerous kills.
But what if a similar scenario happened all throughout? Would it be "right" for the ultimate champion to be decided not through getting more wins, but through dying less often?
If we look at sports or other competitions... the answer is "yes". A player or team which consistently finished second would certainly be ranked World n°1 (unless they always lost in finals to the same opponent).
But let's hope it doesn't happen, as how "right" it might be, it still wouldn't feel very satisfying.

So, in the end, Shaka came on top through outliving his victims (by definition).
For a while, Suleiman looked as if he'd be the runner-up. He had a series of good games as a well-rounded leader: good eco, fairly aggressive, goes after all three victory conditions. But he was eliminated a lot, often a consequence of founding a religion and failing to have it spread to the military leaders. Suleiman would actually have got more wins if he didn't insist on running Bureaucracy when going for culture. That's not even his favoured civics, so why? There was even one game where he was running Free Speech when he turned the slider on... only to revolt a few turns later into Bureaucracy! :smoke:

Joao performed comparably to Caesar's, and was even more impressive than the Roman leader when starting from that godly spot. But he ultimately payed the price for his highish peaceweight.
De Gaulle was mediocre as expected.
Frederick was the only leader who couldn't pull a win, even when starting from the Caesar's original OP position. Although to be fair, he did get to become the dominant AI on at least one occasion, only to be beaten to the punch by Ottoman Culture.
And as mentionned, JC makes it to the playoffs, but not very convincingly.
All in all, it doesn't seem like we have Championship material with these leaders, but we'll see.

The map turned out to be even less balanced than the previous ones, with JC's original position (A) being completely OP and accounting for 60% of the wins, while Suleiman's spot (B) in the middle of the map was an absolute death sentence. Only Shaka has managed to get a win from there.
Joao's (E) and Shaka's (F) were little better.
Frederick's (D), although better, was along the lines of Roosevelt's in game 1: safely tucked out of the way, offering good survival prospects but scant chances at winning.
Only De Gaulle's spot (C) really gave a fighting chance, and that's why he was the second strongest leader in the AH (far behind JC).

The same two warmongers as in the live event get sent to the playoffs, but in reverse order.
I suspect they won't fare any better, though...

Saxo Grammaticus · Aug 2, 2023

First off, I appreciate that your original post has the answers to most of my questions--very thorough! I'm just fascinated with the rating system.

Game 3 piques my interest in just how survival should be valued. At first glance, it seems baffling that Shaka would advance over Joao and Suleiman when they had more first place finishes. Of course, I can identify a soft spot on my part for the two leaders based on their performance in my tests this past season. There is also the nature of Sulla's system where first place finishes are weighted over second place finishes, with the added factor of kills. Is it wrong for Shaka to be rewarded for consistent survival? Hmm...

In your particular rating system, each match runs as a tournament with wins, losses, and "draws" for survivors and those eliminated. I wonder how much it makes sense to award a win to survivors against those eliminated. I can think of a number of scenarios where the survivors are effectively rump states. Your point split for survivors would seem to address the issue of several contenders with 2,000+ pts each as well as a 3,000 pt gap between 2nd and 3rd, for instance. Where I struggle, however, is the somewhat arbitrary nature of who stays and goes with a runaway calling the shots. How much is it a victory to not get chosen for elimination in the last ten turns? Are each of those victories over the eliminated worth the same as the winner's? Of course, Survivor's the name of the game :lol:

I also see merit in the way you evaluate eliminations via turn score. The earliest elimination we have seen in the contest is T80, so really a T100 elimination does not compare directly to one at T300.

Keler said:
My utopic ideal would be to give them points based on their score area, maybe except for winners (especially culture winners). Sadly it is not as possible unless I roughly try to draw them by myself on autocad.

I am intrigued by this as well, though I have a hunch it could disproportionately reward "mid-game score leaders" who dominate the score board for much of the game with tech, wonders, and empire-building but end up eliminated. (This could be moderated by the absorption of their lands and the continuation of the game). I suppose this plus the above speak to how to evaluate the performance of the AI who place other than first, while distinguishing between this project and the contest at large.

As though this isn't already way too much thought given to the series :mischief:

I have a couple other questions.

-How do you randomize the starting positions?

-How will you calculate expected score once ELOs are "floating," i.e. from Wildcard on?

Keler · Aug 2, 2023

Saxo Grammaticus said:
I have a hunch it could disproportionately reward "mid-game score leaders" who dominate the score board for much of the game with tech, wonders, and empire-building but end up eliminated

right, or porportionately reward "was going to win but dogpiled to death" leaders who did not get a kill :lol:

and the worst thing is when there are 2 giant civs and 1 rump state civ left on the map and one of the giants destroy other one or putting him belove that "all game slept" rump state in score late game. Fair right?

I still don't know how ELO is calculated for oppening rounds, belove average gets minus points. And both 6 or 7 player game winner gets 7,50 points. Good stuff, I am going to stay tuned for that. I was just going to ask what does Elo stands for but turns out there is a wikipedia page for Elo rating system with a man named Arpad Elo!

Thrasybulos · Aug 3, 2023

Saxo Grammaticus said:
-How will you calculate expected score once ELOs are "floating," i.e. from Wildcard on?

Keler said:
I still don't know how ELO is calculated for oppening rounds, belove average gets minus points. And both 6 or 7 player game winner gets 7,50 points. Good stuff, I am going to stay tuned for that. I was just going to ask what does Elo stands for but turns out there is a wikipedia page for Elo rating system with a man named Arpad Elo!

iirc the Wikipedia page is a good source, but I've used this as a reference : https://www.omnicalculator.com/sports/elo
Basically, the Elo formula gives the expected result between two players. So I calculate it for each "duel" and then sum it all. That gives me the total expected result.

Saxo Grammaticus said:
-How do you randomize the starting positions?

I tried different "algorithms" based on the game results:
- Swap 1st and last, rotate the others
- Killer swaps with victim
- winner takes FTD's place, 6th takes 5th's place, 5th takes 4th's place, etc...
None of those guaranteed a fair repartition, so all had to be abandonned along the way.
I also tried (Game 3) pre-generating the 20 permutations.

What I did for the last games of the Opening Round was divide each series (for 6-player games, idea is similar for 7-player games) into 3 "rotations" of 6 games: each AI must have played every position inside a "rotation". So after 6 games, each AI has played from each starting position one, twice after 12 games, 3 times after 18 games.
I applied the last algo I'd tried (1 -> 6, 6 -> 5, 5-> 4, etc...) for the first games of each rotation until it yielded an "illegal" result and filled up the rest "manually". With the constraints in place, it's like a sudoku game.

That last two games are "rigged" to ensure fairness for the AIs still in the run: if two AIs are still competing for the 2nd, qualifying place for instance, it wouldn't do to have one play from two strong positions and the other from two "death spots".
Then, sometime during the Wildcard League, I started simply copy-pasting the same permutations for the first 18 games.

Saxo Grammaticus said:
In your particular rating system, each match runs as a tournament with wins, losses, and "draws" for survivors and those eliminated. I wonder how much it makes sense to award a win to survivors against those eliminated. I can think of a number of scenarios where the survivors are effectively rump states. Your point split for survivors would seem to address the issue of several contenders with 2,000+ pts each as well as a 3,000 pt gap between 2nd and 3rd, for instance. Where I struggle, however, is the somewhat arbitrary nature of who stays and goes with a runaway calling the shots. How much is it a victory to not get chosen for elimination in the last ten turns? Are each of those victories over the eliminated worth the same as the winner's? Of course, Survivor's the name of the game

Sulla's system has "kill steals", I have indeed cases of "survival steals". Particularly annoying are the instances when one AI conquers another, but then gets attacked and conquered by a 3rd, stronger AI, and the initial victim lives on with a couple of cities left.
It had me wondering for a while whether I should have a civ5-like conquest system: if an AI has lost its capital, it's considered eliminated when the game ends, with the turn of elimination being the last turn of the game.
After all, we had a case when an AI won the game after losing its capital and getting it back (hey, I've started season 2 and I had such a case yesterday evening: Willem, 20 turns away from a Cultural Victory, declares on a much, much stronger Kublai. He loses his capital, which had already gone Legendary. Utrecht is also Legendary, Rotterdam is the one lagging. Willem seems like he's on his way out, but then Ramesses, the game leader, attacks Kublai as well and destroys his armies. Rotterdam goes Legendary. Willem recaptures Amsterdam and wins instantly. As a side note, if Willem's attack was stupid, so was Ramesses', since bailing out Willem was the only way he could fail to win the game...).
But we've never had a game where an AI won a game after losing its capital and failing to get it back.
But that would be adding another system the AI doesn't understand. In one game, I had an AI lose its capital, then regain it. In a later war, it gave it away for peace (it was just a standard city for it).
In the end, I decided it wasn't worth it, for a few fringe cases.

If (and that's a big if), I find a way to read the data in the replay files (or the save files: since the replay is only created at the end, I guess the relevant data must also be present in the save files), I got a few ideas about getting something close to what Keler has described. But that's entirely contingent on the ability to extract that data.

Thrasybulos · Aug 3, 2023

Game 4 saw Justinian use the AP to rein in a dominant Peter and claim victory for himself, with Charlie in his wake, to the infamous ultimate result we know.
The alternate histories (played without the AP) have shown that while Justinian was indeed strong in that game (7 wins), Peter was even better (11 wins). But the latter's starting position, next to the doomed Asoka and boxed-in, jungle-choked Sury, was suspected to have played a major role in those results.
And I indeed expected that when shuffling the AIs around, Peter would be nothing special, and Justinian would prove the better AI in that field. Who would be the second best performer? Ask the community, and I'm fairly certain most would have answered Sury without hesitating. That was my case too, even though I had the example of the AH for S4's wildcard game that I'd run, where Saladin outperformed Sury.
Another interrogation was Monty: was he really that bad?

Justinian blew my expectations, by absolutely crushing the competition. If the previous game had given rise to doubts about whether some AIs were actually really better than others, this game laid them to rest immediately.

Justinian has two main weaknesses: cannot plot at pleased is the major one, of course, often preventing him from making a decisive move. The other is the fact that the way the Culture victory is coded, he basically never goes for it, even though he usually has a very strong culture. But in spite of those weaknesses, his performance was stellar, combining a very strong economy with a strong military. And while he may at times be a tad late getting his cataphracts into play, there's no Rifling tech eschewing nonsense from him.
Asoka and Charlie were doomed by their peaceweight: two high peaceweights vs 5 low peaceweights. And in that group, religious tensions would also add fuel to the animosity. To his credit, Asoka managed one miraculous win. But that's it.
Monty did better: he got two wins. But was eliminated as often as Asoka: 17 times. I'm afraid that he's unfortunately as bad as he's said to be.
Peter got as many wins as Monty, and survived as many times as Charlie: in other words, he performed rather poorly, which I pretty much expected.
What I did not expect, on the other hand, was that Sury would basically be as bad as Peter. I had a gnawing suspicious that Sury was one of those leaders, like Cathy or Kublai, that a few good performances in AI Survivor had led people in general to somewhat overestimating them... but this still came as a surprise.
Now, one group of leaders isn't enough of a sample to evaluate the overall performance of an AI, but that seemed to be a perfect group of AIs for him to do well.
Saladin was a nice surprise. Now, I played those game before Season 7's playoffs and Championship which illustrated that he certainly could be a solid leader. His results here confirm that. He's not Justinian-tier, but certainly seems above average.

Peter's position (F) was indeed very strong, especially on account of having the extremely weak position B (Sury's) as a backline.
Justinian's position (A) was decent: sheltered, good land, with some room.
So was Charlie's (C), which is a bit more of a surprise, as it was central, with very close neighbours. Proximity with usual dogpile-bait locations?
Saladin's (G) was dependent on barb city captures, with lots of jungle: decent land, but slow development. Too slow.
Asoka's (C) had decent land, but it lacked production and was the most central position: not healthy in a group of religious fanatics.
Montezuma's (E) position seemed similar to but better than Charlie's, and yet it led to poorer results. I guess the metal situation was to blame: no copper, and iron availability iffy (closest could easily be stolen by whoever was at Charlie's spot).
Sury's position (B) was confirmed as awful: even Justinian couldn't make it work.

While I was harbouring doubts about the AIs that the previous game sent to the playoffs, here, with Justinian, we had definite Championship material.

Keler · Aug 3, 2023

Thrasybulos said:
Sury's position (B) was confirmed as awful: even Justinian couldn't make it work.

but he made it work by coming up as well deserved runner up twice! And so is rewarded for being the only leader not starting there 3 times

This is so much fun to read, thanks for sharing all your effors.

AI Jumbled Rumble : another set of AI Survivor Alternate Histories, with a twist

Chieftain

Chieftain

Chieftain

Attachments

Rex Omnium Imperarium

Mycro Junkie

Chieftain

Warlord

Clerk

Prince

Chieftain

Chieftain

Chieftain

Attachments

Warlord

Chieftain

Chieftain

Attachments

Clerk

Warlord

Chieftain

Chieftain

Attachments

Warlord

Similar threads