How to handle version-differences in the HOF

Airny · Jan 31, 2007

As proposed in another thread, I'll start a new one to this topic, that seems to be of interest for some people.
I quote the comments done so far:

superslug said:
Sorry about that!

Are you proposing an unofficial version filter? Or suggesting a divorce into a Vanilla HOF and Warlords HOF?

Bastian-Bux said:
Well, my impression is, that some victory conditions are suffering more from Warlords then others. Frex it seems that rushes are a tad harder then in Vanilla (still stupidly easy). As it is now rushes are also some of the most played versions, thus it is very hard to compete in this area with established Vanilla scores.

Another reason might be that many of the good players are still preferring Vanilla. So mostly "sub average" players like me try to fill the scores with Warlords games.

But all together I agree with the OP: there is a distinct imbalance between Vanilla and Warlords scores.

Now how to solve it?

Three possibilities:

1) two completely different HOF
2) an unofficial version filter
3) a version multiplier (frex Vanilla rushes get only a 0.x multiplier, while Warlords gets 1.0)

Its also possible to combine those three ways: an official filter is used to make two different HOFs for now (DB side they are stored in the same tables, only the output is divided), though by comparing them the HOF team works out consistent multipliers so that over time we might come back to one HOF.

Why would that make sense? Well, lets face it, in the end all of us will probably play Civ4 with all its expansions. But it would be sad to declare the older results as not-comparable, or equally bad, let them rot in separate HOFs nobody is looking at anymore.

By separating them there would be more incentive for players to get Warlords results, and thereby get a database so that useful multipliers can be developed. And thus the "old" records don't loose actuality. They are just multiplied down (or even up) to reflect the comparable difficulty.

Yeah, I know, its quite some work. But lets just imagine the situation once a second or even third expansion hits the market.

dutchfire said:
My opinion:

If you think some leader will give you the best result, you pick him.
If you think some version will give you the best result, you pick that version.

IMO it's not that bad if "no-one" plays warlords, it's their own choice.

I'd suggest just a filter on the website atm.
It would be sufficient and could be set off default.

DeafDolphin · Feb 1, 2007

I would suggest no changes.

Having a filter, or even two separate tables, would lead to a tremendous amount of work for the already busy HoF staff, who are cooking up their nefarious schemes to make us pull our hair out at impossible winning dilemmas. :lol:

Not to mention, the version differences are balanced out in one way or another, via traits, UU, UB, and adjusting of their statistics.

For example, the change to Catherine's traits from Organized & Expansive to Imperialistic & Organized comes along with a change to the power of the Cossacks. Losing Expansive means a hit to health, while Imperialistic gains little behind cheap settlers and increased Great General appearances. With the Cossacks being nerfed in Warlords, and smaller cities/production, Catherine is not quite the powerhouse she would be in the Vanilla version. You can make the case that others have benefited from the version differences, but it balances out in the end, one way or another.

I really don't think anything would be accomplished by having such a filter or separate set of tables, even if it is off by default. It just means no work and really accomplishes nothing. I don't see such a filter for the Civ3 HoF, despite the changes both expansion packs brought. That tells me the players don't perceive any problems.

Airny · Feb 1, 2007

@DeafDolphin:
Did you try to beat AI on high level with warlords 2.08?
Did you ever try to make a slingshot to CS now?

I'm mainly concerned with the changes they made in warlords 2.08, there is not a thing that I'm aware of and that's easier now.

Lexad · Feb 1, 2007

Like you always need a slingshot for a speed win. View it as a good opportunity to learn new strats.

DeafDolphin · Feb 1, 2007

Airny said:
@DeafDolphin:
Did you try to beat AI on high level with warlords 2.08?
Did you ever try to make a slingshot to CS now?

Yes. It's difficult, but doable, same as Vanilla.

CS Slingshot: Done it numerous times on the lower levels. It's harder on the upper levels. Besides, if you miss the CS Slingshot, it's not the end of the world.

Bastian-Bux · Feb 1, 2007

Lexad, I for example don't play Vanilla anymore at all, though I do know that I'd have a much easier time to score high with it.

The problem stays: warlords changed the game balance so much, that results from the same table ain't really comparable if one is 1.61 and the other is 2.08.

This is not true for all combinations, victory conditions and such, thats why I ain't proposing completely separate tables. Though its true enough for many conditions.

And while there surely are some things that are easier using Warlords (you experienced players know this probably much better then me), in general a warlord score is "harder fought for".

Lets take a look (only deity difficulty, to make it easier):

huge: 28 vanilla, 1 warlord
large: 24 vanilla, 3 warlords
standard: 35 vanilla, 3 warlords
small: 29 vanilla, 2 warlords
tiny: 41 vanilla, 8 warlords
duel: 52 vanilla, 5 warlords

So out of all 231 submitted deity wins 209 use vanilla and 22 warlords. A meager 9.5%. It will probably look a bit better in the lower difficulties, though the picture is clear: vanilla is out since almost 18 months, warlords since 6 months. Shouldn't we see a bit more warlords submissions by now?

I think warlords is overall so much more difficult (again, not everywhere and at all times), that a direct comparison between vanilla and warlords scores isn't possible as is.

shyuhe · Feb 1, 2007

correlation does not equal causation. One of the recent combination gauntlets had people attempt the game with the same settings on warlords and vanilla, and the warlords finish times were significantly faster than the vanilla times (it was space race). It's not so clear that the "higher" difficulty of warlords will directly affect all scores as faster AI tech pace helps victory conditions like cultural, space, and diplomatic. I think they should leave the system as it is, or at most include a filter that allows a user to distinguish between warlords and vanilla games.

Bastian-Bux · Feb 2, 2007

shyuhe, its interesting how you argument:

"Its just coincidence that many results in Vanilla are higher then in warlords, but the few results from the space race gauntlet are a sure sign that its easier in warlords."

As stated before: there are conditions which became easier with warlords. And there (IMO more) conditions that got harder with warlords.

Anyway: your own statement gives reason to doubt the validity of comparing at least space race results (and by you add on sentence also diplomatic and cultural results) from vanilla and warlords.

Lets follow this thought: you suggest that space race, diplomatic and cultural victories are easier in warlords due to higher tech pace of the AIs. Now you#d have to question if this is not counterproductive at higher difficulties, but well.

The correlation of the data suggests strongly that other victory conditions (or games at higher difficulty) have become more difficult using warlords.
And we can probably assume that some conditions and difficulties will have stayed the same.

As long as the "gap" between the scores for each table doesn't become to wide, I'd agree with you on keeping the combined scores.

And even if the gap between single tables would be significantly wide, but the overall effect would cancel it out, you'd could argument that only the quattromaster results are the important ones. While I would disagree (some people never manage a quattromaster, and care for single tables only), its a valid argument.

Though lets be honest: its both speculation. Nobody of us has enough data to decide this yet.

IMO there are several possibilities:

1.) there is no significant difference between the versions, and scores are completely comparable
2.) there are significant differences, though only limited to select tables, and only small differences. Overall it cancels out.
3.) there are significant differences, though only limited to select tables, but important differences for those. While it cancels out overall, those tables are clearly distorted, and even events are affected.
4.)the differences are so severe that not only single tables are affected, but even average scores like events or even quattromaster scores are distorted.

How would a good reaction for each outcome look like?

1.) no change needed.
2.) no change needed.
3.) a filter and maybe hints in the FAQ/rules which tables are affected.
4.) separate HOFs or if possible a difficulty multiplier for each table.

So which of the 4 possibilities is true, and as a result which way to choose?

Actually, we don't know. There are arguments for and against each possibility, and differing opinions. The only honest way to decide would be to hold more events like the parallel gauntlet, and compare them. Plus ask especially the high ranking players to give warlords a try so that we get a big enough database for a honest comparison.

Lotsa work for nothing right? Nope. This whole HOF and especially the comparison between versions are invaluable in evaluating the balance between civs and leaders. If Sid Meyer is smart (which he probably is), he has one of his statistics professionals regularly have a look at the HOF results. Anyone with some empirical background can quickly tell which civs/leaders clearly need an overall boost, and which are doing just fine, even if only for a certain function.

So by making a better HOF, we could indirectly support a better balanced Civ, where Huayana and Lizzy ain't making up 50% of all submissions.

DaddyMac · Feb 2, 2007

Bastian-Bux said:
So by making a better HOF, we could indirectly support a better balanced Civ, where Huayana and Lizzy ain't making up 50% of all submissions.

For the most recent HOF submissions update, Huayna and Isabella combined only made up 12.8% of all the submissions.

Bastian-Bux · Feb 2, 2007

Working on it.

Just submitted two quechua rushes in the low 3xxx BC range.

Bah, somehow this rushing feels like cheating.

superslug · Feb 2, 2007

Bastian-Bux said:
Working on it. Just submitted two quechua rushes in the low 3xxx BC range.

Bah, somehow this rushing feels like cheating.

I've even wondered at times if I should ban that strategy, but it's too effective a way for newer players to get confidence they can actually hang at higher difficulties.

playshogi · Feb 3, 2007

I voted for no changes, but I see the merit in suggesting that vanilla games be discounted in some way. I think that Civ4 will stand the test of time (until civ5) and so do we want players plugging away at vanilla years from now because of exploits that exist in it? I think that maybe the QM should be an annual competition with the tables wiped clean each year. Declare a champion at the end of the year and start over with only the most current version of the game in the new year.

Bastian-Bux · Feb 3, 2007

Or give scores a "half life time"? Lets say all scores have a decay of 0,2% a day. So a 10.000 score is worth only 9.980 the next day. And only 5.000 after 346,5 days, which would be the half life time. Close enough to a year to talk about a "1 year half life time".

What would be the result? Newer submissions would be more valuable then older ones. So sure, you could go on and submit one vanilla 3835 BC deity quechua rush every week. But honestly, is that fun?

To encourage the use of new expansions it could be even argumented, that different decays should be use for different HOF versions. Frex 0,4% for HOF 1.0, 0,2% for HOF 2.0, 0,192 for HOF 2.08 ...

Heck for simplicities sake you could even say: decay is 0,5 divided by HOF version used. So a 10 days old HOF 1.61010 submission would be worth 96,94% a 10 days old HOF 2.08003 submission would be worth 97,62%. Also an 18 month old HOF 1.00 submission would be worth ~6,7% while a 6 months old HOF 2.00 submission would be still worth 63,7%.

So newer submissions (using newer Civ patches!) would have automatically higher scores, and submissions using newer expansions (resulting in higher version numbers) would keep their value longer.

This would also be an elegant way to give an higher priority to games inside the scope of one expansion but with higher patches (thus hopefully with less exploition holes).

Airny · Feb 3, 2007

I don't see the point in such a penalty.
If someone wasn't beat in his record with his version, why penalizing him?
A Hall of Fame that doesn't value its records is worth a ****.

azzaman333 · Feb 3, 2007

Maybe just set a filter to show only warlords/vanilla games? Doesnt affect the HoF, but people can compare their scores solely with x version if they want to.

playshogi · Feb 3, 2007

Airny said:
I don't see the point in such a penalty.
If someone wasn't beat in his record with his version, why penalizing him?
A Hall of Fame that doesn't value its records is worth a ****.

The records remain, but the score QM depreciates over time. This has the effect of encouraging a player to submit more games instead of sitting on high scores. Also, it allows new players to surpass high scoring players that have quit playing. It's a worthwhile idea, really.

Methos · Feb 3, 2007

playshogi said:
The records remain, but the score QM depreciates over time.

Is this correct? I've never heard this before. Are you talking about how the QScore changes due to the adjustment of the average from additional submissions? If that is the case, an old submission with the fastest time (raw score of 100) would never depreciate until someone beat his/her time.

playshogi · Feb 4, 2007

No, the score doesn't depreciate, it was just an idea proposed in Bastian's post above this one. The score can only go down if someone submits a better game, but Bastian's idea was to automatically depreciate scores over time, although this still wouldn't prevent someone from playing a new game with the old exploits in 1.61.

Bastian-Bux · Feb 4, 2007

True playshogi. But if you add in the version dependency, then it devaluates results with old exploits (lower version) quicker then results using new exploits (higher version).

superslug · Feb 5, 2007

azzaman333 said:
Maybe just set a filter to show only warlords/vanilla games? Doesnt affect the HoF, but people can compare their scores solely with x version if they want to.

That would certainly be the easiest option for a ~~lazy~~ energy-efficient HOF staff.

playshogi said:
The records remain, but the score QM depreciates over time. This has the effect of encouraging a player to submit more games instead of sitting on high scores. Also, it allows new players to surpass high scoring players that have quit playing. It's a worthwhile idea, really.

How to handle version-differences in the HOF

What do you think of a change in the HOF as follows?

No changes plz

Make a filter, so I can set "vanilla 1.61 only" for example

Separate the HOF completely (warlords/vanilla)

Use multiplier to balance the versions

I've got another idea and post it here!

Warlord

Moved on.

Warlord

oldfart

Moved on.

King

Deity

King

Chieftain

King

Still hatin' on Khan

Emperor

King

Warlord

meh

Emperor

HoF Quattromaster

Emperor

King

Still hatin' on Khan

Similar threads