Old GOTM Scoring Discussions

Cartouche Bee · Aug 15, 2002

Yeah whatever you guys say, anyway, I ran some tests with my interum scores and using Beard's system I'd probably just play to 1880, give or take a couple other factors like map and difficulty.

ainwood · Aug 16, 2002

Originally posted by Matrix
Again, there's no way we can punish milking if we don't square root the score before an early finish bonus is given, since the score also grows exponentially (or quadratically) when milking. So that's what we will do then.

Matrix,

I'll try to keep it short! :lol:

The aim is not to punish milking!

Kemal · Aug 19, 2002

I agree that the aim is not to punish milking but to get a better view on how your skills compare to other players. Obviously this comparison is hampered due to the fact that civ3's gamescoring formula heavily favours playing through to the end. Since not everyone has the drive or time to play an already won game to the end I would like to suggest to also add an average finishing date for the last five games played to the global ranking. This way it's easier for players to compare their skills with players that have the same playing style (e.g. milking, non-milking or early finishers).

I know it means lots of extra work to calculate all this but I would be happy to help calculate these dates. Furthermore, since this only affects the global ranking it should not have to be ready at the same time the gotm-result are published.

What do you think about this? Could this help to get to a better comparison of skills or is it a bad idea that only leads to extra work to implement it?

Cruise · Aug 19, 2002

The civ3 tournament has a nice formula which rewards both early victories and milking.
Aeson knows more about it, consult him

Aeson · Aug 19, 2002

The problem with the scoring formula for the tournament is that it has no way of comparing between victory conditions though EEK. It's designed to compare games played with the same required victory type. An early Conquest/Domination would almost always win in the GOTM setting.

Beard's formula (here) has some flaws I think. There is too much variation between similarly played games based on what other people did in their games. If a large majority goes for early victories, it devalues the date portion, and vice versa. Just look at comparable early finishes from game to game, some score drastically higher (both in score and relative to 'milked' games) because more people waited to finish their games.

Creepster · Aug 19, 2002

Beard Rinker,

I did look at the results you have posted and initially they do look good. My problem with your formulae though is that you use the avg score. I have to agree with Aeson's comments above. The scoreing formuale needs to be a stand alone score; independent of what others do. If we were to have a GOTM where almost no one milks the game then the results will be skewed.

While I think the results from your formulae would work in the short term it is not a long term fix. I still think we need a formulae that takes into account more variables and provides a unique score, based solely on one persons game.

Beard Rinker · Aug 19, 2002

Aeson & Creepster,

I see what you are saying but I don't think it is a big problem.

Comparing data from GOTM V and GOTM VI shows this problem the best. Using the proposed scoring formula, the top 5 finishes in GOTM V were all domination/conquest finishes. In GOTM VI the top 4 finishes were milking games. Not only that, the top milked games in GOTM VI had significantly higher scores than milked games in GOTM V. Similarly, the top fast finish games in GOTM V scored much higher than in GOTM VI. The reason behind this is the average finish date is much earlier in GOTM VI, devaluing the finish bonus in that game. This appears to be exactly what Aeson pointed out.

However, if you look a little farther down the list there is much less variability. Nathan Barcley got 70 in game 5 and 67 in game 6 for fast space finishes. My scores were 53 and 59 for a milked game and the fastest space finish. Aleric 65 and 71 for early domination victories. Alexman got 68 and 67. Like any set of numbers, the top numbers and bottom numbers experience the largest degree of variability.

What this suggests to me is in games where you think the average finish date will be late, finish quickly and games where you expect the average finish date is early, milk the game to some degree. Keep in mind that all players concerned with score will be doing the same thing, thus nullifying this strategy to some degree.

These kind of strategic decisions are unfortunate and not what was intended when developing the formula. However, I think this does not apply to most players, only to the top players.

Originally posted by Aeson
There is too much variation between similarly played games based on what other people did in their games.

By making a guess at the average finish date, you can alter your strategy to achieve your best score. In GOTM V, there was a very difficult starting position suggesting a late average finish date whereas GOTM VI was a cakewalk and would likely have an early average finish. For GOTM V, finish fast for best score and GOTM VI milk the game to some degree to achieve your best score. Cartouche Bee's games are a good example. In GOTM V he finished fast and GOTM VI he milked the game. Using the proposed formula he would receive scores of 106 and 109.

The perfect solution? Hell no, but this seems better to me than the answer always being milk the game to 2050.

Unlike game score, the best way to achieve your best score using the scoring formula is not clear. I can speculate however that fewer players would milk their games to 2050, lowering the average score and finish date. Whether this will exacerbate the problem pointed out by Aeson or reduce it I'm not sure.

Originally posted by Creepster

The scoreing formuale needs to be a stand alone score; independent of what others do.

The problem with this is your scores from month to month would not be comparable. GOTM V is a good example why. It had a very difficult start position that resulted in the latest average finish date and one of the lowest average scores. I can't think of any way to mathematically compensate for a bad start position other than to use the results of others.

The average skewing problem probably could be resolved with a little more adherence to statistical formulas but my guess is the result would be a formula that is far to complex. One of the objectives with this formula is to keep it simple enough that any player could understand the dynamics behind it without too much difficulty

Aeson · Aug 20, 2002

I just don't like the fact that 80 people have a say in what your score turns out to be, and that the difference between 'good' and 'great' has nothing to do with how well the game was played. It turns into guesswork as to how everyone else will play, and could make reading the spoilers (even after you know the map) very much a spoiler.

If there is going to be a scoring system change, I think it should be like Cartouche Bee was suggesting earlier in the thread. Certain desirable factors are kept track of, and a static scoring system can be formed from those numbers. It's a lot more involved, but a utility program could handle it no problem. The only difficulty is deciding on how to weight the factors.

For game to game comparison, the map could be analyzed to figure out 'best' scores and dates, and player's results adjusted accordingly. This could be done before the game was released. All that is needed is a utility to extract the basic information (behind the scenes so the administrator can still participate) and make the calculations. The results would be released with the game.

Score

Score would be determined by the efficiency the player showed in obtaining the 'max' milking score. A bonus for early victory would then be added to keep milking and early victories both as viable options.

Objectives

- To determine the max milking score possible for the map
- To use that figure to modify the players score

AverageFoodPerTile = Food / FoodTiles
ClaimableFood = DominationLimit * AverageFoodPerTile
WorkingCitizens = DominationLimit
SpecialistCitizens = ((AverageFoodPerTile - 2) * DominationLimit) / 2)
MaxTurnScore = ((WorkingCitizens * 2) + SpecialistCitizens + DominationLimit) * Difficulty
MaxMilkedScore = MaxTurnScore * Landform

PlayerScore = (GameScore - GameBonus / MaxMilkedScore) * ScoreAdjustment

GameBonus = (2050 - PlayersDate) * Difficulty

Landform

Archipelago = ?.45?
Continent = ?.55?
Pangaea = .65

ScoreAdjustment = 10000 (arbitrary number to set a score in score form instead of %, this would result in a 'max' score [theoretically possible to exceed] of 10,000 for example)

Depending on the Landform (archipelago, continent, pangaea), another % of that is max 'milked' score. I'm pretty sure on the pangaea number already (65%) because of HOF attempts, and the others could be deduced from the plethora of well milked games that have been played in the GOTM's so far (excluding the dogpile game of course). It would give us a pretty decent max score to judge the games by, and would be a score that is available to the player as soon as they finish the game.

I think removing the territory portion of the score could also help to bring building back towards parity with early conquest then build. This way a size 6 city would be worth as much as 6 size 1 cities, though eventually more cities will win out (as they should). Population is dependant on territory, and so territory is counted twice (once directly, once indirectly through population) in the current scoring system.

Date

Date is a bit more difficult, but we can still come up with good estimates of a well played victory condition from map settings and difficulty level. Again, this is information that is available to the player before the game is played, and they can play accordingly. Instead of having to guess at what the adjustments will be once all is said and done.

Objectives

- To balance scoring by finish date
- To give a date bonus which is comparable between victory conditions
- To allow for comparison between maps and difficulty levels

Each victory condition would get a 'best' date for each map setting/difficulty combination.

Code:

Base victory condition 'best' dates: (standard continents map, monarch)

Conquest       : 10AD
Culture (100k) : 1500AD
Culture (20k)  : 1700AD
Diplomatic     : 1300AD
Domination     : 500AD
Spaceship      : 1600AD

Difficulty        Chi   War   Reg   Mon   Emp   Dei

Conquest       : -300, -200, -100,    0, +300, +500
Culture (100k) : -100,  -50,  -25,    0, +100, +200
Culture (20k)  :  -50,  -25,    0,    0,  +75, +100
Diplomatic     : +150, +100,  +50,    0, -200, -300
Domination     : -300, -200, -100,    0, +200, +400
Spaceship      : +100,  +50,  +25,    0, -100, -300

Map Modifiers    Tiny Small Stand Large  Huge    Arch  Cont  Pang

Conquest       : -200, -100,    0, +200, +400    +200,    0, -200
Culture (100k) : +300, +200,    0, -100, -200    +100,    0, -100
Culture (20k)  :    0,    0,    0,    0,    0       0,    0,    0
Diplomatic     :    0,    0,    0,    0,    0       0,    0,    0
Domination     : -200, -100,    0, +200, +400    +200,    0, -200
Spaceship      :    0,    0,    0,  -50, -100    +100,    0, -100

EXAMPLE: Conquest/Emperor/Small/Cont

ConditionBestDate = 10 + 400 - 100 + 0 = 310AD

Of course all these numbers could use some refinement! (suggestions welcome) Also, it might be best to convert entirely into turns.

After all the modification to the 'best' date for each victory conditions is done, a bonus (in score form, comparible to the ScoreAdjustment) is added based on how quickly the player was able to achieve the victory condition. With a decent balance in the modifiers, it should award a comparable bonus for a 'good' conquest as a 'good' spaceship victory, even on different maps or difficulty levels. Taking out the difficulty level comparison could be done if it was desired.

Bonus = ((2050 - PlayerDate) / (2050 - ConditionBestDate)) * ScoreAdjustment

ScoreAdjustment = same as in the scoring section, to equally weight score and date.

FinalScore = PlayerScore + Bonus

Conclusion

I know there is a lot of tweaking to do in a system like this, but even as it is, it should give a reasonable basis for comparison of different victory types within the same game, and even from game to game. There could always be more modifiers added (wonders, future tech, ect) as well. A utility to determine the score would definitely be needed. I could write one if necessary, though mapstat is already pretty much there if Chiefpaco would care to allow for it.

Each GOTM would be released with the following info: (example)

MaxMilkingScore = (would depend on the map generated)

Condition 'Best' Dates (Emperor/Small/Continents)

- Conquest = 310AD
- Culture (100k) = 1800AD
- Culture (20k) = 1825AD
- Diplomatic = 1400AD
- Domination = 500AD
- Spaceship = 1500AD

EDIT: changed some of the modifiers that aren't reflected in the examples posted below.

Aeson · Aug 20, 2002

Problems with this suggestion as it is:

- The map/difficulty adjustments need refining.

- Later victory conditions (Culture, Diplomacy, Spaceship) should always score more in the score section (more population), and so should perhaps have a general modifier to the base ConditionBestDate to compensate.

- One city culture (20k) might not be something that should be included at all.

Aeson · Aug 20, 2002

EDIT: Just going to do the examples in a spreadsheet, my math was off somewhere anyways.

Aeson · Aug 20, 2002

Here is the formula at work in GOTM 6-9. It's an Excel file.

ainwood · Aug 20, 2002

Aeson;

This is what I was getting at, although I didn't go as in-depth as you did. One addition that I would make is that you need some means of demonstrating that you hd worked yourself into a position where you can viably milk to the end and win - ie there is no chance of a spaceship or diplomacy loss for example. Perhaps you have to win by conquest, or alternatively reduce the world to three civs, and be at at least 90% of the domination limit.

The aim here is to give the early finisher the equivalent score that they could have got by milking, without having to go through the mechanics of the milking.

Creepster · Aug 20, 2002

The aim here is to give the early finisher the equivalent score that they could have got by milking, without having to go through the mechanics of the milking.

I don't believe that the player should be given the full value of potential milking at the earlier date. Most players out there cannot milk a game to its maximum potenntial. There are a lot of variables that go into a truely fine milked game and so far only a hand ful of people have achieved this. This may in fact be due to bordom, or attention to detail, but the fact remains. Look at the scores for the GOTM's there are several people who tried to milk the game but there scores vary by quite few points. A lot of this has to do with population management at the end of the game.

The early finisher should come out at some value less than an optimized milked game.

ainwood · Aug 20, 2002

My initial "idea" was to give people 80% of the theoretical maximum that could be got by milking.

Aeson · Aug 20, 2002

The formula I proposed doesn't do away with milking, it just puts early finishers on even ground in comparison. A well played spaceship launch should be comparable to a well played conquest or a well milked game.

Kemal · Aug 20, 2002

I agree with Aeson. Since it is undoubtedly true that one needs skill to succesfully milk a game for a high score and that not everyone possesses these skills. Therefore,completing a succesfully milked game should definitely not be punished but rewarded in form of a high ending score.

However I do not see why early finishers should consequently recieve a lower score than a milker (as is the case with the current ingame scoring formula) since it does take (other) skills to obtain early finishes and it doesn't make sense why those skills should be rewarded so little compared to good "milking"skills.

So it should be possible for early finishers to get a score which could compare with "milking-style"games.

Creepster · Aug 20, 2002

I agree that the scores should be comparable. A truly well played "conquest game" in some cases should or could beat a well played "milked game". My point is that one should not be favored over the other.

In the tournament for example it is set up to favor a fast finish and nothing else. I would not like to see that represented here.

Kemal · Aug 20, 2002

In my opinion playing a great game doesn't necessarily mean having lots of war and conquest, cultural, space ship and diplomatic victories can be excellently played as well.

I just looked at Aeson's formula at work and I find the results obtained via this formula to be giving a much more balanced (and thus better) standing concerning milked and early finish games and definitely an improvement over the current formula.

Beard Rinker · Aug 20, 2002

The scoring formula proposed by Aeson is fundamentally different than the one I proposed. It has some advantages over the formula I proposed, but there are also some problems with this approach.

I will try to clarify the differences between the two approaches and identify the strengths and weaknesses of each approach. I will try and keep this analysis as unbiased as possible. :rolleyes:

First of all, the goals of these two formulas appear to be subtly different.

Originally posted by Aeson
Score would be determined by the efficiency the player showed in obtaining the 'max' milking score.

A similar statement about the scoring formula I proposed would be as follows:

Score is determined by how well a player did compared to the average score and average finish date.

Premise of Modifier Based Formula (Aeson's formula)

Score is determined by the efficiency the player showed in obtaining the 'max' milking score. A bonus for early victory is then be added to keep milking and early victories both as viable options.

Essentially your score is modified based on various game statistics such as number of game tiles, average food each tile can produce, land form etc. These modifiers are balanced so no victory condition has an advantage and your score can be compared from month to month.

Premise of Average Based Formula (Beard's formula)

Score is computed using the average per turn game score and average finish date. The premise behind this is averages provides a consistent benchmark to measure against that will automatically compensate for variations in each game such as map size, difficulty level and start position. This should make your score comparable from month to month.

The formula weights per turn score and finish bonus equally. The premise behind this is these two factors tend to work against each other. The faster you finish the lower your per turn score, the longer you take the higher your per turn score. By weighting them equally, no victory condition should be favored.

Strengths of Modifier Based Formula

Score is independent of other players results. You can compute your score as soon as finished and your results are not subject to what other players decided to do. This formula can also be used in non GOTM games.
Formula is open-ended. This approach allows for additional game statistics other than finish date, game score and victory condition to be added at some point.

Strengths of Average Based Formula

Scores are very comparable from month to month and between victory types. Using averages automatically compensates for all differences between games and victory conditions.

Weakness of Modifier Based Formula

Very complex. This is why I took a different approach. The complexity of this formula leads to a whole host of problems.
Scores not very comparable between different games. Many of the differences between games can be compensated for with modifiers but there are some that cannot. The most notable condition that cannot be compensated for is the starting position.
Scores not very comparable between victory types. Eventually this problem will diminish as the modifiers are improved but there will be an imbalance for some time.
Constantly Changing. There are many modifiers and these modifiers are largely determined by a best guess. These will likely change frequently as imbalances are detected.

Weakness of Average Based Formula

Higher variability in top and bottom scores. Like any set of statistics, the highest variability occurs in the top and bottom numbers.
Unusually difficult games such as the GOTM VII deity level game produce unusual results. This may also be the case if we played a chieften level game with an easy start position.
Scores are subject to an "average skewing" effect. In games where there is an early average finish date, the finish bonus portion of games with the fastest finishes are devalued. Likewise the finish bonus for the fastest finishes are overvalued in games where the average finish date is late. This seems to only effect the results of the top scores.

[An Exceptional Game

A description of an exceptional game from the perspective of each scoring formula also illustrates the differences between the two approaches.

Using Aesons modifier based formula and exceptional game is one where a player achieves a victory condition at or near the earliest theoretical finish date for that victory condition. Another exceptional game is where the games score is at or above the maximum theoretical score.

Using Beards average based formula an exceptional game is one that has double the per-turn score of the average game and is finished with double the average number of turns left. Another exceptional game might be one that has 4 times the average per turn score or one that is finished with 4 times the average number of turns left.

Conclusion

As pointed out, there are merits to both approaches. I think determining a best formula depends on what we would like to achieve here.

If a comparison between players scores and games from month to month is the objective, then an average based formula would work best. Using averages automatically compensates for differences between games and victory conditions by providing a solid benchmark for comparison.

If a measure of how efficient a game is compared to the theoretical perfect game is the objective then the modifier based formula proposed by Aeson is the best approach. This approach also allows for adding other game statistics at some point in the future.

I would also like to add that the complexity in the approach used by Aeson makes balancing the formula a difficult, if not impossible task. That Aeson has laid out the groundwork and determined many of the modifiers is an impressive feat. These modifiers will be subject to constant change and there will always be factors that cannot be compensated for such as start position.

Aeson · Aug 21, 2002

If a comparison between players scores and games from month to month is the objective, then an average based formula would work best. Using averages automatically compensates for differences between games and victory conditions by providing a solid benchmark for comparison.

It wouldn't be too hard to incorporate a normalizing factor into the formula I posted. It could either be based off the max score posted, or the average of all scores. As it is, all the 'best' scores from the last 4 GOTM's are already in a pretty well defined 10k-12k area. This can be further refined through the map/difficulty modifications.

These modifiers will be subject to constant change and there will always be factors that cannot be compensated for such as start position.

This certainly is true. As the GOTM gives the same starting condition to everyone, it shouldn't cause problems except in game to game comparison. That comparison is done through the global rankings, which nullifies the problem by takes into account the averages already. There could be starts which favor one victory condition over another, but any formula would be affected by that. It's up to the player to assess the map and decide which condition to pursue.

Old GOTM Scoring Discussions

Appropriations Consultant

Consultant.

Tough Bureaucrat

Warlord

orangesoda

Silent Service

Warlord

orangesoda

orangesoda

orangesoda

orangesoda

Consultant.

Silent Service

Consultant.

orangesoda

Tough Bureaucrat

Silent Service

Tough Bureaucrat

Warlord

orangesoda

Similar threads