da_Vinci said:
Absolutely, but in chess there is no attempt to estimate your opponent's performance in THAT game in isolation, the rating is used, which represents an estimate of average performance over several to many games.
Unless I have misunderstood, I thought you were trying to generate a rating for the AI based on a single game (or a single save), rather than AI performance (within diff levels) over many games. Which seems to add a lot of complexity.
You have to differentiate between a single performance rating and the averaged rating. The performance rating indicates the player performance in a single game or tournament while the player rating is an average with an exponential forgetting factor. A single game can be rated and I know for a fact that this is sometimes done in the Danish Chess Team Championship.
In my rating proposal for GOTM rating the AI and player rating could similarly be calculated as both performance rating and averaged rating with the latter being the end result of the rating calculation.
da_Vinci said:
Asuming the AI does not learn ... its skill level should be pretty constant over time. Yes, it will have variability as it applies its algorithm to varying game conditions (the algorithm makes decisions that are better for some game situations than for others), but that is a pretty predictable and constant range of variation, I would think.
So if AI rating is estimated historically, how much complexity can be removed from your system?
As far as I know the AI doesn't learn from experience and thus it will have a fixed playing strength until the SW is updated or modified.
The AI playing strength is an inherent part of the statistical game model I have used so fixing the AI rating after estimation on old games has little influence on complexity.
da_Vinci said:
Perhaps the real opponents are the other humans ... so finishing (winning) faster than player X is considered a "win" over player X ... and perhaps the gap in number of turns would give a variable margin of outperformance (it is fixed in chess since all wins are equal, IIRC). Finishing (winning) slower is a loss ... also with a variable margin of underperformance. With appropriate ceilings and floors on those margins.
Still remains how to rate a loss to the AI ... if AI has an historical rating, a loss could be a set margin of underperformance to that, which should be simple (but is it accurate for the system?).
If you want to discuss modifications to the rating system I'm proposing we should start by discussing the underlying model. The rating updates more or less follow from the model. Looking at the model also has the advantage that you can discuss how well it may fit reality.
da_Vinci said:
Then how to rate an attempt with no submission ...
It must be rated as a loss - otherwise most players wouldn't submit any losses. Actually there should be a small bonus for those who do submit a loss as there already is in the existing ranking system.
da_Vinci said:
Hmm ... is it feasible to generate a ratings system based only on won games? Since those wins could be considered to be "wins" over all who finished slower, and "losses" to all who finished faster ... maybe consider the players immediately above and below the player to be rated (or the 2 or 3 above or below ... obviously issues for the top and bottom players)? Then no need to rate the AI at all.
That would remove any penalty for trying and failing, which might remove a lot of the reservations that are popping up in the thread.
dV
Since a loss is clearly a possible outcome of the game I doubt that you can formulate any satisfactory game model that ignores this. It would also open for exploits - in some cases it will be better to loose or not submit compared to submitting a sub-standard win.
You can't have a ranking system that doesn't penalize a poor performance. Sometimes you loose and that shows the limits of you skill which is exactly what we are trying to measure. People who don't like to compete could be offered a "non-rated" game as I have already suggested.
AlanH said:
Sounds reasonable to me. The AI is not the competitor here, the other players are. The AI is more like the golf course in this situation - a constant across one game, certainly, and actually a constant across many games. As dV says, it doesn't learn, until a new version of the game software is released.
The AI is certainly a competitor in the game itself and that's why my game model has a notion of AI playing skill. The AI skill is constant for a given SW and level, but that doesn't mean that we shouldn't estimate it - either using old game data or update it like player ratings. The AI skill is something inherent in the model and not something that can be removed without fundamentally changing the model. I think you may have gotten the impression that AI rating is an additional feature complicating matters, but that is not the case.
Since we are discussing possible modifications of the system I think I should explain the game model in more detail. That would also ensure that we speak the same language.
In my model I introduce the concept of playing power (or skill), denoted P, and the concept of a total workload (denoted W) that is the required work needed to achieve a certain VC. The VC is then achieved when the playing power integrated over a number of turns, T, exceed the required workload i.e. when:
P*T >= W
The playing power P is not constant since no person plays equally well every time. This is modeled by drawing P from a normal distribution with a standard deviation of 200 and a mean R, which is a parameter in the model (The rating). This happens to be exactly the same assumptions as in Chess (ELO) rating. The workload, W, required to achieve victory will also vary depending on the map, the speed and the VC. I don't make any specific assumptions on distribution for this variable.
So the victory date for the player in a particular game is:
T=W/P
Since the AI's are also participating in the game there is some chance that they will finish first. Since the AI is playing the same map and speed I have chosen to model the AI in the same way as the player i.e. the victory date for the AI is:
T_AI=W/P_AI
Now the outcome of the game is:
Win in T turns if T <= T_AI
Loss in T_AI turns if T_AI < T
Given a game result (Win/loss in T turns) it is then possible to derive estimates for the playing power in this particular game for both player and AI. The rating can then be updated as a running average of the estimated playing power with an exponential forgetting factor. Anyway, this is where it starts getting technical, and in order to discuss modifications to the rating system it's much more relevant to address changes to the model. The key point is that the model should be able to explain exactly how a win or loss in T turns is generated as a function of the players skill since these are the readily observable data. The rating update is then "only" a matter of estimating the "skill" parameters in this statistical model.