Sirp,
I think you have to recognize that in an environement where we have literally hundreds of comparative examples for any given game we can begin to develop a better understanding of what the statistical impact of various potential exploits/cheating events might be.
The undetectable preknowledge events become detectable in the fact that they alter player behavior in the immediate game and then again in other games over time. You cannot know something that you are not supposed to know and then expect your eye not to twitch in that direction at least once and a while.
We can use an example for the recent Gotm18 game results to make sure that this process is slighty better understood even though it will still contain some uncertainty. In the Gotm18 game, the start position was specifically designed to encourage settling right on the starting position without moving. While at the same time potetially providing one peice of information to help in the detection of players who might use foreknowledge of the map gained from any number of sources to alter the outcome of their game.
The image below shows these opening move positions for 87 games that were analysed in detail.
(thanks to RufRydyr for the graphic as well as the opening city placement analysis)
We expect that from any position a certain number of players can randomly make any move based on a variety of basically random choices. So seeing players move to any of the surronding squares is not unexpected. The 1's and 2's provide a basic background signature that statistically means we could expect any where between 0 and 4 players to move into any of the given swaures with no statistical inference. The 70 players who started at setteled at the starting position would have behaved exactly as expected and then the 2+1 and 2+1 plus up to 4 of the 11 would be totally within expectations. So a total of 70+2+1+2+1+4=80 out of the 87 games behaved just as expected without considering the impact of any opening worker moves.
Moving the only worker to the 11 square would reveal the fish square and following this reveal we would expect many players to make the additional move to the 11 square in a non random way. Moving the worker to the 11 square would not be a rimary startegic move because we would actually expect most players to move the worker to the hill or to one of the four bonus grassland squares surrounding the start position. Again the actual player moves confirmed these expected moves and only 5 players in the 87 game sample made opening worker moves onto the 11 square.
This process of elimination left us with a numeric expectation that 85 out of the 87 opening move sequences would be within normal exectations while 2 of those 87 games could be considered to perhaps be suspect (less than 2.3%). The unknown question at that point was which of the 4 random move games and with of the 2 suspect games would belong to which of the 11 players that actually did move to the 11 square.
Essentially on its own we could identfy no bad games by thes one move sequence alone but we could confirm that at least 97.7% of all the games would be making valid opening moves and we could also identify a list of 11 games that could potentially be used as dat points to compare across other similar situations where we could identify one specific move where know ing some hidden feature of the map in advance could have influnced the player move decisions.
All of this information set to the side, no other comparative game play event in the Civ3 world has access to the same volume and the same quality of data that we can use in the GOTM games to perform quality control and game performance analysis. These tools of comparing game events to similar events that are independently produced by players in total separate game playe experiences can tell us a great deal about how the game works and how the players respond to the game environment. As we get more and more data we can extend or ability to predict what players might be expected to do and also to determine the likelihood that players would repeatedly do the one good but unexpected thing in a series of multiple events.
This actual example was used to analyze games that were submitted to the Qsc19 event and then several games that were not included in the qsc were validated against the expected 97.7% confidence level for moves that did not fall onto the 11 count position.