OoS Bug Hunting

Vehem

Modmod Monkey
Joined
Nov 22, 2005
Messages
3,219
With Patch M, Xienwolf has included support for Gerike's OoSLogger. Basically this will help us to trace any causes of OoS problems that are being reported. It won't mean that we are able to instantly solve them, but it will give us a much better idea of what is going on.

In order to do this effectively however, we need to get the relevant feedback from players. With the OoSLogger enabled, any OoS occuring in Fall Further will create a file named "OOSLog.txt" in the Civ4 directory. That file tracks various values that are used to create the sync-checksums (Sync Value and Options Value as displayed in game) and will help to narrow down which are becoming divergent.

All we need from you is;

  1. Rename the log file to something unique - preferably the name of the host, the name of the player and the date such as "HostVehem-PlayerXienwolf-27-09-07-OOSLog.txt"
  2. Post them to this thread.

We will need logs from both sides of the divergence (at least one of the "master" version and one from the "different" version) - in games with 2 players that means both players need to post. In games with more players, it's easier if all available logs are posted as we're sure to get the necessary ones.

===

If you would like to help further and don't mind a bit of reading - take a look at this thread and learn a bit about what the OoS errors and checksums actually mean. The explanation there goes into far more depth than I'd ever seen and I learnt a lot reading it. If more of us understand the problem, more of us can help solve it.
 
Playing FF patch M with FfH2 33g I was getting OoS errors when ending the first turn. Of the displayed numbers with the OoS message: the sync numbers are the same, but the options numbers are different. I generated the OoS log files as suggested, used a diff utility on them and it reported they were equal.

Anyway, I noticed that there was no OoS (on the first turn) when no AI players were in the game. So I experimented. Any time I have the Malakim, Luchuirp, or the Ljosalfar as an AI player, I get an OoS error when ending the first turn. I do not seem to get this error with any other civs. I notice that they all have life mana granted by their palace, but am not sure of any other common factor.

Of course it all may be coincidence and something else entirely is at fault.
 
Reading through the thread for OOS logger, I doubt that he logs any information from the Options Checksum, based on the assumption that they simply CANNOT change during the course of the game.

Code:
int CvGame::calculateOptionsChecksum()
{
	PROFILE_FUNC();

	int iValue;
	int iI, iJ;

	iValue = 0;

	for (iI = 0; iI < MAX_PLAYERS; iI++)
	{
		for (iJ = 0; iJ < NUM_PLAYEROPTION_TYPES; iJ++)
		{
			if (GET_PLAYER((PlayerTypes)iI).isOption((PlayerOptionTypes)iJ))
			{
				iValue += (iI * 943097);
				iValue += (iJ * 281541);
			}
		}
	}

	return iValue;
}


So the only value checked is if each player has a gameoption set on or off. Those options are:

PLAYEROPTION_ADVISOR_POPUPS,
PLAYEROPTION_ADVISOR_HELP,
PLAYEROPTION_WAIT_END_TURN,
PLAYEROPTION_MINIMIZE_POP_UPS,
PLAYEROPTION_SHOW_FRIENDLY_MOVES,
PLAYEROPTION_SHOW_ENEMY_MOVES,
PLAYEROPTION_QUICK_MOVES,
PLAYEROPTION_QUICK_ATTACK,
PLAYEROPTION_QUICK_DEFENSE,
PLAYEROPTION_STACK_ATTACK,
PLAYEROPTION_AUTO_PROMOTION,
PLAYEROPTION_START_AUTOMATED,
PLAYEROPTION_SAFE_AUTOMATION,
PLAYEROPTION_NUMPAD_HELP,
PLAYEROPTION_NO_UNIT_CYCLING,
PLAYEROPTION_NO_UNIT_RECOMMENDATIONS,
PLAYEROPTION_RIGHT_CLICK_MENU,
PLAYEROPTION_LEAVE_FORESTS,
PLAYEROPTION_MISSIONARIES_AUTOMATED,
PLAYEROPTION_MODDER_1,
PLAYEROPTION_MODDER_2,
PLAYEROPTION_MODDER_3,


And I am not entirely certain WHY those are part of the gamestate. It seems like it should not matter at all to each of the other player's computers how you have your options set. If it is set for Workers to start Automated, then the routine which assigns them a task during automation ought to send information on what task they performed.


Anyway, let us know what the values are when your options are different values. At the least it can inform us roughly which/how many options are different for the both of you. This might be tied to the same issue which is causing Clams Pop-ups to happen repeatedly for some people (also a health granting resource, like Life Mana).
 
Could it be caused by someone getting a "first health resource" advisor pop-up? I can't imagine that causing an OOS, but it could be related to the bug I am having with being unable to disable advisor pop-ups.
 
Here's some data, although I do not know what it means.

Started two games with same options and got OoS with same option checksums for both games:
Host options oos options checksum: 5081619
Options on: Wait at End of Turn; Minimize Pop-Ups.
2nd computer oos options checksum: 13569492
Options on: Advisor Pop-Ups; Sid's Tips; Wait at End of Turn.

Started another two games changing 2nd computer's options to same as host. Again the oos option checksums were the same for both games:
Host options oos checksum: 4701604
Options on: Wait at End of Turn; Minimize Pop-Ups.
2nd computer oos options checksum: 13189477
Options on: Wait at End of Turn; Minimize Pop-Ups.

All games started with Malakim, Luchuirp, and Ljosalfar as AI civs (and Elohim and Bannor as player civs). All games generated OoS at end of 1st turn.

Are the graphics, audio, and other options included in the checksum?
 
Reading through the thread for OOS logger, I doubt that he logs any information from the Options Checksum, based on the assumption that they simply CANNOT change during the course of the game.

It is however, a desync of that checksum that I am experiancing in most games around said time. So its safe to say something in that assumption is wrong.

Edit: I wonder if it could somehow be related to dowsing's little bug O.o... just a thought.
 
Looking at it again, the numbers for the checksum are simply stupid. There is one number with a large multiplier which indicates the player checked, and another number with a smaller number indicating the option value checked. But both of them are 6 digit numbers!


It does seem to make more sense in my opinion to make the player based number a 6 digit number which is simple, like 300000, and the option based number a single digit simple number, like 7. Then both are prime numbers (in their scale) and you can count on no overlap between them.
 
Here's some data, although I do not know what it means.

Started two games with same options and got OoS with same option checksums for both games:
Host options oos options checksum: 5081619
Options on: Wait at End of Turn; Minimize Pop-Ups.
2nd computer oos options checksum: 13569492
Options on: Advisor Pop-Ups; Sid's Tips; Wait at End of Turn.

Started another two games changing 2nd computer's options to same as host. Again the oos option checksums were the same for both games:
Host options oos checksum: 4701604
Options on: Wait at End of Turn; Minimize Pop-Ups.
2nd computer oos options checksum: 13189477
Options on: Wait at End of Turn; Minimize Pop-Ups.

All games started with Malakim, Luchuirp, and Ljosalfar as AI civs (and Elohim and Bannor as player civs). All games generated OoS at end of 1st turn.

Are the graphics, audio, and other options included in the checksum?


Crunched the numbers on this one today. First listed checksums were:

Host options oos options checksum: 5081619
Options on: Wait at End of Turn; Minimize Pop-Ups.
2nd computer oos options checksum: 13569492
Options on: Advisor Pop-Ups; Sid's Tips; Wait at End of Turn.

The options listed match the Checksum that was given by the Host. The Checksum on the second computer is off by Player 9 having Advisor Popups enabled (iI = 9 & iJ = 0)


The numbers from the second set with both computers having the same options set do the same thing, the difference in the values is exactly iI = 9.



So.... somehow, Player 2's computer thinks that Player 9 (an AI) has decided to turn on Option 0. And in #erebus Moltov informed me that it was a game with 10 slots filled, making Player 9 the last AI slot (excluding the Barbarians of course).
 
Ok, did some more testing with numbers as they came up while Moltov played a game. Seems like there is an overflow issue with the Feats spilling data into the options. This would also explain how some people have seen the Relationship boost from Casting Trust without ever having done so, and others have noticed that one of their options changed without them personally having asked to change it.


Testing showed that having the Shrine of Sirona (sets Feat 32 to 1 at the start of each turn) resulted in activation of the Option "Right-Click Menu" (Option 16). The option was disabled again after casting Sirona's Touch (sets Feat 32 back to 0)

Visiting Bradeline's Well (the last listed Feat in the FeatInfos File) activated Feat 39 out of 40 for me. Meaning that the game thinks there should be 1 more Feat than there actually is.

After completing all non-BtS Feats (Trust through Bradeline's Well), I saved and loaded the game. All options claimed to have been loaded properly.

Sirona's Touch was not available, so that Feat didn't load properly. I no longer had a relationship boost from the other Civilizations, so there is another down the drain. Global Spell is available as well. Of all the Site Visit Feats, I was able to redo each but the last 2, Odio's Prison and Bradeline's Well.

Counted the entries in the XML and sure enough, they come out to be 39 total.


Going to test a slight reformatting of how I have things now, and if that doesn't work then I will attempt to mimic how DamageTypeInfos is handled (both DLL list and XML setup) so that the Description tags can be kept for the FeatPrereq fields on promotions. Hopefully this is the only "big issue" broken in the code right now so we can have a clean entry into 34 compat mode :)
 
So, one of the infamous Vista-XP interconnect OOS. Prehaps you can do something with it.

Me(Moltov) Hei(whicheverwaytheygo) and Kraken(ur...whatever) were all in the game.
Me and Hei had identical checksums, but Krakens was... Quite a bit off.
Anyway, the file: [edited out, wasnt actualy of this oos]

Forum upload didnt wana work. *shrugs*

Edit: Seems the logger for the OOS didnt log for me. Odd.
 
OOS log-ushram, rogue, me(hei) 051008

hope this helps ^^


(btw really appreciate the work you guys are putting in into smoothing out these MP issues... FfH is way too good a game for MP not to work that well =P)

(that said, O seems to be much improved over M in MP stability =D we'd try to continue this game later to see if we get into one of those chronic OOSes or its a one-off issue)
 

Attachments

  • OOSLog-naterogue.rar
    36 KB · Views: 69
OOS log-ushram, rogue, me(hei) 051008

hope this helps ^^


(btw really appreciate the work you guys are putting in into smoothing out these MP issues... FfH is way too good a game for MP not to work that well =P)

(that said, O seems to be much improved over M in MP stability =D we'd try to continue this game later to see if we get into one of those chronic OOSes or its a one-off issue)

Here is my log from the game
 

Attachments

  • OOS LOG hostushramnate.rar
    36.1 KB · Views: 70
odd... the one we put was the prog files\...\civ IV\bts\OOSlog
they were too big to attach up there so we zipped it up
 
Strange, both of your attachments claim to be a .rar file, but they open as a Civ4 Save File.

The file you want to attach should be:

C:\Program Files\Firaxis Games\Sid Meier's Civilization 4\OOSLog.txt

odd... the one we put was the prog files\...\civ IV\bts\OOSlog
they were too big to attach up there so we zipped it up

They downloaded as rar'd text files for me - I've uploaded both into your folder on the server, Xien. Might be that CFC was getting it's attachments crossed again.
 
I did a little differencing on them myself btw. It strongly suggests the spawn of a Gnossling triggered the OOS unless I'm mistaken.
That said, wow, a lot of stuff OOSed in the immediate time therein...
 
Ok, so the breakdown of differences (read the list and tell me if anything springs to mind as being tied to actions you recently took):

Next Random Seed was off. So one of you did a random number check that the other did not.

Ushram's differences from heisenburg's:


Records 1 extra deal having been negotiated during this game

According to Ushram's computer, player 1 (so not the host, the guy in the slot right after him) has 1 extra unit, who is worth 1 Asset and 1 Power. This shows up later as being a Gnosling (UnitAI_Recon). This gnosling is at map location (8,3), which tells you nothing much if you were playing on a wrapped map.

Ushram records Player 1's Adept as having a bonus XP point, this is probably the random check. There is also a warrior belonging to player 1 on that tile, and a Barbarian Giant Spider is 30 tiles south of it, with a barbarian goblin just north of the spider. Player 2 (the third human) has a worker about 15 tiles east of these barbarians (if this helps to discover where the city with the extra gnosling and a single warrior defender is).



Ushram goes on to record that Player 3 (first AI) has a warrior 1 square east of where heisenburg thinks it should be, and that Player 5 is getting 2 less food & production total (so possible a barbarian standing on a 2/2/0 tile, or at war with that warrior and HE is standing on the 2/2/0 tile)



Ushram thinks that player 5 has an extra Hunter (UnitAI_ATTACK_CITY) (this hunter is being trained, doesn't exist on the map yet). He also records a Hunter belonging to player 5 as being 4 tiles off location, and yet another hunter as being healed more than it ought to be, and a third being out of location. Plus a Worker with an extra point of experience (further cascade effect from the random number being off)

2 of Player 5's warriors are out of location, with another one being full health and no XP, when it should be half health with 5 XP

Player 6 has a workboat out of location (2 tiles north, 1 east).

Player 7 has a wolf 1 tile east of where he should be

Player 10 is missing a unit worth 13 Asset & 5 power (a Chanter - UnitAI_Explore. Had 8 XP in proper logs, but was still level 1, so must have just won a battle). Also has a scout 2 tiles west of proper, another 2 NW of proper, and a chanter with 4 extra XP.


Player 12 has a worker missing a point of XP, and a warrior who didn't move 2 tiles west (so is healed and in the wrong spot).

Player 14 has a hunter 1 tile east of proper, and another 1 tile NE. Also a worker with 1 XP missing, another worker with 1 XP too much, and a trio of hunters each 3 tiles off proper.

Player 15 has a 2 scouts 1 tile east of proper, and a slave missing an XP point and promotion (so he has had a turn since the error occured).


Essentially every barbarian is off location by a little bit. Notable oddities other than that are that Barbatos has 1 XP and 1 Promotion less for Ushram.





Overall.. tons of stuff that is very different, I would have to suspect that quite a bit of it stems from the random number being different though, and it all cascaded from there. Major change was a Chanter belonging to player 10 died to a warrior belonging to player 5 (most likely), but that could also just be a cascade effect from the RNG being off. Though I am not sure how Player 10 had a chanter gain some XP, unless the one which died happened to have Spirit Guide, since it is precisely half as much XP as the dead guy had.
 
Thats great analysis xienwolf, and exactly what we need to try to draw a pattern into what may be causing these.

I looked through the function that grants spellcaster xp and I cant see anything fishy in it or the random roll. Maybe it was in a roll before that?

ps anyone else find it funny that we are unable to keep states aligned in games where one of the people is named Heisenberg? No, just me? </nerd>
 
Top Bottom