Multi Player bugs and crashes - After the 16th of May

One user looking over their promotions shouldn't trigger any losses. If they select an earned promotion, that could... and a recalc could. But either way that would be something that would stay in synch wouldn't it? (unless it has something to do with once a promotion is taken the ensuing check then stays asynch and of course there's no message design in that since I would have assumed this part of the selection process was back in synch... hmm... That might be something to look into perhaps.)

But if that were the case, I'd think this sort of issue would come up more often... I could be wrong. However another possible issue could be, again, differing bug settings. There's a few bug options, opportunity fire, terrain damage, archery bombard, Storms to name a few, that could account for one unit having more or less strength on one or the other.

Events can also commonly be culprits. Could one of the events that either gives out promos or heals a unit in the field be to blame?

Yes any of the above is possible I suspect (except opportunity fire anyway - that does a sync roll to determine if it hits, and that would show up as one side performing an opportunity fire roll, and the other not, which is not in the logs)
 
But wouldn't you have problems linking the two open games to each other in an IP connection where they'd have the same IP?
When I tried it it worked fine. As khh says only one side needs to listen for a connection.

How does one achieve this? (a simple re-run of the main executable doesn't start a second instance, at least for me)
You need to add "multiple" (without "") as command line argument to the shortcut.
 
So...anyone got any new format random logs from OOS (probably) caused by the city multi-threading to look at yet...?
 
Logs (random, OOS, BBAI) from an OOS involving a combat versus an invisible unit. Using the latest SVN with the multi-threading.

Edited to add: Adding another set of logs. It didn't create an OOS log (though it went OOS at the beginning of the turn) but attaching the random and BBAI logs.
 

Attachments

Logs (random, OOS, BBAI) from an OOS involving a combat versus an invisible unit. Using the latest SVN with the multi-threading.

This looks like the previous one - units apparently entering combat with different stats, resulting in different combat paths (and consequential rand de-syncing). It is 99.99% certainly not related to the multi-threaded city processing, so it's not really my area, or the issue I am focussed on right now (which is OOS due to the city processing).

@AIAndy - I think this is just a long running existing OOS issue - can you tell anything more precise from the logs than the above? (OOS log shows health of a unit difference, but I'm not sure if that is as a result of the combat desync, or the cause of it). BBAI logs don't show anything obvious

Given the differing combat path the only explanation I can see is different starting strengths (because 'iAttackerCombatRoll < iAttackerHitChance' passes on one machine and gives the attacker a round to process, but doesn't on the other). This could be due to an incorrect use of an async rand somewhere (e.g. - in promotion choice decisions), but I can't see one...
 
This looks like the previous one - units apparently entering combat with different stats, resulting in different combat paths (and consequential rand de-syncing). It is 99.99% certainly not related to the multi-threaded city processing, so it's not really my area, or the issue I am focussed on right now (which is OOS due to the city processing).

@AIAndy - I think this is just a long running existing OOS issue - can you tell anything more precise from the logs than the above? (OOS log shows health of a unit difference, but I'm not sure if that is as a result of the combat desync, or the cause of it). BBAI logs don't show anything obvious

Given the differing combat path the only explanation I can see is different starting strengths (because 'iAttackerCombatRoll < iAttackerHitChance' passes on one machine and gives the attacker a round to process, but doesn't on the other). This could be due to an incorrect use of an async rand somewhere (e.g. - in promotion choice decisions), but I can't see one...
One more combat round is likely to make a difference in the health so I assume it is more likely that it is the result of the desync and not the cause.
I think we should include more information in both the sync checksum calculation (CvGame::calculateSyncChecksum) and the OOS log.
Especially the information which promotions each unit has so any difference in promotions causes a desync when it happens and not only when the promotion makes a difference in combat or anything.
 
One more combat round is likely to make a difference in the health so I assume it is more likely that it is the result of the desync and not the cause.
I think we should include more information in both the sync checksum calculation (CvGame::calculateSyncChecksum) and the OOS log.
Especially the information which promotions each unit has so any difference in promotions causes a desync when it happens and not only when the promotion makes a difference in combat or anything.

Agreed - I think the state dump in the OOS log looks like the right place (to list promotions). I'm rather snowed under with work on making more of the city pipeline asynchronous right now though, so I may not get to it for a while. Let me know if you get there first...
 
Agreed - I think the state dump in the OOS log looks like the right place (to list promotions). I'm rather snowed under with work on making more of the city pipeline asynchronous right now though, so I may not get to it for a while. Let me know if you get there first...
I'll do it.

Edit: Done. The revision I just committed has the promotions added both to the sync checksum and the OOS log.
 
@Koshling... I just sent you some logs from last night's playtest with the multi-threading back on. It did of course go OOS after loading and hitting the end turn.
 
@Koshling... I just sent you some logs from last night's playtest with the multi-threading back on. It did of course go OOS after loading and hitting the end turn.

See email - they don't seem to be from the most recent version (logs are still the old format ones)

Edit - Falcon's are, so it's definitely not a problem with the logging code
 
Rev 5578. Second OOS on the second combat in the game. Logs from both PCs and savegame attached.

View attachment 352763

I have found a definite problem from these logs. Next time I make a push to the SVN a fix (for the issue I can see - can't guarantee its the only one by any means!) will be included.
 
I have found a definite problem from these logs. Next time I make a push to the SVN a fix (for the issue I can see - can't guarantee its the only one by any means!) will be included.
The main weird thing I see is that the city name is not properly logged in the random log there.
One city seems to have Religion Spread rolls on one computer only and it might well be the same player 2 city that later builds a wanderer on the computer of player 1 but nothing on the computer of player 0.
 
The main weird thing I see is that the city name is not properly logged in the random log there.
One city seems to have Religion Spread rolls on one computer only and it might well be the same player 2 city that later builds a wanderer on the computer of player 1 but nothing on the computer of player 0.

The cause of the OOS is the sync'd rand thrown by CvUintAI::AI_init() on unit initialization, which is using the global stream, but occurring on city training of units (which is multi-threaded). The lack of city names as stream names is also concerning - I'll sort both out.

Edit:

Oops:

Code:
	virtual const wchar*	GetName(void) const
	{
		return m_pCity->getName().c_str();
	}

returns a pointer to an object it then implicitly destructs. My bad.
 
The cause of the OOS is the sync'd rand thrown by CvUintAI::AI_init() on unit initialization, which is using the global stream, but occurring on city training of units (which is multi-threaded). The lack of city names as stream names is also concerning - I'll sort both out.

Edit:

Oops:

Code:
	virtual const wchar*	GetName(void) const
	{
		return m_pCity->getName().c_str();
	}

returns a pointer to an object it then implicitly destructs. My bad.
While that certainly is something to fix, I don't think it is the actual desync reason here.
The birthmark is only on one computer because only one computer built the unit while the same player 2 on the other computer decided to build nothing (the OOS log shows nothing produced on the computer of player 0 while on the computer of player 1 the city first produced a wanderer and then is currently producing some building).

So the actual action that desynced the game probably happened on turn 17 or turn 19 which shows a bunch of random rolls on one computer only each and at least the second has to do with AI building decisions.
 
While that certainly is something to fix, I don't think it is the actual desync reason here.
The birthmark is only on one computer because only one computer built the unit while the same player 2 on the other computer decided to build nothing (the OOS log shows nothing produced on the computer of player 0 while on the computer of player 1 the city first produced a wanderer and then is currently producing some building).

So the actual action that desynced the game probably happened on turn 17 or turn 19 which shows a bunch of random rolls on one computer only each and at least the second has to do with AI building decisions.

Given the stream names are all NULL are you sure they are not in order within each stream (or at least that that is the first thing that isn't)? With the bugged stream names I don't think there is much clarity here - best just get another set once that is fixed I think.
 
Back
Top Bottom