OOS Error after using setHasTech

JDPElGrillo

Chieftain
Joined
Aug 20, 2015
Messages
72
Hello, I have a couple of questions as I'm trying to get my mod to work for simultaneous multiplayer games. I've read through the tutorial here: http://forums.civfanatics.com/showthread.php?p=4609823#post4609823 , and I've looked at the differences in OOSLog.txt.

In my mod, Spiritual leaders discover the cheapest first-row tech unknown to them when they found their first city:

Code:
	def onCityBuilt(self, argsList):
		'City Built'
		city = argsList[0]
		pPlayer = gc.getPlayer(city.getOwner())
###########################################
#    Spiritual Trait Free Tech Start      #
###########################################
		if city.isCapital():
			if pPlayer.hasTrait(gc.getInfoTypeForString("TRAIT_SPIRITUAL")):
				techs = []
				team = gc.getTeam(gc.getPlayer(city.getOwner()).getTeam())
				for iTech in range(gc.getNumTechInfos()):
					if iTech == gc.getInfoTypeForString("TECH_FISHING") or iTech == gc.getInfoTypeForString("TECH_AGRICULTURE") or iTech == gc.getInfoTypeForString("TECH_HUNTING") or iTech == gc.getInfoTypeForString("TECH_THE_WHEEL") or iTech == gc.getInfoTypeForString("TECH_MINING") or iTech == gc.getInfoTypeForString("TECH_MYSTICISM"):
						if pPlayer.canResearch(iTech,1):
							iCost = team.getResearchLeft(iTech)
							if iCost > 0:
								techs.append((iCost, iTech))
				if techs:
					techs.sort()
					iTech = techs[0][1]
					team.setHasTech(iTech, True, city.getOwner(), False, False)
###########################################
#    Spiritual Trait Free Tech End        #
###########################################
		if (city.getOwner() == gc.getGame().getActivePlayer()):
			self.__eventEditCityNameBegin(city, False)	
		CvUtil.pyPrint('City Built Event: %s' %(city.getName()))

This works just fine in singleplayer (I know it's a bit inelegant to specify which techs are eligible, but I wanted to prevent corner cases where large teams of Spiritual leaders would result in them starting with multiple religions). However, in simultaneous multiplayer, I get this:



These are the only differences between the OOSLog.txt outputs. What's happening is that Player 0, who is Spiritual, discovers Fishing, and the city swaps from a 2/1/0 grassland forest to a 2/0/2 freshwater lake. Player 1 is not getting updated that Player 0 discovered a tech and swapped to a different tile. Now, to my understanding, onCityBuilt is within the global context, and I don't have any conditions based on active players in my code, so it should work just fine.

Two questions: Do I need to use ModNetMessage here, and if so, why?
 
I have created a function with a list in order of cost.

Code:
def onCityBuilt(self, argsList):
	'City Built'
	city = argsList[0]
	pPlayer = gc.getPlayer(city.getOwner())
	###########################################
	#    Spiritual Trait Free Tech Start      #
	###########################################
	if city.isCapital():
		iPlayer = city.getOwner()
		xxxxxxxxxx.doSpiritualCheck(iPlayer)


Try calling this function from onCityBuilt: See above.

Code:
def doSpiritualCheck(iPlayer):
	# By Orion Veteran
	pPlayer = gc.getPlayer(iPlayer)
	iTrait = gc.getInfoTypeForString("TRAIT_SPIRITUAL")
	
	dEarlyTechList = [
		"TECH_FISHING",
		"TECH_HUNTING", 
		"TECH_MINING",
		"TECH_MYSTICISM",
		"TECH_AGRICULTURE",
		"TECH_THE_WHEEL"
	]

	if not pPlayer.isNone() and not pPlayer.isBarbarian() and pPlayer.isAlive():
		if (pPlayer.hasTrait(iTrait)):
			for iszTechType in dEarlyTechList:
				iTechType = gc.getInfoTypeForString(str(iszTechType))
				
				if (pPlayer.getTeam() != TeamTypes.NO_TEAM):
					pTeam = gc.getTeam(pPlayer.getTeam())
				
					if not pTeam.isHasTech(iTechType):
						pTeam.setHasTech(iTechType, True, iPlayer, False, False)
						break
 
No dice. Your code works in singleplayer or non-simultaneous multiplayer, but it still produces an OOS error in simultaneous multiplayer.
 
I did a closer inspection of onCityBuilt from the first post. Everything looks fine until setHasTech. Following that function through the DLL reveal that it will not try to sync in a network game. This mean the code relies on onCityBuilt being called at the same time on all computers. If it for some reason is only called for the building player, then the game will go out of sync as they will disagree on which techs the team has.

Next question is how do we determine if it is called on call computers or not? If we know it is only called for the local player, then the answer is to call doSendCommand() or similar to transmit the command over the network. In this case the command is "team X gains tech Y". It might take a bit of searching to find the precise function call to do this, but it should be doable from python.
 
Thanks for your help, Nightinggale! I was afraid that that might be the case, but I wasn't sure how to follow the function through the .DLL, as you put it. I do have a development environment set up, but I'm not too familiar with how to use it. In your opinion, would it be easier to make the change in the SDK so that setHasTech does sync in network multiplayer, or to use python to send out the command? How should I check other functions to see if they sync or not, short of trial-and-error playtesting?
 
Changing a vanilla function like setHasTech to handle network sync issues sounds dangerous. Vanilla has a well defined behavior for this function and it is the same in all mods. We better keep it that way.

The rules for network sync is actually quite simple.
  1. read only (get functions) can't desync anything
  2. write to memory (set functions) triggered on all computers will not cause desyncs (like doTurn events)
  3. set functions triggered on just one computer (like clicking a button) requires network code to avoid desync
The problem here is that I don't know if onCityBuilt is #2 or #3. I would assume #2, but the desync points towards #3.

The issue is treating #2 as #3 will not only generate lots of network traffic, it will also cause other issues. Say a unit gains one XP and each computer then sends a package telling it to get +1 XP. With 2 computers in the game, it suddenly gets 2 XP because both computers tells everybody to add one.

This mean the key to handing network stability is to have complete control of which lines of code are handled in sync or just on one computer. I can tell (mostly) in the DLL, but I don't actually know about the python callbacks. For all I know some callbacks are in sync while others aren't.

The way to figure out if functions are in sync or not is to read the code and determine if they are executed in sync or not. This could easily be an overwhelming task, particularly for somebody who failed to track just one function through the DLL. The alternative is to just playtest.
 
Hmm, thanks for the explanation. I'll try a couple of different things, but I'll stay away from the SDK for now. My main issue with playtesting is that it's a bit of a hassle, I either have to lean on my friends' patience, or wait for my relatively ancient laptop to boot up and run Civ4.
 
There is this trick with starting the game from a shortcut where you add mod=name. There is another useful command to add here, which is multiple. If you use this one, you should be able to run more than one instance of the game on the same computer and running two games at once in two windows should allow some network testing. I hope you have a big monitor.

More info http://forums.civfanatics.com/showthread.php?t=172014
 
I did a closer inspection of onCityBuilt from the first post. Everything looks fine until setHasTech. Following that function through the DLL reveal that it will not try to sync in a network game. This mean the code relies on onCityBuilt being called at the same time on all computers. If it for some reason is only called for the building player, then the game will go out of sync as they will disagree on which techs the team has.

setHasTech is the only function I know of that can give a team a technology. If it is going to fail in a simultaneous multiplayer game, then what good is it? Many mods out there are not going to work in a simultaneous multiplayer game. OGI works for PBEM and hot seat. After considering how many functions just might fail, I think trying to run a simultaneous multiplayer game is just not worth the effort. I'm going to stick with PBEM.
 
I looked through the DLL code and there is no correct function to handle this. However we can easily make our own in python.

Code:
team.setHasTech(iTech, True, city.getOwner(), False, False)
Replace this line with
Code:
CyMessageControl().sendModNetMessage(0, iTech, city.getOwner(), team.getID(), 0)

In CvEventManager.py:
Code:
	def onModNetMessage(self, argsList):
		'Called whenever CyMessageControl().sendModNetMessage() is called - this is all for you modders!'
		
		iData1, iData2, iData3, iData4, iData5 = argsList
		
[B]		if iData == 0:
			# assign tech to team
			team = gc.getTeam(iData4)
			team.setHasTech(iData2, True, iData3, False, False)[/B]
		
		print("Modder's net message!")
		
		CvUtil.pyPrint( 'onModNetMessage' )
CyMessageControl().sendModNetMessage() is the way to send network data from python. It takes int 5 arguments (32 bit signed int to be precise) and calls onModNetMessage with those 5 ints. The thing is the call is made on one computer while onModNetMessage is executed on all computers. It seems like this is the vanilla code intended to allow modders to keep games in sync without modding the DLL itself.
 
The problem may have nothing with the code posted but rather the mod itself.
Test the section in pure BTS to confirm if the code alone creates problems.
 
The problem may have nothing with the code posted but rather the mod itself.
Maybe, but the main question is still very relevant: is onCityBuilt called on all computers?

Test the section in pure BTS to confirm if the code alone creates problems.
That would be a good approach.

Often we know if we get OOS problems, but it doesn't tell what went wrong. I once managed to write a function, which would cause OOS, but only if it was called in doTurn. This meant it worked when I tested it, but occasionally it caused an OOS in the background, meaning the normal OOS bug hunting question (what did you do?) didn't help at all. We really should have better sync checking tools, but I have yet to figure out a good way to make those.

It should be noted that sync isn't checked constantly. Instead it is checked once each turn, meaning when it triggers it could be anything taking place last turn, including AI code, which didn't sync correctly.
 
Code:
	def onCityBuilt(self, argsList):
		'City Built'
		city = argsList[0]
		if (city.getOwner() == gc.getGame().getActivePlayer()):
			self.__eventEditCityNameBegin(city, False)	
		CvUtil.pyPrint('City Built Event: %s' %(city.getName()))

Even in BTS, there are already codes triggered onCityBuilt, which is the city naming event.
Hence, it may not be a problem with the codes themselves, but rather other components which messed with CvEventManager.
 
Sorry for the delay, but I've got good news. I had to correct a typo:

Code:
if iData == 0:
	# assign tech to team
	team = gc.getTeam(iData4)
	team.setHasTech(iData2, True, iData3, False, False)

should be:

Code:
if iData1 == 0:
	# assign tech to team
	team = gc.getTeam(iData4)
	team.setHasTech(iData2, True, iData3, False, False)

But it works! And using the multiple startup option is a godsend, if only I'd known about it earlier. And now that I know about how onModNetMessage works, I should be able to tackle any remaining sync issues that I come across in the future. Thanks everyone for your help, really appreciate it :)
 
Heads up for anyone interested: we found another bit of OOS-inducing code, the culprit is:

Code:
city.changeExtraTradeRoutes(1)

Replaced with:

Code:
CyMessageControl().sendModNetMessage(1, iPlayer, iCity, 1, 0)

and

Code:
def onModNetMessage(self, argsList):
	'Called whenever CyMessageControl().sendModNetMessage() is called - this is all for you modders!'
		
	iData1, iData2, iData3, iData4, iData5 = argsList

	if iData1 == 1:
		# change city's trade routes
		pPlayer = gc.getPlayer(iData2)	
		city = pPlayer.getCity(iData3)
		city.changeExtraTradeRoutes(iData4)
 
I had to correct a typo
Leave it to me to write great code and then spoil everything with a typo like that :wallbash:

And using the multiple startup option is a godsend, if only I'd known about it earlier.
There are many great features I would have liked to have known earlier. In fact in retrospect I would have gone into modding the day the SDK was released.

Lao Tzu said:
Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime.
It fits very well here. I spend a bit of time fixing one problem and explained the solution. Now it seems that OOS problems can be fixed without me doing anything at all. It's great for modding stability. It's a problem that has been bugging me for a while because it's not extremely hard to handle this problem correctly, but quite a number of mods breaks multiplayer because it seems everybody have to reinvent the wheel. For the first time I actually managed to teach somebody the concept of how to get stable network games. I tried a bit before, but this time it actually resulted in more stable network code.

I noticed I didn't really tell why I set iData to 0. The concept is to give a unique ID to each type of network package, but it seems you figured it out. The DLL does something like that as well, though it uses enums for better readability.

Speaking of "if only I'd known about it earlier", I would love to go back in time to tell this when civ4 came out and modders would have known about network sync before they started breaking the mods. Civ4 uses a fairly standard approach, which mean I knew the concept before civ4 was even released.
 
Thanks again for your help, it really made a difference. Case in point, the trade route error came up very early on in one of our first MP games yesterday, and I was able to make that change, send out the revision, and host the save again. We managed to play to t150 without any further errors.
 
I also want to thank you guys, this was a very interesting read about OOS errors! :thumbsup:
 
Top Bottom