1. We have added a Gift Upgrades feature that allows you to gift an account upgrade to another member, just in time for the holiday season. You can see the gift option when going to the Account Upgrades screen, or on any user profile screen.
    Dismiss Notice

Civ4 Multiplayer Compatability Guide

Discussion in 'Civ4 - Modding Tutorials & Reference' started by Gerikes, Oct 6, 2006.

  1. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    This guide is meant to give light to issues that arise when trying to make your Civ4 mods Multiplayer compatible. It explains the ideas of how the Civ4 multiplayer system works, including some of the differences between Hotseat, PBEM, Internet, and Simultaneous games. It will also discuss various ways to avoid and debug OOS (Out of Sync) errors that can so easily plague LAN and Internet-based mods. What I've collected are ideas that through experimentation or through other people on the board I have gathered that Civ4 should work. There is still the possibility for mistakes, so don't take most of it as absolute, but the majority of it I've been able to use as my guiding principles and have been able to successfully work with creating Civcraft as a solid multiplayer experience.

    It's pretty long, but it's intended for people who want to be super serial about their mod's working with multiplayer.


    First and foremost, I would like to introduce this idea: As far as I’ve seen, XML files, so long as they remain equivalent on each machine that plays the mod, can NOT cause OOS errors. If your files differ from one machine to the next, you’re going to have problems, but in many cases these discrepancies will be caught in the game lobby. As for graphics and sound files, those too will not cause OOS problems. If your game is experiencing OOS problems, the problem is most likely in the python or c++ code. So, unless you’re REALLY interested, XMLers and graphics/sound guys can skip this one.

    I wrote this guide because I felt that it was a field that I was particulary strong in where others may not be, and I know how thankful I am for those who write guides and tutorials about topics that I am not so strong in. I want to thank all those people who take the time out of creating their own mods and projects to make tutorials, be they graphical, coding or simply theory. And I encourage those who find that they have a particular strength, especially if that doesn’t particularly seem to be well-known or well-understood so far in the modding community, to come out and share it to help better the entire community.

    Ok, enough banter.

    Most of this guide assumes that you have knowledge in python, and have had some experience using it to mod Civ4, since many examples are from the CvEventManager.py file Although there might be some c++ samples, they will probably not be very complicated, and understanding the individual lines of code won’t be as important as understand what it’s doing, which I will hope to explain.


    -----------------------------------


    Table of Contents:

    Section 1: Introduction to Multiplayer Games

    This section will concentrate on the ideas of multiplayer gaming on a general scale. It talks briefly about how multiplayer games communicate, how some data needs to be sent to other computers while some can be calculated locally, how random numbers can be generated on many different computers to all have the same number generated, and what exactly is meant by game-state, checksum, and OOS error. It is meant to give you a general overview that is not game-specific, but uses Civ4 for all it’s examples.
    This is meant to be a quick introduction for those who have never dealt with these concepts. Some people might be able to skip this section, although I highly recommend a quick read through for anyone who isn’t too sure. I promise to try to keep it short.

    Section 2: Civilization 4 Multiplayer Concepts:

    This section gives a summary of the various multiplayer game styles (Hotseat, PBEM, Simultaneous, etc.) and each they work. It explains how the onBeginGameTurn, onBeginPlayerTurn, onEndGameTurn, and onEndPlayerTurn functions work for each one, and how order is determined. In most styles of gameplay, they work the same, but some, especially Simultaneous, have some peculiarities.

    Section 3: Introduction to OOS errors: Global vs. Local Context

    This section will use two phrases I have completely made up for Civ4 modding: “Global Context” and “Local Context.” The idea it to define a terminology for dealing with how code is written so that one can be confident that what they’re writing code-wise won’t cause OOS problems by knowing what context you’re writing your code in.

    Section 4: The concept of “Active Player” and the getActivePlayer functions.

    This section explains what the concept of being the “Active Player” of a game means. Most OOS problems will result from improper use of the active player functions, so it’s important to understand what the results are actually telling you, as well as how to use the results properly.

    Section 5: Debugging OOS Errors

    No matter how hard you try, eventually, if you make a big-enough mod, you’ll probably end up with OOS problems. This section gives tips on how to go through debug such errors, as well as provides a script that will probably be useful in this task.



    Note: My ability for formatting is pretty crappy, you might want to try the word version:
    Second Note: I've made changes over the mon...erm, years, nothing large, just fixing stuff to hopefully make it easier to read. However, the word document has not been updated, so you're better off sticking with the crappy forum post.
     

    Attached Files:

  2. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    Section 1: Introduction to Multiplayer Games


    The internet is SLOW! There, I said it.

    I don’t care how fast you can send your order for the new Tom Clancy novel online to the central distribution center at Amazon, for real-time games, the internet is slow. Without a doubt, in multiplayer games that are played on-line, the network is the bottleneck (in other words, the network is the place where data will be traveling slowest, and will cause faster components, such as your computer’s processor, to be wasted since it will be waiting for the network to finish transmitting data). When you have a bottleneck, you do one of two things:

    1.) Fix the bottleneck to run faster.
    2.) Try to minimize the use of the bottleneck.

    Obviously, the first is out of the question, unless EA were to buy up enough wires and servers to create their own little world-wide network that caters to their games where the routing tables would be static and the traffic would be low. Something tells me that’s not going to happen.

    So, as a result, game-makers will make sure to use the internet as little as possible. The result is games that make sure to only send over the network the absolute necessary pieces of data that need to be transmitted. So, for example, you’re not going to see games where the entire data of the model, skins, etc. are sent over the net every time a unit is created. These things are packaged and shipped to every computer during installation (or through patches), and then run right from the computer without the need to be downloading the content during the game.

    What you end up with are games where multiple players are all playing their own little games, and whenever their game deviates from everyone elses (normally because they changed something, like moving a unit), they send a small message about what they did. In the example of moving a unit, the message sent over the internet doesn’t say how to change the animation, or send data on what the unit looks like when it’s moving, or what the repercussions are for moving into a plot. It won’t even send what plot the unit is moving from, since we assume that all player’s computers are at the same state. Even if the unit is moving into a square to attack a unit, the message that the player is sending would be no different. In every case, it’s simply “This player moved this unit into this plot”.




    In the end, what is happening is that all computers use this knowledge, and they all go through the same steps: Each generates the path, each triggers the animation on their screen, each checks to see if there are units in the plot, each does attacks, each updates the visibility of the surrounding plots, etc. Thus, by adding code to give gold to a player who kills another player to the onCombatResults function, you are assured that the code you use will be called on every computer, so that every computer will have that gold added to the player.

    However, not all code for every computer will act EXACTLY the same in every case. For example, there is code that puts a message on your screen when you win a battle, and other code that puts a message on your screen when you lose a battle. When a unit is moved by a player, the result of the battle is calculated independently on all computers, just as above has shown. All this code is going to be exactly the same. However, somewhere the code will need to be different, since on one screen a victory message should be displayed, whereas for another player a losing message should be displayed, and still for others no message will be displayed. Thus, in this well-oiled machine of multiple, distant computers all calculating the same exact thing, there will be slight changes in what is being calculated and executed. It is in those small pockets of code that OOS errors typically breed.

    To that end, I now attempt to define the “Game-state.” The Game-state is all of the values that make up the game world. In the example above, it included the X and Y coordinates of the plot that the unit moved to, the players of whom the unit’s were owned by, the HP and combat values of each of the units, the promotions and bonuses they received, etc. Whenever any of these values change on one system, they must be changed on another system.

    Not everything in your computer’s memory is considered as part of the game-state, however. If you took all the models of warriors and turned them into models of monkeys, you wouldn’t have changed the game-state, but rather the interface. The game-state is the direct opposite of the interface (the graphics, screens, etc). The interface takes all the binary data of the game-state (unit locations, promotions held, city names) and put them into something we can understand (placing a unit’s model on a certain part of the screen, putting a little icon of a promotion next to their name, placing the text of the city onto the map below the city).

    Now, as shown before in the warrior to monkey example, you can change the interface on your computer all you want without notifying other computers, since it’s just part of the interface. The problem that sometimes happens, though, is changing the game-state and not notifying the other computers.

    Before I continue I should probably note that In Civ4, most of the time you don’t need to “notify” other players of games-state changes, as it’s handled for you. In most cases you’re writing the code that is run when the computers are responding to an event. So, don’t get worried if you realize you’ve done pages of python changes and haven’t once called any method called “NotifyOtherPlayersOfMyChanges()” or something like that, because it doesn’t exist. There are some very advanced cases where you would need to call code on your own to pass a message to other computers, but this will be handled in other sections).

    Now, one question you might be asking is how different computers can all do the battles on their own. Surely, if each computer is generating random numbers, wouldn’t each computer have a different result to any computations that require random numbers? The answer is no, and to understand why you have to understand how computers make random numbers.

    A computer can do exactly what you tell it, and nothing more. If you say, “Show me the number one”, it will show you the number one. If you say, “Give me all the prime numbers between one and one hundred”, it will ask you what a prime number is, which you will then describe in mathematical terms (a number only divisible by itself and one) and it will go off and do that. If you tell a computer, “Give me a random number”, it will say, “What’s a random number?” Here’s a little secret that computer developers have had for years: we can’t really define to a computer what a random number is well enough for it to go ahead and give us one.

    So, the random seed and pseudo-random generation algorithms were developed. It goes like this: you give the computer a number (the random “seed”), and a mathematical computation. Then, whenever you ask for a random number between two numbers (say, 0, and 99), it does the computation you defined on the seed you defined. Part of the result will be shortened to fall within the range you specified, and part of the result will be used to create a new seed. That new seed is then used for the next random number. The idea is, so long as you start with the same random number seed, and use the same series of ranges, you will always get the same results. So, if multiple computers all start from the same seed, and all use the same series of ranges, they will all generate the same results. For a more info on this, check out the wiki on Pseudo-random Number Generators (http://en.wikipedia.org/wiki/Pseudorandom_number_generator).

    This has two profound consequences: The first is that the synchronized pseudo-random number generators (like MapRand and SorenRand) are considered part of the game-state. If you do so much as make a random number on one computer using them, and don’t do that same random number on another computer, the seeds will be different, and thus the game-states will be different.

    The second is that we explain why making sacrifices to the RNG gods doesn’t do anything, because the actual acronym is PRNG, or if you’d like, SPRNG. So, make sure you make the sacrifices to the Goddess of SPRNG.


    So finally, we have all computers running for the most part independently of each other, only stopping to change the game-state when it or another computer says to make a small change, which they all then go about doing. This involves massive data crunching, making the same random numbers, and getting the same results. However, something can still go wrong. The networking code (which, count your lucky stars, you never will have to debug) could have something wrong, or a modder might put something that changes the game-state in code that is only running on one computer (to put a message on the screen, for example). Civ4 still needs to check for OOS, because if you only realize that something’s wrong many, many turns after the actual problem occurs, debugging is going to be tough.

    Luckily, Civ4 checks for OOS a lot. Safe to say, when you do something wrong, it will probably pick it up right away. But, how exactly does it do this? Simply sending every single piece of data over network won’t work, because remember, the internet is slow. Instead, a checksum is used.

    You’ve used checksum’s before. When you give the cashier a fiver to pay for an item that is $4.37, do you count the change exactly? If you’re not a penny pincher like me, you take a look at the change and check that you have at least two quarters, maybe some other silver coins, there should be a few pennies in there too.

    A checksum is way to detect discrepancies without having to check all the data. If we wrote a computer program to do the checksum of the change, it might calculate exactly how many coins it should have in total, so rather than knowing exactly how many quarters, how many nickels, how many dimes, and how many pennies, you just have one number: how many coins.

    Obviously, there are disadvantages: first off, it’s not always right. If you’re expecting X coins, and get that many pennies, the computer will think everything’s okay. The accuracy of a checksum is only as good as the algorithm used to make it. Typically, pretty accurate checksums can be created pretty simply. You can actually see the entire checksum process in the SDK:

    Spoiler :

    Code:
    int CvGame::calculateSyncChecksum()
    {
    	PROFILE_FUNC();
    
    	CvUnit* pLoopUnit;
    	int iMultiplier;
    	int iValue;
    	int iLoop;
    	int iI, iJ;
    
    	iValue = 0;
    
    	iValue += getMapRand().getSeed();
    	iValue += getSorenRand().getSeed();
    
    	iValue += getNumCities();
    	iValue += getTotalPopulation();
    	iValue += getNumDeals();
    
    	iValue += GC.getMapINLINE().getOwnedPlots();
    	iValue += GC.getMapINLINE().getNumAreas();
    
    	for (iI = 0; iI < MAX_PLAYERS; iI++)
    	{
    		if (GET_PLAYER((PlayerTypes)iI).isEverAlive())
    		{
    			iMultiplier = getPlayerScore((PlayerTypes)iI);
    
    			switch (getTurnSlice() % 4)
    			{
    			case 0:
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getTotalPopulation() * 543271);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getTotalLand() * 327382);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getGold() * 107564);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getAssets() * 327455);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getPower() * 135647);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumCities() * 436432);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumUnits() * 324111);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumSelectionGroups() * 215356);
    				break;
    
    			case 1:
    				for (iJ = 0; iJ < NUM_YIELD_TYPES; iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).calculateTotalYield((YieldTypes)iJ) * 432754);
    				}
    
    				for (iJ = 0; iJ < NUM_COMMERCE_TYPES; iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getCommerceRate((CommerceTypes)iJ) * 432789);
    				}
    				break;
    
    			case 2:
    				for (iJ = 0; iJ < GC.getNumBonusInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumAvailableBonuses((BonusTypes)iJ) * 945732);
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getBonusImport((BonusTypes)iJ) * 326443);
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getBonusExport((BonusTypes)iJ) * 932211);
    				}
    
    				for (iJ = 0; iJ < GC.getNumImprovementInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getImprovementCount((ImprovementTypes)iJ) * 883422);
    				}
    
    				for (iJ = 0; iJ < GC.getNumBuildingClassInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getBuildingClassCountPlusMaking((BuildingClassTypes)iJ) * 954531);
    				}
    
    				for (iJ = 0; iJ < GC.getNumUnitClassInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getUnitClassCountPlusMaking((UnitClassTypes)iJ) * 754843);
    				}
    
    				for (iJ = 0; iJ < NUM_UNITAI_TYPES; iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).AI_totalUnitAIs((UnitAITypes)iJ) * 643383);
    				}
    				break;
    
    			case 3:
    				for (pLoopUnit = GET_PLAYER((PlayerTypes)iI).firstUnit(&iLoop); pLoopUnit != NULL; pLoopUnit = GET_PLAYER((PlayerTypes)iI).nextUnit(&iLoop))
    				{
    					iMultiplier += (pLoopUnit->getX_INLINE() * 876543);
    					iMultiplier += (pLoopUnit->getY_INLINE() * 985310);
    					iMultiplier += (pLoopUnit->getDamage() * 736373);
    					iMultiplier += (pLoopUnit->getExperience() * 820622);
    					iMultiplier += (pLoopUnit->getLevel() * 367291);
    				}
    				break;
    			}
    
    			if (iMultiplier != 0)
    			{
    				iValue *= iMultiplier;
    			}
    		}
    	}
    
    	return iValue;
    }
    



    You don&#8217;t have to understand this code, but as you can see, different values are retrieved, and each has a mathematical operation done to it. The results keep getting added to the iValue variable, which is then returned to the exe as the checksum. The final result of that checksum will be sent to each computer for comparison. In the end, discrepancies will be found, and the chances for two game-states that are different to produce the same checksum will be marginally small, all from just a single number. This number, uncoincidentally, is what you see next to the player&#8217;s name when it goes OOS.

    There are actually two checksums: one for the game&#8217;s sync, and one for player options. In the rest of the guide when talking about these checksums I will just be covering the sync checksum, as it will probably discover more problems, but just note that there also is the other checksum, whose calculations you can find in the SDK under the function CvGame::calculateOptionsChecksum(). Also note that the sync checksum doesn&#8217;t check for everything. For example, what promotions each unit has is not checked. Thus, there is still the possibility of discrepancies happening without the game picking up that it&#8217;s gone OOS, and will continue until finally the discrepancy pans out into a domino affect that finally is realized. If on one computer a unit has one promotion, and on another that same unit has another promotion, it may not be fully recognized until the unit gets into a battle where those promotions have some effect. The end result will probably be different, and when the battle ends the game will finally notice it, even if this might be game-years later. In most cases, however, the sync checksum does enough to detect most problems. Even though civics are not included in the checksum, discrepancies in Civics will most likely be detected instantly since changes will normally change a player&#8217;s total commerce and yield outputs, which are included in the checksum.

    The problem with checksums is that you can tell if there&#8217;s a problem, but you don&#8217;t know exactly what the problem is. In later sections, I will describe how to deal with figuring out exactly what the problem is.

    To summarize this section, here are the key things you should probably know:

    - The internet is slow.
    - Games try to send a little information as possible over the internet, as it is the bottleneck.
    - The game-state is the part of the game that must remain equivalent on every machine in order to keep the games &#8220;in sync.&#8221; This is opposed to the interface, which can be different on all machines.
    - When a game makes it change to the game-state, it&#8217;s typically a small event. That small event is passed as a message to other computers that are in the game, and each computer will, on its own, calculate all the remaining changes to the game-state.
    - OOS errors occur when one computer has a game-state that different than another&#8217;s game-state. They are detected by using a checksum that takes the entire game-state and reduces it to a single number for comparison with other computers checksums, a process which happens continuously throughout the game.
     

    Attached Files:

    • 1.JPG
      1.JPG
      File size:
      36.3 KB
      Views:
      3,178
  3. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    Section 2: Civilization 4 Multiplayer Concepts:

    This section will go a bit more in depth with multiplayer concepts that are specific to Civ4. It will explain the differences between the various multiplayer types, and give an overview over what happens every game turn and in what order. Unforunately, I haven&#8217;t had much experience with Pitboss, so I won&#8217;t touch on that. If you&#8217;re just looking for OOS advice, you can probably skip over this chapter, but it might hold some good tidbits, especially those trying to figure out the simultaneous play. Mainly, I talk about how the turn-based system works in relation to simultaneous mode (onBeginPlayerTurn and onEndPlayerTurn both happen at the end of the turn in all game types but simultaneous, they happen at the beginning). I also show why testing a mod in the PBEM and Hosteat multiplayer game types does not test for OOS problems.

    First off, in order to understand what happens in a multiplayer game, we&#8217;ll need a base to compare it against, so I&#8217;ll take a look at a typical singleplayer turn.


    A turn begins with player 0 (or, if player 0 is not alive, the first alive player in the list of players). The first thing that is done on a player&#8217;s turn is setting them to &#8220;active&#8221; (see section four for more about what it means to be &#8220;active&#8221;). Note that when this player&#8217;s turn becomes active (hence &#8220;starting&#8221; their turn), the onBeginPlayerTurn is NOT called. Actually, onBeginPlayerTurn is called at the end of a player&#8217;s turn. I&#8217;ll get back to this in a minute.

    Rather, the first thing done is that the doTurn() for each of the player&#8217;s selection group is called. This will, in effect, call doTurn() on all units for that player, since every unit belongs to a selection group (even if that selection group contains only that one unit). A quick look through doTurn() shows us that it does basic things for each unit, such as resetting their movement points, doing any healing, and taking the unit out of automation if it&#8217;s in danger. This, however, does not do any automation (such as a worker doing it&#8217;s work, or a unit moving to the next plot in it&#8217;s queue). Any warnings (&#8220;An enemy spearman has been spotted outside your army of tanks!&#8221;) are then displayed to the screen and finally, the little bit that tells the computer to start cycling through your various selection groups is set to true, thus setting in motion the actual turn.

    &#8220;So now it calls &#8216;onBeginPlayerTurn&#8217;, right?&#8221; Hold on, not yet.

    Now the player&#8217;s turn actually has started, and all their construction popups and advisor ramblings and manual unit movements are done one at a time by the player. Finally, the player has no more moves left.

    &#8220;So, &#8216;onBeginPlayerTurn should have been called by now, right?&#8217; Nope.

    Once the player has no more units to give orders to, any units that need to have automoves done will do so. These units may also perform their automoves directly after a player forces the turn to end, or when the player runs out of time on their turn.

    The player, finished with their turn, hits enter or time runs out or whatever. In the SDK, the correct functions are called to make all of the automoves start. Sometime after this, the player&#8217;s CvPlayer::doTurn() function is called. This function starts with onBeginPlayerTurn being called (finally), and ends with onEndPlayerTurn being called. In between, it does things such as looping through all the player&#8217;s cities and calling the city&#8217;s doTurn function; updates gold, revolution and anarchy turns, etc.

    So, it&#8217;s interesting to note that the onBeginPlayerTurn actually happens at the end of the entire turn. It&#8217;s called &#8220;begin&#8221; because it happens at the beginning of the CvPlayer::doTurn SDK function for the player.

    When the player&#8217;s turn is finished, they tell the next player in line that it&#8217;s their turn to start. Then, when that player&#8217;s done, the next player starts. The entire time, there is always at least one player who&#8217;s currently the &#8220;active&#8221; player.

    Throughout the entirety of the game, the CvGame::update function is repeatedly called. This happens multiple times per second. One thing it checks for is how many player&#8217;s have their turn considered &#8220;active&#8221;.

    Code:
    if (getNumGameTurnActive() == 0)
    {
    	if (!isPbem() || !getPbemTurnSent())
    	{
    		doTurn();
    	}
    }
    
    The getNumGameTurnsActive() returns the number of players whose turn is &#8220;active&#8221;. In all game types (with the exception of Simultaneous, see &#8220;Simultaneous Games&#8221; below), this will always be zero or one, because as soon as one person no longer has their turn active (bringing the total to zero), they simply tell the next player that their turn is active (bringing the total back to one). The last player, the Barbarian, will simply make itself inactive. During the CvGame::update function, when the number hits zero, it&#8217;s assumed that the last player has finished their turn, and thus all player&#8217;s have done their turn for this game turn. Thus, the game&#8217;s doTurn SDK function. Like the player&#8217;s doTurn SDK, this starts with the onBeginGameTurn and ends with the onEndGameTurn. So, realize that the onBeginGameTurn python event doesn&#8217;t happen before player 0&#8217;s turn, but rather at the end, after the last player&#8217;s turn, as the first thing done in the CvGame::doTurn() SDK function.

    Thus, the game does what it needs to do (spawn barbarians, update global warming, etc.), then increments to the next turn, and tells the first player to start their next turn.


    Okay, so that was just Singleplayer. Now we get to see what makes the other game types different. First off, the most important thing about multiplayer games: the human player is not always player 0 anymore. So, if you have been getting used to using player 0 as the human player all the time, stop it. This works for singleplayer, but not for multiplayer.

    Hotseat:

    Hotseat is having one computer where player&#8217;s take turn behind the computer (during their turn, they&#8217;re in the &#8220;Hotseat&#8221;). Due to the nature of the game being played on one computer throughout its entirety, it is the closest to Singleplayer that you&#8217;ll get out of all the different multiplayer modes.

    The only real difference between Hotseat and a normal Singleplayer game is what happens at the start of a turn and how diplomacy works. At the start of a Hotseat turn, the screen is black and only a single popup to display the player&#8217;s entry box for their password is on the screen. Once that&#8217;s all done and the player enters their password, then their turn begins just like normal.

    For diplomacy, unlike singleplayer where you&#8217;re diplomatic treaties are handled with the AI in a real-time way, any time you try to make a deal it will be put up for consideration. On the player&#8217;s turn that you&#8217;re dealing with, they can accept, reject, or make a counter-offer. Their response will not be known to you until your next turn (or, since this is a Hotseat game, until the player actually talks to you, since assumingly you guys can, you know, talk and communicate with each other. You do communicate in real life, correct?)

    However, all these details are handled for you. Whenever you do anything, the only problem you need to worry about with making a Singleplayer mod compatible with Hotseat games is that you don&#8217;t assume that the human player is going to be player 0. Even if you do things that would cause OOS errors in a networked game, a hotseat game, due to how it&#8217;s naturally on one computer at one time, will not go OOS (more on this is in Section 3). However, that doesn&#8217;t mean that there isn&#8217;t room for screwing up. Using the wrong player ID, for example, will mean your event that gives a player a super-monkey unit for collecting 100 bananas will give the monkey unit to the wrong player. The game won&#8217;t go OOS, since the actual event of creating the unit will happen on the one computer, just for the wrong player.

    PBEM:

    Play By E-Mail (PBEM) games work like Hotseat games, with the main exception being that an e-mail is sent out with the saved game, so the game will typically be played on multiple computers. When a player finishes a turn, they go ahead and set into action the next player&#8217;s turn. This continues on until it hits the next human player. At that point, the game is saved just before that player&#8217;s turn starts, and the save is sent to the player.

    Diplomacy works the same way. You don&#8217;t get your instant gratification of a Singleplayer game, but once again these issues are not things you need to worry about.

    Finally, just like hotseat, the game cannot go OOS, but you can mess up how things work. A singleplayer game might have given you the correct message of &#8220;You have received a super-monkey&#8221;, because you were human player 0, and you never made the AI build super-monkeys. But when in PBEM, if human player 3 receives a super-monkey, and you told the message to show for player 0, then the message will be saved in the save game, and when player 0 opens it, they will be happy to know that they received a super-monkey without even trying.



    Networked Multiplayer:

    Here is where the fun begins. Networked Multiplayer games happen on multiple computers at the same time. Unlike the previous game types, anything that happens on one computer that changes the game-state needs to be sure to be replicated for other computers in the game. Thus, even if you&#8217;re player 0, and it is player 1&#8217;s turn, if player 1 moves a unit, that unit has to be moved on your computer as well. As described in Section One, Player 1&#8217;s computer will send out a message to all connected computers that it moved the unit. Both of your computers respond to the message and set into action what happens when a unit moves.

    This is different than PBEM, where the action is done on only player 1&#8217;s computer, and then the results are stored in the save file, so that when your turn comes, you open it have all the completed results. Rather, in a networked game, all the changes to the game happen real-time.

    Along with the problems you can have in the Hotseat and PBEM games where you are using the wrong player id, another problem creeps up. The dreaded OOS error. This happens due to one computer doing something to the game-state without telling another computer about it. In PBEM and Hotseat this is avoided because only one computer at a time has the up-to-date game-state. More on OOS errors in Section Three.

    Simultaneous games:

    You might be wondering why I went through the entire single-player example that I did, and just give a few paragraphs for the others. It seems as if I kinda went overboard on explaining Singleplayer, huh?

    Well, fear not. Because Simultaneous games are a whole different breed to begin with, and you should probably understand what&#8217;s going on if you want to support it with your mod.

    First off, unlike Singleplayer games, where the game tells the first person to make their turn active, and this continues down the line one at a time, at the beginning of a game turn ALL player&#8217;s are set active. You might recall that things such as unit movements are done at the beginning of the player&#8217;s turn, whereas things such as city turns and research updates and anarchy turns are done are done at the end. When two players finish a research or wonder at the same time, the tie will go to the player who is first in the order of player&#8217;s turn. In simultaneous games, everyone goes at the same time, so whose turn is run first?

    A smart person would say that it&#8217;s the person who ends their turn first. After all, all the things such as city production and research updates are done during their doTurn called at the end of the turn, correct? Well, don&#8217;t worry if you didn&#8217;t think this, because it&#8217;s wrong anyway.

    In order to keep things fair, you have to realize that the AI will probably finish their moves first, due to the fact that they can think many, many times faster than you (even if it&#8217;s not many, many times smarter than you). But, even in a game without AI, or if two players are racing for the wonder, wouldn&#8217;t one player just try to end their turn as quick as possible and be the first one done?

    In order to stop things like this, Firaxis made this decision: In a simultaneous game, all players&#8217; turns are done at the beginning of the game turn, just before the unit turns are done. So now, since all players have their turn start at the same time (as opposed to ending at different times), we can give each player and AI equal running for a tie in finishing a wonder or research first.

    The key is that, unlike normal games where player 0 always gets to start their turn first, all the player ID&#8217;s are &#8220;shuffled&#8221;. Then, players &#8220;start&#8221; their turn in the shuffled order, so player 0 is not always first, and it&#8217;s a matter of luck as to who wins if two Civ&#8217;s finish at the same time. Of course, all of this happens so fast that player&#8217;s think it&#8217;s instantaneous, but as a modder you know better. The code is below.

    Code:
    void CvGame::doTurn()
    
    ...
    ...
    	
    	[b]if (isMPOption(MPOPTION_SIMULTANEOUS_TURNS))[/b]
    	{
    		shuffleArray(aiShuffle, MAX_PLAYERS, GC.getGameINLINE().getSorenRand());
    
    		for (iI = 0; iI < MAX_PLAYERS; iI++)
    		{
    			iLoopPlayer = aiShuffle[iI];
    
    			if (GET_PLAYER((PlayerTypes)iLoopPlayer).isAlive())
    			{
    				GET_PLAYER((PlayerTypes)iLoopPlayer).setTurnActive(true);
    			}
    		}
    	}
    ...
    
    As you can see, the emboldened code is the special case that the game being played is a simultaneous game. If so, it makes each player active. During that CvPlayer::setTurnActive SDK function, we have this code:

    Code:
    void CvPlayer::setTurnActive(bool bNewValue, bool bDoTurn)
    ...
    ...
    if (bDoTurn)
    {
    	if (GC.getGameINLINE().getElapsedGameTurns() > 0)
    	{
    		if (isAlive())
    		{
    			if (isHuman())
    			{
    				if (getTeam() == GC.getGameINLINE().getSecretaryGeneral())
    				{
    					if (GET_TEAM(getTeam()).getSecretaryID() == getID())
    					{
    						for (iI = 0; iI < GC.getNumVoteInfos(); iI++)
    						{
    							if (!(GC.getVoteInfo((VoteTypes)iI).isSecretaryGeneral()))
    							{
    								GC.getGameINLINE().setVoteTriggered(((VoteTypes)iI), false);
    							}
    						}
    					}
    				}
    			}
    
    			[b]if (GC.getGameINLINE().isMPOption(MPOPTION_SIMULTANEOUS_TURNS))
    			{
    				doTurn();
    			}[/b]
    			doTurnUnits();
    		}
    	}
    	if ((getID() == GC.getGameINLINE().getActivePlayer()) && (GC.getGameINLINE().getElapsedGameTurns() > 0))
    	{
    		if (GC.getGameINLINE().isNetworkMultiPlayer())
    		{
    			gDLL->getInterfaceIFace()->addMessage(getID(), true, GC.getDefineINT("EVENT_MESSAGE_TIME"), gDLL->getText("TXT_KEY_MISC_TURN_BEGINS").GetCString(), "AS2D_NEWTURN", MESSAGE_TYPE_DISPLAY_ONLY);
    		}
    		else
    		{
    			gDLL->getInterfaceIFace()->playGeneralSound("AS2D_NEWTURN");
    		}
    	}
    
    	doWarnings();
    }
    
    So, as you can see, when a player&#8217;s turn is set to &#8220;active&#8221;, their &#8220;doTurn&#8221; is called. This happens at the beginning of their turn, not at the end as is the case in other game types. Also interesting to note is that in the Singleplayer only one player has their turn active at a time. In Simultaneous, multiple players will be active. The same strategy of determining when all players finish their turn still holds. If at the start, 18 players have their turn active, one by one they will finish until there are no more players remaining with their turn active, and as a result the game goes through it&#8217;s CvGame::doTurn SDK function. Note that the CvGame::doTurn SDK function is always called at the end of a game turn, even in Simultaneous games.

    One thing to learn from this is that if you are doing something in the onBeginPlayerTurn function that depends on the fact that the player&#8217;s orders are always 0 throught 18, you&#8217;re going to have some Simultaneous trouble.



    One example I&#8217;ll give here dealing with the difference in Singleplayer to Multiplayer (specifically Simultaneous games) is with the Civcraft: SC mod (I promise I&#8217;ll try not to plug it too much in this tutorial, but this example was too good to pass up). For those not familiar with Starcraft, a Terran player can build a Nuclear silo, which, inside of that, can be built a Tactical Nuke. Unlike in Civ4, the nuke cannot be controlled to just be targeted anywhere within its range, but rather a covert ops &#8220;Ghost&#8221; unit must &#8220;mark&#8221; the nuke sight with a laser. Some small amount of time passes between when the ghost first marks the sight to when the nuke lands, given the enemy a chance to hear the siren (&#8220;Nuclear Launch Detected!&#8221;) and try to find the ghost (which is typically invisible without a unit around to detect it). If they kill the ghost or if the ghost cancels the nuke while the nuke is in mid-flight, then the nuke is lost in space and never lands on it&#8217;s target.

    Translating this to a turn-based game, the idea is that a nuke should land one or more turns after the warning is first given. My original thought was that as soon as the nuke gets launched (that is, as soon as the ghost unit makes the mark to have it launched) to send up the warning. The nuke will land on the player&#8217;s next turn. Now, here&#8217;s the million dollar question: will this work in all multiplayer situations.

    First, let&#8217;s see how it should work in singleplayer. Recall that in Singleplayer, a player&#8217;s doTurn is called at the end of their turn. So, if I were to in the onPlayerBeginTurn or onPlayerEndTurn place code that makes the nuke land for any ghost that is currently targeting a sight, then the nuke would land the very same turn. Also, in a simultaneous game, the player could just launch the nuke a second before the turn timer ends, and the nuke would land at the beginning of the next turn (remember, the doTurn of all players are done at the beginning of the turn in Simultaneous mode), giving nobody any chance of reacting.

    So, instead, I would just make it wait a turn. The siren goes up, and a counter is created for the player of that nuke that starts at zero. At the end of the turn, the counter gets incremented (from zero to one). All players would have one turn to &#8220;find the ghost&#8221; or at least try to get their units out of the way of the blast. On the next players turn, the counter is incremented from one to two, and because two is the magical nuke number, the nuke lands. In multiplayer, the same concept applies. Now, will this work in simultaneous mode?

    When the player launches the nuke, the siren goes up immediately. This means that if the player launches the nuke as the very first thing they do in the simultaneous turn, the other players will have the whole turn to be able to react. That turn ends, and the next begins. At the beginning of the next turn, the counter is incremented (from zero to one), and once again the players have a whole turn to react. That turn ends, the next turn begins, whereby the player&#8217;s nuke counter goes up to two, and the nuke lands. But, did you catch what happened?

    If the player launches the nuke at the beginning of the simultaneous turn, there are essentially two turns for players to react, whereas if they launch it at the end, players will have essentially one turn to react. Some mod makers might be okay with this, and if so that&#8217;s great, but to me simultaneous mode isn&#8217;t there for you to have to time your movements in a real-time manner, rather it&#8217;s just supposed to make the turn-based game go faster. At it&#8217;s root, it&#8217;s still supposed to be turn based. As such, when a player who plays the game singleplayer or any non-simultaneous multiplayer sees that there is one turn to react to a nuke, I want them to believe that the same rule applies for simultaneous mode.

    As a result, what I ended up doing is making the warning siren appear during the player&#8217;s doTurn. That way, no matter when during a player&#8217;s simultaneous turn they launch the nuke, the siren will go up exactly one full turn before the nuke hits. Of course, this may not be the best case for your mod, and you might have liked it better the other way, but I guess making mods is a lot like how Sid Meier&#8217;s views playing games: &#8220;A series of interesting choices.&#8221;


    That concludes this section on the various multiplayer game types. Here are the things to remember:

    - If a mod works in singleplayer, it doesn&#8217;t necessarily means that it will work in any multiplayer types.
    - If a mod works in singleplayer, hotseat, and PBEM, it doesn&#8217;t necessarily mean it will work in the other multiplayer types.
    - Simultaneous games should work mostly the same as other network-type games, so long as you don&#8217;t do anything tricky with the ordering of the doTurns.
    - In all games, the unit&#8217;s CvUnit::doMove SDK function, which resets movement and other per-unit data, is called at the beginning of a player&#8217;s turn.
    - In all but simultaneous games, the CvPlayer::doTurn SDK function for a player, which contains the python onBeginPlayerTurn and onEndPlayerTurn, happen at the END of the turn. The CvGame::doTurn SDK function, which includes the python onBeginGameTurn and onEndGameTurn, is called after all players have finished their turn
    - In simultaneous games, the CvPlayer::doTurn SDK functions for all players are called at the very beginning of the game turn. All players are considered &#8220;active&#8221;, until they end their turn. The CvGame::doTurn is called, similar to other multiplayer games, after everyone is done.
    - Knowing how the various Multiplayer systems work is not absolutely required for a modder, but can prove useful in realizing how gameplay might change in unexpected ways.
     
  4. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    Section 3: Introduction to OOS errors: Global vs. Local Context



    The biggest problem that specifically plagues multiplayer games is OOS errors. This section will describe to you what is meant by when the game going OOS, as well as provide the definition of two terms I’ve completely made up to help describe the common situation that leads to OOS problems. It’s recommended that you read at least the first section of this tutorial before devouring this section. If you’re deciding to skip it, then at least know this: the root of all OOS errors is that changes are made to the game-state on some computers, but not others.

    So what is an OOS error?

    Recall that in Section One we talked about the concept of the game-state, and how when changing the game-state, rather than having the game constantly sending huge amounts of data back and forth to each other (remember, the Internet is slow?), they instead rely on sending small events that tell what has happened and allow all the large number-crunching to happen on all the computers in the game simultaneously.

    This controlled chaos can be so easily disturbed that just a single wrench thrown into the machine. One computer that does a calculation wrong will result in a domino affect where the further calculations of the game are wrong. If somehow your unit ends up one hit point less on your computer than on another computer, it may not affect this battle, but it might affect the outcome of the next battle. Then suddenly you have your unit surviving a battle where someone else has your unit losing. Next thing you know you’re moving that unit that is still alive on your computer, a message is sent to the other player that says you’re moving the unit, the other computer thinks that unit is dead, gets scared, and shuts off like an ostrich sticking its head in the sand.

    Ok, so maybe it doesn’t gets scared and turn off. But the problem is still there. In section One, we discussed how checksums are used to try to detect these errors as soon as possible. It does this by taking the game-state and trying to summarize all the data into one number that can be compared to the numbers calculated by all the other computers in the game. When one computer sees that it’s different than another, they throw red flags and sound alarms (actually, Civ4 doesn’t make an alarm sound, us modders get enough stress just seeing those bold red letters, add an annoying siren sound effect and there would be a lot of broken keyboard and monitors).

    When the OOS errors pop up, every player has a number next to their name. That number, actually, is the checksum that they calculated. The game says to look at these numbers, find the player whose number is wrong, and have them reconnect. The idea is that they will have the game-state of the other computers loaded in to replace their broken one, and the game can continue. This is, obviously, a hassle, and to avoid the actual OOS errors is a much better solution.

    Because games go OOS when two computers are connected to the same game but have different game-states, we can safely say that it’s impossible for Hosteat and PBEM games to go OOS. This is because at any one time, the Hotseat and PBEM games are only being played (and more importantly, only having their game-state changed) on one computer. Whatever that computer says is the law. Even if it has some code that in a networked game would cause on OOS error, because there is no other computer to compare the data with, it will always just assume that it’s correct.

    So, OOS errors are simply when one player’s game-state goes unexpectedly different to another’s in a networked multiplayer game. So, how do we avoid this?


    First, I will try to define to you my new definitions:

    Global Context:
    Code can be said to be within a “Global Context” of the game if it can be said that whenever the specified bunch of code is called, that the specified code will be called on all other computers that are in the game. For example, all the functions within the CvEventManager.py are on the global context. Whenever these functions are called, you can be assured that in a networked multiplayer game, the function is being called on all computers, and thus any changes to the game-state that you do will be changes made on all computer.

    Most of the code you write is probably something that stems from an event. For example, you might have something happening in your mod that stems from a call to the onEndPlayerTurn function. Since this code is in the global context, it is safe to add gold to that player using gc.getPlayer(iPlayer).changeGold(50), since on every computer in the game that function is being called with the same arguments, and thus the same result will happen. The game-state will stay synchronized.

    Local Context:
    Code can be said to be within a “Local Context” of the game if the above is not true. Thus, if every time the code is called you can guarantee that code is NOT being run on other computers, it is in the local context. Likewise, if you have code that SOMETIMES runs in sync with other computers, it will be in the local context.

    One example of code that runs in the local context is screens. When you bring up the Civic advisor screen, code is being called to show that screen. Since that screen is being shown on your computer, but not anyone else’s, you can be quite sure that the code used to create the Civic advisor screen is under the local context. Similarly, the code used to handle events that happen within that screen are also in the local context, since the handleInput function can only be called by manipulating that screen. Making changes to the game-state, such as calling the CyPlayer.revolution with the new civics, will cause the game to go OOS. The reason is because the revolution function makes changes to the game-state, changes that will not be also made on other computers. In order to make this work, Firaxis put in the CyMessageControl system, which allows for such things to be delegated to the message system that ensures that if the game is multiplayer that other players will get the message of the event. I won’t get into the CyMessageControl system, but if you’re interested check out this post, written by MatzeHH:

    http://forums.civfanatics.com/showpost.php?p=4300972&postcount=8


    Another example of code that runs within the local context is many of the functions in CvGameUtils.py. For example, the function canTrain(). This function will be called in many different occasions. Sometimes the AI needs to determine if it can build a certain unit. Because most (if not all) AI code is in the global context (run similarly on all computers), all the calls to the canTrain function COULD be considered in the global context. However, the canTrain function is also called on all different unit types every time you select a city (so that it can determine what icons to display for units that are able to be built in that city). Because your selecting a city is not an event that needs to be sent to other computers, no other computers respond to your selecting a city, and thus the canTrain function will be called on your computer, but not on others. Thus, the canTrain function can be said to be within the local and global context.

    In the case such as above, it is better off saying that it’s in the local context. The reason is this: anytime you’re looking at code in the global context, you are specifying that it’s safe to change the game-state within that code. If you were to change the game-state in the canTrain function (such as adding one gold to player 0’s sum), you would have OOS errors, since the function is only called synchronously on all the computers in the game SOME of the time.

    Determining if the code you’re writing is in the global or local context can be difficult. Here are some hints to help you discover this:

    1.) Look around at the other code that Firaxis has included. For example, look for any code that would change the game-state (pPlayer.changeGold(), pUnit.kill(), etc.). If you see code like that within a function.

    2.) Look at where that code is being called from. In the case of the CvEventManager.py file, you are looking at functions that you don’t really know if they are changing the game-state or not, because the functions are empty. This is where some SDK knowledge can help. Search the SDK for where those functions are being called from. For example:

    Code:
    void CvPlayer::doTurn()
    {
    	PROFILE_FUNC();
    
    	CvCity* pLoopCity;
    	int iLoop;
    
    	FAssertMsg(isAlive(), "isAlive is expected to be true");
    	FAssertMsg(!hasBusyUnit() || GC.getGameINLINE().isMPOption(MPOPTION_SIMULTANEOUS_TURNS), "End of turn with busy units in a sequential-turn game");
    
    	[b]gDLL->getEventReporterIFace()->beginPlayerTurn( GC.getGameINLINE().getGameTurn(),  getID());[/b]
    
    	GC.getGameINLINE().verifyDeals();
    
    	AI_doTurnPre();
    
    	if (getRevolutionTimer() > 0)
    	{
    		[b]changeRevolutionTimer(-1);[/b]
    	}
    
    The first emboldened line is where the beginPlayerTurn function in the CvEventManager.py is being called. Directly after that, we see changeRevolutionTimer, which changes the game-state. Thus, you can be pretty sure that beginPlayerTurn is located in the global context, because if it weren’t, then Civ4 would be getting OOS errors all by itself.

    Of course, this isn’t foolproof. We’re relying on our fact that in order to find out what context a bit of code is in, we have to determine the context of code around it is in. This, obviously, requires knowledge of other functions. At first, this is going to be tough, but experience will help you more easily identify local vs. global contexts.



    3.) Experiment. This is not the best way, but can be effective. The idea of experimentation is to place some code where you want to know if it’s local or global, and then put something that you know will make the game go OOS. The typical case I use is that I run this code:

    Code:
    	CvUtil.pyPrint(“Running code to test for context.”)
    	gc.getPlayer(0).changeGold(1)
    
    I then test by loading up this game into a networked multiplayer setting (I luckily have two computers, if you don’t this might not be very easy for you) and having the code run. I’ll know it runs because I’ll see the text in the log. If the game goes OOS, I know that it is in the local context.

    There are a few problems with this approach. The first problem is that it can be very slow, since it takes a long time to change the code, place the files on two computers, get the game set up, and to find a way to ensure the code is called (this is especially true if you’re dealing with items that don’t come around until later in a game, in which case you’ll have to set up a saved game or scenario that has the later unit, building, tech, etc.).

    The second problem is that sometimes, like in the case of the canTrain() function in the CvGameUtils.py, there are times when the python function is called in a global context, and other times when it’s called in a local context. If we were to do this test on the canTrain function, and never have either computer select a city, then the game wouldn’t go OOS. So, in other words, the best results you can get are a definite “this code is in the local context”, or a negligible “we’re still not 100% sure what context this is in.”

    4.) Ask on the forums. Hopefully by then, a few people will have at least read and understood this part of this tutorial so when you say “Local Context” or “Global Context” on the forums, people will know what you’re talking about. It’s all about community, baby. Can’t you feel the love?


    So, to answer the question that I posed before these definitions, how do you best avoid OOS errors? It all comes down to this one sentence:

    Keep anything that changes the game-state OUT of code that is in the local context. All game-state changing code MUST go into code that is in the global context. This one fact is the entire key to avoiding OOS errors.

    Knowing this fact brings you one step closer to help you determine your OOS problems. The problem isn’t memorizing those few words and understanding what they mean, the problem is being able to accurately determine what code can be considered in the global context, and what code can be considered in the local context.

    Note that sometimes the case isn’t that the game-state is being changed in the local context, but that the game-state is being changed in the global context, although each computer is making a different change. In the previous examples, the problem was code in the wrong place. It could be the possibility that you’re looking at code in the right place but with data that differs from computer to computer.

    However, that data was probably, in it’s own part, created during a local context piece of code. For a simple example, a player gets gold for killing another unit. We’ve defined a function getGoldForUnitKill.

    Code:
    def onUnitKilled(self, argsList):
    	'Unit Killed'
    	unit, iAttacker = argsList
    	player = PyPlayer(unit.getOwner())
    	attacker = PyPlayer(iAttacker)
    
    	[B]# My code here
    	iGold = getGoldForUnitKill(unit)
    	attack.changeGold(iGold)[/B]
    
    This can be perfectly legitimate code, because the onUnitKilled function is in the global context. However, the value used as the amount of gold changed will have an impact on the game-state, and thus it’s imperative that getGoldForUnitKill() is thus also in the global context. That will probably be the case, since it’s probably just a simple function that is only called during this code.

    However, if you do something silly that might make the code within getGoldForUnitKill() become local context, you could be jeopardizing the return value. What could you possibly do to a standalone function to make it go from global context to local context? That would be our dear friend getActivePlayer, which I’ll speak about in the next chapter.

    Here is a summary of what I’ve spoken about in this section:

    - An OOS error occurs when two computers in a networked multiplayer game somehow end up with different game-states.
    - Hotseat and PBEM games cannot get OOS errors, even if the code used in those games would cause OOS errors if played in a network multiplayer game.
    - Code can be considered in the “global context” if it is always called synchronously with other computers in the game. Game-state changing code can safely be placed into code that is in the global context.
    - Code can be considered in the “local context” if it is ever called such that not all computers of the entire group of computers in the game run the code similarly. Game-state changing code placed in local context code will cause OOS errors.
    - Determining if code is local or global context can be difficult. Staying within the CvEventManager.py file is a safe bet, as it is all in the global context. The ability to recognize other pieces of code’s context will largely come down to experience.
    - How you calculated the data used in the function arguments is just as important as where you locate the function when determining if code might cause OOS errors.
     
  5. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    Section Four: The concept of &#8220;Active Player&#8221; and the getActivePlayer functions.

    As stated in the last section, the most basic way to avoid OOS errors in games is to ensure that your game-state changing code stays in the global context. I also mentioned that there are some cases of code you can write in global context code that can still cause problems. The one main contributor to this is the getActivePlayer functions, whom consist of these functions:

    Python: gc.getActivePlayer(), gc.getGame().getActivePlayer()
    C++: GC.getActivePlayer(), GC.getGameINLINE().getActivePlayer()

    This section hopes to enlighten people as to what is meant by the active player, and furthermore, what you&#8217;re actually getting when you call getActivePlayer(). It will show through example why a getActivePlayer function can cause OOS errors when used incorrectly, and ways to fix these errors through proper getActivePlayer usage.

    We&#8217;ll start by going back to Section Two, where we talked about what happens during the span of a turn. There, we said that in a turn-based (non-simultaneous) game turn, the game will tell the first player it&#8217;s their turn by calling the CvPlayer::setTurnActive() SDK function for that player. However, this doesn&#8217;t necessarily mean that when you say getActivePlayer, you will get the player who last had their setTurnActive function called. Here, Firaxis sort of makes two different kind of &#8220;actives&#8221;. You can have your turn active, and/or you can be the active player.

    Having your turn active means it&#8217;s actually your &#8220;turn&#8221;. You can move units, etc. There may be more than one person who has their turn active if it&#8217;s a simultaneous game. This is a whole different can of beans from being the active player.

    The best way to think of the active player is this: the active player is the player that is currently behind the computer screen.

    - In a Singleplayer game, this will always be the human player.
    - In PBEM or Hotseat games, the active player will change. During player 0&#8217;s turn, if player 0 is a human, then player 0 is the active player. Player 0 will continue being the active player until it is the next human player&#8217;s turn (say, player 3). Then, player 3 is the active player, because they are now &#8220;controlling&#8221; the game. Eventually, player 0 will become the active player again.
    - In networked games, there are multiple people &#8220;controlling&#8217; the action. On each one of their computers, the person behind the screen is the active player. On player 0&#8217;s computer, player 0 is the active player. On player 1&#8217;s computer, player 1 is the active player, and so on.

    Knowing exactly who is behind the screen can be very important. For example, one player might have the unit cycling option turned on, and another might not. In a PBEM or Hotseat game, knowing which one of the players is currently behind the screen is important to decide if the game should cycle through the units or not. Also, when you add a popup or message to the screen, you typically have to put in the id of the player to show the message or popup for. Typically, the rule of thumb is to use the items you have available to you. For example, here is an incorrect way of making a message appear when a unit moves:

    Code:
    def onUnitMove(self, argsList):
    		'unit move'
    		pPlot,pUnit = argsList
    		player = PyPlayer(pUnit.getOwner())
    		unitInfo = PyInfo.UnitInfo(pUnit.getUnitType())
    
                    CyInterface().addMessage([B]gc.getGame().getActivePlayer()[/B], True, 10, "You moved a unit!", "", InterfaceMessageTypes.MESSAGE_TYPE_INFO,
                                             "", ColorTypes.NO_COLOR, -1, -1, False, False)
    
    Obviously, the use of getActivePlayer() is wrong here. At first, it appears right, because when you move your unit, you get a message. But then you realize you&#8217;re getting the message even when you don&#8217;t move a unit. Actually, you are getting messages when ANY unit is moved, even the AI&#8217;s.

    The correct way is to use what info you have. pUnit.getOwner() gives you the id of the player who moved their unit, so you would want to use that for your &#8220;Player&#8221; argument:

    Code:
    def onUnitMove(self, argsList):
    	'unit move'
    	pPlot,pUnit = argsList
    	player = PyPlayer(pUnit.getOwner())
    	unitInfo = PyInfo.UnitInfo(pUnit.getUnitType())
    
                CyInterface().addMessage([B]pUnit.getOwner()[/B], True, 10, "You moved a unit!", "", InterfaceMessageTypes.MESSAGE_TYPE_INFO,
                "", ColorTypes.NO_COLOR, -1, -1, False, False)
    

    This will ensure that when your units are moved, that the message is added to your screen. When the AI moves a unit, the message gets put into the system to show up on their screen (which, when the game tries to display, realizes that it&#8217;s an AI and deletes the message). If you&#8217;re in a multiplayer game, the correct player will get the message.

    Of course, while a problem such as the one shown above might be annoying, it wouldn&#8217;t have caused on OOS error. The reason is because it doesn&#8217;t change the game-state. Of course, we could have easily made the example of the incorrect form something like this:

    Code:
    def onUnitMove(self, argsList):
    	'unit move'
    	pPlot,pUnit = argsList
    	player = PyPlayer(pUnit.getOwner())
    	unitInfo = PyInfo.UnitInfo(pUnit.getUnitType())
    
    	# Give money to the player, congratulating them on a very nice&#8230;movement
    	iActivePlayer = gc.getGame().getActivePlayer()
    	pPlayer = gc.getPlayer(iActivePlayer)
    	pPlayer.changeGold(40)
    
    This will cause an OOS in networked multiplayer games. Even though the code is written in a global context, the OOS error still results. The reasoning is this:

    Every computer in networked game will get the message that the player moves. Therefore, every player in the game will run that function. However, on every computer, the &#8220;active player&#8221; is different. On player 0&#8217;s computer, player 0 is the active player. On player 1&#8217;s computer, player 1 is the active player. So, here&#8217;s a side by side view of what happens.

    http://forums.civfanatics.com/attachment.php?attachmentid=139864&stc=1&d=1160114908

    As you can see, player 0 will give gold to player 0, and player 1 will give gold to player 1. In the end, the game-states differ, and as a result, the game goes OOS. The correct way to add gold to the player who just moved would be:

    Code:
    def onUnitMove(self, argsList):
    	'unit move'
    	pPlot,pUnit = argsList
    	player = PyPlayer(pUnit.getOwner())
    	unitInfo = PyInfo.UnitInfo(pUnit.getUnitType())
    
    	# Give money to the player, congratulating them on a very nice&#8230;movement
    	gc.getPlayer([B]pUnit.getOwner()[/B]).changeGold(40)
    


    This summary is pretty simple:

    - When you need to make sure that something, such as a popup or other graphical screen, only shows up on one person&#8217;s computer, then use getActivePlayer (although in many cases you can control which player gets the popup, see Kael&#8217;s guide on popups).
    - Otherwise, avoid it like the plague. It might need to be used in advanced situations, but in most cases you can find another way around it.
     

    Attached Files:

    • 2.JPG
      2.JPG
      File size:
      68.5 KB
      Views:
      249
  6. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    Section Five: Debugging OOS Errors


    OOS Errors are like that guy that posts &#8220;I&#8217;m having problems with my mod. Does anyone know what the problem is?&#8221;, without giving a single bit of information about what their problem might be. When an OOS error occurs, the only thing you know is the number (the checksum) that appears on the people&#8217;s screen, which isn&#8217;t very helpful. This section will show you ways of getting a better understanding of what part of your mod is causing the OOS error. Once you know where your error is coming from, you can then use what has been learned in the previous sections to hopefully fix the problem.

    In a good case, the problem is glaringly clear. In your latest mod based on the movie &#8220;Hackers&#8221;, you have code that, whenever you build the &#8220;Cookie Monster Virus&#8221; wonder, any players who own a &#8220;Citrix Kernel&#8221; building has a bunch of cookie monsters that appear on the screen shouting &#8220;I want a cookie&#8221; (until they type &#8220;cookie&#8221;, of course). Now, say when that wonder gets finished, you get an OOS. Obviously, you should check out the code that is run when the wonder is completed. Of course, that&#8217;s simple. I just wanted to make the &#8220;Hackers&#8221; reference.

    But in many situations, it won&#8217;t be so obvious. You attacked and killed another unit, but you just included 20 different mods off of the Mod Components site, and you&#8217;re not sure which of them mod the onUnitMove, all the onCombat callbacks, onUnitKilled, or any other functions that might be called. Also, if you&#8217;re in any multiplayer game (especially simultaneous, where everyone&#8217;s turns are ALL done at once) everyone has their own idea of what happened that might have caused the OOS (&#8220;I was moving a unit&#8221;, &#8220;I was switching my build queue&#8221;, &#8220;I was grabbing a soda&#8221;). Unlike normal errors, when an OOS error occurs, there isn&#8217;t anything that can tell you exactly where the problem is, since in the end everyone&#8217;s game worked fine, the code ran swimmingly, it&#8217;s just that everyone got a different result.

    The key to finding the exact point of the problem lies in being able to determine what&#8217;s different in the game-state. Remember, the OOS Checksum that determines when things have gone OOS is nothing more than just a collection of different pieces of data from the game, such as unit locations, promotions and levels; cities locations, populations, and production; tech researches, etc. The full code is below, take a quick glance at it (you can find it within . Don&#8217;t worry about the gameslice or the cases, that&#8217;s not important. Mostly, take a look at the different values that are added to iValue.

    http://forums.civfanatics.com/showpost.php?p=4609835&postcount=7


    As you can see, all the values get multiplied by a special number. They each have their own meaning that helps make sure that two OOS checks that have different values won&#8217;t be alike. But, that&#8217;s not important. What is important is that with the exception of the random number seeds, every piece of data used in the checksum can be retrieved from python. So, it&#8217;s completely possible that we can get a log of all of the data in the game-state and dump it to a file on the player&#8217;s computer. If this is done on every computer, the files can be collected, and then we can determine exactly where the game-states differed when the OOS happened. Now, you can go ahead and write this yourself, or you can just use the one I&#8217;ve created.

    http://forums.civfanatics.com/showpost.php?p=4609835&postcount=7

    The script is pretty simple. By having the onGameUpdate() function constantly call this modules doGameUpdate() function, the game will be constantly checked for if it has gone OOS (using the CyInterface().isOOSVisible() function, which returns True if the game has gone OOS). If it has, then it goes through a function that writes all the data that is found into log file.

    When used in a two-player (plus one barbarian) game, if the OOS error happens in the first few turns or so, before the game gets really complicated, the log can be over 1,000 lines of data. Obviously, we won&#8217;t pick through the whole thing, but rather use diff to have our computer do the work for us by having it look at the two files and show us where the differences are.

    So, I have a mod, and it&#8217;s generating an OOS error whenever I move. I get the log files from player 0 and player 1 and put them on the same computer. Then, I run a diff, and get this result:

    Code:
    $ diff OOSLogHost.txt OOSLog_Player1.txt 
    29c29
    < Player 0 Gold: 20
    ---
    > Player 0 Gold: 0
    
    The &#8220;29c29&#8221; means that on line 29, there was a change from the first file (OOSLogHost.txt) to the second file (OOSLog_Player1.txt). The left carrot (arrow) shows us what the line looks like in the OOSLogHost.txt file, and the right carrot (arrow) shows us what the line looks like in the OOSLog_Player1.txt file. (For more info on reading diff outputs, visit http://en.wikipedia.org/wiki/Diff).

    What this tells us is that for the player who was running the computer that generated the OOSLogHost.txt, player 0 had 20 gold at the time of the OOS error. However, for the player who generated OOSLog_Player1.txt, player 0 had 0 gold. Somewhere, either player 0 received extra gold on the host&#8217;s machine, or lost gold on the other player&#8217;s machine, but in any case, the game-state was probably changed in a local context.

    So, knowing that the problem happens when I move, I would be looking through the mods that change the onUnitMove. I also now know that the problem has to deal with a player recieving or losing gold, and that it&#8217;s most likely coded in a local context. So, following my nose, I find my problem:

    Code:
    def onUnitMove(self, argsList):
    	'unit move'
    	pPlot,pUnit = argsList
    	player = PyPlayer(pUnit.getOwner())
    	unitInfo = PyInfo.UnitInfo(pUnit.getUnitType())
    
    	[B]iActivePlayer = gc.getGame().getActivePlayer()
    	if (iActivePlayer == pUnit.getOwner() ):
    		gc.getPlayer(iActivePlayer).changeGold(20)[/B]
    
    What? How did those lines get in there? Oh yeah, I was testing to make sure my OOS script worked by making the player who moved the unit get the gold. Unfortunately, because of the getActivePlayer business, the changeGold command will only run if the unit that moved is owned by the active player. Thus, player 0 moved a unit, and on their computer received the gold addition, where as on player 1&#8217;s computer, that change never happened since the if-statement failed.

    This was still a pretty simple example. You probably would&#8217;ve been able to discover the problem even without the script to tell you it had to do with Gold. But, in many cases, especially with the actual code that makes the game go OOS being on the end of a player&#8217;s turn when LOT&#8217;S of things are probably happening, it may be your only chance.

    Using that school of thought should cover most of your OOS cases. Sometimes, though, you could have problems with random numbers being out of sync, which depending on how you look at it might work a little bit differently.

    Say someone were to make a mod based on that &#8220;Hackers&#8221; movie where upon completing the &#8220;Cookie Monster&#8221; virus wonder the other players get a bunch of pictures of the cookie monster filling their screen if they&#8217;ve built the Citrix Kernel building. The pseudo-code might look something like this:

    Code:
    def onBuildingBuilt:
    	if building is cookie monster virus:
    		if active player is not builder of cookie monster virus:
    			if active player has Citrix kernel:
    				for 20 cookie monster pictures:
    					drawCookieMonsterRandomly()
    
    def drawCookieMonsterRandomly():
    	iX = gc.getSorenRandNum(SCREEN_WIDTH, &#8220;Cookie Monster X&#8221;)
    	iY = gc.getSorenRandNum(SCREEN_HEIGHT, &#8220;Cookie Monster Y&#8221;)
    	drawCookieMonsterAtXY(iX, iY)
    
    Do you see the OOS problem? If not, try again, it may not be obvious. In any case, assuming you had no idea why your game was going out of sync (for example, say the code was being run at the end of a turn and there are just too many things happening to realize that the cookie monster virus wonder was built), you decide to run the script to generate the logs. What you&#8217;ll get is a diff like the following:

    Code:
    $ diff OOSLogHost.txt OOSLog_Player1.txt
    2c2
    > Next Soren Rand Value: 2945
    ---
    > Next Soren Rand Value: 6234
    
    What the numbers are isn&#8217;t important. What IS important is that they&#8217;re different. In the actual checksum, the actual seeds of both the MapRand and SorenRand numbers are used, as shown below:

    iValue += getMapRand().getSeed();
    iValue += getSorenRand().getSeed();

    However, it&#8217;s not possible to get the actual seeds of the SPRNG&#8217;s, so instead I&#8217;ve done the next best thing: make the game generate a random number between 0 and 10,000.

    Code:
    pFile.write("Next Map Rand Value: &#37;d\n" % CyGame().getMapRand().get(10000, "OOS Log"))
    pFile.write("Next Soren Rand Value: %d\n" % CyGame().getSorenRand().get(10000, "OOS Log"))
    
    Although using the seeds is the surefire way to make sure they&#8217;re synchronized, the chances of two SPRNG&#8217;s with different seeds generating the same number between 0 and 10,000 is very small, so you&#8217;ll still see the difference in the random numbers.


    In case you haven&#8217;t figured it out, the problem in the cookie monster code is that the SorenRand is part of the game-state. Remember, it&#8217;s a synchronized RNG, and in order to stay synchronized the random numbers must be called on all computers. Everything within the first &#8220;if active player&#8221; statement, including the functions called from within it, are in the local context, which I&#8217;ve emboldened below:

    Code:
    def onBuildingBuilt:
    	if building is cookie monster virus:
    		[b]if active player is not builder of cookie monster virus:
    			# Local context is emboldened
    			if active player has Citrix kernel:
    				for 20 cookie monster pictures:
    					drawCookieMonsterRandomly()[/b]
    
    def drawCookieMonsterRandomly():
    	# Local context is emboldened
    	iX = gc.getSorenRandNum(SCREEN_WIDTH, &#8220;Cookie Monster X&#8221;)
    	iY = gc.getSorenRandNum(SCREEN_HEIGHT, &#8220;Cookie Monster Y&#8221;)
    	drawCookieMonsterAtXY(iX, iY)
    


    The player who built the cookie monster virus will never call the drawCookieMonsterRandomly() method. If another player has the Citrix Kernel building, they will call the drawCookieMonsterRandomly function 20 times, each time calling sorenRand twice. Even calling sorenRand once is enough to change the seed and desynchronize the random number generators. What makes this OOS problem even more interesting is if no one owns a Citrix Kernel building, this OOS error will not occur, since no one will run the getSorenRandNum function, and all the SPRNG&#8217;s will stay synchronized.

    Luckily, there is one more weapon in a modders arsenal to deal with Multiplayer
    OOS issues that particularly comes in handy for generating random numbers in local context code. That is Synchronization Logging. I&#8217;m actually kind of surprised that I&#8217;ve buried it this deep into this tutorial.
    You can enable Synchronization Logging just like any other Civ4 logging, via setting the option to 1 in the ini file.

    Code:
    ; Enable synchronization logging
    SynchLog = 1
    
    The synchronization log works like any other Civ4 log, and resides in your logs directory with the file MPLog.txt. Here is an example from one turn:

    Spoiler :

    Code:
    Player 0 Turn ON
    Player 0 Unit 8192 (Gerikes's Settler) moving from -2147483647:-2147483647 to 18:9
    Player 0 Unit 16385 (Gerikes's Warrior) moving from -2147483647:-2147483647 to 20:9
    Player 1 Unit 8192 (TXT_KEY_LEADER_ISABELLA's Settler) moving from -2147483647:-2147483647 to 27:15
    Player 1 Unit 16385 (TXT_KEY_LEADER_ISABELLA's Warrior) moving from -2147483647:-2147483647 to 27:15
    Player 0 City 8192 built at 18:9
    Player 0 Unit 8192 (Gerikes's Settler) moving from 18:9 to -2147483647:-2147483647
    Rand = -243135540 on 202 (AI Best UnitAI ASYNC)
    Rand = 1606376981 on 202 (AI Best UnitAI ASYNC)
    Rand = 744727850 on 202 (AI Best UnitAI ASYNC)
    Rand = -1422110949 on 202 (AI Best Unit ASYNC)
    Rand = 1386879160 on 202 (AI Best UnitAI ASYNC)
    Rand = -599472495 on 202 (AI Best UnitAI ASYNC)
    Rand = 586082806 on 202 (AI Best UnitAI ASYNC)
    Rand = 356958711 on 202 (AI Best UnitAI ASYNC)
    Rand = 1514408036 on 202 (AI Best UnitAI ASYNC)
    Rand = 182336205 on 202 (AI Best UnitAI ASYNC)
    Rand = -568873086 on 202 (AI Best Building ASYNC)
    Rand = -2011461997 on 202 (Wonder Construction Rand ASYNC)
    Rand = -1083488560 on 202 (AI Best Building ASYNC)
    Player 0 Unit 16385 (Gerikes's Warrior) moving from 20:9 to 21:10
    Player 0 Turn OFF
    Rand = 1064263372 on 237 (Religion Spread)
    Rand = 253080853 on 237 (Religion Spread)
    Rand = 1822994474 on 237 (Religion Spread)
    Rand = -1812201957 on 237 (Religion Spread)
    Rand = 765041592 on 237 (Religion Spread)
    Rand = 1613335953 on 237 (Religion Spread)
    Rand = -369515274 on 237 (Religion Spread)
    Player 1 Turn ON
    Rand = -1143840521 on 237 (AI Goody)
    Player 1 Unit 16385 (TXT_KEY_LEADER_ISABELLA's Warrior) moving from 27:15 to 27:16
    Rand = -936854684 on 237 (Goodies)
    Rand = -171413043 on 237 (Goodies)
    Rand = -896329086 on 237 (Goodies)
    Rand = -1790831213 on 237 (Goody Gold 1)
    Rand = -1301796400 on 237 (Goody Gold 2)
    Player 1 Unit 24578 (TXT_KEY_LEADER_ISABELLA's Scout) moving from -2147483647:-2147483647 to 27:16
    Rand = -638100023 on 237 (AI Unit Birthmark)
    Player 1 City 8192 built at 27:15
    Player 1 Unit 8192 (TXT_KEY_LEADER_ISABELLA's Settler) moving from 27:15 to -2147483647:-2147483647
    Rand = -1084465970 on 238 (AI Explore)
    Rand = -1276217361 on 238 (AI Explore)
    Rand = -1225911556 on 238 (AI Explore)
    Rand = 403639685 on 238 (AI Explore)
    Rand = 241530842 on 238 (AI Explore)
    Rand = 408200203 on 238 (AI Explore)
    Rand = -779656472 on 238 (AI Explore)
    Rand = -566756095 on 238 (AI Explore)
    Rand = -1665976410 on 238 (AI Explore)
    Rand = -2086177305 on 238 (AI Explore)
    Rand = -337394284 on 238 (AI Explore)
    Rand = 983984189 on 238 (AI Explore)
    Rand = -926209998 on 238 (AI Explore)
    Rand = 762022275 on 238 (AI Explore)
    Rand = -1191074048 on 238 (AI Explore)
    Rand = 1888897849 on 238 (AI Explore)
    Rand = 1644139902 on 238 (AI Explore)
    Rand = 919812831 on 238 (AI Explore)
    Rand = -614845652 on 238 (AI Explore)
    Rand = -1593351691 on 238 (AI Explore)
    Rand = -931427446 on 238 (AI Explore)
    Rand = -136665605 on 238 (AI Explore)
    Rand = 1103975960 on 238 (AI Explore)
    Rand = 1127819377 on 238 (AI Explore)
    Rand = 34526806 on 238 (AI Explore)
    Rand = 35890903 on 238 (AI Explore)
    Rand = 1733147588 on 238 (AI Explore)
    Rand = -1461265747 on 238 (AI Explore)
    Rand = 1337590242 on 238 (AI Explore)
    Rand = 1972492659 on 238 (AI Explore)
    Rand = -515059664 on 238 (AI Explore)
    Rand = -1975290711 on 238 (AI Explore)
    Rand = -1879418322 on 238 (AI Explore)
    Rand = 44472783 on 238 (AI Explore)
    Rand = 1759952732 on 238 (AI Explore)
    Player 1 Unit 24578 (TXT_KEY_LEADER_ISABELLA's Scout) moving from 27:16 to 26:15
    Player 1 Turn OFF
    


    The synchronization log does a number of types of logging, all of which hinge around determining OOS problems. The first thing it does is track when players turns are on or off. This is set in the CvPlayer::setTurnActive SDK function. You probably won&#8217;t use it too much, unless you really start screwing around with how turns work (which is something I highly do not recommend). Another thing you&#8217;ll see is that it tracks all unit movements Whenever a unit moves, the log will show what plots they moved from and to (when it shows some spazzy number like &#8211; 2147483647 for a plot index, that just means that the unit is not on a plot, either they were just initialized or just killed).

    Another thing, which is so helpful for problems with random number generators, is show you every random number generation. Note that it doesn&#8217;t show you the the actual number generated, but most of the time that doesn&#8217;t really matter. If you want to find that out you can do a debug statement in the code. An example line:

    Code:
    Rand = -1083488560 on 202 (AI Best Building ASYNC)
    
    Here are what the different parts mean:

    &#8220;Rand=&#8221; just tells us that this line logs a random number

    &#8220;-1083488560&#8221; is the seed that the random number generator has BEFORE the generation is made.

    &#8220;on 202&#8221; means that this random number was generated on slice 202 of the turn. Every turn is broken up into slices, and every slice has its own OOS check. Thus, if one game has a random number being generated on 202, and another has it on 203, there will be a problem. You don&#8217;t have to worry about this, and it&#8217;s probably more useful for Firaxis to use to help code their synchronization code than for the casual modder.

    &#8220;(AI Best Building ASYNC)&#8221; The final part is in commas, and this is actually the text that you put into the quotes whenever you generate a random number. In this case, the random number has something to do with the AI determining what the best build would be.

    Notice that Firaxis uses &#8220;ASYNC&#8221; in some of the comments. This tells you that an asynchronous random number generator was used (as opposed to a synchronous random number generator, like MapRand or SorenRand). This means that it&#8217;s okay for that random number to be showing up in one player&#8217;s logs, but not another, because it&#8217;s not expected that these &#8220;match up&#8221; across computers. They&#8217;re useful for when you need to do some random number generation on one computer (such as generating fireworks for WLTKD or Cookie Monster pictures). In this case, the random number was used to help determine what the game selects as the &#8220;suggested&#8221; buildings when I started my first city.

    The way to use this to your advantage is as follows: if you are finding out that you&#8217;re getting OOS errors from random number generations, check out this file. Do a diff, and weed out all the ASYNC&#8217;s, &#8216;cuz those don&#8217;t matter. Sooner or later, you&#8217;ll find out that one player has a (synchronized) random number being generated where another doesn&#8217;t. If you were smart and gave your random generation comment arguments good, easy-to-find descriptive phrases, you&#8217;ll know the EXACT line that your OOS code is on, because that random number generation is going to be included in some player&#8217;s log files, and missing in others.

    So far I&#8217;ve discussed much of the technical aspects of debugging, but now I hope to give a couple of tips on the actual way to go about doing it. The easiest way is probably with two computers sitting right next to each other on a networked game. Of course, not all people have this luxury. The next best is to have people you know online (or in your dorm, or whatever) play, and then make sure that you have a ground plan on how you will collect the logs and analyze them, as if you just go into it without planning you might end up in confusion, and end up losing valuable data and time.

    If you&#8217;re in the need to go it alone, and only have the resources of one computer, then you&#8217;re really stuck, but not completely out. Note that Civilization comes with a command-line option that allows you to have multiple Civ4 games running at the same time. It can be buggy and memory-intensive at times, and you are probably going to have problems with log files overwriting each other. However, if you&#8217;re up to the task, make a shortcut to the Civ4 executable (you can do this by making a copy of the shortcut Civ4 places on your desktop). Then, right-click it and hit properties. In the window labeled &#8220;Target&#8221;, replace&#8230;

    "C:\Program Files\Firaxis Games\Sid Meier's Civilization 4\Civilization4.exe"

    &#8230;with&#8230;

    "C:\Program Files\Firaxis Games\Sid Meier's Civilization 4\Civilization4.exe" multiple mod=\MyModsName

    This will allow you to start the game directly loading to your mod, as well as using the &#8220;multiple&#8221; switch to load multiple instances.

    Another alternative would be to use virtualization software like VMWare, but if you decide to go that route I&#8217;ll just tip my hat and wish you luck.

    As a last note, OOS errors might be a little tougher to manage finding because it involves a little more teamwork. Players (or more likely, beta testers) have to send their files to each other and compare and such, as opposed to just seeing error messages and bugs that show up right in front of them, make a screenshot and post a bug report. Having good beta testers is a very important part of any error detection, but for OOS, having patient beta testers is something not to be taken for granted. On a structural note, making a procedure for beta testers to gather their multiple log files together might not be such a bad idea, such as an &#8220;In the event of an OOS, do the following steps&#8230;&#8221; document. Perhaps someone could expand upon my script (or make their own) that automates this process.

    I hope by now you&#8217;re ready to put down this virtual book manifestation and start to knock out some of the problems plaguing your mods (whether you were looking for them before, or realize that you could be looking for them now). Here is a summary of this section:

    - The easiest (but not necessarily most effective) way to find the source of an OOS error is to look for any strange code that most likely gets called around when the OOS error happens. I would keep an extra eye out for getActivePlayer() calls in those areas.
    - By using a simple python script, the entire game-state for all computers involved in a game can be dumped to a log file for comparison. This can help pinpoint the exact problem part of the game that is not staying synchronized, further pointing you to your source.
    - The MapRand and SorenRand random number generators are part of the game-state, and as such if they have their seeds changed (a result of generating a random number) in local context, they will make your game OOS.
    - Enabling the Synchronization log can help you fight some OOS problems, especially when you&#8217;ve discovered that random number generators are the culprit of your OOS errors. Also, naming your random number comments something that actually identifies what the random number is doing is a habit that you should start taking up, if you haven&#8217;t already. In the event of a SPRNG-related OOS error, you&#8217;ll have a MUCH easier time finding the source of the problem.
    - &#8220;Hackers&#8221; is one of the worst techy movies in the last few decades.
     
  7. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    Full code listing for sync Checksum:

    Code:
    int CvGame::calculateSyncChecksum()
    {
    	PROFILE_FUNC();
    
    	CvUnit* pLoopUnit;
    	int iMultiplier;
    	int iValue;
    	int iLoop;
    	int iI, iJ;
    
    	iValue = 0;
    
    	iValue += getMapRand().getSeed();
    	iValue += getSorenRand().getSeed();
    
    	iValue += getNumCities();
    	iValue += getTotalPopulation();
    	iValue += getNumDeals();
    
    	iValue += GC.getMapINLINE().getOwnedPlots();
    	iValue += GC.getMapINLINE().getNumAreas();
    
    	for (iI = 0; iI < MAX_PLAYERS; iI++)
    	{
    		if (GET_PLAYER((PlayerTypes)iI).isEverAlive())
    		{
    			iMultiplier = getPlayerScore((PlayerTypes)iI);
    
    			switch (getTurnSlice() % 4)
    			{
    			case 0:
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getTotalPopulation() * 543271);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getTotalLand() * 327382);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getGold() * 107564);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getAssets() * 327455);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getPower() * 135647);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumCities() * 436432);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumUnits() * 324111);
    				iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumSelectionGroups() * 215356);
    				break;
    
    			case 1:
    				for (iJ = 0; iJ < NUM_YIELD_TYPES; iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).calculateTotalYield((YieldTypes)iJ) * 432754);
    				}
    
    				for (iJ = 0; iJ < NUM_COMMERCE_TYPES; iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getCommerceRate((CommerceTypes)iJ) * 432789);
    				}
    				break;
    
    			case 2:
    				for (iJ = 0; iJ < GC.getNumBonusInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getNumAvailableBonuses((BonusTypes)iJ) * 945732);
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getBonusImport((BonusTypes)iJ) * 326443);
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getBonusExport((BonusTypes)iJ) * 932211);
    				}
    
    				for (iJ = 0; iJ < GC.getNumImprovementInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getImprovementCount((ImprovementTypes)iJ) * 883422);
    				}
    
    				for (iJ = 0; iJ < GC.getNumBuildingClassInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getBuildingClassCountPlusMaking((BuildingClassTypes)iJ) * 954531);
    				}
    
    				for (iJ = 0; iJ < GC.getNumUnitClassInfos(); iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).getUnitClassCountPlusMaking((UnitClassTypes)iJ) * 754843);
    				}
    
    				for (iJ = 0; iJ < NUM_UNITAI_TYPES; iJ++)
    				{
    					iMultiplier += (GET_PLAYER((PlayerTypes)iI).AI_totalUnitAIs((UnitAITypes)iJ) * 643383);
    				}
    				break;
    
    			case 3:
    				for (pLoopUnit = GET_PLAYER((PlayerTypes)iI).firstUnit(&iLoop); pLoopUnit != NULL; pLoopUnit = GET_PLAYER((PlayerTypes)iI).nextUnit(&iLoop))
    				{
    					iMultiplier += (pLoopUnit->getX_INLINE() * 876543);
    					iMultiplier += (pLoopUnit->getY_INLINE() * 985310);
    					iMultiplier += (pLoopUnit->getDamage() * 736373);
    					iMultiplier += (pLoopUnit->getExperience() * 820622);
    					iMultiplier += (pLoopUnit->getLevel() * 367291);
    				}
    				break;
    			}
    
    			if (iMultiplier != 0)
    			{
    				iValue *= iMultiplier;
    			}
    		}
    	}
    
    	return iValue;
    }
    
    Full code listing for python dump script.

    Code:
    import os
    from CvPythonExtensions import *
    
    gc = CyGlobalContext()
    
    szFilename = "OOSLog.txt"
    iMaxFilenameTries = 100
    
    bWroteLog = False
    
    SEPERATOR = "-----------------------------------------------------------------\n"
    
    
    # Simply checks every game turn for OOS. If it finds it, writes the
    # info contained in the sync checksum to a log file, then sets the bWroteLog
    # variable so that it only happens once.
    def doGameUpdate():
        global bWroteLog
        bOOS = CyInterface().isOOSVisible()
    
        if (bOOS and not bWroteLog):
            writeLog()
            bWroteLog = True
    
    def writeLog():
        pFile = open(szFilename, "w")
    
        #
        # Global data
        #
        pFile.write(SEPERATOR)
        pFile.write(SEPERATOR)
    
        pFile.write("  GLOBALS  \n")
    
        pFile.write(SEPERATOR)
        pFile.write(SEPERATOR)
        pFile.write("\n\n")
    
        pFile.write("Next Map Rand Value: %d\n" % CyGame().getMapRand().get(10000, "OOS Log"))
        pFile.write("Next Soren Rand Value: %d\n" % CyGame().getSorenRand().get(10000, "OOS Log"))
    
        pFile.write("Total num cities: %d\n" % CyGame().getNumCities() )
        pFile.write("Total population: %d\n" % CyGame().getTotalPopulation() )
        pFile.write("Total Deals: %d\n" % CyGame().getNumDeals() )
    
        pFile.write("Total owned plots: %d\n" % CyMap().getOwnedPlots() )
        pFile.write("Total num areas: %d\n" % CyMap().getNumAreas() )
    
        pFile.write("\n\n")
    
        #
        # Player data
        #
        iPlayer = 0
        for iPlayer in range(gc.getMAX_PLAYERS()):
            pPlayer = gc.getPlayer(iPlayer)
            if (pPlayer.isEverAlive()):
                pFile.write(SEPERATOR)
                pFile.write(SEPERATOR)
    
                pFile.write("  PLAYER %d  \n" % iPlayer)
    
                pFile.write(SEPERATOR)
                pFile.write(SEPERATOR)
                pFile.write("\n\n")
    
                pFile.write("Basic data:\n")
                pFile.write("-----------\n")
                pFile.write("Player %d Score: %d\n" % (iPlayer, gc.getGame().getPlayerScore(iPlayer) ))
    
                pFile.write("Player %d Population: %d\n" % (iPlayer, pPlayer.getTotalPopulation() ) )
                pFile.write("Player %d Total Land: %d\n" % (iPlayer, pPlayer.getTotalLand() ) )
                pFile.write("Player %d Gold: %d\n" % (iPlayer, pPlayer.getGold() ) )
                pFile.write("Player %d Assets: %d\n" % (iPlayer, pPlayer.getAssets() ) )
                pFile.write("Player %d Power: %d\n" % (iPlayer, pPlayer.getPower() ) )
                pFile.write("Player %d Num Cities: %d\n" % (iPlayer, pPlayer.getNumCities() ) )
                pFile.write("Player %d Num Units: %d\n" % (iPlayer, pPlayer.getNumUnits() ) )
                pFile.write("Player %d Num Selection Groups: %d\n" % (iPlayer, pPlayer.getNumSelectionGroups() ) )
    
                pFile.write("\n\n")
    
                pFile.write("Yields:\n")
                pFile.write("-------\n")
                for iYield in range( int(YieldTypes.NUM_YIELD_TYPES) ):
                    pFile.write("Player %d %s Total Yield: %d\n" % (iPlayer, gc.getYieldInfo(iYield).getDescription(), pPlayer.calculateTotalYield(iYield) ))
    
                pFile.write("\n\n")
    
                pFile.write("Commerce:\n")
                pFile.write("---------\n")
                for iCommerce in range( int(CommerceTypes.NUM_COMMERCE_TYPES) ):
                    pFile.write("Player %d %s Total Commerce: %d\n" % (iPlayer, gc.getCommerceInfo(iCommerce).getDescription(), pPlayer.getCommerceRate(CommerceTypes(iCommerce)) ))
    
                pFile.write("\n\n")
    
                pFile.write("Bonus Info:\n")
                pFile.write("-----------\n")
                for iBonus in range(gc.getNumBonusInfos()):
                    pFile.write("Player %d, %s, Number Available: %d\n" % (iPlayer, gc.getBonusInfo(iBonus).getDescription(), pPlayer.getNumAvailableBonuses(iBonus) ))
                    pFile.write("Player %d, %s, Import: %d\n" % (iPlayer, gc.getBonusInfo(iBonus).getDescription(), pPlayer.getBonusImport(iBonus) ))
                    pFile.write("Player %d, %s, Export: %d\n" % (iPlayer, gc.getBonusInfo(iBonus).getDescription(), pPlayer.getBonusExport(iBonus) ))
                    pFile.write("\n")
    
                pFile.write("\n\n")
    
                pFile.write("Improvement Info:\n")
                pFile.write("-----------------\n")
                for iImprovement in range(gc.getNumImprovementInfos()):
                    pFile.write("Player %d, %s, Improvement count: %d\n" % (iPlayer, gc.getImprovementInfo(iImprovement).getDescription(), pPlayer.getImprovementCount(iImprovement) ))
    
                pFile.write("\n\n")
    
                pFile.write("Building Class Info:\n")
                pFile.write("--------------------\n")
                for iBuildingClass in range(gc.getNumBuildingClassInfos()):
                    pFile.write("Player %d, %s, Building class count plus building: %d\n" % (iPlayer, gc.getBuildingClassInfo(iBuildingClass).getDescription(), pPlayer.getBuildingClassCountPlusMaking(iBuildingClass) ))
    
                pFile.write("\n\n")
    
                pFile.write("Unit Class Info:\n")
                pFile.write("--------------------\n")
                for iUnitClass in range(gc.getNumUnitClassInfos()):
                    pFile.write("Player %d, %s, Unit class count plus training: %d\n" % (iPlayer, gc.getUnitClassInfo(iUnitClass).getDescription(), pPlayer.getUnitClassCountPlusMaking(iUnitClass) ))
    
                pFile.write("\n\n")
    
                pFile.write("UnitAI Types Info:\n")
                pFile.write("------------------\n")
                for iUnitAIType in range(int(UnitAITypes.NUM_UNITAI_TYPES)):
                    pFile.write("Player %d, %s, Unit AI Type count: %d\n" % (iPlayer, gc.getUnitAIInfo(iUnitAIType).getDescription(), pPlayer.AI_totalUnitAIs(UnitAITypes(iUnitAIType)) ))
                
    
                pFile.write("\n\n")
    
                pFile.write("Unit Info:\n")
                pFile.write("----------\n")
                iNumUnits = pPlayer.getNumUnits()
    
                if (iNumUnits == 0):
                    pFile.write("No Units")
                else:
                    pLoopUnitTuple = pPlayer.firstUnit(False)
                    while (pLoopUnitTuple[0] != None):
                        pUnit = pLoopUnitTuple[0]
                        pFile.write("Player %d, Unit ID: %d, %s\n" % (iPlayer, pUnit.getID(), pUnit.getName() ))
                        pFile.write("X: %d, Y: %d\n" % (pUnit.getX(), pUnit.getY()) )
                        pFile.write("Damage: %d\n" % pUnit.getDamage() )
                        pFile.write("Experience: %d\n" % pUnit.getExperience() )
                        pFile.write("Level: %d\n" % pUnit.getLevel() )
    
                        pLoopUnitTuple = pPlayer.nextUnit(pLoopUnitTuple[1], False)
                        pFile.write("\n")
                    
    
                # Space at end of player's info
                pFile.write("\n\n")
            
        # Close file
    
        pFile.close()
    
    

    Python script download:
     

    Attached Files:

  8. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    I hope you enjoyed and/or learned from this guide. If there are any suggestions to improve, or questions to ask, or incorrect places that need fixing, please reply below.
     
  9. keldath

    keldath LivE LonG AnD PrOsPeR

    Joined:
    Dec 20, 2005
    Messages:
    6,449
    Location:
    israel
    wow.....gerikes......


    no words...

    absolutley wonderfull.

    thank you for this awesome and hard work!
     
  10. Jeckel

    Jeckel Great Reverend

    Joined:
    Nov 16, 2005
    Messages:
    1,637
    Location:
    Peoria, IL
    Nice, I look forward to reading this when I get some free time. :)

    This tut was so need, and I am totally super serial about that. ;)
     
  11. Jeckel

    Jeckel Great Reverend

    Joined:
    Nov 16, 2005
    Messages:
    1,637
    Location:
    Peoria, IL
    Just finished reading it, good job Gerikes. Excellsior!
     
  12. MatzeHH

    MatzeHH Chieftain

    Joined:
    Jan 8, 2006
    Messages:
    210
    Location:
    Germany
    Not to bad man, not to bad!

    And thanks for the credits.

    Matze
     
  13. Shiggs713

    Shiggs713 Immortal

    Joined:
    Mar 11, 2007
    Messages:
    2,361
    Location:
    Indianapolis
    Very nice work, i appreciate this and hope to be modding soon!
     
  14. krille

    krille CivDOS Fanatic

    Joined:
    Sep 5, 2005
    Messages:
    337
    Great effort! Looks really helpful.

    What version is more update, the Word version or the version you posted here on the forums?
     
  15. Maniac

    Maniac Apolyton Sage

    Joined:
    Nov 27, 2004
    Messages:
    5,588
    Location:
    Gent, Belgium
    I have tried to incorporate Gerikes' OOSLogger into my Planetfall mod. Unfortunately no OOsLog is produced when someone is playing a multiplayer game. Neither is the 'MPLog.txt' which Gerikes mentions, created. SynchLogging is enabled.

    Does Gerikes' guide and code no longer work for BtS, or am I missing something obvious??
     
  16. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    It will only produce the log when it finally does go OOS, so you won't see anything until then.

    I haven't seen the code base since vanilla, so I'm not sure what has changed. I think I once looked at the Warlords code and saw that it hadn't changed much, so I'm assuming just as much with BtS. If so, I don't have the expansion (and my Civ addiction sense is telling me it's probably best not to install it :p)

    The actual script really isn't too hard to understand, it's just a matter of putting that one if-statement in the right spot. Just find a place where a peice of code is run over and over (which is why I stuck it on the OnUpdate method, if I remember correctly, since I was sure that that method would eventually be run, even if the game is OOS). Once the code is run, and the game has gone OOS, it should trigger creating a new file, dumping the info to it and saving the file.

    There are still a couple of things I could think of that could go wrong:

    1.) The user is running Vista, which has a different set of security measures, and not allowing the log file to be written. I'm using python's own file-creating API to write the log, so it might fail to write the log if there is a security conflict at the operating system level (for example, I can't even remember where it writes the log to, but if it tries to write to C:\Program Files\..etc..., it may fail in Vista.

    You can try changing the code doing the following. Place this class definition alongside the code:

    Code:
    class Writer:
        def __init__(self):
            self.log = ""
    
        def write(self, logComment):
            self.log += logComment
    
        def close(self):
            # Do whatever you want to do with the self.log variable...
            # print it to the screen, or whatever...
    
    Now, replace the line...

    Code:
    pFile = open(szFilename, "w")
    
    ...with...

    Code:
    pFile = Writer()
    
    The writer class just mocks the typical python file object so that the code still works, but now you're controlling what happens to it. In the close method, you can choose to display the self.log file to the screen or do anything else you might want to do.

    2.) The code is, for some reason, just not running at all. You might want to throw in a statement that you know you can check to be sure the code is running anyway. It may be running but generating an error (although if I remember correctly Civ4 does well with showing any error that happens during python along with the traceback, so you would've seen this). Perhaps the name of the method that is run constantly has changed from DoGameUpdate to something else?

    Remember that the actual writeLog() function will only run once: when the game has gone OOS.
     
  17. Maniac

    Maniac Apolyton Sage

    Joined:
    Nov 27, 2004
    Messages:
    5,588
    Location:
    Gent, Belgium
    Thanks. I don't understand enough python to know what I could do with that Writer function, but thanks for the response in any case.
     
  18. Gerikes

    Gerikes User of Run-on Sentences.

    Joined:
    Jul 26, 2005
    Messages:
    1,753
    Location:
    Massachusetts
    I can't remember what the exact code is, but there's some way in python to write to the actual log files. So you could write in code....


    CySomething.LogMessage("Test")

    And it would log "Test" to the log files. You can use this to do...

    CySomething.LogMessage(self.log)

    In fact, it might just be a matter of doing this...

    print self.log

    In the code I have above (in the close method). This would dump the contents of self.log (which would be one huge string that would contain what the OOSLog.txt file would contain) to one of the python log files. It might not work since the string might be too large, or it would ignore whitespace and clump it altogether (which I guess could still be workable with a diff program).

    I'm suspecting that the log file is not being written because it's probably trying to write to an area that it does not have rights to.

    So, you might want to try one of these...

    Code:
    def close(self):
            print "OOS Error Discovered:\n" + self.log
    
    ...or...

    Code:
    def close(self):
            raise Exception, "OOS Error Discovered:\n" + self.log
    
    One of these methods should force python to be able to print the log (which contains that huge string of the values that the log file would have contained if it had been written) to the python log file. I think print works, but if not you can try to raise an exception.

    But, once again, you need to make sure that the code is actually running in the first place, otherwise, all this effort is for naught.
     
  19. Maniac

    Maniac Apolyton Sage

    Joined:
    Nov 27, 2004
    Messages:
    5,588
    Location:
    Gent, Belgium
    Thanks for your post.

    I ended up trying this code in the close function:
    Code:
    CvUtil.pyPrint(self.log) + self.log
    Now the OOSlog information (160 MB for one turn!) is written to the pythonDebug file :D but at the same time it is also causing a python exception :(
    Code:
    TypeError: unsupported operand type(s) for +: 'NoneType' and 'unicode'
    ERR: Python function onEvent failed, module CvEventInterface
    I guess just remove one of them and see what happens.
     
  20. Maniac

    Maniac Apolyton Sage

    Joined:
    Nov 27, 2004
    Messages:
    5,588
    Location:
    Gent, Belgium
    double post
     

Share This Page