In this post I will give some detailed information about how the multiplayer synchronization works in Civ4 and how to avoid and find OOS errors. Everyone dealing with C++ or Python code or importing either needs to keep this in mind. How does the Civ4 synchronization model work? The full gamestate is kept on all computers at the same time. That is all information that has a direct effect on gameplay like the unit and city positions, what a city currently produces and what each player researches. It also includes the full state of each AI. To keep it that way, all code that changes the gamestate must run on all computers in the same order and it must be deterministic. This also means that code that changes the gamestate only depends on the synchronized gamestate. So if S is the synchronized gamestate and F and G are functions that represent a change of the gamestate, then all computers must run F and G in the same order and the result G(F(S)) is synchronized as well. Synchronization and Random Numbers So how do you keep the code deterministic but still have random numbers? Well, pseudo random number generators actually generate a deterministic sequence of numbers depending on the seed. So if all computers start with the same seed they will get the same random numbers. That means the state of the RNG is part of the gamestate. It is synchronized. That also means using it changes the gamestate so it may only be used in code that runs on all computers, in synchronized code. There is also an async RNG which you can use to generate random numbers in code that is not synced. Which code is synchronized? In general all code that changes the game state is synchronized, all code that deals with the user interface is not. There are some functions that are used both in async and in sync context. In which context they are called is determined by a bool that is passed so you need to keep that in mind and use the right RNG depending on that bool. Pretty much all synchronized code starts either from CvGame::update or from messages. How to initiate synced code from async code and vice versa Async code is often started from user input like pressing a button or the result of a popup. If you only want to display some information screen, that can be done directly in async code. But lets assume pressing that button should actually initiate a command on the unit which has a gameplay effect. Then you need to start synced code from the async user input. That is done with messages. A message can contain any amount of information, which is serialized and passed to all computers, including the one sending the message. The Civ4 exe makes sure that all messages are handled on all computers in the same order so the code executed from messages is synced. Sometimes you want to change the UI or camera on one computer only starting from synced code. In that case you can simply use getActivePlayer in CvGame which will be different on the different computers so "if (GC.getGameINLINE().getActivePlayer() == iPlayer) doSomething()" will execute doSomething on one computer only. Caches and Synchronization For performance reasons a lot of expensive information is cached. If you add a cache, you can choose between two synchronization variants. Variant 1 is to consider the cache state synchronized. That means the restriction that only synchronized code may change the cache state applies. Async code may read from it, but if it is a cache miss, the result of the uncached calculation may not be added to the cache. Variant 2 is to consider the query result synchronized. That means that the cached result and the result of the uncached calculation must be equal at all times. So if anything the calculation is based on changes, the cache entry must be marked as dirty. Multi-Threading and Synchronization To add multiple threads to parts of the code you usually start them at some point and then join them to the main thread at a later point. There is no guarantee on the execution order in between the threads so one might be executed entirely before the others or completely mixed. To keep the game state synchronization, the result of the threads must be deterministic and must not depend on the execution order. This will often be the case if you are just gathering information or valueing buildings and the like with one important problem: the RNG. Using the synced RNG changes the game state so any use of it in the threads would make the result non deterministic. You can solve this using this pattern: Generate one random number in the main thread for each worker thread. Use that as the seed of a thread-specific RNG. Now the threads can use their specific RNGs and the result will be deterministic again. Finding out of sync errors Out of sync errors (OOS) are very difficult to find without a good OOS report from a user. The main tools are the OOS log and the Random log. Both are generated on each computer participating in a multiplayer game if the respective INI values are set. The important information is in the difference between the logs. The OOS log contains a good part of the game state in text form that is also used to generate a checksum to recognize out of sync states. The Random log contains all uses and results of the synced RNG. If the OOS log shows that the state of the random number generator is different, then you can look into the Random log to see at which point it started to be different. A common source is that the RNG is used in async context which will result in a RNG usage that can only be seen in one log. Use a program like the merge program in Tortoise to compare the different logs and when you have found the first different random number, then search with a program like Windows Grep where the log phrase of the RNG usage is in the code files. Mind in general that a state out of sync snowballs, especially a bad usage of the RNG as all following random numbers will be different as well so it can sometimes be hard to find the exact root cause of an out of sync as a lot of other things have gone out of sync before it is recognized. An example could be that a promotion is set from async context which gives a unit double movement on forest which means it generates a different path and therefore might fight against different animals and you might see a subdued animal generated on one computer but the root cause was actually a bad promotion.