Multiplayer: Hacking A Fix Around OOS Errors

Afforess · Sep 6, 2014

I have things mostly working, except for one major setback. FFreeListTrashArray containers are impossible to resynchronize mid-game. Saves work because they destroy or create a new FFreeListTrashArray. You can not re-organize the structure in-place, as they do not allow any choice of the IDs used to allocate indexes. This is bad for cities and units, which are stored in FFreeListTrashArray.

Destroying and Creating a new FFreeListTrashArray in its place is also not viable as the FFreeListTrashArray "tracks memory usage internally", meaning when you destroy it, it also destroys every object in its containers, calling the destructor on cities and units. Not good. The game's graphics engine is very unhappy if you wipe those.

I initially tried to iterate over the existing elements, and update them in place, if they had an ID that matched what was sent over to be resynced. If there was extra units not sent in the resync (client has extra units that the host game lacks) kill those. But it falls apart when the host game has extra units the client lacks. There is no way to allocate the units on the client and guarantee they will have the correct IDs from the FFreeListTrashArray, since you have no control over how it allocates ids.

Is this a big deal? Yes, if units are out of order in FFreeListTrashArray, they will be out of order when the game iterates over units (which happens often), and will cause a breakdown in the expected determinism.

I think the only reasonable option is to replace the FFreeListTrashArray for cities and units, and use a new storage container. FFreeListTrashArray is Firaxis's attempt at creating a Linked-Hash Map, which provides constant access with the key (ids) and also provides deterministic orderings when iterating over elements (which the stdext::hash_map lacks).

I was unhappy to discover the C++ STL has no Linked Hash Map standard object. This is a rather common feature in other languages... I now have to locate a suitable alternative. [I am considering just combining a std::vector and std::hash_map in a class, which could provide iteration via the vector, and constant access via the hash_map. Is anyone aware to any drawbacks to this approach, besides the duplicated memory footprint?]

Assuming I do replace FFreeListTrashArray with a Linked Hash Map, where I can control the ID-space, then the units/cities resync problems should be solved.

Thunderbrd · Sep 6, 2014

I would be willing to wager you'll run across problems with the exe if you do this but I don't mean to say that to deter you. If it works it sounds like it would help things.

Afforess · Sep 6, 2014

Thunderbrd said:
I would be willing to wager you'll run across problems with the exe if you do this but I don't mean to say that to deter you. If it works it sounds like it would help things.

The actual container m_cities and m_units in CvPlayer are not exposed to the engine itself, and iteration and lookup is all controlled in the CvPlayer class through functions which can be modified.

I don't think there are any EXE problems with this, as long as the existing function headers remain the same.

Further, Koshling modified the ID allocation order (to avoid exhausting the global id space) and there were no adverse affects.

AIAndy · Sep 6, 2014

Consider using Boost.MultiIndex which allows you to have a container with efficient lookup by key and efficient sequential traversal in one. That is similar to combining vector and map.

http://www.boost.org/doc/libs/1_32_0/libs/multi_index/doc/index.html

Thunderbrd · Sep 6, 2014

Ok I'll admit I don't have the deepest understandings on these issues BUT there's a few things I wonder...

1) Will a differing method enable us to eliminate the current unit limit?

2) Would a new method allow us to reassign the old ID's of units no longer in play to new units so as to help in delaying the unit count limit wall?

Does any of this have anything to do whatsoever with the reason there is a limit to the number of units a player may have? We've had some trouble running up against this limit before and some modifications made have enhanced this concern.

Afforess · Sep 6, 2014

Thunderbrd said:
Ok I'll admit I don't have the deepest understandings on these issues BUT there's a few things I wonder...

1) Will a differing method enable us to eliminate the current unit limit?

2) Would a new method allow us to reassign the old ID's of units no longer in play to new units so as to help in delaying the unit count limit wall?

Does any of this have anything to do whatsoever with the reason there is a limit to the number of units a player may have? We've had some trouble running up against this limit before and some modifications made have enhanced this concern.

Yes to both counts. There is no problem (that I am aware of ) preventing previously used ids could not be recycled, and as long as the game has less than 2^32 - 1 units alive, per player, things should be fine.

AIAndy said:
Consider using Boost.MultiIndex which allows you to have a container with efficient lookup by key and efficient sequential traversal in one. That is similar to combining vector and map.

http://www.boost.org/doc/libs/1_32_0/libs/multi_index/doc/index.html

Thanks, will look into.

Thunderbrd · Sep 7, 2014

Afforess said:
Yes to both counts. There is no problem (that I am aware of ) preventing previously used ids could not be recycled, and as long as the game has less than 2^32 - 1 units alive, per player, things should be fine.

Thanks, will look into.

Then with all my heart I hope to see you successful in this endeavor!!! That would be very great advances for us all.

Afforess · Sep 7, 2014

FYI, I have a local test case where I replaced m_units FFreeTrashListArray with std::vector and stdext::hash_map and nothing has blown up. EXE problems disproven.

Afforess · Sep 7, 2014

Thunderbrd said:
Then with all my heart I hope to see you successful in this endeavor!!! That would be very great advances for us all.

It's coded: http://forums.civfanatics.com/showpost.php?p=13439824&postcount=1187

Ids for units will recycle. Ids for cities will not, because I didn't feel a need to recycle them. I could, but it would probably be unneeded complexity.

It is save-backwards compatible.

Spirictum · Sep 8, 2014

Does this mean RAND is already OOS-immune?

Afforess · Sep 8, 2014

Spirictum said:
Does this mean RAND is already OOS-immune?

Nope.

That discussion about unit ids is all a side-problem that needed to be fixed to continue work on resyncing. The side-problem is now fixed. (As a side benefit, it also solves a problem some users had where they had games with more units than the previous id system would allow... I doubt RAND users ever hit it, but I understand some C2C games have it).

I will make a bigger announcement if/when OOS-solutions are implemented.

Koshling · Sep 8, 2014

Afforess said:
I have things mostly working, except for one major setback. FFreeListTrashArray containers are impossible to resynchronize mid-game. Saves work because they destroy or create a new FFreeListTrashArray. You can not re-organize the structure in-place, as they do not allow any choice of the IDs used to allocate indexes. This is bad for cities and units, which are stored in FFreeListTrashArray.

Destroying and Creating a new FFreeListTrashArray in its place is also not viable as the FFreeListTrashArray "tracks memory usage internally", meaning when you destroy it, it also destroys every object in its containers, calling the destructor on cities and units. Not good. The game's graphics engine is very unhappy if you wipe those.

I initially tried to iterate over the existing elements, and update them in place, if they had an ID that matched what was sent over to be resynced. If there was extra units not sent in the resync (client has extra units that the host game lacks) kill those. But it falls apart when the host game has extra units the client lacks. There is no way to allocate the units on the client and guarantee they will have the correct IDs from the FFreeListTrashArray, since you have no control over how it allocates ids.

Is this a big deal? Yes, if units are out of order in FFreeListTrashArray, they will be out of order when the game iterates over units (which happens often), and will cause a breakdown in the expected determinism.

I think the only reasonable option is to replace the FFreeListTrashArray for cities and units, and use a new storage container. FFreeListTrashArray is Firaxis's attempt at creating a Linked-Hash Map, which provides constant access with the key (ids) and also provides deterministic orderings when iterating over elements (which the stdext::hash_map lacks).

I was unhappy to discover the C++ STL has no Linked Hash Map standard object. This is a rather common feature in other languages... I now have to locate a suitable alternative. [I am considering just combining a std::vector and std::hash_map in a class, which could provide iteration via the vector, and constant access via the hash_map. Is anyone aware to any drawbacks to this approach, besides the duplicated memory footprint?]

Assuming I do replace FFreeListTrashArray with a Linked Hash Map, where I can control the ID-space, then the units/cities resync problems should be solved.

This sounds entirely reasonable to me in principal, but it's really rather hard to come up with an algorithm that can make this work I think, because of the (inadequate) size of the key space. You require all of the following:

1) Allocation will never assign an id that has previously been assigned (NOT just one that is not CURRENTLY in use, because unit ids etc. can be kept in other objects as references, and their non-existence due to destruction is only revealed by the attempt to dereferece the id [to give null], which means it cannot safely be re-used [else it will deref to the new instance])

2) Ids can be imposed (this is a new requirement you have to get out of OOS)

3) Ids are 32-bit quantities

The current scheme meets the needs of (1) and (3), but not (2). Any sort of hash-list would fail on (2) due to unavoidable key reuse because a 32-bit space is nowhere near large enough to avoid it by statistical means, and keeping a list of all previously allocated ids would be unscalable [and would also need syncing])

IMO your best bet is to do this (but it's FAR from trivial):

i) Internally increase id size to 64-bits

ii) Maintain a 64 bit internal id <-> 32 bit external id map, that guarantees (1-1) mapping but does NOT guarantee non-reuse of the external id. Provide the external id via the DLL externals (so that's what the game engine sees). Use new internal only methods to retrieve the full id and use that internally to the DLL (and in syncing)

iii) To form the 64-bit id use a linked hash map (or similar) with a 32-bit id space, that guarantees uniqueness within the current set but doesn't know anything about reuse. On allocation form the top 32-bits from a monotonically increasing sequence incremented on each allocation (ensures no reuse of the 64-bit result) with the top 6 bits reserved for the player number of the local human player (i.e. - each location being synced guarantees to allocate from a distinct space and so they cannot clash)

The need to do the 64<->32 internal<->external translations everywhere an id might escape from the DLL makes this a LOT of work though.

Afforess · Sep 8, 2014

Koshling said:
This sounds entirely reasonable to me in principal, but it's really rather hard to come up with an algorithm that can make this work I think, because of the (inadequate) size of the key space. You require all of the following:

1) Allocation will never assign an id that has previously been assigned (NOT just one that is not CURRENTLY in use, because unit ids etc. can be kept in other objects as references, and their non-existence due to destruction is only revealed by the attempt to dereferece the id [to give null], which means it cannot safely be re-used [else it will deref to the new instance])

What sort of objects store the unit id as a reference? [Off the top of my head, Selection Groups and Tiles both do] I let the game recycle unit ids and I didn't see any immediate issues, although very probable that I either didn't notice or haven't run enough games to see the issue crop up.

It seems more straightforward to attempt to locate and invalidate id use when a unit is destroyed than to attempt a larger 64-bit mapping.

alberts8 · Sep 8, 2014

Afforess said:
What sort of objects store the unit id as a reference? [Off the top of my head, Selection Groups and Tiles both do] I let the game recycle unit ids and I didn't see any immediate issues, although very probable that I either didn't notice or haven't run enough games to see the issue crop up.

It seems more straightforward to attempt to locate and invalidate id use when a unit is destroyed than to attempt a larger 64-bit mapping.

You can change the ID usage in the dll but the problem is we don't know and can't change what the exe does with those ID's.

Afforess · Sep 8, 2014

alberts2 said:
You can change the ID usage in the dll but the problem is we don't know and can't change what the exe does with those ID's.

That's not entirely true. Whenever the EXE wants the unit it has to call the DLL method getUnit. So while we can't know what it is doing with them, we can observe important information. Information like the frequency of lookups, whether it performs lookups on known dead id space, etc.

That is all we really need to know.

Afforess · Sep 8, 2014

Koshling said:
This sounds entirely reasonable to me in principal, but it's really rather hard to come up with an algorithm that can make this work I think, because of the (inadequate) size of the key space.

I feel I should also mention that what is an inadequate key space for Caveman2Cosmos is an adequate key space for RAND. I suspect that keys can be recycled, but even if they can not, I have never once seen a player consume the entire key space in a RAND game. It just doesn't happen.

Also, random thought. If C2C runs out of key space, could you not, say, up the DLL MAX_PLAYERS to 100, and keep the extra 50 spaces as extra slots for when a player's key space is consumed? When one player's full 2^32 bit range is consumed, you could switch all of the content over to a new, unused player slot, and boom, fresh key space.

alberts8 · Sep 9, 2014

Afforess said:
I feel I should also mention that what is an inadequate key space for Caveman2Cosmos is an adequate key space for RAND. I suspect that keys can be recycled, but even if they can not, I have never once seen a player consume the entire key space in a RAND game. It just doesn't happen.

Also, random thought. If C2C runs out of key space, could you not, say, up the DLL MAX_PLAYERS to 100, and keep the extra 50 spaces as extra slots for when a player's key space is consumed? When one player's full 2^32 bit range is consumed, you could switch all of the content over to a new, unused player slot, and boom, fresh key space.

I don't think players in C2C ever consumed the entire key space because 2^32 bit is alot maybe only in case of bad key allocation. I only saw a few games reaching FLTA_MAX_BUCKETS that is 8192 with Units.

AIAndy · Sep 9, 2014

I don't really see the problem with key reuse. So there is no need to keep a list of previously allocated IDs. Just use synced rands until you find a free spot in the hash map.

alberts8 · Sep 9, 2014

Maybe i'am lost but i don't see the need to reuse them. Those ID's have a 2^32 range that should be enough. The 8192 FLTA_MAX_BUCKETS limit has nothing to do with ID's and it should be gone if we use a Vector and a HashMap as containers.

Koshling · Sep 9, 2014

Recycling causes rare and obscure bugs (was seen in the past with a first attempt to change this structure, that did allow recycling, a few years ago). I'm not sure what the exhaustive list of things hat can hold unit or selection group ids is, but off the top of my head:

Selection groups (units they are shadowing and units that comprise the group of course)
Selection group AI (units they are rendevousing with I think)
Tiles (selection groups present)
Units I think (transport units know what they are carrying or visa versa, also cached id of commander)

Trying to find all the uses when a unit dies will be horrible for performance, which is why it relies on the later dereferencing returning null (which requires the id to not be recycled). The best way to test the impact is to modify it to recycle IMMEDIATELY (i.e. - be the worst possible case), which will show all the otherwise-obscure cases right up front. The resulting issues should then show up much more commonly for debugging. If they can all be fixed then that would green-light recycling generally.

Multiplayer: Hacking A Fix Around OOS Errors

The White Wizard

C2C War Dog

The White Wizard

Deity

C2C War Dog

The White Wizard

C2C War Dog

The White Wizard

The White Wizard

Immortal

The White Wizard

Vorlon

The White Wizard

Emperor

The White Wizard

The White Wizard

Emperor

Deity

Emperor

Vorlon

Similar threads