Newer versions of Windows are less forgiving about certain programming errors. Take a look at
this thread that CTD took place in an similar scenario.
Now that was a fascinating thread! I may comment on this bug there. You should take a look at the rather simple change that this debug required. The thread you posted suggests that this may be an issue elsewhere throughout all vanilla based mods and when I found the issue I was left to wonder about that because the 'fix' was made in a function I do not suspect we've ever modified.
Mind you, the problem with the gamestate doesn't begin there. Where it actually begins is a bigger mystery. How would you get two duplicate unit IDs on the same plot in the first place? In this case, that may have even happened in rounds that took place before this save. It won't cause a problem until the unit then tries to move again (or gets killed which results in a move to an invalid plot.)
A little further investigation should be done on this still I think. Perhaps I should place an assert on CvPlot's add unit function to try and catch when a duplicate unit ID is added to the list.
The same thinking applies to the bug you were discussing in the other thread. A NULL pointer is only returned by the GetUnit function when cycling through a list, be it a plot list or a selection group list if something has gone horribly wrong. Even eliminating a unit mid-loop should still not allow such a problem since the 'next' pointer is re-established properly during the removal process. removeunit not being run while the unit is otherwise deleted would cause this problem. Most of these reasons would be painfully obvious immediately upon being incorrectly programmed.
But if you have a situation like this one where the unit was somehow added to the list twice, in this case, the plot list, you get a crash when the unit dies and the assumption there's only one instance of its ID makes it only removed once while the rest of the unit data is eliminated, making it apparent why you then get a crash the next time the unit data is asked for since the plot list is saying it should still exist given that only one of the two instances listed there was removed when the unit died.
Ultimately, my point is:
Code:
if (pLoopUnit == NULL)
{
// Koshling - this really isn't a valid condition I think but it's been seen
// and until we can understand the underlying cause, mitigate by ignoring the
// invalid unit
continue;
}
as a fix is terribly flawed. As Nightinggale was pointing out, this is only addressing a symptom, not addressing the disease. There are so many places in the code where node lists are used that just patching up one instance, especially on plot node lists, of a null call will simply lead to another elsewhere. There's a reason this should be safe despite getUnit having a valid NULL response, and that is that if you get that NULL response, the problem is in a corruption in the node list, and that should never happen... unless there's a big problem elsewhere. If you don't fix that initial flawed gamestate creating error, you are going to set yourself up for later crashes and other larger problems down the line.
Only by following the debugger run very carefully was I able to figure out what happened as it was so far out of the box of any presumptions I might have had coming into the issue that I had to be forced to recognize what was actually taking place by watching the code process through every single step. Took me a while to narrow in on it. I would have thought it was impossible what was actually taking place.
In this case, the effort led to deeper analysis of what was happening here:
Code:
void CvPlot::removeUnit(CvUnit* pUnit, bool bUpdate)
{
CLLNode<IDInfo>* pUnitNode;
pUnitNode = headUnitNode();
while (pUnitNode != NULL)
{
if (::getUnit(pUnitNode->m_data) == pUnit)
{
FAssertMsg(::getUnit(pUnitNode->m_data)->at(getX_INLINE(), getY_INLINE()), "The current unit instance is expected to be at getX_INLINE and getY_INLINE");
m_units.deleteNode(pUnitNode);
at m_units.deleteNode(pUnitNode);
Code:
template <class tVARTYPE>
inline CLLNode<tVARTYPE>* CLinkList<tVARTYPE>::deleteNode(CLLNode<tVARTYPE>* pNode)
{
CLLNode<tVARTYPE>* pPrevNode;
CLLNode<tVARTYPE>* pNextNode;
FAssert(pNode != NULL);
pPrevNode = pNode->m_pPrev;
pNextNode = pNode->m_pNext;
if ((pPrevNode != NULL) && (pNextNode != NULL))
{
pPrevNode->m_pNext = pNextNode;
pNextNode->m_pPrev = pPrevNode;
}
When pNextNode also showed up as the same player and unit ID I did a double-take. Could there REALLY be a duplicate unit ID Node??? Sure enough there was. So the easiest 'fix' to correcting this was to back up and adjust removeUnit to:
Code:
while (pUnitNode != NULL)
{
if (::getUnit(pUnitNode->m_data) == pUnit)
{
FAssertMsg(::getUnit(pUnitNode->m_data)->at(getX_INLINE(), getY_INLINE()), "The current unit instance is expected to be at getX_INLINE and getY_INLINE");
m_units.deleteNode(pUnitNode);
/*break;*///TBDebug: It has been found possible that a node may have somehow duplicated itself and thus the plot may have more than 1 of the node. If these duplicates aren't destroyed, we get a crash as well. Best case would be to somehow catch how this gets setup but in the debugging case that discovered this, it likely happened long beforehand. Jumptonearestvalidplot is the suspected culprit as it was an exile that showed the issue.
}
//else
//{
pUnitNode = nextUnitNode(pUnitNode);
//}
}
I'd bet quite a bit that the same thing (a duplicate ID) was happening in the CVSelection remove Unit command in their mod.
Again though, now I need to figure out what happened to create this duplication!
That different OPs and compilers handle the issue differently isn't too surprising really. It's a matter of a difference in what would happen in the loop when it hits a null unit pointer ejecting from the while sequence rather than holding on and hitting a crash due to a null reference. Either that or the gamestate altering problem in their case was based on a random being generated that was varying on one system to the next.