1. We have added a Gift Upgrades feature that allows you to gift an account upgrade to another member, just in time for the holiday season. You can see the gift option when going to the Account Upgrades screen, or on any user profile screen.
    Dismiss Notice

HOW TO: Isolate Crashes

Discussion in 'Civ4 - Modding Tutorials & Reference' started by Kael, Jun 21, 2006.

  1. Kael

    Kael Deity

    Joined:
    May 6, 2002
    Messages:
    17,401
    Location:
    Ohio
    [tab]Okay, you have finally done it. After many long hours you have the mod you always wanted to make. But then you, or someone else, try to play and it crashes. What do you do?


    Isolation and Resolution

    [tab]There are two steps to fixing problems, isolating to find out what piece of code is causing the problem and changing that code to resolve the issue. This article will only deal with isolation. Resolution is going to be specific to your code so I can’t offer general advice. The hardest part of fixing problems is isolating the issue, typically if you can do that resolution is easy.


    The Changelog

    [tab]Before we start talking through the steps it is important that you keep a changelog of the changes you are making between versions. Players will often give you a good indication of what was going on during the crash and if you have a good changelog you will be able to drill into the issue right away.

    [tab]Changelogs become even more important as you start writing new functions. In these cases crashes may not be a problem in the function that crashes so much as in “bad” data passed to them. It is extremely frustrating to be working on a new crash in a function that hasn’t changed in forever and to have forgotten that you made a change to a related module that passed data to it.

    Special note, the SDK changelog:

    [tab]The changelog is good advice for modding, but if you are going to be making SDK changes it is mandatory. Not only is it required to help troubleshoot issues but you will need the changelog to convert all of your changes into the next SDK when Firaxis releases future patches.

    [tab]The following is the format I use for SDK changes:

    [tab]It’s pretty self explanatory. It contains why the change was made, the version it was checked into, the file and function that was changed and the actual code. Notice when the code is changed the original code is always left in but commented out. I never remove “real” code, it is too important as a reference.


    Step 1: Get a save

    [tab]To fix a problem you are going to need a save game right before a reproduceable crash. Load up the save and verify that the crash occurs consistently. If the crash doesn’t happen then we can’t go any further. We will talk about intermittent issues later in this article.


    The Little Hammer

    [tab]Once we have a reproducible crash you can begin to isolate exactly what is going on in the save that is triggering the crash. I usually go through the following steps:

    1. Remove all players except the player and one other in the Worldbuilder. See if the crash still occurs. If the Crash still occurs go to 2, if it doesn’t go to step 3.

    2. If the crash still occurs with only the player and one other, we will need to delete more stuff. Remove all players except the player and one other again, but this time remove all of the units and cities for two existing players except for an empty city each. Make sure neither of these cities is finishing production the turn the crash occurs. If the crash still occurs go to the next section on ‘The Big Hammer’. If the crash doesn’t occur go to step 3.

    3. Now that we have a save game where we can stop the crashes by removing a certain portion of the environment we have only to add pieces back until the crash happens again. This can get time consuming but I typically reload and only delete half of what I did the previous time (when the crash didn’t occur) and see of the crash still doesn’t happen. If it doesn’t repeat this step but delete even less. If it does repeat this step but only delete half of what I just added back in until I narrow down on the specific unit/action/city/etc that is causing the issue.


    The Big Hammer

    [tab]The Little Hammer doesn’t always work. Sometimes you can remove just about everything in the save game and it still crashes. The Big Hammer is another method of isolation that works at the mod level instead of within the save game. It is very useful for isolating python and SDK issues.

    [tab]As above you will need a reproducible crash to test with.

    1. If you are using an SDK mod rename the ‘CvGameCoreDLL.dll’ to ‘CvGameCoreDLL old.dll’. If the crash still occurs then you can assume that the problem isn’t in the SDK modified sections. If the crash no longer occurs then you have an SDK issue. You will have to go through your changelog (you do have a changelog right!) and review the changes that have been made in your latest version.

    [tab]I keep all of the old CvGameCoreDLL.dll files from my versions so I can walk back through the changes until I find the first CvGameCoreDLL.dll that is broken, review the changes that were checked into that one and I should have my issue isolated.

    2. If you aren’t using an SDK mod or removing the CvGameCoreDLL.dll didn’t fix your problem your next suspect is python. To rule out a python issue rename your python directory to ‘python.old’ then reload your save and see if the crash still occurs.

    [tab]If it still occurs with your CvGameCoreDLL.dll and your python directory out of your way then you have to be looking at an XML issue and you should probably go back to the Little Hammer section and see what is still around to delete.

    [tab]Assuming renaming the python directory keeps the crash from occurring you can start adding python file back until the crash returns and then see what changed in the last python file added since the last version. You could also swap in python directories from older version of your mod to see if those have the same issue (to prove if this crash is a new issue or if it has been around before).


    Intermittent Issues

    [tab]Intermittant issues are the worst! If they are extremely intermittent then they are almost impossible to trap because they may happen on only 1 out of 20 times. You may make a change that you think will fix it and load the save 40 times without a problem and you still wont know if you actually did anything to help with the issue or if you just got lucky.

    [tab]Fortunatly since Civ4 saves the random generator these issues are less common. Even “random” events will occur exactly as before if no one consumes a random number they didn’t in the original.

    [tab]For example I had a reproduceable crash that occured if I fortified my unit and ended the turn. If I used the unit to attack and kill a nearby barbarian the crash wouldn’t happen. So I spent hours trying to figure out what was up with that barbarian or the attacking unit that could have anything to do with the error. Then I tried to attack another unit with one of mine, again there was no crash. All I was doing with the attack was eating up some of the random numbers so that the actual random event that caused the crash didn’t occur. The lesson I learned from this was, if you have crashes related to random functions you need to make sure your actions are identical each time you reload the save, or your results won’t be meaningful.


    Practice makes Perfect

    [tab]I hope this helps. I plan on updating this article with me information and suggestions as I come across them. But don’t wait for me, post your own tips and tricks for isolating issues with mods in this thread!
     
  2. Red Door

    Red Door Man of Mayhem

    Joined:
    May 29, 2005
    Messages:
    12,665
    Location:
    USA #1
    Wow! Nice job Kael, should help out a lot of mod-makers.
     
  3. Kael

    Kael Deity

    Joined:
    May 6, 2002
    Messages:
    17,401
    Location:
    Ohio
    Thanks, I know to some of the community this will be information they already know but I think it may be of value to new modders.
     
  4. Shqype

    Shqype Shqyptar

    Joined:
    Nov 15, 2005
    Messages:
    2,468
    Location:
    New York + Shqypni
    True, but some of us already know because you've helped us out in the past ;)
     
  5. Chalid

    Chalid Black Dragon

    Joined:
    Nov 24, 2005
    Messages:
    1,742
    Location:
    Munich, Germany
    For locating problems i usually add some code in the SDK.

    I have a log message that reports when the program enters a major part of the code and it reports back when it leaves the code.

    Eg. It reports when AI player 1 begins the turn, then it reporte when the cities are all worked, when the cities are finished, then when it starts doing units, when the units are finished, ..., and finally when the entire player has finished its turn. Then the same thing for the next player.

    This way i simply can look which action was begin but was not finished. That often gives a good clue where to look for your bug.
     
  6. Kael

    Kael Deity

    Joined:
    May 6, 2002
    Messages:
    17,401
    Location:
    Ohio
    Could you provide some example code?
     
  7. Chalid

    Chalid Black Dragon

    Joined:
    Nov 24, 2005
    Messages:
    1,742
    Location:
    Munich, Germany
    I'll post some code sniplets this evening.
     
  8. talchas

    talchas Prince

    Joined:
    Jan 3, 2006
    Messages:
    428
    This is my logging function that I use in the SDK. Just stick this at the top of the cpp file after all of the #include statements to use it. Also, if you use it in multiple files, you need to give it a different name for each file (or do it properly and give it a header file that is included, but this is simpler, especially if each file uses its own logfile).
    Code:
    void logMsg(char* format, ... )
    {
            static char buf[2048];
            _vsnprintf( buf, 2048-4, format, (char*)(&format+1) );
            gDLL->logMsg("ai.log", buf);
    }
    This would log the messages that use this function to "Documents and Settings\<username>\My Documents\My Games\<whatever your civ4 dir is called>\logs\ai.log" not "program files\Firaxis Games\Civ4\anything".

    you would call this exactly like printf - logMsg("message to log"); or logMsg("log:%d",aNumber);


    Also, instead of using a changelog like Kael's, you might want to use version control if you can deal with it. It would allow you to save all the versions of the files so you know exactly what changes you made. I personally use Subversion for the spells code I wrote for FFH.
     
  9. wotan321

    wotan321 Emperor

    Joined:
    Oct 25, 2001
    Messages:
    1,225
    Location:
    NC, USA
    Could you shed some light on what the .log files tell us? How do we use the log files to isolate errors?

    Thanks for this valuable thread, btw.
     
  10. talchas

    talchas Prince

    Joined:
    Jan 3, 2006
    Messages:
    428
    Well, its the standard printf method of debugging - you go to the area where you think the program is crashing + insert log statements like this:
    Code:
    someMethod(){
    log("1");
    <code>
    log("2");
    <code>
    log("3");
    ...
    }
    
    where <code> is a few lines of the actual code. Then, after it crashes, you can go look at the log file. If the log shows
    Code:
    1
    2
    
    or even
    Code:
    1
    2
    3
    1
    2
    3
    ...
    1
    2
    
    you know that the CTD is occurring between the log("2") statement and the log("3") statement. (this is where java-style stacktraces, with line numbers would be really nice). However, this doesn't tell you exactly what the problem is (however, its almost certainly a null reference, the trick is finding out where and why), for that you need to do more debugging. You can print out a number like this: log("my number is:%d", mynumber) and it'll print something like this: "my number is:14". To print a string use %s instead of %d (for more info google for printf). If you want to tell if a given pointer is null you do one of these two things: either log("my pointer is this:%d", myPointer), which will print out 0 if it is null and some large number otherwise, or more properly log("my pointer is null:%d", myPointer==NULL), which will print "my pointer is null:1" if it is null and "my pointer is null:0" otherwise. However, even checking for nulls may not be enough as you might access a deleted object (this should rarely happen, as you should generally set any deleted references to NULL unless you are sure you don't have to). Then the best you can do is to try something like this:
    Code:
    someMethod(){
    ...
    log("before");
    log("%d", myPointer->someVariable); //or myPointer->someMethod();
    log("after");
    ...
    }
    
    if it prints both "before" and "after" then the pointer is (probably) valid and the crash is somewhere else.

    Isn't C++ debugging w/o a debugger fun? :crazyeye:
     
  11. Olleus

    Olleus Deity

    Joined:
    Oct 30, 2005
    Messages:
    6,478
    Location:
    Beyond the Veil
    A simpler way of logging is to use gDLL->messageControlLog(char *), which prints to the MPLog. However, this only accepts char*and not wchar* which is what civ uses most of the time and converting one to the other can lead to some problems.
     
  12. Jeckel

    Jeckel Great Reverend

    Joined:
    Nov 16, 2005
    Messages:
    1,637
    Location:
    Peoria, IL
    Instead of printing values and checks to the log, I have then print on screen in the game and to the ingame event log. That way you can see it right there without having to alt-tab and open the log file.
     
  13. talchas

    talchas Prince

    Joined:
    Jan 3, 2006
    Messages:
    428
    Only problem with that is that only so many messages will show up, and if one shows up, but then the program crashes immediately afterward, you may not see the message. Also, you can always run Civ in windowed mode, and have the file open in the background.
     
  14. Houman

    Houman TR Team Leader

    Joined:
    Jan 24, 2002
    Messages:
    1,083
    Thanks Kael for this wonderful thread.
    And Many thanks talchas for your logging function. This is exactly where we at Total Realism are stuck. It is really hard to debug the SDK.

    Regarding your code
    void logMsg(char* format, ... )

    What should the ... mean? More parameters?

    Many thanks
    Houman
     
  15. talchas

    talchas Prince

    Joined:
    Jan 3, 2006
    Messages:
    428
    That is one of the ways to define a variable argument list in C/C++ - I copied it from somewhere else in the SDK.
     
  16. Kael

    Kael Deity

    Joined:
    May 6, 2002
    Messages:
    17,401
    Location:
    Ohio
    Np, if you get some time you should write a how-to on building an automated installed for a mod. Yours is really cool since it creates an icon on the desktop, adds an entry to add/remove programs and allows the player to select options during the install. I would love to know how to do all that.
     
  17. Houman

    Houman TR Team Leader

    Joined:
    Jan 24, 2002
    Messages:
    1,083
    Thanks Kael,

    Yeah definitely its worth an artikel. I will publish it as soon as I fix these CTDs and prepare the Warlord release. Right now the community is close to throwing rotten eggs at us. :)
     
  18. Houman

    Houman TR Team Leader

    Joined:
    Jan 24, 2002
    Messages:
    1,083
    Gys,

    I have problems: Some of the files are fine with the debugger function, others say:

    cannot convert 'this' pointer from 'const Unit' to 'Unit &

    I.e.:
    Code:
    bool CvUnit::isNukeVictim(const CvPlot* pPlot, TeamTypes eTeam) const
    {
    	CvPlot* pLoopPlot;
    	int iDX, iDY;
    
    	// Dale - MAD: don't declare war on self
    					logMsg("CvUnit.cpp - 16");
    	if(getTeam() == eTeam)
    	{
    		return false;
    	}
    .
    .
    .
    
    I have already declared the method in each header file:
    void logMsg(char* format, ... );


    WHat shall I do please?
     
  19. talchas

    talchas Prince

    Joined:
    Jan 3, 2006
    Messages:
    428
    Try making that const void logMsg...
     
  20. Houman

    Houman TR Team Leader

    Joined:
    Jan 24, 2002
    Messages:
    1,083

    Ok by making the function itself to const void logMsg(...)

    how do I call a const function from code?

    Because now it says:
    'CvUnit::logMsg' : cannot convert 'this' pointer from 'const CvUnit' to 'CvUnit &'

    Thanks
    Houman
     

Share This Page