[TUTORIAL] Avoid creating bugs from the exe calling the DLL

Nightinggale

Deity
Joined
Feb 2, 2009
Messages
5,281
The setup

The game consists of 3 parts: exe, dll and python. Each of those can call functions in the other two, either directly or indirectly.

The problem is not making such calls, but rather mod code, which is called from somewhere else. The biggest issue is modding a dll function, which is called from the exe. If a call goes wrong, the game can crash seemingly randomly or other really weird stuff can occur.

How the exe calls the dll

There are two approaches. One is that the exe has access to anything using the DllExport keyword. This will allow the exe to call using standard C++ calls. The other is calling virtual functions even if they do not have the DllExport keyword.

Very simplified, a function call in C++ (and other compiled languages because it's really about how the CPU works) is done by placing the arguments in registers. A register is a tiny memory inside the CPU and it can hold a single int (32 bit). For simplicity let's say arguments are placed in registers numbered from 1 and forget that it can use the stack too.

Say the function takes one argument. The caller places the argument in register 1. The called function will then assume the variable to be present in register 1. If there is a second argument, the same thing happens with register 2 and so on.

The problem

If a modder change what an argument is or adds an argument, then nobody will tell the exe. If the dll assumes 2 arguments and the exe only adds a variable to register 1, the dll will get the value of the second variable to be whatever was left in register 2 from the last time it was used. This will introduce corrupted data and then all sorts of weird stuff can happen.

Default values will not help because they won't do anything at runtime. If argument 2 is default to 0, then it is a message to the compiler to make the caller place 0 in register 2 while the function itself will not change. Since the exe will not know this, it will not place the default value, hence it will still contain corrupted data.

How to fix the exe from providing garbage arguments

You can't. We are stuck with the function calls the exe makes and we can't change the arguments. We can move the contents of a function into a new function without DllExport and then make the original call the new function. That way we have some sort of "buffer" in the dll where default values can be added.

Which functions will be called from the exe

Ideally it should be any function with DllExport. However vanilla added this to too many functions and over the years modders have added it for new functions as well, apparently because "vanilla works, so copy vanilla even if I don't know what it does".

I attached a list of functions called by the two exe files.

The way to identify the virtual functions called by the exe is.... who knows? Apparently the exe can call say the 4th virtual function in CvPlayer and then it just assumes it's the vanilla function for arguments. No known list of used virtual functions exist.

How to remove the extra DllExport keywords

You can go through the lists, but it would likely be easier to go to the source of the lists. I copy pasted output from dependency walker (DP). You can just download and use that application as well.

The easiest approach would likely to copy the BTS (or Colonization) exe files and dll files into Assets in your mod. Obviously you are using git, meaning you can easily remove those extra files later and there is no need to commit them as they will only bloat your repository.

Open DP, open the exe in Assets. In the left menu is the dll files it can call and the mod file is last in the list. Click it. To the top right will now be a list of functions it can call. First column is a bunch of green icons. Click on the column header once and it will sort according to that list.

Now remove all DllExport from a file and compile. Close the file in DP (file menu) and open the exe again. Select the mod dll again. You will now have red lines on top (because it's sorted). Adding DllExport will fix the red icon and make it go green again. Fix all red icons.

If you do this for all files, then you should end up knowing that you are free to do whatever you want with arguments for any function not using the DllExport keyword while those with the keyword are all hardcoded to "do not touch arguments".

DP will also be able to check if the arguments have been modded, meaning it can help catch already introduced bugs.
 

Attachments

  • BTS exe.txt
    65.7 KB · Views: 210
  • BTS pitboss.txt
    56.2 KB · Views: 158
As I wrote in the K-Mod subforum (link), I wanted to try removing all the unused DLLExports from my K-Mod-based mod by looking up every exported function in Nightinggale's lists. I've done so now, and, so far, everything still seems to work. I've started singleplayer and multiplayer games with various settings, ran some turns on AI Auto Play, saved, (quick-)loaded, changed some player options, browsed through menus and Advisor screens – all good. Only Pitboss, Steam multiplayer and LAN multiplayer I haven't tested at all. So it seems that the lists are reliable even when it comes to constructors and destructors. Oddly, I've come across one function, CyArgsList::add(wchar const *) [not: char const *] that I can't find in the lists, but that is actually called by the executable on game start. Dependency Walker does recognize the error if the DLLExport is removed. Oh well.

I don't know if I can recommend removing all the DLLExports; even if I were positive that it doesn't break anything, the procedure still took a couple of hours as there are several hundred DLLExports to remove, and the only real upside is that one doesn't have to keep Nightinggale's lists at hand when changing function signatures. (And it should make it easier to identify functions that are never called from anywhere.) In any case, since I've gone through the trouble, I thought I'd post my list of functions that don't require the DLLExport keyword.

Here's a git commit with my changes. Unfortunately, that's based on a mod that had already removed a few DLLExports beforehand, so the list isn't quite complete. Missing from the commit:
Spoiler :
Code:
CvPlayer::convert
CvSelectionGroup::autoMission
CvTeam::declareWar
CvTeam::makePeace
CvGameTextMgr::setUnitHelp
CvGameTextMgr::getTradeString
CvGameTextMgr::getDealString
And possibly one or two others that I can't remember. :(

Edit (June 2019): A few more that I had previously overlooked: Git commit

While I was at it, I also removed the 2 times 19 read/write(FDataStreamBase*) functions in CvInfos.h. My tests suggest that these functions are indeed never called – as Nightinggale had already strongly suspected in this thread. (My hypothesis is that they were originally used for the "No Cheating" option, and that the behavior was later changed so that checksums of the XML files are stored instead – perhaps in patch 1.61, which renamed the option to "Lock Modified Assets".) I suppose some modder might someday write code that wants to store the XML data in a savegame, so I've removed the dead code through a preprocessor flag SERIALIZE_CVINFOS (see CvDefines.h, CvGlobals.cpp, CvInfos.h and CvInfos.cpp in the commit).
 
Last edited:
While I was at it, I also removed the 2 times 19 read/write(FDataStreamBase*) functions in CvInfos.h. My tests suggest that these functions are indeed never called – as Nightinggale had already strongly suspected in this thread. (My hypothesis is that they were originally used for the "No Cheating" option, and that the behavior was later changed so that checksums of the XML files are stored instead – perhaps in patch 1.61, which renamed the option to "Lock Modified Assets".) I suppose some modder might someday write code that wants to store the XML data in a savegame, so I've removed the dead code through a preprocessor flag SERIALIZE_CVINFOS (see CvDefines.h, CvGlobals.cpp, CvInfos.h and CvInfos.cpp in the commit).
Another possibility is that it is used to cache xml data. Rather than loading the xml files, it can read them like savegames to start the game faster. However it's a stupid idea because it's the cause of so many issues when switching or updating mods. Also loading xml data can be done in in a few seconds. The slow part of loading a mod is loading graphics meaning it's not actually speeding up the game startup time.

This can be disabled in CivilizationIV.ini (DisableCaching). However if you want your mod to disable this feature regardless of user settings, you can edit CvXMLLoadUtility::LoadGlobalClassInfo() in CvXMLLoadUtilitySet.cpp. The last argument pArgFunction is a pointer to the cache object function for the file in question. Set this to NULL and the game is forced to read the xml files each time.

Obviously if the DLL avoids using the cache entirely, then the read/write functions will be useless.
 
Another possibility is that it is used to cache xml data. [...]
That's got to be it. The contents of my cache folder:
Spoiler :
Code:
catalogCiv4BeyondSword1029623623.dat       8 B
catalogCiv4BeyondSword491075293.dat       86 B
catalogCiv4BeyondSword79756253.dat       167 B
catalogCiv4BeyondSword2298406619.dat     624 B
catalogCiv4BeyondSword2775974657.dat     640 B
catalogCiv4BeyondSword1154765664.dat   1.3 KB
catalogCiv4BeyondSword2715548016.dat  16 KB
catalogCiv4BeyondSword3979112632.dat 124 KB
catalogCiv4BeyondSword1511201814.dat 130 KB
catalogCiv4BeyondSword2330217323.dat 332 KB
CIV4BonusInfos.dat                     9.8 KB
CIV4BuildingInfos.dat                 15 KB
CIV4CivicInfos.dat                    47 KB
CIV4CivilizationInfos.dat            114 KB
CIV4DiplomacyInfos.dat                93 KB
CIV4EventInfos.dat                 1.6 MB
CIV4EventTriggerInfos.dat             89 KB
CIV4HandicapInfo.dat                   5.7 KB
CIV4ImprovementInfos.dat              76 KB
CIV4LeaderHeadInfos.dat               67 KB
CIV4PromotionInfos.dat                35 KB
CIV4TechInfos.dat                     51 KB
CIV4UnitInfos.dat                     16 KB
crc.dat                               33 B
GlobalDefines.dat                     17 KB
GlobalText.dat                       865 KB
--
26 Files                           3.7 MB
... and the CvInfo classes with read/write functions:
Spoiler :

CvBonusInfo
CvBuildingInfo
CvCivicInfo
CvCivilizationInfo
CvDiplomacyInfo and its component CvDiplomacyResponse
CvEventInfo
CvEventTriggerInfo
CvHandicapInfo
CvImprovementInfo and its component CvImprovementBonusInfo
CvLeaderHeadInfo
CvPromotionInfo
CvTechInfo
CvUnitInfo
CvDiplomacyTextInfo and its component Response
the two base classes CvInfoBase and CvHotKeyInfo
Apart from CvDiplomacyTextInfo, it's a 100% match. Whereas e.g. WorldInfo and ReligionInfo are absent from both lists.

The strange thing is, though, that I've always had caching enabled, and my mod still never stepped into those read/write functions and never created those ...Infos.dat files. In fact, I can't get any mod to create those files, and they all say "Init XML (uncached)" during initialization. Only the unmodded game creates them. That's probably just me?
Edit: Technical details moved into spoiler tags:
Spoiler :
But I've tried all combinations of the caching and file system caching options, it's the proper INI file (e.g. the Fullscreen switch works) and the DisableCaching switch does what it's supposed to do for the unmodded game.

With a mod, regardless of the INI settings, I do get e.g. "Wrote Technologies to cache" in xml.log, and in my own mod, I can see that the pCache object is created (with szFileRoot=CIV4TechInfos), and gDLL->cacheWrite gets called and returns true. An FAssert(GC.getDLLIFace()->ChangeINIKeyValue( "CONFIG","DisableCaching","0"))
before the gDLL->cacheWrite succeeds, but doesn't make a difference (however, toggling FullScreen this way works). gDLL->cacheRead always returns false.
Perhaps I'll ask about this in the Quick Modding Questions thread – if anyone has ever seen "Init XML (cached)" when loading a mod. I'm not so sure that the XML cache works at all for mods.
 
Last edited:
Perhaps I'll ask about this in the Quick Modding Questions thread – if anyone has ever seen "Init XML (cached)" when loading a mod. I'm not so sure that the XML cache works at all for mods.
I once loaded a mod where units were severely broken to the point where it was unplayable. Editing the ini file to disable the cache fixed the problem. Apparently it does cache in some cases and will read the cache is some cases. That's not the same as it is working as intended and it's certainly possible that the vanilla implementation is buggy for mods. Personally I always disable the cache and haven't experimented. The only issue is when the game crashes at startup and resets the ini file without telling me.

There is a command line argument to use a different ini file. It kind of works, but I haven't managed to get it to work reliably. It's possible that the whole ini setup could be a bit unstable unless you use the vanilla code/settings.
 
My inquiry in Quick Modding Questions hasn't yielded anything so far, but, either way, you're right that:
Apparently it does cache in some cases and will read the cache is some cases.
And I agree that's it's best to disable it. Thanks for all the info!
 
Is it safe to add virtual functions to the end of the list? That seems like it should not interfere with "call Xth virtual function of class Y" calls from the exe.
 
Is it safe to add virtual functions to the end of the list? That seems like it should not interfere with "call Xth virtual function of class Y" calls from the exe.
In theory yes, but it's untested. I would however do this instead:

PHP:
inline CvPlayerAI* AI() {return (CvPlayerAI*)this;}
inline const CvPlayerAI* AI() const {return (const CvPlayerAI*)this;}

Before the class, announce the existence of CvPlayerAI with the following line:
PHP:
class CvPlayerAI;

Adding this to CvPlayer and you will gain access to the AI functions by adding an AI()-> prefix to the call. It works because all instances of CvPlayer are actually CvPlayerAI and as such should be safe for our usage. It's NOT safe in general in C++ because it assumes the instance to be of type CvPlayerAI without checking and it crashes if it isn't an instance of CvPlayerAI. There are C++ typecasts to get around this, but they will not work in this case because it would mean CvPlayer.h should include CvPlayerAI.h and CvPlayerAI.h have to include CvPlayer.h. Instead it uses C style typecast where it actually doesn't know CvPlayer and CvPlayerAI have anything to do with each other.

This should work for all AI classes.
 
Top Bottom