DLL coding thread

After a lot of hard work, I think I'm more or less done redesigning the makefile to handle multiple subdirs and having the project files in a different directory. I wrote it to be generic enough to add more subdirs without actually changing the makefile itself. It is once again intended to work on all mods and have no M:C specific code anymore.

I ran into problems regarding max length of each line (funny windows limitation) and for some reason it only affects fastdep. I wrote 3 ways to use fastdep
  • 0: all files at once (one execution)
  • 1: All files in a directory at once (one execution for each directory)
  • 2: One file at a time (one execution for each file)
The major issue regarding performance in this matter is starting fastdep. This mean the lower the setting, the faster it starts the makefile, but at the same time, the higher the score, the more files it can handle. It's currently set to 1.


We shouldn't add more files to DLLsources. The current amount of cpp files is near the line length limit and going over means starting up the makefile slower. We should consider using subdirectories more.
 
I made a design change in the Makefile. Instead of having a lot of magic going on when you start the file, the list of cpp files and fastdep output is made by calling certain targets in the Makefile. As a result, the line length limitation is avoided and the idea was that it could be called once instead of one time nmake/jom is started. However it's even better than that because as it turns out, the system with the line length limitation is rather slow. As a result, getting to the point where it writes _precompile.cpp is faster even though nmake is called 3 times instead of one.

Apart from a possibly tiny issue with fastdep, I think my makefile rewrite is done. It feels more complete now and with the ability to compile cpp files in multiple directories, I can clean up the data directory.

EDIT:
I figured out how precompiled headers are included. It was set to trigger on "CvGameCoreDLL.h" regardless of path to this file. I added a configurable path prefix (default ../) for each subdir and now the code can compile using non-broken C++ code. Before it actually had to include CvGameCoreDLL.h from the subdir even though it wasn't there, which resulted in red lines and problems finishing functions and stuff like that. Now it behaves as intended.

Using the correct location for header includes also fixes the issue fastdep had with this file. No wonder it was confused about including a file, which wasn't there.
 
I wrote a perl script, which modifies the project files to allow starting the game from the project itself. F5 with debugger and control+F5 without debugger.

Since it gathers the info anyway, it also updates Makefile.settings to add the path to boost and there is no longer any need to enter this info manually.

Usage guide:
  1. Install strawberry perl (if you haven't done that already)
  2. Close the project if it is open
  3. Unrar and place SetupPaths.pl next to the project files
  4. double click on SetupPaths.pl
  5. wait for the script to finish (might take a while depending on how quickly it locates the exe)
That's it and it sets up all targets, not just debugging. Naturally it makes no sense to attach the debugger to non-debug builds, but it would still make sense to start from the project anyway as it activates the makefile before starting the game, meaning you are sure the DLL is completely up to date.

One nice thing about this is that everything ends up in .vcxproj.user and Makefile.settings. Those two files aren't part of git because they are intended to be for personal use. This mean we can't commit our local paths for this even by accident as both are in gitignore.

How it works:
The script identifies the mod name by locating MODS in the path to the script itself and then read the next part of the path.
The exe needs to be located and that is a bit more tricky. It starts by looking next to MODS in the path. If the mod isn't in My Documents, this would be the place to look. It checks for both Colonization.exe and Civ4BeyondSword.exe at this location. This mean it should work on BTS mods as well (including FTTW).

If it fails to detect any valid exe at this location, it asks windows for C:\Program Files (x86). Since it asks windows, it should work on non-English as well as 32 bit systems, but I haven't tested that. In this directory, it loops though all directories recursively to locate Colonization.exe and hooks up to that one. This could take a while depending on how much you have in program files as well as how late in the alphabet your install path is. The only "progress bar" I could come up with is printing the directory it is currently looking at. That way it is at least possible to see that it didn't stall.

Known issues:
  • Running the script multiple times will add the boost path multiple times (the makefile still works though as each line overwrites the result of the previous)
  • MSVC complains that it lacks the symbols for Colonization.exe. We obviously don't have those and you should tell it to ignore this issue.
  • .vcxproj.user is overwritten. To my knowledge we don't put anything useful in that file besides the script output, but it would still be better if it could add settings instead of overwriting everything.
I will add the script to git eventually. At the moment we don't have two people working on the same branch, which is why I ended up posting it here instead. Also it is poorly tested and feedback is welcome.
 

Attachments

So, if I understand it correctly, this perl script should work perfectly with Civ IV BtS too, on a 64 bit system with Visual Studio used as a debugger
 
So, if I understand it correctly, this perl script should work perfectly with Civ IV BtS too, on a 64 bit system with Visual Studio used as a debugger
It's untested, but yes it should and it's the intended goal. It's made for Visual Studio and the format for the project files is the same regardless of which game you mod for ;)
 
I have written something about hardcoding XML data into the DLL somewhere. Since I can't find it right now, I have decided to add it here as I will look here when I search for it in the future ;)

I have been thinking more about this and have come up with a number of steps. To some degree each step can be used independently and I will write a list here. We can then later decide what we want and what we don't. None of this will be in the next release as it is long term plans.

My take on this is that we should implement all of it. The only one, which has arguments against it is the actual hardcoding of XML data, but it is only for "optimized release" and should tell if people should use the included assert DLL instead. If people change XML files, they should use the assert DLL anyway.

XML enums
A perl script generates a header file where the types from XML are defined. The specific types are defined inside #ifdef HARDCODE_XML_TYPES. HARDCODE_XML_TYPES is then defined by the project, but the makefile will not. This mean the debugger will show the types, but the compiler will cause an error if something is hardcoded into the code itself.

Adding this file to sourceMOD will allow people to compile even though they haven't installed perl. There will be an error while compiling, but the build commands will carry on and produce a DLL anyway. Since it's a matter for the debugger and nothing at runtime is changed, it doesn't matter.

Bonus: all types defined in the debugger and nothing is hardcoded.

Fast size for info arrays
Right now the code use GC.getNum*Infos() and what it does at runtime is this

  1. Fetch the pointer to CvGlobals
  2. Use that pointer to call a function
  3. This function calls a member function of std::vector
  4. The vector uses some code to tell the length
It works, but it is a whole lot of memory I/O in order to get the answer and this is done each time the loop condition is checked.

Solution:
Define an int somewhere (like the XML reading file) and then add it to a header with the extern keyword. This will make a global variable. If you look up global variables, people advice against them. However the problem with them is that if one thread writes to it and another reads, something undefined happens. We will not encounter this as we will not change the value once set (also I think they are only used by one thread, but the graphics engine might read this too using a different thread, though it doesn't matter if it does or not).

Bonus:
Getting the length of an array is reading a single int. Reading this repeatably will place it in the CPU cache, making it much faster to handle the loops.

Hardcoding XML values
Requires all previous ideas to be implemented first.

HARDCODE_XML_TYPES is set by the makefile for certain targets (possibly new targets rather than changing existing ones). The header file with the extern int lengths should have an #ifdef #else setup where instead of ints, the same variable names are defined as an enum.

Bonus:
Looping is much faster because no variable is read at runtime. Instead it is hardcoded at compiletime and is possibly optimized as the compiler optimization knows the value. The also result is faster code.

Since this depends on compiler settings, it will be possible to only use this for really optimized DLL files. Also it should contain an error popup telling people to use the assert DLL if the XML mismatch.

Note: unlike before, this time the perl script will be able to affect the resulting DLL file. Building an optimized DLL like this will require perl.

Linear storage
Requires Fast size for info arrays

Instead of storing a vector of pointers, store each XML file in an array where the actual class instances are stored in the array (not pointers to the class instances). This will store the data in a linear fashion, which makes looping predictable to the hardware. If a class use 40 bytes and we loop it to read an int, the hardware will detect that the code read an address, then address +40, then +40 and so on. This will cause it to fetch data from memory and move it to the CPU cache before it is requested because it will likely be requested in the near future.

Bonus:
Memory latency is the biggest performance killer on modern computers. This solution is aimed at "hiding" latency and hence make the game faster.

Note on implementation: read XML data into temp vector using existing code, then copy into newly allocated array. That requires minimal recoding. Info classes should be reviewed to not copy pointers and then free those pointers as this would cause crashes with the copy.
It should likely be something like a wrapper class for the read function, which will make it easy to implement this for a single XML file at a time.
 
I just had a new idea about running such a perl script. It should generate the yield enum. It's already hardcoded, but it can be done automatically. However just updating automatically isn't the main topic of this idea. We add bools to XML, like bIsHammers and such, telling which one has the DLL hardcoded features. Col2071 broke at some point because it renamed reserved names. This approach can be used to avoid that.

This way the order in XML doesn't matter and the names doesn't matter. Even better, we can add something like bIsPolution (I think col2071 needs that one) and if no yield in XML has this bool set, then whatever it is supposed to do in the DLL will not be compiled, making an optimized DLL for whatever yield we can come up with.

Naturally the yieldgroups can be generated by the script as well, just by adding bYieldGroupSomething.

The more I think about it, the more I like adding a perl script to the compiler. It solves so many of the issues we are having where XML and DLL has to match and I really prefer a setup where the problems are fixed automatically rather than just the current assert if it goes wrong.
 
I have done some thinking on such a perl script and the issue of having to recompile when XML is modded.

I have come to the conclusion that a good modder friendly vs performance vs DLL modder friendly tradeoff is to hardcode some XML files (like now), but being picky about which files to hardcode, though the ability to make a super optimized release DLL could still be possible. Naturally all hardcoded values should be checked against the XML values at startup and cause an error if there is a mismatch. The script can easily generate C++ code for such a check.

I think hardcoding the following files would be good for performance and DLL modding, while at the same time would have minimal XML modder impact:
  • Domains
  • Yields
  • Father point types (particularly FATHER_POINT_REAL_TRADE would be a performance boost)
  • UnitAIInfos
  • City types (currently not even an XML file)
  • Possibly more here over time when we spot something in the DLL
I think those are changed rarely enough to not really cause problems for XML modding while at the same time they greatly impact performance. In fact FPs are currently the only one on the list not already hardcoded, which mean the real impact of the script is to update the DLL according to XML values each time the DLL is compiled instead of doing it manually.

The script can be made to check for certain types and then set defines based on those. For instance DOMAIN_SPACE can trigger #define USE_DOMAIN_SPACE, which in turn will enable the compiler to compile the code designed specifically for this domain.

I think the script should generate the enums used for hardcoding. However I have come up with a little trick, which will allow the debugger to use the enums, but not the compiler. This mean the debugger will print the type for say UnitTypes rather than just a number, which will make debugging easier. Since the compiler will not have access to the full enum, accidental hardcoding will result in error while compiling.

I think we can all agree on the script, if this is how it works. I have a hard time finding bad stuff about it, apart from the need to install perl, which in itself shouldn't really be an issue.
 
That sounds like a good compromise - IMO it is worth it to have a quality optimized DLL, so the need for a Perl script and recompiling when changing some major features isn't a big issue if this gives a worthwhile performance boost in the final mod. :borg:

Domains
These shouldn't be changed frequently by modders so make sense to set at compile time. 2071 would eventually need to add a Space Domain like in Androrc's mod, but this could wait since I know there is lots more urgent stuff to work on :hammer: I would imagine World History and other mods also could eventually use DOMAIN_AIR imported from Civ4BtS, which effectively enables units to "fly" across multiple domains.

Yields
Will be different for each mod, but shouldn't change frequently during modding so are ok to set up at compile time. There is so much involved in adding new Yields (gamefonts, etc etc) that I'd actually be fine with starting from a standard number of Yields by base M:C as a functional maximum for modmods. (IMO one of the lessons learned from RaR is that there are diminishing returns to continuing to add more and more Yield types, and past a certain point adding more can start to detract from gameplay and the AI). I know there were some AI Yield Groups in the current DLL setup (ie Food, Raw Material, Export and Equipment yields etc), do you think this approach should be used in future? I'm fine with doing anything that you think might help the AI with Yield planning; I'm not sure the current AI is really able to deal with Yields optimally through the existing system yet though.:crazyeye:

Fatherpoints
I'm less sure that FatherPoints should be hardcoded, since these will each be different by mod, and everything about them is essentially handled by XML. (If I understand the new CivEffects / Perks correctly, Perks will become a class of Civeffect that can be made purchasable by whatever types of FatherPoints a modder chooses, rather than centering around FATHER_POINT_REAL_TRADE). But FatherPointInfos shouldn't be changed frequently for each mod, so it's fine to have them fixed at compile time if this helps code efficiency. :science:

UnitAIInfos
I think UnitAI behaviors are all programmed in the DLL so it makes sense to specify them there. In old 2071 I made it possible for some Ship units to harvest Asteroid Belt Features and build space Improvements which worked well for human players, but I realized that AI player ships having vanilla Worker AI were unable to clear Features and build Improvements appropriately; I think this was just because the vanilla UnitAI couldn't handle them having a non-Land Domain, so hopefully this could be possible to solve with new Domains. :science: To help the AI (and human automated units) cope with the new Animal Hunting system and the Gathering system, I suppose there could be one standard UnitAI set up for any "Hunting" profession and for any "Gathering" profession, as long as it could make its decisions based on what was set up in XML.

CityTypes
Also makes sense to have a limited number set up at compile time. I'm not sure but I think I may largely minimize use of different CityTypes when modding, it seems potentially problematic for players and the AI to cope well with. However, if possible it could be interesting to have an Undersea CityType for oceandwellers, and a Starbase CityType located in Deep Space tiles. I hope it should work fine as long as the AI can cope with absence of MC-specific things like Castle/Monastery, and work with a few basic citytypes using BuildingClass availability set by XML.:nuke:

The option to have a 2-plot city radius rather than 1-plot city radius is one thing that could be useful to have available for mods; I think this worked really well when tried in RaR as well as Civ4BtS. However I imagine this might be very challenging to adapt the AI to; and if implemented would make sense as a global flag set once per mod at compile time.
 
I'd actually be fine with starting from a standard number of Yields by base M:C as a functional maximum for modmods.
It's not a max, it's the number of yields the mod has to have, no more and no less.

I know there were some AI Yield Groups in the current DLL setup (ie Food, Raw Material, Export and Equipment yields etc), do you think this approach should be used in future?
I was planning on removing those to avoid hardcoding. However using a perl script, it can instead be made into a bunch of bools in XML, which the perl script reads and generate the groups based on this. If they are that easy to update, then sure we could expand on this concept and it would likely be a good idea.

I will still use the runtime calculation of military yields. After all the required yields will change according to owned CivEffects for humans and it would make sense if the AI can adapt to different tech levels as well.

I'm less sure that FatherPoints should be hardcoded, since these will each be different by mod, and everything about them is essentially handled by XML.
That's kind of the point with a script. The enums would be different for each mod, hence the need to hardcode different values. If it was just about FPs, then it wouldn't be a good idea. However since we need to hardcode some other values anyway, we might as well improving on the trade system.

I think UnitAI behaviors are all programmed in the DLL so it makes sense to specify them there.
Yeah, they are hardcoded for this very reason. The issue with this is that both the enum and XML translates a string into an int. If something is changed and a single string translates into two different ints, then everything will go wrong. By updating the DLL each time it compiles to match the XML settings, then they would have a harder time going out of sync ;)

One huge advanges of this is if M:C adds a new type to DLL and XML, and another mod updates the DLL code without noticing, the compile will fail to compile, saying that UNITAI_AMBUSH is undefined.

The option to have a 2-plot city radius rather than 1-plot city radius is one thing that could be useful to have available for mods; I think this worked really well when tried in RaR as well as Civ4BtS. However I imagine this might be very challenging to adapt the AI to; and if implemented would make sense as a global flag set once per mod at compile time.
I did something like this in RaRE. If you define a certain flag in Makefile.settings and recompile, then it will switch back to 1 plot radius. My plan was a runtime switch, but looking at the code, I made it compile time for performance reasons.

I don't really like 2 plot radius because it makes the AI even worse. The AI pioneers have code, which assumes 1 plot radius, which make them even worse at figuring out what to do with the plots (they are bad to begin with).
 
I just had an idea. If we add the ability to NOT get a new random seed when requested, we can load a savegame, make it autoplay 100 turns, load the savegame and autoplay 100 turns and it will do 100% the same thing.

This kind of consistency is useful for bug hunting and fix testing. If autoplay crashes after 37 turns, then it will crash consistently, which mean it can be made to crash with the debugger attached. It is also a decent way of testing fixes.

Another good use of this is for profiling. If the game profiles 100 turns, some code is optimized and then the code can be profiled again for comparison. Say some code is optimized and the post reading shows that the game runs 10% faster. Will it be by chance because the AI lost more units and took less actions or is it because the optimization really provided a 10% speed boost. Removing any random random seeds will ensure that the functions are the same, hence it is only a measurement of code efficiency.

With a cool setup like this, it will be possible to tell what kind of speed boost we can gain from hardcoding XML values as well as how to order the data in the memory. If/when such optimization is coded and measured, we can use proper profiling to figure out if releases should be hardcoded for performance reasons. Personally I'm quite interested in knowing precisely how much performance can be gained this way, both for the game and for general coding "research". There are some data designs in vanilla, which doesn't follow the general guidelines for writing high performance code. I would like to know how big a performance impact it has to not follow the guidelines as this is important knowledge far beyond modding :scan:
 
Well, I think you can turn off random seed. Yeah, in a Custom game it gives you the option it seems.
That option will control if you get a new random seed on game load. Effectively it mean you can save, enter a ruin, load enter again and with a new random seed you get a different result, while without a new random seed, you will get the same outcome. The same goes for combat and stuff like that.

Regardless of this setting, some next turn code will provide a new random seed. I want to be able to ignore this call, which makes the random seed predictable for an unlimited amount of turns, not just the same turn. It would be lame for gameplay to have no randomness at all, but rule for automated tests where different code configurations are compared.
 
With the lack of a python thread, I will use this one.

I wrote a guide on how to handle the python interface in the DLL. It took way longer than expected :( and I haven't proofread it or done any editing. I'm not sure if/when I will do anymore about it, which is why I link to it now. Hopefully people can understand enough based on this and if not, then know who to ask. I can then read it again to remember what I figured out way back when I needed to add CivEffect classes :lol:
https://sourceforge.net/p/colonizationmodcollection/wiki/Python Interface/

I added the tag CodingGuide, which I think we should use whenever we write something, which tells how to code for some specific feature or something. We can then ask the wiki to list all files with this tag to get an overview of what guides we have.
 
If that would be helpful for other mods or Civ4 you may could post a thread for it in other places, and it would also draw attention to M:C and it's sibling mods.
That's the plan. However for now I need my guinea pigs... errr... fellow modders to understand it first to ensure it's written correctly. It is also likely to get an update on the graphical presentation (indented code fails and such) and all the TODOs should likely be solved as well.
 
I have read Nightinggale’s Python Interface Guide to share an opinion but as you know I am a mere xml number changer, so I assume I am not the target modder to comment your guide.

But having said that, for people at my level, I think it could help if it is explained what a modder can achieve by adding a new class onto vanilla or an existing mod. From what I believe when I read the xml code, classes are professions, yields or promotions? Is it that right?

So if in the guide it is said that by adding a new class you can implement a new series of goods that are needed to be consumed on map from wagons by a fortified army, that may mouth-water the modder, they can see the point and continue reading.
 
I think it seems well written enough, but of course the necessary concepts would only really be understood by those with some programming expertise who are familiar enough with object oriented programming and the concepts behind classes/pointers/inheritance etc.
 
I think it seems well written enough, but of course the necessary concepts would only really be understood by those with some programming expertise who are familiar enough with object oriented programming and the concepts behind classes/pointers/inheritance etc.
Maybe I should add a foreword telling that the intended reader is one, who has made a new C++ class and wants to add python access to it. I have been thinking of writing a bit on class inheritance or find a good link because CivEffects use on a much higher level than vanilla. This mean I can't assume other modders to follow what goes on. However it should explain as little as possible about C++ classes, just enough to understand the design I have set up. If people want to know everything about C++ classes, buy a book.

But having said that, for people at my level, I think it could help if it is explained what a modder can achieve by adding a new class onto vanilla or an existing mod. From what I believe when I read the xml code, classes are professions, yields or promotions? Is it that right?
A class is an object, be it CvUnitInfo (XML data) or CvUnit (each unit in game). The need for a new class is if a new type of object is needed. CivEffects needs this because civicInfo is now split into civics, censures, perks and techs. That makes 3 new classes, but in reality there are more because using class inheritance allows them to share code for common features.

Writing this makes me wonder about trade screens. I did add a C++ class for them to allow saving prices and such. However I don't think we have a python interface for it. Considering the GUI is written in python, that might be a really good idea to provide full access to python.

So if in the guide it is said that by adding a new class you can implement a new series of goods that are needed to be consumed on map from wagons by a fortified army, that may mouth-water the modder, they can see the point and continue reading.
I'm not quite sure that would require a new class and even if it did, I think it would have to be handled in the DLL, not python for performance reasons.

The thing is, telling what you can use a new class for is kind of pointless. That would be like telling people why a car is good by telling a specific destination people can drive to while most if not all people want/need to go somewhere else. The point in this guide is telling how to provide python access to C++ classes using the horribly outdated boost library (which can't be updated because of the exe), not what the C++ classes should contain. If I start writing about when to make new classes and how to inherit other classes in C++, it would turn into a book and there are plenty of books on this subject already.
 
Back
Top Bottom