Core DLL and XML DLL

Lib.Spi't · May 14, 2015

So I opened this thread so that anyone who is clever can look and see (and hopefully add to) A list of things that will break cross compatibility for mods that would be a part of the UMS (Universal Mod Solution) Project. These are projects that share a common .dll code base.

Here are some previous comments that inspired me to make a thread on this:

Lib:

Spoiler :

Nighttinggale:

Spoiler :

Lib.Spi't · May 14, 2015

XML DLL:

Core DLL:
1. CivEffects
2. No arrays in savegames. Replace all with JIT arrays as this will make savegames less likely to break
3. Make JIT arrays save even less data (I have a nice idea on how to reduce savegame file size)

Lib.Spi't · May 14, 2015

How to Make/Design/Implement XML DLL Components:

Nightinggale · May 14, 2015

Having answered a number of questions in the WHM thread, I decided to add a summary here.

The issue is how to set up the game to apply mod specific features. We have two ways to go:

1: move all setup to XML and make one DLL file, which can make any settings into a valid game. This will allow copying the DLL file from one mod to anther.

2: hardcode settings in the DLL. This will require less coding and make the game run faster.

#1 is quite obvious and has been the goal according to everybody on the forum for a while. However currently we have #2 and I'm wondering about extending the DLL in this direction.

Currently we have 3 header files, which is mod specific and are placed in sourceMOD (read: mod git repository, not DLL source repository). This is where yields are hardcoded.

#2 will sacrifice the 100% XML solution and place high performance settings in the DLL. This mean we keep yields in the DLL and then we add flags as well in order to configure.

Example. Colonization 2071 requires a space domain at some point. The mod specific header will then contain

PHP:

#define USE_DOMAIN_SPACE

It can then be used in the shared DLL code like this:

Spoiler :

This mean at compile time, mods defining USE_DOMAIN_SPACE will include all space code, hardcoded for performance reasons while mods not defining this will ignore the code set inside #ifdef. In other words it ends up being a boolean setting, which in theory can be set in XML, but we move it to the DLL for performance reasons.

Using the define flag approach will allow us to make a high performance DLL, which is still fairly easy to configure for XML modders. Also since the #ifdef is handled by the precompiler, it can be added everywhere, in switch-cases (example) or in enums. The compiler optimize the code after compilation, but before runtime (obviously). This mean bool (flag) checking in precompiler is done before optimization while the runtime can't be optimized.

The hardcoding should be done by well documented flags and not really anything else. This mean the overall strategy of not hardcoding anything in the shared code and let it be up to the XML modder to decide is the goal for both solutions. The question is if the XML modder should mod XML only or XML and a few header files.

At some point we will have to make a decision if we want solution #1 or #2. From a coding perspective, I'm leaning more and more towards #2, which happens to be the performance solution as well. I know it mean recompiling the DLL more times, but it's a tradeoff situation where none of the solutions can be perfect.

Nightinggale · May 14, 2015

I just had a great idea for making setup a whole lot easier. If we require modders to have perl installed (I recommend strawberry perl because it's easy to install and works great), then we can add a perl script to the makefile/project file build commands.

This new perl script can then read a select few XML files and then autogenerate the needed header files. I already wrote a script to add XML enum values for debugging and that would do this task just fine. There is also the script to create the yield groups, which can be changed into doing the very same thing based on XML settings.

The result is that we would still have XML data hardcoded in the DLL. However when we compile, the script will apply the XML setup to the headers before doing the actual compilation. As a result, the XML modder can mod the XML files and then simply compile a new DLL without looking at the C++ code.

The define flags would also work in this setup. Back to the example of USE_DOMAIN_SPACE. The perl script simply defines it if the domain XML file contains the type DOMAIN_SPACE.

I think this will be the solution. We have the hardcoded coding abilities and performance and the XML modder will only have to worry about the XML files, not the C++ files. Still it would be best if all XML modders can compile, but that shouldn't be impossible to assist everybody into getting the compiler up and running.

Lib.Spi't · May 15, 2015

There are already tutorials for compiling, so we can always snatch and modify one to make a 'How To' UMS Thread.

I am not against hard coded concepts, for example culture is a hardcoded concept.

But via the xml, you can do a lot of things to use the concept of culture (expanding borders) in a different way.

It is just the idea of how much of the 'specifics' can we make the .dll look into the xml to get an answer on what it should do.

For example:
City Types and Plot Group Bonuses:
Could we design a system that instead of in the DLL having a hardcoded thing that says something like.
CITYTYPE_MONASTERY when connected to CITYTYPE_ECONOMY in (plotgroup) gives 50% increase to YIELD_CROSS.

Could we have something that says.

<tagcitytypebonus> when connected to <tagcitytype> in (plotgroup) gives <tagcitytypebonusproduction> (I forget the word we were using for + and +% could also be positive or negative value) to <tagyieldtype>.

That way we can make a whole bunch of Rules, the details of which are fiddled with in an XML file like citytypes.xml

?

Nightinggale · May 15, 2015

Lib.Spi't said:
It is just the idea of how much of the 'specifics' can we make the .dll look into the xml to get an answer on what it should do.

For example:
City Types and Plot Group Bonuses:
Could we design a system that instead of in the DLL having a hardcoded thing that says something like.
CITYTYPE_MONASTERY when connected to CITYTYPE_ECONOMY in (plotgroup) gives 50% increase to YIELD_CROSS.

Could we have something that says.

<tagcitytypebonus> when connected to <tagcitytype> in (plotgroup) gives <tagcitytypebonusproduction> (I forget the word we were using for + and +% could also be positive or negative value) to <tagyieldtype>.

That way we can make a whole bunch of Rules, the details of which are fiddled with in an XML file like citytypes.xml

That's a good idea, but that would not require DLL hardcoding at all. It could just be an array of yields for each city type. Plotgroups can then cache how many cities of each type it contain, meaning it will be fairly quick for the city to look up the numbers it needs for the calculation.

Also I would like it to stack without making it insanely good. Something like this:

Production of yield X is increased by 50%/N. This mean if you have it connected to 3 monasteries, then crosses would be increased by 50/1 + 50/2 + 50/3 = 50 + 25 + 16 = 91%. 50% is likely a too high number to start with, but that's not an issue for the concept itself.

However this idea is written in the wrong thread

If this is about the concept of hardcoding for performance, then the compile script could check if this modifier is even mentioned in XML. If the bonus is always 0%, then the compiler can disable the check itself, making the next turn event faster. It's not really what I had in mind, but it would be possible if the build instructions calls a perl script prior to compiling.

Nightinggale · May 15, 2015

I just had a new idea. If we decide to use a perl script to compile, then we can autogenerate a header file containing the enum types, which match the XML files. Not just yields, but all of them. This mean that while debugging, the types will always appear. However even more important, it allows adding the number of elements in an enum to the enum itself, like replacing GC.getNumUnitInfos() with NUM_UNIT_TYPES. The compiler translates the enum to a fixed number, which it inserts in the machine code in the DLL, making it an instant lookup. The GC version will at runtime look up the global pointer, call the getNum function using that pointer where it will go into a vector to get the length. That is a lot of memory I/O we avoid and hence make the code faster.

Another benefit is that the enum is available at compile time while the GC is available only after the XML has been read. The tiny difference can be significant because it mean we can reserve memory for the entire XML file before it's read. Having the file stored in one block rather than many will use the CPU cache more efficiently when we loop all infos in an array, which again allows for faster code. Also currently JIT arrays can't be member functions of a class, which is allocated before the XML is read. This mean we can't use them in CvGlobals, but also it matters for the order of XML files being read. If JIT arrays starts to use enums instead, they can be used everywhere without any problems.

Another interesting thing is that if everybody installs strawberry perl, then I can assume everybody to have everything it installs, not just perl. It also installs the gcc compiler as well as GNU make. Currently we use NMake, but it is full of issues and most of the issues aren't present in GNU make. This mean if I run into a problem again, I can simply switch to GNU make, update the project file to call the right one and then it will work again. Nobody else will even have to know about it. It just happened in the background. We might even switch to the gcc compiler at some point if we take this path as it is able to optimize for modern CPUs.

The more I think about it, the more I think we should demand that all modders have strawberry perl installed and use perl to hardcode XML data in the DLL.

Lib.Spi't · May 16, 2015

Using perl doesn't seem like a problem, if you are going to learn to compile then you can probably, without much more effort, learn to use perl too.

We would just need to develop a nice tutorial that takes people through the whole process from start to finish. When I tried to use perl before, I just could not figure out what I was supposed to do or how it actually worked. Having never done anything with it before. I could definately see the benefits of it though.

The point of my example, was not to look specifacally at one type of city and percentage, it was more of a broad question on DLL vs XML, and how much many gameplay concepts/rules can we have 'complete' control over in the xml, without it crippling performance in some way?

Also if a concept was available in the Xml but not used by the mod, like say a citytypes.xml, how much performace would be lost by it's 'silent existence' compared to if it did not exist at all?

It is just going back to the idea, of how many of the current gameplay features of M:C could be turned into an xml 'moddable' concept without destroying the performance like you said Domains would?

Nightinggale · May 16, 2015

Lib.Spi't said:
Using perl doesn't seem like a problem, if you are going to learn to compile then you can probably, without much more effort, learn to use perl too.

The issue is if perl should be part of the build process, not if people can code in perl. Most modders don't know what actually know how the compiler works. They click build and they get a DLL file and that is enough to mod just fine. This will be the same with and without perl.

If we make the build process rely on perl, then a perl script will be called when you click build. The modder don't have to know this script is called or be able to understand it. What is important here is that if we agree that everybody installs strawberry perl, then I know that I can make the build process rely on any exe installed with that installer (like perl.exe) and it will just work.

Lib.Spi't said:
We would just need to develop a nice tutorial that takes people through the whole process from start to finish.

We have a tutorial on the wiki. All we would have to do is to add "install strawberry perl from this URL" on the page regarding installing the compiler.

Lib.Spi't said:
Also if a concept was available in the Xml but not used by the mod, like say a citytypes.xml, how much performace would be lost by it's 'silent existence' compared to if it did not exist at all?

It is just going back to the idea, of how many of the current gameplay features of M:C could be turned into an xml 'moddable' concept without destroying the performance like you said Domains would?

This is tricky to answer. Simply put, each time an XML value is used, it is a bit slower to use the XML value than a hardcoded value. The real question is how often it is called. If it is used once for each city or even each player on new turn event, we will not be able to tell the difference. If it is used when calculating if a unit can enter a plot, then there will be a huge difference because the AI does that all the time.

The XML values called most often would be yield and domain (I think) and vanilla hardcodes those two. We simply do not know how much it will slow down if we un-hardcode those two, but in theory it could matter a lot. Also GC.getNum*Infos() are used all the time. They use so little each time that the profiler has a hard time picking up how much time is spend using those, but we would benefit if we can change all of them into enum values.

However "how many" is not really the question. If we add a perl script to handle hardcoded yields (and possibly domains), then it makes no sense to waste time moving other data to XML. It's an all or nothing question and if we hardcode something in the DLL (hence will need to recompile to match XML), then the real question isn't if a setting is exposed in XML, but rather if it is easily available to the XML modder. If we like, we can add a tag to XML, which the DLL will not even read. Instead it is used by the perl script to generate hardcoded values for a header file. The result would be that the XML modder can modify everything in XML.

The only difference is that the XML modder would have to recompile the DLL if anything regarding <Type> is changed (added/removed/renamed/changed order). Also the other tags, which are hardcoded and would require a recompilation could be put into a tag group named something like <HardcodedTags>.

orlanth · May 17, 2015

I just had a new idea. If we decide to use a perl script to compile, then we can autogenerate a header file containing the enum types, which match the XML files. Not just yields, but all of them. This mean that while debugging, the types will always appear. However even more important, it allows adding the number of elements in an enum to the enum itself, like replacing GC.getNumUnitInfos() with NUM_UNIT_TYPES. The compiler translates the enum to a fixed number, which it inserts in the machine code in the DLL, making it an instant lookup. The GC version will at runtime look up the global pointer, call the getNum function using that pointer where it will go into a vector to get the length. That is a lot of memory I/O we avoid and hence make the code faster.

Another benefit is that the enum is available at compile time while the GC is available only after the XML has been read. The tiny difference can be significant because it mean we can reserve memory for the entire XML file before it's read. Having the file stored in one block rather than many will use the CPU cache more efficiently when we loop all infos in an array, which again allows for faster code. Also currently JIT arrays can't be member functions of a class, which is allocated before the XML is read. This mean we can't use them in CvGlobals, but also it matters for the order of XML files being read. If JIT arrays starts to use enums instead, they can be used everywhere without any problems.

Another interesting thing is that if everybody installs strawberry perl, then I can assume everybody to have everything it installs, not just perl. It also installs the gcc compiler as well as GNU make. Currently we use NMake, but it is full of issues and most of the issues aren't present in GNU make. This mean if I run into a problem again, I can simply switch to GNU make, update the project file to call the right one and then it will work again. Nobody else will even have to know about it. It just happened in the background. We might even switch to the gcc compiler at some point if we take this path as it is able to optimize for modern CPUs.

The more I think about it, the more I think we should demand that all modders have strawberry perl installed and use perl to hardcode XML data in the DLL.

hmm I can't say I totally understand the implications of all the above, but it sounds fine to me - installing strawberry perl is fairly straightforward for anyone, and who knows I may be able to take some advantage of my slight perl knowledge! :science:

What does it mean that you'd have to recompile the DLL if anything regarding <Type> is changed? If thats anytime you added a Unit / Building etc you'd potentially be recompiling a lot, but I guess it could be manageable especially if the resulting DLL is faster.

It's a cool idea to have a tag group like <HardcodedTags> keep track of all hardcoded content - in making a total conversion mod it takes a surprisingly long time to get down to a basic "skeleton" version removing M:C-specific content to start fresh with new content, while still finding and keeping the M:C content tags (UNITCLASS_RELIC, etc etc) that are required due to hardcoing.

Nightinggale · May 17, 2015

orlanth said:
What does it mean that you'd have to recompile the DLL if anything regarding <Type> is changed? If thats anytime you added a Unit / Building etc you'd potentially be recompiling a lot, but I guess it could be manageable especially if the resulting DLL is faster.

Yeah, if the XML length is hardcoded, changing the length would require recompiling. It is a tradeoff, not just a bonus, which is why there is a thread about it. If it was some internal DLL change, which makes the game faster with no sideeffects, then I would implement it without any debate.

If you start the game from MSVC, it will recompile if needed. This mean if you edit XML and you always start the game from the project file, you will not have to consider if the DLL needs recompiling. This requires that I write the guide on how to set up the project file in a way, that allows this. Maybe it would be faster to simply write a perl script, which does it as it has to be done for each target.

In semi related news I figured out how to configure the project to start the game in the debugger and load the correct DLL

One good thing about using a perl script and possibly hardcoding XML data is that at the moment, it will not actually change anything in the XML files themselves (except if we add yield group tags). This mean it can be added to the DLL at any time without considering XML file compatibility.

orlanth said:
It's a cool idea to have a tag group like <HardcodedTags> keep track of all hardcoded content - in making a total conversion mod it takes a surprisingly long time to get down to a basic "skeleton" version removing M:C-specific content to start fresh with new content, while still finding and keeping the M:C content tags (UNITCLASS_RELIC, etc etc) that are required due to hardcoing.

Generally speaking it would be good to clean up the XML schema files. To some extend even vanilla is messy and M:C sure haven't improved on this. Luckily I have perl scripts, which can be used to upgrade XML files to adapt to a new schema file, which will make upgrading multiple mods easier in the future. I have already planned to make it more modular, meaning if you make a mod using DLL version 2 and you upgrade to version 2.1, then you can simply run the 2.1 upgrade script and it should work.

Kailric · May 17, 2015

orlanth said:
It's a cool idea to have a tag group like <HardcodedTags> keep track of all hardcoded content - in making a total conversion mod it takes a surprisingly long time to get down to a basic "skeleton" version removing M:C-specific content to start fresh with new content, while still finding and keeping the M:C content tags (UNITCLASS_RELIC, etc etc) that are required due to hardcoing.

We should completely remove all Hard coded types in the DLL. For any such hard codes we have now we can add them to the xml like in WorldInfos.xml or BasicInfos.xml. Then make sure the game will still run if any of those entries are NULL, or have it assert. All the hardcodes should be in GlobalDefineAlt.xml.

Nightinggale said:
Yeah, if the XML length is hardcoded, changing the length would require recompiling. It is a tradeoff, not just a bonus, which is why there is a thread about it. If it was some internal DLL change, which makes the game faster with no sideeffects, then I would implement it without any debate.

So, what we are talking about here is making it so that modders would have to install and run a perl script each time they add a new "type" in order to increase performance right? Well, for us Big Four (Lib, Night, orlanth, Kailric) it wouldn't be that big of an issue but it would not be very user friendly at all. Most casual players will just download the mod and start playing with out ever reading the threads on the forums, and then if they make any changes as far as types, will this mess up their game? I think we should concentrate on other performance enhances first before making that big a change.

Nightinggale said:
In semi related news I figured out how to configure the project to start the game in the debugger and load the correct DLL

Please do tell :deal:

Nightinggale · May 17, 2015

Kailric said:
Most casual players will just download the mod and start playing with out ever reading the threads on the forums, and then if they make any changes as far as types, will this mess up their game? I think we should concentrate on other performance enhances first before making that big a change.

Good point. It would require a bit of new code, but interestingly it would require nearly the same amount of coding to make it a compile time switch. This mean if the length in DLL and XML isn't the same, we can generate a popup, telling people to use the dynamic assert DLL. We would then include two DLLs in releases. Even if they do not read the readme or forums, they would likely read an error popup, which will then have to be well written.

Regardless of optimization, we still have the issue of yield hardcoding. Lib.Spi't rightfully complained that it is too complex to add more yields since yields are hardcoded in so many locations. However the more I think about it, the less convinced I am about moving everything to XML. I think it would be better to autogenerate when compiling. Particularly yieldgroups are tricky write code for by hand and the current code is made by a script. I do have a goal of writing yield stuff in yieldInfo.xml and then have a script/DLL code to generate all the needed yield code rather than the current setup where I think you need to edit 7 files to add a yield and that is not even counting graphics.

Currently the DLL checks most/all hardcoded XML values against what is read in XML. We should keep doing that and make the error message a proper popup window rather than assert because even release builds should report errors, which will make the game fail to execute correctly.

Kailric · May 17, 2015

Now, adjusting the code to make adding new yields more user friendly would be a different issue, at least to me. Any modder that is going to start adding new yields will learn really quick that it is no easy task, and if they are knowledgeable enough to pull it off, then they should be able to do perl scrips. If we can work up the dll for a more modder friendly way to adjust yields, then I would be all for it. We could then have a tutorial thread describing how to add new yields with the new dll. Only serious modders would go that far in their modding, and serious modders would be more apt to check the forums for solutions.

Lib.Spi't · May 18, 2015

The having to recompile everytime you add/remove stuff in the xml sounds less than ideal to me..

One of the joys of the xml is the fact you can change so much without having to do anything other than play with the xml files.

Nightinggale · May 18, 2015

Kailric said:
Any modder that is going to start adding new yields will learn really quick that it is no easy task, and if they are knowledgeable enough to pull it off, then they should be able to do perl scrips.

The idea is that it is called just like nmake is called right now. The modder will not have to know how to do it in order to use it. However the modder would have to install the exe (like nmake.exe or perl.exe) in order for the compilation to work. Knowing perl is a bonus, but it should be possible to use this without ever looking at a perl script or even manually execute one.

Now that I think about it, if we add the generated header file to git and it fails to find perl, it will print an error regarding not finding perl, but otherwise compile just fine as it uses the header it downloaded with git. In other words perl will only be needed if the modder change something, which is hardcoded in the DLL, like yields or domains (at some point we should make a complete list of which XML files are hardcoded).

I have been thinking a bit about hardcoding and performance vs modder friendliness. We should have a "final release" target, which hardcodes and then release with both optimized and assert DLL files. This mean regular compiling will produce a dynamic DLL, which only needs updating when hardcoded XML files are edited (hard to avoid that) while we can still produce a DLL, which is optimized as much as possible for speed for non-modding users.

The way this is done is by adding code like this:
In some header:

PHP:

#ifdef XML_HARDCODING
#define XML_HARDCODE_PREFIX const
#define XML_HARDCODE_SUFFIX( x ) = x
#else
#define XML_HARDCODE_PREFIX extern
#define XML_HARDCODE_SUFFIX( x )
#endif

XML_HARDCODE_PREFIX UnitTypes iNumUnitTypes XML_HARDCODE_SUFFIX( NUM_UNIT_INFOS );
// repeat with a line here for each XML file

This will then work in the precompiler to produce:

PHP:

// hardcoded
const UnitTypes iNumUnitTypes = NUM_UNIT_INFOS;

// dynamic (non-optimized)
extern UnitTypes iNumUnitTypes;

The first will be a number fixed at at compiletime and work like the enum. The latter mean any file using iNumUnitTypes will use the same 32 bit variable in memory and it can then be set by XML loading. In other words all files will be able to use iNumUnitTypes and the compiler will generate the code we expect it would generate in both cases. The only downside is that the dynamic variable would not be write protected, but if some code writes to it, then the compiler will cause an error when compiling it as a const.

Core DLL and XML DLL

Lib.Spi't

Overlord of the Wasteland

Lib.Spi't

Overlord of the Wasteland

Lib.Spi't

Overlord of the Wasteland

Nightinggale

Deity

Nightinggale

Deity

Lib.Spi't

Overlord of the Wasteland

Nightinggale

Deity

Nightinggale

Deity

Lib.Spi't

Overlord of the Wasteland

Nightinggale

Deity

orlanth

Storm God. Yarr!

Nightinggale

Deity

Kailric

Jack of All Trades

Nightinggale

Deity

Kailric

Jack of All Trades

Lib.Spi't

Overlord of the Wasteland

Nightinggale

Deity

Similar threads