I finally reached the point where I could start working on JIT array save code. A block consist of a header byte, as well as a start and end. If the code thinks it's a good idea, it will save an array as multiple blocks and merge them on load.
Example. Say we have an array where we have written to elements 4, 5, 6 and 18. It will make one block starting at 4, containing 3 elements and another one starting at 18 containing a single element. Everything not mentioned in the savegame will be set to the default value (usually 0, but it can be whatever the C++ coder wants it to be, like NO_CIVIC).
I ran this on some actual game arrays to see if it works as intended and to see how well it performs. Saving a specific array I looked into:
Vanilla: 88 bytes
JIT (old): 78 bytes
JIT (new) 9 bytes
I would call that a success

Part of the reason why the saving is this good is because each block look at the min and max values of the numbers in the block. It then figure out the smallest variable able to hold all the numbers. In this case it was an int array (4 bytes), but all the numbers were so low that they could fit in a single byte. That alone gave a 3 byte or 75% size reduction for each element. The other reason is that the array contains 3 elements different from 0. They are so far apart that it makes it 3 blocks, each with a single element. The old approach was to also save the 0s in between the important elements.
I then started wondering how to test if it always behave as expected. It appeared to work just fine from testing, but 98% sure of working code isn't good enough if the remaining 2% corrupts savegames at random. After thinking long and hard about how to become as close to 100% sure as we can get, I came up with what I would consider a brilliant solution. The new code is written from scratch, leaving the old read and write functions intact in JIT arrays. Now if SAVEGAME_DEBUG is defined, it will save with both systems and on load, it will load the new one into a temporally JIT array and then it will assert check the two arrays to have loaded precisely the same data. If two different functions tries to save the same data and it loads the same data, then we can be pretty sure it went well. I defined SAVEGAME_DEBUG in the mod define header and we will use that for quite a while now. It should be removed prior to making a stable release, but if it can be used through testing without asserting even a single time, we should be able to trust the new code. Also it adds some counters in the savegame as well to assert on to see if it reads and writes the same amount of bytes in each class.
DLL version
I added makefile code (technically Makefile.project as it is unique to our project). It will run git to get the revision ID and forward it to a perl script, which then generates a .h file. Here the revision string is a static const char pointer (that is, a read only string using minimal resources). A CvString copies the content and then it calls readWrite(). This mean the string enters the savegame and whatever is in the savegame will enter the string on load. It isn't used at all, but it can be read using a debugger, which mean dll modders can identify the version of DLL, which would be really helpful if somebody comes with a buggy savegame in the future while saying "I don't know which version it is from".
The header is ignored by git, which mean it will never be committed. Instead it is generated just prior to running fastdep. This naturally adds having git and perl installed. The reason why it isn't added is because it would always be outdated. It will contain the current revision when you reach the point where you commit, but once the file goes into git, a new revision is made, but the header file still contains the old one. Because of this, it wouldn't make any sense to store the revision. We should figure out what to do about released source code and this requirement. Right now it wouldn't compile because such a release wouldn't be in a git repository. It should likely have some release flag to make it optional and the release should have a header file telling it's the release.
XML value conversion on load
I use the JIT conversion to convert even more arrays on game load, meaning savegames will be preserved if even more xml files are changed. To do this, I just added a whole bunch of xml file data to savegames and I suspect I might have overdone it. The savegames are now full of strings for the table and I lost track what is used and what isn't
The savegame appears to be too big and too complex to keep track of such types manually. I'm not considering just adding access to all xml files and then make counter for how many times each one is used. This will allow adding assert checks for unwanted numbers, like if it detects saving the strings for an xml file and the type is used 0 times.
We should have some on/off switch to tell if we want xml files to be included in a savegame. Or rather, we should decide if we want the conversion table for the type. If we happen to save something, which lacks a conversion table, then we can simply write the type string. On load, since the table wasn't in the savegame, it knows it should read a string and then loop the types of the xml file in question to find a matching ID. While I prefer to save ints when saving a number, it would make sense to save them as strings if there are less variables saved than there are types in the xml file. It wouldn't surprise me if some types are only saved once. Also even with say 30 diploEvents in a savgame, it would pay off to save the strings rather than the (currently) 154 strings needed for the conversion table.
Also I looked into the ability to set a prefix to a type and then not save the prefix in the table. This would reduce the disk space needed to store some files by more than 50%. Now I can't think of any reason not to do this, other than it requires some xml type cleanup.
Even though I have done quite a lot to the savegame now, it seems that there are still quite a lot of potential improvements. At least I'm not bored
I'm reading what you're saying and as soon as I get my bachelors in applied science I'll give you a proper reply

But it all sounds good.
And I was optimistic and expected a proper reply this century
