I don't think it matters as much if we use a one-file or a two-file approach because save games aren't traded around as often.
Oh I think it matters more than that. Play-by-email, succession games, and sharing saves when asking for help is a thing, especially on this forum. But also if you have a crapload of save files and go to clean them up or organize them, keeping the C3X and SAV files together would be a huge pain.
Unfortunately I don't know much about the internals of the save format. Don't all saves include a BIQ internally?
Well...maybe and probably not. Non-custom "epic" games don't have the BIQ in them, and C3C then loads the default Conquests.biq file. Games based on BIQs *do* include the BIQ, but I
am going to presume can confirm that the trailing data we add on after the BIQ is not thrown into the SAV. (Confirmed because the uncompressed autosave of my last test is smaller than my megabyte-of-trailing-data test BIQ is.)
Both the BIQ and SAV data I believe are serialized class instance data dumps of the in-memory objects. They are not a "format" or generic serialization that I can discern. I'm not sure everyone agrees with me that these files are near-raw representation of in-memory objects, but only you and Antal1987 really have dug into the in-memory data.
If you look at the decompressed BIQ and SAV files with a hex editor or string parser you'll find what Civ3 format descriptions call "sections", a char[4] ASCII header like CIV3, GAME, TILE, WRLD, CITY, UNIT, LEAD. Many of these repeat at odd intervals or even seemingly random. (Although I finally figured out CITY has like 86 or so repetitions of the CITY header for each city in game, but even knowing that each CITY section varies by length based on ... something, I forget, maybe citizen count and/or other stuff.)
I tend to think of these "sections" as actual representation of the class instance memory object largely because each map tile in C3C has 4 TILEs of lengths (from memory, might be wrong) of 128, 12, 32, and 4, and it seems clear that these represent a cascade of object inheritance as the terrain data for PTW saves is in a different TILE portion than terrain data for C3C saves.
In any case, there is no complete public full understanding of the whole data structure, so both Quintillus and I approach it with searching for a (usually the first) occurrence of a particular header and then decoding data from that point. But there are pitfalls here as sometimes a BIQ includes a map which makes finding the sav map vs the scenario map a bit more involved, and C3C doesn't zero-out allocated memory, so stray instances of text strings often pop up in uninitialized data, particularly in LEAD sections, so finding the first legit CITY and UNIT in particular takes some care.
Heh, that's the short version. Sorry.
The main point being, I *am* quite sure that the BIQ embedded in the SAV is deserialized into memory then serialized into the SAV, so no tacked-on data would survive that process as it's never slurped in by the deserializer in the first place.
Which actually gives me more hope that "hiding" data in unused/obsolete places in the existing data would survive the deserialization/serialization process without your having to jump into the file read/write routines.
Also when you say you're "manually decompressing" BIQs is that with an off the shelf decompressor like 7-zip or a scenario editor?
C3C uses the PKWare DCL (Data Compression Library) which seems to have been a proprietary-only streamable de/compression format (analogous to gzip), and pretty much NO standard utilities can handle it. I know one's mind jumps immediately to PKWare being equivalent to ZIP, but DCL is just not a format used in zip archives or utilities.
And it's not openly licensed or commonly used, so nobody is really freely distributing binary de/compressor utilities.
But it has been reverse-engineered, and the reference C code is known as "blast" (the PKWare internal name is "explode"):
https://github.com/madler/zlib/blob/master/contrib/blast/blast.c . So you can find various implementations here and there, or grab the code yourself.
For what it's worth, C3C does not use the Huffman-coded literals and always uses the same sliding dictionary size. I coded my own implementation in Go of a decoder-only with no Huffman logic, and it works fine on Civ3 files.
https://github.com/myjimnelson/c3sat/tree/master/civ3decompress . All implementations are based on a reverse-engineering posted at
https://groups.google.com/g/comp.compression/c/M5P064or93o/m/W1ca1-ad6kgJ (although note that the blast.c implementor says the example encoding is incorrect...blast.c is well-documented, and I think blast.c is right.)
But, in your case, the compressor and decompressor are obviously compiled into the executable, so it would probably make as much sense for you to leverage than than any third-party utility or library.
What I actually did just now was use
a PowerShell script that calls
a C# implementation of Blast to do the decompression. This is the implementation we're using so far in the new-game-from-scratch code. I had sort-of intended to port my Go decompressor code to C#, but there's not a lot of compulsion to do so.