Multithreading & 64bit memory access to increase Civ4's speed in large games?

So you are saying automation is bad for turn times? I knew it. Automation is the devils tool and now I have proof.
 
Now you just need to stop the AI from automating things.

The real solution of course is to make it fast. There's a huge amount of redundant computation. A city shouldn't have to reevaluate its plots if they didn't change. I wonder if you could have like some sort of centralised cache that maps arbitrary plots to arbitrary values. The hard problem is ensuring correctness. Proper invalidation.

Anyway, logging:

Spoiler End of game startup sequence :
Code:
Mutating CvInitCore::setLeader
Mutating CvInitCore::setColor
Mutating CvInitCore::setMinorNationCiv
InitUninit CvPlayer::init
InitUninit CvMap::init
CreateDestroy CvMapGenerator::GetInstance
Mutating CvMapGenerator::generateRandomMap
Mutating CvMapGenerator::addGameElements
InitUninit CvGame::initDiplomacy
Mutating CvGame::setInitialItems
Mutating CvGame::initScoreCalculation
InitUninit CvGame::initEvents
Mutating CvGame::setFinalInitialized
InitUninit CvGlobals::SetGraphicsInitialized
InitUninit CvPlayer::setupGraphical x19
InitUninit CvMap::setupGraphical
InitUninit CvSelectionGroup::reset
Mutating CvGame::getGlobeviewConfigurationParameters
Mutating CvMap::updateSymbolVisibility x2
Mutating CvInitCore::reopenInactiveSlots
Mutating CvGame::setInitialTime
Mutating CvPlayer::setStartTime x19
Python CyArgsList::makeFunctionArgs
Mutating CvMap::updateMinimapColor
CreateDestroy CvPlotIndicatorData::CvPlotIndicatorData
InitUninit CvEventReporter::gameStart
Python CyArgsList::makeFunctionArgs
Mutating CvPlayer::updateHuman x19
Python CyArgsList::makeFunctionArgs
ComplexGetter CvGame::calculateOptionsChecksum
ComplexGetter CvGame::calculateSyncChecksum
Mutating CvGame::update
Python CyArgsList::makeFunctionArgs
CreateDestroy CyUnit::CyUnit
CreateDestroy CyPlot::CyPlot
CreateDestroy CyUnit::CyUnit
Python CyArgsList::makeFunctionArgs
CreateDestroy CvPopupInfo::CvPopupInfo
Mutating CvPopupInfo::operator =
CreateDestroy CvDLLButtonPopup::getInstance
Mutating CvDLLButtonPopup::launchButtonPopup


This logging should only be showing entries into the DLL from the EXE, with the help of _ReturnAddress(). That return address should also let you see the call site in Ghidra and any surrounding logic.

With a log of function calls, it should be possible to create a simple program that uses the DLL and just executes turns and compares save files. If you want that x64 engine. But either way, if you get a working DLL out of VS2022, then at least you can easily modernise the code. A fun exercise for anybody who has the time.
 
Last edited:
Now you just need to stop the AI from automating things.

The real solution of course is to make it fast. There's a huge amount of redundant computation. A city shouldn't have to reevaluate its plots if they didn't change. I wonder if you could have like some sort of centralised cache that maps arbitrary plots to arbitrary values. The hard problem is ensuring correctness. Proper invalidation.
You don't need a cache. After all, you need to reevaluate the whole city if a change has happened anyway. So you only need one boolean per city to represent that something, anything has changed and than hook that up into all the functions that change things.

So every time your city creates a new building or a tile in its BFC changes you just set that flag to true and that tells the system to do precisely one evaluation after which it resets the flag.
 
Yes, you could locally cache things. But something like plots which are modified elsewhere, having a centralised caching system may have use. I don't exactly know what's best though, that would be something for somebody with time to figure out. You'd have to really look into the code and find all the data dependencies.
 
Just a noob's noob question:
Aren't Tras Routes recalculated every turn? Won't it make this approach impossible?

Spoiler :
Since I'm no coder I understand little of the technical parts of the conversation and may say/ask stupid things (if I happen to say/ask anything), yet I read it with great pleasure and joy 😁
Basically I'm here to root for your success :popcorn: :woohoo::banana:
 
I haven't seen trade routes involved in plot assignment.

But even so, in general, any function that shows up in profiling, that takes the same inputs and produces the same outputs, should benefit from some form of caching. The tricky part is when these inputs are global state, modified elsewhere.

*It might also be possible to do plot assignment for all cities in parallel, but you'd be lucky if you could do that! It seems to be done in CvGame::update, so if there are no inter-dependencies, it would be easy to thread that.
 
Last edited:
Yes, you could locally cache things. But something like plots which are modified elsewhere, having a centralised caching system may have use. I don't exactly know what's best though, that would be something for somebody with time to figure out. You'd have to really look into the code and find all the data dependencies.
Honestly, I don't really see the logic for it. Not in this case anyway. If your goal is to make the function that checks where to move workers around in a city faster by making it run less often a cache is plainly overkill. Remember, maintaining caches comes with its own price in terms of memory footprint and CPU work. And fundamentally I just don't really see a reason to pay that if we don't have to.

I mean what can practically require a test?
  1. Improvement status changed.
  2. Building status changed (potentially new specialists available / old no longer available or tile yields changed).
  3. An event with the specific XML tags that change yields has effected the city.
  4. A tech with the specific XML tags that change yields has been researched or resource unlocked.
I don't really think I've missed anything. Of these, only #4 requires a global rescan of all your cities and the rest just effect one city at a time. So just having one boolean per city and having these situations raise that flag should be the cheapest solution. All it would take is the boring legwork of figuring out all the scenarios where this could happen and adding the code.

And I think the same approach can be applied to a great number of things to essentially produce a state of calculate on demand only. Because ultimately the fastest way to do something is to not do it at all.
 
Last edited:
To elaborate a bit on what I am proposing.

Basically, I am thinking of using STD::unordered_map for cities. The key is the cities name, ID or what ever CIV4 uses internally to track them. When a city is built / razed it is added / removed from the map. The value for the map is a boolean representing if something about the city or effecting the city has changed.

Each operation that does anything with the city info can than basically check the boolean and know if to skip it or not. It's basically a mark as dirty flag.

This is all assuming we can't just edit the memory structure that defines a city in the first place to just add a boolean flag. Which might well be possible.
 
Last edited:
Working plot assignment is just one small thing. But there are many small things. The DLL already has dirty flags, but they are apparently insufficient.

I would worry that if one were to start refactoring redundant computations all over the place, including AI, and doing specialised invalidation for each little thing, that one would start to lose track and introduce bugs, and that's where some general caching system might seem like a better idea. Whereby the cache entry knows what it depends on, and because it's centralised, there's only one thing to invalidate.

But it's just speculation. Somebody will have to go in there and find out what's best. I have given a bunch of info so that anybody who has VS2022 and time can get started. There is ample optimisation opportunity, as long as the engine itself isn't the problem.
 
Honestly, I am not sure what you mean by caching in this context. I mean, all the data about the cities and units ect. is already in the memory anyway. It's not like it's written to a disk. And presumably the game does not just literally recalculate everything from scratch for each function call. Right?

I mean it's not like say the call to reassign specialists that we are talking about goes through your techs, than every time of your city and calculates the yields from scratch. Right? I mean, that would be beyond the pale.
 
Yes, the DLL is chock full of recalculation. I don't know how much of it is a good candidate for caching though. I've just been looking at CvCityAI::AI_assignWorkingPlots and AI_plotValue within. It's literally a bunch of nested looping on top of whatever else those functions are doing. What you'd do is keep trimming things off of profiling results and hope you can handle whatever needs to be done without any major rearchitecting.
 
Yes, the DLL is chock full of recalculation. I don't know how much of it is a good candidate for caching though. I've just been looking at CvCityAI::AI_assignWorkingPlots and AI_plotValue within. It's literally a bunch of nested looping on top of whatever else those functions are doing. What you'd do is keep trimming things off of profiling results and hope you can handle whatever needs to be done without any major rearchitecting.
I guess it makes sense if you have to work within a 2GB memory limit since caches would add up very quickly. But still, god dam. Honestly just finding a way to rid our self of a memory limit would be huge in its own right if for no other reason than the mods. I wonder if that is even possible though. Probably not since we can't recompile the actual .exe.

Well, I mean, we could. We could decompile it into machine code, manually edit everything to make it work with x64 sized variables and than recompile it. But that would both be a violation of Fireaxis copyright and trust on a monumental level and a project of such complexity and scope that it would require employing the entire tech industry of a small country even if it were allowed.
 
Last edited:
I guess it makes sense if you have to work within a 2GB memory limit since caches would add up very quickly. But still, god dam. Honestly just finding a way to rid our self of a memory limit would be huge in its own right if for no other reason than the mods. I wonder if that is even possible though. Probably not since we can't recompile the actual .exe.
I'm not sure where most memory usage comes from. Probably art assets. My 200 city save loads up at 1.1GB according to heap debugging. 835MB of that is apparently in the engine *and other sources, leaving 291MB from the DLL.

Implementing an x64 engine would be a certain kind of Fun. It should actually be easy to get the bare minimal going. Just something that uses the DLL correctly enough, minimal graphics, minimal UI. And then you'd build up the UI, the graphics engine, the fancy L-System stuff, NIF loading (Gamebryo).
 
Last edited:
The next performance issue: If you bring up the city screen enough, it will slow down. Reproduce with rapid Esc, F10 in an end game save. Unfortunately, it seems the bottleneck is in the engine. Very spread out. It could be allocation overhead, but _nh_malloc from VC71 is only 3.65% total CPU time.
 
The next performance issue: If you bring up the city screen enough, it will slow down. Reproduce with rapid Esc, F10 in an end game save.
Okay, but that's not a realistic issue :dunno:
I mean who cares that we can freeze the game by some extreme action? Or is this the symptom of something else?
 
Okay, but that's not a realistic issue :dunno:
I mean who cares that we can freeze the game by some extreme action? Or is this the symptom of something else?
Not rare at all. It happens all the time in my huge games. UI gets slow, enqueuing production gets slow, I actively avoid going into the city screen unless I need to. You restart the game to get it back to normal.
 
Well, it's 64-bit at least. :thumbsup:

2024y05m17d - Cv4MiniEngine cropped.png


Performance is great of course. It can start a new game, load, save, it can end the first few turns, cycle units, found a city with naming UI. And this one test game produces a save that is almost byte identical (turn slice number is realtime). UI made easier with FTXUI: buttons, resizeable panels, scrollable action list. Uses the latest Python 2 (2.7.18).

How deep down the rabbit hole should I go for a proof of concept miniengine.
 
Yeah, it's actually not all that difficult. The DLL and python code is 99% of the game and you mostly write a bunch of UI and DLL interface implementations. With Gihdra to guide you. It's even calling into the engine to display the tech chooser popup:

Code:
Player 0 event: The villagers have given you the secrets of a new Technology!
CmdCiv4 log Info void __cdecl CyPopupInfo::addPopup(int): player 0
Player 0 event: You have discovered TXT_KEY_TECH_AGRICULTURE!
addPopup for player 0: What would you like to research next?

I gave this some time as I had a reprioritisation of projects, but I don't intend to complete this right now. But if anybody out there wants to do an full engine reimplementation, I can be a source of info.
 
Top Bottom