How can I improve turn load times?

Icevulture

Chieftain
Joined
Aug 26, 2004
Messages
49
Location
Hamilton, ON
Great mod. As I get later into a game, around 1650-1700AD, the turn load times start getting quite long. They get progressively worse the longer into a game I am. I've got a 7th gen i7, 16gb ram and an ssd, so thinking hardware isn't the issue. Was most recently playing south america scenario, but this happens with other maps as well.
 
Idk if it is just my imagination or what, but I recently cleaned out all my old unused save games, and it seems to have made everything ingame run much faster and smoother. Of course I had an insane number of files. But just a thought...
 
Great mod. As I get later into a game, around 1650-1700AD, the turn load times start getting quite long. They get progressively worse the longer into a game I am. I've got a 7th gen i7, 16gb ram and an ssd, so thinking hardware isn't the issue. Was most recently playing south america scenario, but this happens with other maps as well.

16gb ram isn't going to help because the game is 32bit and won't use more than 2gb (4gb if the large address aware flag is enabled - I don't know if it is)

The game relies a lot on python code, which is slow and inefficient. If you save and reload it should help, but it is an old game and modern hardware is not going to have much of an influence on the way it runs.
 
It's hard to do anything about it yourself. The "turn load times" isn't actually loading, but rather the AI trying to figure out what to do. It doesn't intuitively know the results of a lot of stuff like human players do and it tries a lot of stuff and calculate on expected result to see if it wants to do that. This means if the colony has a fisherman on a fish, the AI will try to move it to the water plot without a fish each turn, only to figure out that's a bad idea and then not do it.

The real problem is mainly pathfinding. For each AI unit the AI tries to figure out where to move the unit. The time spend on this increases linearly with the number of units (2 units takes twice the time as one unit). However the time spend on figuring out what to do with each unit increases exponentially with the number of available options. This means it spends longer on each unit for each plot, which is discovered. In late game pathfinding can take up more time than all other tasks combined.

Without changing the code itself, what you can do is to use a smaller map. If the map is smaller, the pathfinding for each unit will be significantly faster. That's really the only thing you can do in software as a player.

If you are to invest in hardware, what you need is something, which can look up data really fast. Most of the pathfinding isn't calculations, but rather looking up data. If a unit tries to enter a plot, it has to look up the terrain on that plot, look up the terrain data and unit data to see if the unit can move on the terrain in question, then it looks up stuff to figure out the movement costs etc.

For hardware this means you want low latency memory. Intel CPUs have faster memory access than AMD for this kind of work. The higher the single core performance, the better.
EDIT: this was written before AMD released the Ryzen 3000 series. I don't know if the Intel vs AMD statement is still valid.
 
Last edited:
I agree with what @Nightinggale wrote, pathfinding is indeed the single biggest consumer of CPU time. From profiling I've seen that CvUnitAI::AI_update() per unit approaches of 90% of AI turn time in the late game.

Other large consumer of CPU time:
1) AI Job assignment
2) Plot storm processing
3) surprisingly enough, achievements on huge maps
4) The industrialization victory condition, if enabled, often consumes 10% of cpu time by itself.

I have parallelized 1-4 with some success but that work still has beta status. I have only pushed my work concerning 1) so far: https://github.com/We-the-People-civ4col-mod/Mod/tree/task_parallelism

The "holy grail" would be to parallelize pathfinding which would require us to replace calls to the "stock" pathfinder (in the EXE) with (most likely) instances of the K-Mod pathfinder. We'd also have to restructure CvUnitAI::AI_update() to allow units to be processed at least partially in parallel. Note that some Civ4 mods like Caveman 2 Cosmos did implement multithreading for unit processing but for various reasons they did not achieve any meaningful improvements in performance. My approach would be a bit different than theirs though and should be able to overcome the issues that they encountered by using several nested levels of task based parallelism.

I think the main problem that WTP has right now is the lack of active programmers :(
 
Last edited:
Note that some Civ4 mods like Caveman 2 Cosmos did implement multithreading for unit processing but for various reasons they did not achieve any meaningful improvements in performance

It was explored and abandoned due to it being too challenging because of the lack of native multithreading support in the civ iv engine.

In fairness, C2C is an incredibly OTT mod, so you can understand why they would want to do that.

FTP is never going to be comparable to that style of mod.
 
Last edited:
C2C has a workaround, start a hotseat multiplayer game and set the turns to simultaneous. Set yourself as the only human player. AI will take all its turns at the same time.

I don't think/know if you can play a scenario that way though, so you might be limited to random maps.
 
In my case in RAR the turns go fine until 1700 or so but in WTP i'm already having like 10-20 seconds in the first turns
 
After reading all this, it's quite sad to realize it won't be solved anytime soon, unless some genius invest a massive amount of time re-coding everything... Something clearly unreasonable when you don't get paid IRL.
The mod is great but sadly hard to play because of those endless calculation (I can't stand it...). Was this (those) project(s) doomed since the beginning because of those limitations ? I hope not...
 
I have some ideas on how to improve performance. For starters I want to flatten the memory layout of the xml data. The idea is to not have to touch the xml files themselves or the code, which makes use of the xml data, but just the xml storage layout itself. The reason is the same as Factorio's fluidbox optimization, which actually happened after I came up with the idea. It will improve speed in cases like the AI looping through all BuildInfos to figure out what a pioneer should build. Currently it will read each build info from a random location in memory, but by placing the data linearly, the RAM interface of the CPU can figure out a reading pattern and read data before the CPU needs it, hence less waiting time for the data to arrive.

Another concept is hardcoding xml data into the DLL. I made something of that nature in Medieval Conquest. The idea is that instead of looping X times where X is read from memory, the compiler knows the loop will take place maybe 3 times and then the compiler can optimize the code accordingly, maybe by writing the loop code 3 times and not include the loop overhead. The tradeoff is it will break if anybody changes certain xml data meaning there should be a version for xml editing and one for fast gameplay.

I also want to add a memory pool to reuse memory allocations instead of releasing and allocating all the time. When used frequently with the same sizes (mainly one int for each yield), the same memory will be used over and over, which will boost performance. Making it part of JustInTimeArray means the code using it won't have to be updated to use it.

That's 3 ways of boosting performance without touching the code itself, only "support code". It's not like we don't have ideas on what to do regarding performance.

I think the main problem that WTP has right now is the lack of active programmers :(
This seems to be the main issue.

unless some genius invest a massive amount of time re-coding everything...
Not everything. What would be nice would be some clever person (perhaps genius) who can rewrite pathfinding. It's A* pathfinding, which is widely used in games and in principle isn't too bad. The issue is it will always seem to be the slow part in games and getting it to run fast, ideally with multiple CPU cores is problematic. To get an idea of the problem, I propose reading about Better Pathfinder mod for Rimworld. It shows A* pathfinding and the needed number of plots each approach need to look at. Many plots = slow. Low number of plots = stupid pathfinder with weird and slow routes.

If we can have a proper pathfinder, most of the performance issues would likely be gone.

Note that some Civ4 mods like Caveman 2 Cosmos did implement multithreading for unit processing but for various reasons they did not achieve any meaningful improvements in performance.
Looks like it's mainly thread locks, which caused issues as threads are apparently waiting for each other. Quoting C2C from a week ago.
The multithreading in C2C is really inefficent causing lots of waits and it is no longer needed. It was added years ago them the turn times where alot longer as they are now. The reason for those long turn times was really inefficient coding in some places which i optimized alot.

That multithreading will be removed in a big commit including some other performance improvements some time after v39 is released .
That's not saying multithreading is bad. It's awesome if done correctly. What's important to remember is that there are correct and incorrect ways of implementing threads. Doing it correctly is a difficult task and the approach @devolution brings to the table seems like it has a good chance of boosting performance. The question is if we can apply it successfully to the slow parts.
 
I need your help
I need a late game savegame on a big map. One where you have to wait a while for the next turn

I have come up with an idea on how to boost performance for pathfinding. In short it's about caching xml data used by the pathfinder for faster access. I have actually started work on version 2.8 (the next version to break savegames), but since part of this idea is that it won't break savegames, if it works well it would be valid to aim for a 2.7.2 release. This has nothing to do with multithreading, though if this works, any future multithreaded calls to the same code will also be faster.

The big question what kind of performance we can gain from this. For that purpose I need a worst case, but realistic savegame for testing purposes. I could make one, but it would likely be better to use a savegame you report as "horribly slow", particularly because anything you have wouldn't be created for the test, hence it would automatically be a realistic setup, which the player can encounter. Knowing precisely how much performance is gained (or unlikely lost) from each change is useful info for future optimization and it helps to tweak the optimization to gain the most out of it.
 
These are slow, but not "horribly slow" so I'm sure you can find some better examples, but these are the best I've got (just started playing the mod!).
 

Attachments

Here is a link to a post with a late slow game save:

https://forums.civfanatics.com/threads/we-the-people-bug-reporting.636760/page-7#post-15426291

@Nightinggale Perhaps you could make a stickied thread to get a wider response
I profiled this savegame and the result isn't as great as I had hoped. Yes it can be optimized, but the speedup isn't great enough for the player to make this optimization urgent. Since it won't affect savegames, it's something which can always be added whenever somebody has the time to look into this.

For the time being, I will work towards WTP 2.8 features instead.
 
I've been on a break from this for a bit but 2.8 might change that. Any projection on when so I know when to check back?
 
Thx. When I retire after the end of the year, I'll have some free time. ;)
Maybe I'll volunteer to help out.
 
Excuse me if I'm saying something obvious, but from my experience playing 2.7.1, the turn processing time is a lot worse using 1-radius settlements than with 2-radius settlements. Using 2-radius, I've yet to encounter any performance issues worth complaining about. I've also noticed that AI build reasonable amounts of coastal ships in 2-radius mode; only in 1-radius mode do I see the great armadas of coastal ships. If pathfinding is indeed the main issue, and AI coastal ships contribute the most, maybe it's just an AI bug. Perhaps the AI overestimate how many transports they need with smaller settlements. Maybe they're inefficiently setting up too many trade routes.

However, again while playing 1-radius settlements, I also noticed significant freezes whenever I add/remove a colonist to a settlement, which makes me believe it's the city governor AI that's slowing things down. Strangely, I don't notice it too much using 2-radius settlements. I would expect a settlement pushing 30 pops and 24 tiles to need more calculations than 20 pops and 8 tiles. So that's odd.

Hope this helps.
 
Back
Top Bottom