I can't prove, but suspect that the causes of late-game slowdown, particularly with certain modmods, wouldn't be hugely helped by better hardware.
Civ4 is single-threaded in nature. Aside from maybe a few DirectX related things in the graphics engine and maybe some of the disk I/O for writing log files and perhaps some other trivial things like that, it only uses one CPU core.
It doesn't really use more than 1 core of a CPU to any significant degree. So a CPU that is "better" becasue it has more cores and/or hyperthreading will do nothing for you. The only thing that matters is the processing power of a single core. So a CPU with a higher clock rate than another of the same architecture will give you more performance. Newer generation CPUs may also give you slightly better performance at the same clock speed since they have improvements that usually add, on average, a couple of percent each generation. Since Civ4's executable is not being rebuilt each CPU generation, various added features in newer CPUs don't do anything for you (extensions like AVX, which first appeared in CPUs in 2011, and such), and it is unlikely that recompiled DLLs will use these since they are built using the same old compiler and libraries that don't use them for anything either.
So that all boils down to better performance from higher clock speed and architectural improvements in newer CPUs. This is why the fastest clocked Intel processor of the newest generation is the best you can do, currently. AMD processors have more cores, which does nothing for Civ4, and while they have higher clock speeds they get much less done per clock cycle so a 4GHz Intel CPU is faster for this than a 5GHz AMD CPU. If future CPUs continue to have better single threaded performance, that will carry over to Civ4. But in recent years the rate this has happened has been pretty slow - each generation of Intel CPUs is maybe 100MHz faster at a given price point (which is not much of a gain when it is already over 3GHz) and gets something like 0% to maybe 5% more work done per Hz (they actually alternate, with 0 to 1% in one generation step and 3-5% the next, then back to 0-1% since every second generation is a new internal architecture with the ones between being process shrinks with only minor tweaks to the functioning - this is called the "tick-tock" progression where the "tick" is a shrink and a "tock" is a new core design).
As far as I know, the one exception to this is, to a very limited extent, the Caveman 2 Cosmos mod any anything that uses its DLL (the new version of Rise of Mankind: A new Dawn and the Rocks 2 Rockets mod, which I did, are the only 2 mods that I know of that use it). That mod added a little multithreading in the DLL. But not a lot. It can trim a bit off the end turn processing because the barbarian and animal spawning code will use more than one thread, and there is a little somewhere in the city processing although I don't remember what that covers (possibly choosing what to produce, which is actually a large chunk of what they do each turn). One thing that reduces the effect of this approach is the "turbo" mode a lot of CPUs have, which increases the clock speed of CPUs when not all of their cores are active and their temperature is low enough - adding multithreading can make it fall out of turbo mode, or use a smaller clock speed increase, so the cores it is using run slower which can negate at least some of the speed gain from the multithreading (and the complexities of multithreading can make the processing take longer just from the things you have to do to get it to work properly). The net effect is only a small improvement, but every little bit helps when the end-of-turn times are really long.