[BTS] Max Civ IV performance, which CPU to buy?

ManUnited4Ever

Chieftain
Joined
Aug 16, 2007
Messages
75
I'm buying a new computer soon, and I want to optimize turn loading time on large Civ IV maps. Which CPU should I buy? Is the answer to this question more complicated than to simply buy the one with best benchmark results on single thread performance? I.e. the AMD Ryzen 7 3800X as of writing.

I'm going to buy in some sophisticated cooling gear and try overclocking. Not sure if this is relevant to the decision.

https://www.cpubenchmark.net/singleThread.html
 
I still haven't bought new gear, but I thought I might bump this thread one more time before I actually do. I've read several posts by @Nightinggale about lategame pathfinding and terrain XML lookups, and his recommendation is to prioritize DRAM latency and L3 cache. I'm not sure whether to go for the Intel 9900K or the Ryzen 9 3900 X. According to this benchmark, it might seem as if Intel is still significantly better than AMD regarding latency, the 9900K measures 2/3 of the Ryzen delay. (I'm not completely sure what I'm reading though). However, the Ryzen L3 cache is 4x greater than that of the 9900K.

If my schematic understanding is correct, a greater L3 cache means, that for each individual lookup, there is a greater probability that RAM need not be accessed, and this probability scales linearly with L3 cache size. In those cases when RAM is accessed, latency is the bottleneck.

For now, my choice is the i9-9900K Ryzen 9 3900X. If someone here has an opinion or point of consideration they would like to share, I'd love to hear it.
 
Last edited:
Ryzen isn't "a CPU", but rather a cluster of CPUs. The 3900X consist of 4 CCX units (core clusters), each with 16 MB L3 cache. Since one CCX can't use the L3 cache from another CCX, both 3900X and 9900K have 16 MB L3 cache for single core operation.

I would say 9900K will be fastest based on the fact that it always wins when people benchmark them against each other for single core performance. Also the civ4 engine highly depends on DRAM latency.

However AMD recently announced their next generation ryzen, which will be pin compatible with the existing CPUs. This means going ryzen will allow you to replace the CPU next year to something, which is likely even better than anything available today. Intel usually requires new motherboards if you want a new CPU.

Also be aware since AMD use a standard CPU socket, you can fit 3900X into a budget motherboard, which can only supply enough power for 8 cores. This means you should really pay attention to the capacity/quality of the VRMs.

You should however be aware of history. Intel struggled to be ahead of the competitors and with Pentium, they managed to do just that. They finally managed to make something where the clones were noteworthy slower. This lasted for decades, but then AMD is catching up in latency and then came the hardware security bugs (meltdown etc). It was then revealed that the way Intel managed to stay ahead of the competition was to not do some hardware level security checks because nobody will notice anyway. Then came the internet and also AMD didn't do this cheat. This means while AMD has such issues too, Intel is hit harder because they never fixed the cheat they used back in the day. This means it's not unreasonable to assume that 3900X can become faster than 9900K if the development with security patches continues as it has developed recently.

I'm not sure what I will recommend today. It was easy prior to Ryzen where it would be "whatever has the highest single core performance from Intel". Now it's way too complex even before taking into account that you can get great single core performance from something like 3950X something expensive, which is designed for many core operation and isn't cost effective for single core gaming.

You should also consider how much better it is to upgrade. New CPUs tend to have many cores, but improvements in single core performance isn't as great as we would like to think. Take for instance 4790K. Running at full boost speed at all time for single core performance (requires 50-60 W, not unrealistic to cool, 4th gen didn't boost much) it will provide the same performance as a 9900K when used for single core performance at base clock (as in not boosting). This means going from the 4th gen flagship to the 9th gen flagship in single core performance gains.... well nothing unless you cool your new CPU for serious boosting. In fact buying a CPU today at the same price as the 4790K used to cost will likely give you twice the cores, but reduced single core performance. In other cases upgrading will boost single core performance significantly. This means upgrading is beneficial for some people and not for other people even if they upgrade from equally aged computers.

AMD did a lot of good when it gave us Ryzen, but the idea of adding lots of cores (an idea, which Intel has copied) means the market for good single core performance has become way more confusing.

Generally speaking, look for single core benchmarks for what you want to compare and you should also prefer low latency memory. How to figure out which memory is good is also something, which is way more complex than "pick the expensive one/the one with the best numbers". What you want is the one with the lowest highest "frequency divided by CAS numbers", but at the same time you want high frequency because that will give you more MB/s.
 
Last edited:
Thanks again for your long and elaborate explanation.

What you want is the one with the lowest "frequency divided by CAS numbers", but at the same time you want high frequency because that will give you more MB/s.

This is a bit counterintuitive to me. Naively, CAS is the nr of cycles to retrieve something. So, latency would be CAS * (cycle duration). The higher the frequency, the lower the cycle duration, so I would think that I would want (frequency/CAS) to be as high as possible, not the other way around. Is this true, or no?
 
This is a bit counterintuitive to me. Naively, CAS is the nr of cycles to retrieve something. So, latency would be CAS * (cycle duration). The higher the frequency, the lower the cycle duration, so I would think that I would want (frequency/CAS) to be as high as possible, not the other way around. Is this true, or no?
Oops, that went a bit too fast. I swapped the fraction around (long live skipping proofreading :p). I was thinking CAS/frequency, which will give the time for needed for RAM to do a certain task. This needs to be as low as possible. Frequency divided by CAS gives the number of operations per second, which needs to be as high as possible.

The main problem is that CAS tend to go up when frequency goes up, which isn't surprising by itself. It does however become an issue if CAS/frequency goes up, which sadly isn't uncommon. This means you should watch out for some expensive ultra high frequency memory, because it can actually slow down the game. It's also possible that the RAM can have such a high frequency that the motherboard/CPU can't keep up and make it stable, which is another issue to watch out for. Just because the RAM says 4000 MHz doesn't mean the motherboard's bus routing is insulated enough to keep the noise level low enough to be stable at that speed. At the same time getting low frequency RAM in order to get low CAS might not help if it's just some poor quality and the latency isn't better just because the CAS is lower. Contrary to popular belief online, it is actually possible to get high frequency RAM with low latency. The question is if it's reasonable for your budget.

One more thing to look out for. Zen 2 (Ryzen) has an internal bus named infinity fabric. If the memory is more than twice the speed of the infinite fabric, the memory I/O part will decouple the clock, which then requires buffering bridge between the RAM bus and the internal bus. This means while the memory throughput will increase, the buffering will increase latency. Since the infinite fabric runs at 1800 MHz (good chips a bit higher), you can't assume to go faster than 3600 MHz RAM without a latency penalty.
 
I ordered my new computer now. Went for the i9-9900K.

The good RAM was out of stock, but I had to place the order before the campaign ended. So I configured the build with the cheapest RAM available, because it was not technically possible to order without RAM installed. The cheap RAM is 2400 MHz and CL18 I think, whereas the RAM I'll actually use - once I get it - will be 3600 MHz and CL14. I'll benchmark before and after and report the results. Then I might try some OC and see where it gets me.

Is there some super-late-game save I can use for my benchmark?
 
Ok, I changed the RAM from 2400 MHz CL16 to 3600 MHz CL 14. On my first benchmark, I got exactly 53 seconds again. Then I noticed that the frequency was set to 2100 MHz in my UEFI after the upgrade, despite that the factory settings are 3600 MHz. I changed them manually and ran the test again. Still exactly 53 seconds. Then I noticedthat the timings were off as well, and I changed from 15-16-36 to 14-15-35. Ran the benchmark one more time, and I got 43 seconds.

So... the results are weird, and I don't know what to make of them. Something seems to have gone wrong with my testing.

First RAM: https://www.corsair.com/eu/en/Categ...GB)-DDR4-2400MHz-C16-DIMM/p/CMV8GX4M1A2400C16

Second RAM: https://www.gskill.com/specification/165/326/1562839044/F4-3600C14D-16GTZN-Specification

i9-9900K @ 3.6 GHz

2400 MHz CL16 (8 GB) — 53 seconds
2100 MHz CL15 (16 GB) — 53 seconds
3600 MHz CL15 (16 GB) — 53 seconds
3600 MHz CL14 (16 GB) — 43 seconds
 
Hitting precisely 53 seconds with 3 different setups could indicate a systematic error in your measurements, though it's impossible to tell based on what you have written. In fact if you run the test multiple times with the same setup, then I wouldn't even expect the same result each time. This has to do with the randomness of where the data is stored in memory, what is running in the background, what happens to be in the CPU cache etc. Gone are the days when computers worked in hard realtime and a certain calculation would always take the same amount of time. It's one of those things we lost when we gained the ability to run more than one application at the same time.

You need to run the test multiple times with each setting to make sure the results are consistent. I don't think you should test less than 5 times, though if you get the same time 3 times in a row, you are likely getting consistent timing.

One explanation for your results could be that you increased memory throughput, which doesn't really matter to the AI. What matters to the AI is latency, meaning the CPU wait from requesting certain data until it has the data ready. This means it isn't surprising that reducing the CAS timing will result in improvements while increasing memory frequency will not if the CAS is adjusted accordingly. However a 19% increase by going from 15-16-36 to 14-15-35 is surprising.
 
Back
Top Bottom