I'm sure diffusion has a massive effect, especially as the number of cities seems to correlate well with the extended slowdowns.
It will definitely have a significant impact on late game turn time, I'm more uncertain on how much it would affect early game, thought the algorithm probably loop trough every tile each turn throughout the game.
the new game I started (and what crashed just now) is 500 turns old only, and I notice that there is a relative long time betweem my last unit turn or do other and the blinking red button, means, in background the AI is calculating a liong time more for moving or spawning animals etc.
compared with former games with mass of cities, for my feeling it is not the amount of cities, not the amount of civs, but the amount of total existing units that need to get calculated each turn. It makes no difference for CPU calculation time if there are 100 citgies and 10,000 units or 10 cities and 10,000 units. Each units neets to get calculated separately. Therefore, grafic is not the this, this is only 1x when loading work for the GPU, but processing time of the CPU is the point.
This would be same in early game when mass of animals are around and spawn again and again.
The only way I see to reduce calculation time is to reduce the total amount or units, which sometimes can be more than 10,000 each civ (as the last game had). It is an XP game, works with 9 threads on 2 kernels only, it is not a 64 bit game!
This means, create a way that a unit can get much more stronger and upgrade and at same time reduce the amount of units, so that (i.e.) now 10 good attacker are fine to get a city while 10 defender are good to defend it, and not, as in my last game, a city had more than 250 defender (all low level units) and around there had been in each field also more than 250, so that I neede more than 1,000 military units to attack well.
The AI uses more mass of low level units than smaller amount but well upgraded. Maybe that you can manage by maintenance costs (food or gold) which should be same for low level and high level unit, so high that the AI can't build that mass and decides to upgrade instead to create a new unit.
Limit amount of units per tile does not help. This only forces that any field gets maximum of allowed units and units no longer can move and the civ in case freezes de facto (that I had when I set the limit for test, it gives terrible result.)
Edit
just time tested
I7-4700 2,4 GHz / 16 GB Ram / GeForce 780 / 4 GB VRam / Win7-64 / SSD
GEM map
turn number 496
load sav 3.680 KB after start game - 65 sec
recalculation done at first 45 sec - this shows where the problem is, that is much too long,should be not more than 15 sec in this early status of the game.
next turn wait 61 sec
Edit 2
I play that game w/o random events!