Its just a quick grab bag of stuff I noticed that piqued my interest (and can remember off the top of my head right now), that I want to investigate with an eye to improving performance: Sparse bool arrays - these seem to be used a lot, and in every case I so far looked at they are forcing the code to be slower instead of faster, and the memory usage to be higher than it should be (which also impacts performance). Massive classes - the bigger an object is in memory the less of it fits in cache. Vectors of pointers - this is bad performance wise in almost all cases. Of course some design requirements might make this impossible, but, as an example, the replacement system, where this is most used, is NOT one of them. At least as far as I can tell thus far from my surface reading of the code). Pointers in general - they shouldn't be the default for how to instantiate an object, but they seem to be in this code, so I'm going to look into that. Maybe most or all uses are the correct choice, I shall see. Const correctness - yes the optimizer can in fact perform better when it knows that code doesn't have side effects. It will also lead to much easier ability to apply... Data parallelization - once you know that the contents of a for loop *actually* don't have side effects (not that const actually guarantees it, but it is most of the way there), you can run it on more than one thread so it finishes faster. I don't know how many places could currently benefit from this, but some places that currently can't might be refactored such that they could. Branching - its slow. Especially convoluted branching will hurt the branch predictor and the optimizer. I already noticed plenty of redundant branching as well, where constraints are checked multiple times. Critical Sections - but I think alberts is removing all this stuff. Bake the static portions of the AI scoring - a large part of the valuation that the AI gives to buildings when deciding which to build is based off their raw numbers and not contingent on any external factors, and thus can be pre-computed, instead of computed for every city, for every building, every turn. Probably the same is true of unit builds. Maybe I can't improve on the code in some or any of these cases, they are just my check list of things to investigate. /edit The above stuff wouldn't change gameplay or behavior at all (intentionally at least!). There is another set of things I want to look at (I didn't work them all out yet, I just know they are there waiting to be discovered) that *would* change gameplay behaviour. This doesn't mean they make it worse, or even noticeably different though. In fact in the end they could lead to gameplay/AI improvements implemented using the spare CPU cycles that were saved. Better utilization of caches once they are formed, e.g. longer term city build planning for the AI. More accurate cache invalidation rules - caches don't need to be cleared at the start of a round, instead they can be cleared or partially cleared when values known to effect them have changed. Multi-level cache with standardized representation, from which invalidation rules can be derived automatically - this would be the ultimate way to model the caches. Not a trivial task. Queue multiple buildings - the AI is already evaluating and assigning a score to every possible building every time it changes production. Why not just put the top 5 rated buildings in the queue straight away? Then decide each turn if the AI feels strongly about clearing the rest of the queue due to some significant change in circumstances (war, tech with an un-built wonder in it etc.) This is how most humans play after all. I didn't look into unit AI at all yet, so more ideas to come hopefully!