Note: this post use city and colony interchangeable and they are indeed the very same thing. They are called colonies ingame, but the C++ code is mainly reused from civ4 and they are called cities in the code.
Those calculations are done in-game when we open the "military advisor" screen. In an advanced game (with something like a thousand units), opening that window freezes the game during 5 seconds the time to do the calculations. If this operation is supposed to be performed several times by the player and by each AI, this could significantly slow down the game.
Those calculations are done in python, which is significantly slower than C++. However just the task of transferring the current profession variable from memory to the CPU would be way too slow, meaning it's already too slow even before the CPU starts to calculate.
With the use of a table, the calculations required would be significantly reduced, as data would simply be "stored" and picked up. I don't know if there's such a system of table in the game but I assume there is necessarily otherwise how could work things like trade routes? Are all things really calculated all the time? Nothing is never stored?
The answer to this can be reduced to just a single word:
cache
It's more or less as you say. If the calculation results in the same answer each time, the answer can be calculated once and then stored for quick access. The problem in using a cache is that if say the cache is the sum of 1000 numbers, then if any of those changed after the cache was calculated, using the cache would result in an incorrect number, which then results in a bug. This mean the cache needs updating and doing that whenever it's needed and only when it's needed can sometimes be somewhat tricky.
I think the best example of a good cache is one of the first thing I made as a modder. I cached the yield requirements for professions. Vanilla calculates this each time it's needed, with trait and FF bonus and whatever. I spotted this to be slow, calculated it once and for all and the AI reduced the time it needed by 17.5%. The cache needs to be recalculated, but I think it's only when the player gets a new FF and only if a yield cost modifier is different from 0. Really rare event (perfect for a cache), but coding this was far from trivial.
You mentioned traderoutes and particularly that part of the code is actually quite slow. It does in fact loop all of them, look at the city stock and stuff each time a fully automated transports has to decide what to do. I would love to improve this part, but I just don't have any idea how to do it. Caching data would be pointless as colonies change their stockpile way too often.
What we should do about units/professions:
My plan is to figure out how the numbers should be used and how to calculate those. They would be calculated on game start or game load. While the game is running, the numbers would have to be updated, but a partial update can be done. For instance the unit function to change a profession would subtract one of old profession and add one of the new.
If it's possible to generate a table (now I'm sure of nothing), here's how I would do it.
Each city determines its needs of specialists (for instance, the AI could simply collect the professions in the city which are currently not done by a specialist and that's good enough). Then that data is stored in a table which would look like that:
Code:
[B]Specialist City On his way[/B]
Lumberjack City A 0
Blacksmith City A 1
Tobacconist City B 1
Carpenter City C 0
C++ allows organizing memory precisely like we want and we could implement a table structure if we like. However I would prefer a simple array structure. It's a list of number and it has one number for each profession. If you want to know how many lumberjacks you have, you look up lumberjack and say it's the 7th in the xml file, then you read the 7th number. It's easy to code and extremely fast at runtime. When the "key" is location, there is no need to search and no need to read data to detect that's not the one you search for.
Your example with "already in city" and "on his way" could be implemented as two arrays. It could also be one, where each element is two numbers. For now, we could call the data structure tables and pretend some SQL design if you think that's easier. It will not be 100% like that when implemented in C++, but that's not important right now.
I think the easiest solution to handle is to store the arrays in the city object. This will automatically create one list for each city. It would likely be a good idea to have a similar list in the player object, which stores the combined values from all cities.
For the AI, we can imagine that at every turn, it would check the list and only insert a new line if a new unit comes in. The difficulty though is that as I recall, the AI recalculates professions of all its citizens at every turn, but that's another topic.
I'm thinking of modding the AI profession selection code to do something like this:
- loop though all units in the city with the profession in question
- if one is found, which isn't an expert, kick the unit (A) out of the city
- insert expert into the slot left free by A
- tell A to find best profession
If no unit is found or the unit isn't an expert, use the current code to determine a profession.
I think this would be an improvement. However it's not perfect and it would require some thinking, like don't place a blacksmith as a blacksmith if the colony is out of ore.
Alternatively finding the best profession would include the ability to kick out an inferior unit. There are multiple ways to handle this issue and it's a problem I will leave for later.