Nightinggale
Deity
- Joined
- Feb 2, 2009
- Messages
- 5,279
I haven't looked at what was copied yet, but I can reply based on what I read in this thread.
However I'm not sure you will be able to tell the difference in RaR if you used the newest from RaRE or M:C. The difference is likely just coding. (didn't bother to check right now)
What I did change from RaR:
Some of it will make the player's UI faster as well, like creating the list of professions for a unit inside a colony. I didn't actually measure that, but I do imagine at least some people will notice a smother colony interface, specially on low end computers.
I did find one place I can improve by changing logics. I didn't as I (at least at that time) had a plan to add a feature to that particular piece of code and I easily ended up improving a function I would no longer call. I decided to postpone that until I'm fairly sure it's there to stay for quite a while. Having said that I can tell for sure that it will not have the overall impact like the caching I made.
"Just-In-Time" arrays are not designed to increase speed on their own, but rather memory. The point here is that they aren't allocated until they are written to and unallocated arrays are full of 0. It's primary used for arrays, which are full of 0 from the start and the AI will not change that. The speed goal is to match "normal" arrays, which is more or less true, specially for reading. It does have one speed boost though as it can check quickly if an array is allocated, which in turn is a single comparison for "is all content 0".
The first commit with a working profiler was caching CvPlayer::getYieldEquipmentAmount(). On it's own this cache is responsible for half the speed improvement. However in addition to the caching I did I did notice that whenever this function is called, it is often like "for each yield, call function, check that yield is in colony etc". In other words this is safe to ignore when all are 0. Skipping those checks when the array aren't allocated contributes to 20% of the speed boost. There is absolutely no way to predict this without proper testing and I had no idea how well it would work before I was done and tested it.
This mean this single commit has more than twice the impact than all the other speed improvements in RaRE combined.
The key to writing faster code is to measure and profile. If you ever compile a DLL yourself I highly recommend the makefile I wrote. It not only compiles with all CPU cores, it also allows stuff like profiling, allowing easily identification of which functions and even lines in the code, which uses the CPU time. Without this, I wouldn't have been able to identify the parts of the code to optimize.
The current advisor file in Medieval Conquest is written to be copied to RaRE without modifications. In other words it is the "correct" one. I wrote it generic enough to show both correctly (setting columns and stuff based on XML values) as it allows writing new features for both mods at once. It also mean a bugfix for one likely works on both.3) Advisor Screen from Nightinggale
However I'm not sure you will be able to tell the difference in RaR if you used the newest from RaRE or M:C. The difference is likely just coding. (didn't bother to check right now)
What I did change from RaR:
- Made it more generic, allowing addition of new code more easily.
- New screen with combined production storage of all yields for all colonies.
- Fixed wrong column width on production page 3 (copy paste error)
- Removed buildings page 3 (it was empty)
- Added missing column on buildings page 1
- External events to redraw the screen will now always recalculate the front page and never any background page (if you were on say production page 2, that page wouldn't update, but trade and natives would)
- The clicked button is highlighted when it's the active button (RaR only had highlighting for page 1)
My optimization strategy was to improve the slow part of the AI code to reduce wait time for next turn, meaning this is the only part I actually measured as a result. I managed to reduce the wait time by 25%.I can't give exact numbers, but it feels faster.
But I can't tell for sure.
Some of it will make the player's UI faster as well, like creating the list of professions for a unit inside a colony. I didn't actually measure that, but I do imagine at least some people will notice a smother colony interface, specially on low end computers.
The code is generally not that badly written when it comes to speed. The bad parts is that it recalculates values, which are often the same though the entire game or accessing constant XML values. This is why caching appears to be the most efficient approach to increase performance.The things I did, do improve performance only marginally.
(Basic code improvements to have logic a bit more efficient.)
I did find one place I can improve by changing logics. I didn't as I (at least at that time) had a plan to add a feature to that particular piece of code and I easily ended up improving a function I would no longer call. I decided to postpone that until I'm fairly sure it's there to stay for quite a while. Having said that I can tell for sure that it will not have the overall impact like the caching I made.
The inlining was done before I had the profiler working. I don't know how much it actually improves either. However it's a schoolbook example of what should be inlined (all functions returns a variable) as it reduces code size and generates a small speed boost. It reduces the DLL file size by 8k. Speed is unknown, but likely not that great. It is likely not a bad speed improvement when it comes to increase compared to coding time as it was mainly copying one block of code from a cpp file to a header file and then search and replace to add inline. I'm not sure how long it took me, but it's in the area of a few minutes and testing can more or less be reduced to "can it compile".Several things Nightinggale had done, I can tell by looking at the code, that they increase speed since we did that already before.
(Caching of Globals)
With others it is really hard to predict by simply looking at the code.
(Inlined functions, the "Just-In-Time" arrays, ...)
Here I simply trust Nightinggales tests and examinations.
"Just-In-Time" arrays are not designed to increase speed on their own, but rather memory. The point here is that they aren't allocated until they are written to and unallocated arrays are full of 0. It's primary used for arrays, which are full of 0 from the start and the AI will not change that. The speed goal is to match "normal" arrays, which is more or less true, specially for reading. It does have one speed boost though as it can check quickly if an array is allocated, which in turn is a single comparison for "is all content 0".
The first commit with a working profiler was caching CvPlayer::getYieldEquipmentAmount(). On it's own this cache is responsible for half the speed improvement. However in addition to the caching I did I did notice that whenever this function is called, it is often like "for each yield, call function, check that yield is in colony etc". In other words this is safe to ignore when all are 0. Skipping those checks when the array aren't allocated contributes to 20% of the speed boost. There is absolutely no way to predict this without proper testing and I had no idea how well it would work before I was done and tested it.
This mean this single commit has more than twice the impact than all the other speed improvements in RaRE combined.
The key to writing faster code is to measure and profile. If you ever compile a DLL yourself I highly recommend the makefile I wrote. It not only compiles with all CPU cores, it also allows stuff like profiling, allowing easily identification of which functions and even lines in the code, which uses the CPU time. Without this, I wouldn't have been able to identify the parts of the code to optimize.