Nightinggale
Deity
- Joined
- Feb 2, 2009
- Messages
- 5,279
I decided to start a new thread on coding issues. This is not about anything related to the players. Instead it is announcements and discussions on how the internals of the DLL should work.
I pushed an update for the cache for getDefineINT(). Now that XML files are read differently we can set the cache in a more efficient way as we know everything is set before main menu.
I decided to take this one step further and reorder where in the array each int is. If a variable is often used together with another variable, then those two can benefit from being next to each other (hardware cache). We currently have 306 variables in the cache meaning analyzing all of them manually would mean spending way more time than I plan to do on this. Instead I made a simple counter, played a bit and then read the counter. This tells me how many times each variable is called during my gaming ad the plan was to sort by this list to group often called together. At least this will prevent often called ones from being together with really rarely called ones. This gave me a top 10:
83,24% | XML_CIVICOPTION_INVENTIONS
10,79% | XML_DEFAULT_YIELD_ARMOR_TYPE
2,41% | XML_AUTORESEARCH_ALL
0,82% | XML_CITY_YIELD_CAPACITY
0,82% | XML_FEATURE_PRODUCTION_YIELD_MAX_DISTANCE
0,37% | XML_FATHER_COST_EXTRA_TEAM_MEMBER_MODIFIER
0,19% | XML_MAX_WITHDRAWAL_PROBABILITY
0,13% | XML_FATHER_POINT_REAL_TRADE
0,12% | XML_MIN_CIV_STARTING_DISTANCE
0,12% | XML_STARTING_DISTANCE_PERCENT
This reveals something interesting. More than 96% of all calls to the cache goes to the same 3 variables
Also only 1% of the calls goes to those 296 variables, which isn't in top 10.
Now I have a better idea of how to optimize this. The top 2-3 variables get hardcoded into the DLL.
CIVICOPTION_INVENTIONS: this really needs to be hardcoded, but I'm not sure which would be the best approach yet.
DEFAULT_YIELD_ARMOR_TYPE can be part of the yield enum.
AUTORESEARCH_ALL could be a define set in Makefile.settings or a header as it is fairly rarely used with a non-default value.
If those are hardcoded, then I think the rest will be called rarely enough not to matter with order and we can keep the current one as it is human readable with comments about file of origin etc.
If you feel like knowing more about memory management optimization:
I pushed an update for the cache for getDefineINT(). Now that XML files are read differently we can set the cache in a more efficient way as we know everything is set before main menu.
I decided to take this one step further and reorder where in the array each int is. If a variable is often used together with another variable, then those two can benefit from being next to each other (hardware cache). We currently have 306 variables in the cache meaning analyzing all of them manually would mean spending way more time than I plan to do on this. Instead I made a simple counter, played a bit and then read the counter. This tells me how many times each variable is called during my gaming ad the plan was to sort by this list to group often called together. At least this will prevent often called ones from being together with really rarely called ones. This gave me a top 10:
10,79% | XML_DEFAULT_YIELD_ARMOR_TYPE
2,41% | XML_AUTORESEARCH_ALL
0,82% | XML_CITY_YIELD_CAPACITY
0,82% | XML_FEATURE_PRODUCTION_YIELD_MAX_DISTANCE
0,37% | XML_FATHER_COST_EXTRA_TEAM_MEMBER_MODIFIER
0,19% | XML_MAX_WITHDRAWAL_PROBABILITY
0,13% | XML_FATHER_POINT_REAL_TRADE
0,12% | XML_MIN_CIV_STARTING_DISTANCE
0,12% | XML_STARTING_DISTANCE_PERCENT
Also only 1% of the calls goes to those 296 variables, which isn't in top 10.
Now I have a better idea of how to optimize this. The top 2-3 variables get hardcoded into the DLL.
CIVICOPTION_INVENTIONS: this really needs to be hardcoded, but I'm not sure which would be the best approach yet.
DEFAULT_YIELD_ARMOR_TYPE can be part of the yield enum.
AUTORESEARCH_ALL could be a define set in Makefile.settings or a header as it is fairly rarely used with a non-default value.
If those are hardcoded, then I think the rest will be called rarely enough not to matter with order and we can keep the current one as it is human readable with comments about file of origin etc.
If you feel like knowing more about memory management optimization:
Spoiler :
This video talks about how memory access latency affects performance and the newer the computer, the more severe the problem. The video is from 2007 and stuff has happened since then, but this problem certainly didn't go away.
There is a lot of talk about shared/global memory and locks. This is only interesting for multithreaded applications and the DLL is singlethreaded. This mean we completely avoid this issue, though it also mean we only use one CPU core for gaming while the rest are idle. The graphics display runs in another thread though, which mean hopefully another CPU core.
I think the most illustrative part about the cause of the issue is from 17:30 to 28:10 (roughly).
Last, but not least: it's ok not to get this video. It is certainly possible to write decent code without understanding this. I don't really know how tricky this to understand for non-engineers, but I have a hunch it could be tricky. The video is theoretical and only give hints on how to code for this. It likely takes some background knowledge to really turn this into actual code.
There is a lot of talk about shared/global memory and locks. This is only interesting for multithreaded applications and the DLL is singlethreaded. This mean we completely avoid this issue, though it also mean we only use one CPU core for gaming while the rest are idle. The graphics display runs in another thread though, which mean hopefully another CPU core.
I think the most illustrative part about the cause of the issue is from 17:30 to 28:10 (roughly).
Last, but not least: it's ok not to get this video. It is certainly possible to write decent code without understanding this. I don't really know how tricky this to understand for non-engineers, but I have a hunch it could be tricky. The video is theoretical and only give hints on how to code for this. It likely takes some background knowledge to really turn this into actual code.