More Naval AI Profiling

Terkhen

King
Joined
Aug 1, 2011
Messages
917
Location
Granada
As everyone who has played on a big map for many turns noticed, More Naval AI performance suffers a lot in the late game. I would like to work on improving this situation.

To know what methods cause the biggest problems I used the FProfiler included in the dll to profile More Naval AI r1332 for 10 turns in a late game (turn 465) taking place in a Huge Pangaea. The savegame I used and the profiling results for each turn can be found attached to this post.

I checked the source code of each one of the methods that take a lot of time and took some notes of them. I would appreciate any comments with regard to these methods. Once I get a clear idea of some changes to these or any other methods that could improve game speed I plan to start coding and profiling those changes to see if they manage to improve turn speed or not.

CvPlayerAI::AI_doTurnUnitsPost:
This method has a big cost in all turns. Improving its performance should be a priority. Tholal has a ToDo note in this method. The change he proposes should help a lot in improving the performance of this method.

CvPlayerAI::AI_unitValue:
This method has a big cost on all turns. Improving its performance should be a priority. Most of the time cost of this method comes from the huge number of times it is being called. If this value could be cached for each unit and marked as dirty as unfrequently as possible we could see a huge improvement in late game speed. The time cost of the method itself seems to be negligible. Anyways, switching from the bValid magic to a code that bails out from the method as early as possible should improve things a bit.

CvTeamAI::AI_doWar:
This method does not add a lot of cost on all turns, but when it does it has a great cost.

This method calls python "AI_doWar", but since that python function is empty this should not be increasing its cost much. I don't see any obvious way to improve the cost of this method.

CvTeamAI::AI_doWar -> CvPlayerAI::AI_doPeace:
This method does not add a lot of cost on all turns, but when it does it has a great cost. I don't spot any obvious ways to improve this code.

CvTeam::setHasTech:

This method does not add a lot of cost on all turns, but when it does it has a great cost. This method cycles twice through all the plots to check bonus changes even if the technology being set does not change "techReveal", "techCityTrade" or "techObsolete" for any bonus. Since most of its cost comes from calling CvTeam::processTech, what I mentioned does not seem like a big issue.

CvTeam::setHasTech -> CvTeam::processTech:
This method does not add a lot of cost on all turns, but when it does it has a great cost. I do not see any obvious change that could improve the speed of this code.

CvPlayer::isFullMember:
This method has a noticeable cost in all turns. This cost comes from being called half a million times each turn. Trying to reduce the number of times it is called should help a bit with its cost.

CvUnit::canCast:
This method has a noticeable cost in all turns. The cost comes mostly from the number of times it is being called. Although from my profiling log canCast does not seem like a huge performance drain, I have reasons to suspect that the cost of canCast increases a lot when more spells are added (specially if they have or "unit on stack", "promotion in stack" prerrequisites or they are permanent summoning spells). For example, Snarko mentioned at #erebus that CvUnit::canCast was a big performance problem in Rise from Erebus. Adapting the changes that he made in Rise from Erebus to More Naval AI should help a bit with its performance, and it should also help with the performance of any More Naval AI modmods that add a huge number of spells.
 

Attachments

  • PERFORMANCE.CivBeyondSwordSave
    932 KB · Views: 145
  • dllprofile.rar
    4.1 KB · Views: 115
Thanks for the links :)

The download link for the first one is dead. Since the thanks mention Kael and I remember some references to a partial Wildmana merge in FFH2, I checked More Naval AI source code if it was already merged. I could find "SPEEDTWEAK ... Sephi" mentions in More Naval AI source code and therefore it seems that it was already merged somewhen.

In the CAR mod, the changes seem to be tagged as "Sanguo Mod Performance start, added by poyuzhe". I found a few mentions to it in More Naval AI source code as well, which means that MNAI already has a partial merge of CAR.

What I plan to do right now is to just tweak, modify and adapt the methods that cause the biggest peaks in time consumption in More Naval AI. In the future (if I have time and I want to spend time on coding) I would like to take a look at some speed improvements that are being coded right now in Realism Invictus. While talking about what would bring real improvements in time at #erebus, Snarko suggested modifying directly how information is stored to make it accessible in a faster way than always looping through everything. But for now, I only plan minor tweaks :)
 
I don't know if this change was ever made, so posting it here. It was originally proposed in the original MNAI thread several months back, so Tholal might already have incorporated it, I don't know.
 
IIRC it improves game speed once hell terrain starts spreading by using more efficient code for the terrain change
 
[to_xp]Gekko;12198019 said:
IIRC it improves game speed once hell terrain starts spreading by using more efficient code for the terrain change

With a lot of world builder and a bit of luck I managed to create a savegame in which a neutral civ (the Khazad) are about to become evil and switch to the Ashen Veil with an AC of 60. Since the Infernals are nearby, this triggers the change of their lands into hell terrain. I profiled 10 turns and got the results attached to this post. I did not notice any significant differences between these results and the original ones.

Very cool!

Thanks :)

I'm thinking about creating a cache for the CvPlayerAI::AI_unitValue method. It is one of the biggest performance eaters, and most of its calculations could be done just once (on load and game start, for example). The cache would index unit values by a (UnitTypes ,UnitAITypes) pair. The cached value would contain the sum of all "static" calculations currently made by AI_unitValue. "Dynamic" calculations would be those that call AI_unitImpassableCount or those based on the pArea parameter. Whenever the new AI_unitValue method would be called, it would make only the dynamic calculations and add their value to the cached value. Do you foresee any problems with this approach?
 

Attachments

  • PERFORMANCE_HELL.CivBeyondSwordSave
    280.1 KB · Views: 138
  • dllprofile.rar
    2.9 KB · Views: 92
I'm thinking about creating a cache for the CvPlayerAI::AI_unitValue method. It is one of the biggest performance eaters, and most of its calculations could be done just once (on load and game start, for example). The cache would index unit values by a (UnitTypes ,UnitAITypes) pair. The cached value would contain the sum of all "static" calculations currently made by AI_unitValue. "Dynamic" calculations would be those that call AI_unitImpassableCount or those based on the pArea parameter. Whenever the new AI_unitValue method would be called, it would make only the dynamic calculations and add their value to the cached value. Do you foresee any problems with this approach?

Before you go write it yourself you might want to check other mods to see if they have already done something like this or found another way to speed it up. IIRC some of the mods that add lots of content also have performance enhancements like Caveman2Cosmos. Kmod should definitely be checked... but I believe some parts of Kmod have already been merged into MNAI?

Your approach sounds good to me if no one else has already created something.
 
I believe the hell terrain tweak has to do about RAM and graphics rather than CPU processing so that's probably why you didn't notice a difference. people with less beefy computers should notice though :)
 
Before you go write it yourself you might want to check other mods to see if they have already done something like this or found another way to speed it up. IIRC some of the mods that add lots of content also have performance enhancements like Caveman2Cosmos. Kmod should definitely be checked... but I believe some parts of Kmod have already been merged into MNAI?

Your approach sounds good to me if no one else has already created something.

Given the amount of AI additions from other mods and tweaks made to More Naval AI, I would have to check any "external" changes very thoroughly to make sure that they don't conflict with MNAI changes. That said, I have checked this method in Rise from Erebus, Realism Invictus and Caveman2Cosmos (I was not able to find K-MOD source code for the dll, please direct me to it if you know where to find it). I did not see any changes similar to what I'm proposing here.

[to_xp]Gekko;12202721 said:
I believe the hell terrain tweak has to do about RAM and graphics rather than CPU processing so that's probably why you didn't notice a difference. people with less beefy computers should notice though :)

I see :)

My main goal is to improve CPU processing performance. I can easily check how code changes affect it, and in my opinion it is the biggest performance problem in MNAI. Although I would like to avoid any RAM cost increases, given the usual RAM vs CPU conflict in increasing performance some future performance changes may increase RAM consumption. It is not the case for the change I'm proposing right now though, and I don't see the need for huge caches in any other of the problematic methods either.
 
I'm thinking about creating a cache for the CvPlayerAI::AI_unitValue method. It is one of the biggest performance eaters, and most of its calculations could be done just once (on load and game start, for example). The cache would index unit values by a (UnitTypes ,UnitAITypes) pair. The cached value would contain the sum of all "static" calculations currently made by AI_unitValue. "Dynamic" calculations would be those that call AI_unitImpassableCount or those based on the pArea parameter. Whenever the new AI_unitValue method would be called, it would make only the dynamic calculations and add their value to the cached value. Do you foresee any problems with this approach?

That sounds like a decent plan. Just be sure that you keep the bUpgrade variable in AI_unitValue - it has some important overrides.
 
Given the amount of AI additions from other mods and tweaks made to More Naval AI, I would have to check any "external" changes very thoroughly to make sure that they don't conflict with MNAI changes. That said, I have checked this method in Rise from Erebus, Realism Invictus and Caveman2Cosmos (I was not able to find K-MOD source code for the dll, please direct me to it if you know where to find it). I did not see any changes similar to what I'm proposing here.

Yep it is a pain to look at two mods with very different source code. :sad:

K-Mod Github link

I lurk on K-mod's thread, and there was some discussion about merging the K-mod efficiency changes just the other day. They might not have done what you plan to, but there might be other worthwhile changes.

Question:
Also can you give me any help on what I would need to change to merge the K-Mod efficiency changes?
Karadoc's response:
I probably won't be about to help you with it much. About all I can tell you at the moment is that the biggest speed gains have come from the changes in CvPlayerAI::AI_unitUpdate, combined with some changes to the return value in CvSelectionGroupAI::AI_update (it now returns false in a few places where it use to return true - and these are important for the speed gains.)
Also in the same thread from Cruel who works on Realism Invictus:
Dave, I think it's worth you looking at our SVN. Our programmers have done the merge and changed some details that are very important to achieve speed gains in large mods like LoR.
 
I took a look at CvPlayerAI::AI_unitValue in K-Mod and MNAI. Looks like K-Mod has a few improvements that would speed things up a little but nothing like caching. UNITAI_ATTACK_CITY in particular looked better or at least quicker.
 
That sounds like a decent plan. Just be sure that you keep the bUpgrade variable in AI_unitValue - it has some important overrides.

Okay, I'll code it when I have time. Then we'll see if it works or not :)

Yep it is a pain to look at two mods with very different source code. :sad:

Well, besides some tweaks or mod-specific stuff, AI_unitValue is mostly identical in all the mods I have checked (including K-Mod) so it wasn't that complicated in this particular case. This similarity makes me wonder why no one attempted this before; maybe I'm following a dead end.

I lurk on K-mod's thread, and there was some discussion about merging the K-mod efficiency changes just the other day. They might not have done what you plan to, but there might be other worthwhile changes.

Thanks for the link and for the pointers. I'll check those when I'm done with AI_unitValue. They seem quite promising, specially the AI_unitUpdate change.

Also in the same thread from Cruel who works on Realism Invictus:

It's good to know that. Once that I start looking into K-Mod I'll check the RI tweaks to the K-Mod changes too. It may be a while until I reach that point, though :)
 
I'm thinking about creating a cache for the CvPlayerAI::AI_unitValue method. It is one of the biggest performance eaters, and most of its calculations could be done just once (on load and game start, for example). The cache would index unit values by a (UnitTypes ,UnitAITypes) pair.

I had an idea about the cache, but it might already be how you are planning on doing it. The idea is this - AI_unitValue first checks the cache for a match on the (UnitTypes, UnitAITypes) pair. If it exists in the cache already then it uses the value returned from the cache. If it does not exist in the cache yet then the value is calculated and stored in the cache for future use.

Waiting to calculate and cache values until they are actually asked for would reduce the size of the cache because values that weren't needed wouldn't be taking up space.
 
I had an idea about the cache, but it might already be how you are planning on doing it. The idea is this - AI_unitValue first checks the cache for a match on the (UnitTypes, UnitAITypes) pair. If it exists in the cache already then it uses the value returned from the cache. If it does not exist in the cache yet then the value is calculated and stored in the cache for future use.

Waiting to calculate and cache values until they are actually asked for would reduce the size of the cache because values that weren't needed wouldn't be taking up space.

My first thought was to initialize all values in the cache since the beginning, to limit the overhead of having to check if the cache is initialized or not every time.

Now that you mention it, it's probably not a great idea, though. (num_players * unit_classes * ai_types) * 4 bytes is not precisely a nice number for keeping in memory all times. I'll use your idea, thanks for the suggestion :)

I'll havo to check what is faster, STL maps or the maps used by the completely outdated version of boost that Civilization IV uses. I'm betting on the former, given that currently boost is 21 versions ahead of "our" version.
 
Top Bottom