Measuring Mod Performance in SDK

xienwolf · Oct 12, 2009

So you don't get a screen like this one:

when you hit CTRL + D?

Nor Me · Oct 12, 2009

I do get a screen like that one. And selecting "Show DLL profile" will guve me a dllprofile.log file. But it doesn't profile the entire turn, only the last call to CvGame::update.

Afforess · Nov 15, 2009

I'm looking at improving mod performance, and I figure this is the best thread to hijack.

I was looking through CvTeam code, and have a question about this section:

Spoiler :

Is there any logical reason we are doing 3 loops for the same exact thing there?

Wouldn't it work exactly the same, except 3x faster with this?

Spoiler :

Also, a question for the HoTK team, if I am using the CAR mod in my SDK, any time this code comes up...

Code:

		for (iI = 0; iI < MAX_PLAYERS; iI++)
			{
				if (GET_PLAYER((PlayerTypes)iI).isAlive())
				{
					if (GET_PLAYER((PlayerTypes)iI).getTeam() == getID())
					{

I can replace it with this, for a small speed boost?
(Also, I will have to replace any counter variables with *iter, correct?)

Code:

for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
	{

xienwolf · Nov 15, 2009

You'd have to check what each function does. If they did 3 separate loops, you must assume that each loop depends on the results of the previous one. War Weariness doesn't completely make sense to have influence on your plotgroups, unless during the war weariness update you can have a city go into revolt. But PlotGroups would influence available traderoutes, so they must ALL be updated before you update ANY traderoutes.

Afforess · Nov 15, 2009

xienwolf said:
You'd have to check what each function does. If they did 3 separate loops, you must assume that each loop depends on the results of the previous one. War Weariness doesn't completely make sense to have influence on your plotgroups, unless during the war weariness update you can have a city go into revolt. But PlotGroups would influence available traderoutes, so they must ALL be updated before you update ANY traderoutes.

Hmm...

It seems that it's more difficult than I thought to improve performance. However, using the CAR as an example, wouldn't this run a bit faster?

Code:

        for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
        {
            GET_PLAYER(*iter).updateWarWearinessPercentAnger();
        }
        for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
        {
            GET_PLAYER(*iter).updateWarWearinessPercentAnger();
        }
        for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
        {
            GET_PLAYER(*iter).updateWarWearinessPercentAnger();
        }

EmperorFool · Nov 15, 2009

Keep in mind that the code that does the looping itself (iI++, iI < MAX_PLAYERS, getTeam(), getID(), etc) probably takes 0.00001% of the time that those other functions being called take. Shaving that down to 0.000003% is not saving you all that much time (microseconds at the most).

This is why profiling is the most important first step to performance tuning. You want to measure the parts that take the largest amounts of time and focus your efforts there.

And as xienwolf points out, when you do you have to analyze your changes to ensure you aren't changing the logic of the code.

As for iterating over the members iterator versus looping over all players and checking for which are members, yes that will be faster and should be functionally identical. I suspect that the function that checks if a player is a member of a team iterates over the members array looking for a matching ID itself. Again, however, you're saving a tiny amount of time. You won't notice it unless it's run thousands of times per turn.

Afforess · Nov 15, 2009

EmperorFool said:
Keep in mind that the code that does the looping itself (iI++, iI < MAX_PLAYERS, getTeam(), getID(), etc) probably takes 0.00001% of the time that those other functions being called take. Shaving that down to 0.000003% is not saving you all that much time (microseconds at the most).

This is why profiling is the most important first step to performance tuning. You want to measure the parts that take the largest amounts of time and focus your efforts there.

And as xienwolf points out, when you do you have to analyze your changes to ensure you aren't changing the logic of the code.

As for iterating over the members iterator versus looping over all players and checking for which are members, yes that will be faster and should be functionally identical. I suspect that the function that checks if a player is a member of a team iterates over the members array looking for a matching ID itself. Again, however, you're saving a tiny amount of time. You won't notice it unless it's run thousands of times per turn.

Okay, profiling the main functions, if I wanted to profile my functions, what would I have to change? Where do the profiler logs get saved? I haven't ever seen them, as far as I know, so there must be some option I haven't enabled...

EmperorFool · Nov 15, 2009

Doesn't the beginning of this thread describe how to turn it on?

Afforess · Nov 15, 2009

EmperorFool said:
Doesn't the beginning of this thread describe how to turn it on?

Yep, my bad.

Afforess · Nov 15, 2009

wow, my Profiler is very revealing. It's pretty obvoius which function is eating up lots time:

[882186.438] DBG: Total Frame MS: 698.0 FPS: 001 Min:143 Max:000 Avg:001 SampleFilter:1.010000
Time : Ave : Min% : Max% : Num : Profile Name
-----------------------------------------------------
697.0 : 000.0 : 000.0 : 000.0 : 001 : CvGame::doTurn()
003.0 : 000.0 : 000.0 : 000.0 : 006 : CvTeam::doTurn()
009.0 : 000.0 : 000.0 : 000.0 : 001 : CvMap::doTurn()
009.0 : 000.0 : 000.0 : 000.0 : 1024 : CvPlot::doTurn
007.0 : 000.0 : 000.0 : 000.0 : 1024 : CvPlot::doFeature()
663.0 : 000.0 : 000.0 : 000.0 : 001 : CvPlayer::doTurnUnits
277.0 : 000.0 : 000.0 : 000.0 : 001 : CvPlayerAI::AI_doTurnUnitsPre
277.0 : 039.7 : 000.6 : 039.7 : 001 : CvPlayerAI::AI_updateFoundValues
236.0 : 033.8 : 005.8 : 033.8 : 488 : CvPlayerAI::AI_baseBonusVal
236.0 : 033.8 : 005.8 : 033.8 : 012 : CvPlayerAI::AI_baseBonusVal::recalculate
013.0 : 001.9 : 000.1 : 001.9 : 2528 : CvPlayer::canTrain
13595.0 : 000.0 : 000.0 : 000.0 : 2463 : CvCity::canTrain
385.0 : 000.0 : 000.0 : 000.0 : 001 : CvPlayerAI::AI_doTurnUnitsPost
008.0 : 000.0 : 000.0 : 000.0 : 006 : CvUnitAI::AI_upgrade
002.0 : 000.0 : 000.0 : 000.0 : 1275 : CvPlayer::calculateScore
--------------------------------------------------

Afforess · Nov 15, 2009

For fun, I commented out the python callback code in CvCity::canTrain() and the time went from 13K+ ms to 1.3k.

That's 10x faster. Stupid python callbacks. I bet if I could remove every single one, I could increase my speed by at least 5-10%.

EmperorFool · Nov 15, 2009

Why not just set the callback in the XML to 0? No need to comment out the code. Or is that callback non-optional?

Afforess · Nov 15, 2009

EmperorFool said:
Why not just set the callback in the XML to 0? No need to comment out the code. Or is that callback non-optional?

No, those callbacks are used. I just need to go and develop non-python methods of doing exactly the same thing. I believe it's used for inquisitors.

Ninja2 · Nov 16, 2009

If you could do that, it would be awesome! :goodjob:

One thing about canTrain... those Inquisitors usually have a limit for each civ - like 3 each (similar to Executives). Would it be possible to instruct the AI to not even consider building units, when the limit has been reached?

davidlallen · Nov 16, 2009

Please take a look on the revdcm forum. Phungus420 is in the middle of changing inquisitors to be soft-coded, ie controlled by xml flags + sdk. This would remove the inquisitor parts of the cvgameutil.py including canTrain.

phungus420 · Nov 17, 2009

davidlallen said:
Please take a look on the revdcm forum. Phungus420 is in the middle of changing inquisitors to be soft-coded, ie controlled by xml flags + sdk. This would remove the inquisitor parts of the cvgameutil.py including canTrain.

It's already done, and the code is posted.

Afforess · Nov 17, 2009

phungus420 said:
It's already done, and the code is posted.

davidlallen said:
Please take a look on the revdcm forum. Phungus420 is in the middle of changing inquisitors to be soft-coded, ie controlled by xml flags + sdk. This would remove the inquisitor parts of the cvgameutil.py including canTrain.

Indeed, I already have the code in my sources.

As for RoM though, it has a total of 5 callbacks, which is what has been slowing it down so much.... I'm working on softcoding them all.

Ninja2 · Nov 17, 2009

Brilliant! :goodjob:

I also need to look at cloning that OR Array...

Fimbulvetr · Aug 29, 2010

xienwolf said:
So you don't get a screen like this one:

View attachment 230873

when you hit CTRL + D?

I have the same question as Nor Me, what options on that page do I have to enable so I get to see more than just a handful of lines?
..Nvm, that seems to be yet another sideeffect of AIAutoplay, if I end the turn normally, I get to see more.

Measuring Mod Performance in SDK

xienwolf

Deity

Nor Me

Chieftain

Afforess

The White Wizard

xienwolf

Deity

Afforess

The White Wizard

EmperorFool

Deity

Afforess

The White Wizard

EmperorFool

Deity

Afforess

The White Wizard

Afforess

The White Wizard

Afforess

The White Wizard

EmperorFool

Deity

Afforess

The White Wizard

Ninja2

Great Engineer

davidlallen

Deity

phungus420

Deity

Afforess

The White Wizard

Ninja2

Great Engineer

Fimbulvetr

Emperor

Similar threads