Measuring Mod Performance in SDK

So you don't get a screen like this one:

Debug Window.jpg

when you hit CTRL + D?
 
I do get a screen like that one. And selecting "Show DLL profile" will guve me a dllprofile.log file. But it doesn't profile the entire turn, only the last call to CvGame::update.
 
I'm looking at improving mod performance, and I figure this is the best thread to hijack.

I was looking through CvTeam code, and have a question about this section:
Spoiler :
Code:
for (iI = 0; iI < MAX_PLAYERS; iI++)
		{
			if (GET_PLAYER((PlayerTypes)iI).isAlive())
			{
				if ((GET_PLAYER((PlayerTypes)iI).getTeam() == getID()) || (GET_PLAYER((PlayerTypes)iI).getTeam() == eTeam))
				{
					GET_PLAYER((PlayerTypes)iI).updateWarWearinessPercentAnger();
				}
			}
		}

		for (iI = 0; iI < MAX_PLAYERS; iI++)
		{
			if (GET_PLAYER((PlayerTypes)iI).isAlive())
			{
				if ((GET_PLAYER((PlayerTypes)iI).getTeam() == getID()) || (GET_PLAYER((PlayerTypes)iI).getTeam() == eTeam))
				{
					GET_PLAYER((PlayerTypes)iI).updatePlotGroups();
				}
			}
		}

		for (iI = 0; iI < MAX_PLAYERS; iI++)
		{
			if (GET_PLAYER((PlayerTypes)iI).isAlive())
			{
				if ((GET_PLAYER((PlayerTypes)iI).getTeam() == getID()) || (GET_PLAYER((PlayerTypes)iI).getTeam() == eTeam))
				{
					GET_PLAYER((PlayerTypes)iI).updateTradeRoutes();
				}
			}
		}

Is there any logical reason we are doing 3 loops for the same exact thing there?

Wouldn't it work exactly the same, except 3x faster with this?
Spoiler :

Code:
for (iI = 0; iI < MAX_PLAYERS; iI++)
		{
			if (GET_PLAYER((PlayerTypes)iI).isAlive())
			{
				if ((GET_PLAYER((PlayerTypes)iI).getTeam() == getID()) || (GET_PLAYER((PlayerTypes)iI).getTeam() == eTeam))
				{
					GET_PLAYER((PlayerTypes)iI).updateWarWearinessPercentAnger();
					GET_PLAYER((PlayerTypes)iI).updatePlotGroups();
					GET_PLAYER((PlayerTypes)iI).updateTradeRoutes();
				}
			}
		}

Also, a question for the HoTK team, if I am using the CAR mod in my SDK, any time this code comes up...

Code:
		for (iI = 0; iI < MAX_PLAYERS; iI++)
			{
				if (GET_PLAYER((PlayerTypes)iI).isAlive())
				{
					if (GET_PLAYER((PlayerTypes)iI).getTeam() == getID())
					{

I can replace it with this, for a small speed boost?
(Also, I will have to replace any counter variables with *iter, correct?)

Code:
for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
	{
 
You'd have to check what each function does. If they did 3 separate loops, you must assume that each loop depends on the results of the previous one. War Weariness doesn't completely make sense to have influence on your plotgroups, unless during the war weariness update you can have a city go into revolt. But PlotGroups would influence available traderoutes, so they must ALL be updated before you update ANY traderoutes.
 
You'd have to check what each function does. If they did 3 separate loops, you must assume that each loop depends on the results of the previous one. War Weariness doesn't completely make sense to have influence on your plotgroups, unless during the war weariness update you can have a city go into revolt. But PlotGroups would influence available traderoutes, so they must ALL be updated before you update ANY traderoutes.

Hmm...

It seems that it's more difficult than I thought to improve performance. However, using the CAR as an example, wouldn't this run a bit faster?

Code:
        for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
        {
            GET_PLAYER(*iter).updateWarWearinessPercentAnger();
        }
        for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
        {
            GET_PLAYER(*iter).updateWarWearinessPercentAnger();
        }
        for (std::vector<PlayerTypes>::const_iterator iter = m_aePlayerMembers.begin(); iter != m_aePlayerMembers.end(); ++iter)
        {
            GET_PLAYER(*iter).updateWarWearinessPercentAnger();
        }
 
Keep in mind that the code that does the looping itself (iI++, iI < MAX_PLAYERS, getTeam(), getID(), etc) probably takes 0.00001% of the time that those other functions being called take. Shaving that down to 0.000003% is not saving you all that much time (microseconds at the most).

This is why profiling is the most important first step to performance tuning. You want to measure the parts that take the largest amounts of time and focus your efforts there.

And as xienwolf points out, when you do you have to analyze your changes to ensure you aren't changing the logic of the code.

As for iterating over the members iterator versus looping over all players and checking for which are members, yes that will be faster and should be functionally identical. I suspect that the function that checks if a player is a member of a team iterates over the members array looking for a matching ID itself. Again, however, you're saving a tiny amount of time. You won't notice it unless it's run thousands of times per turn.
 
Keep in mind that the code that does the looping itself (iI++, iI < MAX_PLAYERS, getTeam(), getID(), etc) probably takes 0.00001% of the time that those other functions being called take. Shaving that down to 0.000003% is not saving you all that much time (microseconds at the most).

This is why profiling is the most important first step to performance tuning. You want to measure the parts that take the largest amounts of time and focus your efforts there.

And as xienwolf points out, when you do you have to analyze your changes to ensure you aren't changing the logic of the code.

As for iterating over the members iterator versus looping over all players and checking for which are members, yes that will be faster and should be functionally identical. I suspect that the function that checks if a player is a member of a team iterates over the members array looking for a matching ID itself. Again, however, you're saving a tiny amount of time. You won't notice it unless it's run thousands of times per turn.

Okay, profiling the main functions, if I wanted to profile my functions, what would I have to change? Where do the profiler logs get saved? I haven't ever seen them, as far as I know, so there must be some option I haven't enabled...
 
wow, my Profiler is very revealing. It's pretty obvoius which function is eating up lots time:
[882186.438] DBG: Total Frame MS: 698.0 FPS: 001 Min:143 Max:000 Avg:001 SampleFilter:1.010000
Time : Ave : Min% : Max% : Num : Profile Name
-----------------------------------------------------
697.0 : 000.0 : 000.0 : 000.0 : 001 : CvGame::doTurn()
003.0 : 000.0 : 000.0 : 000.0 : 006 : CvTeam::doTurn()
009.0 : 000.0 : 000.0 : 000.0 : 001 : CvMap::doTurn()
009.0 : 000.0 : 000.0 : 000.0 : 1024 : CvPlot::doTurn
007.0 : 000.0 : 000.0 : 000.0 : 1024 : CvPlot::doFeature()
663.0 : 000.0 : 000.0 : 000.0 : 001 : CvPlayer::doTurnUnits
277.0 : 000.0 : 000.0 : 000.0 : 001 : CvPlayerAI::AI_doTurnUnitsPre
277.0 : 039.7 : 000.6 : 039.7 : 001 : CvPlayerAI::AI_updateFoundValues
236.0 : 033.8 : 005.8 : 033.8 : 488 : CvPlayerAI::AI_baseBonusVal
236.0 : 033.8 : 005.8 : 033.8 : 012 : CvPlayerAI::AI_baseBonusVal::recalculate
013.0 : 001.9 : 000.1 : 001.9 : 2528 : CvPlayer::canTrain
13595.0 : 000.0 : 000.0 : 000.0 : 2463 : CvCity::canTrain
385.0 : 000.0 : 000.0 : 000.0 : 001 : CvPlayerAI::AI_doTurnUnitsPost
008.0 : 000.0 : 000.0 : 000.0 : 006 : CvUnitAI::AI_upgrade
002.0 : 000.0 : 000.0 : 000.0 : 1275 : CvPlayer::calculateScore
--------------------------------------------------
 
For fun, I commented out the python callback code in CvCity::canTrain() and the time went from 13K+ ms to 1.3k.

That's 10x faster. Stupid python callbacks. I bet if I could remove every single one, I could increase my speed by at least 5-10%.
 
Why not just set the callback in the XML to 0? No need to comment out the code. Or is that callback non-optional?
 
Why not just set the callback in the XML to 0? No need to comment out the code. Or is that callback non-optional?

No, those callbacks are used. I just need to go and develop non-python methods of doing exactly the same thing. I believe it's used for inquisitors.
 
If you could do that, it would be awesome! :goodjob: One thing about canTrain... those Inquisitors usually have a limit for each civ - like 3 each (similar to Executives). Would it be possible to instruct the AI to not even consider building units, when the limit has been reached?
 
Please take a look on the revdcm forum. Phungus420 is in the middle of changing inquisitors to be soft-coded, ie controlled by xml flags + sdk. This would remove the inquisitor parts of the cvgameutil.py including canTrain.
 
Please take a look on the revdcm forum. Phungus420 is in the middle of changing inquisitors to be soft-coded, ie controlled by xml flags + sdk. This would remove the inquisitor parts of the cvgameutil.py including canTrain.

It's already done, and the code is posted.
 
It's already done, and the code is posted.

Please take a look on the revdcm forum. Phungus420 is in the middle of changing inquisitors to be soft-coded, ie controlled by xml flags + sdk. This would remove the inquisitor parts of the cvgameutil.py including canTrain.

Indeed, I already have the code in my sources.

As for RoM though, it has a total of 5 callbacks, which is what has been slowing it down so much.... I'm working on softcoding them all.
 
Brilliant! :goodjob: I also need to look at cloning that OR Array... :)
 
So you don't get a screen like this one:

View attachment 230873

when you hit CTRL + D?

I have the same question as Nor Me, what options on that page do I have to enable so I get to see more than just a handful of lines?
..Nvm, that seems to be yet another sideeffect of AIAutoplay, if I end the turn normally, I get to see more.
 
Top Bottom