C++ Speed Question

General Tso

Panzer General
Joined
Oct 12, 2007
Messages
1,547
Location
U. S. of A.
In the process of improving the AI in my mod I added the following function.

Code:
bool CvUnit::isPhaseActive() const
{
	if (canMove())
	{
		CvPlayer& pPlayer = GET_PLAYER(getOwnerINLINE());
		if (getStartPhase(pPlayer.getPhaseMode()) != NO_PHASE)
		{
			if (getStartPhase(pPlayer.getPhaseMode()) <= pPlayer.getPhaseType())
			{
				return true;
			}
		}
	}

	return false;
}

It is called at least 16 times per turn for each AI unit. Plus at least once for each group preforming an AutoMission. Needlessly to say, it's getting called many times per turn.

If I where to change the function so that the player object was passed as below.

Code:
bool CvUnit::isPhaseActive(CvPlayer* pPlayer) const
{
	if (canMove())
	{
		if (getStartPhase(pPlayer.getPhaseMode()) != NO_PHASE)
		{
			if (getStartPhase(pPlayer.getPhaseMode()) <= pPlayer.getPhaseType())
			{
				return true;
			}
		}
	}

	return false;
}

Would that improve the speed at all?
 
General advice for speeding up code: profile first, tweak later. I am sure that your change would save a few microseconds. However, if this code is not among the top hundred contributors to overall runtime, the change will not really be noticed. An awful lot of CPU time is taken up by graphics, overall turn logistics, etc which you cannot really do anything about. I recommend to look at some real profile results and see if the function you mention is accounting for much in terms of total runtime.
 
Thanks davidladden, how does one go about profiling? I have a vague idea that it might include using something like this PROFILE("CvGame::update"); and compiling a debug DLL. Unfortunately I'm not familiar with either.
 
I haven't done it, Afforess and EF have done a lot as discussed in nearby threads. Maybe, search "profile" in this sub-forum?
 
Thanks, will do.
 
Short answer: no.

If you're passing in a pointer to the player object, presumably you'd have had to call 'GET_PLAYER(getOwnerINLINE())' somewhere further up the call chain to get the pointer in the first place, so it won't make any difference whatsoever.

It's not an expensive lookup, anyway -- only a few assembly instructions and the function calls are both inlined. I doubt you'd save a whole lot even if you were calling it tens of thousands of times per turn. As David said, though, profile first.
 
Only one player has control of a given turn slice so the same player object calls isPhaseActive() for all of it's units. It's most efficient that way because the unit id values are stored in the player object.
 
OK, but you're still talking about a statement that does a single array index lookup and nothing else. If your function is being called enough that this makes a noticeable difference, there may well be better things to optimise. Having just this function's code does't really give enough information. What do 'getStartPhase', 'getPhaseMode' and 'getPhaseType' do, for example?
 
If I understand the original game code correctly (and I may not), it works like this. CvGame::update is called once every turn slice from the executable. The human player's turn is processed over multiple turn slices until they end their turn. Then all of the AI players get processed during one game slice. I never tested this with a very large game, but that's how it works with a few hundred units on the map. I changed this for my mod. Now each AI player gets multiple turn slices. I've broken their turn into Phases (4 in the original post but I changed it to 3). Each phase has 4 modes Recon, Ranged Attack, Combat, and Movement. So the AI now gets 12 turn slices to do their turn. I then added fields to the Civ4UnitInfos file to indicate when the AI will start a given mode for that unit. This attached picture might make it clearer. Note: -1 is NO_PHASE and means the unit never gets called for that mode.

So the AI player gets 12 chances per turn to control a unit. In this order - Phase 1 Recon, Phase 1 Ranged, Phase 1 Combat, Phase 1 Move. Then the same for Phase 2, then Phase 3. The first time slice for the AI, all Recon Units and Fighters will get called for recon missions. The second pass through all towed artillery that's unlimbered gets called for ranged attack missions. The third pass through fighters get called for combat missions. Etc.... The advantage to this is that the AI now uses it's units in a more intelligent manner. Fighter sweeps happen before bomber attacks. Artillery bombards before armor advances. As I mentioned the default starting phases for a unit are located in Civ4UnitInfos. However once a unit is created the starting phases for a given unit can be changed in Python or the SDK, so units don't have to be 100% predictable. As a bonus since the AI is called over multiple time slices the game doesn't "freeze" during the AI's turn, you can scroll the map around and look at things.

So to finally answer your question. Once every AI turn slice my mod cycles through all of the units for one player calling isPhaseActive. If the phase is active CvUnit::AI_Update gets called.

The three functions you asked about are pretty straight forward. Here they are.

Code:
PhaseTypes CvUnit::getStartPhase(ModeTypes mode) const
{
	return (PhaseTypes)m_iStartPhases[mode];
}

PhaseTypes CvPlayer::getPhaseType() const
{
	return (PhaseTypes)m_iPhaseType;
}


ModeTypes CvPlayer::getPhaseMode() const
{
	return (ModeTypes)m_iPhaseMode;
}

I guess my original question should have been more general, instead of about that specific function. I have some code that gets called a lot of times over a (hopefully) short time and I'm trying to learn what type of thing cause a few extra CPU cycles in C++.
 

Attachments

  • Default Phases.JPG
    Default Phases.JPG
    28.2 KB · Views: 54
As davidallen said, profile first. Not only does this tell you where to focus your optimization efforts, it also lets you measure your effect. If you just spent 3 hours shaving 23 microseconds off an 8.3 second turn time, you're wasting your time.

That being said, there are usually some simple micro-optimizations that can be done such as storing return values from functions that will be used multiple times. Another is to pass in a needed value instead of having the called function call out to get it. As long as that value will definitely be used, it could be a gain.

Here it seems that the function calling isPhaseActive() probably knows what phase the player is in. Instead of the unit asking the player, the function could tell it directly.

Code:
bool CvUnit::isPhaseActive(ModeTypes ePlayerMode, PhaseTypes ePlayerPhase) const
{
	if (canMove())
	{
		PhaseTypes eUnitStartPhase = getStartPhase(ePlayerMode);
		if (eUnitStartPhase != NO_PHASE && eUnitStartPhase <= ePlayerPhase)
		{
			return true;
		}
	}

	return false;
}

The downside to doing a "premature optimization" such as this is that if most units cannot move when this function is call, you'll end up increasing the time this function takes on average because the caller must place two values that rarely get used onto the stack before calling the function. This is why it's so important to profile first. You may make this change never realizing you've slowed down your AI!

BTW, when using a reference (&) to objects such as CvPlayer, the SDK code uses "k" as the prefix instead of "p" (for pointer).
 
Thanks for the info, it's greatly appreciated. Once I get a few things straightened out I'll look into profiling.
 
Top Bottom