Measuring Mod Performance in SDK

stmartin

aka. poyuzhe
Joined
Aug 4, 2007
Messages
640
Location
Shanghai
In order to improve performance of HoTK Mod, I'm going to measure the performance (aka. time used) of each part of the game code. I think this kind of job is best done in SDK.

I could use time library of C++. I also notices in SDK there's a header file 'FProfiler.h' which seems to have something to do with performance tweaking. The only thing is I don't understand that file a tiny bit more, so I have no clue how to use it.

Anyone has some good ideas about measuring performance of Civ4? I'd love to hear that! Thanks.
 
I'm not sure if you are using python, but I have used the following approach to find performance hogs. The idea is to print the total elapsed time to a file every 10 turns. Then, by turning on or off certain features, you can see which has a runtime impact on the game. Features which have a high runtime impact can then be redesigned to be more efficient.

In python, you can add an onGameTurnEnd callback, check if (turn % 10) is zero, and then use time.strftime() to write the current time into a file.

Your mileage may vary.
 
If you're measuring Python code, check out the Timer class in BUG's BugUtil.py module. You could copy that class alone to a new module for use in your mod if it's not already based on BUG. Just change the debug() calls to CvUtil.PyPrint() for logging the times.

Code:
timer = BugUtil.Timer("Check for prereqs")
... code to time ...
timer.log()

This allows you to time long tasks. You can also measure each iteration in a loop:

Code:
timer = BugUtil.Timer('draw loop', False)  # don't start timer yet
for/while ...
	timer.start()
	... code to time ...
	timer.log()
timer.logTotal()

It prints out the time in milliseconds. You could use this to time C++ code, but you'd be timing the entire function call.

Code profiling is a little different and much better. It constantly tracks what part of code is being executed as it runs and then shows you where most time is spent. You don't normally need to add anything to your code to profile it except the profiling tool itself.

I suspect that FProfile.h is actually just a timer like BUG's because wherever I see it being used it simply starts it at the beginning of a function and stops it at the end. This is typical timer stuff and doesn't tell you exactly where in the function the time is taken. It can narrow down to which function on a whole is costing the most time, however. That's sometimes enough information to allow you to make a good performance fix.
 
Yes, the Profiler (FProfiler.h) allows you to use in-built timing features to check DLL performance. At the start of many functions you'll notice a line saying PROFILE_FUNC(); or similar but with a text key in it. These are where the profiler hooks in, and thus the functions that get timed.


When you compile, there are certain things that are DEFINED for the compile process, and you'll run into statements in the code saying "ifdef XXXX_XXX" and later something which ends the ifdef statement, or does an else for it. Essentially all of FProfiler.h is wrapped in a massive ifdef, so if you activate that define (normally done in a Release (not Final Release) build type) then your profiler is enabled.

You can also force the define by placing early in FProfiler.h a statement which defines exactly what the ifdef checks for, or by removing the ifdef so that it ALWAYS does it. Then Profiler is always enabled. But that slows down your mod a tad bit for the users who don't care to profile things and also increases the size of the DLL, and thus your download size.


To use the profiler... I think it saves in your Logs folder. You might need to have chipotle enabled and go into the debug menu (CTRL+D) and click on the "enable profiler" checkbox as well to trigger things off. I know that doing that without a debug DLL (which has the profiler enabled) tends to cause an instant crash, so it does require the profiler to be turned on to function properly.
 
Thanks EF and Xienwolf, as always. I compiled a Debug DLL, enabled profile system, got some result. However, the result confused me.

Some of the functions, such as CvGame::update(), only start to appear in log after about 50+ turns, and will not always appear each turn after that. Is there a way to make it appear every turn?

The result has some strange points. Such as CvPlayerAI::AI_getAttitudeVal() got called 40000+ times and takes 20+ seconds per turn. Also sometimes CvPlot::updatePlotGroup(player) got called 5 digit times and took 10+ seconds per turn, but sometimes it only got called single digit time and took less than 0.001 second.

I've pasted the log result for one turn in a HoTK Mod scenario game. If anyone's interested in interpreting the result, I'm all ears.

I've also attached a more complete version of the log.

Spoiler :
--------------------------------------------------
[15160.156] DBG: Total Frame MS: 41055.0 FPS: 000 Min:015 Max:000 Avg:000 SampleFilter:0.010000
Time : Ave : Min% : Max% : Num : Profile Name
-----------------------------------------------------
25605.0 : 062.4 : 057.5 : 067.6 : 024 : CvPlayer::doTurn
003.0 : 000.0 : 000.0 : 000.0 : 127 : CvPlayer::hasBusyUnit
894.0 : 002.2 : 001.7 : 002.7 : 024 : CvPlayerAI::AI_doTurnPre
007.0 : 000.0 : 000.0 : 000.0 : 702 : CvPlayer::canDoCivics
668.0 : 001.6 : 001.6 : 001.6 : 254 : CvPlayerAI::AI_civicValue
001.0 : 000.0 : 000.0 : 000.0 : 515 : CvCityAI::AI_yieldMultiplier
21560.0 : 052.5 : 000.0 : 054.1 : 40594 : CvPlayerAI::AI_getAttitudeVal
060.0 : 000.1 : 000.0 : 001.3 : 186 : CvPlayerAI::AI_calculateStolenCityRadiusPlots
052.0 : 000.1 : 000.0 : 000.5 : 39006 : CvTeam::countNumCitiesByArea
7851.0 : 019.1 : 001.0 : 026.1 : 105 : CvCityAI::AI_assignWorkingPlots
006.0 : 000.0 : 000.0 : 000.2 : 212 : CvPlot::updateSymbolDisplay
021.0 : 000.1 : 000.0 : 000.1 : 105 : CvCityAI::AI_updateSpecialYieldMultiplier
068.0 : 000.2 : 000.1 : 000.2 : 2372 : CvCityAI::angryPopulation
7770.0 : 018.9 : 000.9 : 025.9 : 677 : CvCityAI::AI_addBestCitizen
012.0 : 000.0 : 000.0 : 000.0 : 782 : CvCityAI::AI_avoidGrowth
015.0 : 000.0 : 000.0 : 000.1 : 782 : CvCityAI::AI_ignoreGrowth
015.0 : 000.0 : 000.0 : 000.0 : 642 : CvCityAI::AI_foodAvailable
032.0 : 000.1 : 000.0 : 000.1 : 1015 : CvCityAI::AI_specialistValue
8320.0 : 020.3 : 000.8 : 027.8 : 9865 : CvCityAI::AI_plotValue
285.0 : 000.7 : 000.0 : 000.7 : 79668 : CvPlot::calculateImprovementYieldChange
4827.0 : 011.8 : 010.6 : 015.7 : 103 : CvCity::doTurn()
4289.0 : 010.4 : 009.5 : 014.6 : 103 : CvCityAI::AI_doTurn
094.0 : 000.2 : 000.0 : 000.3 : 28975 : CvPlayerAI::AI_plotTargetMissionAIs
2022.0 : 004.9 : 001.5 : 006.0 : 103 : CvCityAI::AI_updateBestBuild
727.0 : 001.8 : 001.1 : 001.8 : 1825 : CvCityAI::AI_bestPlotBuild
162.0 : 000.4 : 000.1 : 000.4 : 78707 : CvPlayer::canBuild
413.0 : 001.0 : 000.2 : 008.2 : 1132 : CvPlayerAI::AI_baseBonusVal
413.0 : 001.0 : 000.2 : 008.2 : 239 : CvPlayerAI::AI_baseBonusVal::recalculate
019.0 : 000.0 : 000.0 : 000.5 : 14232 : CvPlot::isCoastalLand
184.0 : 000.4 : 000.0 : 000.5 : 33609 : CvPlayer::getBestRoute
001.0 : 000.0 : 000.0 : 000.0 : 624 : CvPlayer::canTrain
004.0 : 000.0 : 000.0 : 000.0 : 111 : CvPlayerAI::AI_getOurPlotStrength
377.0 : 000.9 : 000.0 : 001.6 : 16313 : CvUnit::maxCombatStr
129.0 : 000.3 : 000.3 : 000.4 : 103 : CvCityAI::AI_doDraft
021.0 : 000.1 : 000.0 : 000.1 : 104 : CvCityAI::AI_doHurry
4264.0 : 010.4 : 000.0 : 012.4 : 72644 : CvPlayerAI::AI_getPlotDanger
018.0 : 000.0 : 000.0 : 002.3 : 103 : CvCityAI::AI_doEmphasize
005.0 : 000.0 : 000.0 : 000.0 : 3087 : CvPlot::isPotentialCityWorkForArea
023.0 : 000.1 : 000.0 : 001.8 : 3082 : CvPlot::setCulture
002.0 : 000.0 : 000.0 : 000.0 : 546 : CvPlot::doImprovement
012.0 : 000.0 : 000.0 : 000.0 : 031 : CvPlayer::countUnimprovedBonuses
011.0 : 000.0 : 000.0 : 002.6 : 012 : CvPlayer::initUnit
001.0 : 000.0 : 000.0 : 000.2 : 106 : CvPlot::updateMinimapColor
362.0 : 000.9 : 000.6 : 001.4 : 021 : CvCityAI::AI_chooseProduction
017.0 : 000.0 : 000.0 : 000.1 : 11691 : CvPlot::isAdjacentToArea
008.0 : 000.0 : 000.0 : 000.8 : 10754 : CvTeam::isHuman
003.0 : 000.0 : 000.0 : 000.0 : 006 : CvPlayerAI::AI_countNumAreaHostileUnits
243.0 : 000.6 : 000.5 : 000.7 : 101 : CvPlayerAI::AI_getTotalFloatingDefendersNeeded
418.0 : 001.0 : 000.8 : 001.2 : 190 : CvPlayerAI::AI_getTotalAreaCityThreat
477.0 : 001.2 : 001.1 : 001.4 : 1458 : CvCityAI::AI_cityThreat
584.0 : 001.4 : 000.0 : 002.1 : 1281 : CvPlayerAI::AI_getAttitude
180.0 : 000.4 : 000.4 : 000.4 : 253 : CvCityAI::AI_buildingValueThreshold
012.0 : 000.0 : 000.0 : 002.0 : 055 : CvPlot::updateSymbols
11916.0 : 029.0 : 020.0 : 029.4 : 024 : CvPlayerAI::AI_doTurnPost
11910.0 : 029.0 : 020.0 : 029.4 : 023 : CvPlayerAI::AI_doDiplo
062.0 : 000.2 : 000.1 : 000.2 : 074 : CvTeamAI::AI_declareWarTrade
050.0 : 000.1 : 000.0 : 000.1 : 087 : CvPlayerAI::AI_bonusTrade
120.0 : 000.3 : 000.2 : 000.3 : 070 : CvTeamAI::AI_openBordersTrade
113.0 : 000.3 : 000.2 : 001.9 : 520 : CvPlayerAI::AI_playerCloseness
134.0 : 000.3 : 000.2 : 001.9 : 159 : CvCityAI::AI_cachePlayerCloseness
027.0 : 000.1 : 000.0 : 001.0 : 013 : CvTeamAI::AI_teamCloseness
935.0 : 002.3 : 002.1 : 029.1 : 069 : CvTeamAI::AI_surrenderTrade
10305.0 : 025.1 : 000.0 : 025.8 : 6570 : CvTeamAI::AI_techTrade
006.0 : 000.0 : 000.0 : 000.0 : 001 : CvTeamAI::AI_declareWarTradeVal
008.0 : 000.0 : 000.0 : 000.0 : 009 : CvTeamAI::AI_mapTrade
006.0 : 000.0 : 000.0 : 000.0 : 023 : CvPlayerAI::AI_doSplit
2531.0 : 006.2 : 005.9 : 042.8 : 024 : CvPlayer::doTurnUnits
548.0 : 001.3 : 000.1 : 005.5 : 024 : CvPlayerAI::AI_doTurnUnitsPre
450.0 : 001.1 : 000.1 : 005.5 : 024 : CvPlayerAI::AI_updateFoundValues
040.0 : 000.1 : 000.0 : 000.4 : 3246 : CvMap::findCity
032.0 : 000.1 : 000.0 : 010.0 : 425 : CvSelectionGroup::doTurn()
021.0 : 000.1 : 000.0 : 009.8 : 467 : CvUnit::doTurn()
1949.0 : 004.7 : 004.7 : 041.4 : 024 : CvPlayerAI::AI_doTurnUnitsPost
103.0 : 000.3 : 000.2 : 014.8 : 472 : CvUnitAI::AI_promote
1589.0 : 003.9 : 003.9 : 034.4 : 615 : CvUnitAI::AI_upgrade
10026.0 : 024.4 : 015.6 : 026.2 : 040 : CvPlayerAI::AI_unitUpdate
9993.0 : 024.3 : 015.5 : 026.2 : 636 : CvSelectionGroupAI::AI_update
9970.0 : 024.3 : 015.4 : 026.2 : 365 : CvUnitAI::AI_update
025.0 : 000.1 : 000.1 : 000.1 : 011 : CvUnitAI::AI_assaultSeaMove
002.0 : 000.0 : 000.0 : 000.0 : 234 : CvPlayerAI::AI_unitTargetMissionAIs
070.0 : 000.2 : 000.1 : 000.5 : 357 : CvUnitAI::AI_group
123.0 : 000.3 : 000.3 : 000.3 : 542523 : CvUnitAI::AI_plotValid
017.0 : 000.0 : 000.0 : 000.0 : 046 : CvUnitAI::AI_pickup
494.0 : 001.2 : 000.9 : 001.4 : 1509 : CvCityAI::AI_neededDefenders
461.0 : 001.1 : 000.7 : 001.3 : 1263 : CvCityAI::AI_isDefended
1984.0 : 004.8 : 004.8 : 004.8 : 063 : CvUnitAI::AI_retreatToCity
854.0 : 002.1 : 000.9 : 003.4 : 386 : CvSelectionGroup::pushMission
001.0 : 000.0 : 000.0 : 000.0 : 391 : CvSelectionGroup::canStartMission
176.0 : 000.4 : 000.4 : 000.4 : 001 : CvUnitAI::AI_exploreSeaMove
457.0 : 001.1 : 001.1 : 003.6 : 246 : CvUnitAI::AI_anyAttack
366.0 : 000.9 : 000.9 : 000.9 : 085 : CvUnitAI::AI_pillageRange
905.0 : 002.2 : 002.2 : 002.2 : 028 : CvUnitAI::AI_exploreRange
853.0 : 002.1 : 002.1 : 002.1 : 14476 : AI_exploreRange 1
022.0 : 000.1 : 000.1 : 000.1 : 84920 : AI_exploreRange 2
805.0 : 002.0 : 002.0 : 002.0 : 7918 : AI_exploreRange 3
768.0 : 001.9 : 001.9 : 001.9 : 7823 : AI_exploreRange 4
9135.0 : 022.3 : 013.6 : 024.0 : 27623 : CvSelectionGroup::generatePath()
206.0 : 000.5 : 000.5 : 000.6 : 112300 : pathAdd
292.0 : 000.7 : 000.7 : 001.2 : 27623 : pathDestValid
051.0 : 000.1 : 000.1 : 000.1 : 007 : AI_exploreRange 5
118.0 : 000.3 : 000.3 : 000.3 : 219 : CvSelectionGroup::groupMove
373.0 : 000.9 : 000.9 : 000.9 : 018 : CvUnitAI::AI_collateralMove
2025.0 : 004.9 : 001.4 : 004.9 : 427 : CvUnitAI::AI_guardCity
228.0 : 000.6 : 000.5 : 000.6 : 111 : CvUnitAI::AI_cityDefenseMove
001.0 : 000.0 : 000.0 : 000.0 : 231 : CvUnitAI::AI_load
107.0 : 000.3 : 000.0 : 000.3 : 046 : CvUnitAI::AI_guardCityMinDefender
308.0 : 000.8 : 000.4 : 000.8 : 018 : CvUnitAI::AI_cityDefenseExtraMove
135.0 : 000.3 : 000.0 : 000.3 : 227 : CvUnitAI::AI_heal
438.0 : 001.1 : 000.0 : 001.7 : 024 : CvUnitAI::AI_heroUnitDefendMove
409.0 : 001.0 : 000.2 : 001.0 : 025 : CvUnitAI::AI_heroUnitOffendMove
1276.0 : 003.1 : 003.1 : 003.1 : 048 : CvUnitAI::AI_attackMove
454.0 : 001.1 : 001.1 : 001.1 : 048 : CvUnitAI::AI_attackMove() 1
095.0 : 000.2 : 000.2 : 000.4 : 017 : CvUnitAI::AI_reserveMove
001.0 : 000.0 : 000.0 : 000.0 : 001 : CvUnitAI::AI_settlerSeaMove
007.0 : 000.0 : 000.0 : 000.0 : 042 : CvUnitAI::AI_pickupStranded
002.0 : 000.0 : 000.0 : 000.3 : 018 : CvGame::calculateSyncChecksum
37183.0 : 090.6 : 000.0 : 094.5 : 018 : CvGame::update
123.0 : 000.3 : 000.0 : 019.9 : 1584 : CvPlayer::calculateScore
033.0 : 000.1 : 000.1 : 000.1 : 001 : CvUnitAI::AI_reconSpy
001.0 : 000.0 : 000.0 : 000.0 : 047 : CvSelectionGroup::getBestBuildRoute
001.0 : 000.0 : 000.0 : 000.0 : 011 : CvUnit::healRate
1214.0 : 003.0 : 003.0 : 003.0 : 030 : CvUnitAI::AI_counterMove
154.0 : 000.4 : 000.4 : 000.4 : 065 : CvUnitAI::AI_cityAttack
048.0 : 000.1 : 000.1 : 000.1 : 013 : CvUnitAI::AI_attackSeaMove
001.0 : 000.0 : 000.0 : 000.0 : 027 : CvUnitAI::AI_shadow
094.0 : 000.2 : 000.2 : 000.2 : 051 : CvUnitAI::AI_protect
093.0 : 000.2 : 000.2 : 000.2 : 066 : CvUnitAI::AI_patrol
009.0 : 000.0 : 000.0 : 000.3 : 019 : CvUnit::kill
1855.0 : 004.5 : 004.5 : 004.5 : 002 : CvUnitAI::AI_workerSeaMove
019.0 : 000.0 : 000.0 : 000.0 : 016 : CvUnitAI::AI_improveBonus
085.0 : 000.2 : 000.0 : 5896.0 : 752 : CvPlot::updatePlotGroup(Player)
087.0 : 000.2 : 000.2 : 000.2 : 010 : CvUnitAI::AI_workerMove
001.0 : 000.0 : 000.0 : 000.0 : 027 : CvPlayerAI::AI_isPlotThreatened
006.0 : 000.0 : 000.0 : 000.0 : 010 : CvUnitAI::AI_connectCity
043.0 : 000.1 : 000.1 : 000.1 : 009 : CvUnitAI::AI_improveCity
033.0 : 000.1 : 000.1 : 000.1 : 009 : CvUnitAI::AI_bestCityBuild
748.0 : 001.8 : 001.8 : 001.8 : 010 : CvUnitAI::AI_attackCityMove
016.0 : 000.0 : 000.0 : 000.0 : 047 : CvUnitAI::AI_goody
800.0 : 001.9 : 001.9 : 001.9 : 031 : CvUnitAI::AI_attackMove() 2
015.0 : 000.0 : 000.0 : 000.0 : 022 : CvUnitAI::AI_guardFort
033.0 : 000.1 : 000.1 : 000.1 : 007 : CvUnitAI::AI_guardCitySite
085.0 : 000.2 : 000.0 : 017.9 : 028 : CvPlot::updatePlotGroup
021.0 : 000.1 : 000.0 : 004.0 : 4416 : CvPlot::calculateCulturalOwner()
039.0 : 000.1 : 000.0 : 043.6 : 4313 : CvPlot::setOwner
015.0 : 000.0 : 000.0 : 040.6 : 4378 : CvPlot::verifyUnitValidPlot
650.0 : 001.6 : 000.9 : 003.0 : 003 : CvUnitAI::AI_heroMove
118.0 : 000.3 : 000.3 : 000.9 : 002 : CvUnitAI::AI_heroDefendArea
360.0 : 000.9 : 000.9 : 000.9 : 021 : CvUnitAI::AI_explore
330.0 : 000.8 : 000.8 : 000.8 : 91392 : AI_explore 1
055.0 : 000.1 : 000.1 : 000.1 : 141920 : AI_explore 2
532.0 : 001.3 : 001.3 : 001.3 : 001 : CvUnitAI::AI_heroOffendArea
634.0 : 001.5 : 001.5 : 001.5 : 017 : CvUnitAI::AI_exploreMove
134.0 : 000.3 : 000.3 : 000.3 : 001 : CvUnitAI::AI_moveIntoCity
015.0 : 000.0 : 000.0 : 000.0 : 008 : CvUnit::updateCombat
005.0 : 000.0 : 000.0 : 000.0 : 008 : CvUnit::resolveCombat
038.0 : 000.1 : 000.1 : 000.1 : 013 : CvUnitAI::AI_pillage
104.0 : 000.3 : 000.3 : 000.3 : 027 : CvUnitAI::AI_defend
003.0 : 000.0 : 000.0 : 000.0 : 003 : CvUnitAI::AI_guardBonus
002.0 : 000.0 : 000.0 : 000.0 : 001 : CvUnitAI::AI_formLegion
001.0 : 000.0 : 000.0 : 001.8 : 003 : CvUnit::processLegion
111.0 : 000.3 : 000.3 : 000.3 : 001 : CvUnitAI::AI_pillageMove
098.0 : 000.2 : 000.0 : 008.7 : 104 : CvPlayerAI::AI_targetCityValue
001.0 : 000.0 : 000.0 : 000.0 : 048 : CvPlayerAI::AI_neededMissionaries
006.0 : 000.0 : 000.0 : 000.0 : 003 : CvUnitAI::AI_connectPlot
1134.0 : 002.8 : 002.8 : 002.8 : 004 : CvUnitAI::AI_missionaryMove
1058.0 : 002.6 : 002.6 : 002.6 : 004 : CvUnitAI::AI_spreadReligion
034.0 : 000.1 : 000.1 : 000.1 : 002 : CvUnitAI::AI_safety
048.0 : 000.1 : 000.1 : 000.1 : 005 : CvUnitAI::AI_campMove
027.0 : 000.1 : 000.1 : 000.1 : 015 : CvUnitAI::AI_campInPlotRange
001.0 : 000.0 : 000.0 : 000.0 : 001 : CvGameAI::AI_promoteHero
001.0 : 000.0 : 000.0 : 000.0 : 001 : CvUnitAI::AI_escortSeaMove
074.0 : 000.2 : 000.2 : 000.2 : 014 : CvUnitAI::AI_targetCity
020.0 : 000.0 : 000.0 : 000.0 : 021 : CvUnitAI::AI_animalMove
808.0 : 002.0 : 002.0 : 002.0 : 019 : CvUnitAI::AI_barbAttackMove
080.0 : 000.2 : 000.0 : 016.4 : 025 : CvPlotGroup::recalculatePlots
772.0 : 001.9 : 001.5 : 100.0 : 001 : CvGame::doTurn()
398.0 : 001.0 : 000.9 : 052.0 : 024 : CvTeam::doTurn()
292.0 : 000.7 : 000.6 : 028.2 : 024 : CvTeamAI::AI_updateWorstEnemy
010.0 : 000.0 : 000.0 : 002.6 : 960 : CvTeamAI::AI_calculateAreaAIType
006.0 : 000.0 : 000.0 : 002.2 : 016 : CvTeam::countEnemyDangerByArea
091.0 : 000.2 : 000.2 : 032.4 : 023 : CvTeamAI::AI_doWar
088.0 : 000.2 : 000.0 : 029.6 : 022 : CvPlayerAI::AI_doPeace
075.0 : 000.2 : 000.1 : 047.5 : 001 : CvMap::doTurn()
075.0 : 000.2 : 000.1 : 047.5 : 4352 : CvPlot::doTurn
023.0 : 000.1 : 000.0 : 008.0 : 4352 : CvPlot::doFeature()
041.0 : 000.1 : 000.0 : 045.1 : 4352 : CvPlot::doCulture()
003.0 : 000.0 : 000.0 : 021.3 : 001 : CvPlot::updateFeatureSymbol
--------------------------------------------------
 

Attachments

  • dllprofile.7z
    33.4 KB · Views: 55
If anyone's interested in interpreting the result, I'm all ears.

That is exactly the problem. Some of these discussions turn into software engineering philosophy. I am not sure that function by function profiling will get you what you want. What I think you want is to find out which features you have added in your mod are consuming too much runtime, right? Function profiling will detect if there is a function deep in the engine which is called very often and takes a long time. But there is no way to figure out why, or what to do about it; it is not like you are going to reimplement that engine function more efficiently. This type of profiling is handy for the engine developers at Firaxis, but IMHO is not the best approach for modders.

I proposed a slightly higher level of profiling. Suppose there are 10 features you have added in your mod. Invoke the mod and save a game at turn 0. Now autoplay 100, or 200 turns of the game and look at the total elapsed time. Now remove a couple of features from the mod which may consume a lot of runtime, and rerun. How much time did it save? What you are looking for is minor features which, when removed, make the game run a lot faster. These can be removed. If you find a major feature which costs major time, you need to decide whether to remove it, or you need to find a more efficient way to implement it.

I did this for Dune Wars; some details at this post.
 
@davidlallen

Allow me to explain why I tried function profiling first.

HoTK did all their changes in SDK. When we did our job, we keep performance in mind, so the code in Python is very unlikely to be heavy. We think we have done the best we can to put performance cost to minimum. But the game is still a bit slow. Because our players love scenarios much more than random map, and scenarios have many players, cities and units.

So we want to see what on earth is consuming all that time. This insight can only be gained by function profiling.

Another reason is, sometimes disabling feature is simply too costly. For example, in HoTK we added many more combat variables. Some of these modify a unit's strength, adding lots of code to CvUnit::maxCombatStr(). Disabling all of them would be painful. Also, if we found the game is slightly faster without all the extra variables, should be remove that feature? That's a tough decision to make. By function profiling, luckily, the result shows that maxCombatStr() is not very time consuming in the big picture.

Yes, I agree with you it's unlikely we could make the Civ 4 engine more efficient. However, at a small price, we could make the game run much faster. The function profiling result shows CvPlayerAI::AI_getAttitudeVal() is the big boss, and CvTeamAI:AI_isLandTarget() is one of the small bosses. So I just made a decision: cache the result of those both functions, calculate them only once per turn per player/team instead of thousands of times, at a cost of more memory usage and less real-time values. I still don't know if there's any hidden cost here. Nevertheless, HoTK now runs MUCH faster now.
 
I'm glad you were able to get useful results. The method I described happens to use python to measure the data, but you can use the same method even if you have added many features into the sdk instead of python. In fact, one of the major problems we found using this method was that some of the graphic effects attached to some of the units were slowing down runtime. We re-measured the runtime after replacing the unit with an empty nif, and proved that this was a major factor. I don't think profiling would reveal that.
 
... The function profiling result shows CvPlayerAI::AI_getAttitudeVal() is the big boss, and CvTeamAI:AI_isLandTarget() is one of the small bosses. So I just made a decision: cache the result of those both functions, calculate them only once per turn per player/team instead of thousands of times, at a cost of more memory usage and less real-time values. I still don't know if there's any hidden cost here. Nevertheless, HoTK now runs MUCH faster now.

Please share this result and code with BetterAI, or maybe even the unofficial patch. This sounds like a great improvement in code.
 
Now that you have it actually WORKING, some pointers/observations from my own VERY limited use of the function:

Just as with Assert messages, if you don't understand why something is being profiled, it does you no good at all. So pruning all of the existing profile messages is potentially a good start.

Since you are worried about the new functions you have written, except in places where they are mostly inserted into already profiled functions, add a profile call to your function itself. Actually even if mostly just called from one already profiled function, add a call anyway (if you know that your function should be called twice each time that ::maxCombatStr() is called, check if that is the case by comparing call counts).

For functions that confuse you as to why they take so long, find any loops in them and place a profiler in something called within that loop. I am not sure if you can just place a profile call inside the loop with a text key and have it re-trigger, but it makes sense you SHOULD be able to.


As for some functions not showing up every turn which you KNOW are running each turn: Maybe this is just the top offender list? (I don't find that likely, the list is long and some entries are brief) I'm really not completely sure on that one to be honest. It probably varies per function, like some CvGame functions may span multiple turns.


Specifically: CvPlot::updatePlotGroup(player), this one as I recall is processed for every plot on the map when any team learns a new tech, to find if a bonus should be revealed or traderoutes updated or some such. But is also processed on individual tiles when a road is built to see if it linked two groups. Hence sometimes running 5 digit number of times and sometimes running once.


All said, the big trick is that the profiler just points you in the right direction to analyze, then you have to pick apart WHY. If a function is called a ton of times with a low average processing speed, then use the Caller's Graph (In VS2008, I right click on the function name and the first option is for caller's graph, choose to "display callers" and it shows everywhere in the code that calls that function. Sometimes it is priceless, sometimes it is useless. If the results don't make sense, compare it with a FIND ALL search to see if the Intellisense is screwing with you). Many functions are called from places they don't need to be, or can be nested in a conditional to prevent the call if it isn't needed. For others, like AttitudeValue, you may want to consider creating a Cache system where you update the value from scratch at the start of each turn, and force reference to a static variable during the turn. This is done for BestBuildValue in CvCity/CvPlot as I recall if you need an example to work from.


EDIT: Yeah, sometimes reading comprehension is your friend. I skimmed near the end of your post and missed that you already DID move to a cache system. Only caught it when reading the crosspost of phungus... geez I can be dense ;)
 
Given that actions taken during the turn that alter the attitude would not take effect until the next turn, I wouldn't want to see this caching solution in the UP or BBAI as-is. However, with a little grunt work you could add a call to invalidate the cache for a given player pair whenever any of the values that make up the total attitude change.

Most should be fairly easy: when anyone declares war on an AI, invalidate everyone's attitude toward that player. Hmm, you also need to account for shared war, but I don't know when that kicks in. Plus it increases over time, but your per-turn invalidation will catch that.

To make sure the human always sees the correct attitudes you could ignore the cache when the human is the active player.
 
I'm glad you were able to get useful results. The method I described happens to use python to measure the data, but you can use the same method even if you have added many features into the sdk instead of python. In fact, one of the major problems we found using this method was that some of the graphic effects attached to some of the units were slowing down runtime. We re-measured the runtime after replacing the unit with an empty nif, and proved that this was a major factor. I don't think profiling would reveal that.

I've checked your post about performance experiment with Dune Wars again. What do you mean by an empty nif?

I noticed the spice nif you used to use cost much time. It's a resource, right? How frequently would it appear on map? I'm very curious as a single resource graphic could make such huge difference.
 
Thanks phungus, Xienwolf and EF for noticing my little 'achievement'. I agree with EF completely that this little modification is not up to the standard of UP or BBAI, because AI is actually using attitude value of last turn.

Then I noticed a large part (about 85% percent) of time AI_getAttitudeVal() consumes comes from a function called AI_calculateStolenCityRadiusPlots(). It calculates how many plots other players 'steal' from your cities using culture. This value will change very little from turn to turn, so I cached that value instead of the whole AI_getAttitudeVal(). The performance increase in not as good as caching the whole attitude, but is substantial enough.

I will try to make some further improvements if I can then tell jdog about it.
 
Caching that value--and invalidating it when things change it during the turn--seems pretty reasonable. And saving 85% of a costly routine is rather significant. :goodjob: Off the top of my head:

  • Use Great Artist for a Great Work
  • Perform Spy's "Spread Culture" mission
Does anything else add culture or switch ownership of tiles?
 
Caching that value--and invalidating it when things change it during the turn--seems pretty reasonable. And saving 85% of a costly routine is rather significant. :goodjob: Off the top of my head:

  • Use Great Artist for a Great Work
  • Perform Spy's "Spread Culture" mission
Does anything else add culture or switch ownership of tiles?

How about giving/trading a city or creating a colony?
 
I've checked your post about performance experiment with Dune Wars again. What do you mean by an empty nif?

I noticed the spice nif you used to use cost much time. It's a resource, right? How frequently would it appear on map? I'm very curious as a single resource graphic could make such huge difference.

In general, people find that high poly count models, or models with complex particle effect animations, can slow down the game. In the particular case of Dune Wars and the spice resource, about 15% of the map is covered with this bonus, so a huge map could have a hundred or more. A slightly older screenshot is at this link. It used to also have a smoke effect, along the lines of an unhappy city. This may be an exceptionally bad case, but generally artists do try to minimize the poly count and effect to avoid hurting runtime.

The Dune Wars Art God, deliverator, made a nif which is literally empty, with no objects or anything inside. By changing the artdef from the complex smoky animation to the empty nif, I could measure the runtime before and after, and prove whether this animation was causing a slow runtime. As you can see from later in the thread of the post I quoted, replacing the model with a less complex one did make a significant improvement in the runtime. This is one example where I am pretty sure a profiler would not provide any insight.

In case it is helpful, the empty nif is attached in this post.
 
Then I noticed a large part (about 85% percent) of time AI_getAttitudeVal() consumes comes from a function called AI_calculateStolenCityRadiusPlots(). It calculates how many plots other players 'steal' from your cities using culture. This value will change very little from turn to turn, so I cached that value instead of the whole AI_getAttitudeVal(). The performance increase in not as good as caching the whole attitude, but is substantial enough.

I don't know where, but there has to be a function, which is called after a culture expansion, because there's a python function onCultureExpansion in the CvEventManager.
-> So the simplest way (from logic, not from the programming effort) would be, to let the function be called from there.
 
Caching that value--and invalidating it when things change it during the turn--seems pretty reasonable. And saving 85% of a costly routine is rather significant. :goodjob: Off the top of my head:

  • Use Great Artist for a Great Work
  • Perform Spy's "Spread Culture" mission
Does anything else add culture or switch ownership of tiles?

The value is all about how many plots THEY were able to work which now YOU are working. So you simply place a check in CvPlot::changeOwner() or whatever such a function is called which tests if the plot is in the working radius of a city before the change, and if it is owned by a player after the change. If so, invalidate this cache value.
 
Thanks everyone for giving me suggestions, but I seems to have made a mistake, in original BTS close border attitude value is already cached...Looks like I have to check it more carefully.
 
Top Bottom