snowern
Warlord
- Joined
- Aug 5, 2017
- Messages
- 131
Made all the performance enhancements runtime configurable. So now we have ini settings, in case my buggy code is too buggy.
So I expect the release configurations will be "Strict"/"Vanilla", "PerfEnhanced AVX2", and "PerfEnhanced AVX-512", where the vanilla version does not have any enhancements compiled in and should yield the same game state checksum as the Firaxis DLL.
Even with all that though, turn times still reach 26-33s at T660+ on a 5x Huge map. To optimise further, one would need to: get rid of plot revealed python event (only BUG Event Signs uses it), use approximate pathfinding, parallel unit AI for most units, and parallel city updates. Pathfinding is currently about 40-60% of main thread time, so even if pathing was instant, you still wouldn't have fast turn times. Will need fully parallel AI, which is almost a DLL rewrite.
So, gigantic maps take too long to practically AI autoplay test from start to end.
But Huge Marathon with 18 players takes about 9 minutes from start to end. Finished at turn 765. Huge 5x would take, like, an hour or two. But things might be different if a human was playing, as you'd be busy reducing the number of AI units. Better get around to it by a few hundred turns though.
Spoiler ini :
INI:
[CV4MINIENGINE_PERF_ENHANCED_DLL]
; Enables fixes for city updates, plot grouping, pathfinder usage, and enables the FAStar heuristic update. These fixes are important for testing.
GeneralFixes = 1
; Enables the fast vectorised pathfinder. This is a large bug-prone pile of code, but is so much faster than FAStar. Benefits from AVX-512. Only used for AI units.
VectorisedPathfinder = 1
; Enables fast vectorised evaluation of found values. This is a large bug-prone pile of code, but is so much faster than uncached scalar found value evaluation. Benefits from AVX-512. This option should not change game state checksums.
VectorisedFoundValues = 1
; Enables the blockwise union-find-based plot group system. This is a large bug-prone pile of code, but is so much faster than the original CvPlotGroup system. This option should not change game state checksums.
BlockwisePlotGroupSystem = 1
; Enables parallel unit updates for barbarians only. This is a large bug-prone pile of code, but makes gigantic maps much more practical when you have barbs enabled. Will parallelise "simple" unit AI and fallback to serial for everything else. This option only has an effect if VectorisedPathfinder is enabled.
ParallelBarbarianUnitUpdate = 1
; For gigantic maps, especially on Deity, barbs can easily exceed unit ID allocation capacity. This works around that by capping barbs to 4000.
HardCapBarbSpawns = 1
; Enables a small parallel city update. Only does AI_updateRouteToCity for now. This option should not change game state checksums.
CvCity_doTurnParallel = 1
; This option should not change game state checksums.
Faster_CvCityAI_AI_updateRouteToCity = 1
; A faster algorithm for computing the simple path step distance between plots. This option should not change game state checksums.
Fast_CvMap_calculatePathDistance = 1
; Relaxes "group cycle" ordering for AI, to speed up this function.
Fast_CvPlayer_updateGroupCycle = 1
; Basic whole-map parallel loop. This option should not change game state checksums.
Parallel_CvPlayer_countUnimprovedBonuses = 1
; Avoid doing things that only a human player cares about. This option should not change game state checksums.
Skip_AI_CvPlayer_doWarnings = 1
; Parallel loop over cities. Also calls CvMap::calculatePathDistance in parallel. This option should not change game state checksums.
Parallel_CvPlayerAI_AI_findTargetCity = 1
; Use a caching system to accelerate queries. This option should not change game state checksums.
Fast_CvPlayerAI_AI_calculateStolenCityRadiusPlots = 1
; Basic whole-map parallel loop. This option should not change game state checksums.
Parallel_CvPlayerAI_AI_updateCitySites = 1
; Use a fast flood-fill algorithm. This option should not change game state checksums.
Fast_CvPlot_updateIrrigated = 1
; Cache this value. This option should not change game state checksums.
Fast_CvPlot_isCoastalLand = 1
; Use a faster flood-fill algorithm to count plots instead of FAStar. Only has an effect if BlockwisePlotGroupSystem is disabled. This option should not change game state checksums.
Fast_CvPlotGroup_recalculatePlots = 1
; This AI procedure expects a short path, so use a path limit. This option should not change game state checksums, unless VectorisedPathfinder is enabled.
PathLimit_CvSelectionGroup_groupAttack = 1
; Use a caching system to accelerate queries. This option should not change game state checksums.
Fast_CvTeamAI_AI_calculateAdjacentLandPlots = 1
; Trivial path limit. This option should not change game state checksums, unless VectorisedPathfinder is enabled.
PathLimit_CvUnitAI_AI_group = 1
; Trivial path limit. This option should not change game state checksums, unless VectorisedPathfinder is enabled.
PathLimit_CvUnitAI_AI_guardCityMinDefender = 1
; Trivial path limit. This option should not change game state checksums, unless VectorisedPathfinder is enabled.
PathLimit_CvUnitAI_AI_guardCity = 1
; Whole-map parallel loop with distance sort and path limit.
Parallel_CvUnitAI_AI_guardBonus = 1
; Whole-map parallel loop with distance sort and path limit.
Parallel_CvUnitAI_AI_guardFort = 1
; Non-trivial path limit.
PathLimit_CvUnitAI_AI_spreadReligion = 1
; Rearranges logic to perform less pathfinding.
Faster_CvUnitAI_AI_construct = 1
; Loop over units instead of plots.
Faster_CvUnitAI_AI_protect = 1
; Perform a localised search instead of a whole-map loop. This is an approximation of the original exploration AI.
Localised_CvUnitAI_AI_explore = 1
; Perform a flood-fill to find reachable plots instead of using a pathfinder.
Localised_CvUnitAI_AI_exploreRange = 1
; Sorts cities by distance and use a path limit.
Faster_CvUnitAI_AI_targetCity = 1
; Trivial path limit. This option should not change game state checksums, unless VectorisedPathfinder is enabled.
PathLimit_CvUnitAI_AI_targetBarbCity = 1
; Non-trivial parallelisation of privateer AI. May not be all that effective. Privateer pathing is almost worst case scenario.
Parallel_CvUnitAI_AI_pirateBlockade = 1
; Whole-map parallel loop to filter pathing targets.
Parallel_CvUnitAI_AI_improveBonus = 1
; Sorts cities by distance and use a path limit.
Faster_CvUnitAI_AI_retreatToCity = 1
; Sorts cities by distance and use a path limit.
PathLimit_CvUnitAI_AI_trade = 1
; Trivial path limit. This option should not change game state checksums, unless VectorisedPathfinder is enabled.
PathLimit_CvUnitAI_AI_getEspionageTargetValue = 1
; Sorts cities by distance and use a path limit.
PathLimit_CvUnitAI_AI_moveToStagingCity = 1
; Sorts cities by distance and use a path limit.
Faster_CvUnitAI_AI_pillage = 1
So I expect the release configurations will be "Strict"/"Vanilla", "PerfEnhanced AVX2", and "PerfEnhanced AVX-512", where the vanilla version does not have any enhancements compiled in and should yield the same game state checksum as the Firaxis DLL.
Even with all that though, turn times still reach 26-33s at T660+ on a 5x Huge map. To optimise further, one would need to: get rid of plot revealed python event (only BUG Event Signs uses it), use approximate pathfinding, parallel unit AI for most units, and parallel city updates. Pathfinding is currently about 40-60% of main thread time, so even if pathing was instant, you still wouldn't have fast turn times. Will need fully parallel AI, which is almost a DLL rewrite.
So, gigantic maps take too long to practically AI autoplay test from start to end.
But Huge Marathon with 18 players takes about 9 minutes from start to end. Finished at turn 765. Huge 5x would take, like, an hour or two. But things might be different if a human was playing, as you'd be busy reducing the number of AI units. Better get around to it by a few hundred turns though.