Pazyryk
Deity
- Joined
- Jun 13, 2008
- Messages
- 3,584
This tutorial is a work in progress. I'll keep adding things as I test them. Please make suggestions if you want me to test a specific issue. Or post your own test. There are similar analyses elsewhere (my test here is modeled on this web page), but I thought it might be useful to have one for Civ5. See here for an excellent article on optimization by the leading architect of Lua himself.
An important point is that, in many cases, you don't need to optimize and there is no reason to bother. Run time checks to see where you need to improve something. Still, it's nice to know what runs faster and it may change the way you write your code.
All tests were run on my computer (top of the line 5 years ago) with live tuner running and a Civ5 game in progress (small map; turn 0). I put it all in a mod with a single Lua file, with code shown here (mod was rebuilt for each test):
You can ignore the code above. Just remember that times are averaged over 5 runs (interpolated), so an OS or Civ5 dll "hickup" won't affect results much. I also run the enclosing TimeTest() function 3 times to ensure that the results are consistent within a few percent. I adjust iteration number in each test so that it is high enough for a reliable test, but not so high that the map flickers or goes black (which happens if Lua code runs for more than a couple seconds). Percent time is always relative to the fastest code.
Some tests use a table called "a". This is an array (indexes 1 to 100) that holds 100 random non-integer numbers.
Test 1: for-loops (is ipairs faster than pairs?)
Code 1:
Code 2:
Code 3:
Results:
Code 1: 0.083000000000004 (163%)
Code 2: 0.10359999999999 (204%)
Code 3: 0.050800000000004 (100%)
Conclusions:
No, ipairs is not faster than pairs. In fact, the opposite is true. If your indexes are continuous integers, then you should use the method in Code 3 ("for i = 1, #table do" or whatever range you need). Note that code 3 can be sped up even more (maybe 2x, though I haven't tested it) if you store the value of #a so Lua doesn't have to calculate it at this step. Use pairs when your indexes are not continuous integers. There is really no reason that I can think of to use ipairs.
Test 2: Localization of a simple math function
Code 1
Code 2
Results:
Code 1: 0.16780000000003 (171%)
Code 2: 0.098199999999952 (100%)
Conclusion:
Localize your functions!
Test 3: Table lookup vs. localized constant
Code 1
Code 2
Code 3
Code 4
Results:
Code 1: 0.19180000000001 (1744%)
Code 2: 0.19599999999991 (1782%)
Code 3: 0.053000000000111 (482%)
Code 4: 0.010999999999967 (100%)
Conclusion:
Getting an integer from GameInfoTypes is much faster than from GameInfo. However, if you are going to reference a table item a lot (e.g., you are iterating over all plots and testing terrain, improvement or feature types), it is much faster to put whatever values you need in specific (localized!) constants. My suggestion is to put all of these in a "local defines" section at the top of every Lua file.
Test 4: Inserting table items
Code 1:
Code 2:
Code 3:
Code 4:
Results:
Code 1: 0.16999999999989 (392%)
Code 2: 0.13500000000004 (311%)
Code 3: 0.11140000000014 (257%)
Code 4: 0.043399999999929 (100%)
Conclusion:
table.insert is slower than the simpler (imho) method in Code 3, especially if you don't localize the insert function. However, you can get a big boost by keeping track of table position in an external variable (Code 4).
Test 5: Recycling/reusing tables (vs. re-initing)
Code 1:
Code 2:
Code 3:
Results:
Code 1: 0.13920000000007 (165%)
Code 2: 0.11739999999991 (139%)
Code 3: 0.084400000000005 (100%)
Conclusion:
Better to recycle/reuse your tables rather than letting Lua clean up after you. There is a lot of behind-the-scenes action in code 1 even though it looks simpler. Lua has to init and grow the table with each iteration, plus garbage collect tables from previous iterations. However, Lua is quite fast at this, so take advantage of it in cases where code simplicity is more important than a modest speed improvement.
Test 6: GameInfo.Units() iteration: condition test in Lua vs. SQL
Code 1:
Code 2:
Results:
Code 1: 0.096799999999985 (130%)
Code 2: 0.074400000000014 (100%)
Conclusion:
Better to pass the condition test through SQL. But note the tiny number I had to use on the i iterator here. Both of these tests were horrendously slow! That brings me to the next test...
Test 7: Using GameInfo.Units() iterator versus iterating over the same table data cached in Lua
Code 1:
Code 2:
Results:
Code 1: 0.74800000000001 (6800%)
Code 2: 0.01099999999999 (100%)
--prints 57000 in both cases
Conclusion:
Iterators like GameInfo.Units() are really very slow. I'm totally guessing, but perhaps these iterate through the actual DB rather than C++ cached values (which is crazy!). In any case, if you need speed in a DB table iteration, cache the table (or the parts needed) into a Lua table and iterate over that instead.
To be continued if folks show any interest...
An important point is that, in many cases, you don't need to optimize and there is no reason to bother. Run time checks to see where you need to improve something. Still, it's nice to know what runs faster and it may change the way you write your code.
All tests were run on my computer (top of the line 5 years ago) with live tuner running and a Civ5 game in progress (small map; turn 0). I put it all in a mod with a single Lua file, with code shown here (mod was rebuilt for each test):
Spoiler :
Code:
local a = {} --just some tables to hold stuff
local b = {}
for i = 1, 100 do
number = Game.Rand(1000000, "hello") / 1000
a[i] = number
b[i] = tostring(number)
end
local TestCode = {}
function TimeTest()
local time = {}
for j = 1, #TestCode do
time[j] = 0
end
--do 5 runs testing each code block
for i = 1, 5 do
for j = 1, #TestCode do
local Code = TestCode[j]
local start = os.clock()
Code()
time[j] = time[j] + os.clock() - start
end
end
--find smallest time
local minimumTime = 1000
for j = 1, #TestCode do
if time[j] < minimumTime then minimumTime = time[j] end
end
for j = 1, #TestCode do
local percent = math.floor(100 * time[j] / minimumTime + 0.5)
print("Code " .. j .. ": " .. tostring(time[j] / 5) .. " (" .. percent .. "%)")
end
end
-- TEST CODE -------------------------------------------------------------------------
TestCode[1] = function() -- "Code 1"
for i = 1, 10000 do
for j, v in pairs(a) do
local x = v
end
end
end
TestCode[2] = function() -- "Code 2"
for i = 1, 10000 do
for j, v in ipairs(a) do
local x = v
end
end
end
TestCode[3] = function() -- "Code 3"
for i = 1, 10000 do
for j = 1, #a do
local v = a[j]
local x = v
end
end
end
--[[
TestCode[4] = function()
end
TestCode[5] = function()
end
]]
Some tests use a table called "a". This is an array (indexes 1 to 100) that holds 100 random non-integer numbers.
Test 1: for-loops (is ipairs faster than pairs?)
Code 1:
Code:
for i = 1, 10000 do
for j, v in pairs(a) do
local x = v
end
end
Code 2:
Code:
for i = 1, 10000 do
for j, v in ipairs(a) do
local x = v
end
end
Code 3:
Code:
for i = 1, 10000 do
for j = 1, #a do
local v = a[j]
local x = v
end
end
Results:
Code 1: 0.083000000000004 (163%)
Code 2: 0.10359999999999 (204%)
Code 3: 0.050800000000004 (100%)
Conclusions:
No, ipairs is not faster than pairs. In fact, the opposite is true. If your indexes are continuous integers, then you should use the method in Code 3 ("for i = 1, #table do" or whatever range you need). Note that code 3 can be sped up even more (maybe 2x, though I haven't tested it) if you store the value of #a so Lua doesn't have to calculate it at this step. Use pairs when your indexes are not continuous integers. There is really no reason that I can think of to use ipairs.
Test 2: Localization of a simple math function
Code 1
Code:
for i = 1, 1000000 do
local x = math.log10(222.222)
end
Code 2
Code:
local log10 = math.log10
for i = 1, 1000000 do
local x = log10(222.222)
end
Results:
Code 1: 0.16780000000003 (171%)
Code 2: 0.098199999999952 (100%)
Conclusion:
Localize your functions!
Test 3: Table lookup vs. localized constant
Code 1
Code:
for i = 1, 1000000 do
local x = GameInfo.Improvements.IMPROVEMENT_FARM.ID
end
Code 2
Code:
for i = 1, 1000000 do
local x = GameInfo.Improvements["IMPROVEMENT_FARM"].ID
end
Code 3
Code:
for i = 1, 1000000 do
local x = GameInfoTypes.IMPROVEMENT_FARM
end
Code 4
Code:
local IMPROVEMENT_FARM_ID = GameInfo.Improvements.IMPROVEMENT_FARM.ID
for i = 1, 1000000 do
local x = IMPROVEMENT_FARM_ID
end
Results:
Code 1: 0.19180000000001 (1744%)
Code 2: 0.19599999999991 (1782%)
Code 3: 0.053000000000111 (482%)
Code 4: 0.010999999999967 (100%)
Conclusion:
Getting an integer from GameInfoTypes is much faster than from GameInfo. However, if you are going to reference a table item a lot (e.g., you are iterating over all plots and testing terrain, improvement or feature types), it is much faster to put whatever values you need in specific (localized!) constants. My suggestion is to put all of these in a "local defines" section at the top of every Lua file.
Test 4: Inserting table items
Code 1:
Code:
local c = {}
for i = 1, 500000 do
table.insert(c, "some text")
end
Code 2:
Code:
local insert = table.insert
local c = {}
for i = 1, 500000 do
insert(c, "some text")
end
Code 3:
Code:
local c = {}
for i = 1, 500000 do
c[#c + 1] = "some text"
end
Code 4:
Code:
local c = {}
local position = #c
for i = 1, 500000 do
position = position + 1
c[position] = "some text"
end
Results:
Code 1: 0.16999999999989 (392%)
Code 2: 0.13500000000004 (311%)
Code 3: 0.11140000000014 (257%)
Code 4: 0.043399999999929 (100%)
Conclusion:
table.insert is slower than the simpler (imho) method in Code 3, especially if you don't localize the insert function. However, you can get a big boost by keeping track of table position in an external variable (Code 4).
Test 5: Recycling/reusing tables (vs. re-initing)
Code 1:
Code:
for i = 1, 10000 do
local tempTable = {} -- re-init table at each iteration
for j = 1, #a do
tempTable[j] = a[j]
end
end
Code 2:
Code:
local tempTable = {}
for i = 1, 10000 do
for j = 1, #tempTable do -- recycle table (clean out old contents first)
tempTable[j] = nil
end
for j = 1, #a do
tempTable[j] = a[j]
end
end
Code 3:
Code:
local tempTable = {}
local tempTablePosition = 0 -- reuse table without cleaning previous contents (keep position in external variable)
for i = 1, 10000 do
tempTablePosition = 0
for j = 1, #a do
tempTablePosition = tempTablePosition + 1
tempTable[j] = a[j]
end
end
Results:
Code 1: 0.13920000000007 (165%)
Code 2: 0.11739999999991 (139%)
Code 3: 0.084400000000005 (100%)
Conclusion:
Better to recycle/reuse your tables rather than letting Lua clean up after you. There is a lot of behind-the-scenes action in code 1 even though it looks simpler. Lua has to init and grow the table with each iteration, plus garbage collect tables from previous iterations. However, Lua is quite fast at this, so take advantage of it in cases where code simplicity is more important than a modest speed improvement.
Test 6: GameInfo.Units() iteration: condition test in Lua vs. SQL
Code 1:
Code:
for i = 1, 100 do
for row in GameInfo.Units() do
if row.Cost < 200 then
local x = row.Type
end
end
end
Code 2:
Code:
for i = 1, 100 do
for row in GameInfo.Units("Cost < 200") do
local x = row.Type
end
end
Results:
Code 1: 0.096799999999985 (130%)
Code 2: 0.074400000000014 (100%)
Conclusion:
Better to pass the condition test through SQL. But note the tiny number I had to use on the i iterator here. Both of these tests were horrendously slow! That brings me to the next test...
Test 7: Using GameInfo.Units() iterator versus iterating over the same table data cached in Lua
Code 1:
Code:
local n = 0
for i = 1, 1000 do
for row in GameInfo.Units("Cost < 200") do
n = n + 1
end
end
print(n)
Code 2:
Code:
local tempTable = {}
for row in GameInfo.Units() do
tempTable[row.ID] = row
end
local highestID = #tempTable
local n = 0
for i = 1, 1000 do
for j = 0, highestID do
local row = tempTable[j]
if row.Cost < 200 then
n = n + 1
end
end
end
print(n)
Results:
Code 1: 0.74800000000001 (6800%)
Code 2: 0.01099999999999 (100%)
--prints 57000 in both cases
Conclusion:
Iterators like GameInfo.Units() are really very slow. I'm totally guessing, but perhaps these iterate through the actual DB rather than C++ cached values (which is crazy!). In any case, if you need speed in a DB table iteration, cache the table (or the parts needed) into a Lua table and iterate over that instead.
To be continued if folks show any interest...