1. We have added a Gift Upgrades feature that allows you to gift an account upgrade to another member, just in time for the holiday season. You can see the gift option when going to the Account Upgrades screen, or on any user profile screen.
    Dismiss Notice

Optimizing your Lua code for speed

Discussion in 'Civ5 - Modding Tutorials & Reference' started by Pazyryk, Jun 5, 2012.

  1. Pazyryk

    Pazyryk Chieftain

    Joined:
    Jun 13, 2008
    Messages:
    3,584
    This tutorial is a work in progress. I'll keep adding things as I test them. Please make suggestions if you want me to test a specific issue. Or post your own test. There are similar analyses elsewhere (my test here is modeled on this web page), but I thought it might be useful to have one for Civ5. See here for an excellent article on optimization by the leading architect of Lua himself.

    An important point is that, in many cases, you don't need to optimize and there is no reason to bother. Run time checks to see where you need to improve something. Still, it's nice to know what runs faster and it may change the way you write your code.

    All tests were run on my computer (top of the line 5 years ago) with live tuner running and a Civ5 game in progress (small map; turn 0). I put it all in a mod with a single Lua file, with code shown here (mod was rebuilt for each test):
    Spoiler :
    Code:
    local a = {}		--just some tables to hold stuff
    local b = {}
    for i = 1, 100 do
    	number = Game.Rand(1000000, "hello") / 1000
    	a[i] = number
    	b[i] = tostring(number)
    end
    
    local TestCode = {}
    
    function TimeTest()
    
    	local time = {}
    	for j = 1, #TestCode do
    		time[j] = 0
    	end
    
    	--do 5 runs testing each code block
    	for i = 1, 5 do
    		for j = 1, #TestCode do
    			local Code = TestCode[j]
    			local start = os.clock()
    			Code()
    			time[j] = time[j] + os.clock() - start
    		end
    	end
    
    	--find smallest time
    	local minimumTime = 1000
    	for j = 1, #TestCode do
    		if time[j] < minimumTime then minimumTime = time[j] end
    	end
    
    	for j = 1, #TestCode do
    		local percent = math.floor(100 * time[j] / minimumTime + 0.5)
    		print("Code " .. j .. ": " .. tostring(time[j] / 5) .. " (" .. percent .. "%)")
    	end
    end
    
    -- TEST CODE -------------------------------------------------------------------------
    
    TestCode[1] = function()	-- "Code 1"
    	for i = 1, 10000 do
    		for j, v in pairs(a) do
    			local x = v
    		end
    	end
    end
    
    TestCode[2] = function()	-- "Code 2"
    	for i = 1, 10000 do
    		for j, v in ipairs(a) do
    			local x = v
    		end
    	end
    end
    
    TestCode[3] = function()	-- "Code 3"
    	for i = 1, 10000 do
    		for j = 1, #a do
    			local v = a[j]
    			local x = v
    		end
    	end
    end
    
    --[[
    TestCode[4] = function()
    
    end
    
    TestCode[5] = function()
    
    end
    ]]
    

    You can ignore the code above. Just remember that times are averaged over 5 runs (interpolated), so an OS or Civ5 dll "hickup" won't affect results much. I also run the enclosing TimeTest() function 3 times to ensure that the results are consistent within a few percent. I adjust iteration number in each test so that it is high enough for a reliable test, but not so high that the map flickers or goes black (which happens if Lua code runs for more than a couple seconds). Percent time is always relative to the fastest code.

    Some tests use a table called "a". This is an array (indexes 1 to 100) that holds 100 random non-integer numbers.

    Test 1: for-loops (is ipairs faster than pairs?)

    Code 1:
    Code:
    	for i = 1, 10000 do
    		for j, v in pairs(a) do
    			local x = v
    		end
    	end
    Code 2:
    Code:
    	for i = 1, 10000 do
    		for j, v in ipairs(a) do
    			local x = v
    		end
    	end
    Code 3:
    Code:
    	for i = 1, 10000 do
    		for j = 1, #a do
    			local v = a[j]
    			local x = v
    		end
    	end
    Results:
    Code 1: 0.083000000000004 (163%)
    Code 2: 0.10359999999999 (204%)
    Code 3: 0.050800000000004 (100%)

    Conclusions:
    No, ipairs is not faster than pairs. In fact, the opposite is true. If your indexes are continuous integers, then you should use the method in Code 3 ("for i = 1, #table do" or whatever range you need). Note that code 3 can be sped up even more (maybe 2x, though I haven't tested it) if you store the value of #a so Lua doesn't have to calculate it at this step. Use pairs when your indexes are not continuous integers. There is really no reason that I can think of to use ipairs.

    Test 2: Localization of a simple math function

    Code 1
    Code:
    	for i = 1, 1000000 do
    		local x = math.log10(222.222)
    	end

    Code 2
    Code:
    	local log10 = math.log10
    	for i = 1, 1000000 do
    		local x = log10(222.222)
    	end
    Results:
    Code 1: 0.16780000000003 (171%)
    Code 2: 0.098199999999952 (100%)

    Conclusion:
    Localize your functions!

    Test 3: Table lookup vs. localized constant

    Code 1
    Code:
    	for i = 1, 1000000 do
    		local x = GameInfo.Improvements.IMPROVEMENT_FARM.ID
    	end
    Code 2
    Code:
    	for i = 1, 1000000 do
    		local x = GameInfo.Improvements["IMPROVEMENT_FARM"].ID
    	end
    Code 3
    Code:
    	for i = 1, 1000000 do
    		local x = GameInfoTypes.IMPROVEMENT_FARM
    	end
    Code 4
    Code:
    	local IMPROVEMENT_FARM_ID = GameInfo.Improvements.IMPROVEMENT_FARM.ID
    	for i = 1, 1000000 do
    		local x = IMPROVEMENT_FARM_ID
    	end
    Results:

    Code 1: 0.19180000000001 (1744%)
    Code 2: 0.19599999999991 (1782%)
    Code 3: 0.053000000000111 (482%)
    Code 4: 0.010999999999967 (100%)

    Conclusion:
    Getting an integer from GameInfoTypes is much faster than from GameInfo. However, if you are going to reference a table item a lot (e.g., you are iterating over all plots and testing terrain, improvement or feature types), it is much faster to put whatever values you need in specific (localized!) constants. My suggestion is to put all of these in a "local defines" section at the top of every Lua file.

    Test 4: Inserting table items

    Code 1:
    Code:
    	local c = {}
    	for i = 1, 500000 do
    		table.insert(c, "some text")
    	end
    Code 2:
    Code:
    	local insert = table.insert
    	local c = {}
    	for i = 1, 500000 do
    		insert(c, "some text")
    	end
    Code 3:
    Code:
    	local c = {}
    	for i = 1, 500000 do
    		c[#c + 1] = "some text"
    	end
    Code 4:
    Code:
    	local c = {}	
    	local position = #c
    	for i = 1, 500000 do
    		position = position + 1
    		c[position] = "some text"
    	end
    Results:
    Code 1: 0.16999999999989 (392%)
    Code 2: 0.13500000000004 (311%)
    Code 3: 0.11140000000014 (257%)
    Code 4: 0.043399999999929 (100%)

    Conclusion:
    table.insert is slower than the simpler (imho) method in Code 3, especially if you don't localize the insert function. However, you can get a big boost by keeping track of table position in an external variable (Code 4).

    Test 5: Recycling/reusing tables (vs. re-initing)

    Code 1:
    Code:
    	for i = 1, 10000 do
    		local tempTable = {}	-- re-init table at each iteration
    		for j = 1, #a do
    			tempTable[j] = a[j]
    		end
    	end

    Code 2:
    Code:
    	local tempTable = {}
    	for i = 1, 10000 do
    		for j = 1, #tempTable do	-- recycle table (clean out old contents first)
    			tempTable[j] = nil
    		end
    		for j = 1, #a do
    			tempTable[j] = a[j]
    		end
    	end
    Code 3:
    Code:
    	local tempTable = {}
    	local tempTablePosition = 0		-- reuse table without cleaning previous contents (keep position in external variable)
    	for i = 1, 10000 do
    		tempTablePosition = 0
    		for j = 1, #a do
    			tempTablePosition = tempTablePosition + 1
    			tempTable[j] = a[j]
    		end
    	end
    Results:
    Code 1: 0.13920000000007 (165%)
    Code 2: 0.11739999999991 (139%)
    Code 3: 0.084400000000005 (100%)

    Conclusion:
    Better to recycle/reuse your tables rather than letting Lua clean up after you. There is a lot of behind-the-scenes action in code 1 even though it looks simpler. Lua has to init and grow the table with each iteration, plus garbage collect tables from previous iterations. However, Lua is quite fast at this, so take advantage of it in cases where code simplicity is more important than a modest speed improvement.

    Test 6: GameInfo.Units() iteration: condition test in Lua vs. SQL

    Code 1:
    Code:
    	for i = 1, 100 do
    		for row in GameInfo.Units() do
    			if row.Cost < 200 then
    				local x = row.Type
    			end
    		end
    	end
    Code 2:
    Code:
    	for i = 1, 100 do
    		for row in GameInfo.Units("Cost < 200") do
    			local x = row.Type
    		end
    	end
    Results:
    Code 1: 0.096799999999985 (130%)
    Code 2: 0.074400000000014 (100%)

    Conclusion:
    Better to pass the condition test through SQL. But note the tiny number I had to use on the i iterator here. Both of these tests were horrendously slow! That brings me to the next test...

    Test 7: Using GameInfo.Units() iterator versus iterating over the same table data cached in Lua

    Code 1:
    Code:
    	local n = 0
    	for i = 1, 1000 do
    		for row in GameInfo.Units("Cost < 200") do
    			n = n + 1
    		end
    	end
    	print(n)
    Code 2:
    Code:
    	local tempTable = {}
    	for row in GameInfo.Units() do
    		tempTable[row.ID] = row
    	end
    
    	local highestID = #tempTable
    
    	local n = 0
    	for i = 1, 1000 do
    		for j = 0, highestID do
    			local row = tempTable[j]
    			if row.Cost < 200 then
    				n = n + 1
    			end
    		end
    	end
    	print(n)
    Results:
    Code 1: 0.74800000000001 (6800%)
    Code 2: 0.01099999999999 (100%)
    --prints 57000 in both cases

    Conclusion:
    Iterators like GameInfo.Units() are really very slow. I'm totally guessing, but perhaps these iterate through the actual DB rather than C++ cached values (which is crazy!). In any case, if you need speed in a DB table iteration, cache the table (or the parts needed) into a Lua table and iterate over that instead.

    To be continued if folks show any interest...
     
  2. Pazyryk

    Pazyryk Chieftain

    Joined:
    Jun 13, 2008
    Messages:
    3,584
    reserved...
     
  3. Pazyryk

    Pazyryk Chieftain

    Joined:
    Jun 13, 2008
    Messages:
    3,584
    I'm redoing tests 1 to 3. I had an assignment to a global x rather than a local x inside the loop. Assignment to global was slow enough to partly mask the speed differences (just magnitude; no real difference in conclusion).
     
  4. Danthor

    Danthor Chieftain

    Joined:
    Feb 17, 2006
    Messages:
    262
    Location:
    Punta Arenas, Chile
    Nice!
     
  5. Pazyryk

    Pazyryk Chieftain

    Joined:
    Jun 13, 2008
    Messages:
    3,584
    I renamed Tests 6 and 7 above because I'm not really sure if these involve DB traversal or not. I suspect that they do because they are sooooo slow. GameInfo.Units and similar constructs are a strange animal: they are type userdata, which is (if I understand it) a call to some C++ code. So GameInfo.Units() returns an iterator (a function) and GameInfo.Units[arg] returns a table. It's hard to say how the GameInfo.Units() iterator works, but it's clearly slower than Lua iteration using Lua-cached table data.
     
  6. bc1

    bc1

    Joined:
    Jan 12, 2004
    Messages:
    1,178
    Has anyone tested whether the GameInfo. iterator works any faster than a DB. sql query ? Thanks.
     
  7. Pazyryk

    Pazyryk Chieftain

    Joined:
    Jun 13, 2008
    Messages:
    3,584
    Haven't tested it. I suspect they would be similar. In both cases you are sending info from Lua to C++ (takes time), then C++ has to build an SQL query (a text string), then SQLite has to compile that text string into its own code, then SQLite runs it, then C++ has to get the results from SQLite, convert it into C++ values, and then port it back out to Lua in the form of a table. Each row in the iteration requires a whole new Lua table to be inited from scratch and then grown by Lua's table routines. There is really no way that this could be a fast process.

    If you need speed, then do what the dll does: cache the database values into a "native" form. For Lua, this would be tables, as I did in Test #7. You can cache whole tables or just the parts you need. Do this anywhere speed is an issue.
     
  8. bc1

    bc1

    Joined:
    Jan 12, 2004
    Messages:
    1,178
    Testing shows GameInfo is more than 7 times faster than DB.Query
     
  9. qqqbbb

    qqqbbb Chieftain

    Joined:
    Sep 25, 2010
    Messages:
    411
    When running test 7, line "if row.Cost < 200 then" gives error "attempt to index local 'row' (a nil value)". How can there be nil entries in tempTable?
     
  10. whoward69

    whoward69 DLL Minion

    Joined:
    May 30, 2011
    Messages:
    8,217
    Location:
    Near Portsmouth, UK
    If you've deleted units from the Units table, there will be gaps in that, and as tempTable is created from the <Units> table using its rowId value any gaps will be carried over.

    Replace
    Code:
    tempTable[row.ID] = row
    
    with

    Code:
    table.insert(tempTable, row)
    
    and any gaps will be closed up

    Otherwise, just run the tests on an unmodded game ;)
     
  11. qqqbbb

    qqqbbb Chieftain

    Joined:
    Sep 25, 2010
    Messages:
    411
    I am running tests on unmodded game but there are gaps in vanilla Units table.
     
  12. qqqbbb

    qqqbbb Chieftain

    Joined:
    Sep 25, 2010
    Messages:
    411
    Something worth mentioning:
    Code:
    for unitsRow in GameInfo.Units() do
    	for policyRow in GameInfo.Policies() do
    		if unitsRow.PolicyType == policyRow.Type then
    		end
    	end
    end
    Caching Units table in lua does not decrease code execution time. Caching Policies table does.
     

Share This Page