General Lua Techniques

oChuckles

Chieftain
Joined
Mar 16, 2010
Messages
87
Hi everyone! I'm a guy who writes a LOT of Lua. I thought I'd share some of the more advanced techniques that I regularly use, because I see a lot of code written (both in Civ V and in my personal experience) that doesn't take full advantage of this very powerful language. I'm attempting to leverage my experience and give advice, but I will also be explaining concepts that are explained in about as much detail in Programming In Lua, which I'm assuming you've got on hand.

My main goal is to help you think in Lua. I see a lot of programmers from the C or Java worlds who see Lua and think "oh I guess it's nice not to have to worry about x y and z but I don't really see what it buys me. I'm hardcore enough not to need the convenience." My main point here is to show that things like dynamic typing, garbage collection, and first class functions are not just conveniences, but that they allow the language to provide you with indispensable tools that can create non-trivial elegance and conciseness that could not be easily expressed in a lower level language.


Lua not LUA

Lua is a word which means "Moon" in Portuguese; it's not an acronym.


Closures

Both tables and functions can be used for side-effects, independently. Those of you who know a bit of Lua (but apparently not enough) might ask "hey if only tables pass by reference, how can a function make a side effect without taking a table as an argument?"

The answer lies in two properties of functions that Lua has that languages like C do not and languages like Python discourage with syntax.

The first is that functions are first class objects. They are a value type like strings and numbers. "function foo() end" is just syntactic sugar for "foo = function() end".

The second is that Lua functions carry around their scope in something called a closure. Closures are references to dynamic scopes. Scopes in Lua are sort of isomorphic to classes in, say, C++, except that their definition, instantiation, and deallocation are all implicit in the structure of your code.

Scopes are governed by two rules:

- When you call a function, a scope containing all that function's local variables is "instantiated."
- When Lua evaluates the "function()...end" expression, references to all enclosing scopes are created, as a sort of "this" or "parent" pointer. This is called a closure.

Function calls aren't the only thing that dynamically create scopes. Control structures like "if then end" and "for do end" and even just "do...end" do as well. All of these can "hide" things, but functions can do so in generic, patterned ways. This is what makes them a powerful tool for abstraction.

It's important to note that this has some effects on optimization. I'll touch on that later.

Right now you might be a bit confused about Lua scopes. How are they different from levels of the stack (which in C are delimited by { and })? The answer is that Lua has no stack*. Closures are garbage collected like anything else, not based on when they happen to end. This means that functions carry around references to fully-formed, garbage-collected objects (kind of like the concept of a "delegate" in C# or D.) So, if no first-class function value is referencing a variable in a closure, it does indeed get collected in a "stack-like" way. But otherwise it sticks around.

So Lua does not have a strict separation of algorithm and data structure like C does. Your code in essence describes a data structure, even if you aren't using a table.

*except the one used to pass things back and forth from C and to and from functions, but don't let that confuse you. It isn't used for memory management, which is coincidentally one of the reasons it works so well with C.

Enough with the high-level; let's see some examples.

Code:
do --a scope is created, containing a local called oldprint
  local oldprint = print    --oldprint is set to the value of the global function "print"
  print = function() oldprint"haha you can't print now" end  --this creates a closure over the do...end because it uses oldprint
end --so as long as print is equal to that new function, the scope does NOT get garbage collected

--oldprint can no longer be accessed by anyone, except through the "new and improved" print.

print = nil --now the scope will be garbage collected (along with both the old function and the new closure)

One thing that's easy to forget (because some languages, such as Python, do it the opposite way) is that your loop variables are inside the loop's scope, not the enclosing scope. So for instance,

Code:
local t = {}
for i = 1,5 do
  t[i] = function() return i end
end

print(i) --> nil

for j = 1,5 do
  print(t[j]()) -->prints 1 2 3 4 5, not 5 5 5 5 5
end


Although similar to classes, closures are more flexible and generic. I have found it more useful to think of classes as a special case of closures than the other way around, both in my Lua and my OOP code.

Here's an example of something resembling a class that uses no tables whatsoever.

Code:
--sort of like a class constructor
local function accgen(n) 

  --default argument
  n = n or 0 

  --sort of like a "private member"
  local original = n 

  --sort of like a "private method"
  local function printOriginal() 
    print("the original value of n was "..original)
  end

  --sort of like a "public method"
  return function(delta) 
    n = n + delta or 0
    printOriginal()  
    return n
  end
end

local inst = accgen(5)
print(inst(1)) --> the original value of n was 5
                  --> 6
print(inst(5))  --> the original value of n was 5
                   --> 11
print(accgen()(3)) -->the original value of n was 0
                         --> 3

Since there was only 1 public method, I just had it return the method. If there had been more than one, I would have returned a table containing them.

A great way to explain closures is with iterators. Let's take a look at a Civ V iterator:

Code:
--an iterator
local function hexneighbors(x,y) --this part is called explicitly at the start of the loop below
  local i = 0
  local odd = y % 2
  local neighs =
  {
    {x-1,y},
    {x-1+odd, y+1},
    {x+odd,y+1},
    {x+1,y},
    {x+odd, y-1},
    {x-1+odd, y-1}
  }
  return function() --this part is called implicitly by the language on every iteration. 
    i = i + 1
    if neighs[i] then return unpack(neighs[i]) end
    --A return of nil means to stop.
  end
end

for x,y in hexneighbors(10,10) do print(x,y) end

Tables on the other hand are less useful for abstraction. In my experience, tables are more for grouping like things and creating better-looking interfaces. Tables are for presentation and configuration; functions are for abstraction.

So now I'm going to make a claim that might have seemed counter-intuitive before: functions are more like C++/Java-style classes than tables are. Their goal is to hide things in generic, abstract ways.

It may be confusing and strange at first, but the combination of closures and first-class functions is absolutely worth groking. Once you really get them, they make your life so much easier in ways that are difficult to explain until you get out there and prove it to yourself. A good way to get good at it is to challenge yourself to use as few tables as you can. (That's what I did, anyway.) This attitude will also generally improve the performance of your programs.

Of course, table/metatable combinations CAN be used for abstraction, and there are many situations where it makes more sense. In my experience these situations are not in the majority, however. The problem is not so much that I often see people using tables where a function would do, but that I see people completely ignoring the existence of closures and trying to code under some restrictive "paradigm" that Lua is not enforcing on them. It's a bit masochistic.


Metatables

Metatables are used for a bunch of things in Lua, so I'm going to go over them one at a time.

__index and __newindex
Essentially these metamethods are used to overload the [] and []= operators. They're good for "subverting" functionality, similar to the "get" and "set" of C#, but more powerful because of dynamic typing. It's also possible to implement inheritance with them, but I think passing first class functions is often more elegant and is usually more efficient. It's important to remember that setting __index to a table is only a convenience. The basic behavior is for it to be a function. __index and __newindex are basically a way of giving functions table-like syntax. One thing that is easy to forget is that they are only called if the tables field is empty there, so it's sometimes useful to put a metatable on an empty dummy table. Another thing to remember is rawset and rawget, which are occasionally needed inside the definition of an __index or __newindex function.

Likewise, __call is for the opposite -- giving tables function-like syntax.

The math operator overloading metamethods like __plus are for things like, say, making a dice function that lets you do
Code:
--creates a function that when called rolls "1d8 + 1d4 + 3"
rollDamage = dice(1,8) + dice(1,4) + 3 
--I've implemented this myself, but I'll leave it as a challenge for the reader :D

You can also use metatables to create weak keys or values in a table. Sometimes it makes more sense to have a table be weak than to explicitly delete its members. It's a bit of an edge case but important nonetheless.

Coroutines

I have found that coroutines are invaluable for 4 things: state machines, messaging, iterators, and making functions re-entrant. I often find myself wishing other languages had coroutines, because they cover a lot of ground.

Basically, any time you have something you want to "pause and come back to," coroutines will make your life much easier. It's possible to think about all kinds of problems in terms of coroutines! And when you do, you end up writing code that is smaller and easier to read... provided that the reader understands coroutines.

So the next time you find yourself jumping through hoops trying to get two or more "conceptual lines of code that -feel- like the should just look like code" go back and forth, use a coroutine.

example:

Code:
--let's revisit out hexneighbors function with a coroutine.
local cowrap = coroutine.wrap
local coyield = coroutine.yield

local function hexneighbors(x,y)
  local odd = y % 2
  return cowrap(function()
    coyield(x-1,y)
    coyield(x-1+odd, y+1)
    coyield(x+odd,y+1)
    coyield(x+1,y)
    coyield(x+odd, y-1)
    coyield(x-1+odd, y-1)
  end)
end

Not only is this more concise, it's also more performant because no tables have to be constructed.

It's easy to see how this can make certain tasks a lot simpler. Imagine if the cowrapped function had conditionals, for instance.

It's also great for web development, a practice known as "continuation passing style."


Code reuse

I mentioned earlier that functions that take functions are similar to inheritance. In fact, passing functions to functions is a good way to do code reuse in general.

For instance:
Code:
local min,max = math.min, math.max

--initial move_objs
local function move_objs (objs, dx, dy)

  local x0, y0, x1, y1 = objs.bounds()

  for i,o in ipairs(objs) do
    o.x = o.x + dx
    o.y = o.y + dy
  end

  local xa, ya, xb, yb = objs.bounds()

  redraw(min(x0,xa), min(y0,ya), max(x1,xb), max(y1, yb))
end

--similar scale_objs
local function scale_objs(objs, factor)

  local x0, y0, x1, y1 = objs.bounds()

  for i,o in ipairs(objs) do
    o.x = o.x * factor
    o.y = o.y * factor
  end

  local xa, ya, xb, yb = objs.bounds()

  redraw(min(x0,xa), min(y0,ya), max(x1,xb), max(y1, yb))

end

--instead you could write:

--function describing the pattern 
local function with_redraw(f)
  return function(objs, ...)

    local x0, y0, x1, y1 = objs.bounds()

    for i,o in ipairs(objs) do
      f(o,...)
    end

    local xa, ya, xb, yb = objs.bounds()

    redraw(min(x0,xa), min(y0,ya), max(x1,xb), max(y1, yb))

  end
end

--redefine the functions more concisely.
move_objs = 
with_redraw(function(o, dx, dy)
  o.x = o.x + dx
  o.y = o.y + dy
end)

scale_objs = 
with_redraw(function(o, factor)
  o.x = o.x * factor
  o.y = o.y * factor
end)

Lua does not have a macro system, but macros are easy to miss. In C++, I often use them to express "small patterns" that would be a pain to do with inheritance. It's simply not a pain in Lua. In Lisp, macros are used for things that you would use metatables for in Lua. Granted, Lisp macros are more powerful than metatables (nothing is more powerful than Lisp macros,) but that's just where Lua draws it's line. It prefers having a syntax to being arbitrarily modifiable.


Avoid globals

Lua files are Lua functions, so "local" approximately means filescope if at the file level. Not only that but Lua files can and should return values. Usually I like to return a function (usually some kind of constructor) or a table (usually some kind of namespace) from my files. I common idiom of mine is
Code:
 local foo = require"foo"

It's very easy to entirely avoid globals in Lua. It's a good idea to take advantage of this for reasons that have been well explained by others. In Lua, the advantage to making everything local is performance. Globals sit in the _G table, and, as explained above, local access is much faster than table lookup.

Obviously, an API may require you to use globals, but that doesn't mean you have to use any more than they're making you use.


Optimization

In C, declaring variables outside of the loop is faster than declaring them inside (although most modern compilers will optimize it anyway.) In C++ accessing a member variable can be much faster than making one on the stack every time a method is called, even if you aren't going to use it for side-effects.

However, in Lua, it's faster to declare your locals in the deepest scope you can, because it's harder to lookup a local in an outer scope than an inner one. If you're interested in learning to optimize Lua, it's important to note that there are many counter-intuitive issues like this. One thing you might not believe: this "scope lookup" thing is almost always faster than indexing into a "self" variable, and closures are also usually more memory-efficient than tables (but not always.)

In Lua, the things to watch out for performance-wise (that are unique to Lua) are:
1) table construction (try reusing tables or rearchitecting with closures or coroutines. Another thing you can do is to make what I call "sideways objects" where each field has a table rather than each object, and you keep track of an ID rather than a table reference.)
2) too much string concatenation (try using the string library, or table.concat, which takes a table and is much more efficient than manually concatenating each entry.)
3) function constructors (It might make more sense to use a metatable-style object for things with many many methods that will be instantiated many many times.)

Things not to worry so much about:
1) Passing, hashing, and comparing strings. Although strings pass by value, they are also immutable. This means that under the hood, Lua can make all kinds of sexy optimizations. For instance, string comparison is a constant time operation, unlike C's strcmp. The big thing to worry about with strings is construction and concatenation. Passing them to functions, hashing them in tables, etc are all much more performant than you might think.


Unlearning Limitations

You don't need classes
Classes are really just functions that return functions. Inheritance is passing functions to functions. It's all functions! The OOP mindset is limiting because closures are actually more powerful than classes are. A function that returns a function that returns a function is almost like a "manager" or "factory pattern" in OOP. (You'll notice a lot of design patterns that are invisible or else trivial in Lua (if you're the kind of programmer who notices these things.)) There are as many levels of "staticness" as you want, and there's a lot that can be anonymous and invisible. FooVisitorIteratorManager is not something you write in Lua, but it might be something that is going on under the hood.

You don't need enums
Most of the purpose of an enum is for static typing only. Regardless of whether you pass a string or a number indexed by a string, you're going to end up with a runtime error in a dynamically typed language. Passing a string in Lua is no harder on the machine than passing a number. It's more for syntactic ease and ability to subvert behavior that you'd put it in a table. If you want to loop through them, it's often better to have {"yes","no"} rather than {yes = 0, no = 1}. The exception of course is utilizing enums written in C.

You don't need to prematurely specify anything
Programmers brought up on static OOP may find it disquieting that so little "pre-design" is required in a language like Lua. You simply don't need to know the structure ahead of time. The compiler isn't baby-sitting you; you're free!
 
Being a programmer first and a modder second, I really appreciate someone stepping up and explaining Lua beyond "copypasta this to do X". Although I doubt I'll ever play or mod Civ5 (aversion to Steam), I hope I'll get into Lua in the near future. Your examples sure made me more impatient.

Anyway, I got to nitpick:
- Python fully supports first-class functions and closures, it only chooses a different (and admittely, perhaps strange if you're not used to it) tradeoff with regards to scoping: Local variables are created implicitly (i.e. without a "local"), so changing a variable in an outer scope needs that variable to be declared "nonlocal". Alas, that's Python 3 only... in 2.x (which Civ 4 used), scoping is half-assed in this regard (you can still hack around this with minimal effort and zero extra lines, it's just ugly).
- I think you got "closure" and "scope" backwards in a few places. "do ... end" is its own scope. A closure is a function that refers to variables from an outer scope (e.g. the "function () ... end" that refers to oldprint).

But that's just nitpicking. Really, thank you for this post. Also, nice plea for dynamicness :clap:
 
Being a programmer first and a modder second, I really appreciate someone stepping up and explaining Lua beyond "copypasta this to do X". Although I doubt I'll ever play or mod Civ5 (aversion to Steam), I hope I'll get into Lua in the near future. Your examples sure made me more impatient.

Anyway, I got to nitpick:
- Python fully supports first-class functions and closures, it only chooses a different (and admittely, perhaps strange if you're not used to it) tradeoff with regards to scoping: Local variables are created implicitly (i.e. without a "local"), so changing a variable in an outer scope needs that variable to be declared "nonlocal". Alas, that's Python 3 only... in 2.x (which Civ 4 used), scoping is half-assed in this regard (you can still hack around this with minimal effort and zero extra lines, it's just ugly).
- I think you got "closure" and "scope" backwards in a few places. "do ... end" is its own scope. A closure is a function that refers to variables from an outer scope (e.g. the "function () ... end" that refers to oldprint).

But that's just nitpicking. Really, thank you for this post. Also, nice plea for dynamicness :clap:

Thanks! I've corrected the terminology.

I haven't had a chance to use Python 3 yet, but the syntax seems to still highly discourage this style of coding, even if the language can now semantically handle it. I'm guessing the implementation hacks around the stack instead of just removing it (like Stackless Python does,) which means I wouldn't use closures for performance reasons in Python anyway. Although, to be honest, I'd just use Lua from the get go... :p
 
Mhm I just realized that Firaxis removed all the lua execution routines except for include and their own context functions. So no require, dofile or loadfile. And therefore no return of values from loading files, if I'm not mistaken.
 
Hmmm, I'd like to see their argument against require(). Maybe it's a security concern?

I've noticed that most of Firaxis's Lua basically just looks like straight C. It's too bad that this limited style is forced on the modders :/ It bugs me that C++ programmers tend to think of scripts as "code that is faster to compile but slower to run" rather than "code that will make you 4x as productive if you really get it."

I can't wait for the dll to come out. One thing I'm considering is a "Luafication Mod" that pulls a lot of the C++ game logic and as much as possible of the XML into Lua. It kind of depends on if I'm employed/how many classes I'm taking at the time... (these are the reasons I wrote this guide instead of a mod :p)
 
Back
Top Bottom