1.

What Makes Luajit Faster Than Lua?

Answer»

Firstly, LuaJIT has a faster baseline interpreter. Even WITHOUT the JIT, LuaJIT is already faster than baseline Lua for three reasons:

The interpreter uses a custom bytecode format. The Lua 5.1 format needs a BIT more bit ʸddling to decode an instruction, but LuaJIT's format only uses ʸelds that are multiples of 1 byte. This makes decoding instructions faster. Since decoding has to be done for every single instruction, a simpler format directly translates into a faster interpreter. (By how much depends on the complexity of each instruction, though.)

It uses direct dispatch. The standard way of implementing an interpreter in C is to use a LOOP and a big `switch` statement at the top which then dispatches to the code that executes the instruction. A faster way is to use a table of code labels, have each instruction decode the next instruction, and directly jump to the label for the next instruction based on the opcode of the following instruction. If you WANT to do this in C you need a special GNU/Clang instruction. You cannot do this in ANSI C, which standard Lua aims to.

LuaJIT's interpreter is written in assembly. This makes matters quite a bit more complicated (and obviously unportable), but opens the potential to outsmart the compiler. For this speciʸc use case hand-rolled assembly indeed beats a compiler in almost all cases. Google's Dalvik VM interpreter is also written in assembly, and I believe so is JVM's.

Firstly, LuaJIT has a faster baseline interpreter. Even without the JIT, LuaJIT is already faster than baseline Lua for three reasons:

The interpreter uses a custom bytecode format. The Lua 5.1 format needs a bit more bit ʸddling to decode an instruction, but LuaJIT's format only uses ʸelds that are multiples of 1 byte. This makes decoding instructions faster. Since decoding has to be done for every single instruction, a simpler format directly translates into a faster interpreter. (By how much depends on the complexity of each instruction, though.)

It uses direct dispatch. The standard way of implementing an interpreter in C is to use a loop and a big `switch` statement at the top which then dispatches to the code that executes the instruction. A faster way is to use a table of code labels, have each instruction decode the next instruction, and directly jump to the label for the next instruction based on the opcode of the following instruction. If you want to do this in C you need a special GNU/Clang instruction. You cannot do this in ANSI C, which standard Lua aims to.

LuaJIT's interpreter is written in assembly. This makes matters quite a bit more complicated (and obviously unportable), but opens the potential to outsmart the compiler. For this speciʸc use case hand-rolled assembly indeed beats a compiler in almost all cases. Google's Dalvik VM interpreter is also written in assembly, and I believe so is JVM's.



Discussion

No Comment Found