LuaJIT

LuaJIT is a just in time compiler for the Lua programming language. It is generally a hard fork of Lua 5.1, although it does feature several backports from Lua 5.2.

LuaJIT
The logo featured on the LuaJIT website.
Original author(s)Mike Pall
Stable release
2.0.5 / May 1, 2017 (2017-05-01)
Repositorygithub.com/LuaJIT/LuaJIT
Written inC, Lua
Operating systemUnix-like, MacOS, Windows, iOS, Android, PlayStation
Platformx86, X86-64, PowerPC, ARM, MIPS[1]
TypeJust in time compiler
LicenseMIT License[2]
Websiteluajit.org

History

The LuaJIT project was started in 2005 by developer Mike Pall, released under the MIT open source license.[3]

The second major release of the compiler, 2.0.0, bolstered major performance increases[4]

The latest release, 2.0.5 is released in 2017. Since then, the project is not currently maintained by developers other than contributors.[5]

Notable users

Performance

LuaJIT is often the fastest Lua runtime.[10] LuaJIT is also typically the fastest implementation of a dynamic programming language.[11]

Code written in LuaJIT which uses features designed for the just-in-time compiler will see significant performance slowdowns when using the interpreter. For example, while foreign-function interface structs can be faster than LuaJIT's hash table, using structs in non-hot code (eg, code which is likely to be interpreted) will see significant slowdowns when reading or writing to these structs. For this reason, while performance benefits are possible by using JIT-specific features, using the theoretically slower hash tables.

Due to LuaJIT's tracing compiler scheme, code generated by LuaJIT is often recompiled and refined as the stressors of the program change. As code gets 'hotter' and becomes more of a bottleneck for the program, LuaJIT will continuously attempt to refine the traced code to perform optimally for the workload. This behavior is used by Cloudflare, who rely on these linear 're-traces' to increase the resilience of their Web application firewall to Denial-of-service attacks.[12]

Major optimizations performed

  • Allocation Sinking, which was introduced in LuaJIT 2.0, is a code sinking optimization which removes many unused or 'temporary' allocations from compiled code, by moving allocations to positions where they might escape to the Lua heap. Developers of LuaJIT programs can allocate many temporary objects, but these will remain on the stack or in registers until it is determined whether or not the temporary object may become long-lived.
  • Stitching, which was added in LuaJIT 2.1.0-beta2, enables compiled Lua code to quickly de-optimize into the interpreter in order to call a Lua C function. Previously, when JIT-ed code attempted to call non-FFI C functions, it would abort compilation. Stitching enabled the compiled code to continue to be compiled, by using the interpreter's logic to call the C function.

Internal representation

LuaJIT uses two types of internal representation. A stack-based bytecode is used for the Interpreter (computing), and a static-single assignment form is used for the just-in-time compiler. Bytecode decompilations are available for traces using the -jdump command-line option.

LuaJIT's interpreter bytecode is portable across architectures and minor version bumps, and can be used for compression. Interpreter bytecode is not secure, however, and bytecode loading should only be enabled if the source is trusted. LuaJIT's SSA form is ephemeral and only used while recording and compiling a trace.

LuaJIT has many "Not-yet implemented" facilities which cannot be JIT compiled. Whenever one of these is encountered, the trace will be aborted and nothing will be compiled.

for i = 1,100 do
    io.write("hello, world!")
    io.write("The current number is"..((i % 2 == 0 and "even") or "odd"))
end

As LuaJIT is a tracing just-in-time compiler, it's compilation is trace-based. The most branching control flow a trace can have is a conditional jump out of the trace (called a "guard") which resumes execution at the interpreter or an appropriate side trace. Traces can have a loop, but this is not required; LuaJIT may choose to begin a trace at the beginning of a function, especially if it is avoiding a compiler abort in a function higher in the calling stack.

LuaJIT does not support ahead-of-time compilation of traces.

---- TRACE 1 IR
-- Trace setup & constants redacted for brevity
0015    udt FLOAD  nil   #204
0016    p64 FLOAD  0015  udata.file  -- Load STDOUT
0017 >  p64 NE     0016  NULL        -- Sanity check: Ensure STDOUT pointer is not null
--> loop-invariant code duplication
--> This is the first iteration of the loop, performed above the loop.
--> All loop-invariants are patched into the loop from here by the LuaJIT loop optimization.
0018    nil  +17038428  +1           -- Call arguments
0019    nil  0018  +13
0020    nil  0019  0016
0021    int CALLS  fwrite      ([0x7f4b78fec858] +1   +13  0016)
0022    int BAND   0001  +1          
0023 >  int NE     0022  +0
0026    nil  +16978892  +1           -- Call arguments
0027    nil  0026  +24
0028    nil  0027  0016
0029    int CALLS  fwrite      ([0x7f4b7900b3e8] +1   +24  0016)
0030  + int ADD    0001  +1
0031 >  int LE     0030  +100
0032 ------ LOOP ------------
0033    int CALLS  fwrite      ([0x7f4b78fec858] +1   +13  0016) 
                                     -- ^^^ Write "hello world!" (13 bytes) to STDOUT (0016)
0034    int BAND   0030  +1          -- Perform a modulo by bitwise-ANDing i to find i % 2
0035 >  int NE     0034  +0          -- If number is even, then we will abort back to the interpreter.
0036    int CALLS  fwrite      ([0x7f4b7900b3e8] +1   +24  0016) 
                                     -- ^^^ Write "the current number isodd" (24 bytes) to STDOUT (0016)
0037  + int ADD    0030  +1
0038 >  int LE     0037  +100
0039    int PHI    0030  0037        -- Note that registers 30 and 37 are registers used in a loop.
---- TRACE 1 mcode 350

In this case, LuaJIT only compiles a trace for a loop which print "the current number is odd", because the odd iteration is the first to reach it's 57th iteration. The other case, "the current number is even", is compiled as a side trace. Instead of falling back to the interpreter at the 0035 "Not equal" guard, it will fall back onto the side trace.

---- TRACE 2 IR
-- Heavily redacted for brevity
0004 >  p64 BUFPUT 0003  "The current number i"~
0005 >  fun EQ     0002  io.write
0008 >  p64 NE     0007  NULL
0012    int CALLS  fwrite      ([0x7f4b78fdcee0] +1   +25  0007)


Extensions

LuaJIT adds several extensions to its base implementation, Lua 5.1, most of which do not break compatibility.[13]

  • "BitOp" for binary operations on unsigned 32-bit integers (these operations are also compiled by the just-in-time compiler)[14]
  • "CoCo", which allows the VM to be fully resumable across all contexts[15]
  • A foreign function interface[16]
  • Portable bytecode (across instruction sets, not across versions)

DynASM

DynASM
Developer(s)Mike Pall
Stable release
2.0.5 / May 1, 2017 (2017-05-01)
Preview release
2.1.0 beta3 GC64
Repositorygithub.com/LuaJIT/LuaJIT
Written inLua, C[17]
Platformx86, X86-64, PowerPC, ARM, MIPS
TypePreprocessor, Linker
LicenseMIT License[2]
Websiteluajit.org/dynasm.html

DynASM is a lightweight preprocessor for C which was created for LuaJIT 1.0.0 to make developing the just-in-time compiler easier. DynASM replaces assembly code in C files with runtime writes to a 'code buffer', such that a developer may generate and then evoke code at runtime from a C program.

DynASM was phased out in LuaJIT 2.0.0 after a complete rewrite of the assembler, but remains in use by the LuaJIT contributors as a better assembly syntax for the LuaJIT interpreter.

DynASM includes a bare-bones C header file which is used at compile time for logic the preprocessor generates. The actual preprocessor is written in Lua.

Example

|.type L,      lua_State,  esi  // L.
|.type BASE,   TValue,     ebx  // L->base.
|.type TOP,    TValue,     edi  // L->top.
|.type CI,     CallInfo,   ecx  // L->ci.
|.type LCL,    LClosure,   eax  // L->ci->func->value.
|.type UPVAL,  UpVal

|.macro copyslot, D, S, R1, R2, R3
|  mov R1, S.value;  mov R2, S.value.na[1];  mov R3, S.tt
|  mov D.value, R1;  mov D.value.na[1], R2;  mov D.tt, R3
|.endmacro

|.macro copyslot, D, S;  copyslot D, S, ecx, edx, eax; .endmacro

|.macro getLCL, reg
||if (!J->pt->is_vararg) {
|  mov LCL:reg, BASE[-1].value
||} else {
|  mov CI, L->ci
|  mov TOP, CI->func
|  mov LCL:reg, TOP->value
||}
|.endmacro

|.macro getLCL;  getLCL eax; .endmacro

[...]

static void jit_op_getupval(jit_State *J, int dest, int uvidx)
{
  |  getLCL
  |  mov UPVAL:ecx, LCL->upvals[uvidx]
  |  mov TOP, UPVAL:ecx->v
  |  copyslot BASE[dest], TOP[0]
}

References

  1. "LuaJIT". LuaJIT. Retrieved 25 February 2022.
  2. "LuaJIT/COPYRIGHT at v2.1 · LuaJIT/LuaJIT". GitHub. 7 January 2022.
  3. https://luajit.org
  4. Pall, Mike. "Re: [ANN] llvm-lua 1.0". lua-users.org. Retrieved 25 February 2022.
  5. "Download".
  6. Deniau, Laurent. "Lua(Jit) for computing accelerator beam physics". CERN Document Server. CERN. Retrieved 25 February 2022.
  7. "OpenResty® - Official Site". openresty.org.
  8. "Kong/kong". GitHub. Kong. 25 February 2022. Retrieved 25 February 2022.
  9. "Helping to make Luajit faster". blog.cloudflare.com. 19 October 2017. Retrieved 25 February 2022.
  10. "LuaJIT Performance".
  11. "Laurence Tratt: The Impact of Meta-Tracing on VM Design and Implementation". tratt.net. Retrieved 2 March 2022.
  12. Pall, Mike. "Re: How does LuaJIT's trace compiler work? - luajit - FreeLists". www.freelists.org. Retrieved 2 March 2022.
  13. "Extensions". LuaJIT. Retrieved 25 February 2022.
  14. "BitOp Semantics". LuaJIT. Retrieved 25 February 2022.
  15. "Coco - True C Coroutines". LuaJIT. Retrieved 25 February 2022.
  16. "FFI Library". LuaJIT. Retrieved 25 February 2022.
  17. "DynASM Features". DynASM. Retrieved 25 February 2022.

See also

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.