LLVM vs JVM
Compile time:
- JVM: compiles source code (e.g. Java / Kotlin / Scala) to bytecode;
- LLVM:
- front-end (e.g. Clang / Flang) compiles source code to IR (intermediate representation, IR was designed from the beginning to be a portable assembly.)
- back-end turns IR into a native binary executable.
Runtime:
- JVM: garbage collection, access to resources, etc. JVM is an interpreter for Java bytecode, It has to be running during program execution. (Bigger size and higher overhead).
- LLVM: not needed during runtime (since it generates architecture specific executables in advance)
Just-in-time and Ahead-of-time
As discussed above, LLVM is primarily ahead-of-time, JVM is primarily just-in-time.
LLVM also supports just-in-time compiling (based on the generated IR), since in some cases code needs to be generated on the fly, e.g. when using REPL in Julia. (lli
: directly executes programs in LLVM bitcode format. It takes a program in LLVM bitcode format and executes it using a just-in-time compiler or an interpreter.)
GraalVM can compile JAVA application ahead-of-time to native binaries.
register-based vs stack-based
- LLVM: low level, register-based virtual machine. It is designed to abstract the underlying hardware and draw a clean line between a compiler back-end (machine code generation) and front-end (parsing, etc.).
- JVM: higher level, stack-based virtual machine (rather than loading values into registers, JVM bytecode loads values onto a stack and computes values from there).
Cross Language
Because of IR and bytecode, both LLVM and JVM can support multiple languages.
- LLVM: C, C++, Rust, Swift, Fortran, Kotlin/Native, etc.
- JVM: Java, Kotlin/JVM, Scala, Groovy, Closure, etc.
- GraalVM: Java, JavaScript, Python, Ruby, R, WASM, etc.
LLVM provide primitives for common programming languages features. E.g. functions, global variables, coroutines and C foreign-function interfaces. LLVM has many of these as standard elements in its IR. The split of front-end and backend-end frees high-level language compilers from having to target every platform (they only need to emit LLVM intermediate representation).
LLVM front-ends:
- Clang: for C family of languages, e.g. C, C++, Objective-C
- Flang: for Fortran, added in LLVM 13
Implementation
- LLVM itself is written in C++; it provides C and C++ APIs.
- OpenSDK is written in C++.
- GraalVM is written in Java.