Yvm

Abstract

Yvm is a compiler library. It is similar to LLVM, but written in Yao and focuses on functional programming and ease of optimization and analysis.

The other thing that sets Yvm apart from LLVM is that it is meant as a “build library.”

Rationale

Many people may ask, “Why make another compiler library when LLVM works well and is so well-supported?”

That is a fair question. There are numerous reasons:

  1. Design deficiencies in LLVM and its IR.

    In the design document, there is more detail about the design deficiencies of LLVM and its IR, but suffice it to say that LLVM IR includes instructions, branching, and other things that make writing optimization and analysis passes more difficult than it needs to be.

  2. Deficiencies in the LLVM C API.

    C is the lingua franca of the computing world. A C library is easier to create language bindings for, it is easier to reason about, and it requires only a C compiler, not a C++ compiler.

    The LLVM C API is not officially supported, and trying to figure out where it matches up with the C++ code is difficult at best. Also, even if it wasn’t difficult, LLVM’s C API is actually incomplete; there are things in the C++ API that are inaccessible from C. And that leads to my next point:

  3. Deficiency of LLVM as a build library.

    The original use case of this library was to compile and/or run Yao, and the most important innovation of Yao, in the eyes of its creator, was using Yao as its own build system.

    (Credit where credit is due: the creator originally got the idea from Jonathan Blow, of Braid and The Witness fame, and his programming language Jai.)

    What this means is that a Yao script is used to build a Yao project. This makes the language extensible in the same way that Lisp is extensible, but without terrible syntax.

    LLVM just does not cut it, both because of its design as a very low-level IR, and because of the deficiencies in the C API.

  4. Deficiency of LLVM as a JIT.

    Yao is meant to also be run under a JIT or interpreter. LLVM makes that difficult.

  5. LLVM IR is vague.

    LLVM IR has strict rules, but it does not specify what will happen when frontends, passes, or backends break those rules.

  6. LLVM IR has target-specific features.

    LLVM IR includes a lot of features that have to be handled. However, that is a better job for backends.

  7. LLVM requires target-specific ABI code.

    In some cases, LLVM requires frontends to generate target-specific ABI code to interoperate with C libraries. Frontends should not have to worry about targets.

  8. LLVM IR is not stable and is not backwards compatible.

    LLVM IR files and LLVM bitcode can change drastically from version to version (though that might be changing).

  9. LLVM IR is not exactly portable.

    This has a lot of side effects: it’s harder to write backends, it makes it more difficult to write frontends, and passes may need to be written for specific targets.

  10. LLVM is written in an unsafe language.

    Yvm will be written in Yao itself. And since Yao will use Yvm as its backend, and yvm will be able to interoperate with C code, a C API will be easy to make.

  11. LLVM IR is not flexble.

    LLVM makes it almost impossible to add new instructions and specify their semantics.

  12. LLVM leaves some things undefined.

    Undefined behavior leads to bugs. Yvm will carefully define everything, including how syscalls work, which will give programmers a solid foundation to build on.

  13. LLVM IR cannot be used as a platform-agnostic distribution medium.

    One of the high-level goals of Yvm is to provide a format for distribution of code in a platform-agnostic way. LLVM cannot do this; LLVM IR is usually specialized to one particular platform, and when it’s not, it’s because the code that was compiled was platform-agnostic itself.