[haskell-llvm] llvm-general

Mon Aug 12 20:50:05 BST 2013

Hello all,

I've just learned of this mailing list, so I thought given your recent discussion I'd make a belated announcement of llvm-general:

https://github.com/bscarlet/llvm-general
http://hackage.haskell.org/package/llvm-general

llvm-general is a set of bindings and tools for working with LLVM centered around a pure Haskell AST for LLVM IR.
* It supports just about all of the LLVM IR language - I won't claim no deficiencies, but any which do exist are bugs. For example, this coverage includes full support for the exact flag on all instructions which support it.
* It encapsulates all the logic necessary to build arbitrary IR - which can be somewhat circuitous when various IR structures are self-referencing, given the stateful nature of the LLVM API.
* It supports not only generating LLVM internal C++ objects from the Haskell AST, but vice versa - i.e. it includes the essential functionality for implementing IR transformations in Haskell.
* It includes support for Instrumentation passes.
* It is unicode-clean: strings are properly encoded to and decoded from UTF-8 before being passed through the FFI - e.g. it can handle non-ascii identifiers and metadata
* It currently supports llvm-3.2 and llvm-3.3 with versions on hackage, and tracks llvm-3.4svn on the master branch on github.

The AST does represent most of the syntactic constraints of the IR language, but it does not attempt to use Haskell types to constrain the IR to be semantically valid in any sense. As such it does not offer much of the safety of the llvm package. On the other hand, it is much simpler to generate certain constructs - e.g. a function of statically unknown arity.

I have recently split llvm-general-pure out of llvm-general, isolating the AST from the FFI code and the significant amount of wrapper code that encapsulate it to keep its exposure sane. I am currently implementing a pure-Haskell version of the (de)serialization (from)to textual LLVM IR, so that a Haskell package could generate and emit LLVM IR using the AST without needing to link in LLVM itself. In my wilder moments, I dream that GHC itself might one day use such a facility in its LLVM backend, and so ultimately enable in-process JITing of Haskell code.

I should note that although I am of course gratified that llvm-general has been mentioned as "the new hotness," I have no particular agenda with respect to the current llvm and llvm-base packages.

That said, I can offer my own perspective as to both the value and difficulty of various ways in which llvm-general might better coexist with llvm-base and llvm:

Clearly there is considerable overlap between llvm-base and the internal FFI in llvm-general. llvm-general's FFI bindings cover a lot more, and have a fair bit of C++ glue to implement functionality missing from the LLVM C API. They also use a lot of automation (in the form of CPP macros, template Haskell, etc.) to reduce code redundancy. For these reasons I think it might be easier and produce a more maintainable result to split the actual FFI bindings out of llvm-general as llvm-general-ffi than it would be to move llvm-general onto llvm-base.

The llvm package's type safety is valuable, though hard to use in some circumstances. It could be rehomed on top of llvm-general-ffi, but at least part of it might be rehomed on top of llvm-general-pure instead. The latter would expose the benefit of the static type safety even to programs not directly linked against LLVM.

Unfortunately I cannot presently commit to taking on any of these potential integration projects, though I may be able to in the future.

Regards,
Benjamin S. Scarlet

P.S. - come chat on #haskell-llvm