Introducing PureScript Erlang backend

I’m not sure why I decided to create an Erlang backend for the PureScript compiler. The platform is somewhat interesting, but I didn’t know it, and I don’t have any particular use case for it. All I know is I’d had it in my head for a while (I’m sure I wasn’t the only one to consider it), I had most of an implementation a long time before I uploaded anything, and I kept quiet about it for some time after.

Anyway the cat’s now at the bag, a compiler fork is available, along with ports of various libraries under the purerl organisation.

Overview

A new compiler executable pserlc is introduced which compiles PureScript to Erlang source. This takes the same approach as the JavaScript backend, of attempting to produce reasonable, readable output. In some places this leaves a little to be desired, and in others the translation comes out pretty neatly.

FFI is provided via Erlang source files, while functions other than compilation (psc-ide, psci, etc.) are not yet supported.

Types

PureScript type	Erlang type
`Int`	`integer()`
`Number`	`float()`
`Boolean`	`boolean()`
`String`	`string()`
`Array`	`array()`
Records	`#{atom() => any()}`
Tagged union	Tuple with tag element

Erlang data types are immutable, which works nicely as a target for a functional language. Numbers are unbounded, and these are mapped to the primitive numeric types.

Strings are currently represented as Erlang strings, which are simply lists of characters (which are themselves simply integers). This is simple, straightforward as per Haskell strings, and probably a bad idea for the same reasons - Elixir uses a Binary representation of strings, and we should maybe do the same.

Arrays are not entirely obvious. Array has special status in the PureScript compiler, with [] notation for construction, and instances provided in various core libraries. This makes it easy to interoperate with JavaScript arrays, but the performance characteristics are often poor. Erlang has built-in immutable functional arrays, implemented as 10-ary trees, which give good lookup performance (O(log n)) with reasonable immutable update performance in some cases. However the API is somewhat lean compared to lists, and in other cases there is not such a simple implementation. Many typeclass instances are currently provided via round-tripping to lists, clearly not ideal. And it is unfortunate that the native lists are not easily used with [] syntax and pattern matching.

Records come out nicely as Erlang maps, giving a simple representation of construction and update.

Tagged unions are represented as tuples, eg Just 42 and Nothing are simply {just, 42} and nothing.

Output

PureScript modules correspond to Erlang modules with some name mangling, where Data.List comes out as data_list (its FFI module is separate at data_list@foreign, re-exported via the main module as required).

The top level identifiers of a PureScript module are translated to functions in the Erlang module. To support non-function values, every definition is translated to a nullary function returning the translated value.

For example

x :: Int
x = 42

id :: forall a. a -> a
id x = x

translates to

x() = 42.

id() = fun (X) -> X end.

Erlang functions can be of higher arity, but these do not appear in the generated code. purescript-functions can be used to deal with tupled functions, and the FFI supports auto-currying imports.

Erlang syntax has an unfortunate precedence for function calls, such that application of curried functions f a b cannot be f(a)(b) but instead must be (f(a))(b). Combined with an imperfection in code generation, compiler output currently contains a staggering number of parentheses.

Pattern matching

Pattern matching in PureScript is translated into pattern matching in Erlang output. For example, consider maybe:

maybe :: forall a b. b -> (a -> b) -> Maybe a -> b
maybe b _ Nothing = b
maybe _ f (Just a) = f a

The output is (whitespace adjusted)

maybe() ->
  fun (V) ->
    fun (V1) ->
      fun (V2) -> (fun
        (B, _, { nothing }) -> B;
        (_, F, { just, A }) -> (F(A))
      end(V, V1, V2))
    end
  end
end.

A function call is used rather than a case expression due to scope. Of course with uncurried functions this would be much simpler:

maybe(B, _, { nothing }) -> B;
maybe(_, F, { just, A }) -> F(A).

Things work out nicely when it comes to nested patterns, there’s nothing for us to do.

Guard expressions, on the other hand, are a mixed bag. The simplest guards can become guards in the Erlang output.

isJust :: forall a. Maybe a -> Boolean
isJust = case _ of
  Just _ | true -> true
  _ -> false

isJust() ->
  fun (V) -> (fun
      ({ just, _ }) when true -> true;
      (_) -> false
  end(V))
end

However Erlang guard expressions are restricted, and cannot contain arbitrary expressions like function calls. For now in fact the above trivial guard and those that are boolean variables are the only one which will be translated directly.

For complex expressions we must first evaluate the guard expressions before the start of the pattern match. Due to the fact that guard expressions will generally make use of bindings in the pattern match, we end up with 2 pattern match expressions in the output.

isLarge :: Maybe Int -> Boolean
isLarge = case _ of
  Just n | n > 1000 -> true
  _ -> false

isLarge() ->
  fun (V) -> begin
    _@2 = (fun
      ({ just, N }) -> ((((data_ord:greaterThan())((data_ord:ordInt())))(N))(1000));
      (_) -> false
    end(V)),
    (fun
      ({ just, N }) when _@2 -> true;
      (_) -> false
    end(V))
  end
end.

Here the expression n > 1000 is evaluated initially, and then used as a guard in the 2nd match. If we were able to inline the typeclass dictionary the expression N > 1000 would be suitable for inclusion in a guard directly.

Lists, tuples and friends

PureScript’s Data.List can be used as is, and the representation should be reasonable, but does not coincide with Erlang lists - native Erlang lists are provided via purescript-erl-lists. This may be more efficient and could be useful for interfacing with Erlang libraries, but while a cons operator (:) is provided, as this is not a data constructor it cannot be used in pattern matching, and this must go via uncons.

Data.Tuple again can be used, but a tuple such as Tuple 1 2 will correspond to a tuple {tuple, 1, 2}. Native Erlang tuples are provided in purescript-erl-tuples - this allows for higher-arity tuples and may be useful for FFI. Again a downside of this is a lack of pattern matching, in this case conversion to Data.Tuple.Nested is provided which can permit pattern matching with /\.

Core libraries

Many of the core libraries have been ported to Erlang FFI. This has been an experience in discovering just how many transitive dependencies an innocent-looking package may have, and just how many packages include FFI definitions (sometimes for no apparent reason).

Some libraries which should work now - either with ported FFI as part of the purerl org, or FFI-free packages (which must be used with the correct dependencies such as prelude, via bower resolutions).

In purerl:

purescript-prelude
purescript-integers
purescript-math
purescript-assert
purescript-eff
purescript-partial
purescript-console
purescript-foldable-traversable
purescript-unfoldable
purescript-functions

Should work (and many more):

purescript-maybe
purescript-either
purescript-tuples
purescript-bifunctors

Further steps

I’m continuing to work on this, despite distractions of Atom/Code plugins, work and life. The large part of this is fleshing out library ports, with the occasional compiler bug surfacing. I’m driving this by trying to write an interesting app, and chasing relevant dependencies. Expect to see more here. Eventually.