Introducing PureScript Erlang backend
I’m not sure why I decided to create an Erlang backend for the PureScript compiler. The platform is somewhat interesting, but I didn’t know it, and I don’t have any particular use case for it. All I know is I’d had it in my head for a while (I’m sure I wasn’t the only one to consider it), I had most of an implementation a long time before I uploaded anything, and I kept quiet about it for some time after.
Anyway the cat’s now at the bag, a compiler fork is available, along with ports of various libraries under the purerl organisation.
Overview
A new compiler executable pserlc
is introduced which compiles PureScript to Erlang source. This
takes the same approach as the JavaScript backend, of attempting to produce reasonable, readable
output. In some places this leaves a little to be desired, and in others the translation
comes out pretty neatly.
FFI is provided via Erlang source files, while functions other than compilation (psc-ide, psci, etc.) are not yet supported.
Types
PureScript type | Erlang type |
---|---|
Int |
integer() |
Number |
float() |
Boolean |
boolean() |
String |
string() |
Array |
array() |
Records | #{atom() => any()} |
Tagged union | Tuple with tag element |
Erlang data types are immutable, which works nicely as a target for a functional language. Numbers are unbounded, and these are mapped to the primitive numeric types.
Strings are currently represented as Erlang strings, which are simply lists of characters (which are themselves simply integers). This is simple, straightforward as per Haskell strings, and probably a bad idea for the same reasons - Elixir uses a Binary representation of strings, and we should maybe do the same.
Arrays are not entirely obvious. Array
has special status in the PureScript compiler,
with []
notation for construction, and instances provided in various core libraries.
This makes it easy to interoperate with JavaScript arrays, but the performance characteristics
are often poor. Erlang has built-in immutable functional arrays, implemented as 10-ary trees,
which give good lookup performance (O(log n)) with reasonable immutable update performance
in some cases. However the API is somewhat lean compared to lists, and in other cases
there is not such a simple implementation. Many typeclass instances are currently provided
via round-tripping to lists, clearly not ideal. And it is unfortunate that the
native lists are not easily used with []
syntax and pattern matching.
Records come out nicely as Erlang maps, giving a simple representation of construction and update.
Tagged unions are represented as tuples, eg Just 42
and Nothing
are simply {just, 42}
and nothing
.
Output
PureScript modules correspond to Erlang modules with some name mangling, where Data.List
comes out as
data_list
(its FFI module is separate at data_list@foreign
, re-exported via the main module as required).
The top level identifiers of a PureScript module are translated to functions in the Erlang module. To support non-function values, every definition is translated to a nullary function returning the translated value.
For example
x :: Int
x = 42
id :: forall a. a -> a
id x = x
translates to
x() = 42.
id() = fun (X) -> X end.
Erlang functions can be of higher arity, but these do not appear in the generated code. purescript-functions can be used to deal with tupled functions, and the FFI supports auto-currying imports.
Erlang syntax has an unfortunate precedence for function calls, such that application of curried functions
f a b
cannot be f(a)(b)
but instead must be (f(a))(b)
. Combined with an imperfection in code
generation, compiler output currently contains a staggering number of parentheses.
Pattern matching
Pattern matching in PureScript is translated into pattern matching in Erlang output. For example, consider maybe
:
maybe :: forall a b. b -> (a -> b) -> Maybe a -> b
maybe b _ Nothing = b
maybe _ f (Just a) = f a
The output is (whitespace adjusted)
A function call is used rather than a case expression due to scope. Of course with uncurried functions this would be much simpler:
maybe(B, _, { nothing }) -> B;
maybe(_, F, { just, A }) -> F(A).
Things work out nicely when it comes to nested patterns, there’s nothing for us to do.
Guard expressions, on the other hand, are a mixed bag. The simplest guards can become guards in the Erlang output.
isJust :: forall a. Maybe a -> Boolean
isJust = case _ of
Just _ | true -> true
_ -> false
However Erlang guard expressions are restricted, and cannot contain arbitrary expressions like function calls. For now in fact the above trivial guard and those that are boolean variables are the only one which will be translated directly.
For complex expressions we must first evaluate the guard expressions before the start of the pattern match. Due to the fact that guard expressions will generally make use of bindings in the pattern match, we end up with 2 pattern match expressions in the output.
isLarge :: Maybe Int -> Boolean
isLarge = case _ of
Just n | n > 1000 -> true
_ -> false
isLarge() ->
fun (V) -> begin
_@2 = (fun
({ just, N }) -> ((((data_ord:greaterThan())((data_ord:ordInt())))(N))(1000));
(_) -> false
end(V)),
(fun
({ just, N }) when _@2 -> true;
(_) -> false
end(V))
end
end.
Here the expression n > 1000
is evaluated initially, and then used as a guard in the 2nd match.
If we were able to inline the typeclass dictionary the expression N > 1000
would be suitable
for inclusion in a guard directly.
Lists, tuples and friends
PureScript’s Data.List
can be used as is, and the representation should be reasonable,
but does not coincide with Erlang lists - native Erlang lists are provided via purescript-erl-lists. This may be more
efficient and could be useful for interfacing with Erlang libraries, but while a cons operator (:)
is provided,
as this is not a data constructor it cannot be used in pattern matching, and this must go via uncons
.
Data.Tuple
again can be used, but a tuple such as Tuple 1 2
will correspond
to a tuple {tuple, 1, 2}
. Native Erlang tuples are provided in
purescript-erl-tuples - this allows
for higher-arity tuples and may be useful for FFI. Again a downside of this is a lack
of pattern matching, in this case conversion to Data.Tuple.Nested
is provided which can permit
pattern matching with /\
.
Core libraries
Many of the core libraries have been ported to Erlang FFI. This has been an experience in discovering just how many transitive dependencies an innocent-looking package may have, and just how many packages include FFI definitions (sometimes for no apparent reason).
Some libraries which should work now - either with ported FFI as part of the purerl org, or FFI-free packages (which must be used with the correct dependencies such as prelude, via bower resolutions).
In purerl:
- purescript-prelude
- purescript-integers
- purescript-math
- purescript-assert
- purescript-eff
- purescript-partial
- purescript-console
- purescript-foldable-traversable
- purescript-unfoldable
- purescript-functions
Should work (and many more):
- purescript-maybe
- purescript-either
- purescript-tuples
- purescript-bifunctors
Further steps
I’m continuing to work on this, despite distractions of Atom/Code plugins, work and life. The large part of this is fleshing out library ports, with the occasional compiler bug surfacing. I’m driving this by trying to write an interesting app, and chasing relevant dependencies. Expect to see more here. Eventually.