Related Work

Conventional parser generator

In traditional parser generators, the bridge between purely syntactic analysis and AST construction is done with semantic actions. Interaction between an expression and the user code is usually done with one of these two techniques (digit being a rule parsing an integer):

  1. Positional arguments: digit "+" digit { $$ = $1 + $3; } is a technique used in Yacc for example.
  2. Expression labelling: digit:x "+" digit:y { x + y } is similar to what is used in Menhir (parser generator written in OCaml).

The first technique is often discouraged because some errors can silently appear if you change the order of expression inside a rule without changing the associated action or if you make a mistake when numbering the arguments. The generated code will fail to compile if the host language is statically typed and if the two expressions have different types, but in the general case this technique is not safe. Expression labelling is better but it has the inconvenient of burdening the grammar syntax. Also note that none of these techniques help the user to build the corresponding AST, their purposes is to offer a simple interface between grammar and host code.

Using the idea of typing grammar, we can give a type to each expression and directly pass the value to the semantic action without any labelling or positional notation. The previous example becomes digit "+" digit > add with > being a "reverse function call operator", the expression digit "+" digit produces a value v of type (i32, i32) and the code generated looks like add(v). It is even smarter and will automatically unpack the tuple into function arguments, so the function add will be called with two arguments of type i32.

Parser combinators

Implementations

I read, get inspired or used some ideas of the following implementations (non-exhaustive list):

Paper references