In traditional parser generators, the bridge between purely syntactic analysis and AST construction is done with semantic actions. Interaction between an expression and the user code is usually done with one of these two techniques (digit
being a rule parsing an integer):
digit "+" digit { $$ = $1 + $3; }
is a technique used in Yacc for example.digit:x "+" digit:y { x + y }
is similar to what is used in Menhir (parser generator written in OCaml).The first technique is often discouraged because some errors can silently appear if you change the order of expression inside a rule without changing the associated action or if you make a mistake when numbering the arguments. The generated code will fail to compile if the host language is statically typed and if the two expressions have different types, but in the general case this technique is not safe. Expression labelling is better but it has the inconvenient of burdening the grammar syntax. Also note that none of these techniques help the user to build the corresponding AST, their purposes is to offer a simple interface between grammar and host code.
Using the idea of typing grammar, we can give a type to each expression and directly pass the value to the semantic action without any labelling or positional notation. The previous example becomes digit "+" digit > add
with >
being a "reverse function call operator", the expression digit "+" digit
produces a value v
of type (i32, i32)
and the code generated looks like add(v)
. It is even smarter and will automatically unpack the tuple into function arguments, so the function add
will be called with two arguments of type i32
.
I read, get inspired or used some ideas of the following implementations (non-exhaustive list):