stz parsing structures

stz parsing structures
Photo by Raphael Schaller / Unsplash

One of the most painful things about writing this kind of compiler - one in which the compiler itself runs an interpreter and one where I don't want to implement the parser multiple times - is that the compiler needs to be able to create the same structures used in stz.

That's not strictly true - compiler types could be added to the stz compiletime libraries to match, but then I'd be implementing things like list multiple ways. Instead, I've opted to implement things like list multiple times. Once in C and once in stz.

It's not ideal. It's a little bit painful in fact. But the reward will be great. I'll be able to parse an arbitrary string of stz code from the compiler itself. When you say 'stzc build a.stz b.stz c.stz' that's just the start. It might then deal with imports from directories etc. At some point the build instructions for a program might be written in stz as well.

stz will then be able to dynamically drive the compilation, pulling in files and generating code and installing it all in to the compilation unit itself. It's possible the compiler only need scan the available directories to know what files are available and the compiletime interpreter could drive all the other behaviour.

This is a pretty minor gripe. I have a working lexer now. It's a surprisingly simple syntax all said and done. There are only a couple of places you can get a syntax error from the lexer - malformed numbers and missing string terminators or comment terminators. The rest of the errors will come from the parser itself where we can reason about the lexer tokens and complain if it doesn't make sense.

> bin/stzc lex src/test.stz
// line comment
/* multi
  line
     comment
       /* with embedding */
*/
0b01010101 0o07070707 0x0F0F0F0F -123456789 123456789
'test\' "test\"" `test\``
x'testx' xyz"testxy\zxyz" ø`test\øø`
method-call
... deferred-method-call
receiver keyword-selector: argument
(arg-type-1, arg-type-2) -> return-type
a || b
a / b
a * b
a ÷ b
a < b
a ≤ b
a > b
a ≥ b
a == b
a = b
^return-this
err'foobar'
0b01010101:token_binary_integer
0o07070707:token_octal_integer
0x0F0F0F0F:token_hexadecimal_integer
-123456789:token_decimal_integer
123456789:token_decimal_integer
'test\':token_sq_string
"test\"":token_dq_string
`test\``:token_bq_string
x'testx':token_sq_string
xyz"testxy\zxyz":token_dq_string
ø`test\øø`:token_bq_string
method-call:token_identifier
...:token_ellipsis
deferred-method-call:token_identifier
receiver:token_identifier
keyword-selector::token_keyword
argument:token_identifier
(:token_open_bracket
arg-type-1:token_identifier

...etc