stz code blocks

stz code blocks
Photo by Sérgio Rola / Unsplash

There's no doubt that some parts of the Smalltalk syntax are a little awkward. No programming language is without its quirks. This is a topic we'll likely revisit many times.

If it's going to be a Smalltalk then we have to stick with the absolutely iconic:
[ code goes here ]
but let's change up lists of things. From now on comma is syntax, not a valid message. We're going to use it for many things but chief among them is listing parameters, eg:
[ a, b, c | a + b + c ]

We should also shake off the idea that we can return to the scope where the block of code was defined. As awesome as that is this is a systems programming language and we need things to be predictable. We can't 'unwind' the stack - that part of the stack might not even exist anymore. Whose to say we won't have tail recursion optimisation?

Smalltalk has always used ^ to indicate a return. I suppose we'll stick with that for now. Our simple example should indicate that it is going to return something because as a systems language we're allowed to have methods that return nothing.
[ a, b, c | ^a + b + c ]

This brings us to the problem of types. Sometimes we need type information if we're going to take advantage of being a systems language. It should be assumed that all data lives on the stack and in registers any anything that happens to be on the heap was put there by the programmer library. Memory management will have to be its own whole post.

Technically speaking every object in Smalltalk is a list of machine-words. On a 64-bit machine that means a class of 'name, age, height' is 3x64-bits, or 24-bytes of memory. We should probably support more complex types than that. There's a plethora of approaches for structures after all. Also that pesky alignment thing.

If we look over at Jai and Odin and a few other languages for a moment we discover that their variable declaration syntax looks like this:
varname : type = value
the type is optional if value can be used to infer the type. The value is optional if you provide the type. That means varname := 5 is perfectly valid and, oh would you know it looks just like Smalltalk. So for now let's just adopt this convention and see where it takes us.
[ a: int, b: int, c: int | ^a + b + c ]

a: int sure looks like a receiverless message send. In fact if we imagine that area between the [ and | is an object then we'd be sending a: int to it. The int is a class name which means everything after the : and before the , or | is a statement that will run during compile time.

Now we get our first taste of meta programming the type system. We can create parametric types.
[ people: (Array of: Person) | ... ]

The compilation phase is run as an interpreter running stz and then the compilation happens on the result of that.

Okay what about return values? well let's just keep borrowing and see where that takes us. If we add a little more syntax so that → represents a return type we get this:
[ a: int, b: int, c: int → int | ^a + b + c ]

Leaving off the return type but having a ^ in the code section tells us the return type should be inferred. Likewise not specifying the types of the parameters also should be inferred. For simplicity any time the code is called the types will get filled in and a new version of it will attempt to be compiled. If the message sends don't match up with the types we will have to provide a decent explanation to the developer in an error message.

It also means we can provide default values for parameters, eg:
[ a: int, b: int, c := 123 → int | ^a + b + c ]

One thing I note here is the desire to keep types lower-case. Hitting that shift key all the time is not the easiest thing in the world. We want to avoid clashing names with variables constantly, ie: a class called 'person' and a parameter named 'person' is very likely to happen.

Smalltalk doesn't really have the same namespacing approach that procedural languages have. You don't tend to prefix your method names with a namespace. But you can do it with your classes. Let's reserve . to be part of class name resolution with modules, eg:
[ person: (core.array of: .person), name: string → boolean | ... ]

A typename starting with a . would be local to the compilation scope. This is the reverse of a domain name where the top-level is at the end (eg: .com); instead the top level is at the start.

I'm a big fan of importing things in to my namespace. That'll be a topic for another blog post.