stz

closures, returns, and defer

Something that can save a great deal of headaches in programming is defer.

file-handle = io open: 'foo.txt' mode: io O_RDONLY
defer io close: file-handle

In Smalltalk, with garbage collection, we can be told by the garbage collector that something is being destroyed. Things like IO will do this and close them for you so that if you simply forget about the handle it goes away.

Aiming at systems level programming there is no garbage collection in SmalltalkZero. As such we would highly desire some mechanism like defer. We are creating volatile handles and allocating memory and if we forget to clean up after ourselves we're going to be unhappy.

The first syntax that came to mind for me was:

defer: [ file-handle | io close: file-handle ]

But note we have to capture the file-handle to do this. It's a little awkward. It's noisy. I'm not a fan. As powerful as it is whatever the mechanism for defer should be it could be handed a closure anyway. I'd much rather be able to write it as a standard statement.

I think we could explore the ellipsis syntax. We haven't used it yet and this could be a great place for it.

file-handle = io. open: 'foo.txt' mode: io O_RDONLY
... io close: file-handle

Does this work? it's visual enough to stand out as special. Let's roll with it for now.

Just moments ago we had to make a variable capturing closure to do the close operation. Now we don't. The compiler will move that io close: file-handle to every exit point for the current scope. This should be more than enough to cover most cases where we want to deallocate/release things.

Returning from a code block is ...weird... Let's look at a complicated example:

[ &person do-fancy-person-thing-or-else: block | profile, block -> block return-type |
  ... record-fancy-action
  upgradificate-or-else: [ ^block evaluate ]
  take-note-of-success ]

[ &person upgradificate-or-else: block | profile, block -> block return-type |
  status = try-upgrade
  status == no-good then: [ ^block evaluate ]
  rejoice ]

[ main |
  jane :: profile
  jane name: 'Jane'
  jane do-fancy-person-thing-or-else: [ ^99 ]
  stdio print: 'fancy thing did not work'
]

In this scenario when we get to the ^99 the stack will look like this:

stack frames:
1: main
2: do-fancy-person-thing-or-else:
3: [ ^block evaluate ] do-fancy-person-thing-or-else:
4: upgradificate-or-else:
5: [ ^block evaluate ] upgradificate-or-else:
6: [ ^99 ] main

The return in ^99 current means 'leave this code block with the result of 99 which means it will step back through the stack to main and then print out 'fancy thing did not work'.

There are two ways to interpret that ^. The first is that the code block is part of main, the second is to interpret it as being a closure.

If it's a code block then ^ will leave main. If it's a closure then ^ will leave the closure. To the compiler these two things look the same. Capturing variables or not we want to alter the trajectory of the execution.

This is only a problem because we give the code block to the method we're calling. If methods like then: were builtin then they would behave as expected, but then the -or-else: control flow we just invented wouldn't.

We could change the meaning of ^ to jump to the end of the method it is defined in. That works so long as it knows where to jump to. The moment you break away from the stack that jump instruction no longer makes sense. There's no stack frame to return to.

One cheaty solution is to have a different syntax for closures and to say regular code blocks cannot be passed on to another method. Then when you're using closures you know that ^ will behave differently.

Calling regular code blocks is akin to a goto with a hidden parameter passed 'in case you want to come back to the place that called you'. That ^99 would secretly look like this:

[ main_99: return-address | (result) | // if we wanted to return to the method that called us we could do something like this: // return-address jump-with-result: ___ main-exit jump-to ]

Jumping straight to main-exit introduces another problem though. We had only just added deferred execution within a scope. If we jump to main-exit we don't call any of the deferred code. That would be a big problem.

upgradificate-or-else: and do-fancy-person-thing-or-else: both contain returns. To do the return we issue a RET instruction in assembly which will pop the return address off the stack and move the PC there.

If we instead included a continuation idea of 'next-address' as a hidden parameter then each of those returns could be given not just a return value but also a new next-address value. Instead of doing a RET we can pop that off the stack and do a JMP.

The CALL instruction already pushes the next Program Counter address on to the stack. It is kind of a freebie. I'm aware some architectures use a shadow-stack to balance CALL/RET and make sure nothing weird is going on. Some investigation is required here. CALL/RET aren't actually faster than PUSH+JMP, POP+JMP. It takes up few bytes in memory for the machine code though.

Instead we'd be loading up R15 or some register depending on the platform with our next address and then JMPing to the procedure we want to call. If it needs to then save that address it can put it on the stack. There is a minuscule time saving if we don't have to push it to the stack (ie: a leaf method). Not really worth mentioning in the grand scheme of things.

Each of the stack frames unwinds like normal which gives them the opportunity to run any deferred code before it jumps to the next destination.

We're starting to get in to the nitty gritty of turning the syntax all the way in to an executable. We'll need to think about what kind of backend to use real soon...

closures, returns, and defer

Read more

inlining closures

simple classes

auto-cleanup

count vowels