stz the invasive maybe
Maybe is a really powerful concept. It's sometimes called Choice and it has a few other names too. The basic idea is a structure of 'value' and 'state' where that state might be an error-code, or it might simply be a true/false indicator as to whether value exists or not.
Pointer variants of Maybe can test the pointer value to see if it has a pointer or not - if it's 0 then there's no value.
With this information we're armed to implement then: and else: to work on something other than booleans. We can have them work on a Maybe.
people
~> [index, person |
([1000:2000] includes: index)
then: [person]
else: [out-of-bounds]]
~> stdout
Let's imagine for a moment that the entire standard library for stz is built around the idea of may be. Right now it's built around the idea of returning an error code and storing the value you care about in to a reference pointer. While that is incredibly powerful it doesn't help us create better programming ideoms at zero-cost to runtime.
The above code example can be better. We test if index is inside the range 1000-2000 and if it is we return the person, if not we return out of bounds. In other words we repeat the outcome of the #includes: call.
Instead of doing that we only care about the then: clause, using that to return a person instead of an index. The else: clause is the same whether we're returning an index or a person. We can rewrite that code snippet so:
people
~> [index, person | ([1000:2000] includes: index) then: [person]]
~> stdout
The last value on the stack is returned which means then: should return whatever is returned from its block; which could be another may be object. In all likelihood it might be - we don't know what people is after all. It could be an array of references.
And in the case where then: will fail, we want it to return the same may be we tested it on. Therefore the two return values from the block in the example above are either a person or an error. A Maybe.
Writing this out longhand is painful, which is why the shorthand is so helpful. But sometimes you want to be explicit about what your code does:
people
~> [index: int, person: person -> return: (maybe of: person) |
return <- ([1000:2000] includes: index) then: [
-> return2: (maybe of: person) |
person]
~> stdout
What does this mean for the standard library? for the most part it doesn't know or care, but in any case where we would return either a value or an error we now have a specific way of doing it:
jane? := people-by-name["jane"]
jane? then: [jane | stdout print: jane]
Of course we can keep pushing the Maybe down the hill:
jane? := people-by-name["jane"]
stdout print: jane?
And eventually forget we're even dealing with maybe... unless it matters.
stdout print: (people-by-name["jane"] else: [{person | name: 'Jane'}])
This replaces a lot of the boiler plate code we usually write in C along the lines of:
int result = do_the_thing(with_a_thing);
if (result < 0) {
undo_the_thing(with_a_thing);
exit(-1);
}
We do need a way to panic if we got an error. We can then differentiate between a "I don't mind that this failed" and "this absolute cannot fail":
people-by-name["jane"] else: [panic]
people-by-name["jane"] else-panic
people-by-name["jane"] else-exit: -1
These are all acceptable. I think the first one has the most utility:
people-by-name["jane"] else: [panic]
people-by-name["jane"] else: [error |
logger print-error: error
panic]
people-by-name["jane"] else: [error | logger print-panic: error]
This approach does put more onus on interfaces to C libraries. The Posix APIs are pretty inconsistent at times, but their idiom is at least predictable. It will be worthwhile wrapping up their error codes in to enums and utilising those as the error parameter in a maybe:
socket: (maybe of: tcp-socket else: tcp-socket-error)
socket := (io connect-to: 'localhost:8080') else: logger panic-print
Another consideration is how we program with maybe. If we wanted to implement a method that returned either an error or an object do we have to wrap the object and error up in a maybe to do that?
Implicit conversions are something generally frowned upon. Sure they save you time as a programmer but they are also the root cause of many bugs. Just look at Javascript allowing '1' = 1 for instance. That is utter insanity when you want to have a predictable program.
person-error (bad-timing)
make-person ~ -> (maybe a: person else: person-error) [ -> return |
random next < 0.5 then: [return <- bad-timing]
return <- {person | name: 'Jane'}]
Here we are implicitly converting a person or a person-error in to a maybe. How does the compiler know how to do that and should it? It's my opinion that in the case of maybe, being such a core concept, that it should implicitly convert the return value. But by allowing that facility we invite it anywhere and everywhere.
There should be a standard way that we convert things and the compiler would insert the code to do that. It's clear that a decision needs to be made - does bad-timing belong in the value member or the error member of the maybe return value. What if both the a: and the else: were both of type person-error?
These questions inform us that an implicit conversion would not be enough to satisfy. We'd need to do the conversion ourselves or embed Maybe directly in to the language itself.
One way we could do that is to utilise ? to indicate a Maybe type:
make-person ~ -> person? [ ... ]
We'd still need some way to indicate a failure return value instead of a person return value though. ← is special syntax that informs the compiler we're leaving the current context with a value. We could have a secondary syntax to indicate we're leaving with a failure instead.
One could argue for 'throw' as a keyword to do that. But then someone might wonder where is 'catch' in this language. The way implicit returns work you'd be unable to indicate a failure return either.
The second point can be handled by saying if you want to be able to return custom failures, and not just pass back a maybe you got from somewhere else, you need to name the return variable.
The first point can be handled by introducing new syntax. I'm of two minds about this. It would be really cool to have bad-timing by itself without any other syntax the entire solution. Simply referencing the error would return from the current context. But what if we wanted to return from an outer context instead and skip further processing because of the error? The current return semantics allow for that.
And simply adding a ? on the end of the type doesn't specify what kind of error we expect in the error member of the maybe. We could use a different syntax instead. The types are computed at compile time after all, so we could use a syntax to state 'a or b' or 'a else b':
person-error (bad-timing)
make-person ~ -> person else: person-error [...]
That's pretty clear. You're going to get a person back otherwise there'll be an error to say why you didn't get a person back. It might not be the best naming because you then might get confused as to whether person is a boolean or a maybe itself. Let's try something more striking. We want errors to stand out a little bit:
person-error (bad-timing)
make-person ~ -> person / person-error [...]
Not that this matters but if the 'okay' state is 0 then you avoid divide by zero. Ahem. Anyway.. that solves the question of declaring the type succinctly. How do we return a failure instead of a success?
person-error (bad-timing)
make-person ~ -> person / person-error [ -> return |
random > 0.5 then: [return / bad-timing]
return <- {person | name: 'Jane'}]
Mirroring the type, you cannot send messages to the return variable. It's a pseudo-variable that performs RET instruction or JMP instruction or YIELD instruction depending on the implementation and its type.
This is still not good enough. What if I want the return to be a yield?
make-person ~ > yield of: person / person-error [ -> yield |
random > 0.5 then: [yield <- bad-timing]
yield <- {person | name: 'Jane'}]
We're back to the same problem of differentiating between the value and the error parts. We could, instead, dig in to the guts of the maybe:
make-person ~ > yield of: person / person-error [ -> yield |
random > 0.5 then: [yield <- {fail: bad-timing}]
yield <- {ok: {person | name: 'Jane'}}]
The type is implicit in the {} because it has to be the return type. We're therefore making a maybe a: person or: person-error object and sending either fail: or ok: to it. That can then set its internal state as necessary.
This is the ideal solution. It's the most consistent and flexible. It maintains encapsulation (if we even care about that) and reinforces message sending and lets us yield. So lets go back to the basic example and write it out how it should be:
person-error (bad-timing)
make-person ~ -> person / person-error [
random > 0.5
then: [{fail: bad-timing}]
else: [{ok: {name: 'Jane'}}]]
And our iteration example:
people
~> [index, person | ([1000:2000] includes: index) then: [{ok: person}]]
~> stdout
Whether or not people contain maybes doesn't matter, what comes out of our first block are definitely maybe and those are passed to stdout. This also allows us to have objects we can send things to that will act only if there is a failure:
people
~> [index, person | ([1000:2000] includes: index) then: [{ok: person}]]
~> logger-with-panic
Now we print out either the person or the error and if there's an error we will panic too. It doesn't matter where the error came from it'll get passed through because we let the types be determined by the compiler.
We might need to merge errors from different parts of the chain. We can solve this by specifying error instead of a specific enum type by declaring the enum to be a collection of errors:
person-error (error | bad-timing)
Now if we need to merge the person-error with a socket-error or a file-error or a memory-error we can, because we're not returning person / person-error anymore, but person / error. Armed with this knowledge we can safely introduce the ? syntax:
person-error (error | bad-timing)
make-person ~ -> person? [
random > 0.5
then: [{fail: bad-timing}]
else: [{ok: {name: 'Jane'}}]]
This requires the compiler to know what an error is and also to convert type? in to (maybe a: type else: error) but that is not a huge stretch.
This might be a good place to pause and reflect on the proposed language changes and their implications.