Optimizing Affine Handlers
Abilities and their cost
Abilities in Unison are a convenient way to structure a lot of programs. They allow programming with a sort of domain-specific language, and decoupling the exact behavior of the domain-specific operations from the implementation. Also, they don't require rewriting functions in a different 'style' when switching between 'pure' code and domain-specific/effectful code.
However, a disadvantage they have is overhead. For an ability, say
when you call the tell
operation in your code, it is an inherently dynamic event, because the exact behavior depends on which handler for the Tell
ability is active at that point in the execution. Further, the handler looks something like
Tell.ignore : Request {g, Tell a} x -> x
Tell.ignore = cases
{ r } -> r
{ tell _ -> k } -> handle k() with Tell.ignore
In the tell
case, the handler gets access to the continuation of the tell
request, and in general it can do whatever it wants with that continuation. This means that in general, we need to actually capture that continuation. This involves building a value with some stack contents, moving back along the stack as we do so. Then, the handler calls the continuation, which writes what we copied back.
In this example, it's pretty obvious that this is a lot of wasted work. The handler doesn't really do anything. So, the stack is getting copied only to be overwritten exactly. We are essentially implementing a plain function call by hand, using continuation operations.
Even worse, the cost of this is proportional to the size of the stack segment from where the tell
is called to where the handler was installed. So, if Tell.ignore
was the first handler you used, and you have several other handlers after it, as well as some non-tail function calls, this might be a very expensive no-op. It'd be nice if we could do better.
Mitigating the expense
There is literature on what sorts of ability handlers are able to avoid stack copying. With sufficient work, it's possible to optimize each individual effect case, so that only that case needs to be in a 'good' form to benefit. However, making that work in the most situations requires some additional tricky facilities in the runtime.
So, instead, I've been working on a more modest optimization. It revolves around recognizing what you might call "affine" handlers. To fall into this category, the handlers must have all branches in one of two forms
- The continuation is not used at all.
- The continuation is used exactly once, in the tail position of a (tail) handle block that reinstalls the same handler.
Tell.ignore
above is an example of the second form. The key thing about these sorts of handlers is that they never need to capture continuations. In the first case, we just throw the continuation away. In the second case, we put it back right where it was, so we can implement it as a somewhat special function.
The only tricky part is that ability requests within the handler are supposed to refer to the dynamic context the handler occurs in. Normally this is accomplished by unwinding the stack while copying it, but we want to leave it in place. So, instead, we remember the context of the affine handler when we install it. As long as there are only affine handlers preceding it, there is no way to change the dynamic context of the handler, so just remembering the initial value works.
This does mean that we can only use the optimized version of an affine handler as long as only other affine handlers have been used so far. However, affine handlers are actually quite common. Every handler that is basically implementing some kind of state or exception effect is affine.
The main counterexamples are handlers that implement some kind of repetition or non-determinism. For instance, handlers for the Each
ability would typically not be affine. However, I think it is common to mainly use Each
handlers at the 'lowest' level, to repeat operations using some other effects, and in that case the outer affine handlers would be in a position to be optimized.
Details
The optimization of affine handlers has recently been merged to the unison interpreter implementation. It happens when loading code, so it should even be applied to code from previous versions that has been saved somewhere.
Initially it was pretty simple and limited, but over the past couple weeks it's been fleshed out quite a bit by making sure that most of the (pretty gnarly) handlers we use in the cloud computing implementation are properly recognized. So, for instance, you don't need to write precisely in the style of Tell.ignore
. You can use the following sort of 'nice' handler structure
Also, if we define the following somewhat bizarre ability
then the following handler is also recognized
Also, there can be a fair amount of complication in the structure. So, for instance, if we have the ability
the following handler is affine
complicated : Nat -> '{Counter} r -> r
complicated n th =
use Nat +
m = n + 1
handle th()
with cases
{ inc -> k } ->
match m with
2 -> complicated m do k (m + 35)
_ -> complicated m do k m
{ r } -> r
The main blocker to the optimization is resuming the continuation in a separate function, which can sometimes be deceptive based on the surface syntax. The example I came across in auditing our cloud code looks like the following
It's true that k
is called here exactly once in a tail position. However, this only happens indirectly in the Exception
handler for the printLine
call. This actually involves some indirection internally, so it cannot be recognized. Instead, to get the optimization to apply, the calls to k
should be factored out of the exception handler, so that they occur syntactically within the same handler, like so
It's fine to have the exception handler return a value, and match on that value to determine how to call k
. It is mainly the additional handler that introduces too much indirection.
Also, mutually-recursive handlers will just not be optimized right now. In the future we may be able to support them, but for the time being, only self-recursive handlers are possible to optimize.
This optimization is now in the 0.5.42 release of unison. Hopefully everyone is able to get a nice performance boost when they upgrade.