Method chaining

Google Feeds pushed a blog post of Dave Method Chaining, Fluent Interfaces, and the Finishing Problem to me yesterday. It is a good article which analyzed the finishing problem in method chaining. However, it missed a possible solution (or workaround in his wording), which I'll discuss in this post.

For who are not familiar with the term, method chaining is simply something like a.b().c().d(), where each method returns an object (usually this) for successive calls. jQuery is a good example of method chaining. Method chaining is considered visually beautiful and clean by most people.

The finishing problem

One of the biggest problem of method chaining is the finishing problem, as extensively discussed in Dave's blog post. In short, the finishing problem means it is hard for the library to decide when to take an action. As in Dave's example:

    Log.Message("Oh, noes!").Severity(Severity.Bad).User("jsmith");

The Log library does not know if it should start writing when .Message is called, since it might be followed by .Severity or .User. Dave listed several workarounds, but they need either a syntax or semantic change, except for the "casting" approach. But the "casting" approach is suboptimal as well: it introduces an unused reference, requires a proper type to convert to, and impossible in dynamic languages like Python and Javascript.

The road not taken: destructor

However, there is yet another option Dave did not mention, and it does not need to change the syntax and does exactly what is expected: taking action directly after the chain ends. The idea is, well, taking action in the destructor (or finalizer in some languages). Here is a short example in Python.

class Log():
    def __init__(this):
        this._message = ""
        this._severity = "debug"
    def message(this, x):
        this._message = x
        return this
    def severity(this, x):
        this._severity = x
        return this
    def __del__(this):
        print(f"{this._severity}: {this._message}")

def main():
    print("start")
    Log().message("message1").severity("info")
    Log().message("message2")
    print("end")

main()

Looks crazy, unreliable, and an abuse of the destructor? Yes, there are lots of problems. For example, if the compiler managed to allocate these variables on stack, it is not clear will __del__ be called directly after the statement or postponed to the end of function when the memory is released. It is also not clear that whether and how exceptions are caught. However, all these arguments are valid only because it is written in Python, and Python does not make clear promise on when and how the destructors are called (or at least most Python programmers do not know). With a little formalization this crazy idea can become a super powerful technique, and that's Rust.

Scope, "with", and lifetime

Before advertisements of Rust, we first explain what happened in the above code. Here is the pseudo code that shows the implicit function calls in main:

    print("start")
    a = malloc(Log)
    Log::__init__(a)
    Log::message(a, "message1")
    Log::severity(a, "info") # this is the last explicit reference to `a`
    Log::__del__(a)
    free(a)
    b = malloc(Log)
    Log::__init__(b)
    Log::message(b, "message2")
    Log::__del__(b)
    free(b)
    print("end")

I use malloc/free to show the memory allocation, but they are not necessarily allocated in heap. An insane optimizer might even make them reuse the same memory, since a dies before b is born.

The important part is how malloc/free and __init__/__del__ are related. Actually, the memory management can be fused into the constructor and destructor by the compiler. The __init__/__del__ pattern reflects the basis of resource management: a resource has its lifetime, within which it can be accessed and used. At the start and end of a resource's lifetime, there are an entering action and an exiting action, where the resource is allocated and released respectively.

Sounds familiar? You are probably thinking of with statement in Python. Actually when I search for Python+destructor, the first result Google gave me is telling me to use with instead.

with can be seen as syntax sugar of a special case of manual resource management, and it is far less powerful than the well-developed RAM management: the lifetime of the resource is strictly bound to the lexical scope, you cannot assign the resource to a global variable or pass it out in a closure to use later. Since we all trust the compiler of managing the memory, why we have to manually manage other resources?

Let's review the several methods of resource management. The first one is obviously the manual managing: you just call the allocating and releasing methods yourself. This is the most accurate, powerful but tedious method. The second is binding the resource to a lexical scope, like memory of stack variables and other resources managed by with statements. This method lacks generality: not all resources's lifetime is exactly the same as the scope they defined in. The third method is automatic tracking. The compiler tries to figure out the lifetime of each resource with techniques like data flow analysis, and fall back to runtime tracking like reference counting or other kinds of GC.

The last method is clearly the winner. It correctly deals with resources that live longer than their scopes, but also avoids the hassles of manual allocating and releasing. However, this technique is hardly used outside the domain of RAM, until Rust appears.

Rust

Rust is to my knowledge the first languages that fully leverages the modern memory manage techniques (who use RAII and smart pointers correctly in C++?). Following is an example from the document

(comments and imports removed).
let data = Arc::new(Mutex::new(0));

let (tx, rx) = channel();
for _ in 0..N {
    let (data, tx) = (data.clone(), tx.clone());
    thread::spawn(move || {
        let mut data = data.lock().unwrap();
        *data += 1;
        if *data == N {
            tx.send(()).unwrap();
        }
    });
}

rx.recv().unwrap();

If you are not familiar with Rust, you may think there is a bug since there is no statement that release the lock. However, for Rust programmers it's very clear that the lock is released right before data goes out of its scope. That's how finalizers work in Rust, and they work well. Beyond the scope-based management as shown in example, Rust also provides powerful lifetime analysis and explicit reference counting to support resources that live long, all in the same manner of RAM managing.

So fans of method chaining, time to switch to Rust :)