"Just run it in The Background", aka concurrency antipatterns

aka concurrency antipatterns derp! ;B

So we must send this email in The Background

Some of this perhaps belongs on the shitty code page.

Let's start with a tiny, beautiful helper:

import threading

def background(func):
    def run(*args, **kwargs):
        threading.Thread(target=lambda: func(*args, **kwargs)).start()
    return run

Good? Good! Now we can send emails in The Background!

from .utils import background

@background
def send_email(to, subject, message):
    ...

Wait, didn't I forget something? Like, error reporting?

So your send_email function, for some odd reason, has just began sending emails to /dev/null, and you've only noticed it by pure chance, after it's been doing so for the past three days, even though you've been a good little devops and have been using Sentry all along. Congratulations, your "background" code is, for all purposes, an equivalent of fuckit.py.

Ok, so let us use @raven_client.capture_exceptions and move on...

Wait, can I benchmark this code?

Suppose a teammate, unbeknownst to us, added a call to send_email inside of some trivial function:

 def frobnicate(bazang):
     frobs = boozate(bang for bang in bazang if bang.fooble)
+    send_email("me@example.com", frobs[0], "hi frbo")
     return reversed(frobs)

(Yes, I've seen code more silly than that in the wild.)

We want to benchmark two different implementations of frobnicate, to determine which one runs faster. Let's fire up IPython and use the beautiful %timeit magic^W macro.

%timeit frobnicate([1, 2])

How fast can %timeit spawn new threads? I hope it's fast enough for you, on my laptop it can achieve 180┬Ás, about 5000 threads per second!

Wait, but bash has a builtin syntax for running it in The Background! &

Yes, bash (and ksh, and sh, and...) does a lot of things that were widely considered good ideas in 1970's, 80's, even 90's:

OK OK I'm not here to bash bash, it's a great and pragmatic tool, but just because it assumes a particular default, doesn't always make that a good idea.

And no, I never use & interactively (use a new terminal), and almost never in scripts - and if I do, I always follow it with PID=$! and, soon after, wait $PID.

Wait, doesn't Go have the go keyword for running it in The Background?

Of course! Let's see how we can actually correctly use it:

var done = make(chan bool)
go func(done chan<- bool) {
    // do stuff...
    done <- true
}(done)
// wait for func...
<-done

A bit chatty, huh? I didn't expect more from a language that despite having beautiful concurrency primitives:

Where were we? Ah, The Background.

Yes, I know now! Let's have a thread pool and throw random things at it!

Did you take a minute to think, in which ways does your concurrent code use your system resources? I've once been working on an installation (no, you can't spin up more VM's in The Cloud), where a single computer had to handle groups of processes, that have each:

What would happen if I just threw all these tasks at a single thread pool? So the pool is too small, and the disk reads cause OSC messages to be missed? Do I increase the pool size? Now what if all of the 30 CPU-bound jobs try to run at once?

Doing concurrency right: it's not hard!

Tools

Philosophy

What else?

I need to learn Erlang one day. I'm quite sure I've built an ad-hoc, informally-specified, bug-ridden, slow implementation of half of OTP at least once by now.


See this as plaintext. Get the permalink. Check out related. Go home.