My scripts: wait-for-host

A walkthrough of a tiny, but surprisingly useful script that waits for some host(s) to get back up (or go down).

Useful for rebooting, or very simple monitoring.

Complete source

Find on Github.

Walkthrough

Portability, or: ping(8) is a mess

I use three UNIX-like operating systems on a regular basis. Turns out, the further you dig down, the more often you will find minor (or major!) differences.

Even in a command as simple as ping(8): on macOS it accepts timeouts in milliseconds; on OpenBSD the -W flag is spelled -w; on all OSs the set of supported flags is inconsistent…

Making ping do the same thing on macOS, Linux, and OpenBSD
case $(uname) in
    Darwin)
        check_ping() {
            # https://keith.github.io/xcode-man-pages/ping.8.html
            ping -qQ -c1 -W$((${TIMEOUT} * 1000)) "$1" 2>/dev/null >/dev/null
        }
        ;;
    Linux)
        check_ping() {
            # https://linux.die.net/man/8/ping
            ping -q -c1 -W${TIMEOUT} "$1" 2>/dev/null >/dev/null
        }
        ;;
    OpenBSD)
        check_ping() {
            # https://man.openbsd.org/ping.8
            ping -q -c1 -w${TIMEOUT} "$1" 2>/dev/null >/dev/null
        }
        ;;
    *)
        printf 'unsupported OS: %s' $(uname)
        exit 111
        ;;
esac

My general strategy is to wrap all OS-specific commands in a single place, like a giant case block, so that the rest of the script can remain easily portable.

case $(uname) in
    This-Os)
        do_convert() {
            echo "wololo"
        }
        ;;
    thatOS)
        do_convert() {
            winfestor --neural-parasite
        }
        ;;
    *)
        printf 'unsupported OS: %s' $(uname)
        exit 111
        ;;
esac

Side note: I usually do not explode on an unsupported OS, instead trying to adhere to some standard; in ping’s case, it’s safe to assume that it’s not safe to assume.

Argument parsing: getopt(3) is still great

There are thousands of libraries that help you parse command line arguments, many considered easy, modern, or powerful. I still prefer getopt(3), because it’s universally available, flexible, and nudges towards more carefully planned UX.

The option parsing code
    mode=up
    args=$(getopt "du" $*)
    set -- $args
    while :; do
        case "$1" in
            -u)
                mode=up
                shift
                ;;
            -d)
                mode=down
                shift
                ;;
            --)
                shift
                break
                ;;
        esac
    done

Perhaps this is somewhat chatty, but it’s also simple and obvious. When -u is specified (also: the default), wait for the hosts to go up. If you want to check for each of the hosts going down, use -d. I could write some extra code to make these options mutually exclusive, but I believe it’s a little bit of overkill.

What this code is missing, is a proper help message. The entire script is 83 lines. My only excuse is laziness.

The main course

Depending on the check (up or down), I pick one of the two slightly different main loops:

for host in ...; ...
while ! check_ping "$host"; ...
printf ' up\n'
for host in ...; ...
while check_ping "$host"; ...
printf ' down\n'

Perhaps these different loops could be refactored and folded into one, but that would make the code less readable.

Another small thing worth noting, is that I use printf(1) rather than echo. I do know that echo -n will omit the final newline; but since I’m interleaving many small output calls, I find that marking an endline explicitly overall makes the code clearer.

Future improvements?

This utility handles a list of hosts, but it’s processing them serially. I could introduce forking via “&” and wait, but making the output make sense would be too complex.

In the case of ad-hoc monitoring, it could cycle through the entire list of hosts in order, and report when any of them are down. This would probably be a separate mode from -d. However I’m afraid that by making this too easy, I could be tempted into settling on this as an actual monitoring solution. There’s nothing more permanent than a temporary fix.

Final words

I have two simple rules for shell scripting:

  1. It must be simple to read, while fitting in 100 lines
  2. It must accomplish something useful, that a one-liner couldn’t do.

This one checks both boxes.