Andy Stanton Is Writing About Building Docker Exec (Part 2)

Continued from Building Docker Exec Part 1

Some of the programs I wanted to be able to run with my new tools used features that were enabled at compiler level, for example C++11 or C++14 features. I didn’t want separate Docker images for the various versions of languages. Further to this, what if the code I wanted to execute had multiple sources, and what if it required user input? I decided to pass all the arguments to the script in the as part of the entrypoint arguments when calling ‘docker run’.

Expanding the number of source files for compiled languages was easy - just pass more files for mounting to ‘docker run’. But now there were three types of thing I needed to pass to the script, and each of those sets of things were of arbitrary length. That ruled out picking them out based on order. Instead I implemented the parsing of command line switches and flags in the script, which allowed me to denote arguments passed to the compiler like this:

-b foo
--build-arg foo
--build-arg=foo

Arguments passed to the executing program took the form:

-a foo
--arg foo
--arg=foo

And sources were anything else.

At this point I wanted a way to verify that all images handled input and output in a consistent way, so I wrote a test runner in shell that would clone all the images from GitHub, build the image tagged as “testing”. It would then execute sample programs for each language and verify the output was as expected. The two test programs included in every image repository were:

hello world - the test would verify the output was “hello world”
echo chamber - the test would verify the output was the arguments passed to the program separated by newlines

That covered the simple case for a program, but what about header files with C & C++ or input files for parsing? They don’t get passed to the compiler - they’re picked up just because they’re there which means they need to be mounted in the Docker image. I wanted to avoid mounting the current directory as if the image were run from the host system root, Docker would attempt to mount the whole file system in the executing container so preferred the mounting of individual files.

This was flexible, but ended up with verbose commands like:

docker run -t --rm \
    -v $(pwd -P)/foo.cpp:/tmp/dexec/build/foo.cpp \
    -v $(pwd -P)/bar.cpp:/tmp/dexec/build/bar.cpp \
    -v $(pwd -P)/foo.hpp:/tmp/dexec/build/foo.hpp \
    -v $(pwd -P)/bar.hpp:/tmp/dexec/build/bar.hpp \
    -v $(pwd -P)/infile.txt:/tmp/dexec/build/infile.txt \
    dexec/cpp foo.cpp bar.cpp \
    --build-arg=-std=c++11 \
    --arg=infile.txt

I started thinking about how best to reduce the verbosity and initially pulled it into a shell function called ‘dexec’ for ‘Docker Exec’. The shell function would perform a similar command line parsing function as was present in the scripts on the images themselves.

This all sounded brilliant, but as I have the attention span of a 3 year old having implemented a basic version of this, I decided to implement it in Go. Go seems an interesting language - a colleague of mine recently declared that the VM was dead (with reference to Java) and that Go was the future - and I was curious to take it a step further than a cursory “hello world”.

So I began implementing a version of the ‘dexec’ function in Go as a command line interface. One of the things I immediately liked about Go were its companion tooling. By this I mean the built in test runner and the ‘gofmt’ utility for formatting code according to the accepted Go conventions. I used Atom as an editor which has great support for these through the excellent go-plus package which lints on the fly and runs gofmt on code save. In addition the integration of the ‘go get’ command with GitHub is a master stroke, allowing easy dependency resolution without having to invent a hand rolled custom repository service like maven or npm.

I really like the ability to test open source projects for free with Travis CI, and it has great out of the box support for Go. I made use of the new containerised infrastructure option (simply adding “sudo: false” at the top of the file) and the unit tests are run on every commit.

Initially I made use of codegangsta’s cli library to do the parsing, but this expected arguments that weren’t prefixed with a switch to be at the end of the list of arguments. I didn’t want to have this as a restriction - instead preferring to put the execution arguments at the end. As my requirement was only that specific arguments were required, I chose to write a simple argument parser myself which lent itself exactly to my needs.

Extending this, I added the ability to mount files directories with an include flag. This also accepted a suffix of ‘:rw’ or ‘:ro’ which Docker uses when mounting things to indicate that they are ‘read only’ or ‘read write’. The utility would also derive which image was required based on the first source file extension, removing another line from the verbose invocation and allowing for a much nicer syntax that I wouldn’t object to on a single line:

dexec foo.cpp -i bar.hpp -b=-stdc++11 -a foo -a bar

As a further step I wanted to be able to make any source file supported by the system executable using shebang notation at the start of each file, which would instruct the parent shell to invoke ‘dexec’ on this source file. The problem with adding a shebang is most of the compilers or interpreters don’t like it. This meant stripping it out at some point, but making sure not to modify the original file. I came up with a hacky method for achieving this using directory diffs which is good enough for now, so I added a test for it, tagged the common source code and updated the repositories where this was used as a submodule to point to the tagged version.

At this point I rewrote the Dockerfiles for the images, using debian:8.0 as the base image for all of them rather than a disparate variety of base images. The reason for this was that some of the resultant images were quite large (and a handful still are!) so I figured just having one base image might help keep the overall size down if you had all of the images downloaded locally. The stable images could then be themselves tagged in both GitHub and DockerHub. With the tagged images, I could use absolute versions in the ‘dexec’ utility, providing a guaranteed environment for executing code.

With a 1.0.0 version of the application I had a look at options for distribution, at the very least so I could easily use it on other platforms. The simplest means with Go is to simply run “go install github.com/docker-exec/dexec” but this requires Go to be configured first. Adding to package managers would require custom specific configuration for each platform - e.g. adding a PPA on Ubuntu or tapping a keg with brew.

Go has an awesome community project called “goxc”, or Go Cross Compiler which allows you to build and package binaries for a whole host of diverse systems. On top of this it will also publish to Bintray which hosts open source binaries for free. This is exactly what I wanted and was incredibly easy to set up to boot, resulting in the compilation, testing, build and distribution of dexec 1.0.0.

The final step was to update the ‘dexec’ read me with examples. I also created a GitHub page for the project at https://docker-exec.github.io/ as the whole, or at least part of, the project may be useful for other people.

This is the end result, a simple command line utility capable of executing code in many different languages using Docker:

dexec demo animation