DockerConEU 2015 Talk – You Know More Than You Think

This is the prepared text of a talk I gave at DockerConEU 2015.



‘Trust yourself. You know more than you think.’ If I have to distil this talk into one phrase it will be that. My experience in initiating and prosecuting change within an organisation has only hardened my view that it’s you – the engineer who’s been shipping code for years, the technical leader who’s been fighting the good fight in meeting after meeting – it’s you that knows what needs to be done, and often that involves doing things that may feel, or even be wrong. And I hope that by the end you’ll feel emboldened to trust yourself that bit more and enable Docker for your own organisation in your own way.



My name is Ian Miell and I’m honoured to be talking here today, given that so many great people submitted talks I’d have liked to hear.

So why should you listen to me? From a Docker perspective I’ve done all of the usual community things:

– written one of the three million docker ecosystem tools (sorry)
– had so many builds on dockerhub they had to ask me to stop (sorry)
– write a Docker and DevOps blog
– spoken at meetups
– published a video on Docker aimed at web developers
– published a book on Docker in Practice

I worked for 14 years for the leading supplier of online sports betting and casino software and pounced on Docker as a solution for many of problems I faced as Head of DevOps (whatever that means) and latterly when I was put in charge of IT Infrastructure as well. Using this and other experience, I then moved on somewhere else where a large chunk of my responsibility is to be a reference point for Docker.

It’s the practical angle I want to talk about today. About how getting Docker done in a living, breathing organisation meant breaking some rules, and how it worked out for me. I hope it’s useful to some of you, and if not, I hope it’s at least interesting.



First I need to set the scene of why I decided to go all-in on Docker.


In September 2013 I read in Wired magazine about a new technology called Docker.


The timing couldn’t have been more perfect. I was a DevOps Manager with no budget for DevOps, in a company that couldn’t get a useful VM infrastructure going. We were a software company with 25 customers, and of those about 6 were big players. They competed with each other to throw changes out as fast as possible and were willing to accept technical debt, even wear it as a badge of commitment to delivery. I knew this because I lived daily with the consequences. I was responsible for managing outages.


I liked to argue that we had exactly the wrong number of customers to avoid technical debt – if we had two customers of similar size we could do things consistently between them, if we had 200,000 we could do what we wanted, and they would vote with their feet. On top this we were a time and materials company, and which customer would want to pay for testing that other customers get the benefit from?

As it was we had a few big players with big pockets who wanted to differentiate themselves from their rivals by pushing forking changes out faster. Not a great environment for productization, and there were subtle and significant differences between customer systems.

As Live Problem Manager and DevOps Head my biggest frustration was an inability to create realistic customer-specific environments to reproduce problems. Environments were a rare commodity, hand-crafted by old hands like me based on folklore, wiki pages full of bash commands hurriedly noted, and old-fashioned grit. New features always took priority and no customer wanted to pay to sort out technical debt, but were happy to pay to shout at me.

I’d long bemoaned this and wondered what could be done about it. The standard answers: VMs + Chef/Puppet/Ansible were not yielding results, since no-one wanted to tackle this 15-year old software stack and no-one had the time. As an aside, one of the commonest objections to my advocacy of Docker was: ‘you can do all that with VMs’. Which is true, but it’s far less convenient, iterations are slower, and in my experience it’s less stable and more painful to use. Tellingly, our technical presales engineer – who had to manage multiple environments in various states – went back to maintaining shell scripts for his laptop’s envs because VMs ‘were not worth the hassle’. In any case, despite the talk, no-one had managed to demonstrate this working. As I’ll discuss later, the time and resource-savings that containers bring change the paradigm of development.
After reading the article I checked the project out, started using it, and on Monday went into work with a proof of concept


which led to my first choice:


At this point I could have gone to my management and argued the case for this new technology, and waited for a decision and a budget. I chose not to. Instead I sent an email out on the Monday asking whether anyone else had heard of it and whether anyone wanted to work on a way of solving some of our problems. I’ll talk about what happened next in a moment, but at this point I want to talk about failure.

It was failure that drove me down this path, because I’d been here before. In 2006 I and a few others advocated the use of Erlang to solve some specific engineering challenges we were going to have in the coming years to do with scalability and real-time data. We went to the CTO and asked for his support. After many months a relatively insignificant and unrelated toy project was given to someone else in the company apparently uninterested in learning a new technology, and we watched as the results withered on the vine. I still don’t know whether that was the right outcome or not, but I’d seen enough not to let the fate of my vision depend on something I had so little influence over.

As an aside, I think the parallels between Erlang and the whole ‘Data centre as an Computer’ movement, of which Docker is a part, are under-explored. In case you don’t know, Erlang was an engineering solution to the problems of fault tolerance in data centres for telcos that has a message-passing architecture.


You can’t get much more microservices than services built from millions of co-routines that take up only a few dozen bytes of memory by default, and this standard Erlang diagram is one that will look familiar to anyone who’s used Kubernetes. Anyone interested in what happens next with microservices will do well to look at the history of Erlang.

But back to the point. After sending out my email I had a few responses and a small group of people interested in taking things further. This had a number of beneficial consequences in the following months:

– the team was motivated, and the quality of engineers involved was high
– those that didn’t deliver anything found they had no voice and dropped out
– conversely, those that did deliver got a say and felt empowered to contribute more
– it was fun! solving long-standing problems one by one and pulling together was incredibly satisfying
– time was allocated naturally to where we felt it was important – there was no bureaucracy, no deliverables, no project plans, no business case

By focussing on building solutions rather than seeking support elsewhere a lot of time and energy was saved. A lot, but not all. I had to pull in a lot of favours to get resources and access to things outside the normal processes. A lot of chatter ensued in the organisation about what we were doing, and the supposed conflict between what we were working on and the more strategic solutions being posited as solutions by others.

Much of this chatter centred around our solutions not being ‘industry standard’, which leads me to my next choice:

When I saw Docker I thought ‘great!’ – I can run multiple reproducible environments cheaply, save state usefully, all without much hassle and outlay. So the natural and immediate plan was to simply shove everything into a container and allow everyone to consume it as a reference.


I went onto mailing lists to ask about how to achieve this, and got responses like ‘I wouldn’t start from here – you should be using microservices, that’s what Docker is for’.
Fortunately I had the confidence to decide against doing this, mainly because the task was too great. Converting a 15-year old hub-and-spoke architecture with millions of lines of codes and hundreds of apps was a project I didn’t want to take on in my spare time, and would have doomed my efforts to complete failure. I think the area of legacy is a fascinating one for Docker, and it’s going to have to deal with it. Based on experiences at these and other organisations, I’ve come to believe that the approach to Docker for legacy apps should be in three stages:


– Monolithic build, the speed of which enables
– A DevOps workflow, which naturally leads to
– A break-up into microservices

The point is that real projects, real budgets (0 in my case) cannot afford to do everything properly, and even if they do, they risk running into the sand and losing momentum. An evolutionary approach is required.
Given that I had a monolith to contain my next choice was how to build it. Again, I need to set the scene.

Since environments were created by hand by experienced engineers in whatever inconsistent environments supplied by our customers, there was a lack of configuration management experience where I worked. Nonetheless, I figured I should try and do the standard thing, and spent one of my precious weekend days trying to learn chef by watching some introductory videos. A couple of hours later time was running out and I was no nearer. At this point – and out of frustration – I whipped up a solution which I knew would work for me using tools I already knew – Python, bash and (p)expect.

I didn’t believe this was the ideal, but I’d built what I needed and I had complete control over it and I knew that our project could deliver something useful quickly. When I showed what I’d done to people at work, the response typically was: ‘you should be using industry standard tools for this’, to which my response was: ‘agreed, here’s the shell scripts, here’s how I’ve done it, please replicate what I’ve done with whichever tool you like and we’ll move to it’. No-one did this.

This approach proved to be very useful for the project for a number of reasons:

One, I’d designed it to be easy to hack on. As an engineer, all you needed to do to contribute was cut and paste code that amounted to shell commands and re-run on your laptop. As our work got taken up through the organisation, contributions were easily made by others.

Two, the tool did exactly what we needed to achieve our goal: no more, and no less. If it didn’t, we built it. This was fun, and empowering. I learned a hell of a lot about config management tooling challenges, which has helped me a great deal as I’ve moved on and picked up other ‘real’ configuration management tools as part of my work. [As an aside, I did a similar thing with CI tools like Jenkins – I implemented a minimal CI tool in bash called ‘cheapci’, available on github which also helped me understand the problems of CI.]

Three, it allowed us to defer the decision about what configuration management tool to use. Since I’d designed it to organise a series of shell scripts and run them in a defined order, one of the outputs was a list of commands that could be fed into any tool you liked, even run in by hand.

So, as with monoliths, I’m not sure ‘not invented here’ deserves such a bad rep. If your aim is to deliver and control your solution and you have the skills, building your own tool can be the right choice, at least for getting your project done. And if you want to get Docker working in your organisation, getting to useful is your first priority. The tool eventually became known as ShutIt (ie not Chef, not Puppet: ShutIt), and after 4 months of legal discussion became open-sourced, and I still maintain it as one of those three million ecosystem tools. To be clear, I don’t suggest you use ShutIt (though I welcome contributions), I’m just using it as an example of how not invented here can be the right choice on the ground.

At this point I want to dwell on one of the points I just made in order to bring me onto


As I just mentioned, one of my design goals for ShutIt was that it should not get in the way of engineers that wanted to contribute to our endeavour. I didn’t want people to have to learn both Docker and another technology to contribute.

This was part of a broader plan to get people on board with what we were doing as far as possible, to reduce the barriers to entry, and to increase cross-fertilization between different parts of the company.

One of the patterns of failure I’d seen in attempts at technical change was that it was guarded and defended by a group of elite engineers, with little attempt made to persuade others – ie those that would eventually have to build on and maintain their efforts – to understand what was going on. A former colleague of mine pointed out that he’d been forced to use Maven with little support and that this caused him great resentment.

So from the very beginning I made sure I talked openly about what we were up to, both inside and outside the company.

One thing I was absolutely determined to make sure of following my experience with Erlang was to ensure that I took responsibility for knowledge sharing.

First I tried doing lectures in a room, which went OK, but I had a lucky accident which led me down a different path. I couldn’t get a room for a session, so decided to do it over Google hangout instead. This made the whole process way more efficient. People would remain at their desks as I introduced the material, and then they worked through the examples, speaking up when they got stuck. It allowed people to work at their own pace, be interrupted and feel like they had me as a helper as they learned themselves. I could even get on with other work while they worked through it. The PR effect was massive, as people felt part of the change, and that encouraged a number of great ideas came out of it and made people want to smooth our path. I couldn’t recommend this more. And I’m in good company:


I came across this quote coincidentally last week and couldn’t agree with it more.

The other thing I did was put myself out there at meetups and talk openly about what we were up to and how we were doing it. What I found interesting was a pattern of thought emerged which held people back from advocating change. Consistently, people would tell me that their organisation was dysfunctional


but that company X, or even all other companies seemed to have it sorted out:


I haven’t seen this place. I’ve seen some places do some things better than others, but usually these are the things that those businesses exist to do; it’s what they’re optimized for.

So much for the decisions. How did we get Docker taken up and what did it do for us?

One of our number worked in a team of forty engineers, and took up the challenge of getting his colleagues to use it.

There was significant resistance at first. Believe it or not people were happy maintaining environments by hand. The critical insight we had was that while someone is on a project they don’t want to change, but when they came to starting the next project, the benefits of the ‘dev env in a can’ were obvious.
Then, as more people started to use it, a network effect was created, and once about 8 were on it, the others soon followed.

There are many I could mention, but I want to talk about three benefits here that Docker faciliated. As more and more engineers embraced it, these benefits became mutually reinforcing in a virtuous circle.


By having a repeatable daily build of a development environment, friction between engineers and teams was significantly reduced.
Before Docker, environments were unique, so discussions about the software often devolved down to discussions of the archaeology of that environment. Since we now had a reproducible way to get to the same starting state, reproduction of state became simpler.
I ran the 3rd line support team, and with Docker we could instantly get an environment up and running to recreate problems seen on live without begging favours from environment owners. In an early win for Docker I managed to reproduce a database engine crash from a single SQL command moments after we saw it happen on live. No need to find an environment that people were using, check it was OK to crash – this was contained on my laptop and I didn’t even have to wait for an OS to boot up.

Interactions with test teams were made far simpler also. The daily build of the dev environment had some automated endpoint testing added to it, and the test team were notified by email, with the logs attached. This reduced the friction of interaction between testers and developers greatly, as there was no debate or negotiation about the environments being discussed.
Speed of delivery was also facilitated. Since fixes to the environment setup were shared across the team, there was a reduction in duplicated effort, and benefits fed into the automated build. Testing these changes was much quicker thanks to the layered filesystem, which our build tool leveraged to allow quick testing before a full phoenix build.

To show I eat my own dogfood, I wrote a website a few years ago in my spare time to track mortgage rates; it’s called I rebuild this site from scratch daily (video here). Doing that has had a number of very useful consequences. I can quickly make changes and run very simple tests against this static system, then throw it away if it doesn’t work. Very little overhead. It also acts as a canary – I’ve caught some interesting problems very quickly that I otherwise wouldn’t had I rebuilt on demand.

Quality was also improved by being able to iterate faster and earlier in the cycle than before. A vivid example of this was with DB upgrades. Formerly, as we’d only had a few environments that were expensive to re-provision, DB upgrades were a haphazard and costly affair that took place on infrastructure hosted centrally.


Now DB upgrades could be iterated in very tight cycles on the dev laptop, reducing the cost of failure and improving the quality by the time the customer saw it.


Our CI process was also changed in two significant ways.


We had a monolithic model of CI where we had an enormous Jenkins server shared across all teams, and on which changes could not easily be made – if you want a new version of python, for example, that created all sorts of headaches for the central IT team, who found it hard to maintain stability while accommodating these demands. Docker threw all that out:


Teams could now take ownership of their own environments and take responsibility for stability themselves by producing their own images and containing dependencies to their own isolated environments.

What we did went beyond that, as we used the Jenkins Swarm plugin (not to be confused with the Docker product) to allow the developers’ own laptops to run CI. As one of my colleagues put it ‘why is it so hard for me to provision a VM when I have a Corei5 laptop on my desk that’s mostly idle?‘. So developers would submit their hardware to the Jenkins server as slaves, and Docker images were run on the hardware. This had the interesting property of allowing the compute to scale with the team – the more people that were in work committing changes, the more compute was available to use.


Once we’d done all this and got Docker embedded we looked for ways to measure the return on investment. We had plenty of anecdotal evidence by this back and positive feedback from both engineers and customers

There was one small but vivid example of the savings made. There was an escrow process that we had to go through with some customers that involved demonstrating to an auditor that in the event of a disaster the customer could reconstruct the website without us. Traditionally, this had taken a fair number of days to work through, and a good amount of negotiation with the auditor to get them to accept. In addition, it was un-repeatable – it took n days each time. With Docker and the tooling we’d built, we not only completed the task in one-fifth of the time, but also the auditor (who had never heard of Docker) was satisfied after watching one run-through that reconstruction was replicable, and the developers on that team got their environment into a container.


These sorts of anecdotes were all very well, but we wanted to put real numbers on it. To this end we performed a survey of engineers that were actively using it, which boiled down to a simple question: how much time is this saving you a month? To cut to the chase, the rough figure was around 4 days for those users that actively embraced it. Interestingly, we found that engineers were reluctant to admit time was saved, as they felt that somehow this made them feel like they’d been inefficient pre-docker.

In any case, if we took a 4-day/ month figure and apply that across the 600 engineers we came up with a figure of about 130 person years saved per year, which amounted to a lot of money, as you can imagine. And bear in mind that this was before we get to improvements in customer perception, which is a less tangible, but no less important benefit, or even efficiencies in hardware usage, which were significant.


These decisions are not advice! All of these decisions were made in the context I had worked in for over a decade. If you already have working CM tools, maybe you should use those! If your C-level have a good history of funding and delivering promising projects, maybe skunkworks is needlessly hamstringing yourself. As I said at the beginning, you’re the one in your current situation and in the best place to figure out what needs to be done.

Thanks for listening.


The experience discussed here informed the writing of this book: Get 39% off with the code 39miell



My Favourite Docker Tip

Currently co-authoring a book on Docker: Get 39% off with the code 39miell


The Problem

To understand the problem we’re going to show you a simple scenario where not having this is just plain annoying.

Imagine you are experimenting in Docker containers, and in the midst of your work you do something interesting and reusable. Here’s it’s going to be a simple echo command, but it could be some long and complex concatenation of programs that result in a useful output.

docker run -ti --rm ubuntu /bin/bash
echo my amazing command

Now you forget about this triumph, and after some time you want to recall the incredible echo command you ran earlier. Unfortunately you can’t recall it and you no longer have the terminal session on your screen to scroll to. Out of habit you try looking through your bash history on the host:

history | grep amazing

…but nothing comes back, as the bash history is kept within the now-removed container and not the host you were returned to.

The Solution – Manual

To share your bash history with the host, you can use a volume mount when running your docker images. Here’s an example:

docker run \
    -e HIST_FILE=/root/.bash_history \
    -v=$HOME/.bash_history:/root/.bash_history \
    -ti \
    ubuntu /bin/bash

The -e argument specifies the history file bash is using on the host.

The -v argument maps the container’s root’s bash history file to the host’s, saving its history to your user’s bash history on the host.

This is quite a handful to type every time, so to make this more user-friendly you can set an alias up by putting the above command as an alias into your ‘~/.bashrc’ file.

alias dockbash='docker run -e HIST_FILE=/root/.bash_history -v=$HOME/.bash_history:/root/.bash_history

Making it Seamless

This is still not seamless as you have to remember to type ‘dockbash’ if you really wanted to perform a ‘docker run’ command. For a more seamless experience you can add this to your ‘~/.bashrc’ file:

function basher() {
    if [[ $1 = 'run' ]]
        /usr/bin/docker run -e \
            HIST_FILE=/root/.bash_history \
            -v $HOME/.bash_history:/root/.bash_history \
        /usr/bin/docker "$@"
alias docker=basher

It sets up an alias for docker, which by default points to the ‘real’ docker executable in /usr/bin/docker. If the first argument is ‘run’ then it adds the bash magic.

Now when you next open a bash shell and run any ‘docker run’ command, the commands you run within that container will be added to your host’s bash history.


As a heavy Docker user, this change has reduced my frustrations considerably. No longer do I think to myself ‘I’m sure I did something like this a while ago’ without being able to recover my actions.

Convert Any Server to a Docker Container






This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell


How and Why?

Let’s say you have a server that has been lovingly hand-crafted that you want to containerize.

Figuring out exactly what software is required on there and what config files need adjustment would be quite a task, but fortunately blueprint exists as a solution to that.

What I’ve done here is automate that process down to a few simple steps. Here’s how it works:


You kick off a ShutIt script (as root) that automates the bash interactions required to get a blueprint copy of your server, then this in turn kicks off another ShutIt script which creates a Docker container that provisions the container with the right stuff, then commits it. Got it? Don’t worry, it’s automated and only a few lines of bash.

There are therefore 3 main steps to getting into your container:

– Install ShutIt on the server

– Run the ‘copyserver’ ShutIt script

– Run your copyserver Docker image as a container

Step 1

Install ShutIt as root:

sudo su -
(apt-get update && apt-get install -y python-pip git docker) || (yum update && yum install -y python-pip git docker which)
pip install shutit

The pre-requisites are python-pip, git and docker. The exact names of these in your package manager may vary slightly (eg docker-io or depending on your distro.

You may need to make sure the docker server is running too, eg with ‘systemctl start docker’ or ‘service docker start’.

Step 2

Check out the copyserver script:

git clone

Step 3

Run the copy_server script:

cd shutit_copyserver/bin

There are a couple of prompts – one to correct perms on a config file, and another to ask what docker base image you want to use. Make sure you use one as close to the original server as possible.

Note that this requires a version of docker that has the ‘docker exec’ option.

Step 4

Run the build server:

docker run -ti copyserver /bin/bash

You are now in a practical facsimile of your server within a docker container!

This is not the finished article, so if you need help dockerizing a server, let me know what the problem is, as improvements can still be made.

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell


A Field Guide to Docker Security Measures

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell



If you’re unsure of how to secure Docker for your organisation (given that security wasn’t part of its design), I thought it would be useful to itemise some of the ways in which you can reduce or help manage the risk of running it.

The Two Sides

In this context there are two sides to security from the point of view of a sysadmin, ‘outsider’ and ‘insider’:

  • ‘Outsider’ – preventing an attacker doing damage once they have access to a container
  • ‘Insider’ – preventing a malicious user with access to the docker command from doing damage

‘Outsider’ will be a familiar scenario to anyone who’s thought about security.

‘Insider’ may be a new scenario to some. Since Docker gives you the root user on the host system (albeit within a container), there is the potential to wreak havoc on the host by accident or design. A simple example (don’t run this at home kids – I’ve put a dummy flag in anyway) is:

docker run --dontpastethis --privileged -v /usr:/usr busybox rm -rf /usr

Which will delete your host’s /usr folder. If you want people to be able to run docker, but not with the ability to do this level of damage, there are some steps you can take.

Some measures, naturally, will apply to both. Also some are as much organisational as technical.

Insiders and Outsiders

  • Run Docker Daemon with –selinux

If you run your Docker daemon with the –selinux flag it will do a great deal to prevent those in containers you from doing damage to the host system by creating its own security

This can be set in your docker config file, which usually lives in /etc under /etc/docker or /etc/sysconfig/docker

Defending Against Outsiders

  • Remove capabilities

Capabilities are a division of root into 32 categories. Many of these are disabled by default in Docker (for example, you can’t manipulate iptables rules in a Docker container by default)

To disable all of them you can run:

docker run -ti --cap-drop ALL debian /bin/bash

Or, if you want to be more fine-grained start with nothing, and then re-introduce capabilities as needed:

docker run -ti --cap-drop=CHOWN --cap-drop=DAC_OVERRIDE \
    --cap-drop=FSETID --cap-drop=FOWNER --cap-drop=KILL \
    --cap-drop=MKNOD --cap-drop=NET_RAW --cap-drop=SETGID \
    --cap-drop=SETUID --cap-drop=SETFCAP --cap-drop=SETPCAP \
    --cap-drop=NET_BIND_SERVICE --cap-drop=SYS_CHROOT \
    --cap-drop=AUDIT_WRITE \
    debian /bin/bash

Run ‘man capabilities’ for more information.

Defending Against Insiders

The main problem with giving users access to the docker runtime is that they could run with –privileged and wreak havoc, even if you have selinux enabled.

So if you’re sufficiently paranoid that you want to remove the ability for users to run Docker, some problems arise:

– How to prevent users from effectively running docker with privileges?

– How to allow users to build images?

udocker is a highly experimental and as-yet incomplete program which only allows you to run docker containers as your own (already logged-in) user id.

It’s small enough for security inspection (just a few lines of code:, forked from and potentially very useful where you want to lock down what can be run.

To run:

$ git clone
$ apt-get install golang-go
$ go build
$ id
uid=1001(imiell) gid=1001(imiell) groups=1001(imiell),27(sudo),132(docker)
./udocker fedora:20 whoami
whoami: cannot find name for user ID 1001
$ ./udocker fedora:20 build-locale-archive
permission denied
FATA[0000] Error response from daemon: Cannot start container 6ba3db7094a20c9742a3289401dcf915e03a2906d4e44dbbed42e194de13fd44: [8] System error: permission denied

Compare normal docker:

$ docker run fedora:20 id
uid=0(root) gid=0(root) groups=0(root)

If you then lock down the docker runtime to be executable only by root, you disable much of docker’s attack surface.

  • Docker build on audited server (and private registry)

One solution to allow you to build without access to the docker runtime may be to allow people to submit Dockerfiles via a limited web service which takes care of building the image for you.

It’s relatively easy to knock up a server that takes a Dockerfile as a POST request, builds the image with a web framework such as python-flask, and then deposits the resulting image for post-processing. Or you could even use email as a transport, and email them back a tar file of the checked image build :)

You can also do your static Dockerfile and image checking here before allowing promotion to a privately-run registry. For example you could:

  • Enforce USERs in images

If you have a build server that takes a Dockerfile and produces an image, it becomes relatively easy to do tests.

The first static check I implemented was checking that the image had a valid :

– There is at least one USER line

– The last USER line is not root/uid0

  • Run in a VM

The Google approach. Give each user a locked-down VM on which they can run and do what they like, and define ingress and egress at that level.

This can be a pragmatic approach. Some will object that you lost a lot of the benefits of running Docker at scale, but for many developers running tests or Jenkins servers and slaves this will not matter.

Future Work

  • User namespaces

Support for the mapping of users from host to container is being discussed here:

Further Reading

There’s lots more going on in this space. Here’s some highlight links:

Comprehensive CIS Docker security guide

Docker’s security guide

GDS Docker security guidelines

Dan Walsh (aka Mr SELinux) talk on Docker security

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell


Docker SELinux Experimentation with Reduced Pain

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell



As a Docker enthusiast that works for a corp that cares about security, SELinux is going to be a big deal. While SELinux is in principle simple, in practice it’s difficult to get to grips with. My initial attempts involved reading out of date blogs for tools that were deprecated, and confusing introductions that left me wondering where to go.

Fortunately, I came across this blog, which explained how to implement an SELinux policy for apache in Docker.

I tried to apply this to a Vagrant centos image with Docker on it, but kept getting into a state where something was not working, but I didn’t know what had happened, and then would have to re-provision the box, re-install the software, remember my steps etc etc..

So I wrote a ShutIt script to automate this process, reducing the iteration time to re-provision and re-try changes to this SELinux policy.

See it in action here


This diagram illustrates the way this script works.


Once ShutIt is set up, you run it as root with:

# shutit build --delivery bash

The ‘build’ argument tells ShutIt to run the commands in the to the revelant delivery target. By default this is Docker, but here we’re using ShutIt to automate the process of delivery via bash. ssh is also an option.

Running is root is obviously a risk, so be warned if you experiment with the script.

The script is here. It’s essentially a dynamic shell script (readily comprehended in the build method), which can react to different outputs. For example:

# If the Vagrantfile exists, we assume we've already init'd appropriately.
if not shutit.file_exists('Vagrantfile'):
	shutit.send('vagrant init jdiprizio/centos-docker-io')

only calls ‘vagrant init’ if there’s no Vagrant file in the folder. Similarly, these lines:

# Query the status - if it's powered off or not created, bring it up.
if shutit.send_and_match_output('vagrant status',['.*poweroff.*','.*not created.*','.*aborted.*']):
    shutit.send('vagrant up')

send ‘vagrant status’ to the terminal and will call ‘vagrant up’ if the status returns anything that isn’t indicating it’s already up. So the script will only bring up the VM when needed.

And these lines:

vagrant_dir = shutit.cfg[self.module_id]['vagrant_dir']
setenforce  = shutit.cfg[self.module_id]['setenforce'

pick up the config items set in the get_config method, and uses them to determine where to deploy on the host system and whether to fully enforce SELinux on the host.

Crucially, it doesn’t destroy the vagrant environment, so you can re-use the VM with all the software on it pre-installed. It ensures that the environment is cleaned up in such a way that you don’t waste time waiting for a long re-provisioning of the VM.

By setting the vagrant directory (which defaults to /tmp/vagrant_dir, see below) you can wipe it completely with an ‘rm -rf’ if you ever want to be sure you’re starting afresh.


Here’s the invocation with configuration options:

# shutit build -d bash \
    -s io.dockerinpractice.docker_selinux.docker_selinux setenforce no \
    -s io.dockerinpractice.docker_selinux.docker_selinux vagrant_dir /tmp/tmp_vagrant_dir

The -s options define the options available to the docker_selinux module. Here we specify that the VM should have setenforce set to off, and the vagrant directory to use is /tmp/tmp_vagrant_dir.


Instructions on setup are kept here

#install git
#install python-pip
#install docker
git clone
cd shutit
pip install --user -r requirements.txt
echo "export PATH=$(pwd):${PATH}" >> ~/.bashrc
. ~/.bashrc

Then clone the docker-selinux repo and run the script:

git clone
cd docker-selinux
sudo su
shutit build --delivery bash


Note you may need to alter this line


in the


file to change ‘docker’ to ‘sudo docker’ or however you run docker on your host.


This has considerably sped up my experimentation with SELinux, and I now have a reliable and test-able set of steps to help others (you!) get to grips with SELinux and improve our understanding.

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell


Storage Drivers and Docker

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell


Storage Drivers?

If you don’t know, Docker has various options for how to store its data. Originally it used AUFS (a layered filesystem), but this was not beloved by all, so as the likes of RedHat got interested and now there are various options, including Devicemapper, VFS and Overlay(FS).

Here’s a deck from a great talk by Jérôme Petazzoni here on the subject.

OK, So What?

Docker is sexy, this is not.

But it’s going to be important to think about this if Docker is to be used in production. The selling point of Docker (and XaaSes in general) is more efficient use of resources. A bad decision on storage drivers (or no decision) could cost you in compute resources, or operational cost.

I’ve put together this high-level, incomplete, and probably wrong view of storage drivers here, as I couldn’t find such a table anywhere else. I’d welcome corrections and improvements, and hope to update as I go.

Big files?
Space limits?
Page cache sharing?

High-density: is it designed to have lots of containers on the same disk (ie copy-on-write)?
Big Files: does it handle big files gracefully (ie block-level rather than file level)?
Encryption: does it support encryption of the files?
SELinux: is there SELinux support?
Space Limits: will the container hit space limits (before standard FS limits are hit)?
Page Cache Share: can the OS share page caches between different containers


Page Cache Sharing

As someone that works for a corp with the capacity to run a private Docker environment, the column I find most interesting is the “page cache share” one. If you’re running hundreds of thousands of containers over your estate and you have a limited number of blessed images to work from, then the savings in memory from sharing page caches across containers will be compelling.

Big Files

I’ve experienced first hand the pain of having a system that copies large files on write. If you have a monolithic database running within a container (I’m talking several Gig), then it’s painful to wait for the copy of a single massive data file to update one row while your container is running.


As VFS is copy-on-copy, VFS may be useful if you are OK taking the filesystem hit when starting up your containers, and don’t care about disk space. In return, you get (presumably) near-native performance. I’ve not used this.
Space Limits
By default, Devicemapper has a 10G limit for containers. It’s surprisingly difficult to resize this out of the box, so can get operationally annoying if you’ve not seen this before


The area of storage drivers is still not mature within Docker. While overlay(FS) looks promising (and is reputedly dog-fooded at Docker itself), it may not be the last word, or supported everywhere.

Feedback Wanted

Please send me feedback via twitter (@ianmiell) or if you want to mail me privately go via LinkedIn (Ian Miell)

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell


Play With Kubernetes Quickly Using Docker



This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell



In case you don’t know, Kubernetes is a Google open source project that tackles the problem of how to orchestrate your Docker containers on a data centre.

In a sentence, it allows you to treat groups of Docker containers as single units with their own addressable IP across hosts, and scale them as you wish, allowing you to be declarative about services much in the same way as you can be declarative about configuration with Puppet or Chef, and let Kubernetes take care of the details.


Kubernetes has some terminology it’s worth noting here:

  • Pods: groupings of containers
  • Controllers: entities that drive the state of the Kubernetes cluster towards the desired state
  • Service: a set of pods that work together
  • Label: a simple name-value pair
  • Hyperkube: an all-in-one binary that can run a server
  • Kubelet: an agent that runs on nodes and monitors containers, restarting them if necessary

Labels are central point of Kubernetes. By labelling Kubernetes entities, you can take actions across all relevant pods in your data centre. For example, you might want to ensure web server pods run only on specific nodes.


I tried to follow Kubernetes’ Vagrant stand-up, but got frustrated with its slow pace and clunkiness, which I characterized uncharitably as ‘soviet’. Amazingly, a Twitter-whinge about this later and I got a message from Google’s Lead Engineer on Kubernetes saying they were ‘working on it’. Great, but this moved from great to awesome when I was presented with this, a Docker-only way to get Kubernetes running quickly.

NOTE: this code is not presented as stable, so if this walkthrough doesn’t work for you, check the central Kubernetes repo for the latest.

Step One: Start etcd

Kubernetes uses etcd to distribute information across the cluster, so as a core component we start that first:

docker run \
    --net=host \
    -d kubernetes/etcd: \
    /usr/local/bin/etcd \
        --addr=$(hostname -i):4001 \
        --bind-addr= \

Step Two: Start the Master

docker run \
    --net=host \
    -d \
    -v /var/run/docker.sock:/var/run/docker.sock\ \
f    /hyperkube kubelet \
        --api_servers=http://localhost:8080 \
        --v=2 \
        --address= \
        --enable_server \
        --hostname_override= \

Kubernetes has a simple Master-Minion architecture (for now – I understand this may be changing). The master handle the APIs for running the pods on the Kubernetes nodes, the scheduler (which determines what should run where based on capacity and constraints), and the replication controller, which ensures the right number of nodes have replicated pods.

If you run it immediately, your docker ps should now look something like this:

imiell@rothko:~$ docker ps
CONTAINER ID IMAGE                              COMMAND              CREATED        STATUS        PORTS NAMES
98b25161f27f "/hyperkube kubelet  2 seconds ago  Up 1 seconds        drunk_rosalind 
57a0e18fce17 kubernetes/etcd:            "/usr/local/bin/etcd 31 seconds ago Up 29 seconds       compassionate_sinoussi

One thing to note here is that this master is run from a hyperkube kubelet call, which in turn brings up the master’s containers as a pod. That’s a bit of a mouthful, so let’s break it down.

Hyperkube, as we noted above, is an all-in-one binary for Kubernetes. It will go off and enable the services for the Kubernetes master in a pod. We’ll see what these are below.

Now we have a running Kubernetes cluster, you can manage it from outside using the API by downloading the kubectl binary:

imiell@rothko:~$ wget
imiell@rothko:~$ chmod +x kubelet
imiell@rothko:~$ ./kubectl version
Client Version: version.Info{Major:"0", Minor:"14", GitVersion:"v0.14.1", GitCommit:"77775a61b8e908acf6a0b08671ec1c53a3bc7fd2", GitTreeState:"clean"}
Server Version: version.Info{Major:"0", Minor:"14+", GitVersion:"v0.14.1-dirty", GitCommit:"77775a61b8e908acf6a0b08671ec1c53a3bc7fd2", GitTreeState:"dirty"}

Let’s see how many minions we’ve got using the get sub-command:

imiell@rothko:~$ ./kubectl get minions

We have one, running on localhost. Note the LABELS column. Think how we could label this minion: we could mark this minion as “heavy_db_server=true” if it was running on the tin needed to run our db beastie, and direct db server pods there only.

What about these pods then?

imiell@rothko:~$ ./kubectl get pods
POD       IP CONTAINER(S)       IMAGE(S)                                   HOST                LABELS STATUS  CREATED
nginx-127    controller-manager  Running 16 minutes

This ‘nginx-127’ pod has got three containers from the same Docker image running the master services: the controller-manager, the apiserver, and the scheduler.

Now that we’ve waited a bit, we should be able to see the containers using a normal docker ps:

imiell@rothko:~$ docker ps -a
CONTAINER ID IMAGE                                      COMMAND              CREATED        STATUS        PORTS NAMES
25c781d7bb93 kubernetes/etcd:                    "/usr/local/bin/etcd 4 minutes ago  Up 4 minutes        suspicious_newton 
8922d0ba9a75 "/hyperkube controll 40 seconds ago Up 39 seconds       k8s_controller-manager.bca40ef7_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_c40c7396 
943498867bd6 "/hyperkube schedule 40 seconds ago Up 40 seconds       k8s_scheduler.b41bfb6e_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_871c00e2 
354039df992d "/hyperkube apiserve 41 seconds ago Up 40 seconds       k8s_apiserver.c24716ae_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_4b062320 
033edd18ff9c kubernetes/pause:latest                    "/pause"             41 seconds ago Up 41 seconds       k8s_POD.7c16d80d_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_da72f541 
beddf250f4da "/hyperkube kubelet  43 seconds ago Up 42 seconds       kickass_ardinghelli

Step Three: Run the Service Proxy

The Kubernetes service proxy allows you to expose pods as services from a consistent address. We’ll see this in action later.

docker run \
    -d \
    --net=host \
    --privileged \ \
    /hyperkube proxy \
        --master= \

This is run separately as it requires privileged mode to manipulate iptables on your host.

A docker ps will show the proxy as being up:

imiell@rothko:~$ docker ps -a
CONTAINER ID IMAGE                                      COMMAND              CREATED        STATUS        PORTS NAMES
2c8a4efe0e01 "/hyperkube proxy -- 2 seconds ago  Up 1 seconds        loving_lumiere 
8922d0ba9a75 "/hyperkube controll 15 minutes ago Up 15 minutes       k8s_controller-manager.bca40ef7_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_c40c7396 
943498867bd6 "/hyperkube schedule 15 minutes ago Up 15 minutes       k8s_scheduler.b41bfb6e_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_871c00e2 
354039df992d "/hyperkube apiserve 16 minutes ago Up 15 minutes       k8s_apiserver.c24716ae_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_4b062320 
033edd18ff9c kubernetes/pause:latest                    "/pause"             16 minutes ago Up 15 minutes       k8s_POD.7c16d80d_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_da72f541 
beddf250f4da "/hyperkube kubelet  16 minutes ago Up 16 minutes       kickass_ardinghelli

Step Four: Run an Application

Now we have our Kubernetes cluster set up locally, let’s run an application with it.

imiell@rothko:~$ ./kubectl -s http://localhost:8080 run-container todopod --image=dockerinpractice/todo --port=8000
todopod todopod dockerinpractice/todo run-container=todopod 1

This creates a pod from a single image (a simple todo application)

imiell@rothko:~$ kubectl get pods
POD IP        CONTAINER(S)       IMAGE(S)                                   HOST        LABELS                 STATUS  CREATED
nginx-127     controller-manager                   Running About a minute
todopod-c8n0r todopod            dockerinpractice/todo                       run-container=todopod Pending About a minute

Lots of interesting stuff here – the HOST for our todopod (which has been given a unique name as a suffix) has not been set yet, because the provisioning is still Pending (it’s downloading the image from the Docker Hub).

Eventually you will see it’s running:

imiell@rothko:~$ kubectl get pods
POD           IP          CONTAINER(S)       IMAGE(S)                                   HOST                LABELS                STATUS  CREATED
nginx-127                 controller-manager                 Running About a minute
todopod-c8n0r todopod            dockerinpractice/todo             run-container=todopod Running 5 seconds

and it has an ip address ( A replication controller is also set up for it, to ensure it gets replicated:

imiell@rothko:~$ ./kubectl get rc
CONTROLLER   CONTAINER(S)   IMAGE(S)                SELECTOR                REPLICAS
todopod      todopod        dockerinpractice/todo   run-container=todopod   1

We can address this service directly using the pod ip:

imiell@rothko:~$ wget -qO- | head -1

Step Six: Set up a Service

But this is not enough – we want to expose these pods as a service to port 80 somewhere:

imiell@rothko:~$ ./kubectl expose rc todopod --target-port=8000 --port=80
NAME      LABELS    SELECTOR                IP          PORT
todopod       run-container=todopod   80

So now it’s available on

imiell@rothko:~$ ./kubectl get service
NAME          LABELS                                  SELECTOR              IP        PORT
kubernetes    component=apiserver,provider=kubernetes         443
kubernetes-ro component=apiserver,provider=kubernetes         80
todopod                                         run-container=todopod 80

and we’ve successfully mapped port 8000 on the pod to a port 80.

Let’s make things interesting by killing off the todo container:

imiell@rothko:~$ docker ps | grep dockerinpractice/todo
3724233c6637 dockerinpractice/todo:latest "npm start" 13 minutes ago Up 13 minutes k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_da1467a2
imiell@rothko:~$ docker kill 3724233c6637

and then after a moment (to be sure, wait 20 seconds), call it again:

imiell@rothko:~$ wget -qO- | head -1

The service is still there even though the container isn’t! The replication controller picked up that the container died, and restored service for us:

imiell@rothko:~$ docker ps -a | grep dockerinpractice/todo
b80728e90d3f dockerinpractice/todo:latest "npm start" About a minute ago Up About a minute k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_00316aec 
3724233c6637 dockerinpractice/todo:latest "npm start" 15 minutes ago Exited (137) About a minute ago k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_da1467a2

Step Seven: Make the Service Resilient

Management’s angry that the service was down momentarily. We’ve figured out this is because the container died (and the service was automatically recovered) and want to take steps to prevent a recurrence. So we decide to resize the todopod:

imiell@rothko:~$ ./kubectl resize rc todopod --replicas=2

and there are now two pods running todo containers:

imiell@rothko:~$ kubectl get pods
POD           IP          CONTAINER(S)       IMAGE(S)                                   HOST                LABELS                STATUS  CREATED
nginx-127                 controller-manager                 Running 28 minutes
todopod-c8n0r todopod            dockerinpractice/todo             run-container=todopod Running 27 minutes
todopod-pmpmt todopod dockerinpractice/todo run-container=todopod Running 3 minutes

and here’s the two containers:

imiell@rothko:~$ docker ps | grep dockerinpractice/todo
217feb6f25e8 dockerinpractice/todo:latest "npm start" 16 minutes ago Up 16 minutes k8s_todopod.6d3006f8_todopod-pmpmt_default_8e645492-dc50-11e4-be97-d850e6c2a11c_480f79b7 
b80728e90d3f dockerinpractice/todo:latest "npm start" 26 minutes ago Up 26 minutes k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_00316aec

It’s not just the containers that are resilient – try running:

./kubectl delete pod

and see what happens!

It’s Not Magic

Management now thinks that the service is bullet-proof and perfect – but it’s wrong!

The service is still exposed to failure: if the machine that kubernetes is running on dies, the service goes down.

Perhaps more importantly they don’t understand that the todo app is per browser session only, so their todos will not be retained across sessions. Kubernetes does not magically make applications scalable, so some kind of persistent storage and authentication method is required in the application and  to make this work as they want.


This only scratches the surface of Kubernetes’ power. We’ve not looked at multi-container pods and some of the patterns that can be used there, or using labels, for example.

Kubernetes is changing fast, and is being incorporated into other products (such as OpenShift), so it’s worth getting to understand the concepts underlying it. Hyperkube’s a great way to do that fast.

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell