Make Your Own Bespoke Docker Image

Make Your Own Bespoke Docker Image

Recently, a few ideas around Docker converged in my mind and led me to make my own distribution, and to enable others to do the same. Hear me out.

  • Docker is a software packaging tool (a point hammered home to me in a talk by Nic Ferrier).
  • Docker separates build from deployment.
  • Docker allows for stateless builds (and ShutIt on top of that allows for complex stateless builds)
  • Linux distributions are essentially packaging systems.
  • It’s really annoying when you get “distribution clash” between Docker builds, eg when someone writes a Dockerfile for Debian and then you want to integrate it with one written for Centos. This happens.
  • Docker has a layered filesystem, so bloat only matters if you don’t re-use layers.
  • Linux from Scratch exists (and rocks).

Putting all the above together I realised that with the advent of Docker there was less of a need to bother with traditional packaging systems at all. Why not build from source?

“Because it’s bloody hard, bloody time-consuming, and bloody painful” is the standard answer. But since you can reproduce builds at will deterministically and deploy later elsewhere, much of that difficulty can be offloaded to someone else. Add in Docker’s layered filesystem, and these other issues become easier:

  • Bug/build issue reproduction (complete audited steps).
  • More efficient space usage (build in the layers/applications you need and no more, share common layers down the dependency hierarchy).
  • Tweaking your distro from a given layer is relatively easy (just add a ShutIt module).
  • Collaboration on improvements much more effective (we’re all talking about the same thing).
  • Auditing builds (for escrow purposes, or even security purposes).
  • A common reference point for applications built in Docker.
  • Dockerfiles/ShutIt can help produce self-documenting dependencies and build steps for porting elsewhere if desired.
  • You can have more bleeding edge versions of software if you want (or configure and try older ones if preferred)
  • Automated testing is trivial (just run the build, use ShutIt’s test phase).
  • Patching is trivial (docker pull, or just hack, commit, and tag the image).

The ShutIt Docker Distro

So I built (and present) the ShutIt Distro. Based on the giant shoulders of Linux From Scratch (and Beyond Linux From Scratch) and the broad shoulders of Docker it allows you to effectively create a bespoke distribution to target the application you want, mixing different applications with a few clicks, dependencies managed and all from source.

There’s a lot of images already available here, eg:

OSQuery

rpm

node

emscripten

all built from source and with a full bash history available for reference.

Here is the sequence of commands to get from debian:jessie (docker pull imiell/sd_base) to the sd_base filesystem artifact lfs.tar.xz (see below). Empty lines indicate just hitting return to accept defaults.

Here is the sequence of commands to get from imiell/sd_base to a complete working OSQuery image.

And here’s an example of how to run the OSQuery image:

$ docker run -t -i imiell/sd_osquery
bash-4.3# osqueryi
[...]
osquery> select name, path from processes;                  
+----------+-------------------------+
| name     | path                    |
+----------+-------------------------+
| bash     |                         |
| osqueryi | /usr/local/bin/osqueryi |
+----------+-------------------------+
osquery>

Dependencies

Since the environment is so predictable and restricted, certain aspects of the dependency management process become simpler.

ShutIt allows you to produce dependency graphs of your modules with the sc (show config) sub-command. Here is a gist example.

Here’s a (big – get ready with the zoom!) image of all the dependencies for the whole of the ShutIt-Distro module “universe” as it stands today:

shutit-distro-all

and a much smaller one just for the modules needed for the OSQuery module: digraph

and NodeJS‘s much simpler one (bear in mind that the base image has a base toolchain et al in it:

node

Build Your Own Image From Source

Want to cherry pick your own image, with tools and editors to your fancy? Instructions here, raw cut and paste for Ubuntu here You’ll be presented with a sparse web interface like this:

shutit_srv1

On the left are the modules in strict dependency order. At the bottom are the more complex modules. If we take OSQuery as an example, clicking on that will cause all the dependent modules to be emboldened (this can take a few seconds if there are lots of deps):

shutit_srv2

You can add any other modules you like to your image. So if you’re an emacs user, you can tick that box and it’ll be added also. This way you can configure your own personalized image. Then scroll back up and click on “Begin Build” to kick off the build. The commands issues will be on the third pane, the log on the fourth.

When the build is complete, it can be downloaded as a tar file by clicking on “Export Container”, which you can use to load in like so:

cat downloaded.tar | docker load

which will import the image ready to run. Using the web interface is not required, but it’s obviously easier than using the command line and configs, and is a good intro to the ShutIt framework.

How the Base Toolchain Image is Produced

Initially I tried to build the entire toolchain within debian:jessie, but got myself into hot water pretty quickly. Turns out building a toolchain from scratch is quite hard. Thankfully, Automated Linux From Scratch exists, so I just used that.

That delivers an artifact called lfs.tar.xz to the filesystem which can be picked up to make a base image with this Dockerfile:

FROM scratch
ADD lfs.tar.xz /
CMD ["/bin/bash"]

The base image is then delivered to the Docker Hub here. This forms the basis for the ShutIt Distro. Here is a gist for the commands to do this yourself. Note: it takes a while!

Base Image Flow (2)

Lessons Learned

  • Sourceforge is in DR mode a lot. In fact, primary sites (eg gnu.org the other day for a couple of hours) generally are down surprisingly often.
  • You’re a complete idiot if you deviate from Linux From Scratch one iota. I did this a few times and was beaten back into line several times with brutal severity.
  • NodeJS and Ruby have surprisingly few dependencies needed to build from source.
  • apt-file is an incredibly handy tool.

Help Wanted

If you’ve read this far and are still interested, do get in touch.

There’s plenty to grep TODO :)

Advertisements

Taming Slaves with Docker and ShutIt

At our company we had a problem.

We used Jenkins for CI, but had no clear way of setting up machines in a reliable portable and reproducible way. We had centralised VMs but provisioning was slow and complex.

So we bought a big beefy development server. Everyone was happy, for it was Big and the developers did Rejoice. This server was and is maintained by a tightly-controlled puppet script.

As Jenkins use and CI takeup increased over the years we ran out of CPU and memory (thanks, Java) and explored a few options: AWS, a VMWare Virtual Private Cloud, even buying VM server boxes and handing them to teams. All had their drawbacks.

The centralised dev server approach meant a single constraint on delivery. Requirements among the temas diverged and the (centralised and harassed) IT infrastructure cost centre are reluctant to make changes for fear of helping one team and breaking another (“Why can’t we just upgrade python pleeease? What could go wrong!?”)

The system we build is monolithic and has complex dependencies, so when Docker came along we saw the opportunity to rid ourselves of this. How complex?

ShutIt has a feature (“shutit depgraph”) whereby you can output the dependency graph. Here’s the graph of the Jenkins slave server image, suitably anonymised:

jenkins_slave_anon

The dependencies go from top to bottom. At the top is the slave module, which adds a few bits onto its dependencies (the publicly available memcache and docker modules for example – yes, we run docker-within-docker to make some build processes even simpler), and those modules in turn depend on others, many of which we’ve had to tweak or maintain for our proprietary needs. Each module can be configured for specific versions that are on different real-world deployments, and used to test upgrades/fixes and the like. Doing this with standard CM tools was far more complex and difficult to maintain.

The nature of ShutIt means that it’s easy for a developer or team to add a few tweaks to this image, and regression testing can happen in the background off git hooks (or another pet project, cheapci), as all you need to know is bash; not learn and understand some declarative framework new to you.

The nature of Docker makes these slaves easy to deploy (docker pull) and portable (run on your Core i5 laptop? A provisioned VM? A Digital Ocean instance? AWS? No problem).

In the next post I plan to explain why ShutIt and Docker is a perfect configuration management fit for developers looking to manage complex build needs.

Docker – One Year On

Docker – One Year On

Introduction

It’s just over a year since I first heard of Docker, and I’ve been using it avidly one way or another ever since. A year seems like a good time to look back over what it’s done for us at $corp compared to what I’d hoped.

Docker is such a flexible technology it’s fascinating to see how different people view it and run with it. Some see it as “just a packaging tool” (as if that were not a significant thing in itself), others as a means to delivering on the microservices dream, still others as a way of saving CPU cycles at scale.

Our use case was very much about improving predictability in a sprawling decades-old codebase for the development, QA, support and testing cycles. As far as we’re concerned, it’s definitely delivered, but we’ve had to go off-piste a bit.

Main Achievements

The tl;dr is that using Docker has saved us significant amounts of time, and in addition enabled us to do things considered previously impractical. There are too many to list (available on request!) but here are some highlights:

  • Our biggest dev team (~50 devs) has switched from a “shared dev server” development paradigm, to dev’ing on local machines from Docker images built daily. Others are now following.
  • QA processes have improved significantly – for example, automated endpoint testing was a trivial addition to the daily-built dev env, complete with emails sent to the testers with logs etc..
  • We have reduced the reliance and load on a centralised IT team always torn between stability and responsiveness (“why can’t we just upgrade python on the shared servers!?”)
  • CI is now using a centralised reference image as a Jenkins slave, again reducing dependence on IT for changes. Docker’s image layering provides a neat saving in on-premise disk space also.
  • Multi-node regression testing environments can be easily deployed on an engineer’s laptop (we use Skydock for this).
  • Escrow build auditing processes which were previously onerous now come “for free”, as we phoenix-deploy dev environments.

Central to all these has been using ShutIt to enable us to encapsulate our complex build and dev needs into a single point of reference. Uptake of ShutIt  has been far better than with traditional CM tools like Puppet or Ansible, mainly due to speed-to-market rather than maturity of the tool. Relatively little training is needed to slot your changes into the ShutIt builds and regression test it; everyone can pitch in quickly.

Lessons Learned

Getting traction for a new technology with software development house like ours (500+ engineers in various locations etc) is non-trivial. Here are some lessons learned:

  • Find a few motivated people and give them space to make it work
  • Focus them on a single well-defined problem (in our case: canned dev environments)
  • This problem should have a clear benefit to the business

In terms of the Docker systems, these were some of the things we learned as we went:

  • Dockerfiles did not scale with our image need. This may be because we’re “doing it wrong” (ie our software architecture is too monolithic), but Dockerfiles were plain impractical for us, so we effectively dropped them in favour of ShutIt
  • Having pre-built vanilla base images were very useful in getting traction. Being able to hand people a useful and up-to-date image to start working on was seen as a win. A monthly cadence for the base image worked for us, while dailies were essential for the development team.
  • Building on this known base has facilitated several projects that otherwise may have taken longer to get going.
  • The Docker community is very strong and very useful – without Jerome Petazzioni‘s blogs and help on Docker mailing lists we’d have been stymied a few times.
  • Running your containers within a VM can help reduce nerves elsewhere in your business about security beyond isolation, or managing shared compute resource (until that space matures in the Docker ecosystem).
  • Managing your own registry so that people can play with it is a good idea if you want privacy, but make sure you have a plan for when the disk space usage blows up!

Future

As Docker usage grows throughout or organisation there are several things we have our eyes on over the horizon.

  • CoreOS looks like a very promising technology for managing Docker resources, and we’re playing with this very informally
  • Enterprise support is something we’ve explored, but since we’ve not gone near live with this yet we’ve not pursued it. I’d be interested to know what people’s experience with it is though

Over to You

Are your experiences with getting Docker out there similar, or completely different? I’d be delighted to hear from you (@ianmiell / ian.miell@gmail.com)

Using ShutIt to Build Your Own Taiga Server

Recently someone brought to my attention this issue on Github.

https://github.com/taigaio/taiga-scripts/issues/3

It’s a common problem. Awesome-looking project, how exactly do I install it to play with it?

This is a perfect use case for ShutIt.

I built it by setting up a skeleton directory, which creates a standalone ShutIt module:

./shutit skeleton /path/to/shutit/library/taigaio taigaio shutit.tk

which gave me the boilerplate to produce the build section here:

https://github.com/ianmiell/shutit/blob/master/library/taigaio/taigaio.py#L10

You can see the ShutIt API is fairly intuitive – you call commands like login, send, multisend, run_script and logout on the shutit object to perform actions within the bash session that’s set up for you.

Once written, you can test with:

$ cd /path/to/shutit/library/taigaio/bin
$ ./build.sh

Then, to run it:

$ docker run -i -t -p 127.0.0.1:8000:8000 -p 127.0.0.1:8001:8001 taigaio bash -c '/root/start_postgres.sh && /root/start_taiga.sh && echo READY! && sleep 3000d'

Navigate to http://localhost:8000 and login as admin/123123

If you just want to get going:

$ docker run -i -t -p 127.0.0.1:8000:8000 -p 127.0.0.1:8001:8001 imiell/taigaio bash -c '/root/start_postgres.sh && /root/start_taiga.sh && echo READY! && sleep 3000d'

Wait to see “READY!” and navigate to http://localhost:8000 and login as admin/123123

Voila:

taigaio

Using ShutIt and Docker to play with AWS (Part Two)

Using ShutIt and Docker to play with AWS (Part Two)

Previously I showed a very basic way of using ShutIt to connect to AWS.

I’ve taken this one step further so now there is a template for automating:

  • killing any AWS instance you have running
  • provisioning a new instance
  • logging onto the new instance
  • installing docker on it
  • pulling and running your image

All you will need is a .pem file and to know the security group you want to use.

I’m assuming you already have an AWS account on the free tier and have nothing running on it, or one throwaway instance we’re going to kill.

Here are the steps to get going, from an ubuntu:14.04 basic install:

1) Basic installs

apt-get update
apt-get install -y docker.io python-pip
git clone https://github.com/ianmiell/shutit.git
cd shutit
pip install -r requirements.txt 
mkdir -p ~/.shutit && touch ~/.shutit/config && chmod 600 ~/.shutit/config 
vi ~/.shutit/config

2)Edit config for AWS

Then edit the file as here, changing the bits in CAPS as necessary:

[shutit.tk.aws.aws]
access_key_id:YOUR_AWS_ID
secret_access_key:YOUR_AWS_KEY
# region, eg 
# region:eu-west-1
region:YOUR_AWS_REGION
[shutit.tk.aws_example.aws_example]
# Your pem filename, eg if your pem file is called: yourpemname.pem
# pem_name:yourpemname 
pem_name:YOUR_PEM_NAME

3) Place your .pem file in the context’s pem directory:

cp /path/to/yourpemname.pem examples/aws_example/context/yourpemname.pem

4) Run it

cd examples/aws_example_provision
../../shutit build --shutit_module_path ../../library:.. --interactive 0

And wait.

Once you’ve seen that that works, you can now try changing it to automate the setup of an app on AWS.

You can start by uncommenting the lines here in aws_example_provision.py:

 shutit.send('sudo yum install -y docker')
 shutit.send('sudo service docker start')
 # Example of what you might want to do. Note that this won't work out
 # of the box as the security policy of the VM needs to allow access to
 # the relevant port.
 #shutit.send('sudo docker pull training/webapp')
 #shutit.send('sudo docker run -d --net=host training/webapp')
 # Exit back to the "real container"
 shutit.logout()

And running the build again.

Using ShutIt and Docker to play with AWS (Part One)

I’m using ShutIt to play with AWS at the moment.

I can leverage the core libraries to easily build on top with my secret data and store in my source control system, and I’ll show how you can do this here.

Firstly, there’s a core aws library that takes care of installing the aws command line tool which is part of the ShutIt libraries:

Shutit AWS module

ShutIt AWS build script

It contains the ability to configure the AWS access token, but obviously we don’t want to store that in the core library.

The solution to this to create my own module which effectively inherits from that generic solution, adding my pems and configuring for access.

/space/git/shutit/shutit skeleton /my/git/path/ianmiellaws ianmiellaws my.domain
cd /my/git/path/ianmiellaws
mv /path/to/pems context/
cd /my/git/path/ianmiellaws/
vi configs/
vi ianmiellaws.py
./test.sh

The bits in bold were the bits edited in the above vi edit:

imiell@lp01728:/space/git/work/notes/aws/ianmiellaws$ cat ianmiellaws.py
from shutit_module import ShutItModule

class ianmiellaws(ShutItModule):

    def is_installed(self,shutit):        
        return False

    def build(self,shutit):
        shutit.send_host_file('t2.pem','context/pems/t2.pem')     
        return True


def module():
    return ianmiellaws(
        'my.domain.ianmiellaws.ianmiellaws', 1159697827.00,
        description='',
        maintainer=''
        depends=['shutit.tk.setup','shutit.tk.aws.aws']       
)

In the config file edit I put (replacing the stuff in caps with my details):

[my.domain.aws.aws]
access_key_id:MYKEYHERE
secret_access_key:MYSECRETACCESSKEYHERE
region:MYREGION
output:

Then build it:

/path/to/shutit build -m /path/to/shutit/library

Then run it:

docker run -t -i ianmiellaws /bin/bash

and you should be able to access your AWS services from wherever you have the container.

In the next posts I’ll show how to build on top of this to write a module to automatically provision an AWS instance and run a docker service on it.