Ten Things I Wish I’d Known About Chef

1) Understand How Chef Works

This sounds obvious, but is important to call out.

Chef’s structure can be bewildering to newcomers. There are so many concepts that may be new to you to get to grips with all at once. Server, chef-client, knife, chefdk, recipe, role, environment, run list, node, cookbook… the list goes on and on.

I don’t have great advice here, but I would avoid doing too many theoretical tutorials, and just focus on getting an environment that you can experiment on to embed the concepts in your mind. I automated an environment in Vagrant for this purpose for myself here. Maybe you’ve got a test env at work you can use. Either way, unless you’re particularly gifted you’re not going to get conversant with these things overnight.

Then keep the chef docs close to hand, and occasionally browse them to pick up things you might need to know about.

2) A Powerful Debugger in Two Lines

This is less well known than it should be, and has saved me a ton of time. Adding these two lines to your recipes will give you a breakpoint when you run chef-client.

require 'pry'

You’re presented with a ruby shell you can interact with mid-run. Here’s a typical session:

root@chefnode1:~# chef-client
Starting Chef Client, version 12.16.42
resolving cookbooks for run list: ["chef-repo"]
Synchronizing Cookbooks:
 - chef-repo (0.1.0)
Installing Cookbook Gems:
Compiling Cookbooks...

Frame number: 0/22

From: /opt/chef/embedded/lib/ruby/gems/2.3.0/gems/chef-12.16.42/lib/chef/cookbook_version.rb @ line 234 Chef::CookbookVersion#load_recipe:

220: def load_recipe(recipe_name, run_context)
 221: unless recipe_filenames_by_name.has_key?(recipe_name)
 222: raise Chef::Exceptions::RecipeNotFound, "could not find recipe #{recipe_name} for cookbook #{name}"
 223: end
 225: Chef::Log.debug("Found recipe #{recipe_name} in cookbook #{name}")
 226: recipe = Chef::Recipe.new(name, recipe_name, run_context)
 227: recipe_filename = recipe_filenames_by_name[recipe_name]
 229: unless recipe_filename
 230: raise Chef::Exceptions::RecipeNotFound, "could not find #{recipe_name} files for cookbook #{name}"
 231: end
 233: recipe.from_file(recipe_filename)
 => 234: recipe
 235: end
[1] pry(#<Chef::CookbookVersion>)> 

The last line above is a prompt from which you can inspect the local state, similar to other breakpoint debuggers.

CTRL-D continues the run.

See here for more.

3) Run Locally-Modified Cookbooks

I spent a long time being frustrated by my inability to re-run chef-client with a slightly modified set of cookbooks in the local cache (in /var/chef/cache...).

Then the chef client we were using was upgraded, and the
--skip-cookbook-sync option was available. This did exactly what I wanted: use the cache, but run the recipes in exactly the same way, run list and all.

The -z flag can do similar, but you need to specify the run-list by hand.
--skip-cookbook-sync ‘just works’ if you want to keep everything exactly the same and add a log line or something.

If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

Buy in a bundle here

4) Learn Ruby

Ruby is the language Chef uses, so learning it is very useful.

I used Learn Ruby the Hard Way to quickly get a feel for the language.

5) Libraries

It isn’t immediately obvious how you avoid re-using the same code recipe after recipe.

Here’s a sample of a ‘ruby library’ embedded in a Chef recipe. It handles the figuring out of the roles of the nodes.

One thing to note is that because you are outside the Chef recipe, to access the standard Chef functions, you need to explicitly refer to its namespace. For example, this line calls the standard search​:

Chef::Search::Query.new.search(:node, "role:rolename")

The library is used eg here. The library object is created:

server_info = OpenShiftHelper::NodeHelper.new(node)

and then the object is referenced as items are needed, eg:

first_master = server_info.first_master
master_servers = server_info.master_servers

Note that the node object is passed in, so it’s visible within the library.

6) Logging and .to_s

If you want to ‘quickly’ log something, it’s easy:

log 'my log message do
  level :debug

and then run at debug level with:

chef-client -l debug

To turn a value into a string, try the .to_s function, eg:

log 'This is a string: ' + node.to_s do
  level :debug

7) Search and Introspection Functions

The ‘search’ function in Chef is a very powerful tool that allows you to write code that switches based on queries to the Chef server.

Some examples are here, and look like this:

graphite_servers = search(:node, 'role:graphite-server')

Similarly, you can introspect the client’s node using its attributes and standard Ruby functions.

For example, to introspect a node’s run list to determine whether it has the webserver role assigned to it, you can run:


This technique is also used in the example code mentioned above.

8) Attribute precedence and force_override

Attribute precedence becomes important pretty quickly.

Quite often I have had to refer to this section of the docs to remind myself of the order that attributes are set.

Also, force_override is something you should never have to use as it’s a filthy hack, but occasionally it can get you out of a spot. But it can’t override everything (see 10 below)!

9) Chef’s Two-Pass model

This can be the cause of great confusion. If the order of events in Chef seems counter-intuitive in a run, it’s likely that you’ve not understood the way Chef processes its code.

The best explanation of this I’ve found is here. For me, this is the key sentence:

This also means that any Ruby code in the file not explicitly delayed (ruby_blocklazynot_if/only_if) is run when the file is run, during the compile phase.

Don’t feel you need to understand this from day one, just keep it in mind when you’re scratching your head about why things are happening in the wrong order, and come back to that page.

10) Ohai and IP Addresses

This one caused me quite a lot of grief. I needed to override the IP address that ohai (the tool that gathers information about each Chef node and places in the node object) gets from the node.

It takes the default route’s interface’s IP address by default, but this caused me lots of grief when using Vagrant. force_override​ (see 8) above) doesn’t work because it’s an automatic ohai variable.

I am not the only one with this problem, but I never found a ‘correct’ solution.

In the end I used this hack.

Find the ruby file that sets the ip and mac address. Depending on the version this may differ for you:


Then get the ip address and mac address of the interface you want to use (in this case the eth1 interface:

IPADDR=$(ip addr show eth1 | grep -w inet | awk '{print $2}' | sed 's/\(.*\).24/\1/'""")
MACADDRESS=$(ip addr show eth1 | grep -w link.ether | awk '{print $2}'""")

Finally, use sed (or gsed if you are on a mac) to hard-code the ruby file that gets the details to return the information you want:

sed -i "s/\(.*${IPADDR} \).*/\1 \"\"/" $RUBBYFILE
sed -i "s/\(.*macaddress \)m.*/\1 \"${MACADDRESS}\"/" $RUBYFILE

If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

Buy in a bundle here

5 thoughts on “Ten Things I Wish I’d Known About Chef

  1. As the author of `–skip-cookbook-sync` I would highly encourage you and all your readers to not use that, particularly for this purpose. What you’re suggesting is actually sort of the “oh god thats what I was afraid people would do when I wrote it” approach to developing and debugging cookbooks. I would prefer that new users did not even know it existed, because it is the proverbial footgun.

    It is a lot easier and safer to experiment with other means of running chef-client locally with a local set of cookbooks and not hack up cookbooks in /var/chef/cache.

    You can use chef-apply by dropping `#!/usr/bin/env chef-apply` at the top of a file and then `chmod +x` that file and run it. You won’t get templates or cookbook_files or the rest of the cookbook filesystem structure, but for a simple way to play with some resources it works fine.

    You can also run chef-zero locally with `chef-client -z -j dna.json` where `dna.json`s contents can be as simple as:

    “run_list”: [ “recipe[test]” ],

    then create `cookbooks/test/recipes/default.rb` and a minimal `cookbooks/test/metadata.rb`

    When you start wanting to play around with community cookbooks you can use test-kitchen as a wrapper around running berkshelf to pull those down and test against locally created recipes and spin them up in a virtual environment instead of running cookbooks directly against your local host. That sounds complicated but it is reasonably simple:

    ChefDK, Test Kitchen Driven NTP Cookbook

    This gist uses TK+Berkshelf to drive creating a vagrant virts and converging a simple recipe to install and configure NTPd. This is a simple cookbook that has one recipe, one template (for ntp.conf) and one attribute file. It works on Ubuntu 12.04 and CentOS 6.4 (and derviatives) and the attribute file is used to support both distros.

    This should work on Mac (where I developed it) and any chef-supported Linux that you can get Vagrant onto (Ubuntu/CentOS).

    Because I use ChefDK and Test Kitchen, I can largely ignore setting up Vagrant and Berkshelf and can get right to work on writing recipe code.

    NOTE: Modern (7/6/2014) Recipe Generation

    • We do not create a Vagrantfile (TK does that for us)
    • We do not use vagrant-omnibus or vagrant-berkshelf plugins (TK does that for us)
    • We use ChefDK (aka the chef command) to generate Berkshelf and .kitchen.yml files so we don’t have to touch those
    • We use the ChefDK package to install everything for us (except vagrant)
    • We don’t use “knife cookbook create” and instead use “chef generate cookbook”

    If you’re trying to mix+match these instructions with other HOWTOs on the Internet that have you editing your own Vagrantfile or installing vagrant plugins like vagrant-omnibus or vagrant-berkshelf the you’ll probably have a bad time. What this HOWTO is tring to do is leverage TK and Berkshelf (and ChefDK to configure both of those for you) in order so that you can quickly move on to converging chef recipes on Ubuntu and CentOS virtual machines.

    This is the correct way to start building Chef Cookbooks and leverage ChefDK and Test Kitchen as of this writing. This HOWTO should not build any bad habits, and the cookbook that results from this will only need to be extended to include tests.

    Not that we’re not using Test Kitchen to actually test our cookbook. We are ‘only’ using it to run vagrant and berkshelf, sync our cookbook(s), install and/or update the chef-client on the virtual node that we provision, and converge the node (which is quite a lot really — test-kitchen has to do a lot of heavy lifting in order to run tests, and we can utilize that even if we do not run any tests). Test-driven cookbook design is outside of the scope of this HOWTO, but would fit nicely on top of this. This is a minimum skeleton designed to quickly get to applying configuration to an Ubuntu or CentOS virtual machine.


    1. Install Vagrant
    2. Install ChefDK
    3. Use ChefDK to generate a cookbook skeleton
    4. Add a recipe
    5. Add a template
    6. Add an attribute file
    7. Use test-kitchen to create vagrant VMs and apply the chef cookbook

    Install Vagrant

    FIXME: steps to install vagrant — go to the website, download and install.

    Install ChefDK

    The ChefDK package compiles together a development environment with chef-client, knife, the chef command line tool, test-kitchen, berkshelf and drivers for vagrant. ChefInc has done the work to make sure that all the tools in the package are compatible so that you won’t wind up needing to worry about json gem conflicts and become an expert in ruby dependency management just in order to converge a simple cookbook.

    curl -L https://omnitruck.chef.io/install.sh | sudo bash -s -- -P chefdk

    Make sure you have at least version 0.2.0 (there was a bug in 0.1.0 that affected this workflow):

    chef --version
    Chef Development Kit Version: 0.2.0

    Create a Repo

    I’m going to assume a ~/chef-repo/{cookbooks,data_bags,roles,environments} structure, and we are going to follow a one-git-repo-per-cookbook model:

    % mkdir -p ~/chef-repo/cookbooks/ntp
    % cd ~/chef-repo/cookbooks/ntp
    % git init .
    % touch README.md
    % git add .
    % git commit -a -m 'first commit'

    Create Cookbook Skeleton

    % cd ~/chef-repo/cookbooks/ntp
    % chef generate cookbook .

    This magically picks up the name of our cookbook based on the name of the directory we are in. It also creates:

    • metadata.rb
    • README.md
    • chefignore
    • Berksfile
    • .kitchen.yml
    • recipes/default.rb

    It is a good idea now to make a commit so that you can go back to an unmodified skeleton if you want to:

    cd ~/chef-repo/cookbooks/ntp
    git add .
    git commit -m 'Generated cookbook skeleton'

    Add Your Recipe Code

    edit recipes/default.rb and include some resources:

    package "ntp" do
      action :install
    template "/etc/ntp.conf" do
      source "ntp.conf.erb"
      variables( :ntp_server => "time.nist.gov" )
      notifies :restart, "service[ntp_service]"
    service "ntp_service" do
      service_name node[:ntp][:service]
      action [:enable, :start]

    Add Your Template

    Since we included a template resource, we need to create the content for the template. ChefDK contains a really simple generator for the ‘scaffolding’:

    % cd ~/chef-repo/cookbooks/ntp
    % chef generate template ntp.conf

    All that does is create the file templates/ntp.conf.erb. You’ll need to edit that file and include the contents that you want for ntp.conf.erb anyway (so you can skip the ‘chef generate template’ step entirely);

    restrict default kod nomodify notrap nopeer noquery
    restrict -6 default kod nomodify notrap nopeer noquery
    restrict -6 ::1
    server <%= @ntp_server %>
    server     # local clock
    driftfile /var/lib/ntp/drift
    keys /etc/ntp/keys

    Add Your Attribute File

    On RHEL based systems we need to start the “ntp” service and use the “/etc/init.d/ntp” init script. On Debian/Ubuntu systems we need to start the “ntpd” service and use the “/etc/init.d/ntpd” init script. To deal with that difference correctly we included the attribute node[:ntp][:service] in the recipe and we need to set it correctly (based on the platform_family) in an attribute file.

    Again you can run ChefDK to generate the attribute file or just create it directly:

    % cd ~/chef-repo/cookbooks/ntp
    % chef generate attribute default

    That should create attributes/default.rb which you need to edit to include the content:

    default["ntp"]["service"] =
      case node["platform_family"]
      when "rhel", "fedora"
      when "debian"

    Commit Our Work To Git

    Now is a good point to commit and make a checkpoint since things should be working.

    git add .
    git commit -m 'Added Cookbook Code'

    Run Kitchen List

    To see what virts ChefDK sets up to build initially we can do a kitchen list (configured in the .kitchen.yml file that ChefDK generated for us):

    % kitchen list
    Instance             Driver   Provisioner  Verifier  Transport  Last Action
    default-ubuntu-1604  Vagrant  ChefZero     Inspec    Ssh        <Not Created>
    default-centos-72    Vagrant  ChefZero     Inspec    Ssh        <Not Created>

    Run a Test Kitchen Converge

    The TK “converge” command will run chef-client converge and then leave the instance around to be logged into. Use “kitchen test” if you’d like to destroy the instance after you get a successful converge and not leave test VMs lying around.

    % cd ~/chef-repo/cookbooks/ntp
    % kitchen converge default-ubuntu-1604

    Login to Instance

    Since you used kitchen converge previously you can login to the instance:

    % cd ~/chef-repo/cookbooks/ntp
    % kitchen login default-ubuntu-1604

    Test CentOS

    Similarly we can verify that we do the right thing for CentOS:

    % cd ~/chef-repo/cookbooks/ntp
    % kitchen converge default-centos-72
    % kitchen login default-centos-72

    Test Kitchen Matching

    The instance argument for test kitchen should be read as a regular expression bounded by wildcards. So if the string you send it matches an instance it will try to converge or login to it. You can only login to a single box, so you need to give it enough to uniquely identify an instance when you’re doing a login, but multiple servers can be converged with a single command line:

    % kitchen converge             # converge all the instances in .kitchen.yml
    % kitchen converge ubuntu      # converge all the ubuntu servers in .kitchen.yml (would match default-ubuntu-1204 and default-ubuntu-1404)
    % kitchen converge centos      # converge all the centos servers in .kitchen.yml
    % kitchen converge default     # converge all the default test suits in .kitchen.yml (test suites are more advanced TK use)
    % kitchen login ubuntu-1604    # login to the ubuntu-1604 instance (if 'default' is the only suite you have defined)

    Benefits As Starting Workflow

    (This is aimed more at explaining to experienced chef users why to teach Chef in this way)

    There’s no explicit interaction with a Chef Server involved in this example. Under the covers, your kitchen config will be firing up a chef-zero server and using it to converge the node, but this detail is hidden and we do not need to install a chef-server first before doing work. There’s no additional setup of EC2 or Digital Ocean keys as well. We also do not scribble over the User’s desktop configuration. We install two utilities and get right to converging recipes as fast as possible on a virt. There is probably less initial overhead compared to setting up chef-solo and explaining dna.json and solo.rb files. The focus is as much as possible on writing Chef recipes, but done with correct tooling first.

    There’s no going down the wrong path. This outline conforms to the Law of Primacy http://en.wikipedia.org/wiki/Principles_of_learning#Primacy) — that you will fall back on what you learn first. So we do not use chef-solo because fundamentally that approach leads to dead ends, and does not teach that chef is best used with a server. We also start the user down the path of using git-repos-per-cookbook. We use tools like test-kitchen which can grow into fully TDD design later, and we converge CentOS and Ubuntu virts on a workstation that might be MacOS (or soon Windows) because later that is the best workflow to solve that problem (rather than, say, firing up an EC2 cloud instance and logging in and using chef-zero every time you want to test a cookbook).

    What Magic Just Happened

    Notice that you:

    • Didn’t have to configure or touch Berkshelf
    • Didn’t have to find vagrant boxes
    • Didn’t have to deal with installing chef on your vagrant box
    • Didn’t have to deal with old installed versions of chef on your vagrant box
    • Didn’t have to use hosted chef
    • Didn’t have to setup a chef server (either open source or private enterprise chef)
    • Didn’t have to touch a Vagrantfile
    • Didn’t have to touch the Test-Kitchen config (bit of a lie until ChefDK 0.1.0 comes out)
    • Didn’t have to install any vagrant plugins
    • Didn’t have to fight with installing ruby tools and gem dependencies

    Most of the work in this HOWTO was on editing your Chef cookbook and using test kitchen to converge virts using your cookbook.

    While that is an old gist it is still mostly accurate (although some of the philosophical argumentation I’m making there was more geared towards an internal Chef, Inc audience or expert-audience and is now all just accepted and could have been skipped to turn it into a better tutorial). Note that this gives you a route to eventually start writing chefspec and serverspec/inspec tests against your cookbooks but I strongly suggest not adding that additional complexity from the start. Just running `kitchen converge` does the basic testing of “run this cookbook, for reals, with chef-client and show me the results”.

    1. That’s all very well, but where I work it is the only option available to us due to the restricted environment we work in. The only alternative is to make a change and wait 30 minutes for it to deploy.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.