Why Everyone Working in DevOps Should Read The Toyota Way

Ignore the Noise, Go to the Signal

In a former life I was a history student. I wasn’t very good at it, and one of my weaknesses was an unwillingness to cut out the second-hand nonsense and read the primary texts. I would read up on every historian’s views on (say) the events leading up to the first world war, thinking that would give me a short-cut to the truth.

The reality was that just reading the recorded deliberations of senior figures at the time would give me a view at the truth, and a way to evaluate all the other opinions I felt bombarded by.

What I should have learned, in other words, was: ignore the noise, and go to the signal.

Lean Schmean?

I was reminded of these learning moment recently when I finally read The Toyota Way. I had heard garbled versions of its message over the years through:

  • Reading blog after blog exhorting businesses to be ‘lean’ (I was rarely the wiser as to what that really meant)
  • Heard senior leadership in one company use the verb ‘lean’ (as in: ‘we need to lean this process’ – I know, right?)
  • One colleague tell me in an all-hands that we should all stop development whenever there was a problem ‘like they do at Toyota’ (‘How and why the hell is that going to help with 700 developers, and how do I explain that to customers?’, I thought. It came to nothing)

In other words, ‘lean’ seemed to be a content-free excuse to just vaguely tell people to deliver something cheaper, or give some second-hand cargo-cult version of ‘what Toyota do’.

So it was with some scepticism I found myself with some time and a copy of The Toyota Way in my hands. Once I started reading it, I realised that it was the real deal, and was articulating better many of the things I’d done to make change in business before, some of which I wrote about here and here.

‘That’s Manufacturing. I Work in a Knowledge Industry.’

One of the most obvious objections to anyone that foists The Toyota Way (TTW) on you is that its lessons apply to manufacturing, which is obviously different from a knowledge industry. How can physical stock levels, or assembly line management principles apply to what is done in such a different field?

The book deals with that objection on a number of levels.

The Toyota Way is a Philosophy, Not a Set of Rules

First, it emphasises that TTW is a philosophy for production in general, and not a set of rules governing efficient manufacturing. This philosophy can (depending on context) result in certain methods and approaches being taken that can feel like, or in effect become, rules, but can be applied to any system of production, whether it be production of pins, medicines, cars, services, or knowledge.

What that will result in in terms of ‘rules’ for your business will depend on your specific business’s constraints. So you’re under no obligation to do things the same way Toyota do them, because even they break their own supposed ‘rules’ if it makes sense for them. One example of this is the ‘rule’ that’s often cited that stock levels must always be low or minimal to prevent waste.

A high-level overall goal of TTW is to create a steady flow of quality product output in a pipeline that reduces waste. That can mean reducing stock levels in some cases (commonly considered a ‘rule’ of lean manufacturing), or even increasing them in others, depending on the overall needs of the system to maintain a steady flow.

So while the underlying principles of TTW are relatively fixed (such as ‘you should go and see what is going on on the floor’, ‘visual aids should be used collaboratively’, and so on), the implementation of those principles are relatively loose and non-prescriptive.

This maps perfectly to DevOps or Agile, which have a relatively clear set of principles (CALMS, and the Agile Manifesto, respectively) which can be applied in all sorts of ways, none of which are necessarily ‘correct’ for any given situation. In this context, the agile and DevOps industry that’s been built up around these movements are just noise.

Waste and Pipelines are Universal

Secondly, the concept of waste and pipeline is not unique to manufacturing. If your job is to produce weekly reports for a service industry, then you might consider that time spent making that report is wasted if its contents are not acted upon, or even read in a timely way.

A rather shocking amount of time can be spent in knowledge industries producing information that doesn’t get used. In my post on documentation I wrote about the importance of the ‘knowledge factory’ idea in running an SRE team, and the necessity to pay a ‘tax’ on maintaining those essential resources (roughly 5% of staff time in that case). The dividend, of course, was far greater.

Most of that tax was spent on removing or refining documentation rather than writing it. That was time well spent, as the biggest problem with documentation I’ve seen in decades of looking at corporate intranets is too much information, leading to distrust and the gradual decay of the entire system. So it was gratifying to read in TTW that:

  • Documentation audits take place regularly across the business
  • Are performed by a third party who ensures they are in order and follow standards
  • The principal check performed is to search for out of date documentation rather than quantity or correctness (how can an outsider easily determine that anyway?)

The root of the approach is summed up perfectly in the book:

‘Capturing knowledge is not difficult. the hard part is getting people to use the standards and contribute to improving it’

The Toyota Way

So in my experience, the fact that a car is being created instead of knowledge or software is not a reason to ignore TTW. Just like software delivery, a car is a product that requires both repeated activity in a pipeline, and creative planning of features and the building of technology in a bespoke way. All these parts of the process are covered and examined in TTW.

How Flow is Not Achieved

So how to you achieve a harmonious flow of output in your non-material factory? Again, this is essentially no different to manufacturing: what you’re dealing with is a system that has various sub-processes that themselves have inputs, outputs, dependencies and behaviours whose relationships need to be understood in order to increase throughput through the system.

How Flow is Achieved: Visualise for Collaboration First

Understanding, visualising and communicating your view of these relationships with your colleagues is hard, and critical to getting everyone pointing in the same direction.

This is something I’d also stumbled towards in a previous job as I’d got frustrated with the difficulty of visualising the constraints we were working under, and over which we had no control. I wrote about this in the post ‘Project Management with Graphviz’, where I used code to visualise and maintain the dependencies in a graph. I had to explain to so many people in so many different meetings why we couldn’t deliver for them that these graphs saved me lots of time.

Interlude – Visual Representations

Another principle outlined in TTW: visual representations should be simple and share-able. Unfortunately, this is the kind of thing you get delivered to your inbox as an engineer in an enterprise:

Now, I’m sure Project Managers eat this kind of thing for breakfast, and it makes perfect sense to them, but unless it corresponds to a commonly-seen and understood reality, it’s infinitely compressible information to the typical engineer. I used to almost literally ignore them. That’s the point of the visual representations principle of TTW: effective collaboration first, not complex schemas that don’t drive useful conversations.

In retrospect, and having read TTW, the answer to the problems of slow enterprise delivery are logically quite obvious: dependent processes need to be improved before downstream processes can achieve flow. For many IT organisations, that means infrastructure must be focussed on first, as these are the dependent services the development teams depend on.

But what often happens in real world businesses (especially those that do not specialise in IT)? Yup, centralised infrastructure gets cut first, because it is perceived that it ‘doesn’t deliver value’. Ironically, cutting centralised infrastructure causes more waste by cutting off the circulatory systems other parts of the business depend on for air.

So the formerly centralised cost gets mostly duplicated in every development team (or business unit) as they slowly learn they have to battle the ‘decagon of despair’ themselves separately from the infrastructure team that specialised in that effort before.

This is the infrastructure gap that AWS jumped headlong into: by scaling up infrastructure services to a global level, they could extract a tax from each business that uses it in exchange for providing services in a finite but sufficient way that removes dependencies on internal teams.

It is also the infrastructure gap that Kubernetes is jumping headlong into. By standardising infrastructure needs such as mutual TLS authentication and network control via sidecars, Kubernetes’ nascent open source Istio companion product is centralising those infrastructure needs again in a centralised and industry-standard way.

1) How Flow is Not Achieved: No Persistence in Pursuing Change

A key takeaway from the book is that efforts to make real change take significant lengths of time to achieve. TTW reports that it took Ford 5 years to see any benefits from adopting the Toyota Production System, and 10 years for any kind of comparable culture to emerge.

It’s extremely rare that we see this kind of patience in IT organisations (or their shareholders) trying to make cultural change. The only examples I can think of spring from existential crises that result in ‘do-or-die’ attempts to change where the change needed is the last roll of the dice before the company implodes. Apple is the most notable (and biggest) of these, but many other smaller examples are out there. You can probably think of similar analogous examples from your own life where you felt you had no choice but to make a change helped you achieve it.

2) How Flow is Not Achieved: Problems are Not Surfaced

The book contains anecdotes on the importance Toyota place on surfacing problems rather than hiding them. One example of this approach is the famous andon principle, where problems are signalled as quickly and clearly as possible to all appropriate people so the right focus can be given to quickly resolve the problem before production stops, or ‘stop the line’ to ensure the problem is properly resolved before continuing if it can’t be fixed quickly.

Examples include the senior manager who criticised the junior one for not having any of these line stoppages on the latter’s watch, because if there are no line stoppages then everything must be perfect, and it clearly can never be (unless quality control finds no problems and is doing its job, which was not the casein this instance).

This is the opposite to most production systems in IT, where problems are generally covered up or worked around in order to hide challenges from managers up the chain. This approach can only work for so long and results in a general deterioration in morale.

3) How Flow is Not Achieved: Focus on Local Optimisation

There is a great temptation, when trying to optimise production systems, to focus on local optimisations to small parts of the system that seem to be ripe for optimisation. While it can be satisfying to make small parts of the system run faster than before, it is ultimately pointless if the overall system is not constrained on those parts.

In manufacturing cars, optimising the production rate of the wing mirrors is pointless if wing mirrors are already produced faster than the engines are. Similarly, shaving a small amount off the cost of a wing mirror is (relatively speaking) effort wasted if the overriding cost is the engine. Better to focus on improving the engine.

In a software development flow, making your tests run a little faster is pointless if your features are never waiting for the tests to complete to deploy. Maybe you’re always waiting 2 days elapsed time for a manager to sign off a release, and that’s the bottleneck you should focus on.

4) How Flow is Not Achieved: Failure to ‘Go and See’

Throughout TTW, the importance of ‘going and seeing’ as a principle of management is reiterated many times. I wrote about the importance of this before in my blog on changing culture (Section 1: Get on the floor), but again it was good to see this intuition externally validated.

Two examples stuck in my mind: the story of the senior leader who did nothing but watch the production line for four hours so he could see for himself what was going on, and the minivan chief designer who insisted on personally driving in all 50 US states and Canada. The minivan designer then went back to the drawing board and made significant changes to the design that made sense in North America, but not in Japan (such as having multiple cup-holders for thelong journeys typical of that region).

Both of these leaders could have had an underling do this work for them, but the culture of Toyota goes against this delegatory approach.

Implicit in this is that senior leadership need to be bought into and aware of the detail in the domain in order to drive through the changes needed to achieve success.

Go Read the Book

I’ve just scratched the surface here of the principles that can be applied to DevOps from reading TTW.

It’s important not to swallow the kool aid whole here as well. Critiques of The Toyota Way exist (see this article from an American who worked there), and are worth looking at to remind yourself Toyota have not created the utopia that reading The Toyota Way can leave you thinking they have. However, the issues raised there seem to deal with the general challenges of the industry, and the principles not being followed in certain cases (Toyota is a human organisation, after all, not some kind of spiritual production nirvana).

Oh, and at the end of the book there’s also a free ‘how to do Lean consulting’ section at the back that gives you something like a playbook for those that want to consult in this area, or deconstruct what consultants do with you if you bring them in.

If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way


Get 39% off Docker in Practice with the code: 39miell2

14 thoughts on “Why Everyone Working in DevOps Should Read The Toyota Way

  1. Also read “The High-Velocity Edge” by Steve Spear. It gets at the underlying mechanics of TPS.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.