What I Learned Using Private LLMs to Write an Undergraduate History Essay

  • TL;DR
  • Context
  • Writing A 1996 Essay Again in 2023, This Time With Lots More Transistors
    • ChatGPT 3
    • Gathering the Sources
    • PrivateGPT
    • Ollama (and Llama2:70b)
  • Hallucinations
  • What I Learned

TL;DR

I used private and public LLMs to answer an undergraduate essay question I spent a week working on nearly 30 years ago, in an effort to see how the experience would have changed in that time. There were two rules:

  • No peeking at the original essay, and
  • No reading any of the suggested material, except to find references.

The experience turned out to be radically different with AI assistance in some ways, and similar in others.

If you’re not interested in my life story and the gory detail, skip to the end for what I learned.

Context

Although I work in software now, there was a time when I wanted to be a journalist. To that end I did a degree in History at Oxford. Going up in 1994, I entered an academic world before mobile phones, and with an Internet that was so nascent that the ‘computer room’ was still fighting for space with the snooker table for space next to the library (the snooker table was ejected a few years later, sadly). We didn’t even have phones in our dorms: you had to communicate with fellow students either in person, or via an internal ‘pigeon post’ system.

My typical week then was probably similar to a week of a History student in 1964: tutorials were generally on Friday, where I would be given the essay question and reading list for the following week:

A typical History essay reading list from 1996. My annotations at the top related to the library reference details, and the notes at the bottom (barely legible even to me now) I suspect were the notes I made during the hour as my tutor was telling me what I should have written about.

That afternoon I would go to several libraries, hoping that other students studying the same module hadn’t borrowed the books on the list. Over the following week I would try to read and make notes on the material as much as possible given my busy schedule of drinking incredibly cheap and nasty red wine, feeling depressed, and suffering from the endless colds living in Oxford (which is built on a bog) gifts you free of charge.

The actual writing of the essay would take place on the day before delivery, and involve a lot of procrastination (solitaire and minesweeper on my Compaq laptop, as I recall) while the painful process of squeezing the juices from my notes took place.

I didn’t go to lectures. They were way too early in the afternoon for me to get out of bed for.

Writing A 1996 Essay Again in 2023, This Time With Lots More Transistors

The essay question I chose to answer was from a General British History (19th-20th century) course:

How far was the New Liberalism a Departure from the Old?

ChatGPT 3

I first went to ChatGPT (3, the free version – I’m a poor student after all) to try and familiarise myself with the topic:

Already I was way ahead of my younger self. I had an idea of what the broad strokes I needed to hit were to get a passable essay together. ChatGPT is great for this kind of surveying or summarizing stuff, but is pretty bad at writing an undergraduate-level essay for you. It can write a passable essay (literally), but it will be very obvious it was written by an LLM, and probably at best get you a 2:2 or a third, due to the broad-brush tone and supine mealy-mouthed ‘judgements’ which contain very little content.

After a few more interactions with ChatGPT over the next few minutes:

I was now ready to gather the information I needed to write the essay.

Gathering the Sources

The first step was to get the materials. It took me a couple of hours to find all the texts (through various, uh, avenues) and process them into plain text. I couldn’t get the journal articles (they were academically paywalled), which was a shame, as they tended to be the best source of figuring out what academic debate was behind the question.

It totalled about 10M of data:

756K    high_and_low_politics_in_modern_britain_bentley.txt
1.3M    lancashire_and_the_new_liberalism.txt
904K    liberalism_and_sociology_collini.txt
1012K   liberals_and_social_democrats.txt
812K    liberals_radicals_and_social_politics_emy.txt
684K    new_liberalism_allett.txt
1.5M    our_partnership__beatrice_webb.txt
556K    the_age_of_lloyd_george_kenneth_o_morgan.txt
708K    the_new_liberalism_freeden.txt
540K    the_new_liberalism_weiler.txt
148K    the_origins_of_the_liberal_welfare_reforms_hay.txt
1.1M    unemployment_a_problem_of_industry_beveridge.txt

Those couple of hours would have been reduced significantly for any future essay I might choose to write, as I spent some time figuring out exactly where to get them and installing software to extract the plain text. So I would now allocate an hour to getting the vast majority of texts, as opposed to the small subset of texts I would hunt down over a significantly longer time in the physical libraries 30 years ago.

PrivateGPT

One of the well-known limitations with ChatGPT is that it doesn’t tell you what the relevant sources are that it looked at to generate the text it gives you.

I found out that this repo gives us a quick way to augment an LLM with a set of documents, so I installed it on an old Dell Xeon desktop with 24 cores and 128G of RAM. This machine cost me about £600 a couple of years ago, and still works a treat (I bought it to run Kubernetes clusters). It doesn’t have a graphics card in it, so I thought it would be usable for this kind of work, but it’s just fine with ‘normal’ CPUs (as long as you don’t demand instant responses).

$ python3 ingest.py
Appending to existing vectorstore at db
Using embedded DuckDB with persistence: data will be stored in: db
Loading documents from source_documents
Loading new documents: 100%|███████████████████| 12/12 [00:00<00:00, 274.84it/s]
Loaded 12 new documents from source_documents
Split into 38046 chunks of text (max. 350 tokens each)
Creating embeddings. May take some minutes…
Ingestion complete! You can now run privateGPT.py to query your documents

Ingestion of the 10M sources took about five minutes. I asked it a few questions, and this was a typical output:

> Question:
what was the changing role of the state in new versus old liberalism

> Answer (took 169.94 s.):
The main difference between "new" and "old" liberalism lies in their approach towards the role of the government within society. While traditional or 'Old' liberals believed that minimizing intervention by the State would lead to a more prosperous economy, ‘New’ Liberals acknowledged the need for state action and social reform as necessary measures against industrial poverty. In essence, new liberalism marked an evolution of thought among liberals towards greater acceptance of government responsibility in promoting general welfare through interventionist policies.

> source_documents/the_age_of_lloyd_george_kenneth_o_morgan.txt:
It was, therefore, the Old Liberalism which helped to repair the fortunes of the party—nonconformity and industrial free trade, the very essence of the Liberal
faith for half a century. Where, however, was the ‘New Liberalism’ which could embrace a more positive attitude towards the central government, for which many

> source_documents/the_new_liberalism_weiler.txt:
In the broadest sense, the term new Liberalism refers both to the welfare legislation of the 1906-14 administration and to the changes in Liberal social and
political theory. At the time, however, the term first referred to the theoretical changes which preceded the legislative achievements of the Liberal government.

> source_documents/the_new_liberalism_weiler.txt:
What then was the new Liberalism? As defined here, it was the modification of Liberal ideology from a vehicle for a mid-Victorian ideal of laissez-faire to a
philosophy based on state action and social reform to cope with industrial poverty. This modification was achieved by Liberal journalists and theorists seeking to

> source_documents/the_new_liberalism_weiler.txt:
state, a development that represents the fruition of the new Liberalism.

New Liberalism
The Political Economy
of J.A. Hobson

UNIVERSITY OF TORONTO PRESS
Toronto Buffalo London

Toronto Buffalo London
Printed in Canada
ISBN 0-8020-5558-3

> source_documents/the_new_liberalism_weiler.txt:
understanding of the phenomenon of the new Liberalism itself.

> source_documents/the_new_liberalism_freeden.txt:
THE SOCIAL POLICY OL THE NEW LIBERALISM

You can see that the output is mixed in quality (the model I used was relatively small, see Ollama section below), but it did point me at potentially relevant sources for quotes and bits to read. I spent about another hour rooting around these texts and gathering bits of text that I might use later. Note the response time, however (about three minutes is typical). I got a bit frustrated with the response time vs quality and went looking for other tech to use.

Ollama (and Llama2:70b)

This was the combination of technology that blew my mind. It absolutely changes the game for this kind of work. Ollama is a project designed to ‘get up and running with large language models locally’ which borrows concepts from Docker (like the Modelfile (similar to the Dockerfile, and ollama get/pull/list)

Compared to the ChatGPT output, Llama2:70b (with a Llama2 model with 70b parameters) allowed me to automate a significant chunk of the work needed to write the essay, reducing the time needed to write the essay from days to less than a day.

I downloaded the biggest model I could find (Llama2:70b) and asked it the question in a very unspecific way. It returned a much better essay plan than ChatGPT did:

At this point I started writing the body of the essay (introductions and conclusions are best left to the end), basing each paragraph on the numbered points above. As I wrote, looked at specific texts to find references and asked LLMs questions, something unexpected happened: I was learning about the nuances of the debate (mainly through looking the Collini book I hadn’t been able to find in the 90s).

I was getting sucked into a temptation to actually read the texts, and had to remind myself that I wasn’t allowed to do that, according to the rules of the ‘game’ I was playing here.

This appreciation of which texts were important, and what I was being invited to write about happened much faster than would have been the case pre-Internet. I really do wonder how students do this today (get in touch if you can tell me!).

Hallucinations

A well-known limitation of LLMs is their tendency to ‘hallucinate’, or ‘bullshit’ a plausible-seeming answer. I definitely saw this, and it wasted some time for me. Llama2 gave me this nice quote, which raised suspicions with me as it seemed too good to be true:

Evidence: In his book "Industrial Democracy" (1902), Webb argues that collective ownership and control of the means of production is necessary to address the systemic issues that perpetuate poverty and inequality. He writes, "The only way to secure industrial democracy is through the collective ownership and control of the means of production."

But I couldn’t find that quote anywhere in the book. Llama2 just made it up.

What I Learned

After writing my AI-assisted essay, I took a look at my original essay from 1996. To my surprise, the older one was far longer than I remembered essays being (about 2500 words to my AI-assisted one’s 1300), and was, it seemed to me, much higher quality than my AI-assisted one. It may have been longer because it was written in my final year, where I was probably at the peak of my essay-writing skills.

It took me about six hours of work (spread over 4 days) to write the AI-assisted essay. The original essay would have taken me a whole week, but the actual number of hours spent actively researching and writing it would have been at minimum 20, and at most 30 hours.

I believe I could have easily doubled the length of my essay to match my old one without reducing the quality of its content. But without actually reading the texts I don’t think I would have improved from there.

I learned that:

  • There’s still no substitute for hard study. My self-imposed rule that I couldn’t read the texts limited the quality to a certain point that couldn’t be made up for with AI.
  • History students using AI should be much more productive than they were in my day! I wonder whether essays are far longer now than they used to be.
  • Private LLMs (I used llama2:70b) can be way more useful than ChatGPT3.5 for this type of work, not only in the quality of generated response, but also the capability to identify relevant passages of text.
  • I might have reduced the time significantly with some more work to combine the llama2:70b model with the reference-generating code I had. More research is needed here.

Hope you enjoyed my trip down memory lane. The world of humanities study is not one I’m in any more, but it must be being changed radically by LLMs, just as it must have been by the internet. If you’re in that world now, and want to update me on how you’re using AI, get in touch.

The git repo with essay and various interactions with LLMs is here.


Learn jq the Hard Way,Part IV: Pipes

Other Posts

Pipes

The pipe is the next-most used feature of jq after filters. If you are already familiar with pipes in the shell, it’s a very similar concept.

How Important is this Post?

Pipes are fundamental to using jq, so this section is essential to understand. We will cover:

  • The pipe (|)
  • The ‘.[]‘ filter
  • Nulls
  • Array references

Setup

Create a folder to work in, and move into it:

$ mkdir ljqthw_pipes
$ cd ljqthw_pipes

The .[] Filter

Before we get to the pipe, we’re going to talk about another filter. This filter
doesn’t have a snappy name, so we’re going to refer to it as the .[] filter. It’s
technically known as the ‘array/object value iterator’.

Create a simple JSON document in a file:

$ echo '[{"username": "alice", "id": 1 }, {"username": "bob", "id": 2 }]' > doc1

This document is an array representing users on a system, with two objects in it.
Each object has a username and id name-value pair.

Since the file is a simple array, we can filter the individual objects by referring
to the array index in a fashion similar to other languages, by putting the index
number in square brackets:

$ jq '.[0]' doc1
$ jq '.[1]' doc1

If the referenced index doesn’t exist, then a null is returned:

$ jq '.[2]' doc1

You can also refer to negative indexes to reference the array from the end rather than the start:

$ jq '.[-1]' doc1
$ jq '.[-2]' doc1
$ jq '.[-3]' doc1

If you don’t provide an index, then all the items in the array are returned:

$ jq '.[]' doc1

Look at the output. How are the items returned?

  • In a new array?
  • In a series of JSON documents?
  • In a comma-separated list?

Running these commands may help you decide. Note that in the below commands we are using a shell pipe, not a jq pipe. The two uses of the pipe (|) character are analogous, but not the same:

$ jq '.[]' doc1 | jq '.'
$ jq '.[]' doc1 | jq '.[0]'

It’s really important to understand that the .[] filter does not just work on
arrays, even though it uses an array’s square brackets. If a series of JSON
objects are supplied, then it will iterate through the name value pairs, returning
the values one by one. That’s why it’s called the ‘array/object value iterator’.

Type this out to see that in practice:

$ echo '{"username": "alice"} {"id": 1}' > doc2
$ jq '.' doc2
$ jq '.[]' doc2

This time you had no array, just a series of two JSON objects. The .[] filter
iterates over a series of objects, returning the values in it. Remember, it’s
formally known as the ‘array/object value iterator’.

Now try this. What happens? Why?

$ echo '"username"' > doc3
$ jq '.[]' doc3

This filter confused me for a long time when I was learning jq, as I thought the
.[] filter just removed the square brackets from the json, returning an array’s
contents, like this example does:

$ echo '[{"a": "b", "c": "d" }]' > doc4
$ jq '.' doc4
$ jq '.[]' doc4

While it obviously does do that in this case, you should now be aware that this
filter does more than just ‘remove the square brackets’!

Introducing The Pipe

OK, back to doc1. This time you’re going to run the .[] filter over it, and then
use a jq pipe to process it further. Run these commands and think about what
the pipe character (|) is doing.

$ jq '.' doc1
$ jq '.[]' doc1
$ jq '.[] | .["username"]' doc1

The first command shows you the JSON document pretty-printed, to remind you of the
raw content.

The second command (the ‘array/object value iterator’) iterates over the objects in
the array, and produces two JSON documents. This is an example of what was pointed
out above: the .[] filter does not simply ‘remove the square brackets’. If it
did, the resulting two objects would have commas between them. It turns the two
objects within the array into a series of JSON documents. Make sure you understand
this before continuing.

The third command introduces the jq pipe. Note that this time, the pipe character
(|) is inside the jq string’s single quotes, making it a jq pipe rather than
the ‘shell pipe’ we saw earlier. The jq pipe passes the output of the .[] operator
on to the next filter (.["username"]). This filter then effectively retrieves
the value of the username name/value pair from each JSON document from the output
of the first .[] filter.

You can run the last command in other ways too, with identical results:

$ jq '.[] | ."username"' doc1
$ jq '.[] | .username' doc1

jq will figure out what you mean if you use either of those shorthands.

Nulls

Not all JSON documents have objects in them with consistent names. Here you create
a document with inconsistent names in its objects (username and uname):

$ echo '[{"username": "alice"}, {"uname": "bob" }]' > doc4

Now see what happens when you reference one of the names in your query:

$ jq '.[] | .username' doc4

The object with the matching name outputs the its value, but the other
object outputs a null value, indicating that there was no match for
the query for that object.

Referencing the other object’s name swaps which value it outputs and which
results in a null:

$ jq '.[] | .uname' doc4

What You Learned

  • What the .[] (array/object value iterator) filter does, and how it works on arrays and objects
  • How to reference array items
  • The various ways you can refer to names in JSON objects

Exercises

1) Create a file with three JSON documents in them: an array with a single name-value pair object in it, an array with two name-value pair objects in it, and a single bare object. Run it through the .[] filter and predict what output it will give.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

Learn jq the Hard Way,Part III: Filters

Other Posts

Simple Filters

In this section we introduce the most-frequently used feature of jq: the filter. Filters allow you to reduce a given stream of JSON to another smaller, more refined stream of JSON that you can then do more filtering or processing on on if you want.

How Important is this Post?

Filters are fundamental to using jq, so this post’s content is essential to understand.

We will cover:

  • The most commonly-used filters
  • Selecting values from a stream of JSON
  • Railroad diagrams, and how arrays cannot contain name-value pairs in JSON

Setup

Create a folder to work in, and move into it:

$ mkdir ljqthw_nv
$ cd ljqthw_nv

Now create a simple JSON document to work with:

$ echo '{"user1": "alice", "user2": "bob"}' > json_object

This file contains a simple JSON object with two name-value pairs (user1, and user2 being the names, and alice and bob being their values, respectively).

The Dot Filter

The concept of the filter is central to jq. The filter allows us to select those parts of the JSON document that we might be interested in.

The simplest filter – and one you have already come across – is the ‘dot’ filter. This filter is a simple period character (.) and doesn’t do any selection on the data at all:

$ jq . json_object

Note that here we are using the filename as the last argument to jq, rather than passing the data through a UNIX pipe with cat.

Arrays, Name-Value Pairs, and Railroad Diagrams

Now let’s try and create a similar array with the same name-value pairs, and run the dot filter against it:

$ echo '["user1": "alice", "user2": "bob"]' > json_array
$ jq . json_array

Was that what you expected? When I ran this I expected it to just work, based on my experiences of data structures in other languages. But no. Arrays in JSON cannot contain a name-value pair as one of its values – it’s not a JSON object, and arrays must be composed of JSON objects.

What can an array contain? Have a look at this railroad diagram, from https://json.org/:

The above diagram defines what an array consists of. Make sure you understand how to read the diagram before continuing, as being able to read such diagrams is useful in many contexts in software development.

Railroad Diagram
A railroad diagram is also known as a syntax diagram.
It visually defines the syntax for a particular language
or format. As your code or document is read, you can
follow the line,choosing which path to take as it splits. 
If your code or document can be traced through the
'railroad', then it is syntactically correct. As far as I can
tell, there is no 'official' format for railroad diagrams
but the conventional signs can easily be found
and deciphered by searching online for examples.

The ‘value’ in the array diagram above is defined here:

Following the value diagram, you can see that there is no ‘name-value’ pair defined within a value. Name-value pairs are defined in the object railroad diagram:

A JSON object consists of zero or more name-value pairs, separated by commas, making it fundamentally different from an array, which contains only JSON values separated by spaces.

It’s worth understanding these diagrams, as they can help you a lot as you try and create and parse JSON with jq. There are further diagrams on the https://json.org site (string, number, whitespace). They are not reproduced in this section, as you only need to reference them in the rarer cases when you are not sure (for example) whether something is a number or not in JSON.

Our First Query – Selecting A Value

Now, back to our original document.

$ cat json_object
{
  "user1": "alice",
  "user2": "bob"
}

Let’s say we want to know what user2‘s value is in this document. To do this, we need a filter that outputs only the value for a specific name.

First, we can select the object, and then select by name:

$ jq '.user2' json_object

This is our first proper query with jq. We’ve filtered the contents of the file down to the information we want.

In the above we just put the bare string user2 after the dot filter. jq accepts this, but you can also refer to the name by placing it in quotes, or placing it in quotes inside square brackets:

$ jq '.["user2"]' json_object
$ jq '."user2"' json_object

However, with the square brackets and without the quotes does not work:

$ jq '.[user2]' json_object

This is because the [""] form is the official way to look up a name in an object. The simpler .user2 and ."user2" forms we saw above are just a shorthand for this, so you don’t need the four characters of scaffolding ([""]) around the name. As you are learning jq it may be better to use the longer form to embed the ‘correct’ form in your mind.

Types of Quote Are Important

It’s important to know that the types of quote you use in your JSON are significant to jq. Type this in, and think about how it is different to the ‘non-broken’ json above.

$ echo "['user1': 'alice', 'user2': 'bob']" > json_array_broken
$ jq '.' json_array_broken

What You Learned

  • The ‘dot’ filter
  • How to select the value of a name from an object
  • The various allowed forms for performing the selection
  • The syntactic difference between an object, an array, and a name-value pair
  • The types of quote are important

Exercises

1) Create another json_array file as per this section, but this time with two JSON objects (ie the same line twice). Re-run the queries above and see what happens to demonstrate that jq works on a stream of JSON objects.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

Learn jq the Hard Way, Part II: The jq Command

Other Posts

jq

This section introduces you to the jq command, starting with the simplest possible commands as well as a brief look at the most commonly-used flags.

What is jq?

jq is a program for parsing, querying, and manipulating JSON. More broadly, it’s a filter: it takes input and transforms it to output.

It can help you programmatically answer questions like:

  • How many AWS VMs does my account have?
  • How many of these VMs were created before last week?
  • What is the running status of these older VMs?
  • What is the list of VMs that have the tag ‘sales’, were created before last week, and are still running?

Before jq existed, the most common way to work out these things would be to use older command-line tools such as sed, awk, or programming languages like Python or perl. These could be highly sensitive to changes in input, difficult to maintain, prone to error, and slow. jq is much more elegant.

However, jq can be difficult for the uninitiated to understand and is more often used than understood. This book aims to help with that, to the point where you can confidently come up with your own solutions to challenges you face at work.

Invoking jq

Let’s get our hands dirty with jq.

Run this on your terminal:

$ echo '{}' | jq

This is the simplest JSON document you can pass to jq. The output of the echo command is piped into the jq command, which reads the output and ‘pretty-prints’ the JSON received for you because it’s going to the terminal rather than a file or another program. In this case, the input looks the same as the output, but the ‘pretty-printing’ involves emboldening the braces.

$ echo '{}' | jq

Input to jq

jq takes a stream of JSON documents as input. This input can be via a pipe like this:

$ echo '{}' | jq

or from a given filename, like this:

$ echo '{}' > doc.json
$ jq . doc.json

It’s really important to understand that jq does not take a single JSON document as input (although it can). According to the documentation, it takes ‘a stream of JSON entities’.

What does this mean in practice? It means that if you run this:

$ echo '{}[]' | jq

you have just seen jq process ‘a stream of JSON entities’. What you inputted to jq was not valid JSON. If you want to prove this to yourself, just plug in {}[] to a JSON-parsing website (easily Google-able) or run this:

$ echo '{ {}[] }' | jq

The above just enclosed the ‘stream of JSON’ within a single JSON object (the surrounding {}), and jq throws an error, because what’s inside the single JSON object, ie:

{}[]

is not itself a single valid JSON document.

Two jq Flags

The jq command has a number of flags which are useful to know about. At this stage of the book we’re just going to look at two, as to list them all would overwhelm you. These are the two I have seen used most often.

The -r Flag

The first is most often used in quick scripts and shell pipelines. All it does is remove the quotes from the items in the output. For example, just passing in a string to jq will output the string with the quotes surrounding it:

$ echo '"asd"' | jq

Adding the -r flag removes the quotes:

$ echo '"asd" "fgh' | jq -r

The -S Flag

The second flag sorts fields in JSON objects. In this first example, the a and b items are output in the same order they were input:

$ echo '{"b":0,"a":0}' | jq

Adding the -S flag sorts them by their key:

$ echo '{"b":0,"a":0}' | jq -S

In this simple example the value of sorting these fields is not so clear, but if you have huge JSON document to look through, then knowing this flag is a simple way to save lots of time hunting through screen after screen full of data.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

What You Learned

  • What JSON is
  • What a JSON object is
  • What a JSON array is

Exercises

1) Read the page https://www.json.org/json-en.html

2) Pick a programming language of your choice and parse a JSON document into it


Go to Part III


If you buy me a coffee, it helps support and encourage my writing.

Learn jq the Hard Way, Part I: JSON

Other Posts

Introduction

This jq series has been written to help users to get to a deeper understanding and proficiency in jq. It doesn’t aim to make you an expert immediately, but you will be more confident about using it and building your knowledge up from that secure base.

You may well have already played with jq a little – maybe been given a jq command to run by someone else, found a useful one-liner from StackOverflow, or hacked something together quickly that ‘does the job’, but without really understanding what you did. While that’s a great way to get going, a guided course that shows you how the pieces fit together by using it really helps you go further. Understanding these pieces enable you to more creative, solving your own challenges in ways that work for your problem domain.

Why ‘The Hard Way’?

The ‘Hard Way’ is a method that emphasises the process required to learn anything. You don’t learn to ride a bike by reading about it, and you don’t learn to cook by reading recipes. Content can help (hopefully, this does) but it’s up to you to do the work.

This book shows you the path in small digestible pieces and tells you to actually type out the code. This is as important as riding a bike is to learning to ride a bike. Without the brain and the body working together, the knowledge does not properly seep in.

Before we get hands on with jq, it’s important to know what JSON is, and what it is not.

In this post, we cover:

  • What JSON is
  • Look at examples of it
  • Introduce key terms
  • Briefly look at its place in the software landscape

Why Should I Read This Post?

This is an introductory post, but an important one.

Even if you’ve seen JSON before, I strongly encourage you to read over this. The reason for that is that getting a clear grasp of the terminology will help enormously when reading jq docs later. In fact, a good working understanding of the terminology is the best way to avoid confusion when using jq.

What Is JSON?

JSON is a ‘data interchange format’. This is a fancy way of saying that is a standardised way to write information and send it to other entities who can understand what it means.

You might have heard of (or even used) other data-interchange formats, such as XML, CSV, Apache Parquet, YAML. Each of these formats has their benefits and disadvantages relative to each other. CSV is very simple and easily understood but is not very good at expressing complex nested information, and can be ambiguous in how it represents data. XML allows for very complex data to be encapsulated but can be verbose and hard for humans to parse. YAML is optimised for human readability, allowing comments and using whitespace rather than special characters to delimit.

JSON is ubiquitous for a few reasons. First, it is simple, being easily parsed by anyone familiar with standard programming languages. Second, it is natively understood by JavaScript, a very popular programming language in the IT industry. Third, it is widely parsed by many programming languages in easily available libraries.

 JSON Is Simple

Here is an example JSON object.

{
  "accounting": [
    {
      "firstName": "Alice",
      "lastName": "Zebra",
      "building": "7a",
      "age": 19
    },
    {
      "firstName": "Bob",
      "lastName": "Young",
      "age": 28
    }
  ],
  "sales": [
    {
      "firstName": "Celia",
      "lastName": "Xi",
      "building": "Jefferson",
      "age": 37
    },
    {
      "firstName": "Jim",
      "lastName": "Galley",
      "age": 46
    }
  ]
}

The above JSON represents two departments of a workplace and their employees. The departments are in a ‘collection’ of name-value pairs. "accounting" and "sales" are the names, and the values are an ordered list of name-value pairs (an ordered list is known as an array).

Anything enclosed within a pair of curly braces (‘{‘ and ‘}‘) is an object. Anything enclosed within a pair of square braces (‘[‘ and ‘]‘) is an array.

It might sound theoretical, but it’s really important that you understand the above terminology, or at least understand that it’s important. Most jq documentation makes these distinctions carefully, and some use them wrongly, or loosely. This can cause great confusion. When you look at JSON as you read this book, be sure you can explain what it is in clear and correct terms to yourself and others.

The format is flexible, allowing items within an object to have different name-value pairs. Here, the “building” name is in Celia’s and Alice’s entry, but not in Jim’s or Bob’s.

A JSON document can be an object or an array. Here is the same document as above, but in an array rather than an object.

[
  {
    "accounting": [
      {
        "firstName": "Alice",
        "lastName": "Zebra",
        "building": "7a",
        "age": 19
      },
      {
        "firstName": "Bob",
        "lastName": "Young",
        "age": 28
      }
    ]
  },
  {
    "sales": [
      {
        "firstName": "Celia",
        "lastName": "Xi",
        "building": "Jefferson",
        "age": 37
      },
      {
        "firstName": "Jim",
        "lastName": "Galley",
        "age": 46
      }
    ]
  }
]

In this document, the departments are in a specific order, because they are placed in an array rather than in an object.

In the above passage, the key terms to grasp are:

  • Name
  • Value
  • Name-value pairs
  • Object
  • Array

We will cover these in more depth later in this series, but for now just be aware that these names exist, and that understanding them is key to getting to mastery of jq.

Natively Understood By Javascript

JSON arose from Javascript’s need for a way to communicate between processes on different hosts in an agreed format. It was established as a standard around the turn of the century, and any Javascript interpreter now understands JSON out of the box.

Used By Many Languages

JSON is not specific to JavaScript. It was invented for JavaScript, but is now a general-purpose format that is well-supported by many languages.

Here is an example of an interactive Python session parsing a simplified version of the above JSON into a Python dictionary.

$ python3
>>> json_str = '{"sales": [{"name": "Alice"}], "accounting": [{"name": "Bob"}]}'
>>> import json
>>> json_parsed = json.loads(json_str)
{'sales': [{'name': 'Alice'}], 'accounting': [{'name': 'Bob'}]}
>>> type(json_parsed)
<class 'dict'>
>>> >>> json_parsed['sales']
[{'name': 'Alice'}]
>>> json_parsed['sales'][0]
{'name': 'Alice'}
>>> json_parsed['sales'][0]['name']
'Alice'

JSON and YAML

Many engineers today make extensive use of YAML as a configuration language. JSON and YAML express very similar document content, but they look different. YAML is easier for humans to read than JSON, and also allows for comments in its documents.

Technically, JSON can be converted into YAML without any loss of information. But this conversion cannot always go both ways. YAML has a few extra features, such as ‘anchors’ that allow you to reference other items within the same document, which can make converting back to JSON impossible.

JSON Can Be Nested

JSON can have a nested structure. This means that any value within a JSON object or array can have the same structure as the whole document. In other words, every value could itself be a JSON document. So each of the the following lines are valid JSON documents:

  {}
  "A string"
  { "A name" : {} }
  { "A name" : [] }

and this one is not valid:

{ {} }

because there is no ‘value’ inside the JSON object.

This one is also not valid:

{ Thing }

because values that are strings need to be quoted (just as in JavaScript).

We will go into more detail on name-value pairs in an upcoming post.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

What You Learned

  • What JSON is
  • What a JSON object is
  • What a JSON array is

Exercises

1) Read the page https://www.json.org/json-en.html

2) Pick a programming language of your choice and parse a JSON document into it


Go to Part II


If you buy me a coffee, it helps support and encourage my writing.

Monoliths, Microservervices and Mainframes – Reflections on Amazon Prime Video’s Monolith Move

Recently an Amazon Prime Video (APV) article about their move from serverless tools to ECS and EC2 did the rounds on all the tech socials. A lot of noise was made about it, initially because it was interpreted as a harbinger of the death of serverless technologies, followed by a second wave that lashed back against that narrative. This second wave argued that what had happened was not a failure of serverless, but rather a standard architectural evolution of an initial serverless microservices implementation  to a ‘microservice refactoring’.

This brouhaha got me thinking about why, as an architect, I’ve never truly got onto the serverless boat, and what light this mini-drama throws on that stance. I ended up realising how Amazon and AWS had been at the centre of two computing revolutions that changed the computing paradigm we labour within.

Before I get to that, let’s recap the story so far.

The Story

The APV team had a service which monitored every stream viewed on the platform, and triggered a process to correct poorly-operating streams. This service was built using AWS’s serverless Step Functions and Lambda services, and was never intended to run at high scale.

As the service scaled, two problems were hit which together forced a re-architecture. Account limits were hit on the number of AWS Step Function transitions, and the cost of running the service was prohibitive.

In the article’s own words: ‘The move from a distributed microservices architecture to a monolith application helped achieve higher scale, resilience, and reduce costs. […] We realized that [a] distributed approach wasn’t bringing a lot of benefits in our specific use case, so we packed all of the components into a single process.’

The Reactions

There were more than a few commentators who relished the chance to herald this as the return of the monolith and/or the demise of the microservice. The New Stack led with an emotive ‘Amazon Dumps Microservices’ headline, while David Heinemeier Hansson, as usual, went for the jugular with ‘Even Amazon Can’t Make Sense of Serverless or Microservices’.

After this initial wave of ‘I told you so’ responses, a rearguard action was fought by defenders of serverless approaches to argue that reports of the death of the monolith was premature, and that others were misinterpreting the significance of the original article.

Adrian Cockroft, former AWS VP and well-known proponent of microservices fired back with ‘So Many Bad Takes – What Is There To Learn From The Prime Video Microservices To Monolith Story’, which argued that the original article did not describe a move from microservice to monolith, rather it was ‘clearly a microservice refactoring step’, and that the team’s evolution from serverless to microservice was a standard architectural pathway called ‘Serverless First’. In other words: nothing to see here, ‘the result isn’t a monolith’.

The Semantics

At this point, the debate has become a matter of semantics: What is a microservice? Looking at various definitions available, the essential unarguable point is that a microservice is ‘owned by a small team’. You can’t have a microservice that requires extensive coordination between teams to build or deploy.

But that can’t be the whole story, as you probably wouldn’t describe a small team that releases a single binary with an embedded database, a web server and a Ruby-on-Rails application as a microservice. A microservice implies that services are ‘fine-grained […] communicating through lightweight protocols’.

There must be some element of component decomposition in a microservice. So what is a component? In the Amazon Prime Video case, you could argue both ways. You could say that the tool is the component, and is a bounded piece of software managed by a small team, or you could say that the detectors and converters are separate components mushed into a now-monolithic application. You could even say that my imagined Ruby-on-Rails monolithic binary above is a microservice if you want to just define a component as something owned by a small team.

And what is an application? A service? A process? And on and on it goes. We can continue deconstructing terms all the way down the stack, and as we do so, we see that whether or not a piece of software is architecturally monolithic or a microservice is more or less a matter of perspective. My idea of a microservice can be the same as your idea of a monolith.

But does all this argumentation over words matter? Maybe not. Let’s ignore the question of what exactly a microservice or a monolith is for now (aside from ‘small team size’) and focus on another aspect of the story.

Easier to Scale?

The second paragraph of AWS’s definition of microservices made me raise my eyebrows:

‘Microservices architectures make applications easier to scale and faster to develop, enabling innovation and accelerating time-to-market for new features.’

Regardless of what microservices were, these were their promised benefits: faster to develop, and easier to scale. What makes the AVP story so triggering to those of us who had been told we were dinosaurs is that the original serverless implementation of their tool was ludicrously un-scalable:

We designed our initial solution as a distributed system using serverless components (for example, AWS Step Functions or AWS Lambda), which was a good choice for building the service quickly. In theory, this would allow us to scale each service component independently. However, the way we used some components caused us to hit a hard scaling limit at around 5% of the expected load.

and not just technically un-scalable, but financially too:

Also, the overall cost of all the building blocks was too high to accept the solution at a large scale.

To me, this doesn’t sound like their approach has made it ‘easier to scale’. Some, indeed, saw this coming:

Faster to Develop?

But what about the other benefit, that of being ‘faster to develop’? Adrian Cockroft’s post talks about this, and lays out this comparison table:

This is where I must protest, starting with the second line which states that ‘traditional’, non-serverless/non-microservices development takes ‘months of work’ compared to the ‘hours of work’ microservices applications take to build.

Anyone who has actually built a serverless system in a real world context will know that it is not always, or even usually, ‘hours of work’. To take one small example of problems that can come up: https://twitter.com/matthewcp/status/1654928007897677824

to which you might add: difficulty of debugging, integration with other services, difficulty of testing scaling scenarios, state management, getting IAM rules right… the list goes on.

You might object to this, and argue that if your business has approved all the cloud provider’s services, and has a standard pattern for deploying them, and your staff is already well versed in the technologies and how to implement them, then yes, you can implement something in a few hours.

But this is where I’m baffled. In an analogous context, I have set up ‘traditional’ three-tier systems in a minimal and scalable way in a similar time-frame. Much of my career has been spent doing just that, and I still do that in my spare time because it’s easier for me for prototyping to do that on a server than wiring together different cloud services.

The supposed development time difference between the two methods is not based on the technology itself, but the context in which you’re deploying it. The argument made by the table is tendentious. It’s based on comparing the worst case for ‘traditional’ application development (months of work) with the best case for ‘rapid development’ (hours of work). Similar arguments can be made for all the table’s comparisons.

The Water We Swim In

Context is everything in these debates. As all the experts point out, there is no architectural magic bullet that fits all use cases. Context is as complex as human existence itself, but here I want to focus on two areas specifically:

  • governance
  • knowledge

The governance context is the set of constraints on your freedom to build and deploy software. In a low-regulation startup these constraints are close to zero. The knowledge context is the degree to which you and your colleagues know how a set of technologies work. It’s assumptions around these contexts that make up the fault lines of most of the serverless debate.

Take this tweet from AWS, which approvingly quotes the CEO of Serverless:

“The great thing about serverless is […] you just have to think about one task, one unit of work”

I can’t speak for other developers, but that’s almost always true for me most of the time when I write functions in ‘traditional’ codebases. When I’m doing that, I’m not thinking about IAM rules, how to connect to databases, the big app, the huge application. I’m just thinking about this one task, this unit of work. And conversely, if I’m working on a serverless application, I might have to think about all the problems I might run into that I listed above, starting with database connectivity.

You might object that a badly-written three-tier system makes it difficult to write such functions in isolation because of badly-structured monolithic codebases. Maybe so. But microservices architectures can be bad too, and let you ‘think about the one task’ you are doing when you should be thinking about the overall architecture. Maybe your one serverless task is going to cost a ludicrous amount of money (as with APV), or is duplicated elsewhere, or is going to bottleneck another task elsewhere.

Again: The supposed difference between the two methods is not based on the technology itself, but the context in which you’re working. If I’m fully bought into AWS as my platform from a governance and knowledge perspective, then serverless does allow me to focus on just the task I’m doing, because everything else is taken care of.

Here I’d like to bring up a David Foster Wallace parable about fish:

There are these two young fish swimming along and they happen to meet an older fish swimming the other way, who nods at them and says “Morning, boys. How’s the water?” And the two young fish swim on for a bit, and then eventually one of them looks over at the other and goes “What the hell is water?”

When you’re developing, you want your context to be like water to a fish: invisible, not in your way, sustaining you. But if I’m not a fish swimming in AWS’s metaphorical water, then I’m likely to splash around a lot if I dive into it.

Most advocates of serverless take it as a base assumption of the discussion that you are fully, maturely, and exclusively bought into cloud technologies, and the hyperscalers’ ecosystems. But for many more people working in software (including our customers), that’s not true, and they are wrestling with what, for them, is still a relatively unfamiliar environment.

A Confession

I want to make a confession. Based on what you’ve read so far, you might surmise I’m someone who doesn’t like the idea of serverless technology. But I’ve spent 23 years so far doing serverless work. Yes, I’m one of those people who claims to have 23 years experience in a 15-year old technology.

In fact, there’s many of us out there. This is because in those days we didn’t call these technologies ‘serverless’ or ‘Lambda’, we called them ‘stored procedures’.

I worked for a company for 15 of those years where the ‘big iron’ database was the water we swam in. We used it for message queues (at such a scale that IBM had to do some pretty nifty work to optimise for our specific use case and give us our own binaries off the main trunk), for our event-driven architectures (using triggers), and as our serverless platform (using stored procedures).

The joy of having a database as the platform was exactly the same then as the joys of having a serverless platform on a hyperscaler now. We didn’t have to provision compute resources for it (DBA’s problem), maintain the operating system (DBA’s problem), or worry or performance (DBA’s problem, mostly). We didn’t have to think about building a huge application, we just had to think about one task, one unit of work. And it took minutes to deploy.

People have drawn similar analogies between serverless and xinetd.

Serverless itself is nothing new. It’s just a name for what you’re doing when you can write code and let someone else manage the runtime environment (the ‘water’) for you. What’s new is the platform you treat as your water. For me 23 years ago, it was the database. Now it’s the cloud platform.

Mainframes, Clouds, Databases, and Lock-In

The other objection to serverless that’s often heard is that it increases your lock-in to the hyperscaler, something that many architects, CIOs, and regulators say they are concerned about. But as a colleague once quipped to me: “Lock-in? We are all locked into x86”, the point being that we’re all swimming in some kind of water, so it’s not about avoiding lock-in, but rather choosing your lock-in wisely.

It was symbolic when Amazon (not AWS) got rid of their last Oracle database in 2019, replacing them with AWS database services. In retrospect, this might be considered the point where businesses started to accept that their core platform had moved from a database to a cloud service provider. A similar inflection point where the mainframe platform was supplanted by commodity servers and PCs might be considered to be July 5, 1994, when Amazon itself was founded. Ironically, then, Amazon heralded both the death of the mainframe, and the birth of its replacement with AWS.

Conclusion

With this context in mind, it seems that the reason I never hopped onto the serveless train is because, to me, it’s not the software paradigm I was ushered into as a young engineer. To me, quickly spinning up a three-tier application is as natural as throwing together an application using S3, DynamoDB, and API Gateway is for those cloud natives that cut their teeth knowing nothing else.

What strikes this old codger most about the Amazon Prime Video article is the sheer irony of serverless’s defenders saying that its lack of scalability is the reason you need to move to a more monolithic architecture. It was serverless’s very scalability and the avoidance of the need to re-architect later that was one of its key original selling points!

But when three-tier architectures started becoming popular I’m sure mainframers of the past said the same thing: “What’s the point of building software on commodity hardware, when it’ll end up on the mainframe?” Maybe they even leapt on articles describing how big businesses were moving their software back to the mainframe, having failed to make commodity servers work for them, and joyously proclaimed that rumours of the death of the mainframe was greatly exaggerated.

And in a way, maybe they were right. Amazon killed the physical mainframe, then killed the database mainframe, then created the cloud mainframe. Long live the monolith!


This article was originally published on Container Solutions’ blog and is reproduced here by permission.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

If you enjoyed this, then please consider buying me a coffee to encourage me to do more.

Is it Imperative to be Declarative?

Recently, in Container Solutions’ engineering Slack channel, a heated argument ensued amongst our engineers after a Pulumi-related story was posted. I won’t recount the hundreds of posts in the thread, but the first response was “I still don’t know why we still use Terraform”, followed by a still-unresolved ping-pong debate about whether Pulumi is declarative or imperative, followed by another debate about whether any of this imperative vs declarative stuff really matters at all, and why can’t we just use Pulumi please?

This article is my attempt to prove that I was right and everyone else was wrong calmly lay out some of the issues and help you understand both what’s going on and how to respond to your advantage when someone says your favoured tool is not declarative and therefore verboten.

What does declarative mean, exactly?

Answering this question is harder than it appears, as the formal use of the term can vary from the informal use within the industry. So we need to unpick first the formal definition, then look at how the term is used in practice.

The formal definition

Let’s start with the Wikipedia definition of declarative:

“In computer science, declarative programming is a programming paradigm—a style of building the structure and elements of computer programs—that expresses the logic of a computation without describing its control flow.”

This can be reduced to:

“Declarative programming expresses the logic of a computation without describing its control flow.”

This immediately begs the question: ‘what is control flow?’ Back to Wikipedia:

“In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. Within an imperative programming language, a control flow statement is a statement that results in a choice being made as to which of two or more paths to follow.”

This can be reduced to:

“Imperative programs make a choice about what code is to be run.”

According to Wikipedia, examples of control flow include if statements, loops, and indeed any other construct that allows changes which statement is to be performed next (e.g. jumps, subroutines, coroutines, continuations, halts).

Informal usage and definitions

In debates around tooling, people rarely stick closely to the formal definitions of declarative and imperative code. The most commonly heard informal definition saw heard is: “Declarative code tells you what to do, imperative code says how to do it”. It sounds definitive, but discussion about it quickly devolves into definitions of what ‘what’ means and what ‘how’ means.

Any program tells you ‘what’ to do, so that’s potentially misleading, but one interpretation of that is that it describes the state you want to achieve.

For example, by that definition, is this pseudo-code declarative or imperative?

if exists(ec2_instance_1):
  create(ec2_instance_2)
create(ec2_instance_1)

Firstly, strictly speaking, it’s definitely not declarative according to a formal definition, as the second line may or may not run, so there’s control flow there.

It’s definitely not idempotent, as running once does not necessarily result in the same outcome as running twice. But an argument put to me was: “The outcome does not change because someone presses the button multiple times”, some sort of ‘eventually idempotent’ concept. Indeed, a later clarification was: “Declarative means for me: state eventually consistent”.

It’s not just engineers in the field who don’t cling to the formal definition. This Jenkinsfile documentation describes the use of conditional constructs whilst calling itself declarative.

So far we can say that:

  • The formal definitions of imperative vs declarative are pretty clear
  • In practice and general discussion, people get a bit confused about what it means and/or don’t care about the formal definition

Are there degrees of declarativeness?

In theory, no. In practice, yes. Let me explain.

What is the most declarative programming language you can think of? Whichever one it is, it’s likely that either there is a way to make it (technically) imperative, or it is often described as “not a programming language”.

HTML is so declarative that a) people often deride it as “not a programming language at all”, and b) we had to create the JavaScript monster and the SCRIPT tag to ‘escape’ it and make it useful for more than just markup. This applies to all pure markup languages. Another oft-cited example is Prolog, which has loops, conditions, and a halt command, so is technically not declarative at all.

SQL is to many a canonical declarative language: you describe what data you want, and the database management system (DBMS) determines how that data is retrieved. But even with SQL you can construct conditionals:

insert into table1 
where exists (
  select 1
  from table2
  where "some value" == table2.column1
)

Copy

The insert to table1 will only run conditionally, i.e. if there’s a row in table two that matches the text “some value”. You might think that this is a contrived example, and I won’t disagree. But in a way this backs up my central argument: whatever the technical definition of declarative is, the difference between most languages in this respect is how easy or natural it is to turn them into imperative languages.

Now consider this YAML, yanked from the internet:

job:
  script: "echo Hello, Rules!"
  rules:
    - if: '$CI_MERGE_REQUEST_TARGET_BRANCH_NAME == "master"'
      when: always
    - if: '$VAR =~ /pattern/'
      when: manual
	- when: on_success

This is clearly effectively imperative code. It runs in an order from top to bottom, and has conditionals. It can run different instructions at different times, depending on the context it is run in. However, YAML itself is still declarative. And because YAML is declarative, we have the hell of Helmkustomize, and different devops pipeline languages that claim to be declarative (but clearly aren’t) to deal with, because we need imperative, dynamic, conditional, branching ways to express what we want to happen.

It’s this tension between the declarative nature of the core tool and our immediate needs to solve problems that creates the perverse outcomes we hate so much as engineers, where we want to ‘break out’ of the declarative tool in order to get the things we want done in the way that we want it done.

Terraform and Pulumi

Which brings us neatly to the original subject of the Slack discussion we had at Container Solutions.

Anyone who has used Terraform for any length of time in the field has probably gone through two phases. First, they marvel at how the declarative nature of it makes it in many ways easier to maintain and reason about. And second, after some time using it, and as complexity in the use case builds and builds, they increasingly wish they could have access to imperative constructs.

It wasn’t long before Hashicorp responded to these demands and introduced the ‘count’ meta-argument, which effectively gave us some kind of loop concept, and hideous bits of code like this abound to give us if statements by the back door:

count = var.something_to_do ? 1 : 0

There’s also for and for_each constructs, and the local-exec provisioner, which allows you to escape any declarative shackles completely and just drop to the (decidedly non-declarative) shell once the resource is provisioned.

It’s often argued that Pulumi is not declarative, and despite protestations to the contrary, if you are using it for its main selling point (that you can use your preferred imperative language to declare your desired state), then Pulumi is effectively an imperative tool. If you talk to the declarative engine under Pulumi’s hood in YAML, then you are declarative all the way down (and more declarative than Terraform, for sure).

The point here is that not being purely declarative is no bad thing, as it may be that your use case demands a more imperative language to generate a state representation. Under the hood, that state representation describes the ‘what’ you want to do, and the Pulumi engine figures out how to achieve that for you.

Some of us at Container Solutions worked some years ago at a major institution that built a large-scale project in Terraform. For various reasons, Terraform was ditched in favour of a python-based boto3 solution, and one of those reasons was that the restrictions of a more declarative language produced more friction than the benefits gained. In other words, more control over the flow was needed. It may be that Pulumi was the tool we needed: A ‘Goldilocks’ tool that was the right blend of imperative and declarative for the job at hand. It could have saved us writing a lot of boto3 code, for sure.

How to respond to ‘but it’s not declarative!’ arguments

Hopefully reading this article has helped clarify the fog around declarative vs imperative arguments. First, we can recognise that purely declarative languages are rare, and even those that exist are often contorted into effectively imperative tooling. Second, the differences between these tools is how easy or natural they make that contortion.

There are good reasons to make it difficult for people to be imperative. Setting up simple Kubernetes clusters can be a more repeatable and portable process due to its declarative configuration. When things get more complex, you have to reach for tools like Helm and kustomize which may make you feel like your life has been made more difficult.

WIth this more nuanced understanding, next time someone uses the “but it’s not declarative” argument to shut you down, you can tell them two things: That that statement is not enough to win the debate; and that their suggested alternative is likely either not declarative, or not useful. The important question is not: “Is it declarative?” but rather: ‘How declarative do we need it to be?”


This article was originally published on Container Solutions’ blog and is reproduced here by permission.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

If you enjoyed this, then please consider buying me a coffee to encourage me to do more.

The Biggest Cloud Native Strategy Mistake

Business strategy is very easy to get wrong. You’re trying to make sure your resources and assets are efficiently deployed and focussed on your end goal, and that’s hard. There’s no magic bullet that can help you both get the right strategy defined, and then successfully deliver on it, but there are many resources we’ve found that can help reduce the risk of failure. 

Under the heading ‘The Future of Cloud’, Gartner recently ran a symposium for CIOs and IT executives including much discussion about strategies relating to Cloud and Cloud Native trends. At least two of the main talks (onetwo) were centred around a five-year horizon, discussing where cloud adoption will be in 2027 compared to where it is now.

As part of those talks, Gartner referenced a useful pyramid-shaped visualisation of different stages of cloud adoption. It could be viewed as a more schematic version of our Maturity Matrix, which we use as part of our Cloud Native Assessments with clients. 

In this article, we’re going to use the Gartner visualisation to talk about one of the biggest mistakes made in Cloud Native transformation strategies.

The pyramid

Gartner’s pyramid depicts four stages of cloud adoption from the perspective of business scope. These stages are shown as a hierarchy where the bottom layer represents the lowest dependency (“Technology Disruptor”) and the top layer represents the highest level business goal (“Business Disruptor”).

The four stages can be briefly summarised as:

  • Cloud As Technology Disruptor
    • The new base technology is adopted. For example, containerised applications, or a move to using a cloud service provider instead of a data centre.
  • Cloud As Capability Enabler
    • Now you have new technology in place, you can more easily build capabilities that may have been more difficult to achieve before, such as automated testing, or CI/CD.
  • Cloud As Innovation Facilitator
    • With new capabilities, you have the right environment to foster innovation in your business. This means you might, for example, leverage your cloud platform to deliver features more quickly, or conduct A/B testing of new features to maximise your return on investment.
  • Cloud As Business Disruptor
    • The most advanced stage of cloud adoption, where you can use the previous three stages’ outputs to change your business model by, for example, migrating to a SaaS model, or scaling your client base significantly, or introducing an entirely new product line.

The Biggest Cloud Native Strategy Mistake -blog illustration 1Whilst it is somewhat higher level, this pyramid is similar to our Maturity Matrix in that it helps give you a common visual reference point for a comprehensible and tangible view of both where you are, and where you are trying to get to, in your Cloud Native program. For example, it can help in discussions with technologists to ask them how the changes they are planning relate to stage four. Similarly, when talking to senior leaders about stage four, it can help to clarify whether they and their organisation have thought about the various dependencies below their goal and how they relate to each other. 

It can also help you avoid the biggest Cloud Native strategy mistake.

The big mistake

The biggest anti-pattern we see when consulting on Cloud Native strategy is to conflate all four phases of the pyramid into one monolithic entity. This means that participants in strategic discussions treat all four stages as a single stage, and make their plans based on that.

This anti-pattern can be seen at both ends of the organisational spectrum. Technologists, for example, might focus on the technical challenges, and are often minded to consider cloud strategy as simply a matter of technology adoption, or even just technology choice and installation. Similarly, business leaders often see a successful Cloud Native transformation as starting and stopping with a single discrete technical program of work rather than an overlapping set of capabilities that the business needs to build in its own context.

This monolithic strategy also conflates the goals of the strategy with the adoption plan, which in turn can lead to a tacit assumption that the whole program should be outlined in a single static and unchanging document.

For example, a business might document that their ‘move to the cloud’ is being pursued in order to transition their product from a customer installation model to a more scalable SaaS model. This would be the high-level vision for the program, the ‘level four’ of the pyramid. In the same document, there might be a roadmap which sets out how the other three levels will be implemented. For example, it might outline which cloud service provider will be used, which of those cloud service provider’s services will be consumed, which technology will be used as an application platform, and what technologies will be used for continuous integration and delivery.

This mixing of the high-level vision with the adoption plan risks them being treated as a single task to be completed. In reality, the vision and adoption plan should be separated, as while it is important to have clarity and consistency of vision, the adoption plan can change significantly as the other three levels of the pyramid are worked through, and this should be acknowledged as part of the overall strategy. At Container Solutions we call this ‘dynamic strategy’: a recognition that the adoption plan can be iterative and change as the particular needs and capabilities of your business interact with the different stages.

The interacting stages and ‘organisational indigestion’

Let’s dig a little bit deeper into each phase.

In the first ‘Technology Disruptor’ phase, there is uncertainty about how fast the technology  teams can adopt new technologies. This can depend on numerous local factors such as the level of experience and knowledge of these technologies among your teams, their willingness to take risks to deliver (or even deliver at all), and external blocks on delivery (such as security or testing concerns). It should also be said that whilst skills shortages are often cited as blocking new technology adoption, it is no longer practical to think of skills as a fixed thing that is hired as part of building a team to run a project based on a specific technology. Rather, technology skills need to be continuously developed by teams of developers exploring new technologies as they emerge and mature. To support this, leaders need to foster a “learning organisation” culture, where new ideas are explored and shared routinely.

The second ‘Capability Enabler’ phase has a basic dependency on the ‘Technology Disruptor’ phase. If those dependencies are not managed well, then organisational challenges may result. For example, whilst CI/CD capabilities can be built independently of the underlying technology, its final form will be determined by its technological enablers. A large-scale effort to implement Jenkins pipelines across an organisation may have to be scrapped and reworked if the business decides that AWS-native services should be used, and therefore the mandated tool for CI is AWS CodePipeline. This conflict between the ‘Technology Disruptor’ phase (the preference for AWS-native services) and ‘Capability Enabler’ phases can be seen as ‘organisational indigestion’ that can cause wasted time and effort as contradictions in execution are worked out.

The third ‘Innovation Facilitator’ phase is also dependent on the lower phases, as an innovation-enabling cloud platform is built for the business. Such a platform (or platforms) cannot be built without the core capabilities being enabled through the lower phases.

In practice, the three base phases can significantly overlap with one another, and could theoretically be built in parallel. However, ignoring the separability of the phases can result in the ‘organisational indigestion’ mentioned above, as the higher phases need to be revisited if the lower phases change. To give another simple example: if a business starts building a deployment platform on AWS CodeDeploy, it would need to be scrapped if the lower level decides to use Kubernetes services on Google Cloud.

The wasted effort and noise caused by this ‘organisational indigestion’ can be better understood and managed through the four phases model.

The treatment of Cloud Native strategy adoption as a single static monolith can also help to downplay or ignore the organisational challenges that lie ahead for any business. One example of this might be that while implementing a Cloud Native approach to automated testing could be a straightforward matter of getting engineers to write tests that previously didn’t exist, or it could equally be a more protracted and difficult process of retraining a manual testing team to now program automated tests.

Finally, the monolithic approach can lead to a collective belief that the project can be completed in a relatively short period of time. What’s a reasonable length of time? It’s worth remembering that Netflix, the reference project for a Cloud Native transformation, took seven years to fully move from their data centre to AWS. And Netflix had several things in their favour that made their transformation easier to implement: a clear business need (they could not scale fast enough and were suffering outages); a much simpler cloud ecosystem; a product clarity (video streaming) that made success easy to define; and a lack of decades of legacy software to maintain while they were doing it.

What to do about it?

We’ve outlined some of the dangers that not being aware of the four stages can bring, so what can you do to protect yourself against them?

Be clear about your path on the pyramid – optimisation or transformation?

The first thing is to ensure you have clarity around what the high-level vision and end goals for the transformation are. Gartner encapsulates this in a train map metaphor, to prompt the question of what your journey is envisaged to be. The ‘Replacement’ path, which goes across the first ‘Technical Disruption’ can also encompass the classic ‘Lift and Shift’ journey, the ‘Cloud Native’ path might cross both the first and second phases, and the ‘Business Transformation’ journey can cross all four phases.

The ‘east-west’ journeys can be characterised as ‘optimisation’ journeys, while the ‘south-north’ journeys can be characterised as ‘transformation’ journeys.

If the desired journey is unclear, then there can be significant confusion between the various parties involved about what is being worked towards. For example, executives driving the transformation may see a ‘Replacement’ approach as sufficient to make a transformation and therefore represent a journey up the pyramid, whilst those more technologically minded will immediately see that such a journey is an ‘optimisation’ one going across the first phase.

The Biggest Cloud Native Strategy Mistake -blog illustration 2

This advice is summarised as the vision first Cloud Native pattern, with executive commitment also being relevant.

Vision fixed, adoption path dynamic

A monolithic strategy that encompasses both vision and adoption can result in a misplaced faith in some parties that the plan is clear, static, and linearly achieved. This faith can flounder when faced with the reality of trying to move an organisation across the different phases.

Each organisation is unique, and as it works through the phases the organisation itself changes as it builds its Cloud Native capabilities. This can have a recursive effect on the whole program as the different phases interact with each other and these changing capabilities.

You can help protect against the risk of a monolithic plan by separating your high-level vision from any adoption plan. Where the vision describes why the project is being undertaken and should be less subject to change, the adoption plan (or plans) describes how it should be done, and is more tactical and subject to change. In other words, adoption should follow the dynamic strategy pattern.

Start small and be patient

Given the need for a dynamic strategy, it’s important to remember that if you’re working on a transformation, you’re building an organisational capability rather than doing a simple installation or migration. Since organisational capability can’t be simply transplanted or bought in in a monolithic way, it’s advisable to follow the gradually raising the stakes pattern. This pattern advocates for exploratory experiments in order to cheaply build organisational knowledge and experience before raising the stakes. This ultimately leads up to commitment to the final big bet, but by this point risk of failure will have been reduced by the learnings gained from the earlier, cheaper bets.

As we’ve seen from the Netflix example, it can take a long time even for an organisation less encumbered by a long legacy to deliver on a Cloud Native vision. Patience is key, and a similar approach to organisational learning needs to be taken into account as you gradually onboard teams onto any cloud platform or service you create or curate.

Effective feedback loop

Since the strategy should be dynamic and organisational learning needs to be prioritised, it’s important that an effective and efficient feedback loop is created between all parties involved in the transformation process. This is harder to achieve than it might sound, as there is a ‘Goldilocks effect’ in any feedback loop: too much noise, and leaders get frustrated with the level of detail; too little, and middle management can get frustrated as the reality of delivering on the vision outlined by the leaders hits constraints from within and outside the project team. Similarly, those on the ground can get frustrated by either the perceived bureaucratic overhead of attending multiple meetings to explain and align decisions across departments, or the ‘organisational indigestion’ mentioned above when decisions at different levels conflict with each other and work must be scrapped or re-done.

Using the pyramid

The pyramid is an easily-understood way to visualise the different stages of cloud transformation. This can help align the various parties’ conception of what’s ahead and avoid the most often-seen strategic mistake when undergoing a transformation: the simplification of all stages into one static and pre-planned programme.

Cloud transformation is a complex and dynamic process. Whilst the vision and goals should not be subject to change, the adoption plan is likely to, as you learn how the changes you make to your technology expose further changes that need to be made to the business to support and maximise the benefits gained. It is therefore vital to separate the high level goals of your transformation from the implementation detail, and ensure there is an effective feedback loop

Through all this complexity, the pyramid can help you both conceptualise the vision for your transformation and define and refine the plan for adoption, allowing you to easily connect the more static high level goals to the details of delivery.

This article was originally published on Container Solutions’ blog and is reproduced here by permission.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

If you enjoyed this, then please consider buying me a coffee to encourage me to do more.

Ian Miell

Practical Strategies for Implementing DevSecOps in Large Enterprises

At Container Solutions, we often work with large enterprises who are at various different stages of adopting cloud technologies. These companies are typically keen to adopt modern Cloud Native software working practices and technologies as itemised in our Maturity Matrix, so come to us for help, knowing that we’ve been through many of these transformation processes before. 

Financial services companies are especially keen to adopt DevSecOps, as the benefits to them are obvious given their regulatory constraints and security requirements. This article will focus on a common successful pattern of adoption for getting DevSecOps into large-scale enterprises that have these kinds of constraints on change.

DevSecOps and institutional inertia

The first common misconception about implementing DevSecOps is that it is primarily a technical challenge but, as we’ve explored on WTF before, it is at least as much about enabling effective communication. Whilst we have engineering skills in cutting-edge tooling and cloud services, there is little value in delivering a nifty technical solution if the business it’s delivered for is unable or unwilling to use it. If you read technical blog posts on the implementation of DevSecOps, you might be forgiven for thinking that the only things that matter are the tooling you choose, and how well you write and manage the software that is built on this tooling.

For organisations that were ‘born in the cloud’, where everyone is an engineer and has little legacy organisational scar tissue to consider, this could indeed be true. In such places, where the general approach to DevSecOps is well-grasped and agreed on by all parties, the only things to be fought over are indeed questions of tooling. This might be one reason why such debates take on an almost religious fervour.

The reality for larger enterprises that aren’t born in the cloud is that there are typically significant areas of institutional inertia to overcome. These include (but are not limited to):

  • The ‘frozen middle
  • Siloed teams that have limited capability in new technologies and processes
  • Internal policies and process designed for the existing ways of working

Prerequisites for success

Before outlining the pattern for success, it’s worth pointing out two critical prerequisites for enterprise change management success in moving to DevSecOps. As an aside, these prerequisites are not just applicable to DevSecOps but apply to most change initiatives.

The first is that the vision to move to a Cloud Native way of working must be clearly articulated to those tasked with delivering on it. The second is that the management who articulate the vision must have ‘bought into’ the change. This doesn’t mean they just give orders and timelines and then retreat to their offices, it means that they must back up the effort when needed with carrots, sticks, and direction when those under them are unsure how to proceed. If those at the top are not committed in this way, then those under them will certainly not push through and support the changes needed to make DevSecOps a success.

A three-phase approach

At Container Solutions we have found success in implementing DevSecOps in these contexts by taking a three-phase approach:

  1. Introduce tooling
  2. Baseline adoption
  3. Evolve towards an ideal DevSecOps practice

The alternative that this approach is put up against is the ‘build it right first time’ approach, where everything is conceived and delivered in one “big bang” style implementation. 

  1. Introduce tooling

In this phase you correlate the security team’s (probably manual) process with the automation tooling you have chosen, and determine their level of capability for automation. At this point you are not concerned with how closely the work being done now matches the end state you would like to reach. Indeed, you may need to compromise against your ideal state. For example, you might skip writing a full suite of tests for your policies.

The point of this initial phase is to create alignment on the technical direction between the different parties involved as quickly and effectively as possible. To repeat: this is a deliberate choice over technical purity, or speed of delivery of the whole endeavour.

The security team is often siloed from both the development and cloud transformation teams. This means that they will need to be persuaded, won over, trained, and coached to self-sufficiency.

Providing training to the staff at this point can greatly assist the process of adoption by emphasising the business’s commitment to the endeavour and setting a minimum baseline of knowledge for the security team. If the training takes place alongside practical implementation of the new skills learned, it makes it far more likely that the right value will be extracted from the training for the business.

The output of this phase should be that:

  • Security staff are comfortable with (at least some of) the new tooling
  • Staff are enthused about the possibilities offered by DevSecOps, and see its value
  • Staff want to continue and extend the efforts towards DevSecOps adoption
  1. Get To baseline adoption

Once you have gathered the information about the existing process, the next step is to automate them as far as possible without disrupting the existing process too much. For example, if security policy adherence is checked manually in a spreadsheet by the security team (not an uncommon occurrence), those steps can be replaced by automation. Tools that might be used for this include some combination of pipelines, Terraform, Inspec, and so on. The key point is to start to deliver benefits for the security team quickly and help them see that this will make them more productive and (most importantly of all) increase the level of confidence they have in their security process.

Again, the goal for this stage is to level up the capabilities of the security team so that the move towards DevSecOps is more self-sustaining rather than imposed from outside. This is the priority over speed of delivery. In practical terms, this means that it is vital to offer both pairing (to elevate knowledge) and support (when things go wrong) from the start to maintain goodwill towards the effort. The aim is to spread and elevate the knowledge as far across the department as possible. 

Keep in mind, though, that knowledge transfer will likely slow down implementation. This means that it is key to ensure you regularly report to stakeholders on progress regarding both policy deployment and policy outputs, as this will help sustain the momentum for the effort.

Key points:

  • Report on progress as you go
  • Provide (and focus on) help and support for the people who will maintain this in future
  • Where you can, prioritise spreading knowledge far and wide over delivering quickly

Once you have reached baseline adoption, you should be at a ‘point of no return’ which allows you to push on to move to your ideal target state.

  1. Evolve to pure DevSecOps

Now that you have brought the parties on-side and demonstrated progress, you can start to move towards your ideal state. This begs the question of what that ideal state is, but we’re not going to exhaustively cover that here as that’s not the focus. Suffice it to say that security needs to be baked into every step of the overall development life cycle and owned by the development and operations teams as much as it is by the security team.

Some of the areas you would want to work on from here include:

  • Introducing/cementing separation of duties
  • Setting up tests on the various compliance tools used in the SDLC
  • Approval automation
  • Automation of policy tests’ efficacy and correctness
  • Compliance as code

These areas, if tackled too early, can bloat your effort to the point where the business sees it as too difficult or expensive to achieve. This is why it’s important to tackle the areas that maximise the likelihood of adoption of tooling and principles in the early stages.

Once all these things are coming together, you will naturally start to turn to the organisational changes necessary to get you to a ‘pure DevSecOps’ position, where development teams and security teams are working together seamlessly.

Conclusion

Like all formulas for business and technological change, this three-phase approach to introducing DevSecOps can’t be applied in exactly the same way in every situation. However, we’ve found in practice that the basic shape of the approach is very likely to be a successful one, assuming the necessary prerequisites are in place.

Building DevSecOps adoption in your business is not just about speed of delivery, it’s about making steady progress whilst setting your organisation up for success. To do this you need to make sure you are building capabilities and not just code.


This article was originally published on Container Solutions’ blog and is reproduced here by permission.


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

If you enjoyed this, then please consider buying me a coffee to encourage me to do more.


A Little Shell Rabbit Hole

Occasionally I run dumb stuff in the terminal. Sometimes something unexpected happens and it leads me to wonder ‘how the hell did that work?’

This article is about one of those times and how looking into something like that taught me a few new things about shells. After decades using shells, they still force me to think!

The tl;dr is at the end if you don’t want to join me down this rabbit hole…

The Dumb Thing I Ran

The dumb thing I occasionally ran was:

grep .* *

If you’re experienced in the shell you’ll immediately know why this is dumb. For everyone else, here are some reasons:

  • The first argument to grep should always be a quoted string – without them, the shell treats the .* as a glob, not a regexp
  • grep .* just matches every line, so…
  • you could just get almost the same output by running cat *

Not Quite So Dumb

Actually, it’s not quite as dumb as I’ve made out. Let me explain.

In the bash shell, ‘.*‘ (unquoted) is a glob matching all the files beginning with the dot character. So the ‘grep .* *‘ command above interpreted in this (example) context:

$ ls -a1
.    ..    .adotfile    file1   file2

Would be interpreted as the command in bold below:

$ echo grep .* *
grep . .. .adotfile file1 file2

The .* gets expanded by the shell as a glob to all file or folders beginning with the literal dot character.

Now, remember, every folder contains at least two folders:

  • The dot folder (.), which represents itself.
  • The double-dot folder (..), which represents the parent folder

So these get added to the command:

grep . ..

Followed by any other file or folder beginning with a dot. In the example above, that’s .adotfile.

grep . .. .adotfile

And finally, the ‘*‘ at the end of the line expands to all of the files in the folder that don’t begin with a dot, resulting in:

grep . .. .adotfile file1 file2

So, the regular expression that grep takes becomes simply the dot character (which matches any line with a single character in it), and the files it searches are the remaining items in the file list:

..
.adotfile
file1
file2

Since one of those is a folder (..), grep complains that:

grep: ..: Is a directory

before going on to match any lines with any characters in. The end result is that empty lines are ignored, but every other line is printed on the terminal.

Another reason why the command isn’t so dumb (and another way it differs from ‘cat *‘) is that since multiple files are passed into grep, it reports on the filename, meaning the output automatically adds which file the line comes from.

bash-5.1$ grep .* *
grep: ..: Is a directory
.adotfile:content in a dotfile
file1:a line in file1
file2:a line in file2

Strangely, for two decades I hadn’t noticed that this is a very roundabout and wrong-headed (ie dumb) way to go about things, nor had I thought about its output being different from what I might have expected; it just never came up. Running ‘grep .* *‘ was probably a bad habit I picked up when I was a shell newbie last century, and since then I never needed to think about why I did it, or even what it did until…

Why It Made Me Think

The reason I had to think about it was that I started to use zsh as my default terminal on my Mac. Let’s look at the difference with some commands you can try:

bash-5.1$ mkdir rh && cd rh
bash-5.1$ cat > afile << EOF
text
EOF
bash-5.1$ bash
bash-5.1$ grep .* afile
grep: ..: Is a directory
afile:text
bash-5.1$ zsh 
zsh$ grep .* afile
zsh:1: no matches found: .*

For years I’d been happily using grep .* but suddenly it was telling me there were no matches. After scratching my head for a short while, I realised that of course I should have quotes around the regexp, as described above.

But I was still left with a question: why did it work in bash, and not zsh?

Google It?

I wasn’t sure where to start, so I googled it. But what to search for? I tried various combinations of ‘grep in bash vs zsh‘, ‘grep without quotes bash zsh‘, and so on. While there was some discussion of the differences between bash and zsh, there was nothing which addressed the challenge directly.

Options?

Since google wasn’t helping me, I looked for shell options that might be relevant. Maybe bash or zsh had a default option that made them behave differently from one another?

In bash, a quick look at the options did not reveal many promising candidates, except for maybe noglob:

bash-5.1$ set -o | grep glob
noglob off
bash-5.1$ set -o noglob
bash-5.1$ set -o | grep glob
noglob on
bash-5.1$ grep .* *
grep: *: No such file or directory

But this is different from zsh‘s output. What noglob does is completely prevent the shell from expanding globs. This means that no file matches the last ‘*‘ character, which means that grep complains that no files are matched at all, since there is no file named ‘*‘ in this folder.

And for zsh? Well, it turns out there are a lot of options in zsh…

zsh% set -o | wc -l
185

Even just limiting to those options with glob in them doesn’t immediately hit a jackpot:

zsh% set -o | grep glob
nobareglobqual        off
nocaseglob            off
cshnullglob           off
extendedglob          off
noglob                off
noglobalexport        off
noglobalrcs           off
globassign            off
globcomplete          off
globdots              off
globstarshort         off
globsubst             off
kshglob               off
nullglob              off
numericglobsort       off
shglob                off
warncreateglobal      off

While noglob does the same as in bash, after some research I found that the remainder are not relevant to this question.

(Trying to find this out, though, it tricky. First zsh‘s man page is not complete like bash‘s, it’s divided into multiple man pages. Second, concatenating all the zsh man pages with man zshall and searching for noglob gest no matches. It turns out that options are documented in caps with underscored separating words. So, in noglob‘s case, you have to search for NO_GLOB. Annoying.)

zsh with xtrace?

Next I wondered whether this was due to some kind of startup problem with my zsh setup, so I tried starting up zsh with the xtrace option to see what’s run on startup. But the output was overwhelming, with over 13,000 lines pushed to the terminal:

bash-5.1$ zsh -x 2> out
zsh$ exit
bash-5.1$ wc -l out
13328

I did look anyway, but nothing looked suspicious.

zsh with NO_RCS?

Back to the documentation, and I found a way to start zsh without any startup files by starting with the NO_RCS option.

bash-5.1$ zsh -o NO_RCS
zsh$ grep .* afile
zsh:1: no matches found: .*

There was no change in behaviour, so it wasn’t anything funky I was doing in the startup.

At this point I tried using the xtrace option, but then re-ran it in a different folder by accident:

zsh$ set -o xtrace
zsh$ grep .* *
zsh: no matches found: .*
zsh$ cd ~/somewhere/else
zsh$ grep .* *
+zsh:3> grep .created_date notes.asciidoc

Interesting! The original folder I created to test the grep just threw an error (no matches found), but when there is a dotfile in the folder, it actually runs something… and what it runs does not include the dot folder (.) or parent folder (..)

Instead, the ‘grep .* *‘ command expands the ‘.*‘ into all the files that begin with a dot character. For this folder, that is one file (.created_date), in contrast to bash, where it is three (. .. .created_date). So… back to the man pages…

tl;dr

After another delve into the man page, I found the relevant section in man zshall that gave me my answer:

FILENAME GENERATION

[...]

In filename generation, the character /' must be matched explicitly; also, a '.' must be matched explicitly at the beginning of a pattern or after a '/', unless the GLOB_DOTS option is set. No filename generation pattern matches the files '.' or '..'. In other instances of pattern matching, the '/' and '.' are not treated specially.

So, it was as simple as: zsh ignores the ‘.‘ and ‘..‘ files.

But Why?

But I still don’t know why it does that. I assume it’s because the zsh designers felt that that wrinkle was annoying, and wanted to ignore those two folders completely. It’s interesting that there does not seem to be an option to change this behaviour in zsh.

Does anyone know?


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

If you enjoyed this, then please consider buying me a coffee to encourage me to do more.