OpenShift 3.6 DNS In Pictures

Container Networking is Hard Enough…

To those not versed in the dark networking arts, one of the mysteries of OpenShift (RedHat’s wrapper around Kubernetes) is how a pod communicates with the outside world.

This article is more about DNS on clusters, but the point is the same: things can get pretty complicated pretty quickly.

Let’s Add DNSMasq…

Recently I was grappling with this while debugging a Vagrant OpenShift cluster test suite when someone smarter than me took the time to explain what was happening

I wasn’t sure I’d got all the details, so I put together these diagrams to help me follow.

External DNS Lookup

Here’s the ‘simple’ case of a single-container pod pinging google.com:

DNS pod routing - 3.6 external

The steps can be described linearly as:

Process starts in container, and needs to know what google.com resolves to.
Process looks up /etc/resolv.conf to see where dns queries should be resolved.
Process asks the 10.0.2.15 on port 53 (DNS) to get it google.com’s IP
DNSMasq determines that this is a query that needs to go to the outside world, so passes it out
- In this particular setup it passes the lookup to DNSMasq’s configured DNS exit point (which is eth1 in this Vagrant setup)

Even here I’m skirting over a lot. The ping process can also refer to /etc/hosts.conf, /etc/nsswitch.conf, /etc/gai.conf, and /etc/hosts, for example. And I use landrush to manage host lookups for my VMs (between the VMs and to/from the host).

In these diagrams, I don’t show the cluster, rather everything happening on the one node.
Also, the IP addresses for the ‘eth0’ are the standard Vagrant-allocated IPs.

resolv.conf

In OpenShift, the resolv.conf file in the container is constructed by taking the resolv.conf from the host operating system. It then places a nameserver above these (which can be set in your node.yaml file).

DNSmasq

By default, this nameserver points to your host IP (10.0.2.15 in this Vagrant setup), which expects a DNS resolver (typically, a dnsmasq server) to be sitting on that IP’s port 53. If no item is provided, it defaults to the kubernetes service IP, by-passing dnsmasq.

DNSMasq uses the servers specified in the files in /etc/dnsmasq.d/*

According to this thread, there is no specific ordering to the asks, it just asks each in turn until it gets an answer.

Local Cluster DNS Lookup

So that’s the ‘simple’ case of an external lookup.

Now we come onto a local dns lookup on the Kubernetes cluster.

DNS pod routing - 3.6 internal

The steps can be described linearly as:

Process starts in container, and needs to know what kubernetes.default.svc.cluster.local resolves to.
Process looks up /etc/resolv.conf to see where dns queries should be resolved.
Process asks the 10.0.2.15 on port 53 (DNS) to get it kubernetes.default.svc.cluster.local’s address
DNSMasq determines that this is a query that needs to go to the cluster, so passes it to the OpenShift node process to look up. This process is listening on port 53 of the localhost IP (127.0.0.1).
The OpenShift node process either returns the IP address from its cache (which is why bouncing the node process can make some resolution issues go away), or passes the request to the master process DNS.

To help see this setup, you can run this command on the host. In this setup, I have a node and a master OpenShift process running on one Vagrant VM:

[root@master1 ~]# netstat -nltp  | grep 53
tcp  0  0 127.0.0.1:53   0.0.0.0:*   LISTEN      30998/openshift
tcp  0  0 10.0.2.15:53   0.0.0.0:*   LISTEN      31034/dnsmasq
tcp  0  0 0.0.0.0:8053   0.0.0.0:*   LISTEN      29316/openshift

This is based on work in progress from the second edition of Docker in Practice

Get 39% off with the code: 39miell2

4 thoughts on “OpenShift 3.6 DNS In Pictures”