Containers vs. Docker vs. Kubernetes vs. containerd vs. runC... Part 1: What's in the Box?

Originally published at: Containers vs. Docker vs. Kubernetes vs. containerd vs. runC... Part 1: What's in the Box? - Skycrafters

Containers, Docker, Kubernetes, runC, CRI-O, containerd… The container workspace is growing fast! What are all these (and others!) and how do they connect to each other and your production workload?

Keeping up with the container space, if you don’t live and breathe it, is tough. I, myself lost grip of it when I tilted my head to check my phone briefly. Container isn’t just about Docker and Kubernetes anymore. Heck, what’s a container exactly?! Knowing all that – and more – about containers can and will help you decide what’s best for you and your organization when it comes to running your containerized workloads in production.

So, let’s discuss a little about containers. From its history, how containers aren’t equal to Docker, and all the way up to how modern containers actually work and running yourself one – without installing Docker.

But first… What are containers?

This is a great question that shamelessly took me way longer to ask myself that it should, given my engineer brain. So, I’ll start just like most would, comparing it to a Virtual Machine (VM).

The first use of the term VM was made in the early 1960s by IBM researchers. And as you might be aware, is used to define a virtual (as in not physical) environment with its own resources (CPU, Memory, etc.) based on actual physical resources, which in turn might also be shared among other virtual machines.

Although for an ordinary end-user this might not make sense at first – who would want to run 2 copies of Windows from the same laptop? To answer that question, doing this means taking twice the limited resources available! And this approach makes a lot of sense for servers for a few reasons.

One of them being, it allows enterprises to run, for instance, mail and web servers with different software requirements, such as different Operational System or Java versions, all from the same box. Among other benefits, this translates to cheaper operation, since it means buying and maintaining just one physical server, for instance.

Containers are not that different than VMs when you think about the why; since it is also a way to share and isolate resources. That’s why you might see some people using the term lightweight virtualization to describe containers. Which in my humble opinion is a great way to describe it! Modern Linux containers leverage functions buried in Linux’s kernel to provide an isolated environment so you can run process(es) independently from each other. You don’t need Docker, Kubernetes, or some new container technology you just heard to run containers in your Linux environment.

Time for a history lesson

When did you first hear about containers in the IT world? Probably right next to the word Docker, right? Well, containers are a bit older than that… Get your DeLorean ready because it’s time to do some time-traveling.

1979 – chroot

Our first stop is 1979, when the concept of containers was created using Unix chroot, although not necessarily calling it one. In 1979, during the development of Version 7 Unix, the chroot system call was introduced. This system call can change the apparent root directory for the current running process and its children, effectively isolating that process from accessing other resources in disk.

2000 – freebsd jails

Returning to our DeLorean and traveling a few years ahead, we get to the year 2000. It was here where the first mainstream container technology was released, although the name container still wasn’t used.

FreeBSD Jails improve on the concept of the traditional chroot environment in several ways. In a traditional chroot environment, processes are only limited in the part of the file system they can access. The rest of the system resources, system users, running processes, and the networking subsystem are shared by the chrooted processes and the processes of the host system.

Jails expand this model by virtualizing access to the file system, the set of users, and the networking subsystem. More fine-grained controls are available for tuning the access of a jailed environment.

2008 – LXC

Eight years have passed, and we are now in 2008. Linux Containers, or LXC for short, is an operating-system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel.

This is often considered as something in the middle between a chroot and a full-fledged virtual machine. The goal of LXC is to create an environment as close as possible to a standard Linux installation but without the need for a separate kernel.

2013 – Docker

Our last DeLorean trip (for now) is to 2013, where Docker was first started. And since you are here, I assume Docker needs no introduction. What you might not know is that Docker is, at least at first and in a deeply summarized version, an interface between user and existing technologies. Similarly, to LXC, this makes it easier to use thus making its adoption widespread.

Nowadays, by default, Docker uses libcontainer, created by the Docker team themselves, to interface with the kernel features it needs. This means you don’t need Docker to run your own modern containers. Don’t believe me? Let me guide you to prove my point.

Kernel features

Let’s first discuss, and get our hands dirty, with kernel features that enable Docker, or any type of container engine (more on that later) for that matter, to operate. In the process, we will create our own container, Dockerlessly. For all the features, I’ll make sure to run the man command, short for manual. Also, I will get straight from the source what the command is supposed to do and discuss how it can help us achieve all that.

chroot

$ man chroot

chroot – run command or interactive shell with special root directory

chroot – pronounced change root – as its manual entry mentions, is a command to run a command or interactive shell leveraging a special root directory instead of the default one. This allows us to run a new process and isolate it, filesystem wise, providing it any folder as its root directory. It takes a NEWROOT and the desired command as arguments. So, let’s try it on our own.

$ chroot ~ ls /

chroot: cannot change root directory to ‘/home/pi’: Operation not permitted

“This container thing without Docker doesn’t work!!!”, you must be mentally shouting. chroot, however, needs root permission to run. So, let’s try once more, adding sudo to it.

$ sudo chroot ~ ls /

chroot: failed to run command ‘ls’: No such file or directory

Ok, that’s interesting. It isn’t what we expected, but it isn’t a permission error either! It explicitly tells us that the command ‘ls’ doesn’t exist. Any idea why? If this was a YouTube video, that would be the part where I ask you to pause and post in the comments what you think the reason is. Since it isn’t, it’s fine to take note mentally.

Answer: We are trying to run this command (ls) from a completely new root folder that, in this scenario, is our own user’s home folder, a place where the ls binary doesn’t exist. Let’s fix this. First, we need to find where ls lives:

$ which ls

/usr/bin/ls

Still chroot-ing

Take note of this directory and replace it as needed in the following commands:

$ mkdir -p ~/usr/bin #Creating the folder /usb/bin inside our home folder.

$ cp /usr/bin/ls ~/usr/bin #Copying the `ls` binary to this new folder

$ ll ~/usr/bin/ #Checking that the file aw actually copied

total 116

drwxr-xr-x 2 pi pi 4096 May 12 14:10 ./

drwxr-xr-x 3 pi pi 4096 May 12 14:10 ../

-rwxr-xr-x 1 pi pi 108752 May 12 14:10 ls*

Great, our binary is there, now! Care to chroot again?

$ sudo chroot ~ ls /

chroot: failed to run command ‘ls’: No such file or directory

“Liar!”, you are thinking, “It still doesn’t work!” It doesn’t, you are right. But that’s just because ls has library dependencies that we haven’t imported yet. Let’s find out what they are and copy them over.

Note: I’m using a Raspberry Pi running Raspian. If you are running these anywhere else, you’ll probably have different folders because of different architectures in the steps below.

First, let’s find which dependencies ls has.

$ man ldd

ldd prints the shared objects (shared libraries) required by each program or shared object specified on the command line.

 

$ ldd /usr/bin/ls

linux-vdso.so.1 (0x7efde000)

/usr/lib/arm-linux-gnueabihf/libarmmem-${PLATFORM}.so => /usr/lib/arm-linux-gnueabihf/libarmmem-v7l.so (0x76f33000)

libselinux.so.1 => /lib/arm-linux-gnueabihf/libselinux.so.1 (0x76f01000)

libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76db3000)

/lib/ld-linux-armhf.so.3 (0x76f48000)

libpcre.so.3 => /lib/arm-linux-gnueabihf/libpcre.so.3 (0x76d3c000)

libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0x76d29000)

libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0x76cff000)

Taking note of the dependencies and their location (which, again, could be different on your end), we run the right commands to copy these over our “new root folder.”

$ mkdir -p ~/lib/arm-linux-gnueabihf/ ~/usr/lib/arm-linux-gnueabihf/ #Creating folders

$ cp /usr/lib/arm-linux-gnueabihf/libarmmem-v7l.so ~/usr/lib #Copying the libraries over to the new folder

$ cp /lib/arm-linux-gnueabihf/libselinux.so.1 /lib/arm-linux-gnueabihf/libc.so.6 /lib/ld-linux-armhf.so.3 /lib/arm-linux-gnueabihf/libpcre.so.3 /lib/arm-linux-gnueabihf/libdl.so.2 /lib/arm-linux-gnueabihf/libpthread.so.0 ~/lib/ #Copying the libraries over to the new folder

Final chroot… I promise

$ sudo chroot ~ ls /

bin etc-dnsmasq.d lib libseccomp2_2.4.4-1~bpo10+1_armhf.deb … usr

YES! It works! We have containers!!! … or do we? Ok, not really, but it’s a start. Let’s go back to the manual’s own definition: “Run command or interactive shell with special root directory” (yes, I’m using every possibly way to create emphasis on the word directory.)

Even though the root directory for the ls process was your own home directory, the isolation stops there. If you were to run a long-lived process in your host, like a webserver for instance, a “chrooted” kill command would terminate your webserver process. If you don’t believe me, try to run a long-standing command from your terminal and use chroot to kill it 😉

Namespaces

We just saw that running a command with an isolated root directory is good but isn’t enough. So, let’s talk about Namespaces.

$ man namespaces

A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource. Changes to the global resource are visible to other processes that are members of the namespace but are invisible to other processes. One use of namespaces is to implement containers.

Wait… isn’t that pretty much the VM definition that we came up with?

Namespaces can get really complicated quickly, so we are not going to go over all the details here. It’s interesting to know, however, that there are multiple namespace types that one can leverage to isolate resources, as you can see below, straight from Namespaces manual:

Using Namespace to create a container

Let’s use Namespace to create our own container. For that, however, let me introduce yet another command, unshare.

$ man unshare

run program with some namespaces unshared from parent

That seems straightforward enough since we now know what namespaces are. It pretty much isolates a process… creating a container around it. Let’s put it to use.

But first, instead of copying file by file all the binaries that we want plus all the dependencies, let’s do it the smart way, using a tool called debootstrap (available for Debian-based distros only.)

$ man debootstrap

Bootstrap a basic Debian system.

Debootstrap can be used to install Debian in a system without using an installation disk but can also be used to run a different Debian flavor in a chroot environment. This way you can create a full (minimal) Debian installation which can be used for testing purposes

$ debootstrap –variant=minbase focal ~/new-root

Still Namespace-ing

Now we have a minimal installation of the OS in our new-root folder, as we can see below:

$ ls -lah ~/new-root/

total 68K

drwxr-xr-x 17 root root 4.0K Aug 10 14:29 .

drwx—— 8 root root 4.0K Aug 10 20:12 ..

lrwxrwxrwx 1 root root 7 Aug 10 14:28 bin -> usr/bin

drwxr-xr-x 2 root root 4.0K Apr 15 2020 boot

drwxr-xr-x 4 root root 4.0K Aug 10 14:28 dev

drwxr-xr-x 30 root root 4.0K Aug 10 14:29 etc

drwxr-xr-x 2 root root 4.0K Apr 15 2020 home

lrwxrwxrwx 1 root root 7 Aug 10 14:28 lib -> usr/lib

drwxr-xr-x 2 root root 4.0K Aug 10 14:28 media

drwxr-xr-x 2 root root 4.0K Aug 10 14:28 mnt

drwxr-xr-x 2 root root 4.0K Aug 10 14:28 opt

drwxr-xr-x 2 root root 4.0K Apr 15 2020 proc

drwx—— 2 root root 4.0K Aug 10 14:39 root

drwxr-xr-x 4 root root 4.0K Aug 10 14:28 run

lrwxrwxrwx 1 root root 8 Aug 10 14:28 sbin -> usr/sbin

drwxr-xr-x 2 root root 4.0K Aug 10 14:28 srv

drwxr-xr-x 2 root root 4.0K Apr 15 2020 sys

drwxrwxrwt 2 root root 4.0K Aug 10 14:29 tmp

drwxr-xr-x 10 root root 4.0K Aug 10 14:28 usr

drwxr-xr-x 11 root root 4.0K Aug 10 14:28 var

Before we even run unshare, let’s first retest this environment with chroot.

$ chroot ~/new-root/ bash

$ mount -t proc none /proc

$ ps

PID TTY TIME CMD

1548 ? 00:00:00 sudo

1549 ? 00:00:00 su

1550 ? 00:00:00 bash

19626 ? 00:00:00 bash

19632 ? 00:00:00 ps

$ exit

Let’s try the same now, using unshare:

$ unshare –mount –uts –ipc –net –pid –fork –user –map-root-user chroot ~/new-root/ bash

$ mount -t proc none /proc

$ ps

PID TTY TIME CMD

1 ? 00:00:00 bash

5 ? 00:00:00 ps

Done Namespace-ing

“Wow!! That’s it! We finally have containers!!”, you might be thinking. And you are kind of right. However, how useful would they be if you can’t put constraints, in terms of physical resource usage in each of them, such as limit RAM or CPU per container?

cgroups

cgroups, or Control Groups, does exactly that.

$ man cgroups

Control groups, usually referred to as cgroups, are a Linux kernel feature which allow processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored. The kernel’s cgroup interface is provided through a pseudo-filesystem called cgroupfs. Grouping is implemented in the core cgroup kernel code, while resource tracking and limits are implemented in a set of per-resource-type subsystems (memory, CPU, and so on).

We’re not going to go over how to use cgroups because, again, it can get really complicated really quickly. If you want to do that, I recommend reading about cgroup-tools that allow you to use the cgroups features more easily straight from your terminal.

In closing

I hope this article helped you demystify what Containers are and showed you that Docker isn’t anything other than a (damn useful) layer between you and kernel resources in your favorite OS.

I honestly doubt that this content can make you a better container user, developer, or admin. But it’s a great intro for our next discussion around container runtime, engine, and orchestration. Please feel free to chime in below with any thoughts you have. In the meantime, I’ll see you later!

1 Like