Saturday, October 20, 2018

Buildah inside a centos 7.5 docker container on a centos 7.5 host

Our current solution uses Jenkins to start a Nomad job which starts a (unprivileged) docker container in which a developers Dockerfile is being build (as root) using the docker on the host.

The goal is to replace the docker build in the container by buildah so that we don't need to make the docker on the host available inside the container.

The path to this wasn't as straightforward unfortunately, a lot of yaks needed shaving.

Start of the journey

We're starting with a basic container where we install buildah in
# docker run --rm -ti centos:7 /bin/bash
[root@7387c68139dd /]# yum -y install buildah
And a very simple Dockerfile
FROM centos:7
RUN uptime

Yak 1 - overlay problems

Out of the box running buildah in the container will give an overlay error.
# buildah bud -t test .
ERRO[0000] 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay"
ERRO[0000] 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay"
kernel does not support overlay fs: 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay": backing file system is unsupported for this graph driver
kernel does not support overlay fs: 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay": backing file system is unsupported for this graph driver
Spoiler: The real reason this doesn't work is because it tries to do a mount call, which can only be done with the SYS_ADMIN capability (or in a privileged container).

Using --storage-driver vfs fixed this problem.

On to the next one.

Yak 2 - mount namespace error aka unshare(CLONE_NEWNS) permission aka the wrong yak

Spoiler: this yak is a red herring
# buildah --storage-driver vfs bud -t test .
STEP 1: FROM centos:7
Getting image source signatures
Copying blob sha256:aeb7866da422acc7e93dcf7323f38d7646f6269af33bcdb6647f2094fc4b3bf7
 71.24 MiB / 71.24 MiB [====================================================] 4s
Copying config sha256:75835a67d1341bdc7f4cc4ed9fa1631a7d7b6998e9327272afea342d90c4ab6d
 2.13 KiB / 2.13 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
STEP 2: RUN uptime
error running container: error creating new mount namespace for [/bin/sh -c uptime]: operation not permitted
error building at step {Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[uptime] Flags:[] Attrs:map[] Message:RUN uptime Original:RUN uptime}: exit status 1
strace to the rescue
unshare(CLONE_NEWNS)              = -1 EPERM (Operation not permitted)
After some googling I found that centos/rhel kernels have user namespace disabled by default and need to have a kernel parameter set to get this working.
We can enable this by running on the host
sudo grubby --args="namespace.unpriv_enable=1 user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
And also set the maximum number of user namespaces that any user in the current user namespace may create by running
echo "user.max_user_namespaces=15000" >> /etc/sysctl.conf
Now we can reboot the server
And come to the conclusion that it still doesn't work.

Yak 3 - outdated buildah version

Thanks to the #buildah channel on freenode, I found out that the problem of yak 2 was actually an outdated buildah version.
Centos has only a buildah 1.2 rpm, but 1.4 or higher was needed so I'd have to build my own.

You can have this pleasure too with the following script containing a modified RPM spec.

Run a new centos:7 container
# docker run -ti -v /tmp:/tmp centos:7 /bin/bash
and run following commands in the container:
yum -y group install development
yum -y install wget
cd /root/rpmbuild/SOURCES
wget "" -O buildah-608fa84.tar.gz
tar zxvf buildah-608fa84.tar.gz
mv containers-buildah-608fa84 buildah-608fa843cce45e7ee58ccb71a90297b645a984d3 
tar zcvf buildah-608fa84.tar.gz buildah-608fa843cce45e7ee58ccb71a90297b645a984d3
rm -rf buildah-608fa843cce45e7ee58ccb71a90297b645a984d3
cd ../SPECS
wget -O buildah.spec
yum-builddep -y buildah.spec
rpmbuild -ba buildah.spec
This will give you your RPMs
Wrote: /root/rpmbuild/SRPMS/buildah-1.4-1.git608fa84.el7.centos.src.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/buildah-1.4-1.git608fa84.el7.centos.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/buildah-debuginfo-1.4-1.git608fa84.el7.centos.x86_64.rpm

Yak 4 - proc mount error

Progress, a new error when running buildah 1.4!
# buildah --storage-driver vfs bud -t test .
STEP 1: FROM centos:7
Getting image source signatures
Copying blob sha256:205941c9c2d103bcdff0bc72d8836e0ffc4573ec0e6e524ec1a59606062a289f
 71.25 MiB / 71.25 MiB [====================================================] 4s
Copying config sha256:e26dc8af6a3b1856b9f4a893d5b51855c02dfe3b9cec58a4e55002036528c669
 2.14 KiB / 2.14 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
STEP 2: RUN uptime
container_linux.go:336: starting container process caused "process_linux.go:399: container init caused \"rootfs_linux.go:58: mounting \\\"/proc\\\" to rootfs \\\"/tmp/buildah596035765/mnt/rootfs\\\" at \\\"/proc\\\" caused \\\"operation not permitted\\\"\""
error running container: error creating container for [/bin/sh -c uptime]: : exit status 1
error building at step {Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[uptime] Flags:[] Attrs:map[] Message:RUN uptime Original:RUN uptime}: exit status 1
ERRO[0012] exit status 1
Again thanks to #buildah channel, I found out that running --isolation chroot would solve it.


Finally it works, we have an image created by buildah running in an unprivileged container.
# buildah --storage-driver vfs bud --isolation chroot -t test .
STEP 1: FROM centos:7
STEP 2: RUN uptime
 21:30:55 up 32 min,  0 users,  load average: 0.39, 0.12, 0.08
STEP 3: COMMIT containers-storage:[vfs@/var/lib/containers/storage+/var/run/containers/storage]localhost/test:latest
Getting image source signatures
Skipping fetch of repeat blob sha256:f972d139738dfcd1519fd2461815651336ee25a8b54c358834c50af094bb262f
Skipping fetch of repeat blob sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
Copying config sha256:26e3b2177f9e9db1bdc8f49083d09dbb980a99ed4e606f4dc45b79ca865588ce
 1.17 KiB / 1.17 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
--> 26e3b2177f9e9db1bdc8f49083d09dbb980a99ed4e606f4dc45b79ca865588ce
But after testing a new yak appears.

Yak 5 - a lot of diskspace

This one is related to yak 1, because we're using the vfs storage driver, it uses the disk not very space efficient (according to a more complicated docker build uses gigabytes of disk when using the vfs storage driver compared to the overlay driver.

To run with the overlay driver we need access to the mount call which means we have to run our docker container with CAP_SYS_ADMIN which is unfortunate.

# docker run --rm --add-cap SYS_ADMIN -ti centos:7 /bin/bash


It's possible to run buildah in an unprivileged container but only using the vfs storage driver, but beware of the disk usage when building images!

Interesting links:

Monday, September 28, 2015

How to create an IPv6-only consul cluster with docker


  • we're using docker to run consul (and registrator and our services) in, and IPv6 makes this easier (no NAT => better performance)
  • it's easier to maintain one stack
  • consul is known to give issues with NAT and docker (
  • IPv4 is legacy and obsolete ;-)
Consul 0.5.2 has some issues running such a setup, but if you're building consul from master (which includes some fixes (see it will work fine.

Issues to be aware of:

  • the IPv4 version of consul listens by default on private address ranges, when using IPv6 you'll be running on 'public' addresses. So be sure you're firewalling those from the internet.
  • If you're using consul recursive powers, you'll also need IPv6 dns recursors. (e.g. google's 2001:4860:4860::8888)
  • Not IPv6 related, but for extra stability, enable leave_on_terminate.
  • Also not Ipv6 related, but I've noticed that the default LAN settings for consul can be a bit too strict when running on vmware hosts. This patch increase the probetimeout to 2 seconds (instead of 500msec)

Consul extra configuration server and client

Extra settings below necessary for the consul server and client agent setup

        "recursor": "[2001:4860:4860::8888]",
        "leave_on_terminate": true,
        "client_addr": "::",
        "addresses": { "http": "::"}

Consul server setup

The consul server are running as a docker host mode container (which means, they share the same network namespace as the host).

The reason here is that we need a fixed IPv6 address for the servers because we're forwarding our dns requests to those servers. (ofcourse with some extra work we could make a script that dynamically update our dns forwards to the dynamic IP address).

Our server has multiple IPv6 addresses so we'll have to add a -advertise and -bind flag

consul agent -server -advertise 2001:db8::1 -bind 2001:db8::1 -bootstrap-expect 3 -retry-join [2001:db8::1]:8301 -retry-join [2001:db8::2]:8301 -retry-join [2001:db8::3]:8301

Using consul-docker as our consul docker container (for client and server)

Consul client setup 

You'll need to cherry-pick this PR into your local build:
The IPv6 address in the docker container will be random and we want to bind to the IPv6 address.
This patch looks for the first 'public' IPv6 address and uses this address to advertise.

So we start the client with:

consul agent -bind :: -join consul.service.consul

Gotcha's here:
bind :: actually binds to IPv4 and IPv6 addresses in the container, but because we advertise the IPv6 address the IPv4 address won't be used.

Other software


We also use registrator to register our services in consul. So every time a container starts or stops, registrator handles the consul service registration process.

Also for registrator some extra fixes are needed to have IPv6 support. (not yet merged, see

Because we're running consul on IPv6 this means registrator also needs to connect to the IPv6 address.

registrator consul://server1.node.consul:8500

Registrator then can register other services that are running on the docker host, like e.g elasticsearch.


Besides main registrator we also run registrator-netfilter which automatically firewalls the IPv6 services in the container. The containers are no longer NATted but directly accessible, so they need to be firewalled.


A /64 is allocated for docker and a /80 is given to each docker host, running with the switches

--ipv6=true --fixed-cidr-v6=2001:db8::/80


ES is also run ipv6 only, using registrator, registrator-netfilter and consul.
You can find the relevant commands to give to docker below:

docker run --net bridge -e SERVICE_NAME=es -e SERVICE_9200_TAGS=http-data 
-e SERVICE_9300_TAGS=transport-data -e SERVICE_9200_IPV6=tcp -e SERVICE_9300_IPV6=tcp 

Tuesday, February 10, 2015

tmux memory usage on linux

So a while ago I switched from screen to tmux. My reason for switching was that GNU screen didn't work in my docker containers and tmux did ;-)

All was well for a few months and I was replacing screen with tmux everywhere. It did have some other niceties besides working in containers and seem to do its job.


wim       1660  1.3 12.8 135056 131404 ?       Ss    2014 722:46 tmux -u

Notice anything special above ? Compare it with screen.

wim      29595  0.0  4.5  48784 46116 ?        Ss    2014   3:49 SCREEN -c mscreen

The tmux session has 8 open windows and 10000 history limit. (set -g history-limit 10000)
The screen session has 39 open windows and 10000 history limit (defscrollback 5000)

So, tmux seems to be using an awful lot of memory. Two times more than screen, for a 'lighter' session setup.

A quick google showed that other people were having the same issues

My first thought was, 'memoryleak', so I checked the code, but everything seemed to be free'd correctly.

I joined the #tmux channel on freenode for some help and got told that it's a specific glibc (linux) issue. Although the memory was free'd, Glibc wasn't releasing it back to the OS.

But you could force it by using malloc_trim(0). And maybe you could use specific glibc environment variables to control memory allocation behaviour to also emulate malloc_trim().

Too much time googling and testing was wasted, I couldn't get it too work, the memory wasn't getting released back to the OS.

So I made a small patch to tmux which
- calls malloc_trim(0) when a window gets destroyed
- also free's memory when you clear your history manually in a window (and also call malloc_trim())

The patch works for me but YMMV

I tried to get this patch into upstream tmux, but was told: 'It's up to glibc to decide how malloc works'.

PS: if you set history-limit 0, tmux actually uses less memory than screen (and doesn't grow), but ofcourse you don't have a scrollback ;-)

Saturday, January 24, 2015

Rancid 3.2 alpha + git

Rancid lovers rejoice, a 3.2 alpha version is released with (at least) 2 interesting features.

- Git support: based on the patch by jcollie.

But with a 'small' difference, not one repository for all the groups, but a repository per group.
Maybe fine if your starting from scratch, but for my situation I like the one repository setup of the original patch.

You can find the latest version with the original setup of one repository for everything, together with some other minor patches on

- WLC support: Now you can backup your Cisco Wireless Lan Controllers configuration out of the box. One patch less to maintain. Hurrah!

I'm running Rancid in a Docker setup, so upgrading and testing was quite easy.
No issues found yet with this version.

Tuesday, February 25, 2014

Circumventing IPv6 feature parity: drop AAAA to specific IPs

Unless you've been living under a rock, you'll be aware that IPv6 usage has been increasing.

Yes, it even has come to this: mere mortals can use it at home. The audacity!

Unfortunately not all vendors (if any?) have feature parity, in our case a specific VPN product doesn't support IPv6.
The client will only receive an IPv4 address from the VPN server.

When the user at home starts it's VPN and asks for an internal resource (which also has an IPv6 address), it will try to connect to this resource using the IPv6 from his provider (he didn't receive one from the VPN server) which doesn't work, because this specific resource is firewalled for outside addresses.

Luckily the user has to use our DNS server to look up records (forced to do so by the vpn client)
Luckily we're using PowerDNS recursor which has support for LUA scripting which can modify DNS responses.

The script below gives normal answers to every host not coming from or Otherwise if the answer contains an AAAA, drop it, and return the rest.

More information about LUA scripting for PowerDNS can be found here:

Sunday, December 1, 2013

Winter is coming

Which means I have some more time, expect some IPv6 and Cisco related posts in the near future.

Monday, May 13, 2013

Compiling your own PuTTY-CAC with EID support

So we've got electronic ID's, (smartcards) but except for doing our taxes we're not using them so much.

Now under linux there are options to use them for SSH authentication, but these days I'm mostly using Putty on Windows, so I wanted it to work with this client. 

After some searches I found a possible candidate: Putty-cac :

It works with CAPI, the military uses it, it's opensource and based on Putty. Seems like a win-win-win-win. And for once it also is :-)

compared it with the official putty source from to see if nothing suspicious was added to the code. There wasn't, so I could safely build the binary myself.

I remembered that Visual Studio Express was a free C++ compiler from Microsoft, so i download version 2010

So now just open the project and press build right? Wrong! The project was made in Visual studio 6 and apparently you can not convert from visual studio 6 to visual studio 2010. According to the internets you need to first install Visual studio 2008, convert there, save it, open it in Visual studio 2010, convert, save and build.

Here is an overview for those that want to do this:

Start visual C++ 2008
Open Project - c:\temp\putty-cac-master\windows\MSVC\putty.dsw
Convert and open project
Choose File - Save All
Start visual C++ 2010 (and close 2008 ;-)
Open Project - c:\temp\putty-cac-master\windows\MSVC\putty.sln
You'll get a wizard: Next - Next - Finish

Now when you try to build it, it won't. You'll need to add a define 

Open c:\temp\putty-cac-master\windows\
Add  #define SECURITY_WIN32 at the top of the file

If you compile now you'll get linking errors. You'll need to add 'sc.c' and 'capi.c' to the 'source files'

Now you're finally ready to build your binary. Press build and enjoy your own build putty.

To actually use your EID with this, just follow the CAPI instructions on

Stuff you'll need to get this working:

Microsoft Windows SDK for Windows 7 and .NET Framework 4

Visual 2008 express

Visual 2010 express

PuTTY CAC source

Good luck!