Pages

Saturday, October 20, 2018

Buildah inside a centos 7.5 docker container on a centos 7.5 host

Our current solution uses Jenkins to start a Nomad job which starts a (unprivileged) docker container in which a developers Dockerfile is being build (as root) using the docker on the host.

The goal is to replace the docker build in the container by buildah so that we don't need to make the docker on the host available inside the container.

The path to this wasn't as straightforward unfortunately, a lot of yaks needed shaving.

Start of the journey

We're starting with a basic container where we install buildah in
# docker run --rm -ti centos:7 /bin/bash
[root@7387c68139dd /]# yum -y install buildah
And a very simple Dockerfile
FROM centos:7
RUN uptime

Yak 1 - overlay problems

Out of the box running buildah in the container will give an overlay error.
# buildah bud -t test .
ERRO[0000] 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay"
ERRO[0000] 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay"
kernel does not support overlay fs: 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay": backing file system is unsupported for this graph driver
kernel does not support overlay fs: 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay": backing file system is unsupported for this graph driver
Spoiler: The real reason this doesn't work is because it tries to do a mount call, which can only be done with the SYS_ADMIN capability (or in a privileged container).

Using --storage-driver vfs fixed this problem.

On to the next one.

Yak 2 - mount namespace error aka unshare(CLONE_NEWNS) permission aka the wrong yak

Spoiler: this yak is a red herring
# buildah --storage-driver vfs bud -t test .
STEP 1: FROM centos:7
Getting image source signatures
Copying blob sha256:aeb7866da422acc7e93dcf7323f38d7646f6269af33bcdb6647f2094fc4b3bf7
 71.24 MiB / 71.24 MiB [====================================================] 4s
Copying config sha256:75835a67d1341bdc7f4cc4ed9fa1631a7d7b6998e9327272afea342d90c4ab6d
 2.13 KiB / 2.13 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
STEP 2: RUN uptime
error running container: error creating new mount namespace for [/bin/sh -c uptime]: operation not permitted
error building at step {Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[uptime] Flags:[] Attrs:map[] Message:RUN uptime Original:RUN uptime}: exit status 1
strace to the rescue
unshare(CLONE_NEWNS)              = -1 EPERM (Operation not permitted)
After some googling I found that centos/rhel kernels have user namespace disabled by default and need to have a kernel parameter set to get this working.
We can enable this by running on the host
sudo grubby --args="namespace.unpriv_enable=1 user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
And also set the maximum number of user namespaces that any user in the current user namespace may create by running
echo "user.max_user_namespaces=15000" >> /etc/sysctl.conf
Now we can reboot the server
And come to the conclusion that it still doesn't work.

Yak 3 - outdated buildah version

Thanks to the #buildah channel on freenode, I found out that the problem of yak 2 was actually an outdated buildah version.
Centos has only a buildah 1.2 rpm, but 1.4 or higher was needed so I'd have to build my own.

You can have this pleasure too with the following script containing a modified RPM spec.

Run a new centos:7 container
# docker run -ti -v /tmp:/tmp centos:7 /bin/bash
and run following commands in the container:
yum -y group install development
yum -y install wget
cd /root/rpmbuild/SOURCES
wget "https://github.com/containers/buildah/tarball/608fa843cce45e7ee58ccb71a90297b645a984d3" -O buildah-608fa84.tar.gz
tar zxvf buildah-608fa84.tar.gz
mv containers-buildah-608fa84 buildah-608fa843cce45e7ee58ccb71a90297b645a984d3 
tar zcvf buildah-608fa84.tar.gz buildah-608fa843cce45e7ee58ccb71a90297b645a984d3
rm -rf buildah-608fa843cce45e7ee58ccb71a90297b645a984d3
cd ../SPECS
wget https://gist.githubusercontent.com/42wim/848fba2ed2d64d457f56eeebef0e85a2/raw/bb3ad3c524529ed921626fb077b8ff78a56783fc/buildah.spec -O buildah.spec
yum-builddep -y buildah.spec
rpmbuild -ba buildah.spec
This will give you your RPMs
Wrote: /root/rpmbuild/SRPMS/buildah-1.4-1.git608fa84.el7.centos.src.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/buildah-1.4-1.git608fa84.el7.centos.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/buildah-debuginfo-1.4-1.git608fa84.el7.centos.x86_64.rpm

Yak 4 - proc mount error

Progress, a new error when running buildah 1.4!
# buildah --storage-driver vfs bud -t test .
STEP 1: FROM centos:7
Getting image source signatures
Copying blob sha256:205941c9c2d103bcdff0bc72d8836e0ffc4573ec0e6e524ec1a59606062a289f
 71.25 MiB / 71.25 MiB [====================================================] 4s
Copying config sha256:e26dc8af6a3b1856b9f4a893d5b51855c02dfe3b9cec58a4e55002036528c669
 2.14 KiB / 2.14 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
STEP 2: RUN uptime
container_linux.go:336: starting container process caused "process_linux.go:399: container init caused \"rootfs_linux.go:58: mounting \\\"/proc\\\" to rootfs \\\"/tmp/buildah596035765/mnt/rootfs\\\" at \\\"/proc\\\" caused \\\"operation not permitted\\\"\""
error running container: error creating container for [/bin/sh -c uptime]: : exit status 1
error building at step {Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[uptime] Flags:[] Attrs:map[] Message:RUN uptime Original:RUN uptime}: exit status 1
ERRO[0012] exit status 1
Again thanks to #buildah channel, I found out that running --isolation chroot would solve it.

Victory!

Finally it works, we have an image created by buildah running in an unprivileged container.
# buildah --storage-driver vfs bud --isolation chroot -t test .
STEP 1: FROM centos:7
STEP 2: RUN uptime
 21:30:55 up 32 min,  0 users,  load average: 0.39, 0.12, 0.08
STEP 3: COMMIT containers-storage:[vfs@/var/lib/containers/storage+/var/run/containers/storage]localhost/test:latest
Getting image source signatures
Skipping fetch of repeat blob sha256:f972d139738dfcd1519fd2461815651336ee25a8b54c358834c50af094bb262f
Skipping fetch of repeat blob sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
Copying config sha256:26e3b2177f9e9db1bdc8f49083d09dbb980a99ed4e606f4dc45b79ca865588ce
 1.17 KiB / 1.17 KiB [======================================================] 0s
Writing manifest to image destination
Storing signatures
--> 26e3b2177f9e9db1bdc8f49083d09dbb980a99ed4e606f4dc45b79ca865588ce
But after testing a new yak appears.

Yak 5 - a lot of diskspace

This one is related to yak 1, because we're using the vfs storage driver, it uses the disk not very space efficient (according to https://docs.docker.com/storage/storagedriver/vfs-driver/) a more complicated docker build uses gigabytes of disk when using the vfs storage driver compared to the overlay driver.

To run with the overlay driver we need access to the mount call which means we have to run our docker container with CAP_SYS_ADMIN which is unfortunate.

# docker run --rm --add-cap SYS_ADMIN -ti centos:7 /bin/bash

Conclusion

It's possible to run buildah in an unprivileged container but only using the vfs storage driver, but beware of the disk usage when building images!

Interesting links:

2 comments:

  1. Hello, when I follow your process, the first step is the following:

    ```
    [root@b07810c3fc08 opt]# buildah bud -t test .
    Error during unshare(CLONE_NEWUSER): Operation not permitted
    ERRO[0000] error parsing PID "": strconv.Atoi: parsing "": invalid syntax
    ERRO[0000] (unable to determine exit status)
    ```
    Should I add other parameters when docker starts the container?

    ReplyDelete
  2. Zhou,

    To resolve this problem you need to run the container with --cap-add SYS_ADMIN or as a privileges container.

    ReplyDelete