One of my colleagues pointed me to an article on using Buildah to create container images: How rootless Buildah works: Building containers in unprivileged environments.
I decided to test it out! In this case, I just wanted to build a container using the shell script method described in the article, rather than using a Dockerfile. Although the rootless aspect is interesting to me, I believe that requires newer versions of Buildah than what is available by default on CentOS 7. A project for another day!
First, I needed a test case. I decided to use my NLTK chatbot container image, which can be found at:
- https://hub.docker.com/repository/docker/cherdt/nltk-chatbot (container image)
- https://github.com/cherdt/docker-nltk-chatbot (Git repo including Dockerfile)
The existing container image, built from the Dockerfile, was 499 MB. I wanted to note this since one of my goals was to keep unnecessary packages out of the container image. For example, when building uWSGI inside a container, it requires a C compiler (gcc
) and Python headers (python3-devel
). However, those aren’t needed in the final container. Why not just use the existing compiler and headers in the build environment and avoid putting them in the container image at all?
The shell script example from the article had a couple of bugs, which I ironed out to create this shell script:
#!/bin/sh
# Build the container using buildah instead of Docker
# Retrieve container
CONTAINER=$(buildah from centos:centos7)
echo $CONTAINER
# Mount the container filesystem
MNT=$(buildah mount $CONTAINER)
echo $MNT
# Install python3 within the container path
yum --assumeyes install --installroot $MNT python3
# Install python libraries within the container path
pip3 install --prefix=$MNT/usr/ Flask flask-cors nltk pyyaml requests uwsgi
# Remove unnecessary cache files
rm -rf $MNT/var/cache $MNT/var/log/*
# Copy chatbot app files
cp chatbot.py $MNT/
cp chat.html $MNT/
# Set container config options (port, entrypoint)
buildah config --port 9500/tcp $CONTAINER
buildah config --entrypoint 'uwsgi --http :9500 --manage-script-name --mount /=chatbot:app' $CONTAINER
# Commit the changes to nltk-eliza
buildah commit $CONTAINER cherdt/nltk-chatbot
One of the interesting things is that buildah
simply mounts the container filesystem, and then the script uses that as a target for yum
installs and pip
installs. Using buildah was helpful to me to emphasize that a container image is really just a series of files with some metadata/configuration included.
Once the container image was built, I tested it using podman:
podman run -d -p 9500:9500 --name nltk_eliza nltk-eliza
It worked!
When I examined the new container size, I found that I had shaved off 189 MB:
# buildah images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/cherdt/nltk-chatbot latest 8e508778df12 47 hours ago 310 MB
Shell scripts versus Dockerfile
What are the advantages or disadvantages of this method over using a Dockerfile?
- Advantage: keep unnecessary packages and libraries out of the container image.
- Advantage: smaller file size.
- Disadvantage: those packages and libraries now need to be installed on your build environment, which was not previously necessary since everything happened in the containers themselves.
- Disadvantage: the build host probably needs to be similar. I haven’t tested this, but I created a container image based on CentOS 7 on a CentOS 7 host. I’m not sure if it would work trying to create the same CentOS 7 image on, say, an Ubuntu host on the same x86_64 architecture.
- Disadvantage: since this was my first time using a shell script to build a container image, I ran into a lot of errors along the way. Every time I re-ran the script, it built the entire container image over again. This was time-consuming. Building with Dockerfiles creates intermediate container images and makes changes to those images only when needed. There may be a way to accomplish the same thing using buildah, but I haven’t explored that yet.
Buildah can create container images using Dockerfiles (using buildah bud
). Podman can do the same (using podman build
). Worth pointing out in case you want to explore alternatives to Docker without changing how you build container images!
I noted that building the container using the Dockerfile no longer worked (Python 2 is obsolete). I modified it to used Python3 instead, and found that the resulting container image is 473 MB, still 163 MB larger than the image built using the shell script.