How to understand S6 Overlay v3
Otherwise known as “Container hands-off”
For those developers who like to dockerize everything, S6 Overlay is a god-send utility to manage the lifetime of a container. The new version 3 brings more flexibility, but compared to the prior version 2, it’s less straightforward to use and somewhat more strict.
If you don’t know what S6 Overlay is, let me explain it with 🍎 and 🍊. When you create a container image for your app, you also need to do the chores: monitoring each process, create start and finish tasks, add failure scripts, and what else. S6 Overlay is a drop-in framework that will automatically execute all of these tasks for you, instead of letting you fight with signal termination, only by creating some small files and scripts.
While it’s easy to understand how to setup, the new v3 is not just a walk in the park. In the world of tech, you should always stay on the latest stable when the developers tell you the magic words: deprecation and legacy, as some features of the version 2 have been marked as.
Setting up S6 Overlay v3
Okay, let’s imagine we have a simple app that does something. This app is made of two Node.js apps, one that acts as a web frontend, and other to process files: the “Web App” and the “Files App”, respectively.
There are also some steps that need to be executed before both start, and after both end. We don’t have the luxury of having one-app-per-container.
The container lifetime is the following:
- Prepare the directories
- Delete old cache files
- Download a base file for our Files App
- Start the “Web App”
- Start the “File App” after the “Web App”
- Terminate Apps in any order
- Remove unprocessed files of the “File Processor” App
As you can see, trying to manage all of these “steps” is kind of chaotic. How do we tell the container to monitor both apps? What happens if a step fails?
That’s where S6 Overlay comes in.
Preparing our container
You can think of S6 Overlay as script that will read some files in a directory, and these files will tell how to manage the Container. Don’t sweat about it.
S6 Overlay will work with the most used container base images around: The minimal Alpine, the standard Ubuntu, the i-know-what-im-doing Debian, and the i-do-this-for-a-living SCRATCH
.
If you want to choose something, my recommendation is Ubuntu as it comes with more tools out-of-the-box and less headaches for compiling, but on production, you may want to go for other alternatives.
Alpine is good if your app is already compiled (like under an intermediary container running Ubuntu or whatever) but you still need some minor utilities. The SCRATCH
container is my favorite if you only have a single executable that does all the jobs.
Downloading S6 Overlay
S6 Overlay (now) uses tarballs that must be extracted to the container instead of just… well… download a container with the S6 Overlay ready. While you should be able to use almost any container base image in any architecture, it also fixes you into an specific version.
The tarballs can be separated in three: the scripts, the binaries, and the symlinks. You will surely need both script and binaries. The symlinks are mandatory if you’re using minimal container images or SCRATCH
.
If you’re using another architecture different from x86–64, you can change the download URL to any other supported. Anway, these can be downloaded directly from GitHub.
# We're gonna use Ubuntu because of convenience
FROM ubuntu:latest
# Set the overlay as a version.
ENV S6_OVERLAY_VERSION = "3.1.4.1"
# Download the scripts to the temporal directory
ADD https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-noarch.tar.xz /tmp
# Extract the scripts at the container root
RUN tar -C / -Jxpf /tmp/s6-overlay-noarch.tar.xz
# Download the binaries to the temporal directory
ADD https://github.com/just-containers/s6-overlay/releases/download/v${S6_OVERLAY_VERSION}/s6-overlay-x86_64.tar.xz /tmp
# Extract the scripts atthe container root
RUN tar -C / -Jxpf /tmp/s6-overlay-x86_64.tar.xz
After that, we’re free to make it the base for our container, or label it as the base for other containers we want to use with S6 Overlay.
Adding the project files
The next step is to add our applications. Since these are two, we will tell Docker to copy the files into two separate directories, called “webapp” and “filesapp”.
The problem of these directories are the oversized node_modules
directory, the directories called .build
and cache
with some development artifacts, and the storage
directory that shouldn’t be copied on development. We will create a .dockerignore
in each project folder to tell Docker to ignore them when copying them.
.build
node_modules
storage
After adding both files, we can safely copy the project into the Docker image, and fix any permission while doing it — S6 Overlay recommends to do it here rather than fix them when the container starts.
# Copy the project files
COPY /project/app /var/www/webapp
COPY /project/files /var/www/filesapp
# Fix any permission by assigning them to "www-data" user
RUN chown -R www-data:www-data /var/www/webapp \
&& chown -R www-data:www-data /var/www/filesapp \
&& chmod -R 750 /var/www/webapp \
&& chmod -R 750 /var/www/filesapp
Adding Node.js
Because our apps depend on Node.JS — “Node” from now on — we need to install it in our container. While Ubuntu includes APT as the package manager, it doesn’t includes Node in its repositories.
We have to add the official Node repository. Thankfully, the guys at Node have a shell script we can execute to do that into APT with one line.
# Set the Node.JS version as the latest stable ("current")
ENV NODE_VERSION="current"
# Install and update Ubuntu tools, and cleans.
RUN apt-get update \
&& apt-get -y install \
&& git curl ca-certificates gnupg
&& apt-get clean
# Adds the Node repository, installs Node from it, and cleans.
RUN curl -sL https://deb.nodesource.com/setup_${NODE_VERSION}.x | bash -
&& apt-get -y install nodejs
&& apt-get clean
&& node -v
You could make the above in one step to save layers, but in any case, Node should be working, and the last line will output the version installed.
If you’re using another interpreter like Python or PHP, you may need to check their installation procedure, as these depend on the container and architecture.
Configuring S6 Overlay
The S6 Overlay will be called at the start of the container lifecycle as the first process, PID 1. After some minor internal setup, it will run the services you declare, and once the services end or the container is signaled to terminate, it will run your ending scripts.
We don’t need to wait for all prior steps to end before launching the Node apps, and that’s the flexibility of the new S6 Overlay v3. We can tell to the overlay which services depends on which.
All services can be longrun
, oneshot
or bundle
, and are declared inside etc/s6-overlay/s6-rc
with a descriptive name. The service definition is documented here, and explained in detail over there. In short, you use a bunch of empty files.
Before doing anything, we will create a directory with the S6 Overlay configuration that we will copy into our image later.
mkdir -p ~/projects/my-app/docker/etc/s6-overlay
cd ~/projects/my-app/docker
1. Preparing the directories
By S6 Overlay instructions, a service to create directores qualifies as a oneshot
: it does something and then exits when it’s done.
Oneshots in S6 Overlay are kind of tricky. You have to define the type as oneshot
, and include the up
file that points to the script or service to execute. The down
file, which is optional, can be used if the up
script ends execution.
First, we have to declare the service as a oneshot
inside a file called type
.
echo "oneshot" >| etc/s6-overlay/s6-rc.d/prepare-directories/type
Next, we have to add the up
file. This file must have only one line as an UNIX Command because it’s parsed by execlineb, which, contrary to shell utilities like bash
or sh
, it replaces itself with the called process.
The recommended way is to call the process responsible instead of a script, shell, or another process monitor. If you call
bash
to execute a script that executes your app, S6 Overlay will monitor thebash
process, not the app itself.
In our case, it will call bash
which in turn will run a script that handles all of the directories preparation.
echo "bash run.sh" >| etc/s6-overlay/s6-rc.d/prepare-directories/up
After that, the only thing left is to create our script inside the same directory as run.sh
, and here we will also fix some permissions and ownership that are required for these directories to properly work.
#!/bin/bash
mkdir -p /var/www/webapp/cache/
mkdir -p /var/www/webapp/storage/
mkdir -p /var/www/fileapp/storage/
# Make the ownership inheritable
chmod g+s /var/www/webapp/
chmod g+s /var/www/fileapp/
# Make folders writable by the www-data user used by the app processes
chmod g+w /var/www/webapp/cache/
chmod g+w /var/www/webapp/storage/
chmod g+w /var/www/fileapp/storage/
2. Delete old cache files
This step is relatively easy. We only need to remove all files in the cache
file. Since it’s another oneshot service, we have to make the same steps as before:
echo "oneshot" >| etc/s6-overlay/s6-rc.d/delete-cache/type
echo "bash run.sh" >| etc/s6-overlay/s6-rc.d/delete-cache/up
And in our run.sh
, add the commands necessary to delete the cached files, if there is any.
#!/bin/bash
rm -r /var/www/webapp/cache/*
The problem with this command if simple: What happens if the directory doesn’t exists? How we can be sure the directories are prepared before executing this step? That’s where dependencies come in, to ensure a command runs after another.
Dependencies in S6 Overlay can be defined in the dependencies.d
directory, each of them as an empty file with the name of the service it depends on. We can add one or many, it doesn’t matter.
mkdir -p etc/s6-overlay/s6-rc.d/delete-cache/dependencies.d
touch etc/s6-overlay/s6-rc.d/delete-cache/dependencies.d/prepare-directories
S6 Overlay will automatically check if these dependencies have run successfully, and just then run the service. We don’t need to make any of these checks by ourselves.
3. Download base file
Same as before, create a oneshot service and add the script that will handle the download.
echo "oneshot" >| etc/s6-overlay/s6-rc.d/download-base/type
In my case, I can just simply call a Node script that will handle the download of the base file.
echo "npm --prefix /var/www/fileapp run download" >| etc/s6-overlay/s6-rc.d/download-base/up
This command depends on the prepared directories, so as the previous command, we will need to add it as a dependency, otherwise it will run as soon as it can.
mkdir -p etc/s6-overlay/s6-rc.d/download-base/dependencies.d
touch etc/s6-overlay/s6-rc.d/download-base/dependencies.d/prepare-directories
4. Start the Webapp
This one is not a oneshot
, but a long-lasting process that will only exit because of an unrecoverable state or because the container has been told so and it requires to hear the termination signal from S6 Overlay. In that case, we have to set it as longrun
.
echo "longrun" >| etc/s6-overlay/s6-rc.d/start-webapp/type
To run our Webapp, we need to call the NPM script called “production”. It’s the same as with the step prior.
echo "npm --prefix /var/www/webapp run production" >| etc/s6-overlay/s6-rc.d/start-webapp/up
There is one more step we have make, and it’s setting the dependencies for this service. We can’t run the Webapp until the cache is cleared, because the WebApp has to regenerate them so changes are recompiled properly.
mkdir -p etc/s6-overlay/s6-rc.d/start-webapp/dependencies.d
touch etc/s6-overlay/s6-rc.d/start-webapp/dependencies.d/delete-cache
5. Start the File App
This app is also long-running, since it constantly checks if a new file has to be processed, and save the results somewhere. Same as before, we set it as longrun
and add the command to start the application.
echo "longrun" >| etc/s6-overlay/s6-rc.d/start-fileapp/type
echo "npm --prefix /var/www/fileapp run production" >| etc/s6-overlay/s6-rc.d/start-fileapp/up
Of course, this process depends on the Web App, since without it there will be no files to process. It also depends on the base file that is downloaded. S6 Overlay will smartly resolve the dependency chain of both dependencies.
mkdir -p etc/s6-overlay/s6-rc.d/start-fileapp/dependencies.d
touch etc/s6-overlay/s6-rc.d/start-fileapp/dependencies.d/download-base \
etc/s6-overlay/s6-rc.d/start-fileapp/dependencies.d/start-webapp
6. Terminate “WebServer” App
Our WebApp has to terminate gracefully. Luckily for us, there is nothing to “undo” when the WebApp shuts down.
On the other hand…
7. Remove unprocessed files of the “File Processor” App
…our File App doesn’t like to terminate gracefully.
When the Files App terminates, any ongoing file processing is also terminated abruptly. That means that lingering trash will pile up, and unprocessed files will still be marked as “ongoing”.
That’s a problem. When the container starts again, the Files App will read the information about the incomplete files as “ongoing” and it won’t pick them up. Instead, we need to mark these as “pending” again.
Enter the finalization scripts.
First, we’re going to set a timeout to finish. It’s basically a file that indicates the milliseconds to wait for the service to terminate before being killed. We don’t want this service to hang on indefinitely while being shut down, but also give time to end any file processing if it’s close to the finish line. This file is entirely optional but I can make a case to create it for this app.
Adding 10,000 milliseconds (10 seconds) should be enough.
echo "10000" >| etc/s6-overlay/s6-rc.d/start-fileapp/timeout-down
Now, we will add a script that will set in the database all ongoing processes that didn’t terminate to “pending”, which will signal the app to start their processing again once it starts.
This “unorphaning” is done via an NPM script, so there is no need to pull out our hair on bash
with database calls. We only have to create the finish
script file inside the service directory.
#!/bin/sh
# -----------------------------------------------------------
# This file is in etc/s6-overlay/s6-rc.d/start-fileapp/finish
# -----------------------------------------------------------
# Tell the FileApp to unorphan each file processing.
npm --prefix /var/www/fileapp run unorphan
# If the exit code is uncought, pass the second exit code received.
if test "$1" -eq 256 ; then
e=$((128 + $2))
else
e="$1"
fi
# Pass the exit code to S6 Overlay so we can know the exit code later.
echo "$e" > /run/s6-linux-init-container-results/exitcode
And that’s it. Now we need to tell S6 Overlay to run these services.
Telling S6 Overlay to start the services
We have defined the services to run, put we haven’t told S6 Overlay to run them. Luckly, S6 Overlay offers a simple way handle this: by setting each of them inside the user
service.
Let’s get this straight. S6 Overlay starts the built-in user
service, which is a “bundle”, meaning, it’s considered a group of services. Each service declared inside starts automatically.
S6 Overlay also includes the
base
service which is another bundle that starts before theuser
. The latter is useful to avoid race conditions, as it guarantees that tasks run and end before theuser
service bundle.The
base
bundle run first, theuser
bundle runs second.
We need to add each service to the user
bundle. The start order is defined by the dependencies of each, which is handled magically by S6 Overlay.
For that, we can create empty files with the name of the services inside the user/contents.d
directory.
touch etc/s6-overlay/s6-rc.d/user/contents.d/prepare-directories \
etc/s6-overlay/s6-rc.d/user/contents.d/delete-cache \
etc/s6-overlay/s6-rc.d/user/contents.d/download-base \
etc/s6-overlay/s6-rc.d/user/contents.d/start-webapp \
etc/s6-overlay/s6-rc.d/user/contents.d/start-fileapp \
When S6 Overlay runs, these services will run in any order, but respecting their dependencies.
Copying S6 Overlay to the Container
We will simply copy over the configuration into the Docker container like any other directory, and adding the proper permissions to them. This is also a great opportunity to copy other files, if you have them.
# Copies S6 Overlay config into the container
COPY --chmod=755 /etc/s6-overlay /etc/s6-overlay
Finishing touches
Usually I like to set the user at the end of the Dockerfile because sometimes defining it before anything may fight with RUN
commands, especially when updating dependencies.
Then, we set the work directory for commands, the entrypoint, and the default command.
# Set the user for this container
USER www-data
# Set the webapp as the workdir
WORKDIR /var/www/webapp
# Run the S6 Overlay INIT
ENTRYPOINT ["/init"]
# Set the default command to be Node version.
CMD ["node", "-v"]
And that’s it, we can now build the Container image and check if it works or not.
docker build -t my-s6-app
docker run --name my-s6-app-demo -d -p 80:80 -p 8080:8080 my-s6-app
docker top my-s6-app-demo acxf
S6 Overlay offers a lot of convenience, but it will also make some container technologies scream in panic because it comes with one important caveat:
It only runs as PID 1.
PID 1 or get out
In the world of Linux, the Process Identifier 1 (PID 1) is considered the “master” process, and there is no other parent process than itself. If this process terminates, then the computer shuts down because is understood there is no more work to be done.
S6 Overlay needs to be at the top of the chain of processes to efficiently monitor everything that’s happening below it. If not, some other processes may appear outside what S6 Overlay controls, and these can become uncontrollable zombie processes
Since version v3, being PID 1 is mandatory. If you use an entrypoint different than ['/init']
, like a bash
script or just another app, then I’m sorry but you will have to use S6 Overlay v2.
For example, S6 Overlay doesn’t work with Fly.io, because they inject their own init
process in the containers images they receive, which is probably what allows to monitor each container in their app — it runs AWS Firecracker underneath, so there is the explanation.
There may be a workaround, though: call S6 Overlay’s init
with unshare
into it’s own PID 1, but it beats me if this works and how long it will:
What unshare
does is fairly simple: runs a given process as its child, but creates a new PID namespace, so that process thinks it’s the only parent from its point of view, and also will pass any signal to the child process. Of course, this utility should be used with care.