Modify containers to run as non-root and other mitigations
Following some guidelines here:
Openshift specific:
In general, when authoring containers, developers should try run with the least privileges as possible.
Modifying a container’s USER
If a container’s Dockerfile does not set a USER, then it runs as root by
default.This is dangerous because root inside a container is also root on the
host. Openshift prevents containers from running as root by applying a
default restricted SecurityContextConstraint. When a container is started,
Openshift will randomly select a uid from a range that does not have access to
anything on the worker node in case a malicious container process is able to
break out of its sandbox.
In most application scenarios, the actual user a process runs as doesn’t matter,
but there are some legitimate cases where the container expects to be run as a
particular user, such as some database containers or other applications where it
needs to read or write to its local filesystem or to a persistent volume. A
simple mitigation is to add the USER directive to the Dockerfile before the CMD
or ENTRYPOINT so that the main container process does not run as root, e.g.
USER 1000
Then making sure the files it modifies contains the correct permissions.
It’s better to provide a numeric value rather than an existing user in
/etc/passwd in the container’s filesystem, as Openshift will be able to validate
the numeric value against any SCCs that restrict the uids that a container may
run as. In the case where we use a third party container and we are not able to
modify the Dockerfile, or the USER directive refers to a user that corresponds
to something in /etc/passwd, we can add the securityContext section to the
podspec to identify the UID that it the pod refers to. For example, in
BlueCompute the MySQL container we used is from dockerhub, but they allow
running as USER mysql which corresponds to uid 5984 in /etc/passwd, so we added
this section to the podSpec in the deployment:
securityContext: runAsUser: 5984 runAsGroup: 5984 fsGroup: 1000
The fsGroup is useful to provide supplemental groups which are added to the container’s processes. For example, in the above case the container process can also interact with files owned by group 1000, which might be helpful if using existing shared storage where there are directories owned by the group.
Modifying a container’s filesystem for read/write
If filesystem access is needed in the container filesystem, then those files
should be owned by and read/writable by the root group. In Openshift, the
arbitrary uid used by the restricted SCC will be added to the root group.
Directories that must be read to/written from as scratch space may add the
following to the Dockerfile:
RUN chgrp -R 0 /some/directory && \
chmod -R g=u /some/directory
Another strategy that we’ve had success with is to create an emptyDir volume and
mount it to the directory, which Kubernetes will create and destroy with the
pod. The emptyDir volume is owned by root but is world writable and can be used
as local storage for the container. This also helps someone reviewing the pod
definition identify which directories will be written to.
volumes:
- emptyDir: {}
name: database-storage