Originally some of my dockerfiles looked something like this..
FROM docker.psidox.com/base ADD large.zip . RUN apt-get update RUN apt-get install -y unzip .. RUN unzip large.zip ... .. CMD ["/bin/bash", "/bootstrap.sh"]
The docker images built from these docker files would have a lot of unnecessary layers since each line in a dockerfile represents a single layer. Some layers like “large.zip” can be quite large and will be stuck with the docker image forever even though they were only used temporarily.
My solution to the problem, until something better comes along is this 3 line dockerfile:
FROM docker.psidox.com/base RUN scp -r [email protected]:/vagrant/docker/keycloak/* . && /bin/bash /fs/build.sh && echo 'success.' CMD ["/bin/bash", "/bootstrap.sh"]
We now have a single run entry that does the following things:
- Using scp, copies the necessary files required to build the image in from the host machine including the build script
- Executes a build script
There are a few advantages / disadvantages to this approach, as I see it:
Pros
- Resulting image only has 1 layer instead of 1 per each dockerfile line
- Can remove large files (eg. zip files) used in the build process so the resulting layer contains no excess baggage
Cons
- Need a base image that has a ssh key in it to allow you to scp from the host or you need to use something like sshpass to specify a plain-text password that will be forever visible in the docker image history
- Partially a repeat of the first one, but excess complexity of using scp to copy files into the container and having to use an ip address to specify the host
Hope for a better future..
Currently I am not entirely happy with this method and hope to find a better way of doing this in the future. Currently one of the issues I am dealing with as part of my CI process I build these docker containers. My development process uses vagrant as the host but on my CI machine I use a docker container for jenkins slave. An issue presents itself when you need to specify the host to scp files from. In vagrant+docker the host ip would be 172.17.42.1; however, in my jenkins+docker enviroment the host ip would be 10.0.42.1. So far it is not possible to pass environment variables in as part of the “docker build” process so the docker file would need to be compiled before execution as a possible solution. See Support for environment variables for building containers.
Also mentioned in the link above, there may be a way coming up soon to use image builders to create docker images. Could this be the solution I am looking for?
Update: July 4, 2014
With the release of Docker 1.1.0 one of the above mentioned problems has been solved. The following is an excerpt from the release documentation:
Allow a tar file as context for docker build
You can now pass a tar archive to `docker build` as context. This can be used to automate docker builds, for example:
cat context.tar | docker build -
or
docker run builder_image | docker build -
So, it is now possible to pipe a Dockerfile in via STDIN. This will allow us to do variable replacement on a “Dockerfile template” which can then be piped into “docker build -”
FROM docker.psidox.com/base RUN scp -r $HOST_USERNAME@$HOST:$HOST_PATH . && /bin/bash /fs/build.sh && echo 'success.' CMD ["/bin/bash", "/bootstrap.sh"]
Then to create an image from this Dockerfile template:
export HOST_USERNAME=vagrant export HOST=172.17.42.1 export HOST_PATH=/vagrant/docker/keycloak envsubst < Dockerfile | docker build -
To conclude, we can now use a dockerfile template, combined with linux environment variable substitution to pipe in to a “docker build -” command. This does not quite solve the complexities involved with the scp process but in the process of writing this entry I have realized that a Dockerfile is not necessary to build a docker container at all. In my next post I will share this new approach which will help us get rid of the scp command.