Quantcast
Channel: Sam Saffron's Blog - Latest posts
Viewing all articles
Browse latest Browse all 150

Discourse in a Docker container

$
0
0

Deploying Rails applications is something we all struggle with:

You would think that years along we would have made some progress in this front, but no, deploying Rails is almost as complicated as it was back in the mongrel days.

Yes, we have passenger, however, getting it installed and working with rvm/rbenv is a bit of a black art, and let us not mention daemonizing Sidekiq or Resque. Or, god forbid, configuring PostgreSQL.

This is why we often outsource the task to application as a service providers.

Last week I decided to spend some time experimenting with Docker

What is Docker?

an open source project to pack, ship and run any application as a lightweight container

The concept is that developers and sysadmins can author simple images, usually authored using Dockerfiles that provide a pristine state that encapsulate an application. It uses all sorts of trickery to make authoring of these images a painless experience and contains a central repo where users can share images.

Think of it as a VM without the performance penalty of having a VM. Docker containers run in the same kernel as the host unvirtualized.

When a user launches a "container" a private unique IP is provisioned and the process runs isolated. Docker will launch a single process inside the container, however that process may spawn others.

Docker (today: version 0.6.5) is a front end that drives Linux LXC containers and uses a copy-on-write storage engine built on AUFS. It is the "glue" that gives you a simple API to deal with containers and optionally run them in the background, persistently.

Docker is built in golang, and has a very active community.

Restrictions

Docker version 0.6.5 is still not deemed "production ready", the technologies it wraps are considered production ready, however the APIs are changing rapidly with some radical changes to come later on.

There are plans to extract the AUFS support and probably use lvm thin provisioning as the preferred storage backend.

As it stands the only recommended OS to run Docker on by the Docker team is Ubunu LTS 12.04.03 (note, LTS ships with multiple kernels, you need 3.8 at least). I have had luck with Ubuntu 13.04, however 13.10 does not work with Docker today (since is ships with an incompatible, alpha, version of lxc). Additionally you should be aware of a networking issue in VMs that affect 3.8.

The AUFS dependency is the main reason for this tough restriction, however I feel confident this is going away, Red Hat are banking on it.

Security

It is very important to read through the LXC security document. Depending on your version of LXC, the root use inside a container may have global root privileges. This may not matter to you, or it may be very critical to you depending on your application / usage.

Additionally file mounts are a mess, if you mount a location external to the docker container using the -v options for docker run permissions are all a bit crazy. UIDs inside docker do not match UIDs outside of it, so for example:

View from the outside

View from inside the container.

There are plans to mitigate this problem. It can be worked around with NFS shares, avoiding mounts or synchronizing users and groups between containers and host.

The 42 layer problem in AUFS

AUFS only supports 42 layers. It may seem like a lot, but you hit is very early when building complex images. Dockerfiles make if very easy to reuse work when building images. For example, say I am building an image and decide to add "one more thing". When I add a new RUN command, docker is smart enough to re-use all my previous work so building the image is snappy. As a result many docker files contain lots and lots of RUN commands.

To circumvent this issue our base image is built as a single layer. When I am experimenting with changes I add them at the end of the file, eventually rolling them in to the big shell command.

Gotchas developing with Docker

When developing with Docker it is quite easy to accumulate a pile of images you never use, and containers that have long ago stopped and are disposable. It is fairly important to stay vigilant and keep cleaning up. Any complex docker environments are going to need a very clean process for eliminating unnecessary containers and images.

While developing I found myself running the following quite a lot:

docker rm `docker ps -a  | grep Exit | awk '{ print $1 }'`

remove all containers that exited

This blog is running on Docker

There has been a previous attempt to run Discourse under Docker by srid. However I wanted to take a fresh look at the problem and in a "trial-by-fire" come up with a design that worked for me.

Note, this setup is clearly not something we will be supporting externally or would like made official quite yet, however it has enormous amount of appeal and potential. After working through a Docker Discourse setup with our awesome sysadmin supermathie he described it as "20% of the work" he usually does.

This is how you would work through it

  • Install Ubuntu 12.04.03 LTS
  • sudo apt-get install git
  • git clone https://github.com/SamSaffron/discourse_docker.git
  • cd discourse_docker, run ./launcher for instructions on how to install docker
  • Install docker
  • Build the base image: sudo docker build image (note the hash of the image)
  • Tag the image: sudo docker tag [hash of image] samsaffron/discourse
  • Modify the base template to suite your needs (standalone.yml.sample):

# this is the base template, you should not change it
template: "standalone.template.yml"
# which ports to expose?
expose:
  - "80:80"
  - "2222:22"

params:
  # ssh key so you can log in
  ssh_key: YOUR_SSH_KEY
  # git revision to run
  version: HEAD


  # host name, required by Discourse
  database_yml:
    production:
      host_names:
        # your domain name
        - www.example.com


# needed for bootstrapping, lowercase email
env:
  DEVELOPER_EMAILS: 'my_email@email.com'
  • Save it as say, web.yaml
  • Run sudo ./launcher bootstrap web to create an image for your site
  • Run sudo ./launcher start web to start the site

At this point you will have a Discourse site up and running with sshd / nginx / postgresql / redis / unicorn running in a single container with runit ensuring all the processes keep running. (though I still need to build in a monitoring bits)

At no point during this setup did you have to pick the redis and postgres version, or mess around with nginx config files. It was all scripted in a completely reproducible fashion.

This solution is 100% transparent and hackable for other purposes

The launcher shell script has no logic regarding Discourse built in. Nor does pups, the yaml based image bootstrapper inspired by ansible. You can go ahead and adapt this solution to your own purposes and extend as you see fit.

I took it on myself to create the most complex of setup first, however this can easily be adapted to run separate applications per container using the single base image. You may prefer to run PostgreSQL and Redis in a single container and the web in another, for example. The base image has all the programs needed, copy-on-write makes storage cheap.

I elected to keep all persistent data outside of the container, that way I can always throw away a container and start again from scratch, easily.

The importance of the sshd backdoor into the container

During my work with docker I really wanted to be able to quickly be able to log-on to a container and mess about a bit. I am not alone.

A common technique to allow users direct access into a system container is to run a separate sshd inside the container. Users then connect to that sshd directly. In this way, you can treat the container just like you treat a full virtual machine where you grant external access. If you give the container a routable address, then users can reach it without using ssh tunneling.

One process per container

Docker will only launch a single process per container, it is your responsibility to launch any other processes you need and take care of monitoring. This is why I picked runit as the ideal process for this task:

compare that to the 105000 VSZ and 18700 RSS memory bluepill takes

VSZ and RSS numbers this low are probably very foreign to today's programmers, this is perfect for this task and makes orchestrating a container internally very simple. It takes care of dependency so, for example, unicorn will not launch until Postgres and Redis are running.

The upgrade problem

Docker opens a bunch of new options when it comes to application upgrades. For example, you can bootstrap a new container with a new version, stop your old container and start the new one.

You can also enable seamless upgrades on a single machine using 4 containers, a db container an haproxy container and 2 web containers. Just notify haproxy a web is going down, pull it out of rotation, upgrade that container and push it back into rotation.

Since we are running sshd in each container we can still use the traditional mechanisms of upgrade as well.

In more "enterprisey" setups you can run your own Docker registry, that way your CI machine can prep the images and the deploy process simply pulls the image on each box shuts down old containers and starts new ones. Distributing images is far more efficient and predictable than copying thousands of file with rsync each time you deploy.

Why yet another ansible?

While working on the process I came up with my own DSL for bootstrapping my Discourse images. I purpose built it so it solves the main issues I was hitting with a simple shell script. Multiline replace is hard in Awk and Grep. The syntax is scary to some, merging yaml files is not something you really could do that easily in a shell script.

pups makes these problems quite easy to solve

run:
  - replace:
      filename: "/etc/nginx/conf.d/discourse.conf"
      from: /upstream[^\}]+\}/m
      to: "upstream discourse {
        server 127.0.0.1:3000;
      }"

multiline regex replace for an nginx conf file

The DSL and tool lives here: https://github.com/samsaffron/pups feel free to use it where you need. I picked it over ansible cause I wanted an exact fit for my problem.

The initial image was simple enough to fit in a Docker file, however the process of bootstrapping Discourse is complex. You need to spin up background processes, do lots of fancy replacing and so on. You can see the template I am using for this site here: https://github.com/SamSaffron/discourse_docker/blob/master/standalone.template.yml

The future

I see a very bright future for Docker, a huge eco-system is forming with new Docker based applications launching monthly. For example CoreOS , Deis and others are building businesses on top of Docker. OpenStack Havana supports Docker out-of-the-box.

Many of the issues I have raised in this post are being actively resolved. Docker is far more than a pretty front end on the decade old BSD jail concept. It is attempting to provide a standard we can all use, in dev and production regardless of the OS we are running, allowing us to set up environments quickly and cleanly.


Viewing all articles
Browse latest Browse all 150

Trending Articles