Behind the scenes: How do we test CFEngine

Over the last year we have changed the way we test our software from a manual process to a highly automated process. This new system is capable of taking a change from our source code repository and follow it all the way up to where we update our internal staging servers, thus giving us incredibly valuable information while keeping manual intervention to a minimum.

Overview

Our test system consist of the following pieces:

Two policy hubs that manage our build and deploy farm
Vagrant to deploy new machines on the fly
A Jenkins master + build slaves
Build scripts that automate most processes

We follow the mantra of eating our own dog food and therefore we use CFEngine to manage our build farm. All of our build slaves (and even the build master) are running CFEngine. This allows us to focus on the important things knowing that CFEngine will keep our build slaves behaving correctly. This has also served a double purpose, we have found (and fixed) a few issues by using CFEngine this way.

We have also a second policy hub that is used to launch and configure machines on demand. We use this policy hub to control the hypervisor that hosts our build slaves and the Jenkins master. This policy hub is separated from the other policy hub since we like to have both a physical and conceptual boundary. It also contains policies to do house keeping in our hypervisor. As an added bonus, this allows us to reprovision our hypervisor very quickly on a new machine if the current machine fails.

We will describe our testing procedures later, but we can tell you now that we need to be able to quickly deploy machines and bring them down after they are done. And for that we have found that Vagrant + CFEngine integration makes our life very easy.

As most software shops, we have a full size Continuous Integration system. We choose Jenkins because of the myriad of plugins that make our life very easy.

Finally, we keep our Jenkins’ recipes to a minimum. We keep all of our build instructions into a separate repository. This has the advantage of allowing us to build things even if the whole CI system is down. And of course, it is very easy to keep track of them since they are in a proper source code repository.

Building and initial testing

We start the test cycle by triggering a build when a commit reaches our source code repository. This is done by configuring Jenkins so it checks our repositories every 5 minutes. This gives us some time to push several commits at once.

Once Jenkins detects a commit it then bootstraps the code. This is, it runs the autogen script with the NO_CONFIGURE=1 flag so configure and friends are properly generated. We do this in a special machine to make sure that no strange versions of either autoconf or automake produce a weird configure script. We then use Jenkins to preserve the workspace and use it as a base for the other build processes.

The next station is to have a flash compilation process. This is just a sanity check that only compiles the code without running any tests. We run this in two platforms: mingw (cross compilation from Linux) and Ubuntu 10.04. If this stage fails, then the process is stopped and an email is sent to the committers and to our internal development mailing list.

The next step is to build our software and run our unit and acceptance tests. We do this in several platforms, since there might be slight differences in compiler default options and location of header files. We run this stage only on platforms that take less than one hour to compile and run the tests and that are currently supported by their respective vendor. We have a different stage for platforms that take too long, so we do not start building on those if there is an obvious problem in the other platforms. The last step in this stage is to produce packages. In case of problems, committers and our internal mailing list gets a notification.

Deployment testing

Once we have finished producing packages, we start performing deployment tests. We run first the deployment of hub packages. For that we deploy a new machine and install the fresh packages into it. This way we are sure that there are no transients in the machine. After package installation we run a few sanity checks, to make sure that everything is working as expected.

Our next stop is the deployment and test of agent packages. For that we start with deploying a hub machine (also on a clean machine), and then start several agent machines on demand. We install the packages on those agent machines and then bootstrap those machines to our hub. We run some sanity checks on the agent machines to make sure everything works as expected.

For our enterprise version we run UI tests using Selenium and php unit. The tests are run on a machine that is provisioned on demand and results are send back to our Jenkins installation.

All of these tests are performed across multiple platforms. We use Vagrant to provision these machines since their are short lived and Vagrant is really good at provisioning machines. The machines are configured using a mixture of Vagrant scripts and CFEngine policies. Once the configuration phase is done we start running tests and report results back to our Jenkins master.

Updating our staging areas

The final step in this process is the update of our internal staging areas. Internally we always keep staging areas, which are basically just hubs with some agents deployed to them so we can see how our software is working.

We have one staging area per branch we are working on. These stage areas live as long as they are the latest working revision of that branch. After a round of successful testing, we deploy a new machine from scratch and we use CFEngine to configure it. We install the new CFEngine package into it and bring the machine online.

This is specially useful during development because it gives us quick feedback on the progress that we have made.

Useful tips

As we say, we rely on Jenkins to help us manage our builds. We use Jenkins for two main reasons: it is extremely simple and powerful, and it has a myriad of plugins. Particularly we rely on the following plugins:

Build Pipeline: To provide a nice pipeline vire that is very simple and informative.
Email-ext: To provide customized email messages.
FTP publisher: To collect the packages and copy them to our internal ftp server.
GIT plugin: To access our repositories.
Multiple SCM plugin: We need to monitor and check out things from several repositories and this plugin lets you do exactly that.
Clone Workspace SCM plugin: As explained we bootstrap the code in a particular machine and then we copy it to other build slaves. This plugin hides all the complexity and let’s you do exactly that.
Regression report plugin: With this plugin Jenkins tells us which tests failed.

Conclusions

If you are in a situation like ours where you not only need to be able to compile and package your software but also to deploy it and test it, then the combination of CFEngine, Vagrant and Jenkins is a winning move.

We are able to quickly deploy new build slaves, deployment machines and even our whole infrastructure within minutes with the knowledge that the configuration that we need will be maintained by CFEngine, thus avoiding any drifts.

We can easily customize the build slaves or the deployment machines by simply modifying our policies or writing new policies.

Finally, if you would like to replicate this setup for your own software project let us know and we will try to help you get started.