CFEngine 3.6.2 now availabile: Focus on High Availability and Custom actions

October 2, 2014

CFEngine 3.6.2 is now available - in both Community and Enterprise editions! There are major new features in the Enterprise hub; High Availability and Custom actions. In addition, we have resolved numerous issues to provide you with a very stable release. It has been about 8 weeks since the 3.6.1 release, and we plan to continue on a 6-8 week schedule for maintenance releases going forward.

High availability for the hub

A common requirement for most enterprises is that key processes and mission critical applications are highly available - in essence to ensure there is no single point of failure. Although CFEngine is a distributed system, with decisions made by autonomous agents running on each node, the hub can be viewed as a single point of failure. Essentially, the hub has two responsibilities:

  • Serving latest policy to agents (control)
  • Collecting, aggregating and visualizing reports from the agents (visibility)

The responsibility of serving the latest policy is usually the most important, because losing this ability would mean that the infrastructure can no longer be changed. However, given the autonomous nature of CFEngine, this is quite easy to make fault tolerant by creating a separate failover hub and adding it to the copy_from list in the failsafe policy. For fault tolerance of the visibility aspect, the reporting features of the Enterprise edition need to get the same type of failover. This means setting up redundancy for all components, most importantly the PostgreSQL database. To solve this problem, CFEngine Enterprise 3.6.2 introduces support for an active-passive High Availability configuration, as shown in the diagram below. One hub will be the active hub, while the other serves the role as a passive hub and is a fully redundant instance of the active hub. If the passive determines the active is down, it will become active and start serving the Mission Portal, collect reports and serve policy. Mission Portal in 3.6.2 has gotten a new indicator with the status of the High Availability configuration.

Installation and setup

Existing CFEngine Enterprise installations can upgrade their single-node hub to a High Availability system in version 3.6.2. However, as with most High Availability systems, setting it up requires carefully following a series of steps with dependencies on network components. The setup can therefore be error-prone, so if you are a CFEngine Enterprise customer we recommend that you contact support for assistance if you do not feel 100% comfortable of doing this on your own. To learn more, you could have a look at the Overview of High Availability as well as the High Availability installation Guide in the documentation. Users of the CFEngine Community Edition should create their own High Availability setup using a redundant policy server, included in copy_from in the update policy, and synchronize the policy servers’ masterfiles with version control. The reason is that the Community Edition does not have the need for creating redundancy of the reporting, so this is an easier problem.

Custom actions

CFEngine Enterprise 3.6.0 debuted the new concept of ‘alerts’, which enables UI- and email-based notifications on conditions, like policy failures, insecure configurations, and software available for update. In CFEngine Enterprise 3.6.2 a new way to notify is introduced: Custom actions. A Custom action is a script that gets called with a reference to a file containing parameters about the triggered alert. Any scripting language is supported; the only limitation is that the hub needs an interpreter for the language. One typical use for a Custom action script is to integrate notification about promises not kept into a monitoring or ticketing system, like Nagios or JIRA. Using Custom actions, you can open a ticket if a policy bundle becomes not kept, for example. As an example, a bash-script that logs events about policy alerts to syslog on the hub is shown below. #!/bin/bash source $1 logger -i "Policy alert '$ALERT_NAME' $ALERT_STATUS. Now triggered on $ALERT_FAILED_HOST hosts. Defined with $ALERT_POLICY_CONDITION_FILTERBY='$ALERT_POLICY_CONDITION_FILTERITEMNAME', promise handle '$ALERT_POLICY_CONDITION_PROMISEHANDLE' and outcome $ALERT_POLICY_CONDITION_PROMISEOUTCOME" exit $? The resulting log message of a triggered alert may then look like the following. Sep 26 02:00:53 localhost user[18823]: Policy alert 'Web service' fail. Now triggered on 11 hosts. Defined with bundlename='web_service', promise handle '' and outcome NOTKEPT The script can be uploaded in the Mission Portal Settings by members of the admin role. After this, the script may be associated with any alert. At this point the script will be called whenever the alert changes state from OK to triggered, or the other way around. Now you can get rid of those email notifications and integrate notifications into your existing notification infrastructure – what ever it may be! To get started, you may have a look at the Custom action documentation.

Mission Portal dashboard

The first thing you will notice when accessing the Mission Portal in 3.6.2 is that the dashboard has a new out-of-the-box alert widget called System health. It contains one alert, Low disk space, which is triggered if a primary disk partition of any managed node runs below 15% free space. For critical issues like these, you might also want to add an email notification or perhaps a Custom action to make sure you capture it as soon as it happens. As you can see in the above screenshot, the alert status pages have gotten a new, more intuitive and space-efficient design since 3.6.1. This is based on usage patterns we have seen developing since the initial release.

LSB-based OS inventory

If you upgrade the masterfiles framework (recommended), you will get access to a new inventory variable named inventory_os.description. This variable will use Linux Standard Base to create a more human-friendly OS name, and fall back to CFEngine discovery (sys.flavor) if not available. The variable is also available as the OS inventory attribute in the Mission Portal.

Other changes

The admit_keys, admit_ips and admit_hostnames access promises were introduced in 3.6.0. When they are coupled with the dynamic variables connection.ip, connection.hostname and connection.key, they make it easy to dynamically grant access to resources based on the connecting client’s identity. The admit_* promises were initially enabled only when using the TLS protocol, but has been introduced to the legacy protocol in 3.6.2. On the Enterprise packaging side, we have upgraded from PHP 5.4.20 to 5.4.32 and the Windows package has now gotten the LibXML library bundled so that it can edit XML files with edit_xml promises. A more comprehensive list of changes for CFEngine Community can be found in the change log.

Get it!

As always, you can download CFEngine Enterprise 3.6.2 packages for the supported platforms, or give it a quick spin with the CFEngine 3.6.2 vagrant environment. If you are using the Community Edition, we provide you with source code, packages, and package repositories - to make sure we cover the distribution channel of your choice! We hope you enjoy 3.6.2, and we look forward to hearing about your experience in the CFEngine Google Group!