CFEngine and Ansible are two complementary infrastructure management tools that both work with so-called inventories. However, the common term can be quite confusing because the way they are defined and created is very different for an Ansible Inventory and for a CFEngine Inventory. In the most basic case, an Ansible Inventory is just a file with a list of hosts and groups of hosts that Ansible then manages when fed the inventory file. On the other hand, CFEngine Inventory is a database of information about all the hosts in the infrastructure managed by CFEngine which the hosts themselves report. In a more complex scenario, an Ansible Inventory can also contain a lot of information about the hosts in the infrastructure, but those need to be pulled from somewhere else and given to Ansible. With CFEngine, hosts talk to a CFEngine Hub, pull policy from it and report information back to it. On the other hand, with Ansible, policy is pushed to the hosts from one place which thus must have a list of all hosts available in advance, potentially with some extra information (parameters) of the hosts.
So the information for an Ansible Inventory has to be pulled from somewhere and then given to Ansible. And of course that somewhere can be CFEngine Inventory which has all the information. Using such data, Ansible can be used in a more efficient way only targetting the hosts where some actions need to be perfomed. Like shown in the image below and further explained later in this post.
CFEngine Hosts and Classes
One of the key concepts in CFEngine are classes. Policy and CFEngine
components can define classes which are used for making decisions and
some of which are also reported to the hub. The word class has, in
this case, its mathematical meaning – a generalization of a set.
Looking at individual hosts, classes classify the hosts on which they
are defined and for the policy they define a context (that is why they
are sometimes called contexts). However, looking from the global view,
classes actually define sets of hosts.1 So for example the class
linux
used in a policy as a so-called class guard linux::
defines
a required context for the following promises to be evaluated. If the
policy is evaluated in a linux
context, i.e. in an environment where
the class linux
is defined, the promises that follow the class guard
are evaluated, otherwise they are skipped. And the hosts where the class
linux
is defined create a set of, well, hosts running Linux.
CFEngine Hosts by hard class API
The recent versions of CFEngine provide a new API to get hosts grouped by hard classes. Hard classes are special classes defined implicitly by CFEngine components based on the environment discovery. So they reflect the OS the host is running, network parameters, CFEngine version, virtualization technology (if any), etc. With a simple API request, the user can get a JSON- or YAML-formatted file with all the hosts known to CFEngine grouped by hard classes defined on them.
The API request is supposed to be of the following form:
URI: https://hub.example.com/api/hosts/by-class
Method: GET
Parameters:
context-include comma delimited string of regular expressions
format Output format. Default value is "json". Allowed values: "json", "yaml".
withInventory Include inventory data to the API response. Default value is
"false". Allowed values: "true", "false"
and with no parameters specified (and thus defaults being used), the response looks like this (with parts omitted for brevity):
{
"cfengine": {
"hosts": [
"ip-172-31-21-241.eu-west-1.compute.internal",
"ip-172-31-23-151.eu-west-1.compute.internal",
"ip-172-31-28-22.eu-west-1.compute.internal",
"ip-172-31-25-102.eu-west-1.compute.internal",
"ip-172-31-22-152.eu-west-1.compute.internal"
]
},
"127_0_0_1": {
"hosts": [
"ip-172-31-22-152.eu-west-1.compute.internal",
"ip-172-31-23-151.eu-west-1.compute.internal",
"ip-172-31-25-102.eu-west-1.compute.internal",
"ip-172-31-28-22.eu-west-1.compute.internal",
"ip-172-31-21-241.eu-west-1.compute.internal"
]
},
"172_31_21_241": {
"hosts": [
"ip-172-31-21-241.eu-west-1.compute.internal"
]
},
"172_31_22_152": {
"hosts": [
"ip-172-31-22-152.eu-west-1.compute.internal"
]
},
"172_31_23_151": {
"hosts": [
"ip-172-31-23-151.eu-west-1.compute.internal"
]
},
"linux": {
"hosts": [
"ip-172-31-23-151.eu-west-1.compute.internal",
"ip-172-31-21-241.eu-west-1.compute.internal",
"ip-172-31-25-102.eu-west-1.compute.internal",
"ip-172-31-28-22.eu-west-1.compute.internal",
"ip-172-31-22-152.eu-west-1.compute.internal"
]
},
"centos": {
"hosts": [
"ip-172-31-21-241.eu-west-1.compute.internal",
"ip-172-31-22-152.eu-west-1.compute.internal",
"ip-172-31-25-102.eu-west-1.compute.internal"
]
},
"centos_7": {
"hosts": [
"ip-172-31-25-102.eu-west-1.compute.internal",
"ip-172-31-22-152.eu-west-1.compute.internal",
"ip-172-31-21-241.eu-west-1.compute.internal"
]
},
"ubuntu": {
"hosts": [
"ip-172-31-28-22.eu-west-1.compute.internal",
"ip-172-31-23-151.eu-west-1.compute.internal"
]
},
"ubuntu_16": {
"hosts": [
"ip-172-31-28-22.eu-west-1.compute.internal",
"ip-172-31-23-151.eu-west-1.compute.internal"
]
},
"ubuntu_16_04": {
"hosts": [
"ip-172-31-28-22.eu-west-1.compute.internal",
"ip-172-31-23-151.eu-west-1.compute.internal"
]
},
"xen": {
"hosts": [
"ip-172-31-28-22.eu-west-1.compute.internal",
"ip-172-31-25-102.eu-west-1.compute.internal",
"ip-172-31-22-152.eu-west-1.compute.internal",
"ip-172-31-23-151.eu-west-1.compute.internal",
"ip-172-31-21-241.eu-west-1.compute.internal"
]
}
}
The first part of the example above shows the group of hosts with the
cfengine
class defined. This special class is defined on all hosts
managed by CFEngine so this group can conveniently be used as an
equivalent of what is commonly known as the all
group. Then we can see
some groups (classes) based on the IP addresses assigned to the hosts,
the group of Linux hosts and multiple groups based on the operating
system (GNU/Linux distribution) the hosts are running. Those groups
nicely demonstrate a common pattern of how CFEngine defines the hard
classes – a specific class (e.g. ubuntu_16_04
), is defined together
with more generic classes (ubuntu_16
and ubuntu
) that are
constructed from it by removing the trailing parts separated by
underscores. Last but not least, there's a group of hosts running in
the Xen environment which is the same as the cfengine
group because
all the hosts that were used to generate the above example were running
in AWS.
CFEngine to Ansible
The above example response from the CFEngine hosts by class API shows
the format of the response (in the JSON format, but the YAML format is
just a direct equivalent). Each class/group has its top-level object
with a hosts
array of host names. This format is directly recognized
by Ansible, not by accident, of course, in fact, the API response format
was designed that way as part of the efforts to make integration of
CFEngine and Ansible easier.
Ansible inventory data formats
Ansible supports various inventory
sources
providing inventory data in various formats. The most common source is a
simple file in a JSON, YAML or ini format, but if given an executable,
Ansible uses the
script-plugin
that runs the executable and parses its output. Such executable is
supposed to output a JSON in the format shown above in the example.
Unfortunately, this format differs from the JSON/YAML format of an
inventory file, even though the difference is very subtle.2 The
CFEngine hosts by class API was designed to be used in a script (usually
just a simple curl invocation) and thus the format of the response
follows the specification of the script plugin. Which means it
doesn't work as an inventory file. Future versions of CFEngine will
recognize an extra parameter inventoryFile=true/false
to support
scenarios where the response is saved into a file instead of directly
fed into one of the Ansible utilities.
Example inventory script
As described above, the CFEngine hosts by class API returns inventory data suited for Ansible's inventory script-plugin which expects an executable to run. The most trivial script for such use is just:
#!/bin/bash
curl -u ${CFE_USER}:${CFE_PWD} https://hub.example.com/api/hosts/by-class
with the CFE_USER
and CFE_PWD
environment variables to be set and
exported with the export
command.3 With a script like the above
saved as, let's say getcfeinventory.sh, with the executable bit set
(chmod u+x get_cfe_inventory.sh
) the CFEngine Inventory data can be
used with the Ansible utilities like this:
history -a
export CFE_USER="admin" CFE_PWD="testingCFEngine"
ansible -m setup -i ./get_cfe_inventory.sh debian
unset CFE_PWD
history -c
where the last argument to the ansible command (debian
) tells
Ansible to only target hosts from the debian
group as defined in the
inventory data, i.e. as returned by the CFEngine hosts by class API.
Of course it's possible to execute some tasks only on some specific
hosts with Ansible itself, (although the debian
class in CFEngine is
defined on all hosts running Debian-like/Debian-derived GNU/Linux
distributions), but with adding a condition like
where ansible.distro_family == debian
(if such special variable
exists/existed) would mean Ansible would connect to all the hosts and
only then skip the particular task(s) on those hosts that fulfill the
condition. Using the information from CFEngine speeds things up
significantly if there are many hosts in the infrastructure and it can
use the full potential of CFEngine's hard classes.
Host variables
As was explained in the first part of this post,
CFEngine Inventory contains information about hosts in the
infrastructure managed by CFEngine which the hosts report. Ansible
Inventory, on the other hand, may contain host-specific variables with
host-specific values used in the tasks and playbooks. Obviously, these
two mechanisms can be chained up just like getting the groups of hosts
from CFEngine and feeding them to Ansible. By adding the
withInventory=true
argument to the API query (i.e. at the end of the
URL used in the curl
command), the response is extended with a piece
of data like this:
{
"_meta": {
"hostvars": {
"ip-172-31-22-152.eu-west-1.compute.internal": {
"CFEngine Inventory": {
"OS": "CentOS 7",
"Kernel": "linux",
"Timezone": "UTC",
"CPU model": "Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz",
"Host name": "ip-172-31-22-152.eu-west-1.compute.internal",
"Interfaces": "eth0",
"BIOS vendor": "Xen",
"CFEngine ID": "SHA=0854031414e708d8c647496b75580fb09972b6a108737af08fcaa03e43af7de6",
"CPU sockets": "1",
"System UUID": "35A626EC-2164-5893-1476-75B7BC9325D4",
"Architecture": "x86_64",
"BIOS version": "4.2.amazon",
"Virtual host": "xen",
"Disk free (%)": "91.00",
"MAC addresses": "06:3e:2f:ff:99:cf",
"IPv4 addresses": "172.31.22.152",
"Kernel Release": "3.10.0-957.27.2.el7.x86_64",
"Policy Servers": "172.31.21.241",
"System version": "4.2.amazon",
"Uptime minutes": "96",
"Ports listening": "22, 25, 111, 5308",
"CFEngine version": "3.18.0",
"Memory size (MB)": "14875.92",
"CPU logical cores": "4",
"Policy Release Id": "79d0f8cfca33efdc0aea676048ec4b7b9889cc0a",
"CPU physical cores": "2",
"System manufacturer": "Xen",
"System product name": "HVM domU",
"Timezone GMT Offset": "+0000",
"Physical memory (MB)": "15360",
"System serial number": "ec26a635-6421-9358-1476-75b7bc9325d4",
"Primary Policy Server": "172.31.21.241",
"Allowed hosts for cf-runagent": "172.31.21.241",
"Allowed users for cf-runagent": "root"
}
}
}
}
}
Then, host-specific variables are defined for all the hosts in all the
groups that are part of the response. These so-called attributes are
user-defined (with some pre-defined) in the CFEngine policy and thus can
be easily extended. Once again, the format is ready-to-use with Ansible
and so the ansible tasks and playbooks can refer to these values, using
the special hostvars[]
dictionary. This can be a nice extension or a
complement for the Ansible
Facts
functionality.
Conclusions
This post is another piece in the series of posts focused on integration of Ansible and CFEngine and using the best of what the two tools can provide. It shows how the CFEngine Inventory and Ansible Inventory, two seemingly conceptually different things, are actually very similar and how they can be used together to provide enhanced performance and better ease of use. Once again, the combination of Ansible for the deployment part of the infrastructure management and for the well-targeted real-time changes with CFEngine for the general long-term maintenance and infrastructure knowledge and overview proves to be very useful.
-
And class expressions which are logical expressions combining classes with the logical operators AND, OR and NOT, define sets that are results of the respective set operations intersection, union and complement. ↩︎
-
The
hosts
key must be an object and the individual hosts need to be objects too, not just strings. ↩︎ -
Unfortunately, the Ansible script-plugin doesn't support passing through standard error output and standard input so just having something like
curl -u admin https://hub.example.com/api/hosts/by-class
and letting curl to prompt for the password is not possible. ↩︎