The Complete CFEngine Enterprise
Table of Content
- High Availability
- Hub Administration
- Regenerate Self Signed SSL Certificate
- Configure a custom LDAP port
- Backup and Restore
- Lookup License Info
- Custom SSL Certificate
- Re-installing Enterprise Hub
- Policy Deployment
- Enable plain http
- Custom LDAPs Certificate
- Public key distribution
- Adjusting Schedules
- Reset administrative credentials
- Extending Query Builder in Mission Portal
- Extending Mission Portal
- Install and Get Started
- User Interface
- Settings
- Hosts and Health
- Alerts and Notifications
- Custom actions for Alerts
- Enterprise Reporting
- Monitoring
- Enterprise API
- Best Practices
CFEngine Enterprise is an IT automation platform that uses a model-based approach to manage your infrastructure, and applications at WebScale while providing best-in-class scalability, security, enterprise-wide visibility and control.
WebScale IT Automation
CFEngine Enterprise provides a secure and stable platform for building and managing both physical and virtual infrastructure. Its distributed architecture, minimal dependencies, and lightweight autonomous agents enable you to manage 5,000 nodes from a single policy server.
WebScale does not just imply large server deployments. The speed at which changes are conceived and committed across infrastructure and applications is equally important. Due to execution times measurable in seconds, and one of the most efficient verification mechanisms, CFEngine reduces exposure to unwarranted changes, and prevents extreme delays for planned changes that need to be applied urgently at scale.
Intelligent Automation of Infrastructure
Automate your infrastructure with self-service capabilities. CFEngine Enterprise enables you to take advantage of agile, secure, and scalable infrastructure automation that makes repairs using a policy-based approach.
Policy-Based Application Deployment
Achieve repeatable, error-free and automated deployment of middleware and application components to datacenter or cloud-based infrastructure. Along with infrastructure, automated application deployment provides a standardized platform.
Self-Healing Continuous Operations
Gain visibility into your infrastructure and applications, and be alerted to issues immediately. CFEngine Enterprise contains built-in inventory and reporting modules that automate troubleshooting and compliance checks, as well as remediate in a self-healing fashion.
CFEngine Enterprise Features
User Interface
The CFEngine Enterprise Mission Portal provides a central dashboard for real-time monitoring, search, and reporting for immediate visibility into your environment’s actual vs desired state. You can also use Mission Portal to set individual and group alerts and track system events that make you aware of specific infrastructure changes.
Scalability
CFEngine Enterprise has a simple distributed architecture that scales with minimal resource consumption. Its pull-based system eliminates the need for server-side processing, which means that a single policy server can concurrently serve up to 5,000 nodes doing 5 minute runs with minimal hardware requirements.
Configurable Data Feeds
The CFEngine Enterprise Mission Portal
provides System Administrators and Infrastructure Engineers with detailed information about the actual state of the IT infrastructure and how that compares with the desired state.
Federation and SQL Reporting
CFEngine Enterprise has the ability to create federated structures, in which parts of organizations can have their own configuration policies, while at the same time the central IT organization may impose some policies that are more global in nature.
Monitoring and reporting
The CFEngine Enterprise Mission Portal contains continual reporting that details compliance with policies, repairs and any failures of hosts to match their desired state.
Role-based access control
Users can be assigned roles that limit their access levels throughout the Mission Portal.
High Availability
Overview
Although CFEngine is a distributed system, with decisions made by autonomous agents running on each node, the hub can be viewed as a single point of failure. In order to be able to play both roles that hub is responsible for - policy serving and report collection - High Availability feature was introduced in 3.6.2. Essentially it is based on well known and broadly used cluster resource management tools - corosync and pacemaker as well as PostgreSQL streaming replication feature.
Design
CFEngine High Availability is based on redundancy of all components, most importantly the PostgreSQL database. Active-passive PostgreSQL database configuration is the essential part of High Availability feature. As PostgreSQL supports different replication methods and active-passive configuration schemes, it doesn't provide out-of-the-box database failover-failback mechanism. To support the latter one well known cluster resources management solution based on Linux-HA project has been selected.
Overview of CFEngine High Availability is shown in the diagram below.
One hub is the active hub, while the other serves the role of a passive hub and is a fully redundant instance of the active one. If the passive host determines the active host is down, it will be promoted to active and will start serving the Mission Portal, collect reports and serve policy.
Corosync and pacemaker
Corosync and pacemaker are well known and broadly used mechanisms supporting cluster resource management. For CFEngine hub needs those are configured so that are managing PostgreSQL database and one or more IP addresses shared over the nodes in the cluster. In the ideal configuration one link managed by corosync/pacemaker is dedicated for PostgreSQL streaming replication and one for accessing Mission Portal so that once failover happens the change of active-passive roles and failover transition is transparent for end user. He can still use the same shared IP address to log in to the Mission Portal or use against API queries.
PostgreSQL
For best performance, PostgreSQL streaming replication has been selected as database replication mode. It provides capability of shipping WAL files from active server to all standby database servers. This is a PostgreSQL 9.0 and above feature allowing continuous recovery and almost immediate visibility of data inserted to primary server by the standby. For more information about PostgreSQL streaming replication please see this.
CFEngine
In a High Availability setup all the clients are aware of existence of more than one hub. Current active hub is selected as a policy server and policy fetching and report collection is done by the active hub. One of the differences comparing to single-hub installation is that instead of having one policy server, clients have a list of hubs where they should fetch policy and initiate report collection if using call collect. Also after bootstrapping to either active or passive hub clients are implicitly redirected to active one. After that trust is established between the client and both active and passive hub so that all clients are capable to communicate with both. This allows transparent transition to passive hub once fail-over is happening, as all the clients have already established trust with passive hub as well.
Mission Portal
Mission Portal in 3.6.2 has a new indicator whitch shows the status of the High Availability configuration.
High Availability status is constantly monitored so that once some malfunction is discovered the user is notified about the degraded state of the system. Besides simple visualization of High Availability, the user is able to get detailed information regarding the reason for a degraded state, as well as when data was last reported from each hub. This gives quite comprehensive knowledge and overview of the whole setup.
Inventory
There are also new Mission Portal inventory variables indicating the IP address of the active hub instance and status of High Availability installation on each of hubs. Looking at inventory reports is especially helpful to diagnose any problems when High Availability is reported as degraded.
CFEngine High Availability installation
Existing CFEngine Enterprise installations can upgrade their single-node hub to a High Availability system in version 3.6.2. Detailed instruction how to upgrade from single hub to High Availability or how to install CFEngine High Availability from scratch can be found here.
Installation Guide
Overview
This tutorial is describing the installation steps of the CFEngine High Availability feature. It is suitable for both upgrading existing CFEngine installations to HA and for installing HA from scratch. Before starting installation we strongly recommend reading the CFEngine High Availability overview.
Installation procedure
As with most High Availability systems, setting it up requires carefully following a series of steps with dependencies on network components. The setup can therefore be error-prone, so if you are a CFEngine Enterprise customer we recommend that you contact support for assistance if you do not feel 100% comfortable of doing this on your own.
Please also make sure you have a valid license for the passive hub so that it will be able to handle all your CFEngine clients in case of failover.
Hardware configuration and OS pre-configuration steps
- CFEngine 3.6.2 (or later) hub package for RHEL6 or CentOS6.
- We recommend selecting dedicated interface used for PostgreSQL replication and optionally one for heartbeat.
- We recommend having one shared IP address assigned for interface where MP is accessible (optionally) and one where PostgreSQL replication is configured (mandatory).
- Both active and passive hub machines must be configured so that host names are different.
- Basic hostname resolution works (hub names can be placed in /etc/hosts or DNS configured).
Example configuration used in this tutorial
In this tutorial we use the following network configuration:
- Two nodes, one acting as active (node1) and one acting as passive (node2).
- Optionally a third node (node3) used as a database backup for offsite replication.
- Each node having three NICs so that eth0 is used for the heartbeat, eth1 is used for PostgreSQL replication and eth2 is used for MP and bootstrapping clients.
- IP addresses configured as follows:
Node | eth0 | eth1 | eth2 |
---|---|---|---|
node1 | 192.168.0.10 | 192.168.10.10 | 192.168.100.10 |
node2 | 192.168.0.11 | 192.168.10.11 | 192.168.100.11 |
node3 (optional) | --- | 192.168.10.12 | 192.168.100.12 |
cluster shared | --- | --- | 192.168.100.100 |
Detailed network configuration is shown on the picture below:
Install cluster management tools
On both nodes:
yum -y install pcs pacemaker cman fence-agents
In order to operate cluster, proper fencing must be configured but description how to fence cluster and what mechanism use is out of the scope of this document. For reference please use the Red Hat HA fencing guide.
IMPORTANT: please carefully follow the indicators describing if the given step should be performed on the active (node1), the passive (node2) or both nodes.
Make sure that the hostnames of all nodes nodes are node1 and node2 respectively. Running the command
uname -n | tr '[A-Z]' '[a-z]'
should return the correct node name. Make sure that the DNS or entries in /etc/hosts are updated so that hosts can be accessed using their host names.In order to use pcs to manage the cluster, create the hacluster user designated to manage the cluster with
passwd hacluster
on both nodes.Make sure that pcsd demon is started and configure both nodes so that it will be enabled to boot on startup on both nodes.
service pcsd start chkconfig pcsd on
Authenticate hacluster user for each node of the cluster. Run the command below on the node1:
pcs cluster auth node1 node2 -u hacluster
After entering password, you should see a message similar to one below:
node1: Authorized node2: Authorized
Create the cluster by running the following command on the node1:
pcs cluster setup --name cfcluster node1 node2
This will create the cluster
cfcluster
consisting of node1 and node2.Give the cluster time to settle (cca 1 minute) and then start the cluster by running the following command on the node1:
pcs cluster start --all
This will start the cluster and all the necessary deamons on both nodes.
At this point the cluster should be up and running. Running
pcs status
should print something similar to the output below.Cluster name: cfcluster WARNING: no stonith devices and stonith-enabled is not false Stack: cman Current DC: node2 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum Last updated: Wed Oct 17 12:25:42 2018 Last change: Wed Oct 17 12:24:52 2018 by root via crmd on node2 2 nodes configured 0 resources configured Online: [ node1 node2 ] No resources Daemon Status: cman: active/disabled corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
If you are setting up just a testing environment without fencing, you should disable it now (**on the node1**):
pcs property set stonith-enabled=false pcs property set no-quorum-policy=ignore
Before the PostgreSQL replication is setup, we need to set up a floating IP address that will always point to the active node and configure some basic resource parameters (**on the node1**):
pcs resource defaults resource-stickiness="INFINITY" pcs resource defaults migration-threshold="1" pcs resource create cfvirtip IPaddr2 ip=192.168.100.100 cidr_netmask=24 --group cfengine pcs cluster enable --all node{1,2}
Verify that the cfvirtip resource is properly configured and running.
pcs status
should give something like this:
Cluster name: cfcluster Last updated: Tue Jul 7 09:29:10 2015 Last change: Fri Jul 3 08:41:24 2015 Stack: cman Current DC: node1 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 1 Resources configured Online: [ node1 node2 ] Full list of resources: Resource Group: cfengine cfvirtip (ocf::heartbeat:IPaddr2): Started node1
PostgreSQL configuration
- Install the CFEngine hub package on both node1 and node2.
Make sure CFEngine is not running (**on both node1 and node2**):
service cfengine3 stop
Configure PostgreSQL on node1:
Create two special directories owned by the cfpostgres user:
mkdir -p /var/cfengine/state/pg/{data/pg_arch,tmp} chown -R cfpostgres:cfpostgres /var/cfengine/state/pg/{data/pg_arch,tmp}
Modify the /var/cfengine/state/pg/data/postgresql.conf configuration file to set the following options accordingly (**uncomment the lines if they are commented out**):
listen_addresses = '*' wal_level = replica max_wal_senders = 5 wal_keep_segments = 16 hot_standby = on restart_after_crash = off archive_mode = on archive_command = 'cp %p /var/cfengine/state/pg/data/pg_arch/%f'
Modify the pg_hba.conf configuration file to enable access to PostgreSQL for replication between the nodes (note that the second pair of IP addresses, not the heartbeat pair, is used here):
echo "host replication all 192.168.100.10/32 trust" >> /var/cfengine/state/pg/data/pg_hba.conf echo "host replication all 192.168.100.11/32 trust" >> /var/cfengine/state/pg/data/pg_hba.conf
IMPORTANT: The above configuration allows accessing PostgreSQL without any authentication from both cluster nodes. For security reasons we strongly advise to create a replication user in PostgreSQL and protect access using a password or certificate. Furthermore, we advise using ssl-secured replication instead of the unencrypted method described here if the hubs are in an untrusted network.
Do an initial sync of PostgreSQL:
Start PostgreSQL on node1:
pushd /tmp; su cfpostgres -c "/var/cfengine/bin/pg_ctl -w -D /var/cfengine/state/pg/data -l /var/log/postgresql.log start"; popd
On node2, initialize PostgreSQL from node1 (again using the second IP, not the heartbeat IP):
rm -rf /var/cfengine/state/pg/data/* pushd /tmp; su cfpostgres -c "/var/cfengine/bin/pg_basebackup -h 192.168.10.10 -U cfpostgres -D /var/cfengine/state/pg/data -X stream -P"; popd
On node2, create the recovery.conf file to configure PostgreSQL to run as a hot-standby replica:
cat <<EOF > /var/cfengine/state/pg/data/recovery.conf standby_mode = 'on' #192.168.100.100 is the shared over cluster IP address of active/master cluster node primary_conninfo = 'host=192.168.100.100 port=5432 user=cfpostgres application_name=node2' restore_command = 'cp /var/cfengine/state/pg/pg_arch/%f %p' EOF chown --reference /var/cfengine/state/pg/data/postgresql.conf /var/cfengine/state/pg/data/recovery.conf
Start PostgreSQL on the node2 by running the following command:
pushd /tmp; su cfpostgres -c "/var/cfengine/bin/pg_ctl -D /var/cfengine/state/pg/data -l /var/log/postgresql.log start"; popd
Check that PostgreSQL replication is setup and working properly:
The node2 should report it is in the recovery mode:
/var/cfengine/bin/psql -x cfdb -c "SELECT pg_is_in_recovery();"
should return:
-[ RECORD 1 ]-----+-- pg_is_in_recovery | t
The node1 should report it is replicating to node2:
/var/cfengine/bin/psql -x cfdb -c "SELECT * FROM pg_stat_replication;"
should return something like this:
-[ RECORD 1 ]----+------------------------------ pid | 11401 usesysid | 10 usename | cfpostgres application_name | node2 client_addr | 192.168.100.11 client_hostname | node2-pg client_port | 33958 backend_start | 2018-10-16 14:19:04.226773+00 backend_xmin | state | streaming sent_lsn | 0/61E2C88 write_lsn | 0/61E2C88 flush_lsn | 0/61E2C88 replay_lsn | 0/61E2C88 write_lag | flush_lag | replay_lag | sync_priority | 0 sync_state | async
Stop PostgreSQL on both nodes:
pushd /tmp; su cfpostgres -c "/var/cfengine/bin/pg_ctl -D /var/cfengine/state/pg/data -l /var/log/postgresql.log stop"; popd
Cluster resource configuration
Download the PostgreSQL resource agent supporting the CFEngine HA setup on both nodes.
wget https://raw.githubusercontent.com/cfengine/core/master/contrib/pgsql_RA /bin/cp pgsql_RA /usr/lib/ocf/resource.d/heartbeat/pgsql chown --reference /usr/lib/ocf/resource.d/heartbeat/{IPaddr2,pgsql} chmod --reference /usr/lib/ocf/resource.d/heartbeat/{IPaddr2,pgsql}
Create the PostgreSQL resource (**on node1**).
pcs resource create cfpgsql pgsql \ pgctl="/var/cfengine/bin/pg_ctl" \ psql="/var/cfengine/bin/psql" \ pgdata="/var/cfengine/state/pg/data" \ pgdb="cfdb" pgdba="cfpostgres" repuser="cfpostgres" \ tmpdir="/var/cfengine/state/pg/tmp" \ rep_mode="async" node_list="node1 node2" \ primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \ master_ip="192.168.100.100" restart_on_promote="true" \ logfile="/var/log/postgresql.log" \ config="/var/cfengine/state/pg/data/postgresql.conf" \ check_wal_receiver=true restore_command="cp /var/cfengine/state/pg/data/pg_arch/%f %p" \ op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \ op monitor timeout="60s" interval="4s" on-fail="restart" --disable
Configure PostgreSQL to work in Master/Slave (active/standby) mode (**on node1**).
pcs resource master mscfpgsql cfpgsql master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
Tie the previously configured shared IP address and PostgreSQL cluster resources to make sure both will always run on the same host and add migration rules to make sure that resources will be started and stopped in the correct order (**on node1**).
pcs constraint colocation add cfengine with Master mscfpgsql INFINITY pcs constraint order promote mscfpgsql then start cfengine symmetrical=false score=INFINITY pcs constraint order demote mscfpgsql then stop cfengine symmetrical=false score=0
Enable and start the new resource now that it is fully configured (**on node1**).
pcs resource enable mscfpgsql --wait=30
Verify that the constraints configuration is correct.
pcs constraint
should give:
Location Constraints: Resource: mscfpgsql Enabled on: node1 (score:INFINITY) (role: Master) Ordering Constraints: promote mscfpgsql then start cfengine (score:INFINITY) (non-symmetrical) demote mscfpgsql then stop cfengine (score:0) (non-symmetrical) Colocation Constraints: cfengine with mscfpgsql (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master)
Verify that the cluster is now fully setup and running.
crm_mon -Afr1
should give something like:
Stack: cman Current DC: node1 (version 1.1.18-3.el6-bfe4e80420) - partition with quorum Last updated: Tue Oct 16 14:19:37 2018 Last change: Tue Oct 16 14:19:04 2018 by root via crm_attribute on node1 2 nodes configured 3 resources configured Online: [ node1 node2 ] Full list of resources: Resource Group: cfengine cfvirtip (ocf::heartbeat:IPaddr2): Started node1 Master/Slave Set: mscfpgsql [cfpgsql] Masters: [ node1 ] Slaves: [ node2 ] Node Attributes: * Node node1: + cfpgsql-data-status : LATEST + cfpgsql-master-baseline : 0000000004000098 + cfpgsql-receiver-status : normal (master) + cfpgsql-status : PRI + master-cfpgsql : 1000 * Node node2: + cfpgsql-data-status : STREAMING|ASYNC + cfpgsql-receiver-status : normal + cfpgsql-status : HS:async + master-cfpgsql : 100
IMPORTANT: Please make sure that there's one Master node and one Slave node and that the cfpgsql-status for the active node is reported as PRI and passive as HS:async or HS:alone.
CFEngine configuration
Create the HA configuration file on both nodes.
cat <<EOF > /var/cfengine/ha.cfg cmp_master: PRI cmp_slave: HS:async,HS:sync,HS:alone cmd: /usr/sbin/crm_attribute -l reboot -n cfpgsql-status -G -q EOF
Bootstrap the nodes.
Bootstrap the node1 to itself:
cf-agent --bootstrap 192.168.100.10
Bootstrap the node2 to node1 (to establish trust) and then to itself:
cf-agent --bootstrap 192.168.100.10 cf-agent --bootstrap 192.168.100.11
Stop CFEngine on both nodes.
service cfengine3 stop
Create the HA JSON configuration file on both nodes.
cat <<EOF > /var/cfengine/masterfiles/cfe_internal/enterprise/ha/ha_info.json { "192.168.100.10": { "sha": "@NODE1_PKSHA@", "internal_ip": "192.168.100.10" }, "192.168.100.11": { "sha": "@NODE2_PKSHA@", "internal_ip": "192.168.100.11" } } EOF
The
@NODE1_PKSHA@
and@NODE2_PKSHA@
strings are placeholders for the host key hashes of the nodes. Replace the placeholders with real values obtained by (on any node):cf-key -s
IMPORTANT: Copy over only the hashes, without the
SHA=
prefix.On both nodes, modify the /var/cfengine/masterfiles/controls/def.cf and /var/cfengine/masterfiles/controls/update_def.cf files to enable HA by uncommenting the following line:
"enable_cfengine_enterprise_hub_ha" expression => "enterprise_edition";
and commenting or removing the line
"enable_cfengine_enterprise_hub_ha" expression => "!any";`
On both nodes, run
cf-agent -Kf update.cf
to make sure that the new policy is copied from masterfiles to inputs.Start CFEngine on both nodes.
service cfengine3 start
Check that the CFEngine HA setup is working by logging in to the Mission Portal at the https://192.168.100.100 address in your browser. Note that it takes up to 15 minutes for everything to settle and the
OK
HA status being reported in the Mission Portal's header.
Configuring 3rd node as disaster-recovery or database backup (optional)
Install the CFEngine hub package on the node which will be used as disaster-recovery or database backup node (node3).
Bootstrap the disaster-recovery node to active node first (establish trust between hubs) and then bootstrap it to itself. At this point hub will be capable of collecting reports and serve policy.
Stop cf-execd and cf-hub processes.
Make sure that PostgreSQL configuration allows database replication connection from 3rd node (see PostgreSQL configuration section, point 5.3 for more details).
Repeat steps 4 - 6 from PostgreSQL configuration to enable and verify database replication connection from the node3. Make sure that both the node2 and node3 are connected to active database node and streaming replication is in progress.
Running the following command on node1:
/var/cfengine/bin/psql cfdb -c "SELECT * FROM pg_stat_replication;"
Should give: ``` pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state ------+----------+------------+------------------+----------------+-----------------+-------------+-------------------------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------ 9252 | 10 | cfpostgres | node2 | 192.168.100.11 | | 58919 | 2015-08-24 07:14:45.925341+00 | streaming | 0/2A7034D0 | 0/2A7034D0 | 0/2A7034D0 | 0/2A7034D0 | 0 | async 9276 | 10 | cfpostgres | node3 | 192.168.100.12 | | 52202 | 2015-08-24 07:14:46.038676+00 | streaming | 0/2A7034D0 | 0/2A7034D0 | 0/2A7034D0 | 0/2A7034D0 | 0 | async
(2 rows) ```
Modify HA JSON configuration file to contain information about the node3 (see CFEngine configuration, step 2). You should have configuration similar to one below:
[root@node3 masterfiles]# cat /var/cfengine/masterfiles/cfe_internal/enterprise/ha/ha_info.json { "192.168.100.10": { "sha": "b1463b08a89de98793d45a52da63d3f100247623ea5e7ad5688b9d0b8104383f", "internal_ip": "192.168.100.10", "is_in_cluster" : true, }, "192.168.100.11": { "sha": "b13db51615afa409a22506e2b98006793c1b0a436b601b094be4ee4b32b321d5", "internal_ip": "192.168.100.11", }, "192.168.100.12": { "sha": "98f14786389b2fe5a93dc3ef4c3c973ef7832279aa925df324f40697b332614c", "internal_ip": "192.168.100.12", "is_in_cluster" : false, } }
Please note that
is_in_cluster
parameter is optional for the 2 nodes in the HA cluster and by default is set to true. For the 3-node setup, the node3, which is not part of the cluster, MUST be marked with"is_in_cluster" : false
configuration parameter.Start the cf-execd process (don't start cf-hub process as this is not needed while manual failover to the node3 is not performed). Please also note that during normal operations the cf-hub process should not be running on the node3.
Manual failover to disaster-recovery node
Before starting manual failover process make sure both active and passive nodes are not running.
Verify that PostgreSQL is running on 3rd node and data replication from active node is not in progress. If database is actively replicating data with active cluster node make sure that this process will be finished and no new data will be stored in active database instance.
After verifying that replication is finished and data is synchronized between active database node and replica node (or once node1 and node2 are both down) promote PostgreSQL to exit recovery and begin read-write operations
cd /tmp && su cfpostgres -c "/var/cfengine/bin/pg_ctl -c -w -D /var/cfengine/state/pg/data -l /var/log/postgresql.log promote"
.In order to make failover process as easy as possible there is
"failover_to_replication_node_enabled"
class defined both in /var/cfengine/masterfiles/controls/VERSION/def.cf and /var/cfengine/masterfiles/controls/VERSION/update_def.cf. In order to stat collecting reports and serving policy from 3rd node uncomment the line defining mentioned class.
IMPORTANT: Please note that as long as any of the active or passive cluster nodes is accessible by client to be contacted, failover to 3rd node is not possible. If the active or passive node is running and failover to 3rd node is required make sure to disable network interfaces where clients are bootstrapped to so that clients won't be able to access any other node than disaster-recovery.
Troubleshooting
If either the IPaddr2 or pgslq resource is not running, try to enable it first with
pcs cluster enable --all
. If this is not strting the resources, you can try to run them in debug mode with this commandpcs resource debug-start <resource-name>
. The latter command should print diagnostics messages on why resources are not started.If
crm_mon -Afr1
is printing errors similar to the below[root@node1]# pcs status Cluster name: cfcluster Last updated: Tue Jul 7 11:27:23 2015 Last change: Tue Jul 7 11:02:40 2015 Stack: cman Current DC: node1 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 3 Resources configured Online: [ node1 ] OFFLINE: [ node2 ] Full list of resources: Resource Group: cfengine cfvirtip (ocf::heartbeat:IPaddr2): Started node1 Master/Slave Set: mscfpgsql [cfpgsql] Stopped: [ node1 node2 ] Failed actions: cfpgsql_start_0 on node1 'unknown error' (1): call=13, status=complete, last-rc-change='Tue Jul 7 11:25:32 2015', queued=1ms, exec=137ms
you can try to clear the errors by running
pcs resource cleanup <resource-name>
. This should clean errors for the appropriate resource and make the cluster restart it.[root@node1 vagrant]# pcs resource cleanup cfpgsql Resource: cfpgsql successfully cleaned up [root@node1 vagrant]# pcs status Cluster name: cfcluster Last updated: Tue Jul 7 11:29:36 2015 Last change: Tue Jul 7 11:29:08 2015 Stack: cman Current DC: node1 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 3 Resources configured Online: [ node1 ] OFFLINE: [ node2 ] Full list of resources: Resource Group: cfengine cfvirtip (ocf::heartbeat:IPaddr2): Started node1 Master/Slave Set: mscfpgsql [cfpgsql] Masters: [ node1 ] Stopped: [ node2 ]
After cluster crash make sure to always start the node that should be active first, and then the one that should be passive. If the cluster is not running on the given node after restart you can enable it by running the following command:
[root@node2]# pcs cluster start Starting Cluster...
Hub Administration
Find out how to perform common hub administration tasks like resetting admin credentials, or using custom SSL certificates.
Regenerate Self Signed SSL Certificate
When first installed a self-signed ssl certificate is automatically generated
and used to secure Mission Portal and API communications. You can regenerate
this certificate by running cfe_enterprise_selfsigned_cert
bundle with the
_cfe_enterprise_selfsigned_cert_regenerate_cert
class defined. This can be
done by running the following commands as root on the hub.
# cf-agent --no-lock --inform \
--bundlesequence cfe_enterprise_selfsigned_cert \
--define _cfe_enterprise_selfsigned_cert_regenerate_certificate
Configure a custom LDAP port
Mission Portals User settings and preferences provides a radio button encryption. This controls the encryption and the port to connect to.
If you want to configure LDAP authentication to use a custom port you can do so via the Status and Setting REST API.
Status and Settings REST API
This example shows using jq to preserve the existing settings and update the
SSL LDAP port to 3269
.
Note: The commands are run as root on the hub, and the hubs self signed certificate is used to connect to the API over https. An accessToken must be retrieved from /var/cfengine/httpd/htdocs/ldap/config/settings.php.
[root@hub ~]# export CACERT="/var/cfengine/httpd/ssl/certs/hub.cert"
[root@hub ~]# export API="https://hub/ldap/settings"
[root@hub ~]# export AUTH_HEADER="Authorization:<accessToken from settings.php as mentioned above>"
[root@hub ~]# export CURL="curl --silent --cacert ${CACERT} -H ${AUTH_HEADER} ${API}"
[root@hub ~]# ${CURL} | jq '.data'
{
"domain_controller": "ldap.jumpcloud.com",
"custom_options": {
"24582": 3
},
"version": 3,
"group_attribute": "",
"admin_password": "Password is set",
"base_dn": "ou=Users,o=5888df27d70bea3032f68a88,dc=jumpcloud,dc=com",
"login_attribute": "uid",
"port": 2,
"use_ssl": true,
"use_tls": false,
"timeout": 5,
"ldap_filter": "(objectClass=inetOrgPerson)",
"admin_username": "uid=missionportaltesting,ou=Users,o=5888df27d70bea3032f68a88,dc=jumpcloud,dc=com"
}
[root@hub ~]# ${CURL} -X PATCH -d '{"port":3269}'
{"success":true,"data":"Settings successfully saved."}
Backup and Restore
With policy stored in version control there are few things that should be preserved in your backup and restore plan.
Hub Identity
CFEngines trust model is based on public and private key exchange. In order to re-provision a hub and for remote agents to retain trust the hubs key pair must be preserved and restored.
Include $(sys.workdir)/ppkeys/localhost.pub
and
$(sys.workdir)ppkeys/localhost.priv
in your backup and restore plan.
Note: This is the most important thing to backup.
Hub License
Enterprise hubs will collect for up to the licensed number of hosts. When re-provisioning a hub you will need the license that matches the hub identity in order to be able to collect reports for more than 25 hosts.
Include $(sys.workdir)/licenses
in your backup plan.
Hub Databases
Data collected from remote hosts and configuration information for Mission Portal is stored on the hub in PostgreSQL which can be backed up and restored using standard tools.
If you wish to rebuild a hub and restore the history of policy outcomes you must backup and restore.
Host Data
cfdb
stores data related to policy runs on your hosts for example host inventory.
Backup:
# pg_dump -Fc cfdb > cfdb.bak
Restore:
# pg_restore -Fc cfdb.bak
Mission Portal
cfmp
and cfsettings
store Mission Portals configuration information for
example shared dashboards.
Backup:
# pg_dump -Fc cfmp > cfmp.bak
# pg_dump -Fc cfsettings > cfsettings.bak
Restore:
# pg_restore -Fc cfmp.bak
# pg_restore -Fc cfsettings.bak
Lookup License Info
Information about the currently issued license can be obtained from the About section in Mission Portal web interface or from the command line as shown here.
Note: When the CFEngine Enterprise license expires report collection is limited. No agent side functionality is changed. However if you are using functions or features that rely on information collected by the hub, that information will no longer be a reliable source of data.
Get license info via API
Run from the hub itself.
$ curl -u admin http://localhost/api/
Get license info from cf-hub
Run as root
from the hub itself.
# cf-hub -Fvn | grep -i expiring
2016-07-11T15:54:23+0000 verbose: Found 25 CFEngine Enterprise licenses, expiring on 2222-12-25 for FREE ENTERPRISE - http://cfengine.com/terms for terms
Custom SSL Certificate
When first installed a self-signed ssl certificate is automatically generated
and used to secure Mission Portal and API communications. You can change this
certificate out with a custom one by replacing
/var/cfengine/httpd/ssl/certs/<hostname>.cert
and
/var/cfengine/httpd/ssl/private/<hostname>.cert
where hostname is the fully
qualified domain name of the host.
After installing the certificate please make sure that the certificate
at /var/cfengine/httpd/ssl/certs/<hostname>.cert
is world-readable on the hub.
This is needed because the Mission Portal web application needs to access it directly.
You can test by verifying you can access the certificate with a unprivileged user account on the hub.
You can get the fully qualified hostname on your hub by running the following commands.
[root@hub ~]# cf-promises --show-vars=default:sys\.fqhost
default:sys.fqhost hub inventory,source=agent,attribute_name=Host name
[root@hub ~]# hostname -f
hub
Re-installing Enterprise Hub
Sometimes it is useful to re-install the hub while still preserving existing
trust and licensing. To preserve trust the $(sys.workdir)/ppkeys
directory
needs to be backed up and restored. To preserve enterprise licensing
$(sys.workdir/license.dat)
and $(sys.workdir)/licenses/.
should be backed
up.
Note: Depending on how and when your license was installed
$(sys.workdir/licenses.dat)
and or $(sys.workdir)/licenses/.
may not
exist. That is ok.
Warning: This process will not preserve any Mission Portal specific configuration except for the upstream VCS repository configuration. LDAP, roles, dashboards, and any other configuration done within Mission Portal will be lost.
This script in core/contrib serves as an example.
Policy Deployment
By default CFEngine policy is distributed from /var/cfengine/masterfiles
on
the policy server. It is common (and recommended) for masterfiles to be backed
with a version control system (VCS) such as git or subversion. This document
details usage with git, but the tooling is desinged to be flexible and easily
modified to support any upstream versioning system.
CFEngine Enterprise ships with tooling to assist in the automated deployment of
policy from a version control system to /var/cfengine/masterfiles
on the hub.
Ensure policy in upstream repository is current
This is critical. When you deploying policy, you will overwrite your current
/var/cfengine/masterfiles
. So take the current contents thereof and make sure
they are in the Git repository you chose in the previous step.
For example, if you create a new repository in GitHub by following the
instructions from https://help.github.com/articles/create-a-repo, you can add
the contents of masterfiles
to it with the following commands (assuming you
are already in your local repository checkout):
cp -r /var/cfengine/masterfiles/* .
git add *
git commit -m 'Initial masterfiles check in'
git push origin master
Configure the upstream VCS
To configure the upstream repository. You must provide the uri, credentials (passphraseless ssh key) and the branch to deploy from.
Configuring upstream VCS via Mission Portal
In the Mission Portal VCS integration panel. To access it, click on "Settings" in the top-left menu of the Mission Portal screen, and then select "Version control repository".
Configuring upstream VCS manually
The upstream VCS can be configured manually by modifying
/opt/cfengine/dc-scripts/params.sh
Manually triggering a policy deployment
After the upstream VCS has been configured you can trigger a policy deployment
manually by defining the cfengine_internal_masterfiles_update
for a run of the
update policy.
For example:
[root@hub ~]# cf-agent -KIf update.cf --define cfengine_internal_masterfiles_update
info: Executing 'no timeout' ... '/var/cfengine/httpd/htdocs/api/dc-scripts/masterfiles-stage.sh'
info: Command related to promiser '/var/cfengine/httpd/htdocs/api/dc-scripts/masterfiles-stage.sh' returned code defined as promise kept 0
info: Completed execution of '/var/cfengine/httpd/htdocs/api/dc-scripts/masterfiles-stage.sh'
This is useful if you would like more manual control of policy releases.
Configuring automatic policy deployments
To configure automatic deployments simply ensure the
cfengine_internal_masterfiles_update
class is defined on your policy hub.
Configuring automatic policy deployments with the augments file
Create def.json
in the root of your masterfiles with the following content:
{
"classes": {
"cfengine_internal_masterfiles_update": [ "hub" ]
}
}
Configuring automatic policy deployments with policy
Simply edit bundle common update_def
in controls/update_def.cf
.
bundle common update_def
{
# ...
classes:
# ...
"cfengine_internal_masterfiles_update" expression => "policy_server";
# ...
}
Troubleshooting policy deployments
Before policy is deployed from the upstream VCS to /var/cfengine/masterfiles
the policy is first validated by the hub. If this validation fails the policy
will not be deployed.
For example:
[root@hub ~]# cf-agent -KIf update.cf --define cfengine_internal_masterfiles_update
info: Executing 'no timeout' ... '/var/cfengine/httpd/htdocs/api/dc-scripts/masterfiles-stage.sh'
error: Command related to promiser '/var/cfengine/httpd/htdocs/api/dc-scripts/masterfiles-stage.sh' returned code defined as promise failed 1
info: Completed execution of '/var/cfengine/httpd/htdocs/api/dc-scripts/masterfiles-stage.sh'
R: Masterfiles deployment failed, for more info see '/var/cfengine/outputs/dc-scripts.log'
error: Method 'cfe_internal_masterfiles_stage' failed in some repairs
error: Method 'cfe_internal_update_from_repository' failed in some repairs
info: Updated '/var/cfengine/inputs/cfe_internal/update/cfe_internal_update_from_repository.cf' from source '/var/cfengine/masterfiles/cfe_internal/update/cfe_internal_update_from_repository.cf' on 'localhost'
Policy deployments are logged to /var/cfengine/outputs/dc-scripts.log
. The
logs contain useful information about the failed deployment. For example here I
can see that there is a syntax error in promises.cf
near line 14.
[root@prihub ~]# tail -n 5 /var/cfengine/outputs/dc-scripts.log
/opt/cfengine/masterfiles_staging_tmp/promises.cf:14:46: error: Expected ',', wrong input '@(inventory.bundles)'
@(inventory.bundles),
^
error: There are syntax errors in policy files
The staged policies in /opt/cfengine/masterfiles_staging_tmp could not be validated, aborting.: Unknown Error
Enable plain http
By default HTTPS is enforced by redirecting any non secure connection requests.
If you would like to enable plain HTTP you can do so by defining
cfe_enterprise_enable_plain_http
from an augments file.
For example, simply place the following inside def.json
in the root of your
masterfiles.
{
"classes": {
"cfe_enterprise_enable_plain_http": [ "any" ]
}
}
Custom LDAPs Certificate
To use a custom LDAPs certificate install it into your hubs operating system.
Note you can use the LDAPTLS_CACERT
environment variable to use a custom
certificate for testing with ldapsearch
before it has been installed into the
system.
[root@hub]:~# env LDAPTLS_CACERT=/tmp/MY-LDAP-CERT.cert.pem ldapsearch -xLLL -H ldaps://ldap.example.local:636 -b "ou=people,dc=example,dc=local"
Public key distribution
How can I arrange for the hosts in my infrastructure to trust a new key?
If you are deploying a new hub, or authorizing a non-hub to copy files from peers you will need to establish trust before communication can be established.
In order for trust to be established each host must have the public key of the
other host stored in $(sys.ppkeys)
named for the public key sha.
For example, we have 2 hosts. host001 with public key sha
SHA=917962161107efaed9610de3e034085373142f577fb7e7b9bddec2955b748836
and hub
with public key sha
SHA=af00250085306c68bb6d5f489f0239e2d7ff8a1f53f2d00e77c9ad2044309dfe
. For
trust to be established host001
must have
$(sys.workdir)/ppkeys/root-SHA=af00250085306c68bb6d5f489f0239e2d7ff8a1f53f2d00e77c9ad2044309dfe.pub
and hub must have
$(sys.workdir)/ppkeys/root-SHA=917962161107efaed9610de3e034085373142f577fb7e7b9bddec2955b748836.pub
.
The files must be root owned with write access restricted to the owner (644 or
less).
This policy shows how public keys can be stored in a central location on the policy server and automatically installed on all hosts.
bundle agent trust_distkeys
#@ brief Example public key distribution
{
meta:
"tags" slist => { "autorun" };
vars:
"keystore"
comment => "We want all hosts to trust these hosts because they perform
critical functions like policy serving.",
string => ifelse( isvariable( "def.trustkeys[keystore])" ), "$(def.trustkeys[keystore])",
"distkeys");
files:
"$(sys.workdir)/ppkeys/."
handle => "trust_distkeys",
comment => "We need trust all the keys stored in `$(keystore)` on
`$(sys.policy_hub)` so that we can communicate with them
using the CFEngine protocol.",
copy_from => remote_dcp( $(keystore), $(sys.policy_hub) ),
depth_search => basedir,
file_select => public_keys,
perms => mog( 644, root, root );
}
bundle server share_distkeys
#@ brief Share the directory containing public keys we need to distribute
{
access:
(policy_server|am_policy_hub)::
"/var/cfengine/distkeys/"
admit_ips => { "0.0.0.0/0" },
shortcut => "distkeys",
handle => "access_share_distkeys",
comment => "This directory contains public keys of hosts that should be
trusted by everyone.";
}
body depth_search basedir
#@ brief Search the files in the top level of the source directory
{
include_basedir => "true";
depth => "1";
}
body file_select public_keys
#@ brief Select plain files matching public key file naming patterns
{
# root-SHA=abc123.pub
leaf_name => { "\w+-(SHA|MD5)=[[:alnum:]]+\.pub" };
file_types => { "plain" };
file_result => "leaf_name.file_types";
}
Adjusting Schedules
Set cf-execd agent execution schedule
By default cf-execd
is configured to run cf-agent
every 5 minutes. This can
be adjusted by tuning the schedule in body executor
control
. In the Masterfiles Policy Framework body
executor control can be found in controls/cf_execd.cf
Set cf-hub hub_schedule
cf-hub
the CFEngine Enterprise report collection component has a
hub_schedule defined in body hub control
which also
defaults to a 5 minute schedule. It can be adjusted to control how frequently
hosts should be collected from. In the
Masterfiles Policy Framework body hub control
can be
found in controls/cf_hub.cf
Note: Mission Portal has an "Unreachable host threshold" that defaults to 15
minutes. When a host has not been collected from within this window the host is
added to the "Hosts not reporting" report. When adjusting the cf-hub
hub_schedule
consider adjusting the Unreachable host threshold proportionally.
For example, if you change the hub_schedule
to execute only once every 15
minutes, then the Unreachable host threshold should be adjusted to 45 minutes
(2700 seconds).
Set Unreachable host threshold via API
Note: This example uses jq to filter API results to only the relevant values. It is a 3rd party tool, and not shipped with CFEngine.
Here we create a JSON payload with the new value for the Unreachable host
threshold (blueHostHorizon
). We post the new settings and finally query the
API to validate the change in settings.
[root@hub ~]# echo '{ "blueHostHorizon": 2700 }' > payload.json
[root@hub ~]# cat payload.json
{ "blueHostHorizon": 2700 }
[root@hub ~]# curl -u admin:admin http://localhost:80/api/settings -X POST -d @./payload.json
[root@hub ~]# curl -s -u admin:admin http://localhost:80/api/settings/ | jq '.data[0]|.blueHostHorizon'
2700
Reset administrative credentials
The default admin
user can be reset to defaults using the following SQL.
cfsettings-setadminpassword.sql:
INSERT INTO "users" ("username", "password", "salt", "name", "email", "external", "active", "roles", "changetimestamp")
SELECT 'admin', 'SHA=aa459b45ecf9816d472c2252af0b6c104f92a6faf2844547a03338e42e426f52', 'eWAbKQmxNP', 'admin', 'admin@organisation.com', false, '1', '{admin,cf_remoteagent}', now()
ON CONFLICT (username, external) DO UPDATE
SET password = 'SHA=aa459b45ecf9816d472c2252af0b6c104f92a6faf2844547a03338e42e426f52',
salt = 'eWAbKQmxNP';
To reset the CFEngine admin user run the following sql as root on your hub
root@hub:~# psql cfsettings < cfsettings-setadminpassword.sql
Extending Query Builder in Mission Portal
This instruction is created to explain how to extend the Query Builder in the case where the enterprise hub database has new or custom tables that you want to use on the reporting page.
The workflow in this guide is to edit a file that will be updated by CFEngine when you upgrade to a newer version of CFEngine.
Thus your changes are going to be deleted. Please make sure to either keep a copy of the edits you want to preserve,
or add a relative file path scripts/advancedreports/dca.js
to $(sys.workdir)/httpd/htdocs/preserve_during_upgrade.txt
to preserve dca.js
during the CFEngine upgrade process.
How to add new table to Query Builder
To extend the query builder with your custom data you need to edit the javascript file located on your hub here:
$(sys.workdir)/share/GUI/scripts/advancedreports/dca.js
.
There you will find the DCA
variable that contains a JSON object:
var DCA = {
'Hosts':
.........
}
Each element of this JSON object describes database table information. You need to add a new JSON element with your new table information.
Structure of JSON element
Below you can see an example of hosts table representation as JSON element.
'Hosts':
{
'TableID': 'Hosts',
'Keys' : {'primary_key': 'HostKey' },
'label' : 'Hosts',
'Fields' : {
'Hosts.HostKey': {
"name" : "HostKey",
"label" : "Host key",
"inputType" : "text",
"table" : 'Hosts',
"sqlField" : 'Hosts.HostKey',
"dataType" : "string"
},
'Hosts.LastReportTimeStamp': {
"name" : "LastReportTimeStamp",
"label" : "Last report time",
"inputType" : "text",
"table" : 'Hosts',
"sqlField" : 'Hosts.LastReportTimeStamp',
"dataType" : 'timestamp'
},
'Hosts.HostName': {
"name" : "HostName",
"label" : "Host name",
"inputType" : "text",
"table" : 'Hosts',
"sqlField" : 'Hosts.HostName',
"dataType" : "string"
},
'Hosts.IPAddress': {
"name" : "IPAddress",
"label" : "IP address",
"inputType" : "text",
"table" : 'Hosts',
"sqlField" : 'Hosts.IPAddress',
"dataType" : "string"
},
'Hosts.FirstReportTimeStamp': {
"name" : "FirstReportTimeStamp",
"label" : "First report-time",
"inputType" : "text",
"table" : 'Hosts',
"sqlField" : 'Hosts.FirstReportTimeStamp',
"dataType" : 'timestamp'
}
}
}
Structure:
Each element has a key and a value. When you create your own JSON element please use a unique key. The value is a
JSON object, please see explanations below. The element's key should be equal to TableID
.
- TableID (string) Table id, can be the same as main element key, should be unique.
- Keys (json)
Table keys, describe there primary key, emp.:
{'primary_key': 'HostKey'}
. Primary key is case-sensitive.primary_key
is the only possible key inKeys
structure. - Label (string) Label contains a table's name that will be shown on the UI. Not necessary to use a real table name, it can be an alias for better representation.
Fields (json) JSON object that contains table columns.
Fields structure:
Fields object is presented as JSON, where key is unique table's key and value is JSON representation of table column properties.
The element's key should be equal to sqlField
- name (string) Field's name
- label (string) Label contains a field's name that will be shown on the UI. Not necessary to use a real field name, an alias can be used for better representation.
- inputType (string)
Type of input fields, will be used to create filter input for this field. Allowed values:
text
,textarea
,select
- a drop-down list,multiple
- a drop-down list that allows multiple selections,radio
,checkboxes
- table (string) Field's table name
- sqlField (string)
Concatenation of
table name
.field name
. Emp.:Hosts.FirstReportTimeStamp
- dataType (string)
Column's database type, allowed values:
timestamp
,string
,real
,integer
,array
After dca.js editing please validate the content of DCA variable (var DCA =
) in a JSON validation tooling,
there are many online tools to do that. Once your content validated and file has saved your changes will appear after
the next agent run.
Example
Let's see an example of Query builder extending with a new test table.
- Create a new table in the cfdb database
CREATE TABLE IF NOT EXISTS "test" (
"hostkey" text PRIMARY KEY,
"random_number" integer NOT NULL,
"inserted_time" timestamptz NOT NULL DEFAULT 'now()'
);
- Fill the table with data from the hosts.
INSERT INTO "test" SELECT "hostkey", (random() * 100)::int as random_number FROM "__hosts";
- Add a new element to the JSON object
'Test':
{
'TableID': 'Test',
'Keys' : {'primary_key': 'Hostkey' },
'label' : 'Test table',
'Fields' : {
'Test.hostkey': {
"name" : "hostkey",
"label" : "Host key",
"inputType" : "text",
"table" : 'Test',
"sqlField" : 'Test.hostkey',
"dataType" : "string"
},
'Test.random_number': {
"name" : "random_number",
"label" : "Random number",
"inputType" : "text",
"table" : 'Test',
"sqlField" : 'Test.random_number',
"dataType" : 'integer'
},
'Test.inserted_time': {
"name" : "inserted_time",
"label" : "Inserted time",
"inputType" : "text",
"table" : 'Test',
"sqlField" : 'Test.inserted_time',
"dataType" : "timestamp"
}
}
}
- See the result in the Query Builder
After the next cf-agent run file should be changed in the Mission Portal and you will be able to see the new table in the Query builder. You can use this table as predefined ones.
Report based on the new table:
Extending Mission Portal
Custom pages requiring authenticated users
Mission Portal can render static text files (html, sql, txt, etc ...) for users which are logged in.
How to use
Upload files to
$(sys.workdir)/httpd/htdocs/application/modules/files/static_files
on your
hub. Access the content using the url https://hub/files/view/file_name.html,
where file_name.html
is the name of a file. Please note, uploaded files should
have read permission for cfapache
user.
Custom help menu entries
The help menu Mission portal help menu. It can be useful if you would like to make extra content like documentation easily avilable to users.
How to use
Upload html files into
$(sys.workdir)/httpd/htdocs/application/views/extraDocs/
on your hub. Menu
items will appear named for each html file where underscores are replaced with
spaces. Files must be readable by the cfapache
user.
Example
File test_documentation.html
was uploaded to the directory specified above.
Mission Portal Style
Use the following structure in your HTML to style the page the same as the rest of Mission Portal.
<div class="contentWrapper help">
<div class="pageTitle">
<h1>PAGE TITLE</h1>
</div>
<!-- CONTENT --->
</div>
Install and Get Started
Installation
The General Installation instructions provide the detailed steps for installing CFEngine, which are generally the same steps to follow for CFEngine Enterprise, with the exception of license keys (if applicable), and also some aspects of post-installation and configuration.
Installing Enterprise Licenses
Before you begin, you should have your license key, unless you only plan to use the free 25 node license. The installation instructions will be provided with the key.
Post-Install Configuration
Change Email Setup After CFEngine Enterprise Installation
For Enterprise 3.6 local mail relay is used, and it is assumed the server has a proper mail setup.
The default FROM email for all emails sent from the Mission Portal is admin@organization.com
. This can be changed on the CFE Server in /var/cfengine/httpd/htdocs/application/config/appsettings.php:$config['appemail']
.
Version your policies
Consider enabling the built-in version control of your policies as described in Version Control and Configuration Policy
Whether you do or not, please put your policies in some kind of
backed-up VCS. Losing work because of "fat fingering" rm
commands is
very, very depressing.
Configure collection for monitoring data
Monitoring allows you to sample a metric and assess its value
across your hosts over time. Collection of monitoring information is disabled by
default. Metrics must match monitoring_include
in the appropriate
report_data_select
body.
The Masterfiles Policy Framework uses body
report_data_select default_data_select_policy_hub
to specify metrics that
should be collected from policy hubs and default_data_select_host
to specify
metrics that should be collected from non hubs.
For example:
To collect all metrics from hubs:
body report_data_select default_data_select_policy_hub
# @brief Data to collect from policy servers by default
#
# By convention variables and classes known to be internal, (having no
# reporting value) should be prefixed with an underscore. By default the policy
# framework explicitly excludes these variables and classes from collection.
{
metatags_include => { "inventory", "report" };
metatags_exclude => { "noreport" };
promise_handle_exclude => { "noreport_.*" };
monitoring_include => { ".*" };
}
To collect cpu
, loadavg
, diskfree
, swap_page_in
,
cpu_utilization
, swap_utilization
, and memory_utilization
from
non hubs:
body report_data_select default_data_select_host
##### @brief Data to collect from remote hosts by default
#####
##### By convention variables and classes known to be internal, (having no
##### reporting value) should be prefixed with an underscore. By default the policy
##### framework explicitly excludes these variables and classes from collection.
{
metatags_include => { "inventory", "report" };
metatags_exclude => { "noreport" };
promise_handle_exclude => { "noreport_.*" };
monitoring_include => {
"cpu",
"loadavg",
"diskfree",
"swap_page_in",
"cpu_utilization",
"swap_utilization",
"memory_utilization",
};
}
Review settings
See the Masterfiles Policy Framework for a full list of all the settings you can configure.
User Interface
The challenge in engineering IT infrastructure, especially as it scales vertically and horizontally, is to recognize the system components, what they do at any given moment in time (or over time), and when and how they change state.
CFEngine Enterprise's data collection service, the cf-hub
collector, collects,
organizes, and stores data from every host. The data is stored primarily in a
PostgreSQL database.
CFEngine Enterprise's user interface, the Mission Portal makes that data available to authorized users as high level reports or alerts and notifications. The reports can be designed in a GUI report builder or directly with SQL statements passed to PostgreSQL.
Dashboard
The Mission Portal dashboard allows users to create customized summaries showing the current state of the infrastructure and its compliance with deployed policy.
The dashboard contains informative widgets that you can customize to create alerts. All notifications of alert state changes, e.g. from OK to not-OK, are stored in an event log for later inspection and analysis.
Make changes to shared dashboard
Create an editable copy by clicking the button that appears when you hover over the dashboard's row.
Alert widgets
Alerts can have three different severity level: low, medium and high. These are represented by yellow, orange and red rings respectively, along with the percentage of hosts alerts have triggered on. Hovering over the widget will show the information as text in a convenient list format.
You can pause alerts during maintenance windows or while working on resolving an underlying issue to avoid unnecessary triggering and notifications.
Alerts can have three different states: OK, triggered, and paused. It is easy to filter by state on each widget's alert overview.
Find out more: Alerts and Notifications
Changes widget
The changes widget helps to visualize the number of changes (promises repaired)
made by cf-agent
.
Event log
The event log on the dashboard is filtered to show only information relevant based on the widgets present. It shows when alerts are triggered and cleared and when hosts are bootstrapped or decommissioned.
Host count widget
The hosts count widget helps to visualize the number of hosts bootstrapped to cfengine over time.
Hosts and Health
CFEngine collects data on promise compliance, and sorts hosts according to 3 different categories: erroneous, fully compliant, and lacking data.
Find out more: Hosts and Health
Reporting
Inventory reports allow for quick reporting on out-of-the-box attributes. The attributes are also extensible, by tagging any CFEngine variable or class, such as the role of the host, inside your CFEngine policy. These custom attributes will be automatically added to the Mission Portal.
You can reduce the amount of data or find specific information by filtering on attributes and host groups. Filtering is independent from the data presented in the results table: you can filter on attributes without them being presented in the table of results.
Add and remove columns from the results table in real time, and once you're happy with your report, save it, export it, or schedule it to be sent by email regularly.
Find out more: Reporting
Follow along in the custom inventory tutorial or read the MPF policy that provides inventory.
Sharing
Dashboards, Host categorization views, and Reports can be shared based on role.
Please note that the logic for sharing based on roles is different than the
logic that controls which hosts a given role is allowed access to data for. When
a Dashboard, Host categorization, or report is shared with a role, anyone having
that role is allowed to access it. For example if a dashboard is shared with the
reporting
and admin
roles users with either the role reporting
or the role
admin
are allowed access.
For example:
user1
has only thereporting
role.admin
has theadmin
role.
If the admin
user creates a new dashboard and shares it with the reporting
role, then any user (including user1
) having the reporting
role will be
able to subscribe to the new dashboard. Additionally the dashboard owner in this
case admin
also has access to the custom dashboard.
Monitoring
Monitoring allows you to get an overview of your hosts over time.
Find out more: Monitoring
Settings
A variety of CFEngine and system properties can be changed in the Settings view.
Find out more: Settings
Settings
A variety of CFEngine and system properties can be changed in the Settings view.
- Opening Settings
- Preferences
- User Management
- Role Management
- Manage Apps
- Version Control Repository
- Host Identifier
- Mail Settings
- Authentication settings
- About CFEngine
Opening Settings
Settings are accessible from any view of the mission portal, from the drop down in the top right hand corner.
Preferences
User settings and preferences allows the CFEngine Enterprise administrator to change various options, including:
- Turn on or off RBAC
- When RBAC is disabled any user can see a host that has reported classes
- Note, administrative functions like the ability to delete hosts are not affected by this setting and hosts that have no reported classes are never shown.
- Unreachable host threshold
- Number of samples used to identify a duplicate identity
- Log level
- Customize the user experience with the organization logo
User Management
User management is for adding or adjusting CFEngine Enterprise UI users, including their name, role, and password.
Role Management
Roles limit access to host data and access to shared assets like saved reports and dashboards.
Roles limit access to which hosts can be seen based on the classes reported by
the host. For example if you want to limit a users ability to report only on
hosts in the "North American Data Center" you could setup a role that includes
only the location_nadc
class.
When multiple roles are assigned to a user, the user can access only resources that match the most restrictive role across all of their roles. For example, if you have the admin role and a role that matches zero hosts, the user will not see any hosts in Mission Portal. A shared report will only be accessible to a user if the user has all roles that the report was restricted to.
In order to access a shared reports or dashboard the use must have all roles that the report or dashboard was shared with.
In order to see a host, none of the classes reported by the host can match the class exclusions from any role the user has.
Users without a role will not be able to see any hosts in Mission Portal.
Role suse:
- Class include: SUSE
- Class exclude: empty
Role cfengine_3:
- Class include: cfengine_3
- Class exclude: empty
Role no_windows
- Class include: cfengine_3
- Class exclude: windows
Role windows_ubuntu
- Class include: windows
- Class include: ubuntu
- Class exclude: empty
User one has role SUSE
.
User two has roles no_windows
and cfengine_3
.
User three has roles windows_ubuntu
and no_windows
.
A report shared with SUSE
and no_windows
will not be seen by any of the
listed users.
A report shared with no_windows
and cfengine_3
will only be seen by user
two.
A report shared with SUSE
will be seen by user one.
User one will only be able to see hosts that report the SUSE
class.
User two will be able to see all hosts that have not reported the windows
class.
User three will only be able to see hosts that have reported the ubuntu
class.
Predefined Roles
admin
- The admin role can see everything and do anything.cf_remoteagent
- This role allows execution ofcf-runagent
.
Default Role
To set the default role, click Settings -> User management -> Roles. You can then select which role will be the default role for new users.
Behaviour of Default Role:
Any new users created in Mission Portal's local user database will have this new role assigned.
Users authenticating through LDAP (if you have LDAP configured in Mission Portal) will get this new role applied the first time they log in.
Note that the default role will not have any effect on users that already exist (in Mission Portal's local database) or have already logged in (when using LDAP).
In effect this allows you to set the default permissions for new users (e.g. which hosts a user is allowed to see) by configuring the access for the default role.
Manage Apps
Application settings can help adjust some of CFEngine Enterprise UI app features, including the order in which the apps appear and their status (on or off).
Version Control Repository
The repository holding the organization's masterfiles can be adjusted on the Version Control Repository screen.
Host Identifier
Host identity for the server can be set within settings, and can be adjusted to refer to the FQDN, IP address, or an unqualified domain name.
Mail settings
Configure outbound mail settings:
Default from email : Email address that Mission Portal will use by default when sending emails.
Mail protocol : Use the system mailer (Sendmail) or use an SMTP server.
Max email attachment size (MB) : mails sent by Mission Portal with attachments exceeding this will have the attachment replaced with links to download the large files.
Authentication settings
Mission portal can authenticate against an external directory.
Special Notes:
LDAP API Url refers to the API cfengine uses internally for authentication. Most likely you will not alter the default value.
LDAP filter must be supplied.
LDAP Host refers is the IP or Hostname of your LDAP server.
LDAP bind username should be the username used to bind and search the LDAP directory. It must be provided in distinguished name format.
Default roles for users is configured under Role Management.
See Also: LDAP authentication REST API
About CFEngine
The About CFEngine screen contains important information about the specific version of CFEngine being used, license information, and more.
Hosts and Health
Host Compliance
CFEngine collects data on promise
compliance. Each host is in one of two groups: out of compliance or fully compliant.
- A host is considered out of compliance if less than 100% of its promises were kept.
- A host is considered fully compliant if 100% of its promises were kept.
You can look at a specific sub-set of your hosts by selecting a category from the menu on the left.
Host Info
Here you will find extensive information on single hosts that CFEngine detects automatically in your environment. Since this is data gathered per host, you need to select a single host from the menu on the left first.
Host Health
You can get quick access to the health of hosts, including direct links to reports, from the Health drop down at the top of every Enterprise UI screen. Hosts are listed as unhealthy if:
- the hub was not able to connect to and collect data from the host within a set time interval (unreachable host). The time interval can be set in the Mission Portal settings.
- the policy did not get executed for the last three runs. This could be caused by
cf-execd
not running on the host (scheduling deviation) or an error in policy that stops its execution. The hub is still able to contact the host, but it will return stale data because of this deviation. - two or more hosts use the same key. This is detected if the IP address tied to a CFEngine key has changed in the last three scheduled runs. The number of scheduled runs that cause the unhealthy status is configurable in settings.
- reports have recently been collected, but cf-agent has not recently run. “Recently” is defined by the configured run-interval of their cf-agent.
These categories are non-overlapping, meaning a host will only appear in one category at at time even if conditions satisfying multiple categories might be present. This makes reports simpler to read, and makes it easier to detect and fix the root cause of the issue. As one issue is resolve the host might then move to another category. In either situation the data from that host will be from old runs and probably not reflect the current state of that host.
Alerts and Notifications
Create a New Alert
From the Dashboard, locate the rectangle with the dotted border.
When the cursor is hovering over top, an Add button will appear.
- Click the button to begin creating the alert.
Add a unique name for the alert.
Each alert has a visual indication of its severity, represented by one of the following colors:
- Low: Yellow
- Medium: Orange
- High: Red
From the Severity dropdown box, select one of the three options available.
The Select Condition drop down box represents an inventory of existing conditional rules, as well as an option to create a new one
When selecting an existing conditional rule, the name of the condition will automatically populate the mandatory condition Name field.
When creating a new condition the Name field must be filled in.
Each alert also has a Condition type:
- Policy conditions trigger alerts based on CFEngine policy compliance status. They can be set on bundles, promisees, and promises. If nothing is specified, they will trigger alerts for all policy.
- Inventory conditions trigger alerts for inventory attributes. These attributes correspond to the ones found in inventory reports.
- Software Updates conditions trigger alerts based on packages available for update in the repository. They can be set either for a specific version or trigger on the latest version available. If neither a package nor a version is specified, they will trigger alerts for any update.
- Custom SQL conditions trigger alerts based on an SQL query. The SQL query must returns at least one column -
hostkey
.
Alert conditions can be limited to a subset of hosts.
- Notifications of alerts may be sent by email or custom action scripts.
Check Email notifications box to activate the field for entering the email address to notify.
The Remind me dropdown box provides a selection of intervals to send reminder emails for triggered events.
Custom actions for Alerts
Once you have become familiar with the Alerts widgets, you might see the need to integrate the alerts with an existing system like Nagios, instead of relying on emails for getting notified.
This is where the Custom actions come in. A Custom action is a way to execute a script on the hub whenever an alert is triggered or cleared, as well as when a reminder happens (if set). The script will receive a set of parameters containing the state of the alert, and can do practically anything with this information. Typically, it is used to integrate with other alerting or monitoring systems like PagerDuty or Nagios.
Any scripting language may be used, as long as the hub has an interpreter for it.
Alert parameters
The Custom action script gets called with one parameter: the path to a file with a set of KEY=VALUE lines. Most of the keys are common for all alerts, but some additional keys are defined based on the alert type, as shown below.
Common keys
These keys are present for all alert types.
Key | Description |
---|---|
ALERT_ID | Unique ID (number). |
ALERT_NAME | Name, as defined in when creating the alert (string). |
ALERT_SEVERITY | Severity, as selected when creating the alert (string). |
ALERT_LAST_CHECK | Last time alert state was checked (Unix epoch timestamp). |
ALERT_LAST_EVENT_TIME | Last time the alert created an event log entry (Unix epoch timestamp). |
ALERT_LAST_STATUS_CHANGE | Last time alert changed from triggered to cleared or the other way around (Unix epoch timestamp). |
ALERT_STATUS | Current status, either 'fail' (triggered) or 'success' (cleared). |
ALERT_FAILED_HOST | Number of hosts currently triggered on (number). |
ALERT_TOTAL_HOST | Number of hosts defined for (number). |
ALERT_CONDITION_NAME | Condition name, as defined when creating the alert (string). |
ALERT_CONDITION_DESCRIPTION | Condition description, as defined when creating the alert (string). |
ALERT_CONDITION_TYPE | Type, as selected when creating the alert. Can be 'policy', 'inventory', or 'softwareupdate'. |
Policy keys
In addition to the common keys, the following keys are present when ALERT_CONDITION_TYPE='policy'.
Key | Description |
---|---|
ALERT_POLICY_CONDITION_FILTERBY | Policy object to filter by, as selected when creating the alert. Can be 'bundlename', 'promiser' or 'promisees'. |
ALERT_POLICY_CONDITION_FILTERITEMNAME | Name of the policy object to filter by, as defined when creating the alert (string). |
ALERT_POLICY_CONDITION_PROMISEHANDLE | Promise handle to filter by, as defined when creating the alert (string). |
ALERT_POLICY_CONDITION_PROMISEOUTCOME | Promise outcome to filter by, as selected when creating the alert. Can be either 'KEPT', 'REPAIRED' or 'NOTKEPT'. |
Inventory keys
In addition to the common keys, the following keys are present when ALERT_CONDITION_TYPE='inventory'.
Key | Description |
---|---|
ALERT_INVENTORY_CONDITION_FILTER_$(ATTRIBUTE_NAME) | The name of the attribute as selected when creating the alert is part of the key (expanded), while the value set when creating is the value (e.g. ALERT_INVENTORY_CONDITION_FILTER_ARCHITECTURE='x86_64'). |
ALERT_INVENTORY_CONDITION_FILTER_$(ATTRIBUTE_NAME)_CONDITION | The name of the attribute as selected when creating the alert is part of the key (expanded), while the value is the comparison operator selected. Can be 'ILIKE' (matches), 'NOT ILIKE' (doesn't match), '=' (is), '!=' (is not), '<', '>'. |
... | There will be pairs of key=value for each attribute name defined in the alert. |
Software updates keys
In addition to the common keys, the following keys are present when ALERT_CONDITION_TYPE='softwareupdate'.
Key | Description |
---|---|
ALERT_SOFTWARE_UPDATE_CONDITION_PATCHNAME | The name of the package, as defined when creating the alert, or empty if undefined (string). |
ALERT_SOFTWARE_UPDATE_CONDITION_PATCHARCHITECTURE | The architecture of the package, as defined when creating the alert, or empty if undefined (string). |
Example parameters: policy bundle alert not kept
Given an alert that triggers on a policy bundle being not kept (failed), the following is example content of the file being provided as an argument to a Custom action script.
ALERT_ID='6'
ALERT_NAME='Web service'
ALERT_SEVERITY='high'
ALERT_LAST_CHECK='0'
ALERT_LAST_EVENT_TIME='0'
ALERT_LAST_STATUS_CHANGE='0'
ALERT_STATUS='fail'
ALERT_FAILED_HOST='49'
ALERT_TOTAL_HOST='275'
ALERT_CONDITION_NAME='Web service'
ALERT_CONDITION_DESCRIPTION='Ensure web service is running and configured correctly.'
ALERT_CONDITION_TYPE='policy'
ALERT_POLICY_CONDITION_FILTERBY='bundlename'
ALERT_POLICY_CONDITION_FILTERITEMNAME='web_service'
ALERT_POLICY_CONDITION_PROMISEOUTCOME='NOTKEPT'
Saving this as a file, e.g. 'alert_parameters_test', can be useful while writing and testing your Custom action script. You could then simply test your Custom action script, e.g. 'cfengine_custom_action_ticketing.py', by running
./cfengine_custom_action_ticketing alert_parameters_test
When you get this to work as expected on the commmand line, you are ready to upload the script to the Mission Portal, as outlined below.
Example script: logging policy alert to syslog
The following Custom action script will log the status and definition of a policy alert to syslog.
#!/bin/bash
source $1
if [ "$ALERT_CONDITION_TYPE" != "policy" ]; then
logger -i "error: CFEngine Custom action script $0 triggered by non-policy alert type"
exit 1
fi
logger -i "Policy alert '$ALERT_NAME' $ALERT_STATUS. Now triggered on $ALERT_FAILED_HOST hosts. Defined with $ALERT_POLICY_CONDITION_FILTERBY='$ALERT_POLICY_CONDITION_FILTERITEMNAME', promise handle '$ALERT_POLICY_CONDITION_PROMISEHANDLE' and outcome $ALERT_POLICY_CONDITION_PROMISEOUTCOME"
exit $?
What gets logged to syslog depends on which alert is associated with the script, but an example log-line is as follows:
Sep 26 02:00:53 localhost user[18823]: Policy alert 'Web service' fail. Now triggered on 11 hosts. Defined with bundlename='web_service', promise handle '' and outcome NOTKEPT
Uploading the script to the Mission Portal
Members of the admin role can manage Custom action scripts in the Mission Portal settings.
A new script can be uploaded, together with a name and description, which will be shown when creating the alerts.
Associating a Custom action with an alert
Alerts can have any number of Custom action scripts as well as an email notification associated with them. This can be configured during alert creation. Note that for security reasons, only members of the admin role may associate alerts with Custom action scripts.
Conversely, several alerts may be associated with the same Custom action script.
When the alert changes state from triggered to cleared, or the other way around, the script will run. The script will also run if the alert remains in triggered state and there are reminders set for the alert notifications.
Enterprise Reporting
CFEngine Enterprise can report on promise outcomes (changes made by cf-agent
across your infrastructure), variables, classes, and measurements taken by
cf-monitord
. Reports cover fine grained policy details, explore all the
options by checking out the custom reports section
of the Enterprise Reporting module.
Specifically which information allowed to be collected by the hub for reporting
is configured by report_data_select
bodies.
default_data_select_host()
defines the data to be collected for a non policy hub
and default_data_select_policy_hub()
defines the data that should be collected
for a policy hub.
Controlling which variables and classes should be collected by an Enterprise Hub
is done primarily with a list of regular expressions matching promise meta tags
for either inclusion or
exclusion. cf-hub
collects variables
that are not
prefixed with an underscore (_
) or have meta tags matching
metatags_include
and do not have any meta tags
matching metatags_exclude
and does not have a
handle matching promise_handle_exclude
.
cf-hub
collects namespace scoped (global) classes
having any meta tags
matching metatags_include
that do not have any meta
tags matching metatags_exclude
.
Instead of modifying the list of regular expressions to control collection, we
recommend that you leverage the defaults provided by the MPF (Masterfiles Policy
Framework). The MPF includes inventory
and report
in
metatags_include
, noreport
in
metatags_exclude
and noreport_.*
in
promise_handle_exclude
.
If it's desirable for the classes and variables to be available in specialized
inventory subsystem then it should be tagged with inventory
and given an
additional attribute_name=
tag as described in the custom inventory example.
cf-hub
collects information resulting from all other promise types (except
reports
, and defaults
which cf-hub does not collect for). This can be
further restricted by specifying
promise_handle_include or
promise_handle_exclude.
Collection of measurements taken by cf-monitord
are is controlled using the
monitoring_include
and monitoring_exclude
report_data_select
body
attributes.
Limitations:
There are various limitations with regard to the size of information that is collected into central reporting. Data that is too large to be reported will be truncated and a verbose level log message will be generated by cf-agent. Some noteable limitations are listed below.
- string variables are limited to 1024 bytes
- lists are limited to 1024 bytes of serialized data
- data variables are limited to 1024 bytes of serialized data
- meta tags limited to 1024 bytes of serailized output
- log messages are truncated to 400 bytes
Please note that these limits may be lower in practice due to internal encoding.
Users can not configure the data is stored to disk. For example, you can not
prevent the enterprise agent from logging to promise_log.jsonl
.
For information on accessing reported information please see the Reporting UI guide.
Reporting Architecture
The reporting architecture of CFEngine Enterprise uses two software components from the CFEngine Enterprise hub package.
cf-hub
Like all CFEngine components, cf-hub
is
located in /var/cfengine/bin
. It is a daemon process that runs in the
background, and is started by cf-agent
and from the init scripts.
cf-hub
wakes up every 5 minutes and connects to the cf-serverd
of
each host to download new data.
To collect reports from any host manually, run the following:
$ /var/cfengine/bin/cf-hub -H <host IP>
Add
-v
to run in verbose mode to diagnose connectivity issues and trace the data collected.Delta (differential) reporting, the default mode, collects data that has changed since the last collection. Rebase (full) reports collect everything. You can choose the full collection by adding
-q rebase
(for backwards comapatibility, also available as-q full
).
Apache
REST over HTTP is provided by the
Apache http server which also hosts the
Mission Portal. The httpd
process is started through CFEngine policy
and the init scripts and listens on ports 80 and 443 (HTTP and HTTP/S).
Apache is part of the CFEngine Enterprise installation in
/var/cfengine/httpd
. A local cfapache
user is created with
privileges to run cf-runagent
.
SQL Queries Using the Enterprise API
The CFEngine Enterprise Hub collects information about the environment in a centralized database. Data is collected every 5 minutes from all bootstrapped hosts. This data can be accessed through the Enterprise Reporting API.
Through the API, you can run CFEngine Enterprise reports with SQL queries. The API can create the following report queries:
- Synchronous query: Issue a query and wait for the table to be sent back with the response.
- Asynchronous query: A query is issued and an immediate response with an ID is sent so that you can check the query later to download the report.
- Subscribed query: Specify a query to be run on a schedule and have the result emailed to someone.
Synchronous Queries
Issuing a synchronous query is the most straightforward way of running an SQL query. We simply issue the query and wait for a result to come back.
Request:
curl -k --user admin:admin https://test.cfengine.com/api/query -X POST -d
{
"query": "SELECT ..."
}
Response:
{
"meta": {
"page": 1,
"count": 1,
"total": 1,
"timestamp": 1351003514
},
"data": [
{
"query": "SELECT ...",
"header": [
"Column 1",
"Column 2"
],
"rowCount": 3,
"rows": [
]
"cached": false,
"sortDescending": false
}
]
}
Asynchronous Queries
Because some queries can take some time to compute, you can fire off a query and check the status of it later. This is useful for dumping a lot of data into CSV files for example. The sequence consists of three steps:
- Issue the asynchronous query and get a job id.
- Check the processing status using the id.
- When the query is completed, get a download link using the id.
Issuing the query
Request:
curl -k --user admin:admin https://test.cfengine.com/api/query/async -X POST -d
{
"query": "SELECT Hosts.HostName, Hosts.IPAddress FROM Hosts JOIN Contexts ON Hosts.Hostkey = Contexts.HostKey WHERE Contexts.ContextName = 'ubuntu'"
}
Response:
{
"meta": {
"page": 1,
"count": 1,
"total": 1,
"timestamp": 1351003514
},
"data": [
{
"id": "32ecb0a73e735477cc9b1ea8641e5552",
"query": "SELECT ..."
}
]
]
Checking the status
Request:
curl -k --user admin:admin https://test.cfengine.com/api/query/async/:id
Response:
{
"meta": {
"page": 1,
"count": 1,
"total": 1,
"timestamp": 1351003514
},
"data": [
{
"id": "32ecb0a73e735477cc9b1ea8641e5552",
"percentageComplete": 42,
]
}
Getting the completed report
This is the same API call as checking the status. Eventually, the percentageComplete field will reach 100 and a link to the completed report will be available for downloading.
Request:
curl -k --user admin:admin https://test.cfengine.com/api/query/async/:id
Response:
{
"meta": {
"page": 1,
"count": 1,
"total": 1,
"timestamp": 1351003514
},
"data": [
{
"id": "32ecb0a73e735477cc9b1ea8641e5552",
"percentageComplete": 100,
"href": "https://test.cfengine.com/api/static/32ecb0a73e735477cc9b1ea8641e5552.csv"
}
]
}
Subscribed Queries
Subscribed queries happen in the context of a user. Any user can create a query on a schedule and have it emailed to someone.
Request:
curl -k --user admin:admin https://test.cfengine.com/api/user/name/
subscription/query/file-changes-report -X PUT -d
{
"to": "email@domain.com",
"query": "SELECT ...",
"schedule": "Monday.Hr23.Min59",
"title": "Report title"
"description": "Text that will be included in email"
"outputTypes": [ "pdf" ]
}
Response:
204 No Content
Reporting UI
CFEngine collects a large amount of data. To inspect it, you can run and schedule pre-defined reports or use the query builder for your own custom reports. You can save these queries for later use, and schedule reports for specified times.
If you are familiar with SQL syntax, you can input your query into the interface directly. Make sure to take a look at the database schema. Please note: manual entries in the query field at the bottom of the query builder will invalidate all field selections and filters above, and vice-versa.
You can share the report with other users - either by using "Save" button, or by base64-encoding the report query into a URL. You can also provide an optional title by adding title
parameter to the URL, like this:
HUB_URL="https://hub"
API="/index.php/advancedreports/#/report/run?sql="
SQL_QUERY="SELECT Hosts.HostName AS 'Host Name' FROM Hosts"
REPORT_TITLE="Example Report"
LINK="${HUB_URL}${API}$(echo ${SQL_QUERY} | base64)&title=$(/usr/bin/urlencode ${REPORT_TITLE})"
echo "${LINK}"
https://hub/index.php/advancedreports/#/report/run?sql=U0VMRUNUIEhvc3RzLkhvc3ROYW1lIEFTICdIb3N0IE5hbWUnIEZST00gSG9zdHMK&title=Example%20Report
You can query fewer hosts with the help of filters above the displayed table. These filters are based on the same categorization you can find in the other apps.
You can also filter on the type of promise: user defined, system defined, or all.
See also:
Query Builder
Users not familiar with SQL syntax can easily create their own custom reports in this interface. Please note that query builder can be extended with your custom data.
- Tables - Select the data tables you want include in your report first.
- Fields - Define your table columns based on your selection above.
- Filters - Filter your results. Remember that unless you filter, you may be querying large data sets, so think about what you absolutely need in your report.
- Group - Group your results. May be expensive with large data sets.
- Sort - Sort your results. May be expensive with large data sets.
- Limit - Limit the number of entries in your report. This is a recommended practice for testing your query, and even in production it may be helpful if you don't need to see every entry.
- Show me the query - View and edit the SQL query directly. Please note, that editing the query directly here will invalidate your choices in the query builder interface, and changing your selections there will override your SQL query.
Ensure the report collection is working
The reporting bundle must be called from
promises.cf
. For example, the following defines the attributeRole
which is set todatabase_server
. You need to add it to the top-levelbundlesequence
inpromises.cf
or in a bundle that it calls.bundle agent myreport { vars: "myrole" string => "database_server", meta => { "inventory", "attribute_name=Role" }; }
note the
meta
taginventory
The hub must be able to collect the reports from the client. TCP port 5308 must be open and, because 3.6 uses TLS, should not be proxied or otherwise intercepted. Note that bootstrapping and other standalone client operations go from the client to the server, so the ability to bootstrap and copy policies from the server doesn't necessarily mean the reverse connection will work.
Ensure that variables and classes tagged as
inventory
orreport
are not filtered bycontrols/cf_serverd.cf
in your infrastructure. The standard configuration from the stock CFEngine packages allows them and should work.
Note: The CFEngine report collection model accounts for long periods of time when the hub is unable to collect data from remote agents. This model preserves data recorded until it can be collected. Data (promise outcomes, etc ...) recorded by the agent during normal agent runs is stored locally until it is collected from by the cf-hub process. At the time of collection the local data stored on the client is cleaned up and only the last hours worth of data remains client. It is important to understand that the time between hub collection and number of clients that are unable to be collected from grows the amount of data to transfer and store in the central database also grows. A large number of clinets that have not been collected from that become available at once can cause increased load on the hub collector and affect its performance until it has been able to collect from all hosts.
Define a New Single Table Report
- In Mission Portal select the Report application icon on the left hand side of the screen.
- This will bring you to the Report builder screen.
- The default for what hosts to report on is All hosts. The hosts can be filtered under the Filters section at the top of the page.
- For this tutorial leave it as All hosts.
- Set which tables' data we want reports for.
- For this tutorial select Hosts.
- Select the columns from the Hosts table for the report.
- For this tutorial click the Select all link below the column lables.
- Leave Filters, Sort, and Limit at the default settings.
- Click the orange Run button in the bottom right hand corner.
Check Report Results
- The report generated will show each of the selected columns across the report table's header row.
- In this tutorial the columns being reported back should be: Host key, Last report time, Host name, IP address, First report-time.
- Each row will contain the information for an individual data record, in this case one row for each host.
- Some of the cells in the report may provide links to drill down into more detailed information (e.g. Host name will provide a link to a Host information page).
- It is possible to also export the report to a file.
- Click the orange Export button.
- You will then see a Report Download dialog.
- Report type can be either csv or pdf format.
- Leave other fields at the default values.
- If the server's mail configuration is working properly, it is possible to email the report by checking the Send in email box.
- Click OK to download or email the csv or pdf version of the report.
- Once the report is generated it will be available for download or will be emailed.
Inventory Management
Inventory allows you to define the set of hosts to report on.
The main Inventory screen shows the current set of hosts, together with relevant information such as operating system type, kernel and memory size.
To begin filtering, one would first select the Filters drop down, and then select an attribute to filter on (e.g. OS type = linux)
After applying the filter, it may be convenient to add the attribute as one of the table columns.
Changing the filter, or adding additional attributes for filtering, is just as easy.
We can see here that there are no Windows machines bootstrapped to this hub.
Client Initiated Reporting / Call collect
Pull collect is the default mode of reporting. In this mode, the reporting hub connects out to hosts to pull reporting data.
In call collect mode, clients initiate the reporting connection, by "calling" the hub first. The hub keeps the connection open and collects the reports when it's ready. Call collect is especially useful in environments where agents cannot be reached from the hub. This could be because of NAT (routes) or firewall rules.
Call collect and Client Initiated Reporting are the same, they both refer to the same functionality.
How do you enable call collect?
The easiest way to enable call collect is via augments files, modify /var/cfengine/masterfiles/def.json
on the hub:
{
"classes": {
"client_initiated_reporting_enabled": [ "any" ]
},
"vars": {
"mpf_access_rules_collect_calls_admit_ips": [ "0.0.0.0/0" ],
"control_hub_exclude_hosts": [ "0.0.0.0/0" ]
}
}
Client initiated reporting will be enabled on all hosts, since all hosts have the any
class set.
mpf_access_rules_collect_calls_admit_ips
controls what network range clients are allowed to connect from.
This should be customized to your environment.
control_hub_exclude_hosts
will exclude the IPs in the network range(s) from pull collection.
This network range should usually match the one above.
Trying to use both pull and call collect for the same host can cause problems and unnecessary load on the hub.
See also: call_collect_interval
, collect_window
When are hosts collected from? How is collection affected by hub interval?
Call collect hosts are handled as soon as possible.
Agents initiate connections according to their own schedule, and the hub handles them as quickly as possible.
There is a separate call collect thread which waits for incoming connections, and queues them.
Whenever a thread in the cf-hub thread pool is available, it will prioritize the call collect queue before the pull queue.
Neither the call collect thread nor the worker thread pool are affected by the hub reporting schedule (hub_schedule
).
How can I see which hosts are call collected?
This is recorded in the PostgreSQL database on the hub, and can be queried from command line:
/var/cfengine/bin/psql -d cfdb -c "SELECT * FROM __hosts WHERE iscallcollected='t'";
Are call collect hosts counted for enterprise licenses?
Yes, call collect hosts consume a license.
If you have too many hosts (pull + call) for your license, cf-hub
will start emitting errors, and skip some hosts.
cf-hub
prioritizes call collect hosts, and will only skip pull collect hosts when over license.
Note that in other parts of the product, like Mission Portal, there is no distinction between call collect and pull collect hosts.
How do you disable call collect?
Update the def.json
file with the new classes and appropriate network ranges.
For hosts which are already using call collect, but shouldn't, the easiest approach is to generate new keys, bootstrap again, and then remove the old host in Mission Portal or via API.
Unfortunately, there is no way, currently, to easily make a host switch back to pull collection.
Monitoring
Monitoring allows you to get an overview of your hosts over time.
If multiple hosts are selected in the menu on the left, then you can select one of three key measurements that is then displayed for all hosts:
- load average
- Disk free (in %)
- CPU(ALL) (in %)
You can reduce the number of graphs by selecting a sub-set of hosts from the menu on the left. If only a
single host is selected, then a number of graphs for various measurements will be displayed for this host. Which exact measurements are reported depends on how cf-monitord
is configured and extended via measurements
promises.
Clicking on an individual graph allows to select different time spans for which monitoring data will be displayed.
If you don't see any data, make sure that:
cf-monitord
is running on your hosts.cf-hub
has access to collecting the monitoring data from your hosts. See Configuring Enterprise Measurement and Monitoring Collection in the Masterfiles Policy Framework.
Enterprise API
The CFEngine Enterprise API allows HTTP clients to interact with the CFEngine Enterprise Hub. Typically this is also the policy server.
The Enterprise API is a REST API, but a central part of interacting with the API uses SQL. With the simplicity of REST and the flexibility of SQL, users can craft custom reports about systems of arbitrary scale, mining a wealth of data residing on globally distributed CFEngine Database Servers.
See also the Enterprise API Examples and the Enterprise API Reference.
Best Practices
Version Control and Configuration Policy
CFEngine users version their policies. It's a reasonable, easy thing
to do: you just put /var/cfengine/masterfiles
under version control
and... you're done?
What do you think? How do you version your own infrastructure?
Problem statement
It turns out everyone likes convenience and writing the versioning machinery is hard. So for CFEngine Enterprise 3.6 we set out to provide version control integration with Git out of the box, disabled by default. This allows users to use branches for separate hubs (which enables a policy release pipeline).
Release pipeline
A build and release pipeline is how software is typically delivered to production through testing stages. In the case of CFEngine, policies are the software. Users have at least two stages, development and production, but typically the sequence has more stages including various forms of testing/QA and pre-production.
How to enable it
To enable masterfiles versioning, you have to plan a little bit. These are the steps:
Configure your repository
Use a remote Git repository accessible via the git
or https
protocol
populated with the contents of masterfiles.
Using a remote repository
To use a remote repository, you must enter its address, login credentials and the branch you want to use in the Mission Portal VCS integration panel. To access it, click on "Settings" in the top-left menu of the Mission Portal screen, and then select "Version control repository". This screen by default contains the settings for using the built-in local repository.
Make sure your current masterfiles are in the chosen repository
This is critical. When you start auto-deploying policy, you will
overwrite your current /var/cfengine/masterfiles
. So take the
current contents thereof and make sure they are in the Git repository
you chose in the previous step.
For example, if you create a new repository in GitHub by following the
instructions from https://help.github.com/articles/create-a-repo, you
can add the contents of masterfiles
to it with the following
commands (assuming you are already in your local repository checkout):
cp -r /var/cfengine/masterfiles/* .
git add *
git commit -m 'Initial masterfiles check in'
git push origin master
Enable VCS deployments in the versioned update.cf
In the file update_def.cf
under a version-specific subdirectory of
controls/
in your version-controlled masterfiles, change
#"cfengine_internal_masterfiles_update" expression => "enterprise.!(cfengine_3_4|cfengine_3_5)";
"cfengine_internal_masterfiles_update" expression => "!any";
to
"cfengine_internal_masterfiles_update" expression => "enterprise.!(cfengine_3_4|cfengine_3_5)";
#"cfengine_internal_masterfiles_update" expression => "!any";
This is simply commenting out one line and uncommenting another.
Remember that you need to commit and push these changes to the
repository you chose in the previous step, so that they are picked up
when you deploy from the git repository. In your checked out
masterfiles
git repository, these commands should normally do the
trick:
git add update.cf
git commit -m 'Enabled auto-policy updates'
git push origin master
Now you need to do the first-time deployment, whereupon this new
update.cf
and the rest of your versioned masterfiles will overwrite
/var/cfengine/masterfiles
. We made that easy too, using standard
CFEngine tools. Exit the cfapache
account and run the following
command as root
on your hub:
cf-agent -Dcfengine_internal_masterfiles_update -f update.cf
Easy, right? You're done, from now on every time update.cf
is run
(by default, every 5 minutes) it will check out the repository and
branch you configured in the Mission Portal VCS integration panel.
Please note all the work is done as user cfapache
except the very
last step of writing into /var/cfengine/masterfiles
.
How it works
The code is fairly simple and can even be modified if you have special
requirements (e.g. Subversion integration). But out of the box there
are three important components. All the scripts below are stored under
/var/cfengine/httpd/htdocs/api/dc-scripts/
in your CFEngine
Enterprise hub.
common.sh
The script common.sh
is loaded by the deployment script and does two
things. First, it redirects all output to
/var/cfengine/outputs/dc-scripts.log
. So if you have problems,
check there first.
Second, the script sources /opt/cfengine/dc-scripts/params.sh
where
the essential parameters like repository address and branch live.
That file is written out by the Mission Portal VCS integration panel,
so it's the connection between the Mission Portal GUI and the
underlying scripts.
masterfiles-stage.sh
This script is called to deploy the masterfiles from VCS to
/var/cfengine/masterfiles
. It's fairly complicated and does not
depend on CFEngine itself by design; for instance it uses rsync
to
deploy the policies. You may want to review and even modify it, for
example choosing to reject deployments that are too different from the
current version (which could indicate a catastrophic failure or
misconfiguration).
This script also validates the policies using cf-promises -T
. That
command looks in a directory and ensures that promises.cf
in the
directory is valid. If it's not, an error will go in the log file and
the script exits.
NOTE this means that clients will never get invalid policies according to the hub.
Policy changes
If you want to make manual changes to your policies, simply make those
changes in a checkout of your masterfiles repository, commit and push
the changes. The next time update.cf
runs, your changes will be
checked out and in minutes distributed through your entire
infrastructure.
Benefits
To conclude, let's summmarize the benefits of versioning your masterfiles using the built-in facilities in CFEngine Enterprise.
- easy to use compared to home-grown VCS integration
- supports Git out of the box and, with some work, can support others like Subversion, Mercurial, and CVS.
- tested, reliable, and built-in
- supports any repository and branch per hub
- your policies are validated before deployment
- integration happens through shell scripts and
update.cf
, not C code or special policies
Scalability
When running CFEngine Enterprise in a large-scale IT environment with many thousands of hosts, certain issues arise that require different approaches compared with smaller installations.
With CFEngine 3.6, significant testing was performed to identify the issues surrounding scalability and to determine best practices in large-scale installations of CFEngine.
Moving PostgreSQL to Separate Hard Drive
Moving the PostgreSQL database to another physical hard drive from the other CFEngine components can improve the stability of large-scale installations, particularly when using a solid-state drive (SSD) for hosting the PostgreSQL database.
The data access involves a huge number of random IO operations, with small chunks of data. SSD may give the best performance because it is designed for these types of scenarios.
Important: The PostgreSQL data files are in /var/cfengine/state/pg/
by default. Before moving the mount point, please make sure that all CFEngine processes (including PostgreSQL) are stopped and the existing data files are copied to the new location.
Setting the splaytime
The splaytime
tells CFEngine hosts the base interval over which they will communicate with the policy server
, which they then use to "splay" or hash their own runtimes.
Thus when splaytime
is set to 4, 1000 hosts will hash their run attempts evenly over 4 minutes, and each minute will see about 250 hosts make a run attempt. In effect, the hosts will attempt to communicate with the policy server and run their own policies in predictable "waves." This limits the number of concurrent connections and overall system load at any given moment.