Guest blog post: Don't use your distro's package manager

Posted by Jeff Carlson
January 23, 2023

I have stopped using my Linux distro’s package manager, and you should, too. Maybe I should clarify that. I don’t install software with my distro’s package manager any more. I still upgrade my system.

I became influenced by a few different factors. Top among these is something required in certain industries called a change advisory board or committee. This requirement says that changes to production computers have to be reviewed and approved by all stakeholders in that computer’s operations.

Second among these is the concept of reproducible builds. Every so often, I get a new computer or a new hard drive. This always means installing a set of software packages and configuring to my liking.

And third was my commitment to configuring my machines with CFEngine. This includes installing software. Indeed, CFEngine has a powerful packages promise type that works across multiple distros (by identifying the underlying package system and calling that package manager behind the scenes).

Public companies require a change management process

Let’s look a little deeper into the change management process. My first introduction to this was working for a company that had recently become publicly traded and we instituted a change advisory board meeting every day. Being a member of the technical operations team, I had to bring my case for changes I needed to make before the board during the regularly scheduled meeting. This was generally a simple process, I said this or that machine needs to update packages, or we need to install this service. The impact will be the following. The risk is such and such. The rollback procedure is like this. It will be scheduled for this time on this day. And everybody at the meeting says its ok and I go do the work.

Now, there was one situation that never sat quite right with me. I don’t want to out anybody for not following procedures correctly because the impact here was minimal, but here we go.

The rule was, don’t install software on any production machine without an approved change ticket. But sometimes you log into a machine and something simple, which you need to diagnose a problem, isn’t installed. Something like traceroute. So I would ask, is it ok if I install traceroute on this production server? My boss said yes. Like I said, it’s a little thing, but it wasn’t documented.

I decided there’s a better way to handle this scenario, which I can easily implement with CFEngine. Create a list of software you want installed on every computer. That’s easy enough. Using CFEngine means we can guarantee it actually gets installed. Now when we want to add another package to that list, we have a short vote in the change management meeting, and add it. CFEngine will do the rest. That package will be installed on every production machine.

Now we create what I call a baseline of packages that are approved to be installed on every machine, but can be lumped into one bundle because they don’t require any configuration.

Reproducibility

I use Fedora, btw. I also use Debian, btw. Sometimes I use Ubuntu, btw. You get the picture.

Last year I got a new hard drive for my laptop. I set about to install a new operating system on it, but I made the decision I was going to manage this computer exclusively with CFEngine. This is something I read Mark Burgess did in his book on CFEngine version 2. That means I will install all my software with CFEngine, which is the point of this article.

I want to point out that some distros will call the same package differently than another distro. For example, RedHat family distros package the ag command as the_silver_searcher, but Debian family distros package it as silversearcher-ag. This requires some manipulation of the package list which I will demonstrate further on.

Introducing the baseline bundle

Let’s start with a new CFEngine policy file. Of course, I’m using Emacs. I named my file baseline.cf. If you followed my previous article, Quick-start guide to using Emacs for CFEngine, you know how I create a new policy bundle using the YASnippet Emacs extension to insert various templates into my file. Let’s start with the outline.

bundle agent baseline
{
  meta:
  vars:
  defaults:
  classes:
  users:
  files:
  packages:
  guest_environments:
  methods:
  processes:
  services:
  commands:
  storage:
  databases:
  reports:
}

This is going to be a very simple bundle. We’re going to add a few vars promises, and a packages promise. Add the following under the vars: heading.

vars:
  "lists" -> { "def.json" }
    slist => "@{def.baseline}";
  "pkg[${lists}]"
    if => fileexists("${lists}"),
    slist => readstringlist(
      "${lists}", "\s*#[^\n]*$", "\n", "inf", "inf");
  "packages"
    slist => getvalues("pkg");

For those who may be new to CFEngine, I will walk through all of these.

The first variable is a string list called lists. This list will be retrieved from the augments file, def.json. This is indicated in the promisee definition. I always indicate that the values of certain variables will be found in def.json in the promisee. That way, when I am re-using my policy files, I remind myself where I need to define something. I will go over the contents needed in def.json below.

The second variable is an associative array variable called pkg holding a list of strings (it’s actually a collection of individual variables which CFEngine can treat as a cohesive unit like data type vars for some operations). The keys used in this structure will be the file names coming from lists. The contents of each of the string lists will be read from the corresponding files. And there’s an if property to prevent any error occuring if one of those named files is not found.

The final variable is a string list of the actual packages to install called packages. I named it that to correspond to the packages promise.

And speaking of the packages promise, let’s add that next.

packages:
  "${packages}"
    comment => "Ensure ${this.promiser} is installed",
    handle  => "${this.bundle}_packages_${this.promiser}",
    policy  => "present";

There is only one packages promise statement, thanks to CFEngine’s automatic looping over list values. The first property in this promise is the comment. This will be printed if cf-agent is run in verbose mode. The second property is the handle, which is also printed in verbose mode. I always include the bundle name and the current promiser, which in this case will be the package name currently being processed, in the handle. This allows me to identify the exact location of any promise iterations that cause problems.

The final property is the policy, which is set to “present.” This relies on the package module 1 implementation of packages promises available in the masterfiles promise library, which doesn’t require declaring which package manager to use. apt will be used on Ubuntu and zypper will be used on SUSE, for example.

This is all we have to do for the promise bundle. All the actual configuration will be handled in def.json and text lists referenced therein.

{
  "inputs": [
    "services/baseline.cf"
  ],
  "vars": {
    "control_common_bundlesequence_end": [
      "baseline"
    ],
    "baseline": [
      "${sys.inputdir}/baseline.default.dat",
      "${sys.inputdir}/baseline.${sys.flavor}.dat",
      "${sys.inputdir}/baseline.${sys.os}.dat",
      "${sys.workdir}/baseline.local.dat"
    ]
  }
}

If you are already familiar with def.json and its contents, yours may include far more than what is presented above. This is just a simple example to support the bundle we have created.

Note: The above snippet uses the original style syntax. Since 3.18.0 CFEngine supports a more expressive structure which allows specification of comments, meta data and targeting variables in any bundle of any namespace. This mirrors the above:

{
  "inputs": [
    "services/baseline.cf"
  ],
  "variables": {
    "default:def.control_common_bundlesequence_end": {
      "value": [
        "baseline"
      ],
      "comment": "Custom policies to run at the end of the standard MPF bundlesequence"
    },
    "default:def.baseline": {
      "value": [
        "${sys.inputdir}/baseline.default.dat",
        "${sys.inputdir}/baseline.${sys.flavor}.dat",
        "${sys.inputdir}/baseline.${sys.os}.dat",
        "${sys.workdir}/baseline.local.dat"
      ],
      "comment": "Data files used to specify baseline configurations"
    }
  }
}

Adding the policy file name to inputs means I can add the file to the inputs attribute of body common control without editing promises.cf.

Likewise, adding the name of the bundle to the control_common_bundlesequence_end variable means I can add this bundle to the bundlesequence without editing promises.cf.

I consider the ability to modify these properties without editing a file shipped with the Masterfiles Policy Framework (MPF) important because if I don’t have to edit any of these files, I won’t introduce my own bugs.

Finally is a list of filenames which will be the value of the variable def.baseline. Notice I have included a few variables in the file names, coming from sys. These variables are determined in CFEngine’s C code and available before def.json is read. This is only true of sys (and const) variables so they are the only ones which can be included in def.json.

All of these files will be found relative to the sys.inputdir directory, which is usually /var/cfengine/inputs.

I have named all of these files ending with “.dat” because when cf-agent processes update.cf, it will only copy certain file names to the target inputs directory. These files must match the following.

cf-promises -f update.cf --show-vars=update_def.input_name_patterns

This results in the following file patterns.

  • *.awk
  • *.cf
  • *.conf
  • *.csv
  • *.dat
  • *.json
  • *.mustache
  • *.pl
  • *.ps1
  • *.py
  • *.rb
  • *.sed
  • *.sh
  • *.txt
  • *.yaml
  • cf_promises_release_id

I thought *.dat seemed logical. I think *.txt would have been equally acceptable.

My baseline.default.dat file contains that list of packages that I always want installed. If change management approves a new package, it should get added here. As a systems administrator, it is quite frustrating when the following are not installed.

bash-completion
curl
gzip
iperf3
iproute
iputils
jq
less
lynx
man-db
mg
mlocate
more
moreutils
nmap
pcre-tools
procps-ng
screen
strace
sysstat
tar
tcpdump
traceroute
tree
wget
which
whois

The variable sys.flavor represents the Linux distro, if it is Linux. Otherwise sys.os is the operating system, which would be Linux, Solaris, or perhaps AIX. Recall the if property in the vars promise. This allows either of these to be absent.

Here is an example of a baseline.debian_11.dat file.

apt-transport-https
bind9-dnsutils
bind9-host
mailutils
needrestart
silversearcher-ag

And the following is an example of a baseline.centos_7.dat file.

bind-utils
mailx
the_silver_searcher
yum-plugin-copr

I use my baseline.local.dat to contain packages which I want on my desktop, but the name provides a convenient location to store any package lists for a specific host. Don’t put this file in masterfiles, save it in /var/cfengine/.

alacritty
emacs
firefox
rofi
thunderbird

Conclusion

Using CFEngine on my desktops or laptops, I can keep my machines installed with all my favorite software following an Infrastructure-as-Code model. Once CFEngine is set up, any program can be installed just by adding the package name to one of the data files.

Additionally, this method works well in a corporate environment where changes to machines are managed carefully. This same configuration works equally and allows the admin team to know the programs they require are installed on every server.


  1. The package module based implementation of packages promises was introduced in CFEngine 3.7.0. A package module can be implemented in any language. Package modules communicate with cf-agent via the package module api. It’s implementation is the preferred syntax for policy writing and usage documentation is found under the packages promise type↩︎