Change in behavior: The arglist attribute now preserves spaces

Posted by Lars Erik Wik
January 17, 2024

When executing commands in a shell, the program and its arguments are typically separated by spaces. This command reads the content of two files and prints it out (concatenating it):

command
cat one.txt two.txt

In the example above, cat is the command / program, while one.txt is the first argument, and two.txt is the second argument. This is great because it makes commands really easy to read and type, however there is one obvious drawback: When a space has a special meaning (separating arguments), what do you do when you actually need a space?

An example of this could be to want to print files with spaces in their filenames. The most common approach is to use quotes which also have a special meaning:

command
cat 'file one.txt' 'file two.txt'

Now, the first argument is what’s between the first pair of quotes (file one.txt) and the second argument is predictably what’s between the second pair of quotes (file two.txt) 1. In CFEngine, commands promises are used to run commands like these, usually in one of three ways:

example.cf
bundle agent main
{
  commands:
    # First way
    "command with arguments";
    # Second way
    "command"
      args => "with arguments";
    # Third way
    "command"
      arglist => { "with", "arguments" };
}

In this blog post, we’ll take a look at a recent bugfix for the arglist attribute. Because of the change in behavior required for this fix, some users may have to review and update their policy.

The original behavior of the arglist attribute in the commands promise must clearly have been intended to allow for arguments with embedded spaces without having to quote or escape them, as you would normally have to do in a shell. The documentation seemed to support this claim as well:

As with args, it is convenient to separate command and arguments. With arglist you can use an slist directly instead of having to provide a single string as with args. That’s particularly useful when there are embedded spaces and quotes in your arguments, but also when you want to get them directly from an slist without going through join() or other functions.

However, this has never been the case. Causing numerous community members to (rightfully so) report confusion in our bug tracker with tickets dating back to the year 2017. See CFE-2724, CFE-2869, and CFE-4253. Hence, it is about time we do something about it.

As an example, the following policy;

example.cf
bundle agent __main__
{
  files:
    "/tmp/test.py"  # A program that echo's each argument
      content => "import sys; print(*(arg for arg in sys.argv), sep='\n')";
  commands:
    "/usr/bin/python3 /tmp/test.py"
      arglist => { "one", "two three" };
}

would produce the following output;

command
cf-agent -Kf ~/test.cf
output
  notice: Q: "...in/python3 /tmp": /tmp/test.py
Q: "...in/python3 /tmp": one
Q: "...in/python3 /tmp": two
Q: "...in/python3 /tmp": three

while one would expect;

command
cf-agent -Kf ~/test.cf
output
  notice: Q: "...in/python3 /tmp": /tmp/test.py
Q: "...in/python3 /tmp": one
Q: "...in/python3 /tmp": two three

For the next 3.24 LTS release we have a fix for this bug. Thus, whitespace is preserved in arguments given through the arglist attribute. This is unless the useshell attribute is set to "useshell" or "powershell", as the shell itself would interpret spaces as argument separators. Also, due to limitations in the Win32 API, the fix is not available for Windows yet.

What to do

As a workaround, we have previously suggested the use of nested quoting as illustrated in the following example:

example.cf
bundle agent __main__
{
  files:
    "/tmp/test.py"  # A program that echo's each argument
      content => "import sys; print(*(arg for arg in sys.argv), sep='\n')";
  commands:
    "/usr/bin/python3 /tmp/test.py"
      arglist => { "one", "'two three'" };
}

The example above would produce the following output:

command
cf-agent -Kf ~/test.cf
output
  notice: Q: "...in/python3 /tmp": /tmp/test.py
Q: "...in/python3 /tmp": one
Q: "...in/python3 /tmp": two three

However, the bug fix would pose a breaking change for any uses of this workaround. This is because the enclosed quotes would now be a part of the argument. The example above would produce the following output in CFEngine 3.24:

command
cf-agent -Kf ~/test.cf
output
  notice: Q: "...in/python3 /tmp": /tmp/test.py
Q: "...in/python3 /tmp": one
Q: "...in/python3 /tmp": 'two three'

Note: It is worth noting that this change only affects the arglist attribute and not args.

Thus, when upgrading to CFEngine 3.24, you should walk through your policy and remove the enclosing quotes wherever this work around is used in arglist, unless the useshell attribute is set to "useshell" or "powershell".

As an example, you can use the following command to locate all occurrences of arglist from within your MPF directory:

command
grep --colour -Irn . -e "arglist" --include="*.cf"
output
 ---snip---
./services/autorun/bogus.cf:10:    arglist => { "one",  "'two three'" };
./services/autorun/bogus.cf:12:    arglist => { "five",  "'$(six)'" };
./services/autorun/bogus.cf:15:    arglist => { "seven",  "`eight nine`" };
./services/autorun/bogus.cf:17:    arglist => { "ten",  "eleven" };
 ---snip---

For each located occurrence, determine if the promise uses a shell. If not, remove any nested quotes from strings in the slist or any referenced variables. See example below:

example.cf
body contain example
{
  useshell => "useshell";
}

bundle agent main
{
  commands:
    "/usr/bin/foo"  # Requires fixing
      arglist => { "one", "'two three'" };
    "/usr/bin/bar"  # Requires further investigation
      arglist => { "five", "$(six)" };
    "/usr/bin/baz"  # This one is fine as is
      contain => example,
      arglist => { "seven", "`eight nine`" };
    "/usr/bin/qux"  # This one is fine as is
      arglist => { "ten", "eleven" };
}

If you have questions or need help reach out on the mailing list or GitHub discussions. If you have a support contract feel free to open a ticket in our support system.


  1. An observant reader might notice that we need similar tricks now if we want quotes inside our arguments. For those quotes, we’d need to use escape characters, or more quotes, but let’s skip this since it’s not relevant for what we’re looking at here. ↩︎