Migrating from Travis to Github Actions

Posted by Craig Comstock
October 30, 2023

For CFEngine we manage several public and private repositories of code in GitHub for our Open Source and Enterprise products. In order to ensure quality we run many checks on the code both with nightly builds as well as on each pull request. We use a Jenkins server for nightlies which also includes more extensive deployment tests on all of the platforms we support. Previously we had used Travis for many of these checks but that system started to show its age and limitations.

In this blog post I will discuss how we migrated part of our continuous integration environment from Travis to GitHub Actions.

Let’s get on with the major components of continuous integration in our case.

For our open source CFEngine agent we have three repositories involved: core, buildscripts and masterfiles. For Enterprise we have several additional repositories for the enterprise agent and the enterprise hub.

Most of our build and test logic is in buildscripts. That is what we use to drive our Jenkins-based CI as well.

Secrets

In order to enable GitHub Actions to build our Enterprise product we must give it read access to private repos. This is done through secrets management provided by GitHub. We also need to provide our builds with a few other secrets. All of the secrets need to be shared with both the organization as well as GitHub’s dependabot so that when dependabot creates pull requests for dependency upgrades, the github actions can access those secrets.

Public versus private

One snag we ran into here though is that we want our public repos to have read access to build our Enterprise code. GitHub secrets are not available to pull requests, even if the pull request is submitted by a member of the organziation. The workaround in this case is to push to the organization repo instead of our forks. This is easily done with the gh command which I have been fond of using lately so that I can avoid my browser a bit longer.

Assuming I have two remotes, upstream and origin, and origin is my fork:

$ git remote -v
origin  git@github.com:craigcomstock/buildscripts (fetch)
origin  git@github.com:craigcomstock/buildscripts (push)
upstream        git@github.com:cfengine/buildscripts (fetch)
upstream        git@github.com:cfengine/buildscripts (push)

gh will ask where I want to push my branch:

$ gh pr create
? Where should we push the 'test' branch?  [Use arrows to move, type to filter]
> cfengine/buildscripts
  craigcomstock/buildscripts
  Skip pushing the branch
  Cancel

Choosing cfengine instead of craigcomstock will allow the github actions to receive the required secrets.

YAML explosion and contraction (DRY)

Initially as we migrated we created many yaml files for various jobs in various repos. Often we would copy these yml files from one repo to another and make small changes to suite that different repo. This became a maintenance problem as we typically maintain a master and two LTS (long term support) branches for several repos which would essentially have identical yaml files for various jobs: build, package, test.

To solve this issue we moved the yaml files into buildscripts and referenced them from the various other repos in a master ci.yml file which simply defined which jobs to run, not the content of the jobs.

For example, in buildscripts we have .github/workflows/deployment-tests.yml which we want to use in several repositories.

In each repository we have a main .github/workflows/ci.yml which is our main entry point that builds on pull_request activity.

on: pull_request

The deployment-tests.yml activates on workflow_call to enable it to be called from other workflows.

on:
  workflow_call:
    secrets:
      GH_SECRET_ONE:
        required: true

jobs:
  deployment_tests:
    runs-on: ubuntu-latest
    steps:
      ... etc ...

And then from each repo we simply refer to the yml file:

jobs:
  deployment_tests:
    secrets: inherit
    uses: cfengine/buildscripts/.github/workflows/deployment-tests.yml@master

Generally, it is best practice to use a commit sha instead of @master but in this case we have tight controls over who can commit changes to buildscripts so felt it was worth staying with the latest as well as minimizing the work to update the commit shas whenever we changed the yaml in buildscripts.

Containers (docker)

In the process we also found that using containers could be very helpful in making the jobs easy to run anywhere, both in GitHub as well as locally to debug without having to wait or pay for the infrastructure of GitHub.

We already had a head start using Docker so decided to go with that. It is well supported on MacOS and Linux. We would like to gradually migrate to something a bit more lightweight such as buildah in the future.

As our enterprise product is a collection of several independent processes it was a bit non-standard how we had to implement this, starting systemd as the main process in the container and then handling other parts of the process through calls into the container instead of just a one-shot Dockerfile.

Dockerfile-cfengine-deployment-tests

FROM ubuntu:20.04
RUN apt-get update -y && apt-get install -y systemd sudo
CMD [ "/lib/systemd/systemd" ]

and then later in a separate shell script we run various steps to run the test:

docker exec -i $name bash -c 'cd /data; ./buildscripts/ci/deployment-tests.sh'

Not exactly best practices but it does work and it also enables us to run this deployment-tests.sh script in a plain host/virtual machine if we want to dig into things step by step for debugging.

Together

Since we have a multi-repo setup for our software we needed a way to coordinate builds when a change involved PRs for multiple repos. Previously we had a way of looking in the pull request description for mentions (URLs) of PRs in other repos. The CI code would then checkout that PR instead of the branch we were on, thereby including all the changes needed.

Save time and money with caching

As we worked through many builds during development of our GitHub Actions we burned through a LOT of minutes. In order to optimize this we needed to use any caching available to us.

GitHub cache

We were surprised to see that github caching was specific to each pull request. We had hoped that we could leverage caches of builds in various repos from other repos and even from the same repo. e.g. If we made a change which involved 4 repos we wouldn’t be building the same code 4 times but rather share the results.

Custom cache

We include several third party dependencies as part of our package. Since these don’t change all that often and can sometimes take quite a while to build we already had a cache of these built dependencies. Re-using this with GitHub proved to save a lot of build minutes.

Test-only changes reuse old cache

When you are iterating on a pull request you might need to add or adjust testing code often based on the results in GitHub actions. Since we are saving minutes by caching our dependencies and packages we don’t want to invalidate those caches with test expectation changes or additional tests. For this we implemented a package_sha which computes a sha from code-only files

SHA=$(find "${REPO}" -type f -and \( -path "${REPO}/api/*" -or -path "${REPO}/db/*" -or -path "${REPO}/lib*" -or -path "${REPO}/report-collect-plugin/*" \) -print0 | xargs -0 sha1sum | awk '{print $1}' | sha1sum | cut -c -8)

Each repo has it’s own formula for what is code and what is not. We combine all of those shas into one which we use to decide whether to rebuild the package or not.

All in all we have re-implemented most of the CI functionality we have in Travis and are generally happy with the result.

Continuous improvement

As with many projects we weren’t able to include every feature that we wanted. A few tasks stand out as things we will likely address in the near future:

  • shared caching between projects
  • improved logging to ease in debugging failures
  • reduced execution minutes with improved Dockerfiles and saving images for re-use

Hope this summary helps you in your continuous integration work. If you have any thoughts or tips send them our way at contact@northern.tech.