White paper

Efficient synchronization of tabular data

CFEngine is a configuration management tool that offers a domain-specific language for automating system administration tasks while abstracting away platform-specific differences. Exclusively available to CFEngine Enterprise customers is an additional feature — a reporting system. This system collects invaluable information about managed hosts, by mirroring their states onto a centralized hub. These states are stored as tabular data structures and are synchronized using a log-based synchronization technique. The existing implementation of the reporting system is not optimized for minimal bandwidth consumption, primarily due to scalability limitations in its design. It's worth noting that bandwidth consumption is a crucial concern for CFEngine customers. In response, this paper summarizes an enhanced log-based synchronization technique introduced by Lars Erik Wik, designed to address this challenge.