Show posts by author:
Mark Burgess

Self-Repairing Deployment Pipelines

Or what we should mean by Distributed Orchestration Orchestrating complicated distributed processes is an unfamiliar aspect of computing that leads to all kinds of confusions. We are not taught how to do it in college, so we end up trying to apply any methods we are taught, often in inappropriate ways. Promise theory paints a very simple picture of distributed orchestration. Rather than imagining that a central conductor (controller) somehow plays every instrument by remote magic wand, in an algorithmic fashion, promise theory says: let every player in an ensemble know their part, and leave them all to get on with it. The result is an emergent phenomenon. The natural dependences on one another will make them all play together. Over-thinking the storyline in a distributed process is the easiest way to get into a pickle. This is a key point: how we tell the story of a process and how it gets executed are two different things. Modern programming languages sometimes pretend they are the same, and sometimes separate the two entirely. Scroll back time to 1992, and the world was having all the same problems as today, in different wrapping. Then there was cron, pumping out scripts on hourly, daily and weekly schedules. This was used not only for running jobs on the system but for basic configuration health checks. Cron scripts were like a wild horde of cats, each with their own lives, hard to bring into some sense of order. In the early 90s the acme of orchestration was to sort out all your cron jobs to do the right thing at the right time. The scripts had to be different on the machines too because the flavours of Unix were quite different – and thus there was distributed complexity. Before CFEngine people would devise devious ways of creating one cronfile for each host and then pushing them out. This was considered to be orchestration in 1992. One of the first use cases for CFEngine was to replace all of this with a single uniform model oriented language/interface. CFEngine was target oriented, because it had to be repeatable. Convergence . In this article I explain why virtual environments and containers are basically this issue all over again. Another tool of this epoch is the make for building software from dependencies. In 1994, Richard Stallman pointed out to me that CFEngine was very like make. Indeed, this ended up influencing the syntax of the language. The Makefile was different, it was the opposite of a script. Instead of starting in a known state and pushing out a sequence of transitions from there, it focused on the end state and asked how can I get to that desired end state? In math parlance, it was a change of boundary condition. This was an incredibly important idea, because it meant that – no matter what kind of a mess you were in – you would end up with the right outcome. This is far more important than knowing where you started from. Makefiles did not offer much in the way of abstraction; you could substitute variables and make simple patterns, but this was sufficient for most tasks, because patterns are one of the most important mechanisms for dealing with complexity. Similarly, make was a serial processor running on a single machine, not really suitable for today’s distributed execution requirements. The main concession to parallelism was the addition of “-j” to parallelize building of dependencies. What was really needed was a model based approach where we could provide answers to the following questions: what, when, where, how and why. So now we come to the world of today where software is no longer shackled to a workstation or a server, but potentially a small cog in a large system. And more than that - it is a platform for commerce in the modern world. It’s not just developers and IT folks who care about having stuff built - it’s everyone who uses a service. Many of the problems we are looking to solve can be couched in the model of a deployment of some kind. Whether it is in-house software (“devops”), purchased off-the-shelf software (say “desktop”) or even batch jobs in HPC clusters, all of these typically pass through a test phase before being deployed onto some infrastructure container, such as a server, process group, or even embedded device. Alas the technologies we’ve invented are still very primitive. If we look back to the history of logic, it grew out of the need to hit objects with projectiles in warfare. Ballistics was the cultural origin of mathematics and logic in the days of Newton and Boole. Even today, we basically still try to catapult data and instructions into remote hosts using remote copies and shells. So if a script is like a catapult, that takes us from one decision to the next in a scripted logic. Another name for this triggered branching process is a chain reaction (an explosion). A Makefile is the opposite: a convergent process like something sliding easily down a drain. The branching logic in a script leads to multitudes of parallel alternative worlds. When we branch in git or version control systems we add to this complexity. In a convergent process we are integrating possible worlds into a consistent outcome. This is the enabler for continuous delivery. So developers might feel as though they have their triggered deployments under control, but are they really. No matter, we can go from this… To this … This picture illustrates for me the meaning of true automation. No one has to push a button to get a response. The response is proactive and distributed into the very fabric of the design – not like an add-on. The picture contrasts how we go from manual labour to assisted manual labour, to a proper redesign of process. Automation that still needs humans to operate it is not automation, it is a crane or a power-suit. CFEngine’s model of promises is able to answer all of the questions what, when, where, how and why, at a basic level and has been carefully designed to have the kind of desired-end-state self-healing properties of a drain. Every CFEngine promise is a controlled implosion that leaves a desired end-state. Today, configuration promises have to be supported across many different scales, from the smallest containers like a user identity, to processes, process groups, virtual and physical machines, local networks, organizational namespaces and even globally spanning administrative domains. How do we do that? The simple answer is that we always do it “from within” – through autonomous agents that collaborate and take responsibility for keeping desired-end-state promises at all levels. Traditionally, we think of management of boxes: server boxes, rack boxes, routing boxes, etc. We can certainly put an agent inside every one of those processing entities… But we also need to be able to address abstract containers, labelled by the properties we use in our models of intent – business purpose. These are things like: linux, smartos, webservers, storage devices, and so on. They describe the functional roles in a story about the business purpose of our system. This brings up an important issue: how we tell stories. Despite what we are taught in software engineering, there is not only one version of reality when it comes to computer systems. There is the story:

Posted by Mark Burgess
August 5, 2014

What is WebScale and how does CFEngine help you achieve it?

This is a term often used today to acknowledge the extraordinary growth of the major web companies over a decade (social media, retailing, games, cloud etc) from handfuls of machines to the largest installations on the planet. The major web players today have datacenters with 10,000, 100,000 and even 1,000,000 computers serving their operations. Of course, this kind of growth is not appropriate for everyone. WebScale often goes together with quite singular or focused applications, by contrast with very complex industries that have to support thousands of applications for different lines of business. There is also a link to ideas of cloud computing. WebScale operations do not necessarily involve virtualisation, but typically there is a correlation between ideas of cloud computing and web scale. Some of the issues at web scale include:

Posted by Mark Burgess
June 2, 2014

CFEngine and the future of monitoring

Since writing my earlier post on (Model based monitoring), I have talked to many users who encouraged me to describe CFEngine’s simple capabilities in more detail. Although CFEngine is not intended as a traditional monitoring platform, it offers a considerable amount of human-friendly information, with a model that could be a hint of the future. At CFEngine, we like to innovate, and this post offers some hints about how we are thinking.

Posted by Mark Burgess
December 28, 2012

Model-based monitoring with CFEngine

“A model is a lie that helps you to see the truth.” (Howard Skipper) “There is nothing more practical than a good theory.” (Kurt Lewin) The past year has seen a plethora, one might even say an entire movement, of talks and blog posts under the heading “Monitoring Sucks”. Plenty of valid criticisms have been made about the state of the art in monitoring. Back in 1998, I was similarly dissatisfied with the state of the art, and began to ask some basic questions that resulted in CFEngine’s present day tools for system monitoring. This article is a reminder of CFEngine’s smart, and extremely lightweight tools for de-centralized monitoring. These tools were designed to be adaptable, hands-free and to scale to tens of thousands of hosts, while handling machine-learning pattern matching, and responding automatically to thresholds and anomalies with minimal latency, with or without human intervention.

Posted by Mark Burgess
June 10, 2012

CFEngine 3 Enterprise is leaving dry-dock ...

Today, the CFEngine team is announcing CFEngine 3 Enterprise. With the major part of the CFEngine 3 technology being in an open source core, our exploratory commercial edition, was originally dubbed `CFEngine Nova’ – the `New star in configuration management’. Today, CFEngine 3 is no longer a newcomer, but a proven solution in datacentres around the world. With today’s launch, CFEngine 3 Enterprise leaves orbit and begins its voyage to manage an ever expanding universe of IT.

Posted by Mark Burgess
May 29, 2012

Scale and scalability

If someone asks you about the scalability of your operations, don’t tell them about the number of machines you run; tell them rather about what it costs you to tend them each month. The total cost of that burden can be summed up from the cost of hardware, software, maintenance, people, lost revenue during downtime, time lost during maintenance, and time wasted from not managing knowledge well.

Posted by Mark Burgess
May 22, 2012

Ten Reasons for 5-minute configuration update and repair

How often should your configuration management system verify the integrity of your system? The default choices we’ve made by CFEngine are the results of almost 20 years of research into this area. Below you will find ten issues and references that explain why these choices are underpinned by the science. These ten things really all amount to the same thing: if you are playing ping-pong against the adversary of change, you need to be as quick on your feet as your opponent – and faster

Posted by Mark Burgess
March 19, 2012

What makes clouds float and developers operative? Agility!

In an office, high above New York City, we are looking at the screen of a computer, discussing the how the recent blog entry on CFEngine, SysAdmin 3.0 and the Third Wave of IT Engineering applies to the challenges of institutional agility on Wall Street and beyond. `Speed is the product,’ says R. R is a CFEngine customer from a large, and heavily regulated organization, that has been choked with bureaucracy and process management. He has just explained how deploying a server has gone from taking 3-6 months, to taking just a few minutes, thanks to his new CFEngine-based process. The present moves faster than the past, and the future moves even faster than that. By casting off by-gone vestiges of industrialized mass production in favour of Third Wave individual customization, he has turned a bureaucratic mess into an inexpensive triumph.

Posted by Mark Burgess
January 6, 2012

CFEngine, SysAdmin 3.0 and the Third Wave of IT Engineering

PART I: the sysadmin poverty trap The Third Wave of human society is an age where knowledge and information drive prosperity. Parallel to the changes that pushed humanity through major industrialization in manufacturing are developments happening in IT management today, some thirty years behind the manufacturing industry. CFEngine and its users have been a primus motor for these changes – helping to transform old techniques into a knowledge-based approach to IT – and we are celebrating this new chapter with the announcement of CFEngines Nova and Constellation.

Posted by Mark Burgess
November 2, 2011

CFEngine - The Third Wave of Configuration Management

If you have been anywhere near a Unix system in the past ten years, you will almost certainly have heard of CFEngine and its ‘revolutionary, self-healing approach’ to datacentre automation. However, what you might not know is that its current third incarnation CFEngine 3 is both younger and more advanced than most of its imitators, harnessing the very latest ideas about system management – and the difference is all about knowledge.

Posted by Mark Burgess
November 1, 2011