What is WebScale and how does CFEngine help you achieve it?

Posted by Mark Burgess
June 2, 2014

This is a term often used today to acknowledge the extraordinary growth of the major web companies over a decade (social media, retailing, games, cloud etc) from handfuls of machines to the largest installations on the planet. The major web players today have datacenters with 10,000, 100,000 and even 1,000,000 computers serving their operations. Of course, this kind of growth is not appropriate for everyone. WebScale often goes together with quite singular or focused applications, by contrast with very complex industries that have to support thousands of applications for different lines of business. There is also a link to ideas of cloud computing. WebScale operations do not necessarily involve virtualisation, but typically there is a correlation between ideas of cloud computing and web scale. Some of the issues at web scale include:

  • Elastic scaling to handle bursty loads and seasonal trends.
  • Service oriented architectures.
  • Heavy loads inside data enters (so called East-West traffic)
  • Content delivery networks and caches for scaling
  • Continuous delivery of updates
  • High rate of change
  • Software defined infrastructure

Opinions differ on how to manage all of these, but there are common themes. The solutions adopted by the major showcases might not apply to all businesses, but the innovations made by these companies can be applied by almost anyone to good effect. These are some of the specific challenges associated with web scale: from programming and architecting solutions to carrying the traffic over wires in the datacenter. The problems are partly about building software stacks, partly about maintaining them, and partly about robustness under pressure. How does CFEngine help get you there? CFEngine was conceived long before the web mushroomed, but it is still a great tool to help scale computing in all scenarios. This is because of the core principles it adheres to. CFEngine comprises a set of decentralised, agent based controllers, that may be coordinated and measured through shared policy and data. It is fast, lightweight and resilient under pressure. Some of its properties include: Decentralized means scaling is unlimited. Hands-free changes are fully automated as repeatable desired end-states

  • High availability - no need to reprovision servers with minor drifts or errors
  • Versioning and knowledge management allows to to match intent with outcome
  • Model-oriented means you can use patterns to reduce complexity
  • Speed means dynamic adaptation and rapid provisioning
  • Efficiency means little wastage in valuable resources
  • Pro-active repair means avoiding bad states
  • A language reflecting a desired end-state model keeps you on task, not distracted by programming your way out of trouble.
  • In-built monitoring saves setting up multiple models that are uncorrelated for verifying outcomes.
  • Machine-learning allows you to see trends and variability of workloads
  • Policy federation into human centric organisational centres is a key to socialising the processes like DevOps, auditing, validation, compliance

CFEngine has been instrumental in helping to scale several of well known web giants (Facebook, Amazon, LinkedIn). It clearly has the tools to configure the whole software stack - but it also lets you choose just as much as you want. It doesn´t force you into only one approach. CFEngine´s history and Promise Theory-friendly design brings great flexibility. On June 19th Glenn O’Donnell and I will be discussing Continuous Operations for a competitive advantage in a joint webinar. You can find out more by visiting https://cfengine.com/company/blog-detail/upcoming-webinar-with-forrester-research/. We hope to see you there.