Decision Intelligence: An Introduction
Every day, employees and leaders of enterprise IT organizations make multiple decisions that affect their company’s success or failure. To stay ahead of the competition and…
Today, it’s not possible to manually manage homegrown softwares in large environments anymore. It’s necessary to change to a software-defined approach, which has become known as Infrastructure-as-Code (IaC).
Many of the available configuration management tools, such as Ansible, Terraform, Puppet, Chef, and Saltstack provide automation for infrastructure, cloud, compliance and security management, and integration for deployment and continuous deployment (CI / CD). But what is the best tool to start automating your particular environment?
The difficult task of evaluating Configuration Management Tools prevents DevOps from evolving technically and proposing improvements to the environment they manage. The task can seem daunting when many of the tools perform similarly and there doesn’t seem to be much difference. With this guide, sysadmins who aren’t yet on the automation bandwagon don’t have to remain in the dark any longer.
The main benefits of Configuration Management (CM) in infrastructure are:
According to Stack Overflow Trends and Google Trends, Ansible leads the marathon in the searches for automation tools, followed by Terraform, Chef, and Puppet, as can be seen in the graphs below (Saltstack didn’t appear in the Stack Overflow search):
Stack Overflow Trends
Google Trend for web search
These graphs show that, from the five tools being analyzed, Ansible and Terraform have had the most interest in the last five years. On the other hand, interest in Puppet, Chef, and Salt has decreased.
Tool | |||||
Supported resources | Configuration Management, Orchestration and Provisioning | Provisioning | Configuration Management | Configuration Management, Orchestration | Configuration Management, Vulnerability Compliance, |
Desired State | Idempotency | Convergence | Convergence | Convergence | Idempotency |
Infrastructure | Mutable | Immutable | Mutable | Mutable | Mutable |
Syntax | Declarative | Declarative | Declarative | Declarative / Imperative | Declarative / Imperative |
Approach control | Serveless and Agentless | Serveless and Agentless | Server and Agent | Server and Agent | Server and Minion (Agent) |
Configuration Language | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓ | ✓✓✓ | ✓✓✓✓ |
Community and Cost Support | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓ |
Maturity and Learning Curve | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓ |
Can it be used with other tools? | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓✓ | ✓✓✓ | ✓✓✓✓ |
Best for | The best tool for those beginning with Infrastructure-as-code managing heterogeneous environments. | To simplify the management of environments in the public and private clouds, multi-cloud and hybrid clouds. | To orchestrate environments that have rigid compliance requirements, maintaining an immutable configuration. | Continuous Automation on complex topologies or deployments that need speed. | To orchestrate and to automate IT tasks with speed and flexibility. |
RATING SCORE | 4.83 | 4.50 | 4.17 | 3.67 | 3.67 |
Before diving into the configuration tools themselves, let’s explore some fundamental concepts. Feel free to skip the tool reviews below if you’re already familiar.
Traditional software configuration management (SCM) tools are used to track changes made during the lifetime of an application. Among the benefits of using SCM are; keeping track of configuration items, establishing baselines, controlling changes and auditing.
The concept of convergence and idempotency is a little confusing and leads to some mistakes. Convergence typically means if a process is run 4 or 5 times, only the necessary changes are made so that the resource being managed converges to the desired state defined in the configuration files. Idempotency is the characteristic of verifying the current state of what will be modified. If it is already in the desired state, no action is taken.
The concept for mutable and immutable infrastructure is whether or not an environment can change after its creation. While a mutable environment allows changes to be made during its lifecycle, like fixing configuration errors and updating the resources that are already provisioned, in an immutable environment, that would not be possible. The resource is destroyed and created again with a new version.
The configuration management tools work with different methodologies to define configuration in code namely procedural or declarative syntax. While the first describes all steps necessary to reach a specific state in sequence, the second one simply defines a state that a resource should be. How that state is reached is left to the engine of the automation tool.
A procedural syntax is best exemplified as a common shell script. For instance, if a systems administrator uses a shell script to customize the access.conf file of a server to include a set of lines authorizing access to that server, they will have to define a number of instructions. The instructions are to check if the file exists, to load its contents to memory, check if any of the lines already exist, insert those who aren’t in the file and finally, save the new content on disk.
The Terraform tool, for example, uses the declarative approach: An infrastructure object is defined as a resource. The particular configurations of this object are defined as parameters in the resource definition. The administrator does not declare how a particular state will be reached, that will be done by the Terraform Engine. If a number of servers are defined, Terraform will create or destroy the instances until the defined number of infrastructure objects are present.
One of the things a sysadmin must evaluate when choosing tools is trade-off between benefits and the complexity a tool will add to the environment being managed.
Managing master servers and client agents can be a daunting task, especially in big environments . A main challenge being how to provision the agents for the first time and how to keep them up to date.
On the other hand, an agentless solution has some limitations and it needs some extra work to maintain the resources’ compliance. All of this has to be taken into account and placed on a scale when deciding.
Ansible is described as “a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications — automate in a language that approaches plain English, using SSH, with no agents to install on remote systems.[1]“
It’s no accident that Ansible is the most popular choice of automation tools. It’s very flexible and can be considered an “all-in-one” tool. All automation steps can be done with Ansible, from Orchestration, Configuration Management, Application Deployment, Provisioning, Continuous Deployment & Delivery (CI/CD) to Security & Compliance.
This is due to the fact that Ansible was developed in Python, and in addition to inheriting the general-purpose language feature, you can make use of thousands of existing packages from the Python community to create your own modules.
Supported resources:
Ansible does not require agents installed on the endpoints. So, it supports equipment such as Firewalls, Load balancers, Containers, enterprise storage appliances, other network devices, etc .
Ansible, currently in version 2.9, has more than 3300 modules in several areas of IT infrastructure:
Source: Deck Ansible Workshop
The Ansible community is very active, they have meet-up groups across the globe, IRC Channels, and Mailing Lists. Like other Red Hat maintained products, Ansible’s code is open source and receives direct contributions from the community.
Support plans are available for Ansible Engine, Ansible Tower, and Ansible Content Collection. The price varies from US$5000 to US$14000. The licensing is done in 100 server groups and it is billed annually.
Ansible has a very short learning curve, with easy installation and initial configuration. In less than 30 minutes, it is possible to install, configure and execute ad-hoc commands for ’n’ servers to solve a specific problem, such as daylight saving time adjustments, time synchronization, root password change, updating servers, restarting services, etc.
Syntax and workflow are simple to understand, making it easy to learn for new users. The files use YAML (YAML Ain’t Markup Language), a user-friendly declarative language standard that is widely used by other tools and easy to understand, with the addition of using the Python language to extend the functionality of Ansible with customized modules.
Even though Ansible is like a swiss army knife, it can also be used with other tools. Due to its flexibility and simplicity, Ansible can be combined with Terraform for maintaining immutable environments or with Puppet for persistent configuration in servers.
Due to the great (and increasing) number of supported resources and its ease of use, Ansible is a great choice for those starting with configuration management tools and infrastructure-as-code.
The tool is ideal when automating tasks that do not depend on maintaining state. So first installations of software, correcting configuration files across several instances, backing up switches configuration, and similar activities are easily automated with Ansible.
In heterogeneous environments, Ansible is an excellent choice because it allows all these resources to be managed with a single tool. Being able to manage Windows and Linux boxes or provisioning resources in more than one cloud provider and on-premises with the same tool saves a lot of time for the Ops teams.
Even the management of different Linux distributions can be simplified with little adaptations on playbooks. It is necessary to note that the support for Linux and Unix-like systems is greater than the support for Windows in Ansible. However, Ansible can be used to orchestrate PowerShell scripts and Desired State Configuration (DSC) resources if a specific Windows Module was not yet developed.
Read this tutorial to get started with Ansible quickly
Terraform is an open-source infrastructure-as-code software tool written in the Go Language. As described in their first blog post, “it focused on providing a way to describe the complete Infrastructure as Code from physical servers to containers to SaaS products, Terraform is able to create and compose all the components necessary to run any service or application. With Terraform, you describe your complete infrastructure as code, even as it spans multiple service providers. Your servers may come from AWS, your DNS may come from Cloudflare and your database may come from Heroku”[4]
As described above, Terraform has come to cover the growth of cloud platforms. A gap began to increase when the paradigm of infrastructure changed from the on-premises solutions to the cloud platform services. A tool was needed to facilitate handling all the solutions for several cloud providers at the same time without having to implement different APIs and develop scripts to orchestrate all the different components.
Terraform makes it possible to change from one platform to another painlessly, including in-house solutions mixed with multi-cloud platforms (e.g.: IaaS or SaaS services), all in a transparent way. “It is a tool for building, changing, and managing infrastructure in a safe, repeatable way.”[5]
Concrete use cases of Terraform are Multi-Tier Applications, Self-Service Clusters, Software Demos, Disposable Environments, Software Defined Networking (SDN), Resource Schedulers, and Multi-Cloud Deployment.
The resources and features available[6]:
Terraform does not require agents installed on the endpoints. The tool’s concept is to understand any service that exposes an API interaction in order to make resources available, managing low-level components such as computer instances (physical and VMs), storages, and networking (switches) as well as high-level components such as containers, DNS entries, SaaS services, etc11.
Terraform, currently in version 0.12, has more than 300 providers in several areas of cloud computing like IaaS, SaaS and PaaS services.
Source: Open Source and Cloud
Terraform is an open-source tool and its code is hosted on Github. Anyone can contribute to the development and help to extend its features, fixing bugs and document new use cases. The community is growing and people are willing to help the project through various sources like forums, bug trackers and community portals.
Hashicorp offers the support plan for Terraform Cloud and Terraform Enterprise described above. This includes a central interface for running Terraform, access and privilege control, remote state file storage, notifications and built-in integration with Version Control System (VCS).
Terraform Cloud is free for up to 5 Users, and “for Terraform teams that do want enterprise governance features, Terraform Cloud for Teams comes with role-based access control for private module registries and support for unlimited collaborators in a version priced at $20 per user per month. For $70 per user per month, Terraform Cloud for Teams also includes Sentinel policy as code and advanced policy and permissions features that can be customized among multiple regions and time zones and enforced as mandatory or suggestions. The $70 per month version also includes a new cloud infrastructure cost estimation feature that alerts users about the projected costs of infrastructure they are about to provision with Terraform.” [7]
The drawback of the Terraform is the State. Terraform uses the command plan for storing the state in a local file named terraform.tfstate. When you work with a team, you must use the remote state, a kind of remote storage for storing state files, which is shared with all members of a team.
There’s a modest learning curve and the documentation helps those that are getting started. The organization of the documentation is well structured, visually clean and very well written. A learning platform is available with several Getting Started tutorials, mini online training courses, which are totally free. The learning platform also includes a review guide for Product Certification Exam Prep.
Yes. Terraform is not incompatible with other tools. Provisioners can be used to model specific actions on the local machine or on a remote machine in order to prepare servers or other infrastructure objects for service using the provisioners. There are Provisioners for Chef, Puppet and Salt. Provisioners are a last resort and must use it in “certain behaviors that can’t be directly represented in Terraform’s declarative model.”[8]
Terraform is growing due to it simplifying the management of environments in the cloud. Several cloud services appeared over time and what was supposed to be a simple task ended up becoming a complex thing. There are public and private clouds, multi-clouds and hybrid clouds. Terraform arrived with the intent of being the tool to manage resources in varied cloud providers:
Read this tutorial to get started with Terraform quickly
Puppet is a company that began around the name of its main product, despite having other products and services today. Puppet represents the very implementation of the configuration management specification.
The basic architecture of Puppet is a master-client approach, in which a master node controls the desired state of a number of managed nodes through agents.
A Puppet agent “regularly performs Puppet runs, wherein it sends facts to a master and receives a configuration catalog, then applies the catalog to the local system using its providers.” [9]
Puppet is an ecosystem of products that provides tools to automate the management of your infrastructure. The community of users and contributors are involved in the Puppet Open Source version. There are also other Puppet enterprise products available with a support plan for increased functionality and performance of the core product.
Available resources and features:
By default, Puppet requires agents installed on the endpoints and managed nodes. Another approach is to use modules backed by an API for providing services that do not permit installed agents. Thus, it supports management configuration of almost all operating systems, Cloud, Virtualization and Containers, Applications, Networking, Databases, and integrates with other Provisioning tools.
Puppet, currently in open source version 6.14, has more than 6443 modules in the Puppet Forge to support several areas of IT infrastructure, such as:
Source: Open Source and Cloud
The members of the Puppet community are very active. The Puppet Forge has over 6.000 community modules created by a network of users and developers. There are several events such as Puppet Camps, Puppet Champions, and Puppet Test Pilots.
Puppet at its core is open-source software and most users will start off with the Open Source version of Puppet. Puppet Enterprise can be used for free on up to 10 nodes. For the support plans, Puppet Enterprise standard pricing starts at $112 per node/year. Bolt is completely free.
The programming language (DSL) is easy to learn and robust for complex implementation. Puppet was released in 2005 and it has good community support across development tools.
The basic infrastructure for the initial configuration is complex. Installations, configuration, and maintenance of Puppet Master and Agent are not trivial to sysadmins that are entering the DevOps world or finding automation tools. Despite the arrival of the Bolt tool, facilitated for news adepts, the potential of the use of Puppet is in an architecture client-server.
Puppet integrates best with Terraform, using Bolt tasks for executing many Terraform actions like “apply”, “destroy”, “init” and “output”. There are other initiatives from the user communities for any management tools like Ansible, Chef, and SaltStack.
Puppet is recommended for orchestrating environments that have rigid compliance requirements, maintaining an immutable configuration of nodes, with reports and role-based access control.
Read this tutorial to get started with Puppet quickly
Don’t read this part hungry. You’ll hear a lot of culinary terms like recipe, cookbook, kitchen, knife, and supermarket, but keep calm, we’re talking about a company called Chef. Chef was created in 2008 around one product called Chef Infra. The company grew by creating good products focused on delivering a fast automation solution for both infrastructure and applications.[10]
In their mission, they describe themselves as a platform “to help the most enduring and transformative companies use Chef to become a fast, efficient, and innovative software-driven organization.”[11]
Like the other automation tools, Chef embraces several areas in the DevOps (ALDO[12]) software solutions, but with a different premise: Speed and scalability. Chef’s architecture is a client-server approach similar to Puppet, with one difference that each node computes its settings and manages its own needs.
Chef states the quality of their solution is due to the assurance that the “configuration policy is flexible, versionable, testable, and human-readable”[13]. The main resources and features are:
The main product is Chef Infra that interacts with Chef Workstation and Chef Server. Chef Infra supports several areas of automation like Cloud, overall Operating Systems nodes, Virtualization, Containers, Provisioning, Continuous Integration, and Configuration Management Tools.
Chef supported, free distribution is currently in version 12, while Chef Infra Server and Chef Client are in version 14. It has more than 3,914 Cookbooks at its public Chef Supermarket covering several areas of IT infrastructure:
Source: Chef Get Data Sheet
Chef has community support for each portfolio of products behind the Chef Supermarket, with events, meetings, forum discussion, Slack channels, podcasts and posts blogs. There is a big annual event called ChefConf where consumers, developers, enthusiasts, and fans meet at an event like Apple WWDC.
Pricing starts at $16,500 per year, for Standard support on smaller deployments, including Effortless Infrastructure Suite in Essential Plan with Chef Infra, Inspec Automate for 100 nodes/targets. This does not include Chef Habitat. The Essentials Plan, starting at $35,000 per year, for Enterprise Automation Stack, includes Chef Habitat for 100 service instances.[17]
Chef has a medium Learning Curve and provides learning resources. The Learn Chef Rally is an open platform of learning for the community that makes Tutorials, Quickstarts, Tracks, Hands-on with Modules and Demos available for “your DevOps learning journey with Chef, Chef Automate, InSpec, and Habitat”[18].
The quality of Chef materials and resources available is a highlight and improves the user experience. The website, documentation, whitepapers, datasheets, presentations and eBooks contain beautiful diagrams, infographics and graphic representations using a clean design.
Like any configuration management tool, Chef is not incompatible with other tools. There are points of intersection, but they can be combined to extract the best features of each one. Terraform is the recommended tool to cover the gap of Chef concerning Cloud Providers.
Chef’s main strength is Continuous Automation: Infrastructure, Application and enforcing compliance on complex topologies or deployments that need speed, efficiency, and risk management with cross-functional teams. It is “for organizations seeking the highest degree of confidence before introducing change to production systems.”[19]
Salt is one of the open-source configuration management tools written in Python. The project was created in 2011 by the SaltStack company and “was largely the work of its co-founder, Tom Hatch. His street cred in CM development stems largely from his long-time use of both Puppet and Chef. He felt dissatisfied with both, particularly their slowness and over-reliance on Ruby.”[20]
As described in its getting started documentation “Salt is designed for high-performance and scalability. Salt’s communication system establishes a persistent data pipe between the Salt master and minions (agent nodes) using ZeroMQ or raw TCP, giving Salt considerable performance advantages over competing solutions”[21].
The main resources and features are:
By default, Salt requires agents installed on the nodes (minions) and at least one Salt master server. The flexibility of Python allows Salt to run nearly everywhere that Python runs. There is a proxy minion system for devices that can’t be managed with Python. It supports various technologies for major cloud providers, almost all operating systems (including the Microsoft Windows ecosystem), the main virtualization and containers platforms, database infrastructure, monitoring and networking.
Salt, currently in version 3000, has around 1200 modules or plugins in several areas of IT infrastructure, security compliance and networking automation:
Source: Salt Proxy Minion
“Salt uses a server-agent communication model, (though it works well as a standalone single-server management utility, and also provides the ability to run agentless over SSH). The server component is called the Salt master, and the agent is called the Salt minion.”[22]
Salt has many resources for its community of users, contributors and developers. These include; meetups around the world, Virtual User Groups with weekly online events, IRC channels and a Slack workspace.
The big event for enthusiastic users, consumers, partners and developers is SaltConf. The SaltConf20 EU was scheduled to be held in the Netherlands, June 9-10, 2020.
The SaltStack Enterprise is the version with a support plan and can cost up to $150 per minion/year.
The Learning Curve is low for the basics of getting SaltStack up and running. Tutorials and Getting Started, online documentation, as well as The Hacks podcast are easily accessible.
Reference materials like the site, documentation, resources, and use cases fall short of expectations and can frustrate new fans. The tutorials, on the other hand, are well done. The organization of the online documentation is not intuitive and has terrible design compared with the other tools assessed in this article. This is the main drawback of SaltStack.
Salt integrates with any management configuration tool. There are rosters for Ansible and Terraform (for Salt SSH), modules for Chef and Puppet, and the Ansbilegate. With the last one, it’s possible to use the playbooks and modules resources of the Ansible core.
It is best for enterprise IT organizations that wish to orchestrate and to automate IT tasks with speed and flexibility to deliver continuous security compliance, vulnerability remediation and IT security.
How can you benefit from the DevOps Culture and be agile, with the daily increase in tools and solutions that are emerging and disappearing at every moment? How can you choose the best CM tool for your particulars?
The main challenges:
Recommendations:
DevOps and Sysadmins should make their own toolsets for infrastructure automation:
[1] “ansible/ansible: Ansible is a radically simple IT … – GitHub.” – Github
[2] “Ansible module development: getting started — Ansible ….” – Ansible
[3] “Working With Plugins — Ansible Documentation.” – Ansible
[4] “Terraform Announcement” – Terraform
[5] “Introduction to Infrastructure as Code with Terraform”, – Terraform
[6] “Intro – Terraform.” – Terraform
[7] “New HashiCorp Terraform pricing aims for midsize firms, teams”, – TechTarget
[8] “Provisioners”, – Terraform
[9] “Agent glossary”, – Puppet
[10] “The Overview platform”, – Chef
[12] “ALDO Agile Lean DevOps Outcomes”, – Chef
[16] “Chef Infra Overview”, – Chef
[18] “Learn Chef Rally”, – Chef
[19] “Chef and Puppet”, – Chef
[20] “Salt vs Puppet: Which One to Choose?”, – Upguard
[21] “Understanding SaltStack”, – SaltStack