Ansible for developers 101

Introduction

This page has been migrated to Medium

When I started as a developer and I started to feel frustrating when I had to repeat something twice or more.

Being a developer does not mean you only have to write code. Hopefully, your code should be executed in different environments and often it is up to you managing part of those environments.

Nowadays automation brings a lot of improvements to the life-quality of developers. You can, therefore, exploit tools that help you repeating boring stuff, keeping you well focused on your work.

Ansible is one of my favourite tools. He comes with a gentle learning curve and looks still familiar to developers since its concepts are quite similar to ones you use when writing code.

Ansible

Ansible is an automation platform that comes with an automation language that can describe an IT application infrastructure in Ansible Playbooks. An automation engine runs Ansible Playbooks.

Ansible also comes with Ansible Tower: an enterprise framework for fully controlling, securing and managing your Ansible automation with a UI and RESTful API.

Why Ansible?

Ansible is Open Source and has a large community;
Ansible has a gentle learning curve;
It gives you the ability to do automation in an (almost)-Human readable fashion;
No special coding skills needed unless you want to extend it;
Your tasks are executed in order;
Automation steps repeatable and testable in different environments;
You can manage the deployment and configuration of your application;
Ansible works with existing tools, so you can achieve a faster start;
It comes with over 500 included modules for the more common use cases;
You can orchestrate your workflow and application lifecycle.

Unlike some of its competitors, Ansible is agentless. To execute tasks on another machine it can use OpenSSH or WinRM, so you don’t have to take care of updating and managing agents, giving you a fast start and more control. It also brings to a more efficient and secure workflow.

Ansible is also multiplatform and supports all major OS variants.

Use cases

Configuration management: centralizing configuration file management and deployment is a common use case for Ansible, and it’s how many power users are first introduced to the Ansible automation platform;
Security and compliance: when you define your security policy in Ansible, scanning and remediation of site-wide security policy can be integrated into other automated processes and instead of being an afterthought, it’ll be integral in everything that is deployed;
Application deployment: when you define your application with Ansible, and manage the deployment with Ansible Tower, teams are able to effectively manage the entire application lifecycle from development to production;
Orchestration: configurations alone don’t define your environment. You need to define how multiple configurations interact and ensure the disparate pieces can be managed as a whole. Out of complexity and chaos, Ansible brings order;
Continuous delivery: creating a CI/CD pipeline requires buy-in from numerous teams. You can’t do it without a simple automation platform that everyone in your organization can use. Ansible Playbooks keep your applications properly deployed (and managed) throughout their entire lifecycle;
Provisioning: your apps have to live somewhere. If you’re PXE booting and kickstarting bare-metal servers or VMs, or creating virtual or cloud instances from templates, Ansible and Ansible Tower help streamline the process.

How Ansible works

Ansible components follow a kind of hierarchical structure:

Managed Nodes are the network devices (and/or the servers) you manage with Ansible. Ansible does not need to be installed on this node;
The Inventory contains a structured set of managed nodes with optional attached metadata. Those hosts can be grouped together;
Modules are units of code to be executed on the hosts specified in the inventory. You can invoke a single module with a task;
Modules are going to be contained in Tasks. Tasks are units of action in Ansible. You can execute a single task once with an ad-hoc command or more tasks using Playbooks;
Playbooks are ordered lists of tasks, stored so you can run those tasks in that order repeatedly. Playbooks can include variables as well as tasks. Playbooks are written in YAML and are easy to read, write, share and understand.

An important aspect of Ansible is that you don’t have to know the underlying details behind modules. You just have to pass the correct arguments to them to be correctly executed. Some common modules are apt/yum, ping, copy, uri, user, assert, git, template and so on. You can also check out the Ansible Module Index.

Those Modules are going to be used to act against an Inventory. You can also include inventory-specific data, making everything more reusable.

An example of static Inventory can be the following:

[control]
control ansible_host=10.45.21.2

[web]
node1   ansible_host=10.43.21.0
node2   ansible_host=10.43.21.1
node3   ansible_host=10.43.21.2
node4   ansible_host=10.43.21.3

[haproxy]
haproxy ansible_host=10.43.20.0

[all:vars]
ansible_user=vagrant
ansible_ssh_private_key_file=~/.vagrant.d/private_key

[web:vars]
ansible_user=dario

This Inventory contains 3 groups that consist of a control node, four web nodes and an HAProxy node. Those nodes come with their IP address. Furthermore, you can specify a set of variables that apply to each node under the all:vars group and a set of variables that apply to each node in a group under the web:vars. Variables more specific have higher precedence.

Hosts specified in your inventory don’t need to be aware of Ansible. These hosts can be any virtual or physical machine on an on-premise or public cloud having a Python interpreter installed, an OpenStack or VMware cloud and so on.

Ansible is written in Python so we can extend it using plugins written in Python that use Ansible API.

Installing Ansible

Before we start, we need to install Ansible. This is quite simple because you only have to run:

$ pip install ansible

Now to check if it works you have to run:

$ ansible -v

Ad-Hoc Commands

You can also execute some Ad-Hoc Commands. An Ansible ad-hoc command uses the /usr/bin/ansible command-line tool to automate a single task on one or more managed nodes. Ad-hoc commands are quick and easy, but they are not reusable.

An example of ad-hoc command could be the following:

ansible all -i hosts -u dario -m ping

This command executes the ping module on the hosts specified in the inventory file called hosts using dario as the user. This can be used before running the full Ansible Playbook to ensure that your hosts are up and running. You could also run the module setup to gather facts without any modification to the target machine.

Variables

Since Ansible was born to bring automation, automation often implies reusing. Reusing means, obviously, variables. Ansible can work with metadata from various sources and manage their context in the form of variables.

Using variables you can also keep another layer of abstraction in your playbooks, separating information related to the execution and information related to hosts.

Once you defined a bunch of variables, they can be overridden by variables having higher precedence, so you could handle some special cases without having to break your reusable workflow. There are 16 levels of precedence. As the rule of thumb, the more specific is a variable, the higher is its precedence.

Ansible Playbooks

As we said, Ansible Playbooks contains an ordered list of Ansible Tasks to be executed against hosts on your inventory. Let’s see how a playbook works.

To run a playbook you will run ansible-playbook instead of ansible, passing the path to the playbook file. For example ansible-playbook -i hosts my_playbook.yml will run the playbook my_playbook.yml on the inventory hosts.

Let’s construct a playbook step-by-step. First of all, we have to include the list of tasks:

Tasks

tasks:
    - name: add cache dir
      file:
        path: /opt/cache
        state: directory
    
    - name: install nginx
      yum:
        name: nginx
        state: latest
    
    - name: restart nginx
      service:
        name: nginx
        state: restarted

Using this playbook, we

add a cache directory under /opt/cache using the file module giving the desired path as path variable and the desired state directory as variable state;
install nginx using yum and the module of the same name, setting the state variable to nginx and the state variable to latest, meaning that we want the latest version of nginx installed on the machine using yum;
restart the nginx service using the service module and setting the variable name to be nginx and the variable state to be restarted. So we are saying we want the nginx service to be restarted on the machine.

The outcome of a task can be one of the following:

ok: status is as desired;
changed: task modified something to apply the status as desired;
failed: final state is not as desired because something went wrong.

This example gives you a vision of the semantic of modules: you specify the desired state in a declarative way. It’s up to the module to run the correct steps to bring the machine to the desired state. Furthermore, if we run the same playbook twice, nothing changes. This outlines an important aspect of Ansible Tasks: they should be idempotent.

Handlers

Handlers are special tasks that run at the end of a play if notified by another task. They can also be notified multiple time during the play but they will run only once at the end of the play. For example, you can ask to restart a service if the configuration gets changed.

Let’s add an example handler to our playbook refactoring the preceding playbook:

tasks:
    - name: add cache dir
      file:
        path: /opt/cache
        state: directory
    
    - name: install nginx
      yum:
        name: nginx
        state: latest
    notify: restart nginx
    
handlers:
    - name: restart nginx
      service:
        name: nginx
        state: restarted

Doing that, we ensure that the task restart nginx will be run only after the install nginx task has returned changed as its status.

Plays

Plays are ordered sets of tasks to execute against host selection from your inventory. A playbook contains one or more plays.

A playbook with Plays can be the following:

---
- name: install and start apache
  hosts: web
  vars:
    http_port: 80
    max_clients: 200
  remote_user: root

  tasks:
  - name: install httpd
    yum: pkg=httpd state=latest
  - name: write the apache config file
    template: src=/srv/httpd.j2 dest=/etc/httpd.conf
  - name: start httpd
    service: name=http state=started

The upper name field is the name of the play;
The hosts field represents the group of the hosts in the inventory;
The vars map includes some variables that are going to be injected into the tasks. Those variables will be used to render the template in the template task;
We also specified the remote_user that has to run the tasks;
tasks section includes the list of tasks to be executed. We used a more compact form to describe our tasks.

Conclusions

Having those concepts, you could start playing with Ansible. I will also talk about most advanced features such as Ansible Roles, loops and so on in other posts.