10.6 Ansible - Configuration Management
Learning Objectives
By the end of this chapter, you will be able to:
Define configuration management and understand its role in the IaC ecosystem
Install and configure Ansible for managing Google Cloud Platform resources
Write Ansible playbooks using YAML syntax and best practices
Manage inventory files for different environments and cloud deployments
Implement Ansible roles for modular and reusable configuration management
Apply Ansible security practices including Ansible Vault for secrets management
Integrate Ansible with Terraform workflows for complete infrastructure automation
Troubleshoot common Ansible issues and connectivity problems
Design scalable Ansible architectures for production environments
Prerequisites: Understanding of basic Linux administration, SSH connectivity, and familiarity with YAML syntax. Knowledge of Terraform from previous chapters is recommended.
Chapter Focus: This chapter focuses on Ansible for configuration management with practical examples using Google Cloud Platform infrastructure provisioned by Terraform.
What is Configuration Management?
Configuration management is the practice of handling changes to a system in a way that maintains integrity over time. In the context of infrastructure, it ensures that servers and applications are configured consistently and remain in their desired state.
The Configuration Management Problem
After infrastructure is provisioned (using tools like Terraform), you need to:
Install and configure software packages
Set up application services and dependencies
Configure security settings and user accounts
Deploy applications and manage their lifecycle
Ensure configurations remain consistent over time
Without Configuration Management:
# Manual approach - error-prone and not scalable
ssh user@server1 "sudo apt update && sudo apt install nginx"
ssh user@server2 "sudo apt update && sudo apt install nginx"
ssh user@server3 "sudo apt update && sudo apt install nginx"
# Different results on each server:
# - Server1: nginx 1.18.0, different config
# - Server2: nginx 1.20.1, default config
# - Server3: nginx failed to install
With Ansible Configuration Management:
# Declarative approach - consistent and scalable
- name: Configure web servers
hosts: web_servers
become: true
tasks:
- name: Install Nginx
apt:
name: nginx
state: present
update_cache: yes
- name: Configure Nginx
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: restart nginx
# Result: All servers have identical configuration
What is Ansible?
Ansible is an open-source automation tool that automates software provisioning, configuration management, and application deployment. Created by Michael DeHaan in 2012 and acquired by Red Hat in 2015.
Core Ansible Principles:
Simple: Uses YAML syntax that’s easy to read and write
Agentless: No agents to install or manage on target systems
Powerful: Can manage everything from packages to complex deployments
Flexible: Works with any system that can run Python and accept SSH connections
Efficient: Push-based model with parallel execution
Key Ansible Characteristics:
Characteristic |
Description |
|---|---|
Communication |
SSH for Linux/Unix, WinRM for Windows |
Language |
YAML for playbooks, Python for modules |
Architecture |
Push-based (control node pushes) |
State |
Stateless (no central database) |
Execution |
Sequential task execution |
Idempotency |
Tasks can be run multiple times safely |
Ansible vs Other Configuration Management Tools:
Feature |
Ansible |
Chef |
Puppet | Salt |
||
|---|---|---|---|---|---|
Agent Required |
No |
Yes |
Yes | Yes |
||
Language |
YAML |
Ruby |
Puppet DSL | YAML/Python |
||
Architecture |
Push |
Pull |
Pull | Push/Pull |
||
Learning Curve |
Easy |
Steep |
Moderate | Moderate |
||
Setup Time |
Minutes |
Hours |
Hours | Hours |
||
Note
For a detailed comparison between Ansible and Puppet, including architectural differences, configuration syntax examples, and decision criteria, see Chapter 10.0: Infrastructure as Code Introduction, section “Configuration Management: Ansible vs Puppet”.
Ansible Architecture and Components
Ansible Control Node
The machine where Ansible is installed and from which automation is executed:
Control Node (Your laptop/CI server)
├── Ansible Core Engine
├── Inventory Files (defines target hosts)
├── Playbooks (automation scripts)
├── Roles (reusable automation)
└── Configuration (ansible.cfg)
Managed Nodes
The target machines that Ansible manages:
Managed Nodes (GCP Compute Engine instances)
├── SSH Server (for connectivity)
├── Python (for module execution)
└── Target Applications/Services
Core Ansible Components:
Inventory: Defines which machines to manage
[web_servers]
web1.example.com
web2.example.com
[databases]
db1.example.com
[production:children]
web_servers
databases
Playbooks: YAML files that define automation workflows
---
- name: Configure web servers
hosts: web_servers
become: true
tasks:
- name: Install packages
apt:
name: "{{ item }}"
state: present
loop:
- nginx
- git
- curl
Modules: Reusable units of code that perform specific tasks
- name: Manage files
file:
path: /var/www/html
state: directory
owner: www-data
group: www-data
mode: '0755'
Roles: Organized collections of playbooks, variables, and files
roles/
└── webserver/
├── tasks/main.yml # Main task list
├── handlers/main.yml # Event handlers
├── templates/ # Jinja2 templates
├── files/ # Static files
├── vars/main.yml # Role variables
└── defaults/main.yml # Default variables
Ansible in the Modern DevOps Toolchain
Ansible fits into the broader DevOps ecosystem as the configuration management layer:
DevOps Toolchain Integration:
1. Version Control (Git)
├── Infrastructure code (Terraform)
├── Configuration code (Ansible)
└── Application code
2. CI/CD Pipeline
├── Test infrastructure code
├── Apply infrastructure changes (Terraform)
├── Configure infrastructure (Ansible)
└── Deploy applications (Ansible)
3. Monitoring & Observability
├── Infrastructure monitoring
├── Application monitoring
└── Configuration drift detection
Typical Workflow: Terraform + Ansible
# 1. Developer commits infrastructure changes
git add infrastructure/ configuration/
git commit -m "Add load balancer and update web server config"
git push origin main
# 2. CI/CD pipeline triggers
# 2a. Provision infrastructure
terraform init
terraform plan
terraform apply
# 2b. Configure infrastructure
ansible-playbook -i gcp_inventory.yml site.yml
# 2c. Deploy applications
ansible-playbook -i gcp_inventory.yml deploy.yml
Example: Complete Web Application Setup
Step 1: Terraform provisions the infrastructure
# Create GCP compute instances
resource "google_compute_instance" "web_servers" {
count = 3
name = "web-${count.index + 1}"
machine_type = "e2-standard-2"
zone = "us-central1-a"
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2204-lts"
}
}
network_interface {
network = google_compute_network.main.id
access_config {}
}
metadata = {
ssh-keys = "ubuntu:${file("~/.ssh/id_rsa.pub")}"
}
}
Step 2: Ansible configures the servers
---
- name: Configure web application stack
hosts: web_servers
become: true
roles:
- common # Basic server setup
- security # Security hardening
- nginx # Web server configuration
- application # App deployment
- monitoring # Observability setup
Step 3: Ansible deploys and manages applications
- name: Deploy web application
hosts: web_servers
tasks:
- name: Pull latest application code
git:
repo: https://github.com/company/webapp.git
dest: /var/www/html
version: "{{ app_version | default('main') }}"
notify: restart nginx
- name: Install application dependencies
pip:
requirements: /var/www/html/requirements.txt
virtualenv: /var/www/html/venv
- name: Update application configuration
template:
src: app_config.py.j2
dest: /var/www/html/config.py
owner: www-data
group: www-data
mode: '0644'
notify: restart application
Ansible Use Cases and When to Use It
Primary Ansible Use Cases:
Configuration Management
Installing and configuring software packages
Managing system configurations and settings
Ensuring configuration consistency across environments
Application Deployment
Deploying web applications and services
Managing application lifecycle (start, stop, restart)
Rolling updates and blue-green deployments
Orchestration
Complex multi-step procedures
Coordinating actions across multiple systems
Managing dependencies between services
Continuous Delivery
Automated deployment pipelines
Integration with CI/CD systems
Environment promotion workflows
When to Use Ansible:
Use Ansible When |
Example Scenarios |
|---|---|
Infrastructure** |
|
Application Deployment |
|
Operational Tasks |
|
Multi-step Workflows |
|
When NOT to Use Ansible:
Don’t Use Ansible For |
Use Instead |
|
|---|---|---|
Infrastructure Provisioning |
|
|
Real-time Monitoring |
|
|
Container Orchestration |
|
|
|
||
Ansible Installation and Setup
Installation Options:
# Option 1: Using pip (recommended)
pip3 install ansible
# Option 2: Using package manager (Ubuntu/Debian)
sudo apt update
sudo apt install ansible
# Option 3: Using package manager (macOS)
brew install ansible
# Verify installation
ansible --version
ansible-playbook --version
Initial Configuration:
# Create Ansible configuration file
mkdir -p ~/.ansible
cat > ~/.ansible/ansible.cfg << EOF
[defaults]
inventory = ./inventory
remote_user = ubuntu
private_key_file = ~/.ssh/id_rsa
host_key_checking = False
timeout = 30
[privilege_escalation]
become = True
become_method = sudo
become_user = root
EOF
Google Cloud Platform Integration:
# Install GCP collection
ansible-galaxy collection install google.cloud
# Install required Python libraries
pip3 install requests google-auth
Testing Ansible Installation:
# Test with localhost
echo "localhost" > inventory
ansible localhost -m ping
# Expected output:
localhost | SUCCESS => {
"changed": false,
"ping": "pong"
}
Ansible Ad Hoc Commands
Ad hoc commands are one-time commands you run against your hosts without writing a playbook. They’re perfect for quick tasks, troubleshooting, and getting immediate results.
Ad Hoc Command Syntax:
ansible <host-pattern> -m <module-name> -a "<module-arguments>"
# Basic structure:
# ansible: The command
# host-pattern: Which hosts to target
# -m: Module to use
# -a: Module arguments
Basic Ad Hoc Commands
1. Connectivity and System Information:
# Test connectivity to all hosts
ansible all -m ping
# Check uptime on web servers
ansible web_servers -m command -a "uptime"
# Get system facts (detailed system information)
ansible all -m setup
# Get specific fact (like IP address)
ansible all -m setup -a "filter=ansible_default_ipv4"
# Check disk space
ansible all -m command -a "df -h"
# Check memory usage
ansible all -m command -a "free -h"
# Get OS information
ansible all -m setup -a "filter=ansible_distribution*"
2. Package Management:
# Update package cache (Ubuntu/Debian)
ansible all -m apt -a "update_cache=yes" --become
# Install a package
ansible web_servers -m apt -a "name=nginx state=present" --become
# Install multiple packages
ansible all -m apt -a "name=htop,vim,curl state=present" --become
# Remove a package
ansible all -m apt -a "name=apache2 state=absent" --become
# Upgrade all packages
ansible all -m apt -a "upgrade=dist" --become
# Check if a package is installed
ansible all -m command -a "dpkg -l | grep nginx"
3. Service Management:
# Start a service
ansible web_servers -m service -a "name=nginx state=started" --become
# Stop a service
ansible web_servers -m service -a "name=apache2 state=stopped" --become
# Restart a service
ansible web_servers -m service -a "name=nginx state=restarted" --become
# Enable service at boot
ansible web_servers -m service -a "name=nginx enabled=yes" --become
# Check service status
ansible all -m command -a "systemctl status nginx"
# List all running services
ansible all -m command -a "systemctl list-units --type=service --state=running"
4. File Operations:
# Create a directory
ansible all -m file -a "path=/opt/myapp state=directory mode=0755" --become
# Create a file with content
ansible all -m copy -a "content='Hello World' dest=/tmp/hello.txt"
# Copy a local file to remote hosts
ansible all -m copy -a "src=./config.txt dest=/etc/myapp/config.txt backup=yes" --become
# Change file ownership
ansible all -m file -a "path=/var/www/html owner=www-data group=www-data" --become
# Change file permissions
ansible all -m file -a "path=/opt/scripts/backup.sh mode=0755"
# Remove a file
ansible all -m file -a "path=/tmp/oldfile.txt state=absent"
# Create a symbolic link
ansible all -m file -a "src=/opt/app/current dest=/opt/app/latest state=link"
# Check if file exists
ansible all -m stat -a "path=/etc/nginx/nginx.conf"
5. User Management:
# Create a user
ansible all -m user -a "name=deployuser shell=/bin/bash" --become
# Create user with specific UID and home directory
ansible all -m user -a "name=appuser uid=1001 home=/opt/appuser createhome=yes" --become
# Add user to sudo group
ansible all -m user -a "name=deployuser groups=sudo append=yes" --become
# Set user password (encrypted)
ansible all -m user -a "name=deployuser password={{ 'mypassword' | password_hash('sha512') }}" --become
# Remove a user
ansible all -m user -a "name=olduser state=absent remove=yes" --become
# Lock a user account
ansible all -m user -a "name=tempuser password_lock=yes" --become
6. SSH Key Management:
# Add SSH public key to user
ansible all -m authorized_key -a "user=ubuntu key='{{ lookup('file', '~/.ssh/id_rsa.pub') }}'"
# Add SSH key from URL
ansible all -m authorized_key -a "user=deployuser key=https://github.com/username.keys"
# Remove SSH key
ansible all -m authorized_key -a "user=ubuntu key='ssh-rsa AAAA...' state=absent"
7. Process Management:
# List running processes
ansible all -m command -a "ps aux"
# Find processes by name
ansible all -m command -a "pgrep -f nginx"
# Kill a process by PID
ansible all -m command -a "kill -9 1234" --become
# Kill processes by name
ansible all -m command -a "pkill -f 'old-service'" --become
8. Network Operations:
# Test network connectivity
ansible all -m command -a "ping -c 4 google.com"
# Check open ports
ansible all -m command -a "netstat -tlnp"
# Check network interfaces
ansible all -m command -a "ip addr show"
# Download a file
ansible all -m get_url -a "url=https://releases.ubuntu.com/22.04/ubuntu-22.04.3-live-server-amd64.iso dest=/tmp/"
9. System Monitoring and Logs:
# Check system load
ansible all -m command -a "cat /proc/loadavg"
# View last few lines of log file
ansible all -m command -a "tail -n 20 /var/log/syslog"
# Check journal logs
ansible all -m command -a "journalctl -n 10"
# Find large files
ansible all -m command -a "find /var/log -type f -size +100M"
# Check CPU info
ansible all -m command -a "cat /proc/cpuinfo | grep 'model name' | head -1"
10. Archive and Compression:
# Create tar archive
ansible all -m archive -a "path=/var/www/html dest=/tmp/website-backup.tar.gz"
# Extract archive
ansible all -m unarchive -a "src=/tmp/backup.tar.gz dest=/opt/ remote_src=yes"
# Download and extract from URL
ansible all -m unarchive -a "src=https://example.com/app.tar.gz dest=/opt/ remote_src=yes"
Advanced Ad Hoc Patterns
1. Using Variables:
# Use extra variables
ansible web_servers -m template -a "src=nginx.conf.j2 dest=/etc/nginx/nginx.conf" \
--extra-vars "server_name=myapp.com worker_processes=4" --become
# Use variables from file
ansible all -m debug -a "var=ansible_hostname" --extra-vars "@vars.yml"
2. Conditional Execution:
# Run only on Ubuntu systems
ansible all -m apt -a "name=htop state=present" \
--limit "ansible_distribution=='Ubuntu'" --become
# Run on specific hosts
ansible web_servers[0] -m service -a "name=nginx state=restarted" --become
3. Parallel Execution:
# Run on 10 hosts in parallel (default is 5)
ansible all -m ping -f 10
# Run with increased timeout
ansible all -m command -a "sleep 30" -T 60
4. Output Formatting:
# One line output
ansible all -m ping --one-line
# Tree format output
ansible all -m setup --tree /tmp/facts
# JSON output
ansible all -m setup | jq '.'
5. Dry Run and Check Mode:
# Check what would change (dry run)
ansible all -m apt -a "name=nginx state=present" --check --become
# Show differences
ansible all -m copy -a "src=config.txt dest=/etc/app/config.txt" --check --diff --become
Practical Ad Hoc Workflows
Quick Server Health Check:
#!/bin/bash
# health_check.sh - Quick server health assessment
echo "=== Connectivity Check ==="
ansible all -m ping --one-line
echo -e "\n=== Disk Space ==="
ansible all -m command -a "df -h /" --one-line
echo -e "\n=== Memory Usage ==="
ansible all -m command -a "free -h" --one-line
echo -e "\n=== Load Average ==="
ansible all -m command -a "uptime" --one-line
echo -e "\n=== Service Status ==="
ansible web_servers -m command -a "systemctl is-active nginx" --one-line
Emergency Response Commands:
# Stop all web services immediately
ansible web_servers -m service -a "name=nginx state=stopped" --become -f 20
# Clear cache across all servers
ansible all -m command -a "sync && echo 3 > /proc/sys/vm/drop_caches" --become
# Restart all application servers
ansible app_servers -m service -a "name=myapp state=restarted" --become
# Check for security updates
ansible all -m command -a "apt list --upgradable | grep -i security" --become
Log Collection:
# Collect error logs from all servers
ansible all -m fetch -a "src=/var/log/nginx/error.log dest=/tmp/logs/{{ inventory_hostname }}/ flat=yes"
# Search for errors in logs
ansible all -m command -a "grep -i error /var/log/syslog | tail -10"
# Check log file sizes
ansible all -m command -a "du -sh /var/log/*" --one-line
Troubleshooting Ad Hoc Commands
Common Issues and Solutions:
# SSH connection issues
ansible all -m ping -vvv # Verbose output for debugging
# Permission issues
ansible all -m command -a "whoami" # Check current user
ansible all -m command -a "sudo whoami" --become # Check sudo access
# Python issues
ansible all -m command -a "which python3" # Check Python location
# Module not found
ansible all -m setup -a "filter=ansible_python*" # Check Python interpreter
Note
Ad hoc commands are powerful for immediate tasks, but for complex operations or repeated tasks, consider writing playbooks. They provide better documentation, error handling, and reusability.
Note
The examples in this section assume you have infrastructure provisioned by Terraform from the previous chapters. If you haven’t completed the Terraform chapters, you can still follow along by manually creating GCP compute instances for practice.
Next Steps:
Continue to Chapter 10.7: Ansible Core Concepts to start writing your first playbooks and learn essential Ansible patterns for configuration management.