Automated Configuration Drift Detection and Remediation
Objective
The goal of this task is to detect and correct configuration drift in infrastructure managed via Terraform and Ansible.
Configuration drift occurs when infrastructure changes occur outside of IaC workflows, leading to inconsistencies.
Architecture Overview
Tools Used
- Terraform – Infrastructure as Code (IaC) to provision and manage infrastructure.
- Ansible – Configuration management tool to maintain consistent state.
- AWS – Cloud provider for deploying resources (EC2, S3, RDS, etc.).
- Jenkins – Automates drift detection checks and remediation playbooks.
- Prometheus/Grafana – Monitors infrastructure metrics and triggers alerts for drift.
- GitHub/GitLab – Source control for Terraform and Ansible playbooks.
- AWS Config – Monitors AWS resources for changes.
Implementation Steps
Step 1: Provision Infrastructure using Terraform
- Define AWS resources using Terraform (main.tf).
- Apply configuration using terraform apply.
- Store the Terraform state in an S3 bucket.
Example Terraform Code (main.tf):
provider "aws" {
region = "us-east-1"
}
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
tags = {
Name = "WebServer"
}
}
Step 2: Define Configuration with Ansible
- Create an Ansible playbook to enforce configurations on instances.
Example Ansible Playbook (config-enforce.yml):
- name: Enforce Configuration on Web Server
hosts: webserver
become: yes
tasks:
- name: Ensure Apache is installed
yum:
name: httpd
state: present
- name: Ensure Apache service is running
service:
name: httpd
state: started
enabled: yes
Step 3: Detect Configuration Drift
Methods 1: Using Terraform Plan for Drift Detection
- Schedule a Jenkins job to run terraform plan and detect changes.
- If terraform plan detects drift, trigger a remediation workflow.
Method 2: Using Ansible Check Mode
- Run Ansible in check mode (ansible-playbook --check).
- Compare the actual state vs. the desired state.
Method 3: Using AWS Config
- Enable AWS Config to track resource changes.
- Create AWS Config rules to detect non-compliant instances.
Step 4: Automate Drift Correction
- If drift is detected, automatically reapply Terraform and Ansible configurations.
- Use Jenkins to trigger terraform apply and ansible-playbook.
Jenkins Pipeline Script (Jenkinsfile)
pipeline {
agent any
stages {
stage('Check Terraform Drift') {
steps {
sh 'terraform plan -detailed-exitcode || true'
}
}
stage('Enforce Configuration with Ansible') {
steps {
sh 'ansible-playbook -i inventory config-enforce.yml'
}
}
stage('Apply Terraform Changes') {
steps {
sh 'terraform apply -auto-approve'
}
}
}
}
Step 5: Integrate Monitoring & Alerts
- Use Prometheus to monitor AWS instances and configuration drift.
- Set up Grafana dashboards to visualize drift data.
- Configure Prometheus Alertmanager to notify via Slack/Email.
Prometheus Alert Rule (prometheus.rules.yml)
groups:
- name: drift_alerts
rules:
- alert: ConfigurationDriftDetected
expr: node_filesystem_avail_bytes{fstype="ext4"} < 500000000
for: 2m
labels:
severity: critical
annotations:
summary: "Configuration Drift Detected on {{ $labels.instance }}"
description: "Terraform or Ansible drift detected."
Final Workflow
- Terraform provisions infrastructure.
- Ansible enforces configuration.
- Jenkins runs scheduled drift detection.(terraform plan, ansible-playbook --check).
- If drift is detected:
- Alert via Prometheus.
- Auto-correct using Terraform and Ansible.
- Grafana dashboard provides visualization.
- AWS Config continuously monitors for changes.
Benefits
- Automated Drift Detection – Ensures consistency across infrastructure.
- Self-Healing Infrastructure – Automatically corrects drifts.
- Real-time Monitoring & Alerts – Improves visibility into infrastructure changes.
- Integration with CI/CD – Ensures best practices in DevOps workflows.
Next Steps
- Deploy the solution in a test AWS environment.
- Set up Slack notifications for alerts.
- Extend to Kubernetes environments using Helm and Kustomize.
- Implement fine-grained IAM policies for drift detection security.