🚀 Executive Summary

TL;DR: Many organizations face ‘automation debt’ due to chaotic, unversioned scripts, hindering scalability and auditability. This guide outlines a three-stage DevOps automation journey, from ad-hoc scripting to a streamlined GitOps workflow, to help engineers escape ‘script hell’ and build robust, auditable systems.

🎯 Key Takeaways

The three stages of DevOps automation are: Wild West (ad-hoc scripting), Framework (centralized & repeatable with CI/CD and config management), and Operating System (declarative GitOps with IaC and GitOps agents).
Automation debt arises from prioritizing urgent, short-term fixes (like quick bash scripts) over important, long-term solutions, leading to unversioned and untrustworthy scripts.
Transitioning effectively requires sequential progression: first, version control all scripts and introduce CI/CD (Stage 2), then gradually adopt Infrastructure as Code and GitOps agents for declarative state management (Stage 3).

Where are you in your automation journey?

A senior engineer’s guide to navigating the three stages of DevOps automation, from chaotic scripts to a streamlined GitOps workflow, and how to escape the “good enough” trap.

The Three Stages of DevOps Automation (And How to Escape Script Hell)

I remember my first week at a previous gig. My manager, a guy who lived on coffee and stress, told me to deploy the latest marketing service. “The script is on the ops share,” he said, waving vaguely towards the server room. I navigated to \\ops-shared\critical_scripts and my blood ran cold. It was a digital graveyard of good intentions: deploy.sh, deploy_new.sh, deploy_v2_final.sh, and my personal favorite, DO_NOT_USE_deploy_old.sh. That day, I learned the difference between having scripts and having automation. One is a collection of tools; the other is a system. We’ve all been there, and if you’re there now, let’s talk about how to get out.

The “Why”: The Automation Debt Spiral

Why do we end up with folders full of undeclared, unversioned, and untrustworthy scripts? It’s not because we’re bad engineers. It’s because we’re firefighters. A production server goes down at 3 AM. You write a quick bash script to check a process and restart it. It works. The fire is out. You save it as fix_prod_db_01.sh and move on to the next blaze. You’ve just taken on “automation debt.” It solved an immediate, urgent problem, but it wasn’t an important, long-term solution. Multiply that by a dozen engineers over three years, and you get the critical_scripts folder. The root cause is prioritizing the urgent over the important.

Stage 1: The Wild West (Ad-Hoc Scripting)

This is where everyone starts. It’s characterized by individual scripts, written to solve a specific problem, often living on an engineer’s laptop or a shared drive. This is the “just get it done” phase.

What it looks like:

A collection of .sh or .ps1 files.
Execution is manual. You SSH into prod-worker-03 and run ./run_cleanup.sh.
There is no source control. The “latest version” is the one you edited last.
Knowledge is tribal. Only Janet knows why update_config_special.sh needs to be run with the --force-legacy flag.

Here’s a classic example of a “Stage 1” deployment script. It’s hacky, but we’ve all written one.


#!/bin/bash
# DEPLOY SCRIPT - DO NOT EDIT WITHOUT TALKING TO DAVE
echo "Connecting to production server..."
ssh user@prod-api-01 << 'ENDSSH'
    echo "Stopping service..."
    sudo systemctl stop my-api-service

    echo "Pulling latest code from main..."
    cd /var/www/my-api
    git pull origin main

    echo "Installing dependencies..."
    npm install --production

    echo "Restarting service..."
    sudo systemctl start my-api-service
    echo "Deployment complete on prod-api-01!"
ENDSSH

Warning: This stage is a ticking time bomb. It’s not scalable, it’s not auditable, and it’s one “fat finger” away from a production outage. The goal is to escape this stage as quickly as possible.

Stage 2: The Framework (Centralized & Repeatable)

This is the most critical leap in your automation journey. You stop thinking in terms of individual scripts and start thinking in terms of repeatable processes. The goal here is consistency and control, not perfection.

How to get there:

Version Control Everything: The first step is non-negotiable. Create a Git repository (e.g., ops-automation) and commit every single script. Now you have history, accountability, and the ability to review changes.
Introduce a CI/CD Tool: Use Jenkins, GitLab CI, or GitHub Actions as your single point of execution. No more manual SSH sessions. The CI tool pulls the repo and runs the script. This gives you an audit log and control over who can run what.
Adopt a Configuration Management Tool: Instead of raw bash scripts, start using a tool like Ansible or Puppet. This forces you to think declaratively (“ensure this package is installed”) instead of imperatively (“run apt-get install”).

Your “deployment” now becomes a CI/CD job that runs an Ansible playbook. The playbook itself is version-controlled.


# ansible/playbooks/deploy-api.yml
---
- hosts: api_servers
  become: yes
  tasks:
    - name: Pull latest code from Git
      git:
        repo: 'git@github.com:your-org/my-api.git'
        dest: /var/www/my-api
        version: main
      notify: Restart API Service

    - name: Install NPM dependencies
      npm:
        path: /var/www/my-api
        state: present
        production: yes

  handlers:
    - name: Restart API Service
      service:
        name: my-api-service
        state: restarted

Stage 3: The Operating System (Declarative GitOps)

This is the promised land. In this stage, your Git repository isn’t just a place to store scripts; it is the desired state of your entire infrastructure. You don’t “run” anything. You declare the state you want in Git, and an automated system makes it a reality. This is the ‘Nuclear’ option for killing off manual work for good.

What it looks like:

Infrastructure as Code (IaC): Tools like Terraform or Pulumi define your servers, load balancers, and databases in code. A change to a server type is a pull request, not a panicked click in a cloud console.
GitOps Agents: Tools like ArgoCD (for Kubernetes) or Flux constantly compare the live state of your environment to the desired state in your Git repo. If there’s a drift, they automatically correct it.
The Human Role: Engineers don’t push buttons anymore. They write code, create pull requests, and get them reviewed. Merging to the main branch is the deployment.


# terraform/modules/s3/main.tf
# Defines a private S3 bucket for application logs

resource "aws_s3_bucket" "app_logs" {
  bucket = "techresolve-prod-app-logs"

  tags = {
    Name        = "Prod App Logs"
    Environment = "Production"
    ManagedBy   = "Terraform"
  }
}

resource "aws_s3_bucket_public_access_block" "app_logs_access" {
  bucket = aws_s3_bucket.app_logs.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Pro Tip: Don’t try to jump from Stage 1 to Stage 3 overnight. You’ll burn out. The path is sequential. Get your scripts into Git (Stage 2). Then, start carving off one piece of your infrastructure—like S3 buckets or security groups—and manage it with Terraform (the beginning of Stage 3).

Your Journey at a Glance

It’s easy to get lost, so here’s a quick comparison of the stages.

Attribute	Stage 1: Wild West	Stage 2: Framework	Stage 3: GitOps
Trigger	Manual (SSH & Run)	CI/CD Job (e.g., Jenkins)	Git Commit / Merge
Source of Truth	Engineer’s memory	Git Repo for Scripts/Playbooks	Git Repo for Desired State
Audit Trail	None / Bash History	CI/CD Job Logs	Git History (Immutable)
Scalability	Very Low	Medium	Very High

Wherever you are on this journey, the key is to recognize it and take the next small, concrete step forward. If your scripts are a mess, your first step isn’t to learn Terraform. It’s to `git init` a new repository and commit that first script. You’re not alone in this, and every single senior engineer has that deploy_v2_final_REAL.sh file somewhere in their past.

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.

🤖 Frequently Asked Questions

❓ What are the three stages of DevOps automation?

The three stages are: Stage 1 (Wild West) characterized by ad-hoc, unversioned scripts; Stage 2 (Framework) involving centralized, repeatable processes with version control, CI/CD tools, and configuration management; and Stage 3 (Operating System) which is declarative GitOps using Infrastructure as Code and GitOps agents.

❓ How does a GitOps approach (Stage 3) differ from traditional CI/CD (Stage 2)?

In Stage 2, CI/CD tools execute version-controlled scripts or playbooks, acting as the trigger for deployments. In Stage 3 (GitOps), the Git repository itself is the desired state of the infrastructure, and GitOps agents automatically reconcile the live environment with this declared state, making a Git commit or merge the immutable deployment trigger.

❓ What is a common pitfall when trying to advance in the automation journey?

A common pitfall is attempting to jump directly from Stage 1 (Wild West) to Stage 3 (GitOps). The recommended approach is sequential: first, get scripts into version control and implement CI/CD (Stage 2), then gradually introduce IaC and GitOps for specific infrastructure components.

TechResolve – SaaS Troubleshooting & Software Alternatives

🚀 Executive Summary

🎯 Key Takeaways

The Three Stages of DevOps Automation (And How to Escape Script Hell)

The “Why”: The Automation Debt Spiral

Stage 1: The Wild West (Ad-Hoc Scripting)

Stage 2: The Framework (Centralized & Repeatable)

Stage 3: The Operating System (Declarative GitOps)

Your Journey at a Glance

Darian Vance

🤖 Frequently Asked Questions

❓ What are the three stages of DevOps automation?

❓ How does a GitOps approach (Stage 3) differ from traditional CI/CD (Stage 2)?

❓ What is a common pitfall when trying to advance in the automation journey?

Like this:

Leave a ReplyCancel reply

🚀 Executive Summary

🎯 Key Takeaways

The Three Stages of DevOps Automation (And How to Escape Script Hell)

The “Why”: The Automation Debt Spiral

Stage 1: The Wild West (Ad-Hoc Scripting)

Stage 2: The Framework (Centralized & Repeatable)

Stage 3: The Operating System (Declarative GitOps)

Your Journey at a Glance

Darian Vance

🤖 Frequently Asked Questions

❓ What are the three stages of DevOps automation?

❓ How does a GitOps approach (Stage 3) differ from traditional CI/CD (Stage 2)?

❓ What is a common pitfall when trying to advance in the automation journey?

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from TechResolve - SaaS Troubleshooting & Software Alternatives