π Executive Summary
TL;DR: The article addresses the problem of managing near-identical Infrastructure as Code (IaC) for different environments (Dev, Staging, Prod) without resorting to error-prone copy-pasting. It advocates for separating core infrastructure logic from environment-specific configurations to improve maintainability and prevent issues like accidental production data writes.
π― Key Takeaways
- Hardcoding environment-specific values in IaC is the root cause of duplication and maintenance issues, treating code like a template rather than reusable logic.
- The ‘Quick Fix’ involves using variable-driven naming and the `count` meta-argument for conditional resource creation, suitable for small projects but quickly becomes unmanageable.
- Terraform Workspaces (or Terragrunt) is the recommended ‘Right Way’ solution, enabling deployment of the exact same code to different environments using separate state files and `.tfvars` for configuration.
- For large organizations, a ‘Module Power Play’ strategy involves creating generic, reusable modules in separate Git repositories, which are then consumed by environment-specific configurations.
- The core principle across all solutions is to decouple the ‘what to build’ (resource logic) from the ‘how to build it for this environment’ (configuration details like names, sizes, and IP ranges).
Tired of copy-pasting Terraform code for different environments? Learn how to manage near-identical IaC for Dev, Staging, and Prod without losing your sanity or creating a maintenance nightmare.
Stop Copying-Pasting! A Senior Engineer’s Guide to Managing Similar IaC
I still remember the pager alert at 2 AM. It was a Tuesday. A junior engineer, let’s call him Mark, had been tasked with spinning up a new staging environment. He did what seemed logical: he copied our production Terraform directory, did a find-and-replace for “prod” to “staging,” and ran terraform apply. What he missed was a hardcoded ARN for a critical IAM role that gave write access to a production S3 bucket. For about 15 minutes, our new, unstable staging app was gleefully writing garbage test data into our live customer data bucket. We caught it fast, but it was a cold-sweat moment. That, right there, is the price of the “copy-paste-and-pray” deployment strategy.
The Root of the Problem: Code Isn’t a Photocopier
When you find yourself duplicating entire folders of .tf files just to change a server name or a VPC CIDR block, it’s a symptom of a deeper issue. Your Infrastructure as Code (IaC) isn’t being treated like code; it’s being treated like a template document. The root cause is almost always hardcoding environment-specific values. You’ve coupled the logic of “what to build” (a VM, a database, a network) with the configuration of “how to build it for this specific environment” (the name, the size, the IP range).
The goal is to separate that logic from the configuration. The module that defines your web application stack should be blissfully unaware if it’s being deployed to dev-webapp-01 or prod-webapp-cluster-blue. It should only care about the variables it receives.
The Fixes: From Duct Tape to a New Engine
I’ve seen this problem solved a dozen different ways, but they generally fall into three categories. Let’s walk through them, from the quick fix to the long-term architectural solution.
Solution 1: The Quick Fix – Variable-Driven Naming & Count
Let’s say you’re in a single project and just need to deploy a dev and a prod version of the same resource without creating two separate directories. This is a common pattern in smaller projects or when you’re just starting out. You can use a combination of variables and the count meta-argument to conditionally create resources.
Imagine you have a single variable defining the environment:
# variables.tf
variable "environment" {
type = string
description = "The deployment environment (e.g., 'dev' or 'prod')."
default = "dev"
}
Now, in your main configuration, you can use this to dynamically name resources and even change their properties:
# main.tf
resource "aws_instance" "app_server" {
# Only create this instance if the environment is 'prod'
count = var.environment == "prod" ? 1 : 0
ami = "ami-0c55b159cbfafe1f0" # Example AMI
instance_type = "t2.large" # Larger instance for prod
tags = {
Name = "prod-app-server-01"
}
}
resource "aws_instance" "dev_server" {
# Only create this instance if the environment is 'dev'
count = var.environment == "dev" ? 1 : 0
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro" # Smaller instance for dev
tags = {
Name = "dev-app-server-01"
}
}
Is it hacky? A little. You’re cluttering your main file with conditional logic, and this gets unmanageable fast with more than two environments or many resources. But for a quick-and-dirty solution to avoid a second folder, it works.
Solution 2: The “Right Way” – Terraform Workspaces (or Terragrunt)
This is the solution you should be aiming for. Terraform Workspaces are designed specifically for this problem. A workspace is essentially a separate state file for the same configuration. This allows you to deploy the exact same code to different environments, each with its own state and variables.
Here’s how we run our projects at TechResolve:
1. Create workspaces:
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod
2. Use variable files (`.tfvars`) for each environment:
Create files like dev.tfvars and prod.tfvars.
# dev.tfvars
instance_type = "t2.micro"
instance_name = "dev-web-app-01"
# prod.tfvars
instance_type = "m5.large"
instance_name = "prod-web-app-01"
3. Your resource code becomes beautifully generic:
# main.tf - No more conditional logic!
variable "instance_type" { type = string }
variable "instance_name" { type = string }
resource "aws_instance" "web_app" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = var.instance_type
tags = {
Name = var.instance_name
}
}
4. Deploy by selecting a workspace and specifying the var file:
terraform workspace select dev
terraform apply -var-file="dev.tfvars"
terraform workspace select prod
terraform apply -var-file="prod.tfvars"
Pro Tip: For even more power, look into Terragrunt. It’s a thin wrapper around Terraform that helps you keep your configuration DRY (Don’t Repeat Yourself) by managing remote state and variable inputs in a much cleaner way, especially across many accounts or regions.
Solution 3: The “Module Power Play” – A Git-Based Module Strategy
For large organizations or very complex components, we take it a step further. We create a generic, reusable module in its own Git repository, complete with its own versioning and tests.
Repo 1: `terraform-aws-webapp` (The Module)
This repo contains the core logic for building our web application stack. It has inputs (variables) for everything that could possibly change: instance sizes, VPC IDs, security group rules, etc. It has no hardcoded names.
Repo 2: `live-infrastructure` (The Implementation)
This repo contains the actual environment configurations. The file structure might look like this:
live-infrastructure/
βββ dev/
β βββ main.tf
βββ staging/
β βββ main.tf
βββ prod/
βββ main.tf
And the prod/main.tf would simply be:
module "webapp" {
source = "git::https://github.com/TechResolve/terraform-aws-webapp.git?ref=v1.2.0"
# Prod-specific configuration
instance_type = "c5.2xlarge"
min_size = 3
max_size = 10
environment = "prod"
vpc_id = "vpc-11223344"
# ... other variables
}
This approach gives you ultimate separation of concerns. The team building the core infrastructure (module) is different from the team deploying it. You can safely update the module to version `v1.3.0` and roll it out to `dev` and `staging` for testing before ever touching the production deployment.
Choosing Your Path
So, which is right for you? It’s not about finding the single “best” way, but the right tool for the job at hand.
| Solution | Best For | Pros | Cons |
| 1. Conditional Logic (count) | Small projects, quick prototypes, or when refactoring isn’t an option yet. |
|
|
| 2. Workspaces / Terragrunt | Most professional projects. The industry standard. |
|
|
| 3. Git Modules | Large organizations, multi-team environments, and reusable infrastructure components. |
|
|
Stop the copy-paste madness. Your future selfβand your pager at 2 AMβwill thank you.
π€ Frequently Asked Questions
β What is the main problem with managing similar IaC across different environments?
The main problem is hardcoding environment-specific values directly into IaC, which leads to duplicating entire codebases (copy-pasting) for each environment. This results in maintenance nightmares, increased error potential, and a lack of scalability.
β How do Terraform Workspaces compare to using Git-based modules for IaC management?
Terraform Workspaces are ideal for most professional projects, providing clean separation of state and variables for different environments using the same core configuration. Git-based modules, conversely, are suited for large organizations or complex components, offering maximum reusability and enforcing separation of concerns by versioning core infrastructure logic independently, though they add repository management overhead.
β What is a common implementation pitfall when managing similar IaC, and how can it be avoided?
A common pitfall is the ‘copy-paste-and-pray’ deployment strategy, where engineers duplicate entire directories and manually find-and-replace environment-specific values. This can be avoided by externalizing environment-specific configurations through Terraform variables, `.tfvars` files, or by encapsulating reusable infrastructure logic within dedicated Terraform modules.
Leave a Reply