🚀 Executive Summary

TL;DR: A recent ‘pentest-mcp’ update with auto-remediation features caused critical Terraform state drift by directly modifying IAM roles, leading to production pipeline failures. The article details how to resolve these ‘dueling automation’ conflicts by either disabling auto-remediation, integrating security findings into Terraform as the single source of truth, or enforcing read-only access via IAM permissions.

🎯 Key Takeaways

  • Uncoordinated automation tools like ‘pentest-mcp’ can cause Terraform state drift by directly modifying cloud resources, breaking infrastructure-as-code pipelines.
  • Solutions to resolve ‘dueling automation’ include temporarily disabling auto-remediation, integrating security tool findings into Terraform for IaC-driven policy application, or enforcing read-only permissions via IAM.
  • Maintaining Terraform as the single source of truth for infrastructure state is crucial to prevent conflicts, ensure stable deployments, and codify security requirements directly in your repository.

pentest-mcp got big update, and a lot more automation of admin work

A new tool update is ‘helpfully’ automating your admin work and breaking your infrastructure-as-code pipelines. Here’s how to stop the bleeding and fix the state drift for good.

When Good Automation Goes Bad: Taming the Overeager ‘pentest-mcp’ Update

I got the ping at 2:17 AM. A high-priority PagerDuty alert. The main deployment pipeline for our production environment was bleeding red. A junior engineer, bless his heart, had been staring at it for an hour, terrified to touch anything. The error was maddeningly simple and yet deeply confusing: Terraform was failing on a plan, screaming about a “state drift” on an IAM role we hadn’t touched in months. It claimed a new inline policy existed on prod-iam-auditor-role that wasn’t in our code. My first thought? Manual change. Someone with console access got click-happy. But the CloudTrail logs showed the change was made by a service account, specifically the one our security team uses for their scanning tools. That’s when it hit me—the email from last week I’d skimmed: “Heads up, we’re rolling out the new ‘pentest-mcp’ update with enhanced auto-remediation features!” And there it was. A tool, trying to be helpful, had declared war on our source of truth.

The Root of the Problem: Dueling Banjos, DevOps Edition

What we were seeing is a classic case of “dueling automations.” In our world, Terraform is the single source of truth. It maintains a state file (terraform.tfstate) that is a precise map of the infrastructure it manages. When you run terraform apply, it compares the desired state (your code) with the last known state (the state file) and the real state (what’s actually in your AWS account). If the real state is different from what Terraform expects, it panics. It calls this “drift.”

The new pentest-mcp update, in its infinite wisdom, decided to start “fixing” perceived security gaps by directly modifying IAM roles. It saw a role, thought, “This could be more secure,” and attached a new policy. The tool didn’t know or care about our Terraform state. So now you have two systems fighting for control:

  • Terraform: “The state for prod-iam-auditor-role must match what’s in git!”
  • pentest-mcp: “I am a security tool, and I will add this policy to make this role more secure!”

On the next pipeline run, Terraform sees the change, tries to remove the policy to match its configuration, and the cycle of pain continues. This isn’t just annoying; it breaks deployments and erodes trust in your automation.

How to Get Your Pipeline Green Again

Look, the goal isn’t to get rid of the security tool. It’s doing its job. The goal is to make it play nice with our existing automation. Here are three ways to tackle this, from the quick-and-dirty to the architecturally sound.

Solution 1: The Quick Fix (And Why You Shouldn’t Stop Here)

The immediate goal is to stop the bleeding. Your pipeline is blocked, and you need to get it running. The fastest way is to tell the new tool to back off. Dig into the pentest-mcp configuration—usually a YAML or HCL file on the runner—and find the setting that controls this new “helpful” behavior.

In this case, their docs mentioned a new flag. We found it in /etc/pentest-mcp/config.yml:

# /etc/pentest-mcp/config.yml

# Setting this to 'false' will prevent the tool from making direct
# changes to cloud resources. It will only report findings.
enable_auto_remediation: false

# You might even be able to get more granular
remediate_iam_roles: false

By setting enable_auto_remediation to false and re-running the security scan, the tool stopped modifying our IAM roles. We could then run a terraform apply to revert the change it made, and the pipeline went green. This is a band-aid, but it’s a necessary first step to restore service.

Warning: This is a temporary solution! You’ve just turned off a feature the security team probably wants. Your next step should be to have a conversation with them and figure out a permanent solution. Don’t just leave it disabled and hope nobody notices.

Solution 2: The ‘Right’ Fix (Making Them Work Together)

The ideal state is one where Terraform remains the single source of truth, but it can be informed by the security tool. You want to integrate, not isolate. The best way to do this is to change the security tool’s workflow from “Scan and Fix” to “Scan and Report.”

A solid pattern is to have the tool run in a dry-run or report-only mode that outputs structured data (like JSON) about its findings. This report can then be used to drive changes in your IaC.

  1. Configure pentest-mcp to output a findings report: Instead of applying changes, have it save a findings.json to an S3 bucket or an artifact repository.
  2. Ingest the data in Terraform: Use a http or s3 data source in Terraform to pull this JSON file during the plan phase.
  3. Make intelligent changes: Use the data from the JSON to programmatically generate the security policies within Terraform itself. This way, the security team’s *intent* is captured in code and applied through your standard pipeline.
# Example of what this might look like in Terraform

data "s3_object" "pentest_findings" {
  bucket = "security-scan-reports-prod"
  key    = "latest-findings.json"
}

locals {
  # Parse the JSON report from the security tool
  findings = jsondecode(data.s3_object.pentest_findings.body)
  
  # Extract the recommended policy from the findings
  recommended_policy_doc = local.findings.recommended_iam_policies["prod-iam-auditor-role"]
}

resource "aws_iam_role_policy" "auditor_policy_from_scan" {
  name   = "pentest-mcp-recommended-policy"
  role   = aws_iam_role.auditor.id
  policy = local.recommended_policy_doc # The policy is now managed by Terraform!
}

This is the grown-up solution. It respects your IaC workflow, keeps Terraform as the source of truth, and codifies the security requirements directly in your repository. It’s more work upfront but pays dividends in stability and auditability.

Solution 3: The ‘Nuclear’ Option (Total Isolation)

Sometimes, you don’t have time for a full integration, or you simply don’t trust the tool’s automation. In this scenario, you put the tool in a padded room where it can’t break your toys. You use IAM to enforce your workflow.

The approach is simple: create a dedicated IAM role for pentest-mcp that gives it read-only access and nothing more. Specifically, permissions like iam:Get*, iam:List*, ec2:Describe*, but explicitly deny permissions like iam:PutRolePolicy, iam:AttachRolePolicy, or ec2:RunInstances.

This effectively neuters the tool’s auto-remediation feature at the permissions level. It can look, it can report, but it cannot touch. The tool will throw errors when it tries to make changes, and those errors are your signal to manually review its findings and implement them via Terraform if you agree.

Pro Tip: This is a great default strategy for any new third-party tool you’re introducing into your environment. Start with the most restrictive, read-only permissions possible and only grant additional write access when you have a well-defined, automated process (like Solution 2) to manage its changes.

Comparison of Approaches

Approach Speed to Implement Long-Term Stability Risk Level
1. The Quick Fix Very Fast (Minutes) Low (It’s a band-aid) Low (but ignores security intent)
2. The ‘Right’ Fix Slow (Days) High (Sustainable & Auditable) Low (Changes are in code)
3. The ‘Nuclear’ Option Fast (Hours) Medium (Requires manual review) Medium (Risk of human error)

The Takeaway: Trust, But Verify Your Automation

At the end of the day, this isn’t the security tool’s fault, and it’s not Terraform’s fault. It’s a process problem. Automation is powerful, but uncoordinated automation is a chaos engine. Any time you introduce a tool that can alter the state of your infrastructure, you have to ask yourself: “How does this respect my source of truth?” If the answer is “it doesn’t,” you’re setting yourself up for a 2 AM wake-up call. Talk to your security teams, read the update emails, and always, always treat your IaC state as sacred.

Darian Vance - Lead Cloud Architect

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.


🤖 Frequently Asked Questions

âť“ What is ‘state drift’ in the context of Terraform and security tools like ‘pentest-mcp’?

State drift occurs when the actual infrastructure configuration in the cloud (e.g., an IAM role) differs from what Terraform’s state file and code expect. Tools like ‘pentest-mcp’ can cause this by making direct, unmanaged changes to resources.

âť“ How do the proposed solutions for ‘pentest-mcp’ conflicts compare in terms of long-term stability?

The ‘Quick Fix’ (disabling auto-remediation) is a temporary band-aid. The ‘Right Fix’ (integrating findings into Terraform) offers high long-term stability by codifying security requirements in IaC. The ‘Nuclear Option’ (read-only IAM) provides medium stability, requiring manual review and carries a risk of human error.

âť“ What is a common implementation pitfall when introducing new security automation tools with existing IaC pipelines?

A common pitfall is granting new security tools write access and auto-remediation capabilities without coordinating their changes with the IaC’s source of truth. This leads to state drift, broken deployments, and erosion of trust in automation. The solution is to start with read-only permissions and integrate any necessary changes through the established IaC workflow.

Leave a Reply

Discover more from TechResolve - SaaS Troubleshooting & Software Alternatives

Subscribe now to keep reading and get access to the full archive.

Continue reading