πŸš€ Executive Summary

TL;DR: The article addresses the common struggle with GitOps repository structures, where complex configurations can lead to deployment chaos and production outages. It outlines three real-world approachesβ€”Monorepo, Repo-per-Team/App, and Repo-per-Environmentβ€”to help balance configuration reuse (DRY) with critical blast radius control, preventing accidental production impacts.

🎯 Key Takeaways

  • The core challenge in GitOps repo structure is balancing the DRY principle (Don’t Repeat Yourself) with effective blast radius control to prevent non-production changes from impacting production.
  • The Monorepo approach, while simple for small teams, carries a high blast radius and requires stringent branch protection and CODEOWNERS files to manage risks.
  • The Repo-per-Team/App strategy offers high scalability and a low blast radius by assigning ownership to specific teams, utilizing ArgoCD’s Application CRDs to point to individual application repositories.
  • The Repo-per-Environment approach, though high in process complexity and violating DRY, provides maximum security and compliance with a very low blast radius, suitable for highly regulated industries.

KubeCodex: Gitops repo structure - latest updates

Struggling with your GitOps repo structure? We break down three real-world approaches, from the simple monorepo to more complex multi-repo strategies, to help you escape deployment chaos.

From the Trenches: Taming the GitOps Repo Hydra

I still remember the 3 AM PagerDuty alert. A full production outage. We scrambled, only to find that a well-intentioned config change for the staging environmentβ€”a simple replica count adjustmentβ€”had been deployed to production. The root cause wasn’t a faulty pipeline or a bad line of code. It was our GitOps repository. We had a “clever” structure with overlapping Kustomize overlays that was so complex, a junior engineer’s PR to the `staging` branch accidentally triggered a sync on our `production` ArgoCD application. That night, I learned a hard lesson: your GitOps repo structure isn’t just about organization; it’s a critical safety mechanism.

Why Is This So Hard? The Core Conflict

Let’s be honest, the “right” way to structure a GitOps repo is a source of endless debate. The fundamental problem is a tug-of-war between two competing goals:

  • DRY (Don’t Repeat Yourself): We all want to reuse configuration. Using shared Helm charts or Kustomize bases feels efficient. Why define `nginx-ingress` three times for dev, staging, and prod?
  • Blast Radius Control: We need to ensure a mistake in a non-production environment can never impact production. The more you share, the greater the risk of a single change causing a cascading failure.

Finding the balance between these two is the key. There’s no single perfect answer, but based on my experience, most solutions fall into one of three patterns.

Solution 1: The Monorepo – “The Get It Done Approach”

This is where most of us start. You have a single Git repository that holds all your Kubernetes manifests and configurations for all your environments and applications. It’s straightforward and easy to grasp.

The Structure

A common layout using the “app-of-apps” pattern with ArgoCD might look something like this:


gitops-repo/
β”œβ”€β”€ clusters/
β”‚   β”œβ”€β”€ dev/
β”‚   β”‚   └── root-app.yaml   # Points to apps/environments/dev
β”‚   └── prod/
β”‚       └── root-app.yaml   # Points to apps/environments/prod
└── apps/
    β”œβ”€β”€ base/
    β”‚   β”œβ”€β”€ app-a/
    β”‚   β”‚   β”œβ”€β”€ deployment.yaml
    β”‚   β”‚   └── service.yaml
    β”‚   └── app-b/
    β”‚       └── helm-chart/
    └── environments/
        β”œβ”€β”€ dev/
        β”‚   β”œβ”€β”€ app-a-patch.yaml
        β”‚   └── app-b-values.yaml
        └── prod/
            β”œβ”€β”€ app-a-patch.yaml
            └── app-b-values.yaml

Here, your cluster bootstrap points to a `root-app.yaml` for its environment, which then deploys all the applications defined in the corresponding environment folder. Kustomize overlays or Helm values handle the environment-specific differences.

Pro Tip: This approach lives and dies by your branch protection and CODEOWNERS file. Vigorously protect your `main` or `prod` branches and require approvals from senior team members for any merges.

Solution 2: The Repo-per-Team/App – “The Scalable Approach”

As your organization grows, the monorepo becomes a bottleneck. The commit history is noisy, PRs collide, and you don’t want the database team to have permissions to change the frontend application’s deployment manifests. The solution is to split things up.

The Structure

You end up with multiple repositories, each with a clear owner:

  • `platform-infra-repo`: Managed by the DevOps/Platform team. Contains cluster-wide tools like Istio, cert-manager, Prometheus, and the ArgoCD App-of-Apps definitions.
  • `app-team-a-repo`: Managed by the ‘A’ team. Contains only the manifests and Helm values for their specific application(s).
  • `app-team-b-repo`: Managed by the ‘B’ team. Contains their app configs.

The `platform-infra-repo` defines `Application` CRDs that point to the other repos. For example, a file in the platform repo would look like this:


apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: team-a-app-prod
  namespace: argocd
spec:
  project: default
  source:
    repoURL: 'https://github.com/TechResolve/app-team-a-repo.git'
    path: helm/charts/my-app
    targetRevision: main
    helm:
      valueFiles:
      - values-prod.yaml
  destination:
    server: 'https://kubernetes.default.svc'
    namespace: team-a-prod

This gives teams autonomy to manage their own release cycles while the platform team maintains control of the underlying cluster infrastructure. It’s a fantastic balance for scaling organizations.

Solution 3: Repo-per-Environment – “The High-Stakes Option”

I call this the ‘nuclear’ option because many consider it an anti-pattern. It intentionally violates the DRY principle by creating entirely separate repos for each major environment. I’ve seen it as `config-staging` and `config-production`.

The Structure

You have two nearly identical repositories. A change is made in `config-staging`, tested, and then promoted to production by opening a PR from the `staging` repo to the `production` repo. This PR contains the fully rendered, duplicated YAML.

Why would anyone do this? Two reasons: maximum security and compliance. In environments with strict SOX or PCI-DSS requirements, you need an unimpeachable audit trail. This model provides it. The promotion PR is a crystal-clear record of exactly what is changing in production, with no chance of a shared base component causing an unforeseen side effect. It’s slow and cumbersome, but for environments where a mistake could cost millions, it’s a trade-off some are willing to make.

Warning: Don’t reach for this unless you absolutely have to. The overhead of keeping the repos in sync and managing the promotion process is significant. It’s a solution for a very specific, high-consequence problem.

Comparison at a Glance

Approach Setup Complexity Scalability Blast Radius Best For
1. Monorepo Low Low-Medium High Startups, small teams, proof-of-concepts.
2. Repo-per-Team Medium High Low Growing companies, microservices architectures.
3. Repo-per-Env High (Process) Low Very Low Highly regulated industries (Finance, Healthcare).

Ultimately, there is no magic bullet. Start with the simplest thing that can possibly work (usually the monorepo), and don’t be afraid to refactor as your team and your platform grow. Thinking about these trade-offs upfront will save you from a 3 AM page and a painful post-mortem.

Darian Vance - Lead Cloud Architect

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.


πŸ€– Frequently Asked Questions

❓ What are the primary GitOps repository structures discussed in the article?

The article details three main GitOps repository structures: the Monorepo (‘Get It Done Approach’), the Repo-per-Team/App (‘Scalable Approach’), and the Repo-per-Environment (‘High-Stakes Option’), each with distinct trade-offs.

❓ How do the Monorepo, Repo-per-Team, and Repo-per-Environment strategies compare in terms of scalability and blast radius?

The Monorepo has low-medium scalability and a high blast radius. Repo-per-Team offers high scalability and a low blast radius. Repo-per-Environment has low scalability (due to process overhead) but a very low blast radius.

❓ What is a common implementation pitfall with the Monorepo approach and how can it be addressed?

A common pitfall with the Monorepo is a high blast radius, where a change intended for a non-production environment can accidentally affect production. This is mitigated by rigorously protecting `main` or `prod` branches and enforcing CODEOWNERS approvals from senior team members for merges.

Leave a Reply

Discover more from TechResolve - SaaS Troubleshooting & Software Alternatives

Subscribe now to keep reading and get access to the full archive.

Continue reading