🚀 Executive Summary
TL;DR: The Pulumi Kubernetes Operator doesn’t automatically update stacks after a Git push because it reconciles based on changes to its `Stack` Custom Resource (CR), not direct Git repository events. To achieve true GitOps, implement event-driven triggers like Git webhooks to explicitly signal the operator to re-evaluate the repository, or use manual annotations for immediate fixes.
🎯 Key Takeaways
- The Pulumi Kubernetes Operator’s reconciliation loop is triggered by changes to its `Stack` Custom Resource (CR), not directly by Git pushes to the source code repository.
- Manually annotating the `Stack` CR (e.g., `kubectl annotate stack/your-stack-name pulumi.com/last-update=$(date +%s) –overwrite`) forces the operator to re-evaluate and pull the latest commit.
- For a robust, event-driven GitOps workflow, integrate Git webhooks with a notification controller (like FluxCD’s `notification-controller`) to automatically trigger updates to the `Stack` CR upon a Git push.
- The `resyncFrequency` property in the `Stack` CR provides a scheduled polling mechanism, but it’s less efficient and introduces delays compared to event-driven solutions.
Struggling with the Pulumi Kubernetes Operator not updating after a Git push? I’ve been there. Let’s break down why it happens and explore three real-world fixes, from a quick manual kick to a robust, event-driven GitOps workflow.
GitOps, Pulumi, and the Silent Treatment: Why Your Stacks Aren’t Updating
It was 2 AM on a Tuesday. A classic P1 incident. A critical service, our `prod-payments-api`, was timing out under load. The fix was simple: a one-line change in our Pulumi TypeScript code to increase a container CPU limit. I pushed the change, got the PR approved, and merged it into the `main` branch. I leaned back in my chair, expecting to see the new pods roll out within a minute. I waited. And waited. I refreshed my Lens dashboard. Nothing. The deployment was unchanged, happily running the old, under-powered configuration. For a gut-wrenching ten minutes, we were stumped. The code was in Git, but the cluster was blissfully unaware. That was my first real lesson in the subtle “gotchas” of the Pulumi Kubernetes Operator.
The “Why”: Understanding the Reconciliation Loop
So, what’s actually going on here? It feels like a bug, but it’s really a fundamental concept of how many Kubernetes operators work, including Pulumi’s. The operator’s job is to watch for changes to its Custom Resources (CRs). In our case, it’s watching the `Stack` resource you defined in your YAML.
When you push a code change to your Pulumi project in Git, you are not changing the `Stack` CR itself. The YAML that tells the operator “my code lives at `github.com/TechResolve/prod-infra.git` in the `payments-api` directory” is still the same. From the operator’s perspective, nothing has changed, so it has no reason to trigger a new `pulumi up`. It will eventually resync based on its default schedule, but “eventually” isn’t a word you want to hear during an outage.
The core problem is that the process isn’t event-driven out of the box. A `git push` doesn’t magically poke the operator. We need to create that poke ourselves. Here are three ways to do it, ranging from a quick emergency fix to a proper, long-term solution.
Solution 1: The Quick Fix (“The Manual Kick”)
This is my go-to when I’m in a hurry, like that 2 AM incident. We need to make the operator think the `Stack` CR has changed, forcing it to re-evaluate and pull the latest commit from Git. The easiest way to do this is by adding or changing an annotation on the resource using `kubectl`.
I call this “DevOps Percussive Maintenance.” You’re essentially just giving it a good kick to get it going again.
Run this command, replacing `your-stack-name` with the name of your stack resource:
kubectl annotate stack/your-stack-name pulumi.com/last-update=$(date +%s) --overwrite
This command adds an annotation `pulumi.com/last-update` with the current Unix timestamp. Since the annotation value is always changing, it guarantees a “change” to the CR’s metadata, which is enough to trigger the operator’s reconciliation loop. It will then check the Git repo, see your new commit, and run the update.
Warning: This is a manual intervention. It’s great for emergencies or testing, but it’s not a sustainable GitOps practice. If you find yourself doing this every day, it’s time to move to a real solution.
Solution 2: The Permanent Fix (“The Webhook Bridge”)
This is the “right” way to solve the problem and achieve a truly event-driven GitOps workflow. We’ll set up a webhook in our Git provider (like GitHub or GitLab) that fires every time you push to your main branch. This webhook will send a notification to a listener inside our Kubernetes cluster, which will then perform the “manual kick” from Solution 1 for us automatically.
While you could build this listener yourself, why reinvent the wheel? Tools like FluxCD and Argo CD have components built for exactly this purpose.
Using the Flux Notification Controller
My team and I are big fans of Flux, and its `notification-controller` is perfect for this. Here’s the high-level workflow:
- Install Flux: If you don’t have it already, install the FluxCD components, including the `notification-controller`.
- Create a Receiver: You’ll create a Flux `Receiver` resource in the cluster. This will expose a unique webhook endpoint URL.
- Configure GitHub/GitLab: In your Git repository’s settings, add a new webhook pointing to the URL exposed by the Flux Receiver. Set it to trigger on `push` events.
- Create an Alert: Create a Flux `Alert` resource. This is the magic piece. You’ll configure this alert to, upon receiving a webhook event for a specific repository, run a command to annotate our Pulumi `Stack` CR.
This setup bridges the gap. A `git push` now directly triggers the Pulumi Operator. It’s fully automated, auditable, and the way GitOps is meant to be.
Solution 3: The ‘Nuclear’ Option (“The Forced Resync”)
What if you don’t want to set up webhooks and you can tolerate a small delay? The Pulumi `Stack` CR has a built-in property for this: `resyncFrequency`. This tells the operator to ignore everything and just force a check against the Git repository on a schedule.
You can add this directly to your `Stack` definition YAML:
apiVersion: pulumi.com/v1
kind: Stack
metadata:
name: prod-api-gateway
spec:
# ... your other spec properties like projectRepo, stack, etc.
projectRepo: https://github.com/TechResolve/prod-infra.git
stack: TechResolve/api-gateway/prod
# Add this line
resyncFrequency: "5m" # Check for new commits every 5 minutes
...
This is a blunt instrument. It works, but it has downsides. It creates constant, low-level traffic to your Git provider as the operator polls for changes, and your updates are only as fast as the interval you set. A 5-minute delay might be fine for a staging environment, but it’s often too slow for production hotfixes.
My Take: I see `resyncFrequency` as a fallback or a simple solution for non-critical stacks. For anything that matters, the effort to set up webhooks (Solution 2) is absolutely worth it.
Comparing the Solutions
To make it clearer, here’s how I think about these three approaches:
| Approach | Speed | Complexity | When to Use |
| 1. Manual Kick | Instant (after command) | Very Low | Emergencies, debugging, one-off updates. |
| 2. Webhook Bridge | Near-Instant (event-driven) | Medium (requires another tool) | The default for all production & critical stacks. |
| 3. Forced Resync | Delayed (by interval) | Low | Non-critical dev/staging, or as a safety net. |
At the end of the day, understanding that the operator is keying off the `Stack` CR is the real lesson. Once you know that, you can make an informed decision about how you want to tell it when something important has happened upstream in your code. Don’t be like me at 2 AM, frantically wondering why your merge isn’t doing anything. Set up your webhooks, and let the robots do the work.
🤖 Frequently Asked Questions
âť“ Why doesn’t the Pulumi Kubernetes Operator update my stack after a Git push?
The operator monitors its `Stack` Custom Resource (CR) for changes, not direct Git repository pushes. A Git push doesn’t alter the `Stack` CR itself, so the operator doesn’t perceive a need to trigger a `pulumi up` until its default resync schedule or an explicit trigger.
âť“ How does this compare to alternatives like Argo CD or FluxCD for GitOps?
While the Pulumi Kubernetes Operator manages Pulumi stacks, tools like FluxCD and Argo CD are full-fledged GitOps controllers that can manage the deployment of the `Stack` CRs themselves and provide the event-driven webhook capabilities needed to trigger the Pulumi Operator effectively. The article suggests using FluxCD’s `notification-controller` to bridge this gap.
âť“ What’s a common implementation pitfall when using the Pulumi Kubernetes Operator for GitOps?
A common pitfall is assuming the operator will automatically detect Git pushes. The solution is to implement an event-driven mechanism, such as configuring Git webhooks to trigger an update to the `Stack` CR (e.g., by adding a changing annotation) or using a `resyncFrequency` for scheduled polling, to explicitly signal the operator to re-evaluate the Git repository.
Leave a Reply