🚀 Executive Summary
TL;DR: Unexplained Google Cloud storage growth in GCR/GAR often stems from underlying GCS buckets having Object Versioning enabled, causing ‘deleted’ images to persist as costly ‘zombie data’. To fix this, disable bucket versioning and implement lifecycle policies or perform a manual `gsutil` scrub to reclaim space and reduce billing.
🎯 Key Takeaways
- Google Container Registry (GCR) and Artifact Registry (GAR) are abstractions built on top of Google Cloud Storage (GCS) buckets, meaning their storage behavior is dictated by the underlying GCS bucket configurations.
- If Object Versioning is enabled on the underlying GCS bucket for GCR/GAR, commands like `docker rmi` or GUI deletions only unlink image tags, leaving the physical data as ‘non-current’ archived versions that continue to incur storage costs.
- To effectively manage GCR/GAR storage and prevent billing for ‘ghost images,’ administrators must either disable Object Versioning, apply GCS Lifecycle Policies to automatically delete non-current objects, or use `gsutil` for a manual, immediate cleanup of archived versions.
If your Google Cloud bill shows unexplained storage growth despite aggressive image pruning, you likely have a “zombie data” problem in your underlying GCS buckets; here is how to disable object versioning and scrub the archives before finance sends you a nasty email.
Stop Paying for Ghost Images: The GCR/GAR Versioning Trap
There I was, staring at a GCP billing report that made absolutely no sense. It was three years ago on a project called legacy-monolith-prod. We had automated our CI/CD pipelines to aggressively prune old Docker images. Every night, a script ran, keeping only the last 10 tags. Clean, efficient, right?
Wrong. Despite our dashboard showing only 50GB of images, we were being billed for terabytes. I actually argued with a support rep, convinced their metering was broken. Spoiler: It wasn’t. It was me. I was the problem.
I had treated the Google Artifact Registry (and GCR before it) like a magic box. I forgot that underneath the fancy UI, it’s just a dumb Cloud Storage bucket. And that bucket had Object Versioning enabled. Every time our script “deleted” an image, it didn’t actually free up space; it just moved the blob to a “non-current” state, effectively creating a graveyard of zombie data we were paying premium rates to store.
The “Why”: Anatomy of a Billing Disaster
Here is the technical reality: Google Container Registry (and Artifact Registry) are abstractions. They store layers as objects in a GCS bucket.
If that underlying bucket has Versioning turned on, the docker rmi command or the GUI delete button does not remove the data. It merely unlinks the tag. The physical bytes remain on disk (well, on object storage) as an archived version.
This is usually fine for a standard backup bucket. But for a CI/CD system pushing a 2GB layer to dev-build-runner-01 every 15 minutes? You are generating massive amounts of churn. If you don’t delete the archived versions, your storage graph will look like a hockey stick pointing straight at your bankruptcy.
Solution 1: The Quick Fix (Stop the Bleeding)
First, verify if this is actually your problem. Don’t guess. Check the bucket configuration.
# Check versioning status
gsutil versioning get gs://artifacts.[PROJECT-ID].appspot.com
# Output usually looks like:
# gs://artifacts...: Enabled
If it says “Enabled,” you need to turn that off immediately for any registry that handles high-churn build artifacts. This stops new “deleted” objects from being archived. It does not clean up the existing mess, but it stops the hole from getting deeper.
# Turn it off
gsutil versioning set off gs://artifacts.[PROJECT-ID].appspot.com
Solution 2: The “Let Cloud Handle It” (Lifecycle Policy)
This is the responsible, architect-approved way to handle it. We want versioning off (usually), but if you absolutely must keep it on for compliance, you need a Lifecycle Policy to auto-delete the ghosts after a few days.
Create a JSON file named kill-zombies.json. This policy tells GCS: “If an object is not the current live version, kill it after 1 day.”
{
"rule": [
{
"action": {"type": "Delete"},
"condition": {
"withState": "ARCHIVED",
"age": 1
}
}
]
}
Then apply it:
gsutil lifecycle set kill-zombies.json gs://artifacts.[PROJECT-ID].appspot.com
Pro Tip: It can take up to 24 hours for Google’s background sweepers to actually delete the data and reflect the change in your billing. Don’t panic if the graph doesn’t drop instantly.
Solution 3: The Nuclear Option (Manual Scrub)
Sometimes you don’t have time to wait for a lifecycle policy to kick in, or you just disabled versioning (Solution 1) and realized you still have 50TB of “non-current” data sitting there from last year. Turning versioning off does not delete the history.
This is a “hacky” script I’ve used in the trenches when I needed to reclaim space now. It lists all archived (non-current) versions and pipes them into a delete command.
Warning: Run this on a test bucket first. If you type the bucket name wrong, you’re going to have a bad day.
# DRY RUN: See what you are about to destroy
gsutil ls -a gs://artifacts.[PROJECT-ID].appspot.com/** | grep "#"
# THE NUCLEAR LAUNCH:
# This lists all object versions (the ones with #generation IDs),
# filters for them, and forces deletion.
gsutil ls -a gs://artifacts.[PROJECT-ID].appspot.com/** \
| grep "#" \
| xargs -I {} gsutil -m rm {}
When I ran this on the legacy-monolith-prod project, the console output scrolled for 45 minutes. But the next day? Storage costs dropped by 60%.
Clean up your buckets, folks. The cloud is only “pay for what you use” if you’re smart enough to stop using what you’ve deleted.
🤖 Frequently Asked Questions
âť“ Why am I still being billed for GCR/GAR storage after deleting old Docker images?
Your GCR/GAR storage costs are likely high because the underlying GCS buckets have Object Versioning enabled. This feature retains ‘deleted’ images as non-current versions, which continue to consume storage and incur charges.
âť“ What are the recommended methods to manage GCR/GAR storage and eliminate ‘zombie data’?
To manage storage, you can disable Object Versioning on the GCS bucket to stop new archives. For existing ‘zombie data,’ implement a GCS Lifecycle Policy to automatically delete non-current objects after a set period, or use `gsutil` commands for an immediate, manual cleanup of all archived versions.
âť“ What is a critical step often missed when trying to reduce GCR/GAR storage costs?
A critical step often missed is that simply disabling Object Versioning only prevents *future* archiving; it does not clean up *existing* non-current versions. You must explicitly apply a lifecycle policy or perform a manual `gsutil` scrub to delete the historical ‘zombie data’.
Leave a Reply