🚀 Executive Summary

TL;DR: Traefik SSL certificate errors often stem from a stale or misconfigured ‘acme.json’ file, which stores obtained certificates and private keys, leading to state management issues. Resolving this involves either deleting the file to force re-issuance, surgically editing it to remove problematic entries, or ensuring correct file permissions and volume mounting for persistent storage.

🎯 Key Takeaways

  • The ‘acme.json’ file is crucial for Traefik’s persistence, storing all obtained SSL certificates, private keys, and ACME challenge information.
  • Deleting ‘acme.json’ forces Traefik to re-issue all certificates, acting as a quick fix but risking Let’s Encrypt rate limits if overused.
  • Manually editing ‘acme.json’ allows for targeted removal of specific problematic certificate entries, preserving other valid certificates.
  • Ensuring ‘acme.json’ has correct file permissions (600) and is explicitly mounted as a file (not just a directory) in Docker Compose prevents silent failures and ensures persistent storage.

ssl cert error for traefik

Tired of Traefik’s stubborn SSL certificate errors? Here’s a senior engineer’s guide to fixing the dreaded ‘acme.json’ file issue, from quick hacks to permanent solutions that will save your weekend.

Decoding the Traefik SSL Riddle: Why Your ‘acme.json’ is Haunting You

I remember it clear as day. 2 AM on a Tuesday, and a PagerDuty alert screams that our new marketing campaign’s landing page is down. The error? A nasty SSL certificate mismatch. We’d just migrated `promo.techresolve.com` to a new container, a simple five-minute job. Yet, here we were, an hour into a production outage because Traefik was stubbornly serving an old, incorrect certificate for a domain we weren’t even using anymore. It felt like the reverse proxy had a personal grudge. That night, I learned a valuable, frustrating lesson about Traefik’s statefulness, and it all came down to one little file: `acme.json`.

So, What’s Really Going On? The Ghost in the JSON

Before we start deleting things, let’s understand the culprit. Traefik isn’t just generating certs on the fly every time it starts; that would be incredibly inefficient and get you rate-limited by Let’s Encrypt in a heartbeat. Instead, it stores all the certificates it successfully obtains, along with your private keys and challenge information, inside a single file you designate—usually named acme.json.

This is great for persistence. But it becomes a nightmare when you change things. If you remove a domain from a container’s labels or change your ACME resolver configuration, Traefik doesn’t always clean up the old entries in acme.json. It sees the old cert, sees your new config, gets confused, and throws an error or serves the wrong certificate. It’s essentially a state management problem where the persisted state (the JSON file) is out of sync with the desired state (your new configuration).

Fixing The Mess: Three Tiers of Sanity Restoration

Depending on your situation—whether you’re in a panic during a production outage or just setting up a dev environment—there are a few ways to tackle this. Let’s go from the quickest fix to the most robust.

Solution 1: The “Have You Tried Turning It Off and On Again?” Fix

This is the classic, blunt-force approach. It’s fast, it works, but it’s not elegant. You’re essentially telling Traefik to forget everything it knows about certificates and start over from scratch.

  1. Stop your Traefik container.
  2. Locate your acme.json file on the host machine.
  3. Delete it. (Or, better yet, rename it to acme.json.bak just in case).
  4. Restart your Traefik container.

Traefik will wake up, see that there’s no acme.json file, and initiate the Let’s Encrypt challenge for all the hosts it discovers from your container labels. Within a minute or two, it will generate a brand new, clean acme.json file with only the correct certificates.

Warning: Be careful with this! Let’s Encrypt has strict rate limits. If you do this too many times in a short period for the same domains, you’ll get temporarily banned from issuing new certificates. This is for emergencies, not for regular maintenance.

Solution 2: The Surgical Strike (Editing the JSON)

This is my preferred method when I have a few minutes to spare and want to avoid the risks of the first solution. Instead of wiping the whole file, you dive in and manually remove the specific certificate that’s causing the problem. This is great if you have 20 working services and only one is acting up.

Your acme.json file has a structure. You’ll need to find the section for your certificate resolver (e.g., `myresolver`) and look inside its `Certificates` array. You’re looking for the object where the `main` key matches your problematic domain.

Here’s a simplified look at the structure:


{
  "myresolver": {
    "Account": { ... },
    "Certificates": [
      {
        "domain": {
          "main": "good-service.techresolve.com"
        },
        "certificate": "...",
        "key": "..."
      },
      {
        "domain": {
          "main": "problem-service.techresolve.com"
        },
        "certificate": "...",
        "key": "..."
      }
    ]
  }
}

Your job is to stop Traefik, carefully delete the entire JSON object for `problem-service.techresolve.com` (from the opening `{` to the closing `}`), save the file, and restart Traefik. It will then re-issue just that one missing certificate.

Solution 3: The ‘Nuke it from Orbit’ and Do It Right This Time

Often, the root cause isn’t just a stale entry; it’s a fundamental permissions or volume mapping issue. If `acme.json` can’t be written to correctly, Traefik will fail silently or work with an in-memory version that gets lost on restart. This is the permanent, architectural fix.

Let’s Encrypt is very particular about the security of the file containing your private keys. It must have strict permissions. The recommended permission is `600` (only the owner can read and write).

First, fix the permissions on your host machine:


touch /opt/traefik/acme.json
chmod 600 /opt/traefik/acme.json

Next, ensure your `docker-compose.yml` or container run command is set up correctly. You need to explicitly mount the file (not just the directory) and ensure the static configuration points to it.

Here’s a snippet from a solid `docker-compose.yml`:


services:
  traefik:
    image: "traefik:v2.10"
    container_name: "traefik"
    command:
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.myresolver.acme.tlschallenge=true"
      - "--certificatesresolvers.myresolver.acme.email=darian.vance@techresolve.com"
      - "--certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json" # Point to the path inside the container
    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "/opt/traefik/acme.json:/letsencrypt/acme.json" # Mount the file directly

# ... your other services

By explicitly mounting the file and setting the permissions correctly *before* you start, you eliminate a whole class of silent failures. This, combined with the surgical approach when needed, is how we keep our environments stable at TechResolve. Don’t let a simple JSON file be the reason for your next 2 AM wake-up call.

Darian Vance - Lead Cloud Architect

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.


🤖 Frequently Asked Questions

âť“ Why does Traefik serve old or incorrect SSL certificates?

Traefik stores all obtained certificates in ‘acme.json’. If domain configurations change or are removed, Traefik may serve stale certificates from this file because it doesn’t automatically clean up old entries, causing a state mismatch with the desired configuration.

âť“ How do the different ‘acme.json’ solutions compare?

Deleting ‘acme.json’ is a blunt, quick fix for emergencies, forcing a full re-issuance but risking Let’s Encrypt rate limits. Surgically editing ‘acme.json’ is a more precise method for isolated issues, avoiding mass re-issuance. The architectural fix (correct permissions and explicit volume mounting) is a permanent solution addressing root causes of persistence failures.

âť“ What is a common implementation pitfall when configuring ‘acme.json’ for Traefik?

A common pitfall is incorrect file permissions or improper volume mapping for ‘acme.json’. The file must have strict permissions (600) and be explicitly mounted as a file (e.g., ‘/opt/traefik/acme.json:/letsencrypt/acme.json’) to ensure Traefik can read/write persistently and securely, preventing silent failures or loss of certificates on restart.

Leave a Reply

Discover more from TechResolve - SaaS Troubleshooting & Software Alternatives

Subscribe now to keep reading and get access to the full archive.

Continue reading