🚀 Executive Summary
TL;DR: Docker containers can experience DNS resolution failures if the host’s /etc/resolv.conf is modified after the Docker daemon has started, as the daemon caches DNS settings at launch. The primary solution involves restarting the Docker daemon to force it to re-read the updated configuration, or implementing a more robust fix by explicitly defining DNS servers in the Docker daemon’s configuration file.
🎯 Key Takeaways
- The Docker daemon reads and caches the host’s /etc/resolv.conf at startup, making it unaware of subsequent changes to the host’s DNS configuration.
- Restarting the Docker daemon (sudo systemctl restart docker) is the quickest way to force it to re-read the current /etc/resolv.conf and resolve container DNS issues.
- For a permanent and more robust solution, configure explicit DNS servers in /etc/docker/daemon.json using the ‘dns’ key, which makes container networking independent of host-level /etc/resolv.conf changes.
A simple change to /etc/resolv.conf can silently break container DNS resolution, causing CI/CD pipelines and applications to fail. The key is understanding that the Docker daemon needs to be restarted or properly configured to see host-level network changes.
That Time an AI Suggestion Took Down Our Entire CI Pipeline
I remember it like it was yesterday. It was 3 PM on a Thursday. A P2 incident ticket drops into the queue: “All builds failing with ‘Could not resolve host’ errors.” I grab the ticket, jump on a call with a panicked junior engineer, and start digging. He’s sweating bullets, telling me he just made a “tiny change” to the `/etc/resolv.conf` on one of the runners, `ci-runner-pool-04`, to test something. The change was suggested by a popular AI assistant. It looked completely harmless. On the host, `ping google.com` worked. `nslookup internal-artifactory.techresolve.local` worked. But inside any new Docker container? Nothing. The entire deployment pipeline was dead in the water because our build containers couldn’t reach our internal artifact repository. This, right here, is the kind of “vibe-designed” problem that looks simple on the surface but shows a fundamental misunderstanding of how containers and host networking interact.
The Root of the Problem: Docker’s DNS Disconnect
So, what’s actually going on here? It’s not magic, and it’s not a bug. It’s by design. The Docker daemon, by default, reads the host’s `/etc/resolv.conf` file at startup and uses that configuration to provide DNS services to all the containers it launches. When you or an AI script manually edits `/etc/resolv.conf`, the running Docker daemon has no idea you did that. It’s still holding onto the old, cached configuration. Any new container you spin up gets the old, stale, and now-incorrect DNS settings. The host can resolve addresses just fine, but the containers are living in the past, completely blind to the outside world.
Heads Up: This gets even more complicated on modern systems using `systemd-resolved`, where `/etc/resolv.conf` is often a symlink to a stub file like `/run/systemd/resolve/stub-resolv.conf`. Directly editing that symlink is a recipe for disaster that will revert on the next reboot or network service restart.
The Fixes: From Triage to Architecture
Okay, you’re in the hot seat and builds are failing. Let’s get this fixed. We have a few options, ranging from “get it working NOW” to “let’s make sure this never happens again.”
1. The Quick Fix: “Turn It Off and On Again”
This is your emergency-level, get-production-unblocked fix. The goal is to simply force the Docker daemon to re-read the now-correct `/etc/resolv.conf` file. It’s crude, but it’s fast and effective.
Just restart the Docker service:
sudo systemctl restart docker
This will cause a brief interruption for any containers running on that host, but it will immediately solve the DNS issue for any newly launched containers. In our incident with `ci-runner-pool-04`, this is exactly what we did to get the pipeline flowing again in under 60 seconds.
2. The Permanent Fix: “Configure the Daemon Properly”
Relying on the host’s `resolv.conf` is brittle. A better, more architectural solution is to tell the Docker daemon exactly which DNS servers to use, permanently. This makes your container networking independent of the host’s transient configuration. You do this by creating or editing the Docker daemon’s configuration file at `/etc/docker/daemon.json`.
Add a `dns` key with your company’s DNS servers (and maybe a public fallback):
{
"dns": ["10.1.1.5", "10.1.1.6", "8.8.8.8"]
}
After saving this file, you still need to reload and restart the daemon for it to take effect, but you only have to do it once.
sudo systemctl reload docker
sudo systemctl restart docker
Now, every container started by this daemon will automatically use these DNS servers, regardless of what’s in `/etc/resolv.conf`.
3. The ‘Nuclear’ Option: “Per-Container Overrides”
Sometimes you can’t touch the daemon configuration. Maybe it’s a locked-down environment, or you only need a specific DNS for one particular application container trying to reach a special database like `prod-db-01.internal`. In these cases, you can specify the DNS server at runtime.
For a single `docker run` command:
docker run --dns=10.1.1.5 my-app-image
Or in a `docker-compose.yml` file:
version: '3.8'
services:
webapp:
image: my-app-image
dns:
- 10.1.1.5
- 8.8.8.8
I call this the “nuclear” option because it’s a bit of a hack. It creates configuration drift and requires every developer to remember to add this flag. It solves the immediate problem for that one container, but it doesn’t fix the underlying platform issue. Use it sparingly.
Choosing Your Path
To wrap it up, here’s a quick cheat sheet for when to use each approach.
| Solution | Best For | Downside |
|---|---|---|
| 1. Restart Daemon | Emergency incident response. | Doesn’t prevent the problem from happening again. |
| 2. Configure daemon.json | Permanent, architectural solution for your hosts. | Requires root access and a service restart. |
| 3. Per-Container Flag | One-off exceptions or locked-down environments. | Brittle, high maintenance, easy to forget. |
The lesson here isn’t that AI is bad. It’s that tools provide answers, but they often lack context. They don’t know about your stateful Docker daemon or your complex internal network. Understanding the “why” behind a problem is what separates a junior engineer following a script from a senior engineer who can build a resilient system.
🤖 Frequently Asked Questions
âť“ Why do my Docker containers fail DNS resolution when the host’s /etc/resolv.conf has been updated?
Docker containers fail DNS resolution because the Docker daemon reads and caches the host’s /etc/resolv.conf file only at startup. Any changes made to /etc/resolv.conf after the daemon is running will not be reflected in new containers until the daemon is restarted.
âť“ What are the different methods to fix Docker container DNS resolution issues and their trade-offs?
The quick fix is to restart the Docker daemon (sudo systemctl restart docker), which is fast but temporary. A permanent solution is to configure DNS servers in /etc/docker/daemon.json, providing host-independent resolution. A ‘nuclear’ option is per-container DNS flags (–dns in docker run or docker-compose), suitable for specific cases but prone to configuration drift.
âť“ What is a common pitfall when attempting to resolve Docker DNS issues by editing /etc/resolv.conf?
A common pitfall is directly editing /etc/resolv.conf on systems using systemd-resolved, as it’s often a symlink to a stub file; direct edits can be reverted or cause further issues. More importantly, the Docker daemon will not pick up these changes without a restart, making the edit ineffective for running containers.
Leave a Reply