🚀 Executive Summary
TL;DR: When NGINX acts as a reverse proxy for Docker Compose services, it often caches the IP address of the first container it resolves, preventing traffic from reaching newly scaled replicas. The primary solution involves configuring NGINX to use the Docker service name directly in `proxy_pass` or employing the `resolver` directive to dynamically re-resolve service IPs, leveraging Docker’s built-in load balancing.
🎯 Key Takeaways
- NGINX performs a one-time DNS lookup for upstream services and caches the resolved IP, leading to traffic being sent to only one Docker Compose replica even when the service is scaled.
- The recommended permanent fix is to remove the NGINX `upstream` block and use the Docker Compose service name directly in the `proxy_pass` directive (e.g., `proxy_pass http://api-service:8000;`), allowing Docker’s internal DNS to handle round-robin load balancing.
- For advanced NGINX upstream features or maximum resilience, the `resolver 127.0.0.11 valid=10s;` directive can be used to force NGINX to periodically re-resolve service names using Docker’s internal DNS, ensuring it adapts to scaling events.
Your Docker Compose service scales up, but NGINX only sends traffic to one container due to DNS caching. The fix is to use the Docker service name directly in your NGINX config, which leverages Docker’s internal load balancing.
Why Aren’t All My Docker Compose Replicas Getting Traffic From NGINX? A DevOps War Story
I remember it like it was yesterday. It was 2 AM, and my phone was buzzing off the nightstand. PagerDuty, of course. A new feature had just deployed, and our API was getting hammered. “No problem,” I thought, “we built this to scale.” I ran the magic command: docker-compose up -d --scale api-service=10. I watched the logs as ten shiny new containers spun up. But the alerts didn’t stop. The latency charts were still screaming. Digging in, I saw it: only one of our ten containers was actually taking traffic. The other nine were just sitting there, burning CPU cycles and doing absolutely nothing. If you’ve hit this wall, don’t worry. We’ve all been there. It’s an infuriating rite of passage, and it comes down to a simple, fundamental misunderstanding of how Docker and NGINX talk to each other.
The Root of the Problem: NGINX’s One-Time-Only DNS Lookup
Here’s the deal. When your NGINX container starts, it reads its configuration file. It sees something like proxy_pass http://api-service:8000;. NGINX, trying to be efficient, does a DNS lookup for “api-service”. Docker’s internal DNS kindly responds with the IP address of the first container it created for that service, let’s say 172.18.0.5.
NGINX then caches that IP address. For the rest of its life.
When you scale up to 10 replicas, Docker’s DNS knows about all ten IPs. But NGINX doesn’t care. It’s already made up its mind. It’s going to send every single request to 172.18.0.5 until it’s restarted. This is why your first container is melting while the others are on vacation.
The Solutions: From a Quick Kick to a Permanent Fix
Alright, enough theory. Let’s get this fixed so you can get back to building things (or sleeping). I’ll give you three ways to solve this, from the “I need this working 5 minutes ago” band-aid to the “set it and forget it” permanent solution.
1. The Quick Fix: “Have You Tried Turning It Off and On Again?”
I’m not kidding. The fastest way to get all your replicas in the game is to simply restart NGINX after you’ve scaled your application.
# First, scale your app service
docker-compose up -d --scale api-service=5
# THEN, give the NGINX container a kick
docker-compose restart nginx-proxy
Why it works: Restarting the NGINX process forces it to re-read its configuration and, crucially, perform a new DNS lookup for api-service. This time, Docker’s DNS will return a list of all available IPs, and NGINX’s upstream module will begin to round-robin traffic between them. It’s hacky, requires manual intervention, and isn’t a real solution, but it’ll get you out of a 2 AM jam.
2. The Permanent Fix: Let Docker Do The Work
This is the solution you should be using 99% of the time. The secret is that when you use the Docker Compose service name (e.g., api-service) as the upstream host, Docker’s embedded DNS server doesn’t just return one IP—it returns a list and handles the round-robin load balancing for you automatically. The trick is ensuring NGINX is configured to leverage this.
Let’s look at a typical broken setup.
docker-compose.yml:
version: '3.8'
services:
api-service:
build: ./api
# No ports exposed to the host, NGINX will handle it
nginx-proxy:
image: nginx:latest
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
ports:
- "80:80"
depends_on:
- api-service
The broken nginx.conf:
# This is the problematic part
upstream backend {
server api-service:8000; # NGINX caches this to a single IP!
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
}
To fix this, we simplify the NGINX config to pass the request directly to the Docker service name. Docker’s networking layer will handle the rest.
The CORRECT nginx.conf:
server {
listen 80;
location / {
# Just use the service name directly.
# Docker's DNS will handle load balancing.
proxy_pass http://api-service:8000;
}
}
By removing the upstream block and using the service name directly in proxy_pass, you are tapping into Docker’s built-in service discovery and load balancing. When NGINX sends a request to http://api-service:8000, the Docker network intercepts it and routes it to one of the healthy containers in a round-robin fashion. Simple, clean, and it just works.
3. The Bulletproof Fix: The ‘resolver’ Directive
Sometimes, the simple fix isn’t enough. Maybe you have a complex upstream block with weights or specific settings, or maybe you’re just paranoid. In these cases, you can tell NGINX to be smarter about its DNS resolution. You can explicitly tell it to use Docker’s internal DNS server and re-resolve the name periodically.
Pro Tip: Docker’s internal DNS resolver is always available at the IP address
127.0.0.11from within any container in a user-defined network. It’s a magic number worth remembering.
Here’s how you implement it. You keep your upstream block, but you add a resolver directive inside the server block.
The BULLETPROOF nginx.conf:
upstream backend {
# NGINX will now dynamically resolve this name
server api-service:8000;
}
server {
listen 80;
# The magic line:
# Tell NGINX to use Docker's DNS (127.0.0.11) and re-check every 10 seconds.
resolver 127.0.0.11 valid=10s;
location / {
proxy_pass http://backend;
}
}
Why it works: The resolver directive instructs NGINX to use the specified DNS server (Docker’s internal one) to re-resolve the hostnames in your upstream blocks every 10 seconds (the valid=10s part). This means if you scale your service up or down, within 10 seconds NGINX will get the new list of IPs and adjust its load balancing pool accordingly. This is the most resilient solution and is great for environments where containers might be replaced or scaled frequently.
Which One Should I Use?
Let’s break it down.
| Solution | When to Use It | Pros | Cons |
|---|---|---|---|
| 1. Restart NGINX | In a production emergency when you need a quick fix NOW. | Fastest way to solve an immediate outage. | Manual, not a real solution, doesn’t handle containers dying. |
| 2. Use Service Name Directly | This should be your default. For 99% of Docker Compose setups. | Simple, clean, idiomatic, leverages built-in Docker features. | Doesn’t allow for complex NGINX upstream features like weights. |
| 3. Use ‘resolver’ Directive | When you need advanced upstream features (weights, keepalives) or maximum resilience. | Highly resilient, handles scaling and container replacement automatically. | Slightly more complex configuration, introduces a small delay (the ‘valid’ time) in service discovery. |
So next time your scaled services aren’t getting traffic, don’t panic. Take a deep breath, check your NGINX config, and remember that sometimes the most frustrating problems come from a simple, cached IP address. Happy deploying!
🤖 Frequently Asked Questions
❓ Why does NGINX only send traffic to one Docker container after scaling?
NGINX caches the IP address of the first Docker container it resolves for a service, preventing it from discovering and routing traffic to additional replicas that are scaled up later, unless NGINX is restarted or explicitly configured for dynamic resolution.
❓ How do the different NGINX configuration solutions compare for Docker Compose load balancing?
Using the service name directly in `proxy_pass` is the simplest and most idiomatic solution, leveraging Docker’s built-in load balancing. The `resolver` directive offers more advanced control and resilience for complex upstream configurations. Restarting NGINX is a temporary, manual workaround for immediate issues.
❓ What is a common implementation pitfall when configuring NGINX with Docker Compose services?
A common pitfall is defining an `upstream` block in NGINX with `server api-service:8000;` without also including a `resolver` directive. This causes NGINX to cache a single IP address for ‘api-service’, bypassing Docker’s internal load balancing and leading to uneven traffic distribution.
Leave a Reply