🚀 Executive Summary
TL;DR: When encountering ‘Could not resolve host’ errors for Cloudflare-proxied services, the issue is often a stale local DNS cache on your machine, not a global Cloudflare outage. This can be resolved by flushing your operating system’s DNS cache or, in critical situations, temporarily overriding DNS resolution via the `/etc/hosts` file.
🎯 Key Takeaways
- Local DNS resolvers (e.g., systemd-resolved) cache DNS lookups based on TTL, which can lead to your machine holding onto stale or incorrect IP addresses even after global network issues are resolved.
- Flushing your OS’s DNS cache (e.g., `sudo systemd-resolve –flush-caches` on Linux, `ipconfig /flushdns` on Windows, `sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder` on macOS) is the primary, safest, and most effective first step to resolve local DNS resolution failures.
- The `/etc/hosts` file provides a ‘nuclear’ option to manually map hostnames to IP addresses, bypassing DNS entirely for immediate access during critical incidents, but carries a high risk if entries are not removed promptly after use.
When your own tools fail to connect to services behind Cloudflare, it’s often a local DNS caching issue, not a global outage. This guide walks you through diagnosing and fixing the problem, from quick cache flushes to manual host file overrides.
So, You Think Cloudflare is Down Again? (It’s Probably Your DNS)
I remember it vividly. 2:17 AM. PagerDuty screaming bloody murder. Our primary customer-facing dashboard, `status.techresolve.io`, was apparently offline. My heart sank. I VPN’d in, SSH’d to a jump box, and ran a quick cURL. Everything looked fine. The site was up. I checked our external monitoring—all green. But on my own machine? `curl: (6) Could not resolve host: status.techresolve.io`. Absolute panic for a solid five minutes until I remembered an old ghost in the machine: my local DNS cache.
If you’re reading this, you’ve probably been there. You’re trying to hit an API, an internal tool, or even a public site that uses Cloudflare, and your machine is the only one in the world that can’t see it. Before you jump on Slack and declare a P1 incident, let’s walk through why this happens and how to fix it, from the easy way to the “break glass in case of emergency” way.
The “Why”: Your Computer is Holding a Grudge
The root cause is almost always your operating system’s local DNS resolver. On modern Linux systems, this is often `systemd-resolved`. When you ask to connect to `api.techresolve.io`, your OS does the following:
- Checks its local cache to see if it already knows the IP address.
- If not, it asks the upstream DNS server (like your router, 1.1.1.1, or 8.8.8.8).
- It gets the IP back and—this is the important part—caches the result for a certain amount of time (the TTL, or Time To Live).
If there was a momentary blip where Cloudflare’s DNS or network had an issue, your resolver might have cached a failure or a bad IP. Even after Cloudflare fixed the problem for the rest of the world, your machine will stubbornly cling to that bad cached entry until the TTL expires. You’re living in the past, and it’s keeping you from your services.
The Solutions: From a Gentle Nudge to a Sledgehammer
Here are three ways to deal with this, in order of my personal preference. We’ll compare them in a table later.
1. The Quick Fix: “The DNS Flush”
This is the DevOps equivalent of “turning it off and on again.” You’re just telling your OS to forget everything it thinks it knows about DNS lookups and start fresh. It’s safe, fast, and solves the problem 90% of the time.
On Linux (with systemd):
sudo systemd-resolve --flush-caches
sudo resolvectl flush-caches
You can verify it worked with:
systemd-resolve --statistics
On macOS:
sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder
On Windows:
ipconfig /flushdns
After running the appropriate command, try to access the resource again. It will likely just work.
2. The Diagnostic Step: “Ask Someone Else”
If the flush didn’t work, it’s time to put on your detective hat. Is the problem your local cache, your upstream DNS provider (like your ISP), or is it actually Cloudflare? We can bypass your local resolver entirely using tools like `dig` or `nslookup` to ask a public DNS server directly.
Let’s query Cloudflare’s own DNS server (1.1.1.1) and Google’s (8.8.8.8) to see what they think the IP for our service is.
# Ask Cloudflare's DNS directly
dig @1.1.1.1 api.techresolve.io
# Ask Google's DNS directly
dig @8.8.8.8 api.techresolve.io
If both of these commands return a valid IP address in the `ANSWER SECTION`, then you have 100% confirmation that the domain is resolving correctly out on the internet. The problem is isolated to your local machine’s configuration.
3. The ‘Nuclear’ Option: The `/etc/hosts` Override
Okay, you’re in the middle of an incident, you don’t have time to figure out *why* your resolver is broken, you just need to connect to `prod-db-01.techresolve.io` *right now*. This is where the `/etc/hosts` file comes in. It’s a plain text file that lets you manually map hostnames to IP addresses, bypassing DNS entirely.
First, use the `dig` command from the previous step to get the correct IP address. Let’s say it’s `104.18.25.123`.
Next, edit your hosts file (you’ll need `sudo`):
sudo nano /etc/hosts
And add a line at the bottom:
# Temporary fix for incident INC-123, REMOVE AFTER
104.18.25.123 api.techresolve.io
CRITICAL WARNING: This is a powerful but dangerous tool. The IP addresses for Cloudflare-proxied sites can and do change. If you forget to remove this line, you will be pointing to a stale IP address in a week, a month, or a year, and you will cause another, even more confusing outage for yourself. Always add a comment explaining why the line is there and remove it as soon as the crisis is over.
Comparing the Approaches
Let’s break down when to use each method.
| Method | When to Use It | Risk Level |
|---|---|---|
| 1. DNS Flush | Always the first step. It’s fast, easy, and non-destructive. | Very Low |
| 2. Diagnostic (`dig`) | When the flush fails. Use this to prove the problem is local. | None (Read-only command) |
| 3. `/etc/hosts` Override | During a critical incident when you need access *immediately* and other methods have failed. | High (If you forget to remove the entry) |
Final Thoughts
That 2 AM PagerDuty call taught me a valuable lesson: before you declare that the sky is falling, always ask, “Is it down for everyone, or just for me?” Nine times out of ten, a quick DNS cache flush is all you need to get back online. Understanding how to diagnose and override your local DNS is a fundamental skill that separates a junior from a senior engineer. It saves you time, saves your team from false alarms, and ultimately, lets you get back to sleep faster.
🤖 Frequently Asked Questions
âť“ Why am I getting ‘Could not resolve host’ errors for a Cloudflare site when others can access it?
This typically indicates a local DNS caching issue on your machine. Your operating system’s DNS resolver might be holding onto a stale or incorrect IP address for the domain, preventing resolution even if Cloudflare’s services are fully operational globally.
âť“ How do the different DNS troubleshooting methods compare in terms of effectiveness and risk?
DNS flushing is the safest and most common first step, resolving most issues with very low risk. Diagnostic tools like `dig` or `nslookup` are read-only and confirm if the problem is local or upstream with no risk. The `/etc/hosts` override is a high-risk, temporary ‘break glass’ solution for critical incidents, bypassing DNS entirely but requiring diligent removal to prevent future issues.
âť“ What is a common implementation pitfall when using the `/etc/hosts` file for DNS resolution?
The most critical pitfall is forgetting to remove temporary entries from `/etc/hosts`. Cloudflare-proxied IP addresses can and do change, and a stale entry will cause future connectivity failures, making diagnosis much harder. Always add comments and remove entries promptly after the crisis is over.
Leave a Reply