🚀 Executive Summary

TL;DR: Prometheus’s `azure_sd_configs` fails to discover Azure App Services because it’s designed for IaaS Virtual Machines, not PaaS Web Apps. The recommended solution involves implementing file-based service discovery, where a script queries Azure APIs for tagged Web Apps and generates a Prometheus-readable JSON target file.

🎯 Key Takeaways

  • `azure_sd_configs` is specifically designed to discover IaaS resources like Virtual Machines, not PaaS offerings such as Azure App Services.
  • Azure App Services lack dedicated NICs or static private IPs like VMs, instead being exposed via FQDNs or private endpoints, which `azure_sd_configs` does not query.
  • File-based service discovery, using a script to query Azure CLI for tagged Web Apps and generate a `file_sd_configs` JSON, is the ‘right’ and scalable solution for most use cases.

Can azure_sd_configs reach Web Apps?

Struggling to make Prometheus discover your Azure App Services using `azure_sd_configs`? This guide cuts through the confusion, explaining why it fails and providing three real-world, battle-tested solutions to get your metrics flowing.

Prometheus Can’t See My Azure Web App: A Field Guide to Fixing `azure_sd_configs`

I still remember the 2 AM alert. It was a Tuesday. It’s always a Tuesday. PagerDuty was screaming that our primary e-commerce checkout service, `prod-checkout-webapp`, had vanished from our dashboards. Grafana was a sea of “No Data” panels. My first thought was that the app was down, but a quick check showed it was serving traffic just fine. The problem? Prometheus, our all-seeing eye, had gone blind to it. A junior engineer had just migrated the service from a VM to an Azure App Service, assuming our standard Prometheus `azure_sd_configs` would pick it up automatically. It didn’t. That night, I learned a valuable, sleep-deprived lesson about the assumptions we make with cloud tooling.

The “Why”: A Tale of Two Azure APIs

This is one of those problems that feels like it should be simple. You point a discovery tool at Azure, and it should find… well, Azure things. Right? The root of the problem is that `azure_sd_configs` is designed to discover IaaS resources, specifically Virtual Machines.

Under the hood, it queries Azure APIs that list VMs and their associated network interfaces to find IP addresses. Azure App Services, being a PaaS (Platform-as-a-Service) offering, don’t live in that world. They don’t have a dedicated NIC or a static private IP in the same way a VM does. They exist in a managed “App Service Plan” and are exposed via a public FQDN or through private endpoints, which are different resource types altogether. Simply put, `azure_sd_configs` is knocking on the front door asking for the VM list, while your Web App lives in a penthouse apartment with a separate entrance.

Pro Tip: Never assume a cloud provider’s discovery mechanism for one service type (like VMs) will magically work for another (like PaaS Web Apps). Always check the documentation for what resource types are explicitly supported.

The Fixes: From Duct Tape to a New Engine

So, how do we get Prometheus to see our Web App? We have a few options, ranging from a quick fix to get you through the night to a proper, scalable solution. I’ve used all three in different situations.

1. The Quick Fix: Static Configs to the Rescue

This is the duct tape solution. It’s fast, it’s ugly, but it will stop the bleeding at 2 AM. You explicitly tell Prometheus where to find the target by adding its address to a `static_configs` block in your `prometheus.yml`.

Let’s say your web app is `https://prod-checkout-webapp.azurewebsites.net` and it exposes a `/metrics` endpoint. You’d add this to your scrape job:

- job_name: 'azure-webapps-static'
  metrics_path: /metrics
  scheme: https
  static_configs:
    - targets: ['prod-checkout-webapp.azurewebsites.net']
      labels:
        app: 'checkout-service'
        env: 'production'

Warning: This is a brittle solution! If the URL changes, or if you spin up new environments, you have to manually edit this file. Use this to restore monitoring immediately, but plan to replace it with Solution 2 as soon as you’ve had some coffee.

2. The ‘Right’ Way: File-Based Service Discovery

This is my preferred method for 90% of use cases. It combines the flexibility of dynamic discovery with the simplicity of a standard Prometheus feature: `file_sd_configs`. The idea is to have a separate process that queries the Azure API for your Web Apps and writes the targets to a JSON file that Prometheus reads.

Step 1: Tag your resources.
In Azure, add a tag to the App Services you want to monitor. Something like `prometheus-scrape: “true”`.

Step 2: Create a discovery script.
This can be a simple shell script using the Azure CLI, running on a cron job every 5-10 minutes. This script finds all App Services with your tag and formats them into a JSON file.

Here’s a bare-bones `generate-webapp-targets.sh` script:

#!/bin/bash
# A simple script to generate a file_sd_config for Azure Web Apps
# WARNING: This requires jq to be installed!

TARGET_FILE="/etc/prometheus/targets/azure_webapps.json"
TEMP_FILE=$(mktemp)

# Query Azure for all web apps with the 'prometheus-scrape' tag
# and format the output as a JSON array of objects for file_sd.
az webapp list --query "[?tags.\"prometheus-scrape\"=='true'].{host:defaultHostName, name:name, rg:resourceGroup}" | \
jq '[.[] | .host as $target | { "targets": [$target], "labels": { "instance": .name, "job": "azure-webapp", "resource_group": .rg } }]' > $TEMP_FILE

# Atomically replace the old file with the new one
mv $TEMP_FILE $TARGET_FILE

Step 3: Configure Prometheus.
Now, just point a Prometheus job at that generated file.

- job_name: 'azure-webapps-file-sd'
  metrics_path: /metrics
  scheme: https
  file_sd_configs:
    - files:
      - '/etc/prometheus/targets/azure_webapps.json'
      refresh_interval: 2m

This is a robust solution. It automatically adds and removes targets as you scale your App Services, as long as you’re consistent with your tagging.

3. The ‘Enterprise’ Option: Custom SD Adapters

If you’re running a massive environment, probably on Kubernetes, and using the Prometheus Operator, you might eventually outgrow the simple file-based approach. At this scale, you might consider a more integrated solution.

This involves building or using a “service discovery adapter”. This is a small application that runs alongside Prometheus, implements its Service Discovery API, and acts as a bridge. It would talk to the Azure API on one side and talk to Prometheus on the other, providing targets dynamically without intermediate files.

Pros Cons
  • Most scalable and integrated solution.
  • Real-time updates without file I/O lag.
  • Can be extended to discover other non-standard Azure resources.
  • Massive overkill for most teams.
  • Adds another piece of software to build, deploy, and maintain.
  • High complexity compared to a simple cron job.

Frankly, I’ve only seen this implemented once or twice. It’s the “nuclear option” for when you have dedicated observability engineers and a scale that justifies the maintenance overhead. For everyone else, Solution 2 is the sweet spot.

Darian Vance - Lead Cloud Architect

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.


🤖 Frequently Asked Questions

âť“ Why doesn’t `azure_sd_configs` find my Azure Web Apps?

`azure_sd_configs` is built to discover IaaS Virtual Machines by querying Azure APIs for network interfaces. Azure Web Apps are PaaS services that do not have dedicated NICs or static private IPs in the same way, making them undiscoverable by this method.

âť“ How does file-based service discovery compare to static configs or custom SD adapters for Azure Web Apps?

File-based service discovery offers dynamic target management and scalability without the brittleness of manual static configs or the high complexity and maintenance overhead of custom SD adapters, positioning it as the optimal balance for most environments.

âť“ What is a common implementation pitfall when trying to monitor Azure Web Apps with Prometheus?

A common pitfall is assuming that a discovery mechanism for one cloud service type (e.g., `azure_sd_configs` for VMs) will automatically work for another (e.g., PaaS Web Apps). The solution is to always verify supported resource types and use specific discovery methods like file-based service discovery for App Services.

Leave a Reply

Discover more from TechResolve - SaaS Troubleshooting & Software Alternatives

Subscribe now to keep reading and get access to the full archive.

Continue reading