CRITICAL

Solved: Azure VM Extension Provisioning Failed (Windows/Linux 2025)

Quick Fix Summary

TL;DR

Check the VM Agent status, verify network connectivity to Azure endpoints, and review the extension's detailed status in the Azure portal.

Azure VM Extension Provisioning Failed indicates a critical deployment failure where an extension (like Custom Script, Monitoring, or Security) could not be installed or configured on a virtual machine. This halts automated configuration, security hardening, and monitoring, leaving the VM in a non-compliant or insecure state.

Diagnosis & Causes

  • Azure VM Agent is stopped or unresponsive.
  • Network security rules block access to required Azure endpoints.
  • Insufficient permissions on the VM or storage account.
  • Invalid or malformed extension settings (e.g., script URI).
  • Underlying VM OS is in a failed or unhealthy state.
  • Recovery Steps

    1

    Step 1: Diagnose with Azure CLI & Check VM Agent

    First, get the precise error details and verify the health of the Windows Guest Agent (WaAgent) or Linux Agent (WALinuxAgent).

    bash
    # Get detailed extension status for a specific VM
    az vm get-instance-view --resource-group MyResourceGroup --name MyVM --query "instanceView.extensions"
    # Check VM Agent status on Linux
    systemctl status walinuxagent
    # Check VM Agent status on Windows (via Azure Serial Console or RDP)
    Get-Service WindowsAzureGuestAgent
    2

    Step 2: Force Reinstall the VM Agent (Linux)

    If the agent is corrupted, reinstall it. This is a common fix for persistent extension failures.

    bash
    sudo apt update && \
      sudo apt install walinuxagent -y
    sudo systemctl restart walinuxagent
    # For RHEL/CentOS:
    sudo yum install WALinuxAgent -y
    sudo systemctl restart waagent
    3

    Step 3: Force Reinstall the VM Agent (Windows)

    On Windows, download and run the latest VM Agent installer from the Azure GitHub repository.

    powershell
    Invoke-WebRequest -Uri https://go.microsoft.com/fwlink/?LinkID=394789 -OutFile WindowsAzureVmAgent.msi
    msiexec.exe /i WindowsAzureVmAgent.msi /quiet /norestart /log agent_install.log
    Restart-Service WindowsAzureGuestAgent -Force
    4

    Step 4: Retry or Re-deploy the Failed Extension

    Once the agent is healthy, retry the extension deployment. First, remove the failed extension state, then reapply it.

    bash
    # Remove the failed extension instance
    az vm extension delete --resource-group MyResourceGroup --vm-name MyVM --name CustomScriptExtension
    # Re-deploy the extension (Example: Custom Script for Linux)
    az vm extension set \
      --resource-group MyResourceGroup \
      --vm-name MyVM \
      --name CustomScript \
      --publisher Microsoft.Azure.Extensions \
      --version 2.1 \
      --settings '{"fileUris":["https://myscript.blob.core.windows.net/scripts/myscript.sh"],"commandToExecute":"./myscript.sh"}'
    5

    Step 5: Verify Network Connectivity to Azure Endpoints

    Extensions often fail if the VM cannot reach Azure storage (for scripts) or management endpoints. Test connectivity.

    bash
    # Test connectivity to key Azure endpoints from within the VM
    # Linux:
    nc -zv management.azure.com 443
    nc -zv blob.core.windows.net 443
    # Windows (PowerShell):
    Test-NetConnection management.azure.com -Port 443
    Test-NetConnection blob.core.windows.net -Port 443
    6

    Step 6: Inspect Extension Logs for Root Cause

    The most detailed error messages are written to local logs on the VM. Examine them directly.

    bash
    # Linux Extension Logs (General)
    sudo cat /var/log/azure/*/*.log | tail -100
    # Linux Custom Script Extension Logs
    sudo cat /var/log/azure/custom-script/handler.log
    # Windows Extension Logs (via RDP/Serial Console)
    Get-ChildItem "C:\WindowsAzure\Logs\Plugins\" -Recurse -Filter *.log | Select-Object -Last 5 | Get-Content

    Architect's Pro Tip

    "For time-critical recovery in an AutoScale set, deploy a new, corrected VM image version and terminate faulty instances. Fixing extensions on hundreds of live VMs is slower and riskier."

    Frequently Asked Questions

    How long should I wait before troubleshooting a VMExtensionProvisioningError?

    Do not wait. In production, this error signifies a hard failure. Begin diagnostics immediately, as automated processes (backup, monitoring, security) are likely broken.

    Can I ignore this error if the VM itself is running?

    No. The VM may be running, but critical post-deployment configuration (software installation, security policies, monitoring agents) is incomplete, violating operational and compliance standards.

    Where is the most accurate error message for this failure?

    The Azure Portal VM 'Extensions' blade shows a status message, but the definitive details are in the extension-specific logs on the VM's OS at /var/log/azure/ (Linux) or C:\WindowsAzure\Logs\ (Windows).

    Related Azure Guides