Verified by Garnet Grid

How to Optimize Azure Cloud Costs: A Step-by-Step Guide

Reduce your Azure bill by 30-60% with this tactical guide. Covers reserved instances, right-sizing, spot VMs, storage tiering, and automated cleanup scripts.

Most organizations overspend on Azure by 30-60%. The problem isn’t the platform — it’s the default settings. Azure provisions everything at maximum capacity, and nobody ever goes back to right-size. This guide walks you through the exact steps to find and eliminate waste.


Step 1: Audit Your Current Spend

Before optimizing, you need visibility. Azure Cost Management is free and built-in, but most teams never configure it properly.

1.1 Enable Cost Management + Billing

  1. Navigate to Azure Portal → Cost Management + Billing
  2. Select your subscription
  3. Go to Cost Analysis → Set timeframe to Last 3 Months
  4. Group by Resource Type, then by Resource Group
# Export cost data via CLI for deeper analysis
az consumption usage list \
  --start-date 2026-01-01 \
  --end-date 2026-03-01 \
  --output json > azure-costs.json

1.2 Identify the Top 5 Cost Drivers

In most Azure environments, the cost breakdown follows a predictable pattern:

Resource TypeTypical % of BillOptimization Potential
Virtual Machines35-45%High (right-sizing, reserved)
Storage15-25%Medium (tiering, lifecycle)
SQL/Cosmos DB10-20%High (DTU right-sizing)
Networking5-15%Low (egress reduction)
App Services5-10%Medium (plan consolidation)

:::tip[Quick Win] Run az advisor recommendation list --category Cost to get Azure’s own cost-saving recommendations. Most environments have $5K-$20K/month in advisor suggestions that nobody has acted on. :::


Step 2: Right-Size Virtual Machines

Right-sizing is the single highest-impact optimization. Most VMs run at 10-20% CPU utilization.

2.1 Identify Oversized VMs

# List all VM sizes and their current utilization
az monitor metrics list \
  --resource "/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Compute/virtualMachines/{vm}" \
  --metric "Percentage CPU" \
  --interval PT1H \
  --aggregation Average \
  --start-time 2026-02-01 \
  --end-time 2026-03-01

Decision Matrix:

Avg CPUPeak CPUAction
< 5%< 20%Downsize by 2 tiers or decommission
5-20%< 40%Downsize by 1 tier
20-60%< 80%Current size is appropriate
> 60%> 80%Consider upsizing

2.2 Execute the Resize

# Stop the VM (required for resize)
az vm deallocate --resource-group myRG --name myVM

# Resize to a smaller SKU
az vm resize --resource-group myRG --name myVM --size Standard_B2ms

# Restart
az vm start --resource-group myRG --name myVM

:::caution[Downtime Required] Most VM resizes require a stop/start cycle. Schedule during maintenance windows. For production workloads, use VM Scale Sets with mixed instance sizes instead. :::


Step 3: Purchase Reserved Instances

Reserved Instances (RIs) provide 40-72% savings over pay-as-go pricing for predictable workloads.

3.1 Identify RI Candidates

A VM is a good RI candidate if:

  • It runs 24/7 (or at least 16+ hours/day)
  • The workload is stable (not scaling up/down frequently)
  • You’re confident it’ll run for 1-3 years

3.2 Calculate Savings

Pay-as-you-go: Standard_D4s_v5 = $140.16/month
1-Year RI:     Standard_D4s_v5 = $89.79/month  (36% savings)
3-Year RI:     Standard_D4s_v5 = $57.67/month  (59% savings)

3.3 Purchase via Portal or CLI

az reservations reservation-order purchase \
  --reserved-resource-type VirtualMachines \
  --sku Standard_D4s_v5 \
  --location eastus \
  --quantity 5 \
  --term P3Y \
  --billing-scope-id "/subscriptions/{sub-id}"

Step 4: Implement Storage Lifecycle Policies

Azure Storage costs creep up because nobody deletes old data. Lifecycle policies automate tiering and deletion.

4.1 Configure Automatic Tiering

{
  "rules": [
    {
      "name": "TierToCool30Days",
      "type": "Lifecycle",
      "definition": {
        "actions": {
          "baseBlob": {
            "tierToCool": { "daysAfterModificationGreaterThan": 30 },
            "tierToArchive": { "daysAfterModificationGreaterThan": 180 },
            "delete": { "daysAfterModificationGreaterThan": 365 }
          },
          "snapshot": {
            "delete": { "daysAfterCreationGreaterThan": 90 }
          }
        },
        "filters": {
          "blobTypes": ["blockBlob"],
          "prefixMatch": ["logs/", "backups/", "exports/"]
        }
      }
    }
  ]
}

4.2 Savings Estimates

TierCost per GB/monthUse For
Hot$0.018Active data (< 30 days)
Cool$0.010Infrequent access (30-180 days)
Archive$0.002Compliance/backup (180+ days)

A typical storage account with 10 TB of data can save $1,200-$1,800/year by implementing lifecycle policies.


Step 5: Use Spot VMs for Batch Processing

Spot VMs provide up to 90% savings for interruptible workloads like batch processing, CI/CD, and dev/test.

# Create a Spot VM
az vm create \
  --resource-group myRG \
  --name batch-worker-01 \
  --image Ubuntu2204 \
  --size Standard_D8s_v5 \
  --priority Spot \
  --max-price 0.10 \
  --eviction-policy Deallocate

:::note[Eviction Handling] Spot VMs can be evicted with 30 seconds notice. Your workloads must be idempotent and checkpoint-capable. Use Azure Batch or Kubernetes with spot node pools for automatic retry logic. :::


Step 6: Set Up Budget Alerts

Prevention beats optimization. Set up alerts before the bill arrives.

# Create a monthly budget with email alerts
az consumption budget create \
  --budget-name "Monthly-Limit" \
  --amount 15000 \
  --time-grain Monthly \
  --start-date 2026-03-01 \
  --end-date 2027-03-01 \
  --resource-group myRG \
  --notifications '{
    "80Percent": {
      "enabled": true,
      "operator": "GreaterThan",
      "threshold": 80,
      "contactEmails": ["ops@company.com"]
    },
    "100Percent": {
      "enabled": true,
      "operator": "GreaterThan",
      "threshold": 100,
      "contactEmails": ["ops@company.com", "cfo@company.com"]
    }
  }'

Step 7: Automate Cleanup with Azure Automation

Create a runbook that identifies and deallocates idle resources on a schedule.

# Azure Automation Runbook — Stop Idle VMs
$vms = Get-AzVM -Status
foreach ($vm in $vms) {
    $metrics = Get-AzMetric -ResourceId $vm.Id `
        -MetricName "Percentage CPU" `
        -TimeGrain 01:00:00 `
        -StartTime (Get-Date).AddDays(-7)

    $avgCpu = ($metrics.Data | Measure-Object -Property Average -Average).Average

    if ($avgCpu -lt 5 -and $vm.PowerState -eq "VM running") {
        Write-Output "Stopping idle VM: $($vm.Name) (Avg CPU: $avgCpu%)"
        Stop-AzVM -ResourceGroupName $vm.ResourceGroupName `
            -Name $vm.Name -Force
    }
}

Optimization Checklist

  • Run Azure Advisor cost recommendations
  • Right-size VMs with < 20% average CPU
  • Purchase RIs for 24/7 production workloads
  • Implement storage lifecycle policies
  • Use Spot VMs for batch/CI/CD workloads
  • Set budget alerts at 80% and 100%
  • Schedule automated idle resource cleanup
  • Review and act on Tag Governance (untagged resources = unaccountable costs)

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For a strategic analysis of cloud cost optimization patterns, see the full enterprise playbook. :::