How to Optimize Azure Cloud Costs: A Step-by-Step Guide
Reduce your Azure bill by 30-60% with this tactical guide. Covers reserved instances, right-sizing, spot VMs, storage tiering, and automated cleanup scripts.
Most organizations overspend on Azure by 30-60%. The problem isn’t the platform — it’s the default settings. Azure provisions everything at maximum capacity, and nobody ever goes back to right-size. This guide walks you through the exact steps to find and eliminate waste.
Step 1: Audit Your Current Spend
Before optimizing, you need visibility. Azure Cost Management is free and built-in, but most teams never configure it properly.
1.1 Enable Cost Management + Billing
- Navigate to Azure Portal → Cost Management + Billing
- Select your subscription
- Go to Cost Analysis → Set timeframe to Last 3 Months
- Group by Resource Type, then by Resource Group
# Export cost data via CLI for deeper analysis
az consumption usage list \
--start-date 2026-01-01 \
--end-date 2026-03-01 \
--output json > azure-costs.json
1.2 Identify the Top 5 Cost Drivers
In most Azure environments, the cost breakdown follows a predictable pattern:
| Resource Type | Typical % of Bill | Optimization Potential |
|---|---|---|
| Virtual Machines | 35-45% | High (right-sizing, reserved) |
| Storage | 15-25% | Medium (tiering, lifecycle) |
| SQL/Cosmos DB | 10-20% | High (DTU right-sizing) |
| Networking | 5-15% | Low (egress reduction) |
| App Services | 5-10% | Medium (plan consolidation) |
:::tip[Quick Win]
Run az advisor recommendation list --category Cost to get Azure’s own cost-saving recommendations. Most environments have $5K-$20K/month in advisor suggestions that nobody has acted on.
:::
Step 2: Right-Size Virtual Machines
Right-sizing is the single highest-impact optimization. Most VMs run at 10-20% CPU utilization.
2.1 Identify Oversized VMs
# List all VM sizes and their current utilization
az monitor metrics list \
--resource "/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Compute/virtualMachines/{vm}" \
--metric "Percentage CPU" \
--interval PT1H \
--aggregation Average \
--start-time 2026-02-01 \
--end-time 2026-03-01
Decision Matrix:
| Avg CPU | Peak CPU | Action |
|---|---|---|
| < 5% | < 20% | Downsize by 2 tiers or decommission |
| 5-20% | < 40% | Downsize by 1 tier |
| 20-60% | < 80% | Current size is appropriate |
| > 60% | > 80% | Consider upsizing |
2.2 Execute the Resize
# Stop the VM (required for resize)
az vm deallocate --resource-group myRG --name myVM
# Resize to a smaller SKU
az vm resize --resource-group myRG --name myVM --size Standard_B2ms
# Restart
az vm start --resource-group myRG --name myVM
:::caution[Downtime Required] Most VM resizes require a stop/start cycle. Schedule during maintenance windows. For production workloads, use VM Scale Sets with mixed instance sizes instead. :::
Step 3: Purchase Reserved Instances
Reserved Instances (RIs) provide 40-72% savings over pay-as-go pricing for predictable workloads.
3.1 Identify RI Candidates
A VM is a good RI candidate if:
- It runs 24/7 (or at least 16+ hours/day)
- The workload is stable (not scaling up/down frequently)
- You’re confident it’ll run for 1-3 years
3.2 Calculate Savings
Pay-as-you-go: Standard_D4s_v5 = $140.16/month
1-Year RI: Standard_D4s_v5 = $89.79/month (36% savings)
3-Year RI: Standard_D4s_v5 = $57.67/month (59% savings)
3.3 Purchase via Portal or CLI
az reservations reservation-order purchase \
--reserved-resource-type VirtualMachines \
--sku Standard_D4s_v5 \
--location eastus \
--quantity 5 \
--term P3Y \
--billing-scope-id "/subscriptions/{sub-id}"
Step 4: Implement Storage Lifecycle Policies
Azure Storage costs creep up because nobody deletes old data. Lifecycle policies automate tiering and deletion.
4.1 Configure Automatic Tiering
{
"rules": [
{
"name": "TierToCool30Days",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"tierToCool": { "daysAfterModificationGreaterThan": 30 },
"tierToArchive": { "daysAfterModificationGreaterThan": 180 },
"delete": { "daysAfterModificationGreaterThan": 365 }
},
"snapshot": {
"delete": { "daysAfterCreationGreaterThan": 90 }
}
},
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["logs/", "backups/", "exports/"]
}
}
}
]
}
4.2 Savings Estimates
| Tier | Cost per GB/month | Use For |
|---|---|---|
| Hot | $0.018 | Active data (< 30 days) |
| Cool | $0.010 | Infrequent access (30-180 days) |
| Archive | $0.002 | Compliance/backup (180+ days) |
A typical storage account with 10 TB of data can save $1,200-$1,800/year by implementing lifecycle policies.
Step 5: Use Spot VMs for Batch Processing
Spot VMs provide up to 90% savings for interruptible workloads like batch processing, CI/CD, and dev/test.
# Create a Spot VM
az vm create \
--resource-group myRG \
--name batch-worker-01 \
--image Ubuntu2204 \
--size Standard_D8s_v5 \
--priority Spot \
--max-price 0.10 \
--eviction-policy Deallocate
:::note[Eviction Handling] Spot VMs can be evicted with 30 seconds notice. Your workloads must be idempotent and checkpoint-capable. Use Azure Batch or Kubernetes with spot node pools for automatic retry logic. :::
Step 6: Set Up Budget Alerts
Prevention beats optimization. Set up alerts before the bill arrives.
# Create a monthly budget with email alerts
az consumption budget create \
--budget-name "Monthly-Limit" \
--amount 15000 \
--time-grain Monthly \
--start-date 2026-03-01 \
--end-date 2027-03-01 \
--resource-group myRG \
--notifications '{
"80Percent": {
"enabled": true,
"operator": "GreaterThan",
"threshold": 80,
"contactEmails": ["ops@company.com"]
},
"100Percent": {
"enabled": true,
"operator": "GreaterThan",
"threshold": 100,
"contactEmails": ["ops@company.com", "cfo@company.com"]
}
}'
Step 7: Automate Cleanup with Azure Automation
Create a runbook that identifies and deallocates idle resources on a schedule.
# Azure Automation Runbook — Stop Idle VMs
$vms = Get-AzVM -Status
foreach ($vm in $vms) {
$metrics = Get-AzMetric -ResourceId $vm.Id `
-MetricName "Percentage CPU" `
-TimeGrain 01:00:00 `
-StartTime (Get-Date).AddDays(-7)
$avgCpu = ($metrics.Data | Measure-Object -Property Average -Average).Average
if ($avgCpu -lt 5 -and $vm.PowerState -eq "VM running") {
Write-Output "Stopping idle VM: $($vm.Name) (Avg CPU: $avgCpu%)"
Stop-AzVM -ResourceGroupName $vm.ResourceGroupName `
-Name $vm.Name -Force
}
}
Optimization Checklist
- Run Azure Advisor cost recommendations
- Right-size VMs with < 20% average CPU
- Purchase RIs for 24/7 production workloads
- Implement storage lifecycle policies
- Use Spot VMs for batch/CI/CD workloads
- Set budget alerts at 80% and 100%
- Schedule automated idle resource cleanup
- Review and act on Tag Governance (untagged resources = unaccountable costs)
:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For a strategic analysis of cloud cost optimization patterns, see the full enterprise playbook. :::