Eight ways to reduce cloud costs on AWS and Azure

January 20, 2026 • By KPThink

Image made with AI for visual purposes only.

Cloud bills grow in a predictable pattern: a team spins up resources, delivers a project, and forgets to clean up. Infrastructure gets over-provisioned for peak load and left running at that size permanently. Nobody owns the bill. By the time someone looks at it, the waste has been accumulating for months.

The eight strategies below are the most effective ways to identify and eliminate that waste. Some require one-time configuration changes; others require ongoing processes. All of them depend on having accurate visibility into what you're running and what it costs.

1. Monitor and analyze cloud usage

Get visibility before you try to cut

You can't reduce costs you can't see. AWS Cost Explorer and Azure Cost Management both show spending broken down by service, region, account, and resource. The first step is enabling these dashboards and understanding which services are responsible for the majority of your spend.

Tag every resource, such as an instance, storage volume, database, or load balancer, with a project, environment (prod/staging/dev), team, and cost center. Without tags, you can see total spend but not which team or workload is responsible for it. Tags are the foundation of all cost attribution and chargeback.

Set up budget alerts in AWS Budgets or Azure Cost Management so you're notified when spending exceeds a threshold, before it becomes a surprise at the end of the month. Alerts don't cut costs on their own, but they catch runaway spend before it compounds.

2. Right-size your instances

Right-sizing means matching your instance type and size to your actual workload requirements. It's not what you estimated when you provisioned it, and not what was available when you were in a hurry.

AWS Compute Optimizer and Azure Advisor both analyze CPU, memory, and network utilization data and recommend whether an instance should be downsized, terminated, or replaced with a different instance family. Common findings: a workload running at 10–15% average CPU utilization on an instance sized for 80% is a candidate for downsizing to a smaller instance type.

Right-sizing recommendations should be reviewed rather than applied automatically. A database that's idle most of the day but needs burst capacity for end-of-day reporting should not be downsized to match average load. Context matters: match recommendations against actual workload patterns before acting.

3. Use reserved instances and savings plans

On-demand pricing is the most expensive way to run predictable, long-running workloads. For compute that runs continuously, committing to reserved capacity reduces the effective hourly rate by 30–60% depending on the commitment length (1 year or 3 years) and payment option (all upfront, partial upfront, or monthly).

On AWS, Savings Plans offer more flexibility than Reserved Instances. A Compute Savings Plan applies across any EC2 instance family, region, operating system, and tenancy, making it easier to retain the discount even as your instance types change over time. Reserved Instances are instance-type specific and require more careful planning.

Azure Reserved VM Instances work similarly: commit to a 1- or 3-year term for a specific VM size in a specific region. Azure Hybrid Benefit applies on top of this, letting you use existing Windows Server and SQL Server licenses to reduce the software cost component of the VM price.

The rule of thumb: buy reserved capacity for the baseline compute you're confident you'll need continuously. Run variable workloads on-demand or on spot/preemptible instances.

4. Use auto-scaling to eliminate idle capacity

If you size your compute for peak load and run it at that size continuously, you're paying for peak capacity during off-peak hours. Auto-scaling solves this by adding capacity when demand increases and removing it when demand drops.

AWS Auto Scaling Groups and Azure Virtual Machine Scale Sets both scale based on CloudWatch/Azure Monitor metrics: CPU utilization, request rate, queue depth, or custom application metrics. Web tier compute that handles ten times normal traffic during business hours and almost nothing overnight is a good candidate for aggressive scaling.

The prerequisite: the application must be horizontally scalable. Stateless application layers (web servers, API servers) scale easily. Stateful components (databases, caching layers) need separate handling, usually by externalizing state rather than scaling the stateful component itself.

For non-production environments, scheduled scaling is effective: shut down dev and staging environments outside working hours. A dev environment running 24/7 when developers are only active 8 hours a day wastes two-thirds of its compute budget.

5. Use spot and preemptible instances for interruptible workloads

AWS Spot Instances and Azure Spot Virtual Machines offer unused capacity at discounts of 70–90% compared to on-demand prices. The trade-off: these instances can be reclaimed by the cloud provider with short notice (2 minutes on AWS, 30 seconds on Azure) when capacity is needed elsewhere.

This makes spot/preemptible instances well-suited for fault-tolerant, stateless workloads: batch data processing jobs, CI/CD build agents, ML training runs, video encoding, and load testing. These workloads can checkpoint progress, handle interruptions gracefully, and retry failed tasks without losing work.

Spot instances are not appropriate for production web servers, databases, or any workload where sudden termination would cause a user-facing outage. The discount is large, but the interruption risk eliminates it as an option for anything requiring continuous availability.

6. Optimize data storage costs

Storage costs accumulate quietly. Common sources of waste: snapshots that were created for a migration three years ago and never deleted, log files written to S3 Standard when they're never accessed after 30 days, databases retaining years of audit data that nobody queries.

Both AWS and Azure offer tiered storage with significantly different prices per GB:

AWS S3 Intelligent-Tiering automatically moves objects between frequent-access and infrequent-access tiers based on access patterns. S3 Glacier Instant/Flexible Retrieval handles archive storage at a fraction of S3 Standard cost.
Azure Blob Storage has Hot, Cool, and Archive tiers. Cool costs ~50% less than Hot per GB for data accessed less than once a month. Archive is ~95% cheaper but requires hours to retrieve.

Lifecycle policies automate the tier transitions: objects move from Standard to Infrequent Access after 30 days, then to Glacier after 90 days, based on rules you define. This can be configured in minutes and takes effect on the existing dataset without manual intervention.

7. Optimize resource allocation

Beyond instance sizing (covered in strategy 2), resource allocation optimization covers the broader question of what should be running at all. Common findings in cloud environments:

Unused load balancers attached to no instances, still incurring hourly charges
Unattached EBS volumes or Azure Managed Disks from terminated instances
Old AMI snapshots or Azure VM images from instances that no longer exist
Elastic IPs (AWS) or Public IP addresses (Azure) reserved but not in use
RDS instances or Azure SQL databases left running for testing that were never shut down

AWS Trusted Advisor and Azure Advisor both surface these idle or underutilized resources. Running a resource cleanup audit quarterly and terminating orphaned resources typically produces immediate savings without any architectural change.

8. Implement cost governance

Individual cost optimizations are one-time actions. Cost governance is the process that keeps waste from accumulating in the first place.

A basic governance framework includes:

Mandatory tagging policy: Resources without required tags (project, environment, owner) cannot be created. AWS Service Control Policies and Azure Policy both enforce this at the account/subscription level.
Budget alerts per team or project: Each team sees their own spend and gets alerted when it approaches their budget. This distributes cost ownership rather than centralising it in one finance review per quarter.
Monthly cost review: A scheduled review of the previous month's spend against budget, identifying new anomalies and tracking whether previous optimization actions reduced spend as expected.
Approved instance types list: Prevents engineers from spinning up unnecessarily large instances when a smaller one would suffice. Implemented via IAM policies or Azure Policy that restrict which instance families can be launched.

The goal of governance is not to restrict what teams can build. It's to make cost visible and owned at the team level, so that spending decisions are made with cost information rather than without it.

Ready to find out where your cloud budget is leaking?

Get a free cloud readiness assessment Book your free assessment

Where to start

If you're not sure where to begin, the order matters. Without tagging (strategy 1), you can't attribute spend to teams or projects, which makes strategies 2–8 harder to prioritize. Start with visibility and attribution, then move to the quick wins: right-sizing recommendations from Advisor/Compute Optimizer, deleting idle resources, and applying lifecycle policies to storage.

If you're spending more than $10,000/month on cloud and haven't run a cost review, KPThink's DevOps engagements include infrastructure audits as part of the onboarding. See the pricing page for what's included.