AWS Auto Scaling Overview
- AWS Auto Scaling Service: Central service for scaling all scalable AWS resources.
- Resources Supported:
- EC2 Instances: Through Auto Scaling groups, instances can be launched or terminated.
- Spot Fleet Requests: Manage instance launch and termination, and replace instances for price or capacity reasons.
- Amazon ECS: Adjusts the ECS service desired count.
- DynamoDB: Modifies Write Capacity Units (WCU) and Read Capacity Units (RCU) for tables and global secondary indexes.
- Aurora: Utilizes Auto Scaling for Dynamic Read Replica scaling.
- Note: Other services may be added to AWS Auto Scaling over time.
Scaling Plans
- Dynamic Scaling: Adjusts capacity over time to stabilize service utilization.
- Optimize for Availability: Targets 40% resource utilization.
- Balance: Aims for 50% utilization.
- Optimize for Cost: Seeks 70% resource utilization.
- Note: Approaching 100% utilization can lead to performance bottlenecks.
- Custom Metrics: Users can set their own metrics and target values.
Dynamic Scaling Options
- Disable Scale-In: Allows only scaling out, not in.
- Cooldown Period: Specifies the time before another scaling activity can start.
- Warmup Time: Defines the time for ASG to consider new instances as part of the desired capacity.
Predictive Scaling
- Uses machine learning algorithms to analyze historical load data.
- Generates forecasts and schedules scaling actions based on predictions.
Summary
AWS Auto Scaling is a comprehensive service that supports scaling for various AWS resources like EC2 instances, Spot Fleet requests, Amazon ECS services, DynamoDB tables, and Aurora read replicas. It offers dynamic scaling to adjust resources based on actual usage and predictive scaling that uses machine learning to anticipate future demands. Users can optimize scaling for availability, cost, or balance, and have the option to customize metrics and scaling strategies.