AWS Auto Scaling Policies Summary
Auto Scaling Groups (ASG) in AWS can use various scaling policies to adjust their capacity based on demand. Here are the main types of scaling policies:
- Dynamic Scaling: Adjusts the number of instances within an ASG in response to real-time changes in demand.
- Target Tracking Scaling: Automatically adjusts the ASG to maintain a specified average metric value (e.g., CPU utilization at 40%).
- Simple/Step Scaling: Uses CloudWatch alarms to scale in or out when specific metrics thresholds are reached.
- Scheduled Scaling: Preemptively adjusts the ASG based on predictable usage patterns (e.g., scaling up at 5:00 PM on Fridays).
- Predictive Scaling: Uses machine learning to forecast demand and schedule scaling actions in advance.
Metrics for Scaling
Choosing the right metric to scale on depends on the application's behavior:
- CPU Utilization: Indicates the compute load on instances.
- RequestCountPerTarget: Measures the number of requests each instance is handling.

- Network In/Out: Useful for network-bound applications with significant data transfer.
- Custom Metrics: Application-specific metrics pushed to CloudWatch.
Scaling Cooldown
- After a scaling activity, a cooldown period prevents further scaling actions to allow metrics to stabilize.
- The default cooldown period is 5 minutes (300 seconds).
- Instances should use ready-to-use AMIs to reduce configuration time and shorten the cooldown period.

Best Practices
- Use detailed monitoring to get updated metrics every minute for responsive scaling.
- Optimize instance launch times with pre-configured AMIs to enhance scaling responsiveness.