Summary: Troubleshooting CloudFormation Wait Conditions
When using AWS CloudFormation, you may encounter a scenario where a wait condition does not receive the required number of signals from an Amazon EC2 instance. This can be due to several reasons:
- AMI Lacks Helper Scripts: The Amazon Machine Image (AMI) may not have the CloudFormation helper scripts installed. These scripts can be manually installed if they are missing.
- Check cfn-init and cfn-signal Outputs: Verify the outputs of the
cfn-init and cfn-signal commands by checking the log files to understand how these commands executed.
- Disable Rollback: To debug issues, disable the rollback feature in CloudFormation. This prevents the automatic deletion of failed EC2 instances, allowing you to access them for troubleshooting.
- Internet Access: Ensure that the EC2 instance has internet access, especially if it's in a private subnet, as it needs to communicate with the CloudFormation service.
Steps to Debug CloudFormation Wait Condition Failures
- Disable Rollback on Failure
- When creating or updating a stack, choose to preserve successfully provisioned resources instead of rolling back all resources. This allows you to access the instance for debugging.
- Check Internet Connectivity
- Perform a connectivity test from the instance to verify it can reach the internet, which is necessary for signaling CloudFormation.
- Review Log Files
- Access the instance using tools like EC2 Instance Connect and review the log files to identify the root cause of the failure.
- Simulate Failure
- You can simulate a failure by modifying the CloudFormation template to include a command that exits with a non-zero status code, triggering a failure signal to be sent to CloudFormation.
- Observe Stack Creation
- Create a stack with the modified template to observe the failure behavior. The stack should fail and roll back if configured to do so.
- Use Stack Failure Options
- When creating the stack, under stack failure options, select the option to preserve successfully provisioned resources for debugging purposes.
Example: CloudFormation Template Modification to Simulate Failure
Resources:
MyInstance:
Type: AWS::EC2::Instance
Metadata:
AWS::CloudFormation::Init:
config:
commands:
test_failure:
command: "echo boom && exit 1"
In the above example, the echo boom && exit 1 command in the CloudFormation template will cause the cfn-init script to fail, which in turn will cause the cfn-signal to send a failure signal to the wait condition.
Conclusion
Understanding the use of cfn-signal and wait conditions is crucial for managing CloudFormation stacks. Properly configuring rollback settings and knowing how to access and debug EC2 instances can help identify issues during stack creation and development cycles.