Amazon Web Services has added a small but important resiliency feature: its Elastic Computing Cloud (EC2) instances now include automatic recovery by default.
EC2 instances could previously be configured to recover automatically by setting an alarm in Amazon CloudWatch – AWS’s monitoring and observability service.
But AWS has recovery enabled by default in EC2 instances.
AWS states that if an underlying hardware issue occurs, then EC2 instances will find themselves a new home in the Amazon cloud – complete with their instance ID, private IP addresses, public IPv4 IP address, elastic IP addresses, and all instance metadata. However, data in memory is lost.
AWS’s brief announcement of the new feature and other documents make no mention of data recovery points on the disks, or the time it takes for a self-recovered server to resume operations.
It is always possible to disable the automatic recovery function. This might seem like an odd choice to make, but the AWS documentation offers a reason to consider it: instances in placement groups (an AWS feature that lets you ensure that instances are running a particular hardware group) are restored to the same placement group. Server hardware failures can easily indicate a problem in a rack or row. Therefore, auto-recovery in an unstable environment may be a less attractive option than auto-recovery in another Availability Zone or Region.
AWS recommends working across multiple Availability Zones for resiliency, while failures in large regions like the venerable US-EAST-1 have found that building resiliency across multiple regions is also quite reasonable . ®