How VMware Cloud on AWS Handles Host Failures Automatically
Here is how VMware Cloud on AWS handles a host failure automatically, based on one I watched happen in production. When you run a large enough infrastructure, failure is inevitable. How you handle that can be a big differentiator. With VMware Cloud on AWS, the hosts are monitored 24×7 by VMware/AWS Support all as part of the service. If you pay for X number of hosts you should have X. That includes during maintenance and failure operations.
Watching a Host Failure Get Remediated
I’m not sure lucky is the right word but I did witness a host failure with a customer I was working with. True to the marketing, it was picked up and automatically remediated.

Looking at the log extract above a new host was being provisioned the same minute the issue was identified. Obviously, this needed to boot and join the VMware/vSAN cluster before a full data evacuation takes place on the faulty host and finally, the host is removed.
All of this was seamless to the customer. I noticed it as a few HA alarms tripped in the vCenter (these were cosmetic only).
Just another reason why you should look at the VMware Cloud on AWS Service… For more on the hardware behind this, see my VMware Cloud on AWS hosts deep-dive.






