When you run a large enough Infrastructure failure is inevitable. How you handle that can be a big differentiator. With VMware Cloud on AWS, the hosts are monitored 24×7 by VMware/AWS Support all as part of the service. If you pay for X number of hosts you should have X. That includes during maintenance and failure operations.
I’m not sure lucky is the right word but I did witness a host issue with a customer I was working with. True to the marketing It was picked up and automatically remediated.
Looking at the log extract above a new host was being provisioned the same minute the issue was identified. Obviously this needed to boot and join the VMware/vSAN cluster before a full data evacuation takes place on the faulty host and finally, the host is removed.
All of this was seamless to the customer. I noticed it as a few HA alarms tripped in the vCentre ( These were cosmetic only)
Just another reason why you should look at the VMware Cloud on AWS Service