Today HP Cloud scheduled a nine-hour maintenance window for our AZ. Our servers all lost network connectivity, and were offline for a while. This shouldn’t happen. We should come up with a plan to make sure it doesn’t.
Here’s my idea:
-
At least two instances for everything, in different AZs. phab-web1 and phab-web2 shouldn’t live in the same region. Mrz told me that HP’s LBaaS supports by-IP, not by-instance like ELB does on AWS.
-
Distributed monitoring. This is something I’ll try and work on soon.
I’d like to get some additional input on this, as well.