Test your backups! Obviously easier said than done of course.

Experience in 'small' high availability safety-critical systems says:

1- 'failover often, failover safely'. Things that run once a month or 'just in case' are the most likely to fail.

2- people (customers) often aren't ready to pay for the cost of designing and operating systems with the availability levels they want.