The mathematics of the current recommendations didn't seem to give the right outcomes. Especially with `U`, as you need at least `U+1` to keep availability in the face of `U` failures.
- extract the portion related to multi-cluster operation into a new multi-cluster.md doc
- merge the remainder (that was basically high-level troubleshooting advice) into cluster-troubleshooting.md