venerdì 8 aprile 2022

NSX-T 3.2.01 - Upgrade failed from 3.1.6

Issue


Today, during the upgrade of NSX-T Data Center infrastructure from 3.1.3.6 version to 3.2.0.1 I faced out the following issue.
All NSX-T Appliance managers have been updated to version 3.2.0.1, but when updating the latest appliance the result was as follows:


looking in System > Lifecycle Management > Upgrade



It was not possible to connect via UI to the NSX-T manager appliances, instead via SSH, the appliances were reachables and updated, but the “get cluster status” NSX manager CLI command output clearly shows that the group status is degraded and that two nodes were down.

Solution


Disclaimer: Some of the procedures described below, may not be officially supported by VMware. Use it at your own risk.

To solve the issue I decided to keep the good NSX-T manager appliance, deactivate the cluster and deploy new appliances from the good one.
As described in this link, in the event of a loss of two of the three NSX-T Manager cluster nodes we must deactivate the cluster.
An interesting guide on NSX-T recoverability was written by Rutger Blom.

But let's proceed step by step.
  • We first need to deactivate the cluster. This operation must be performed from the good/survived NSX-T manager appliance, running the CLI command "deactivate cluster".

  • We can now, delete the NSX-T Manager appliances not good from the UI.
    If something went wrong you also need to detach the node.

  • Let's now reset the NSX-T Upgrade Plan as shown in the KB82042 via API.

    DELETE https://NSX_MGR/api/v1/upgrade-mgmt/plan

    For this to take affect, ssh to the Manager node controlling the upgrade and restart the upgrade service

    > restart service install-upgrade

  • Refreshing the UI .... we can continue with a fake upgrade, clicking on "NEXT - NEXT - DONE" until the end.
  • We have at the moment, a single and operational manager/controller node, upgraded and without error or pending tasks.

    We should be able, from here, to deploy two new NSX-T Manager appliances from the UI, join them to the active cluster node, and come back to this:


That's it.

Nessun commento:

Posta un commento