vCenter Upgrade Error: Exception Occurred in install precheck phase

Error presented by VAMI Interface

Installation Error

Caveat

This is definitely bypassing some form of pre-check, please contact VMware support if it's on a production system!

Troubleshooting

VCSA 7.0 has moved the upgrade process logging to a new location - the log itself is now at /storage/log/vmware/applmgmt/update_microservice.log (actual) or /var/log/vmware/applmgmt/update_microservice.log (symlink)

update_microservice

This appears to be a rough order of operations with this new update process:

  • Pre-Checks: First, the upgrade tries to identify the system being upgraded:
 1update_microservice::          precheckEventHandler: 148 -     INFO - Precheck event happens  
 2update_b2b::                      precheck: 709 -    DEBUG - Running update prechecks  
 3update_b2b::               b2bRequirements: 479 -    DEBUG - Running B2B Requirements hook and processing the results  
 4update_b2b::                _runScriptHook: 330 -    DEBUG - Running B2B script with hook CollectRequirementsHook  
 5update_b2b::                _runScriptHook: 339 -    DEBUG - update script output to file /var/log/vmware/applmgmt/upgrade_hook_CollectRequirementsHook  
 6extensions::                _findExtension:  83 -    DEBUG - Found script hook <module 'update_script' from '/storage/core/software-update/updates/7.0.1.00200/scripts/update_script.py'>:CollectRequirementsHook'  
 7update_utils::                     isGateway:  83 -    DEBUG - Not running on a VMC Gateway appliance.  
 8update_utils::                  isB2BUpgrade:  72 -    DEBUG - Bundle will execute upgrade: False  
 9update_script::           collectRequirements: 492 -    DEBUG - Checking verisons  
10update_script::           collectRequirements: 496 -    DEBUG - Source VCSA version = 7.0.1.00100  
11update_script::           collectRequirements: 500 -     INFO - Target VCSA version = 7.0.1.00200  
12update_utils::               getRPMBlacklist: 185 -    DEBUG - vCSA deployment Type: embedded  
13update_b2b::               b2bRequirements: 493 -    DEBUG - Getting packages excluding the ones in blacklist  

From there, it picks up the scope for the upgrade, and verifies against common upgrade issues:

 1update_b2b::               b2bRequirements: 528 -    DEBUG - Calculated packages list   
 2update_b2b::                     checkDisk: 423 -    DEBUG - Checking for disk utilization  
 3update_b2b::                     checkDisk: 467 -    DEBUG - CheckDisk completed, returning with selected disk partition /storage/updatemgr  
 4update_b2b::                      precheck: 740 -    DEBUG - Estimating time to install..  
 5update_b2b::                 estimate_time: 679 -    DEBUG - Estimating time required for rpm-update, services start-stop and reboot time if its required  
 6update_b2b::                 estimate_time: 682 -    DEBUG - Calculating RPM installation time  
 7update_b2b::              rpm_install_time: 587 -    DEBUG - Reading all rpms present in rpm-manifest.json  
 8update_b2b::              rpm_install_time: 588 -    DEBUG - Estimating installation time for installed rpms and new rpms  
 9update_b2b::       get_installed_rpms_list: 564 -    DEBUG - Getting the list of installed RPMs along with the time of install  
10update_b2b::       get_installed_rpms_list: 578 -    DEBUG - Completed getting the list of rpms, returning with the list: <class 'list'>  
11update_b2b::              rpm_install_time: 610 -    DEBUG - Installation time estimated successfully, returning with time for installation 23  
12update_b2b::                 estimate_time: 684 -    DEBUG - Calculating time to start and stop services  
13update_b2b::        estimate_time_services: 620 -    DEBUG - Estimating time for services-start and services-stop  
14update_b2b::        estimate_time_services: 640 -    DEBUG - Completed estimating time for starting and stopping services, returning with the required time: 2  
15task_manager::                        update:  80 -    DEBUG - UpdateTask: status=SUCCEEDED, progress=100, message={'id': 'com.vmware.appliance.update.prechecks_task_ok', 'default_message': 'Prechecks completed', 'args': []}  

In this case, everything looks good. I'm not really sure why it needs the SSO Administrator password, and there isn't much on-line about this. We're seeing three errors after we hit go time:

 1update_b2b::                   resumeStage:3431 -    DEBUG - 'download' phase is 100% completed. checkAllRpmsArePresent  
 2rpmfunctions::        checkAllRpmsArePresent: 308 -    ERROR - Empty Stage location passed. This cannot be empty.  
 3update_b2b::                   resumeStage:3497 -    ERROR - Exception in resume stage. Exception : {Package discrepency error, Cannot resume!}  
 4task_manager::                        update:  80 -    DEBUG - UpdateTask: status=FAILED, progress=0, message={'id': 'com.vmware.appliance.plain_message', 'default_message': '%s', 'args': ['Package discrepency error, Cannot resume!']}  
 5dbfunctions::                       execute:  81 -    DEBUG - Executing {SELECT CASE WHEN count(*) == 0 THEN 0 ELSE 1 END as status FROM progress WHERE _stagekey = 'patch-state' AND _message = 'Stage successful'}  
 6functions::              get_resume_state: 340 -    DEBUG - Resume needed in Stage phase  
 7update_b2b::           install_with_resume:2477 -    DEBUG - Installing version 7.0.1.00200  
 8update_functions::                  readJsonFile: 224 -    ERROR - Can't read JSON file /storage/core/software-update/stage/stageDir.json [Errno 2] No such file or directory: '/storage/core/software-update/stage/stageDir.json'  
 9task_manager::                        update:  80 -    DEBUG - UpdateTask: status=FAILED, progress=0, message={'id': 'com.vmware.appliance.not_staged', 'default_message': 'The update is not staged', 'args': []}  
10update_b2b::              installPrechecks:2146 -    DEBUG - Exception occurred while checking for discrepancies Update not staged  
11task_manager::                        update:  80 -    DEBUG - UpdateTask: status=RESUMABLE, progress=0, message={'id': 'com.vmware.appliance.plain_message', 'default_message': '%s', 'args': ['Exception occurred in install precheck phase']}  

This is pretty odd, because it's indicating a "resumable error" despite the fact that it cannot resume until a file lock is removed. Here are the errors I see:

  • Empty Stage Location: Unsure what this means, given the context. Odds are the upgrade script cannot find out where to stage RPMs (Red Hat Package Manager).
  • Package discrepancy error: It could be relating to the above, or it could be a failed checksum. No other logging is generated by the agent to indicate what's wrong.
  • Can't read JSON file /storage/core/software-update/stage/stageDir.json: This one's more actionable! It looks like there's no directory by this name.

Easter Egg: statsmoitor probably should be statsmonitor

Remediation

Allow the update to resume

VAMI saves the installation state as a file in /etc/applmgmt/appliance/software_update_state.conf:

1{  
2    "state": "INSTALL_FAILED",  
3    "version": "7.0.1.00200",  
4    "latest_query_time": "2020-12-21T00:19:32Z",  
5    "operation_id": "/storage/core/software-update/install_operation"  
6}  

VAMI will be stuck in a loop until you remove this file as root:

1rm -rf /etc/applmgmt/appliance/software_update_state.conf  

This will not necessarily resolve the issue that caused the failure, however, more work still needs to be done.

Install via ISO

EDIT: The update ISO can be found at: https://my.vmware.com/group/vmware/patch#search

We're going to try a fallback method, attaching the upgrade ISO. The following snippet is from the vSphere UI, modifying vCenter's VM Hardware:

VM Hardware

From there, simply click "Check CD-ROM" and it will immediately appear.

This time, we know what directories to search, so I'm going to watch the logs:

1tail -f  /var/log/vmware/applmgmt/update_microservice.log | grep -i err  

Attempt via Command-line with ISO

VMware documents the following method to update via the command line https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vcenter.upgrade.doc/GUID-8466F019-C57C-4344-9E15-8CFF74A6E4C2.html

Stage Packages

We're going to try and clear the (empty) workspace and try fresh, auto-accepting EULAs:

 1Command> software-packages unstage  
 2Command> software-packages stage --iso --acceptEulas  
 3 [2020-12-20T17:49:54.355] : ISO mounted successfully  
 4 [2020-12-20T17:49:54.355] : UpdateInfo: Using product version 7.0.1.00100 and build 17004997  
 5 [2020-12-20T17:49:55.355] : Target VCSA version = 7.0.1.00200  
 6 [2020-12-20 17:49:55,169] : Running requirements script.....  
 7 [2020-12-20T17:50:12.355] : Evaluating packages to stage...  
 8 [2020-12-20T17:50:12.355] : Verifying staging area  
 9 [2020-12-20T17:50:12.355] : ISO unmounted successfully  
10 [2020-12-20T17:50:12.355] : Staging process completed successfully  
11 [2020-12-20T17:50:12.355] : Answers for following questions have to be provided to install phase:  
12        Question:  
13                ID: vmdir.password  
14                Text: Single Sign-On administrator password  
15                Description: For the first instance of the identity domain, this is the password given to the Administrator account.  Otherwise, this is the password of the Administrator account of the replication partner.  
16                Allowed values:  
17                Default value:  
18  
19 [2020-12-20T17:50:12.355] : Execute software-packages validate to validate your input  

Let's take a look at the update:

 1Command> software-packages list --staged  
 2[2020-12-20T17:52:00.355] :  
 3 category: Bugfix  
 4 kb: https://docs.vmware.com/en/VMware-vSphere/7.0/rn/vsphere-vcenter-server-70u1c-release-notes.html  
 5 leaf_services: ['vmware-pod', 'vsphere-ui', 'wcp']  
 6 vendor: VMware, Inc.  
 7 name: VC-7.0U1c  
 8 tags: []  
 9 version_supported: []  
10    size in MB: 5107  
11 releasedate: December 17, 2020  
12 executeurl: https://my.vmware.com/group/vmware/get-download?downloadGroup=VC70U1C  
13 version: 7.0.1.00200  
14 updateversion: True  
15 allowedSourceVersions: [7.0.0.0,]  
16 buildnumber: 17327517  
17 rebootrequired: False  
18 productname: VMware vCenter Server  
19 type: Update  
20 summary: {'id': 'patch.summary', 'translatable': 'In-place upgrade for vCenter appliances.', 'localized': 'In-place upgrade for vCenter appliances.'}  
21 severity: Critical  
22 TPP_ISO: False  
23 thirdPartyInstallation: False  
24 timeToInstall: 0  
25 requiredDiskSpace: {'/storage/core': 6.286324043273925, '/storage/seat': 228.3861328125}  
26 eulaAcceptTime: 2020-12-20 17:50:12 AKST  

Let's run it!

 1Command> software-packages install --staged  
 2 [2020-12-20T17:53:52.355] : For the first instance of the identity domain, this is the password given to the Administrator account.  Otherwise, this is the password of the Administrator account of the replication partner.  
 3Enter Single Sign-On administrator password:  
 4  
 5 [2020-12-20T17:54:02.355] : Validating software update payload  
 6 [2020-12-20T17:54:02.355] : UpdateInfo: Using product version 7.0.1.00100 and build 17004997  
 7 [2020-12-20 17:54:02,095] : Running validate script.....  
 8 [2020-12-20T17:54:09.355] : Validation successful  
 9 [2020-12-20 17:54:09,125] : Copying software packages  [2020-12-20T17:54:09.355] : ISO mounted successfully  
10166/166  
11 [2020-12-20T17:57:31.355] : ISO unmounted successfully  
12 [2020-12-20 17:57:31,238] : Running system-prepare script.....  
13 [2020-12-20 17:57:40,289] : Running test transaction ....  
14 [2020-12-20 17:57:54,344] : Running prepatch script.....  
15 [2020-12-20 18:01:22,731] : Upgrading software packages ....  
16 [2020-12-20T18:07:39.355] : Setting appliance version to 7.0.1.00200 build 17327517  
17 [2020-12-20 18:07:39,538] : Running patch script.  
18....  
19 [2020-12-20 18:28:42,743] : Starting all services ....  
20 [2020-12-20T18:28:46.355] : Services started.  
21 [2020-12-20T18:28:46.355] : Installation process completed successfully  
22 [2020-12-20T18:28:46.355] : The following warnings have been found:  
23['\tWarning: \n\t\tsummary: Failed to start all services, will retry operation.\n']  
24Command> shutdown reboot -r "patch reboot"  

Looks like the manual install worked for me - 7.0 U1c

TL;DR

1rm -rf /etc/applmgmt/application/software_update_state  
2grep -i error /var/log/vmware/applmgmt/update_microservice.log  
3exit  
4software-packages unstage  
5software-packages stage --iso --acceptEulas  
6software-packages list --staged  
7software-packages install --staged  
8shutdown reboot -r "patch reboot"