Trace Id is missing

Unlock your potential with Microsoft Copilot

Get things done faster and unleash your creativity with the power of AI anywhere you go.
Microsoft Copilot app being utilized to generate pictures of a singing dog, assisting to identify a flower, and helping to generate an email to congratulate a coworker on a promotion.

HPC Pack 2016 Update 3 fixes

This update fixes some issues of HPC Pack 2016 Update 3

Important! Selecting a language below will dynamically change the complete page content to that language.

  • Version:

    5.3.6450

    Date Published:

    6/8/2021

    File Name:

    KB4537169_x86.exe

    UpdateCompactCN.ps1

    Upgrade-HpcApplication.ps1

    HpcApplicationType.sfpkg

    HpcCompute_x64.msi

    hpcnodeagent.tar.gz

    KB4537169_x64.exe

    File Size:

    4.8 MB

    16.5 KB

    19.5 KB

    75.5 MB

    15.3 MB

    10.7 MB

    4.9 MB

    This update fixes some known issues of HPC Pack 2016 update 3 as described below

    - Improve the SOA performance and reliability.
    - Fix the issue that sometimes the cluster utilization rate is shown greater than 100% in the cluster utilization chart.
    - Fix the issue that NAT is not working if the OS of the head node is Windows Server 2016 or above.
    - Support UEFI boot for bare metal deployment.
    - Fix an issue that in some situation HPCUsers cannot call service registration REST API.
    - Fix an issue that in some rare condition, some compute nodes are in OK state, but cannot run tasks.
    - Fix an issue that a parametric sweep job cannot finish when all its tasks finish.
    - Fix an issue that old jobs are not cleaned up in time.
    - Fix an issue that “SortByNodes” query parameter doesn’t work in HPC Web API.
    - Support pagination in HPC Web API at endpoint “/hpc”: A “StartRow” query parameter can be used when getting list of jobs/tasks/nodes. The first row has index 0. When this parameter presents in request,
    a) The server will returns the requested range of rows defined by “StartRow” and “RowsPerRead”(default to 10).
    b) The total row count is returned in the response header “x-ms-row-count”.
    c) The response headers "x-ms-continuation-QueryId" and "x-ms-continuation-CurrentObjectNumber" will not be returned.
    - Support sorting order in HPC Web API at endpoint “/hpc”: A “Asc” query parameter can be used when getting list of jobs/tasks/nodes. The parameter has a default value “True”, to sort the list ascendingly. When given a value “False”, the return list is in descending order. The sort field is specified by “SortNodesBy”/”SortJobsBy”/”SortTasks” query parameters as before.
    - Enforce authorization of SignalR Hubs for job & task events.
    - Increase the life time of continuation token/rowset in web service to 60 min.
    - Fix an issue that start/stop IaaS nodes may fail unexpectedly.
    - Fix an error in auto grow shrink when the monitoring service is temporarily unavailable.
    - Fix a possible job scheduler hang when the scheduler service restarts.
    - Fix an issue that SOA message level trace cannot be viewed after system reboot.
    - Fix Web Portal connection and port leak.
    - Fix an issue that HpcMonitoring service may stop to persist minute counters.
    - Sort nodes by node group and node name for auto grow and avoid to retry the same batch of failed nodes.
    - Add change role button back to the context menu for Azure IaaS nodes.
    - Fix a scheduler SQL insert NULL exception when adding job/task allocation history caused by node GUID changing.
    - Fix an issue that Linux node with FQDN host name may not be recognized by scheduler.
    - Improve Linux node GPU instance name readability in metric info.
    - Fix an issue that Azure IaaS nodes are stuck at Provisioning state occasionally when auto grow shrink is enabled.
    - Fix an issue that sometimes compute node cannot join the cluster with “node name already exists” error.
    - Fix an issue that Cluster Manager cannot view reporting charts after moving HPCReporting database to another SQL server instance.
    - Fix some issues in French OS.
  • Supported Operating Systems

    Windows Server 2012, Windows Server 2012 R2, Windows Server 2016, Windows Server 2019, Windows 7, Windows 8, Windows 8.1, Windows 10, SUSE, Red Hat, Ubuntu, Linux

    HPC Pack 2016 Update 3 (build 5.3.6435.0) installed
  • Installation Instructions
    This update needs to be run on all head nodes and broker nodes. It is optional for workstation nodes, compute nodes (Windows and Linux), Azure IaaS nodes and clients depending on the required fixes as listed below,

    Workstation and unmanaged server nodes, Windows compute nodes, Azure IaaS nodes:
    - Fix an issue that in some situation HPCUsers cannot call service registration REST API.
    - Fix some issues in French OS.
    Linux compute nodes:
    - Fix an issue that Linux node with FQDN host name may not be recognized by scheduler.
    - Improve Linux node GPU instance name readability in metric info.
    Clients:
    - Add change role button back to the context menu for Azure IaaS nodes.

    Before applying the update, please check if HPC Pack 2016 Update 3 is installed. The version number (in HPC Cluster Manager, click Help->About) should be 5.3.6435.0. Please take all nodes offline and ensure all active jobs finished or canceled. If there are Azure PaaS Nodes, please make sure they are stopped before applying this patch. After all active operations on the cluster have stopped, please back up the head node (or head nodes) and all HPC databases by using a backup method of your choice.

    To start the download, click the Download button next to the appropriate file (KB4537169_x64.exe for the 64-bit version, KB4537169-x86.exe for the 32-bit version) and then:

    Applying the update on Single Headnode
    1. Click Save to copy the download to your computer;
    2. Close any open HPC Cluster Manager or HPC Job Manager windows;
      Note: Any open instances of HPC Cluster Manager or HPC Job Manager may unexpectedly quit or show an error message during the update process if left open. This does not affect installation of the update;
    3. Run the download on the head node using an administrator account, and reboot the head node;
    4. The version number (Start HPC Cluster Manager, click Help->About) now should be 5.3.6450.0;
    5. If you want to revert the patching, please go to Control Panel -->Programs and Features -->View installed updates, un-install below updates in order (Please don’t reboot in the middle): KB4537169 under “Microsoft ® HPC Pack 2016 Web Components”, “Microsoft ® HPC Pack 2016 Client Components” and then “Microsoft ® HPC Pack 2016 Server Components, then reboot;

    Applying the update on Three Headnodes
    Note: If you have at least three head nodes, you need to download HpcApplicationType.sfpkg and Upgrade-HpcApplication.ps1 as well and put them together on one of the headnode, for example, c:\HPCPatch. And run through below steps:
    1. To upgrade service fabric application, please open an elevated powershell command prompt window, run:
           Upgrade-HpcApplication.ps1 -ApplicationPackagePath C:\HPCPatch\HpcApplicationType.sfpkg
    2. After the service fabric application upgrade done, you then need to run the KB4537169_x64.exe on headnode one by one;
    3. Reboot the headnode one by one (Please check https://localhost:10400 to make sure the original reboot headnode back to health state before you reboot a new headnode, so that your service will keep available);
    4. The version number (Start HPC Cluster Manager, click Help->About) now should be 5.3.6450.0 on all headnodes;
    5. If you want to revert the patching, for every headnode please go to Control Panel --> Programs and Features --> View installed updates, un-install below updates in order (Please don’t reboot in the middle): KB4537169 under “Microsoft ® HPC Pack 2016 Web Components”, “Microsoft ® HPC Pack 2016 Client Components” and then “Microsoft ® HPC Pack 2016 Server Components; then downgrade service fabric application through below command;
           Connect-ServiceFabricCluster
           $hpcApplication = Get-ServiceFabricApplication -ApplicationName fabric:/HpcApplication
           $appParameters = @{}
           foreach($appParam in $hpcApplication.ApplicationParameters)
           {
           $appParameters[$appParam.Name] = $appParam.Value
           }
          
           Start-ServiceFabricApplicationUpgrade -ApplicationName fabric:/HpcApplication -ApplicationTypeVersion 1.3.0 -ApplicationParameter $appParameters -HealthCheckStableDurationSec 60 -UpgradeDomainTimeoutSec 1800 -UpgradeTimeout 3000 -FailureAction Rollback -Monitored | Out-Null
    You can run "Get-ServiceFabricApplicationUpgrade -ApplicationName fabric:/HpcApplication" to track the upgrade status. If you find it is stuck at "PreUpgradeSafetyCheck" due to some service fail to cancel, you could try to manually kill the corresponding process on the right node. When if finished, reboot headnode one by one (you need wait the original rebooted node fully healthy in the service fabric cluster before you reboot the next one.

    You can open "Service Fabric Explorer" at "https://localhost:10400" to monitor the service fabric cluster upgrade progress (You need to import the pfx certificate used during cluster setup to CurrentUser\My to avaid HTTP 403 error). There is little possibility that some service is stuck on the headnode during patching. If that happens, you could manually kill the process on that node.

    Applying the update on Windows nodes
    1. Log on interactively, or use clusrun to deploy the fix to the compute nodes, broker nodes, unmanaged server nodes and workstation nodes;
    If you want to use clusrun to patch the QFE on the compute nodes, broker nodes, unmanaged server nodes, Azure IaaS nodes and workstation nodes:
           a. Copy the appropriate version of the update to a shared folder such as \\<headnode>\HPCUpdates
           b. Open an elevated command prompt window and type the appropriate clusrun command for the operating system of the patch, e.g.:
                 clusrun /nodegroup:ComputeNodes \\<headnode>\HPCUpdates\KB4537169_x64.exe -unattend -SystemReboot
                 clusrun /nodegroup:BrokerNodes \\<headnode>\HPCUpdates\KB4537169_x64.exe -unattend -SystemReboot
    Note: HPC Pack updates, other than Service Packs, do not get automatically applied when you add a new node to the cluster or re-image an existing node. You must either manually/clusrun apply the update after adding or reimaging a node or modify your node template to include a line to install the appropriate updates from a file share on your head node.
    Note: If the cluster administrator doesn’t have administrative privileges on workstation nodes and unmanaged server node, the clusrun utility may not be able to apply the update. In these cases the update should be performed by the administrator of the workstation and unmanaged servers.
    2. If you want to revert the patching, please go to Control Panel --> Programs and Features --> View installed updates, un-install below updates in order (Please don’t reboot in the middle): KB4537169 under “Microsoft ® HPC Pack 2016 Web Components”, “Microsoft ® HPC Pack 2016 Client Components” and then “Microsoft ® HPC Pack 2016 Server Components, then reboot;

    Applying the update on Azure IaaS nodes
    From HPC Pack 2016 Update 3 on, the Azure IaaS compute nodes are by default deployed with “Microsoft ® HPC Pack 2016 ComputeNode Components" (You may go to Control Panel->Programs on the compute node to check it). For these Azure IaaS nodes, please follow the steps below to apply the patch,
    1. Download and copy “HpcCompute_x64.msi” and "UpdateCompactCN.ps1" to the remote install share on the head node \\<headnode>\REMINST or any file share that can be access by the compute nodes.
    2. Clusrun /nodegroup:AzureIaaSNodes PowerShell.exe -ExecutionPolicy ByPass -Command "\\<headnode>\REMINST\UpdateCompactCN.ps1 -NewPackage \\<headnode>\REMINST\HpcCompute_x64.msi -RunAsScheduledTask"
    Note: If your Azure IaaS compute nodes were upgraded from previous version (HPC Pack 2016 Update 1/2) without “Microsoft ® HPC Pack 2016 ComputeNode Components" installed, then follow the section Applying the update on Windows nodes instead.

    Applying the update on Linux nodes
    1. Download and copy “hpcnodeagent.tar.gz” to the remote install share of the HPC Cluster (default should be \\<HN>\REMINST\LinuxNodeAgent) to overwrite the existing one. Please back up the existing one so that you could downgrade to original version.
    2. Mount the share on linux node (Suppose you already created /mnt/share on all linux node):
           Clusrun /env:CCP_MAP_ADMIN_USER=0 /user:system /NodeGroup:LinuxNodes mkdir /mnt/share ^& mount -t cifs //<yourheadnode>/REMINST/LinuxNodeAgent /mnt/share -o vers=2.1,domain=<domainname>,username=<username>,password='<password>',dir_mode=0777,file_mode=0777
    3. Clusrun with root on all linux node to update the package:
           Clusrun /env:CCP_MAP_ADMIN_USER=0 /user:system /NodeGroup:LinuxNodes /workdir:/mnt/share echo “python /mnt/share/setup.py -update” ^| at now + 1 minute
    4. Wait for clusrun completion and the real update will start in a minute on the linux node; After the update completes, you can run Clusrun command to check the linux agent version by running below command (the version now will be 2.3.7.0):
           Clusrun /env:CCP_MAP_ADMIN_USER=0 /user:system /NodeGroup:LinuxNodes /workdir:/opt/hpcnodemanager ./nodemanager -v
    5. If you want to revert back to the original linux agent version, you could restore the "hpcnodeagent.tar.gz" to the old version and apply the same above steps;

    Applying the update on Linux nodes on Azure
    If your linux node is on azure and deployed through our ARM template, the linux node will update its Linux Agent Extension automatically.

    Applying the update on Client
    1. To update computers that run HPC Pack client applications apply the following actions:
           a. Stop any HPC client applications including HPC Job Manager and HPC Cluster Manager;
           b. Run the update executable;
           c. Reboot your client computer;
    2. If you want to revert the patching, please go to Control Panel --> Programs and Features --> View installed updates, un-install KB4537169 under “Microsoft ® HPC Pack 2016 Client Components” then reboot;

Follow Microsoft