Servers

    HPC Pack 2008 R2 SP4 Fix for Windows Azure compute node issues

    Select Language:
    This update fixes a set of issues related to Windows Azure compute nodes
    • Note:There are multiple files available for this download.Once you click on the "Download" button, you will be prompted to select the files you need.
      Version:

      3.04.4226

      File Name:

      KB2816845-x64.exe

      KB2816845-x86.exe

      Date Published:

      4/10/2013

      File Size:

      34.7 MB

      3.1 MB

      KB Articles: KB2816845

        This update fixes the following issues that may occur when you manage a Microsoft Windows HPC Pack 2008 R2 cluster that contains Windows Azure compute nodes. All HPC Pack 2008 R2 clusters that use Windows Azure nodes should apply this update, even if these issues have not yet been encountered.

        Job Scheduling Issues
        • Job state is incorrect or appears tbe “stuck”. This hotfix provides greater tolerance for network latency and failures communication between HPC services and tWindows Azure.
          • Job remains in running state with tasks completed or failed
          • Jobs fail with "parent job cannot be validated" exception
          • Job completes successfully but marked failed after an HA failover
          • Jobs remain in running state after database access errors
          • Job remain in draining state and prevent taking compute nodes offline
          • Job in running state cannot be cancelled
          • Cannot cancel jobs when the compute node is running a CPU intensive job
          • Tasks fail on Windows Azure compute nodes with error message: Exception 'Safe handle has been closed' when creating the task
          • Clusrun jobs fail on compute nodes in Windows Azure
        • Issues with exceptions and memory leaks have been addressed.
          • Crash in job scheduler during a large deployment tWindows Azure
          • Job scheduler memory leak after cancelling multiple jobs running in Windows Azure
          • Database timeout exceptions with large deployments tWindows Azure
          • Exception occurs when viewing job state from the command line
          • Exception that the job identifier is invalid when creating a task
          • Error message "Object reference not set tan instance of an object." for a failed job
          • Job fails validation with message “Node AZURECN-xxxx specified in required/requested nodes could not be found. Check the required/requested nodes tensure the names are correct and try again”

        Cluster Management Issues
        • Windows Azure compute nodes fail tdeploy or state is incorrect. This hotfix provides greater tolerance for network latency and failures communication between HPC services and tWindows Azure.
          • Compute nodes in Windows Azure appear unreachable but are available in the portal
          • Windows Azure compute nodes remain in online state and cannot be deleted or stopped if the head node in a high availability cluster fails
          • List of Windows Azure compute nodes becomes out of sync between the management and job scheduler service for multiple deployments if one deployment fails
          • Compute nodes in Windows Azure repeatedly changing between reachable and unreachable state due twrong deployment ID reported if there was a failure creating the deployment and the action is retried
          • There is a long delay between trying tstop a Windows Azure compute node and the operation failing
          • Deployed Windows Azure compute node in an offline state does not come online when an availability policy is enabled after the start time is passed
          • Cannot add Azure compute nodes after high availability failover during a large deployment
          • Configuration package not applied after Windows Azure compute node is ready
          • Failure or timeout uploading proxy certificates tWindows Azure
        • Issues with exceptions and memory leaks have been addressed.
          • Invalid XML in Windows Azure configuration file when startup script parameter contains special characters
          • Powershell cmdlets may leak memory or hang with certain operations
          • Memory leak in hpcmanagement.exe with large number of node templates
          • Crash in admin console when reconnecting tan HA head node with the virtual cluster name

        SOA Runtime Issues
        • When Windows Azure nodes start they fail tsynchronize the SOA service package or application packages from Windows Azure Storage. This issue is more likely toccur in deployments with large number of compute node role instances or a large deployment package tupload. This hotfix makes the synchronization more resilient tfailures on accessing Windows Azure Storage
        • When session’s broker is running on clustered broker node, SOA message level preemption is enabled and SOA task is preempted the task does not exit as expected. When the task cancel grace period expires, the task is killed by the scheduler. This hotfix resolves the problem by making the task exit gracefully.
        • When session’s broker is running on clustered broker node, autshrink feature of SOA doesn’t work because the broker fails tmake a task exit. This hotfix resolves the problem by making the task exit gracefully.
        • BrokerResponseEnumerator.MoveNext() method and BrokerResponse.Result property return error message “Heartbeat lost for broker node” when clients using SOA session API attempt tretrieve more than 632 responses.
    • Supported Operating System

      Windows 7, Windows HPC Server 2008 R2, Windows Server 2008 R2, Windows Vista, Windows XP

        An HPC Pack 2008 R2 / HPC Server 2008 R2 cluster
        HPC Pack 2008 R2 Service Pack 4 must be installed
      • This update need to be run on all Compute Nodes, Broker Nodes, Head Nodes, Workstation/Unmanaged Server Nodes, and computers running the HPC Pack Client Utilities.

        Perform the following actions before installing the update
        1. Take all compute, workstation and unmanaged server nodes offline and wait for all current jobs to drain
        2. Change Node Template availability policy setting to manual
        3. Stop all existing Windows Azure compute nodes
        4. Close any HPC Cluster Manager and HPC Job Manager applications that are connected to the cluster head node
        5. Backup all HPC databases once all active operations on the cluster have stopped

        Applying the update
        To start the download, click the Download button next to the appropriate file (KB2816845-x64.exe for the 64-bit version, KB2816845-x86.exe for the 32-bit version) and then:
        1. Click Save to copy the download to your computer.
        2. Close any open HPC Cluster Manager or HPC Job Manager windows.
          Note: Any open HPC Cluster Manager of HPC Job Managers may unexpectedly quit or show an error message during the update process if left open. This does not affect installation of the update.
        3. Run the download on the headnode using an administrator account, and reboot the headnode.
          Note: If you have HA headnodes, run the fix on the Active node, move the node to Passive, and after failover occurs run the fix on the new Active node. Do this for all head nodes in the cluster.
        4. Log on interactively, or use clusrun to to deploy the fix to the compute/broker/workstation nodes.

        To use clusrun to deploy an HPC Pack 2008 R2 update on clusters that are running Service Pack 2, or higher:
        1. Copy the appropriate version of the update to a shared folder such as \\headnodename\HPCUpdates .
        2. Open an elevated command prompt window and type the appropriate clusrun command for the operating system of the patch, e.g.:
          clusrun /nodegroup:ComputeNodes \\headnodname\HPCUpdates\KB2816845-x64.exe -unattend -SystemReboot
          clusrun /nodegroup:BrokerNodes \\headnodname\HPCUpdates\KB2816845-x64.exe -unattend -SystemReboot
          Note: HPC Pack updates, other than Service Packs, do not get automatically applied when you add a new node to the cluster or re-image an existing node. You must either manually/clusrun apply the update after adding/reimaging a node or modify your node template to include a line to install the appropriate updates from a file share on your head node.
          Note: If the cluster administrator doesn’t have administrative privileges on workstation nodes and unmanaged server node, the clusrun utility may not be able to apply the update. In these cases the update should be performed by the administrator of the workstation and unmanaged servers.
        3. To update workstation nodes and unmanaged server nodes you may need to reboot.

        To update computers that run HPC Pack client applications apply the following actions:
        1. Stop any HPC client applications including HPC Job Management console and HPC Cluster Management console
        2. Run the update executable
        3. Reboot your client computer


        Uninstalling the update
        To uninstall the update
        1. Take all compute nodes, workstation nodes, and unmanaged server nodes offline and wait for all current jobs to drain
        2. Change Node Template availability policy setting to manual
        3. Backup all HPC databases
        4. Stop existing Windows Azure compute nodes, you need to redeploy Azure nodes after uninstalling the update. You do not need to delete them if you want to redeploy them in the future.
        5. You can uninstall the update in any order across all types of nodes.

        Some updates may apply to more than one piece of HPC Pack software. In order to uninstall those updates, remove them in the following order:
        1. Update for HPC Pack 2008 R2 Services for Excel 2010
        2. Update for HPC Pack 2008 R2 Client Components
        3. Update for HPC Pack 2008 R2 Server Components
        4. Update for HPC Pack 2008 R2 MS-MPI Redistributable Pack
        Note: If you don't follow the order, you might not be able to uninstall the update, as some update will have dependence across components

        Note: If you have HA headnodes, uninstall the updates on a passive head node, move the node to Active, and then repeat uninstallation on the new passive head node. Do this for all head nodes in the cluster.

    Popular downloads

    Loading your results, please wait...

    Free PC updates

    • Security patches
    • Software updates
    • Service packs
    • Hardware drivers

    Microsoft suggests

    Download a free trial of Windows Server 2012 R2.
    Windows Server 2012 R2 free trial
    Experience the new and enhanced capabilities.
    Free trial