Skip to content

Author: Steve Sumichrast

UCS FI bootflash clean but with errors

Last month we began the process of upgrading all our UCS domains to the newer 4.1 code release trains to enable new functionality/hardware and resolve some minor bugs. After we completed our first domain we were treated to a new fault code on each Fabric Interconnect, approximately 45 minutes after it’s individual upgrade completed. The fault code thrown was “Partition bootflash on fabric interconnect X is clean but with errors.” As this domain happened to be the very first UCS domain we ever had we assumed it may be an issue with the actual NVRAM in the FI. We triaged the case with TAC and determined that this was just enhanced file system checking in the 4.1 code train and that the fix is to run a e2fsck against the bootflash. In previous versions of UCS this would require the debug utility and manually running some commands. However, in 4.1(2a) and 4.0(4k) Cisco added the ability to run a e2fsck from the UCS CLI on the Fabric Interconnect. We figured this would be a one-off case, isolated to this domain, and didn’t think much of this.

However, we just completed our second domain upgrade last night and low-and-behold one of the two FIs raised the same fault. So now that we’ve encountered this on two of our domains (the second of which is one of our newer domains, but only one of the FIs raised the fault), I’m documenting this for future reference!

Cisco has a public bug report (CCO account required, though) documenting the release of the enhancement: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvq17291

The process to run a e2fsck is as follows:

  1. Log in to UCS CLI
  2. Connect to the local-mgmt shell for the FI that has the fault.
    connect local-mgmt <a|b>
  3. Issue the reboot command with the e2fsck argument. This will trigger the FI to reload and run a e2fsck at bootup.
    reboot e2fsck

Note: This will obviously cause one of your Fabric Interconnects to be unavailable while it reloads. So ensure you have a maintenance window and verified your equipment is properly connected/setup for failover to survive the reboot.

Note 2: It may take some time still for the fault to clear after the FI reboots from it’s e2fsck. This is normal. If it hasn’t resolved within a few days you should open a TAC case as you may have faulty bootflash.

4 Comments

UCS Reserved VLANs

Anyone that has spent any time with Cisco equipment should just come to expect that there’s a number of VLANs that Cisco reserves for internal use. Cisco UCS is no exception. However, in UCS land, there’s a few curve balls you need to be aware of from version numbers and hardware types.

As I just completed an internal documentation of these VLAN IDs for our department, and since there’s a few places these are documented on Cisco’s web site, I felt it may make sense to just put them here for easy consumption later. In addition to the Cisco official documentation I’ve included my recommendations for reserving additional VLANs to make your life easier.

So, without further blabbering, here’s my current list of reserved VLANs on Cisco UCS, what they’re used for, and whether they can be changed or not..

Official Cisco UCS Port Reservations

VLAN IDDescriptionModifiable
3915-4042Only on Cisco UCS 6454 Fabric Interconnects
Used for internal system communication
See Cisco UCS Configuration Limits
Yes
4030-4047Used for internal system communication
See Cisco UCS Manager Network Management Guide
No
4048Cisco UCS 2.0 and later
Default VSAN’s FCoE VLAN ID
See Cisco UCS Manager Network Management Guide
Yes
4049Cisco UCS 2.0 and later
Default FCoE Storage Port Native VLAN ID
See Cisco UCS Manager Network Management Guide
Yes
4093Cisco UCS 4.0.1(c) and earlier
Used for internal system communication
See Cisco UCS Manager Network Management Guide
No
4094-4095Used for internal system communication
See Cisco UCS Manager Network Management Guide
No

My Bonus Recommended Reservations

Cisco UCS domains that are using Fibre Channel, whether by attaching to an existing Fibre Channel SAN fabric or by having an array directly connected to the Fabric Interconnect, will also require VLAN IDs for the VSANs within UCS. As I always design my storage fabrics as an A and B fabric I also create a separate VSAN ID for them (typically 11 and 12, respectively). Therefore, in my UCS domains I also create two VSANs and assign them unique VLAN IDs for the FCoE to run in.

VLAN IDVSAN IDDescription
321111Fibre Channel Fabric A
321212Fibre Channel Fabric B
321313Direct-Attached Array FI-A
321414Direct-Attached Array FI-B
Leave a Comment

FlashArray Host Personalities for ESXi

When Pure Storage released Purity 5.1 for the FlashArray they introduced a new host feature called Host Personality. Host Personalities offer a way to customize how the FlashArray presents storage to the physical host. Today Pure offers 5 types of Host Personalities:

  • ESXi
  • Hitachi VSP
  • HP-UX
  • Oracle VM Server
  • VMS

This post specifically focuses on what’s provided with the ESXi host personality and how to set this using PowerShell, VMware PowerCLI and the Pure Storage PowerShell SDK.

Benefits of the ESXi Personality

Before going into the specific enhancements I think it is important to point out that you should set the personality regardless if you need these enhancements or not. As Pure Storage continues to evolve Purity there likely will be other enhancements added to the personality, and by having it set you can avoid a maintenance window later to set the personality.

The ESXi personality on the FlashArray provides two major enhancements:

  1. Changes the LUN addressing from flat LUN ID to peripheral LUN ID. This change allows ESXi to now see LUNs that are presented with an ID of 256 or higher. When you set the host personality to ESXi in the FlashArray the hosts will now be able to use the maximum number of LUN IDs as documented in the Config Maximums document. This is applicable for both Active Cluster enabled environments or normal environments.
  2. Issue a Permanent Device Loss (PDL) whenever an Active Cluster pod marks itself offline due to the loss of a mediator. This is a critical enhancement needed for vSphere clusters. Without this setting enabled, when a Pod goes offline, the host is not notified and a restart of the VM isn’t executed. With the ESXi personality enabled the FlashArray will properly send a PDL sense code to the ESXi hosts so they know the path is gone, is not coming back, and they need to execute HA failovers.

Configuring ESXi Host Personality

The Pure Storage documentation covers how to set the Host Personality via the GUI, so I’m not going to cover that here. Since I like to automate whenever possible, I assembled a PowerShell script to handle this process.

Please Note: It is critical that the hosts are not servicing VMs when you set this. When the Personality is set, the ESXi host will temporarily lose connectivity to the array. The script I provided below will set

First you will need the Pure Storage PowerShell SDK and the VMware PowerCLI modules. To install these execute the following from a PowerShell prompt:

Find-module PureStoragePowerShellSDK,VMware.PowerCLI | Install-Module

Here’s the full script. I’ll explain what’s happening below.

#Requires -Modules PureStoragePowerShellSDK,VMware.VimAutomation.Core
<#
    .SYNOPSIS
    Configure the ESXi Host Personality on a FlashArray

    .DESCRIPTION
    This script will place a list of ESXi hosts into Maintennace Mode and then
    set the appropriate personality flags on one or more FlashArrays.

    .PARAMETER FlashArrays
    One or more FlashArray IPs or host names to check for ESXi host objects.

    .PARAMETER vCenter
    FQDN of the vCenter server used to obtain ESXi host information from.

    .PARAMETER ESXiHosts
    List of ESXi hosts to place into maintenance mode and set the Personality
    for.  If ommitted, all hosts found in the connected vCenter will be operated
    on.
    Note: This must match the name of the host in vCenter exactly.

    .EXAMPLE
    .\Set-ESXiPersonality.ps1 -FlashArray FlashArray1.contoso.com,FlashArray2.contoso.com -vCenter test-vcenter.contoso.com 

    .NOTES
    Author: Steve Sumichrast <ssumichrast@sumistev.com>
    
    Version: 1.1 
    
    Changelog:
    1.1 - Added support for multiple FlashArrays and to target to specific VM
    hosts. 
    1.0 - First version of script
#>

[cmdletbinding()]
Param(
    [Parameter(Mandatory = $true)]
    [string[]]$FlashArrays,
    [Parameter(Mandatory = $true)]
    [string]$vCenter,
    [string[]]$ESXiHosts
)

# Attempt to connect to the FlashArray(s)
try {
    $Credentials = Get-Credential -Message "Login to FlashArray"
    $pfa = $FlashArrays | Foreach-Object -Process {
        New-PfaArray -EndPoint $_ -Credentials $Credentials -IgnoreCertificateError
    }
    Write-Host "Successfully connected to the following FlashArrays:"
    $pfa | ForEach-Object -Process {
        Write-Host " $($_.EndPoint)"
    }
}
catch {
    throw "Failed to log in to FlashArrays."
}


# Attempt to connect to vCenter
try {
    $vCenterObj = Connect-VIServer -Server $vCenter
    Write-Host "Successfully connected to vCenter Server $($vCenter)"
}
catch {
    throw "Failed to connect to vCenter Server $($vCenter)."
}

# Obtain VMHosts
try {
    if (!$ESXiHosts) {
        $ESXiHosts = Get-VMHost -Server $vCenter
    }
    $VMhosts = Get-VMhost $ESXiHosts
    
    Write-Host "Found the following ESXi hosts to operate on:"
    $VMhosts | Foreach-Object -Process {
        Write-Host " $($_.name)"
    }
}
catch {
    throw "Failed to obtain ESXi hosts."
}

# Set the Host Personality on the FlashArrays
try {
    $VMhosts | Foreach-Object -Process {
        Write-host "Configuring $($_.name)."
        Write-host " Placing $($_.name) into Maintenance Mode"
        $_ | Set-VMHost -State Maintenance

        Write-Host " Obtaining WWPN for host"
        # Obtain a WWPN from the HBA to match it up to Purity. Using HBA
        # identifiers since they have to match exactly between ESXi and Purity,
        # unlike the Name field.
        
        #Note that the 0:X is to convert the output to Hex.
        $wwpn = $_ | Get-VMHostHBA -Type FibreChannel | Select-Object -First 1 @{N = "WWPN"; E = {"{0:X}" -f $_.PortWorldWideName}} | Select-Object -ExpandProperty WWPN

        # Set the Purity personality for each FlashArray we're connected to.
        $pfa | Foreach-Object -Process {
            Write-Host " Setting personality for $($PFAHost.Name) on $($_.Endpoint)"
            $PFAHost = Get-PfaHosts -Array $_ -Filter "wwn = '$($wwpn)'"
            Set-PfaHostPersonality -Array $_ -Name $PFAHost.Name -Personality esxi
        }

        Write-Host " Exiting Maintenance Mode for $($_.name)"
        $_ | Set-VMhost -State Connected
        Write-Host " $($_.Name) configured successfully."
        Write-Host ""
    }
}
catch {
    throw "Failed to configure personalities."
}

# Disconnect from vCenter
Disconnect-VIServer $vCenterObj -Confirm:$false

Lines 45-68
In this block we are just connecting to the Pure Storage FlashArray and to the respective VMware vCenter server to obtain ESXi host information.

Line 70 – 84
Here the script is gathering the matching ESXi host objects so we can iterate over them and obtain HBA data later.

Line 86 – 116
This is the heavy lifting in the script. Here the script processes each ESXi host that we found in the above section. For each host we place it into Maintenance Mode, and wait for that to complete. Once the host enters maintenance mode (Line 91) we then proceed to find a WWPN from the first FibreChannel HBA located on this ESXi server (Line 99). Note that here we have to convert the output from PowerCLI to Hex so that we can match it up in the FlashArray. Lines 102 through 106 are a loop for each FlashArray specified (in case the host is connected to more than one FlashArray). We use the PFA object we obtained in the first section, get a list of hosts on that array that match the filter (Line 104), and then we set that host object to use the personality “esxi”. Note: It must be specified exactly that way — this field is case sensitive. Afterwards, we simply exit maintenance mode for the host (line 109) and then we repeat this process for any remaining ESXi hosts.

For additional information on Host Personalities, I’d recommend checking out Cody Hosterman’s blog post on 6.7 and Flat LUN IDs.

Leave a Comment