By
Ryan Weldon, Dell Virtualization Solutions Engineering
To showcase some of the features of Microsoft® Windows Server® 2008 R2 Hyper-V™, a configuration was set up in the Dell Virtualization Solutions Engineering lab that implements our networking best practice recommendations for a Hyper-V R2 environment. These best practices are a result of internal validation, direct interaction with Microsoft, and built on top of our work with Hyper-V R1. Our best practices are based on the following design principles:
- Redundant hardware to eliminate a single point of failure
- Load balancing and failover for Internet SCSI (iSCSI) and virtual machine traffic
- Redundant paths for the cluster, Cluster Shared Volume (CSV), and live migration traffic
- Separation of each traffic type for security and availability
- Ease of use and implementation
The hardware utilized in this configuration was modeled as closely as possible to a customer configuration that was identified during a TechCenter chat session. The customer identified having purchased:
- Dell™ PowerEdge™ M1000e with 3 PowerEdge M710 servers
- Four Dell PowerConnect™ M6220 switches
- A Dell EqualLogic™ PS6000 series iSCSI storage area network (SAN) array
The customer noted that the two open I/O module slots are targeted for 10 Gigabit Ethernet (GbE) networking in the future.
The only exception to our best practice recommendation in this design is the use of logical separation rather than physical separation of the iSCSI traffic. We recommend that customers use physical separation for iSCSI traffic, however we understand that there are some environments where that is not feasible. If this is the case, then logical separation through virtual LANs (VLANs) can be implemented. To support all of the design principles identified previously with the hardware defined, logical separation was required.
This design could be easily modified by adding additional mezzanine cards and switches to the unused fabric and dedicating that hardware to iSCSI. In this case, the adapters identified in this design for iSCSI traffic could be repurposed for other traffic types (recommend virtual machine). Obviously, the switch and Hyper-V server configuration would need to be updated accordingly.
Note: This configuration does not focus on best practices associated with SCVMM R2.
Hyper-V R2 Networking Overview
The traffic types in a Hyper-V R2 implementation (with failover clustering) are listed in the table below.
| Type | Details | Redundancy |
| Virtual machine | Allows virtual machines to communicate with clients. Enabled by creating a Hyper-V virtual switch on a network adapter. The failover cluster should be disabled from managing this network. | Provided by establishing the Hyper-V virtual switch on a network team. The team can provide load balancing, link aggregation, and failover capabilities to the virtual network. |
| Live migration | Live migration traffic can flow over any network made available to the cluster. By default, live migration traffic will prefer private networks over public networks. In a configuration with more than one private network, live migration traffic will flow on the private network that is not being used by cluster private/CSV. The priority of the networks can be set in the failover cluster manager interface or through WMI/PowerShell. | Provided by the failover cluster. If the primary network for live migration is unavailable, then the next network in the priority list is utilized. |
| Cluster Shared Volumes (CSV) | CSV traffic will flow over the same network identified for use by cluster private. Although CSV is not a requirement for supporting live migration, implementing CSV provides additional redundancy and provides an easier management experience for administrators. | See cluster private. |
| Cluster private | Cluster private traffic is the network that provides the primary interface for inter-node communication in the cluster. Cluster private traffic will flow over the private network with the lowest cluster metric (typically has value of 1000). To view the cluster network metrics that have been assigned, run the following PowerShell command:
Get-ClusterNetwork | ft Name, Metric | Provided by the failover cluster. If the primary network for cluster private is unavailable, then an alternate private network will be utilized. |
Cluster public/ management | In a Hyper-V environment, Cluster Public is the network that provides a management interface to the cluster. External management applications (SCVMM, DMC, Backup/Restore, etc) communicate with the cluster through this network. | Provided by the failover cluster. If an individual node loses access to the public network, the cluster can maintain the group cluster IP address. If retaining access to each node is critical for your environment, a network team can be established for this network. |
In addition, failover clustering has a requirement to attach to a shared storage device—(typically iSCSI, Fibre Channel, or Serial Attached SCSI (SAS). In an iSCSI environment, the failover cluster should be disabled from managing the iSCSI networks. Redundancy for the storage network is typically provided by Microsoft's MPIO framework.
Demo Hardware
PowerEdge M1000e Blade Chassis
| Component | Role | Quantity |
| PowerEdge M710 | Hyper-V R2 host | 2 |
| PowerEdge M805 | SCVMM R2 | 1 |
| PowerConnect M6220 | -- | 4 |
| Chassis Management Controllers (CMC) with FlexAddress enabled | -- | 2 |
The PowerEdge M1000e has three separate fabrics (A, B, and C). Each fabric has two associated I/O slots that can be populated with support I/O modules. The slots are referred to as A1, A2, B1, B2, and C1, C2. The A fabric is dedicated to the on-board network adapters for each blade. Fabrics B and C can be populated with any supported set of mezzanine cards in the servers and corresponding I/O modules (Ethernet, Fibre Channel, Infiniband, etc.).
Component Details
PowerEdge M710 - Processor and memory for this configuration were not critical as the number of virtual machines utilized was very small and performance was not under consideration. As of the time that this article was published, the PowerEdge M710 can support the Intel® Xeon® X5570, 2.93Ghz processor (quantity 2) and up to 192 GB of memory
- Each PowerEdge M710 contains two 5709 dual port LAN on Motherboards (LOMs)
- Each blade was populated with two Broadcom 5709 Dual Port 1Gb Mezzanine Cards for M-Series blades
Internal view of a PowerEdge M710 server
PowerConnect M6220
- I/O slots A1, A2, B1, and B2 have been populated with the PowerConnect M6220 switches
- Each switch has a stacking module, and the switches have been stacked together to create a single virtual switch
- Switches in I/O module slot A1 and B1 also have a 10 Gb CX4 uplink module to connect to our non-iSCSI lab infrastructure
Back end of a PowerEdge M1000e chassis with PowerConnect M6220 switches
EqualLogic PS5000 Series Array
Although the customer has Dell EqualLogic PS6000 series arrays, the design principles remain the same. This design provides the ability to directly connect to up to two EqualLogic PS6000 series arrays to the switches. Other scenarios with external switches can be supported with this hardware, but is outside the scope of this design. Both the EqualLogic PS5000 and EqualLogic PS6000 series arrays offer redundant controllers to handle a storage controller failure. In addition, they support RAID configurations that can tolerate drive failures and have drives identified as hot spares to automatically begin a rebuild in the event that a drive fails
. Existing Lab Infrastructure
The components utilized from our existing lab infrastructure are the following:
- Active Directory, DHCP, and DNS servers
- PowerConnect 6248 switches to interconnect the servers mentioned previously to the blade chassis
Determining the Network Configuration
To implement our best practices, you will need a minimum of five adapters per blade for a non-iSCSI environment and seven for an iSCSI environment:
- 1 public for cluster public
- 2 private: 1 for cluster private/CSV primary, 1 for live migration primary
- 2 for virtual machines
- 2 for iSCSI
With eight available network ports per blade with our hardware, we considered three different options for the additional port:
| Option | When |
| Dedicate for virtual machines | Workload is heavy between virtual machines and external clients/servers |
| Dedicate for iSCSI | Workload is storage intensive |
| Dedicate for cluster public | Management application availability (SCVMM, Dell Management Console Powered by Altiris™ by Symantec™, backup/restore, etc.) to each server is critical. Backup/restore operations are time sensitive (for those that perform copy over the wire). |
For our configuration, we decided to choose the first option and dedicate the port to virtual machine traffic.
PowerConnect M6220 Configuration
Stacking
All four Dell PowerConnect M6220 switches are stacked together to form a single logical switch, which allows all traffic to flow between the four physical switches with 12 Gb of bandwidth on each stacking link. In addition, to implement an LACP team across switches, a single logical switch must be created.
10 Gb Uplinks
The 10 Gb uplink ports are both uplinked to a single logical PowerConnect fabric. These ports are member of a LACP team and the upstream ports have a corresponding LACP configuration. These ports have been configured to only pass traffic tagged with the VLANs identified for cluster public/management and virtual machines.
1 Gb Uplinks
Each PowerConnect M6220 has four 1 Gb ports that have been configured to directly connect to either a physically separate iSCSI infrastructure or directly connect to the EqualLogic array(s). iSCSI traffic has been configured to be tagged on ingress (within the switch to keep segregated) and untagged on egress (outside the switch). Although outside this design, 10 Gb uplink modules could be added to the configuration and set up to pass iSCSI traffic.
Internal Ports
Each PowerConnect M6220 switch has 16 internal ports. In the case of full-height blades, like the PowerEdge M710, 2 ports from each switch are mapped to each blade server. The mapping of these ports to the physical network adapters is critical to the design to help ensure that the loss of a single switch or a single network adapter/LOM does not result in a loss of service. As an example of the mappings, LOM 1 of the blade server in slot 1 would map one port to switch port 1 on A1 and one port to switch port 2 on A2.
Adapter-to-switch port mappings
. .
Based on our network design and the understanding of the adapter to switch port mappings, the configuration shown in this figure was implemented. Take note of each individual traffic type and how redundancy is provided across physical adapters (for example, live migration/cluster private).
Traffic-to-switch port mappings
Understanding the VLAN and Switch Implementation
Each traffic type has been placed on its own VLAN. In addition, we had to take into account our existing VLAN implementation for traffic that we wanted to flow outside the chassis. Our current lab infrastructure is set up with the following VLANs:
| Type | VLAN |
| Management | 52 |
| Client/server workloads | 152, 153 |
With those details in mind, here is the VLAN configuration we implemented:
| Type | VLAN | Egress Chassis? |
| Cluster public/management | 52 | Yes, tagged |
| Cluster private/CSV | 53 | No* |
| Live migration | 81 | No* |
| Virtual machine/virtual networking | 152, 153 | Yes, tagged |
| iSCSI | 20 | Yes, untagged** |
*With our failover cluster being completely contained within the chassis, there is no need to externally route live migration or cluster public traffic. As such, the uplink ports on the switch do not allow those VLANs.
**With our EqualLogic array being directly attached to the switch, there was no need to keep the traffic tagged coming out of the switch.
For some customers, it may be desirable to provide the virtual machines with access to management network (in our case VLAN 52). If that is required,
do not place a virtual switch on the network port dedicated to cluster public. Instead, add an additional virtual adapter on the management VLAN, place it on the existing Hyper-V virtual switch, and update the physical switch settings accordingly.
Who is tagging?
For traffic coming into the switch from outside the chassis (VLANs 52, 152, and 153) it is being tagged by the sender at either the switch or by the server/hypervisor. For traffic coming from our blades, we implemented the following settings:
| Type | VLAN | Tagged by |
| Cluster public/management | 52 | Switch |
| Cluster private/CSV | 53 | Switch |
| Live migration | 81 | Switch |
| Virtual machine/virtual networking | 152, 153 | Hyper-V virtual adapters* |
| iSCSI | 20 | Switch |
*The tagging for virtual machines occurs on a per-virtual adapter basis. As such, any individual virtual machine can have an adapter on VLAN 152, an adapter on VLAN 153, or two adapters to provide access to both VLANs.
We chose to implement tagging at the switch for all traffic types that would support that method because it provides a single location for management of the VLANs. The alternative would have been to tag each physical adapter by setting the VLAN in software; however, that method is more tedious and prone to mistakes. If you only have a single VLAN for virtual machine traffic, then switch-based tagging can be utilized there as well.
Switch Configuration Details
Complete details on the switch settings are available by viewing the switch configuration (
www.delltechcenter.com/page/Sample+Switch+Configuration) that is loaded on the stack. These settings include Jumbo frame support on the ports that pass iSCSI traffic, LACP team configuration for the uplink and virtual networking ports, switch stacking, and VLANs. Use the configuration as an example and update the configuration to meet your specific environment.
Hyper-V Management Partition Network Configuration
At this point, we have all the information to understand what our adapter setup should be on each blade. We know what function each adapter is dedicated to and the VLAN implementation for each.
Mapping Adapters to Switch Ports
The first step is to map the network devices as seen in the management partition to their corresponding switch ports. "Local Area Connection X" doesn't really tell us much. To perform the mappings, we are going to rely upon the MAC addresses for each adapter.
The MAC addresses as they relate to the switch ports can be viewed on the CMC Web interface. In a FlexAddress environment, the MAC addresses can be viewed on a per server, per port basis. The iSCSI MAC addresses seen in this figure are applicable if an iSCSI offload adapter is being utilized.
With the information we have now, we just need to determine the MAC addresses of the management partition devices shown in the "Network Connections" interface
. To aid in that effort, we wrote a PowerShell script that will rename all adapters that begin with "Local Area Connection" to their respective MAC addresses (available here
www.delltechcenter.com/page/Helpful+PowerShell+Scripts). To ease management in the future, we renamed our adapters to identify the function they are performing.
The figure below displays an example of how the complete end-to-end mappings are accomplished.
Completing the Network Configuration
The last steps to implementing our network design were the following:
- Establish the LACP network team in Broadcom Advanced Control Suite (BACS) on the three interfaces identified for virtual machines and create a Hyper-V virtual switch on the resulting teamed adapter
- Assign IP addresses to the network interfaces
- Enable Jumbo frames on the two interfaces identified for iSCSI
- Configuring the Dell EqualLogic Remote Setup Wizard and connecting to the array through the Microsoft iSCSI initiator interface
- After the failover cluster was established:
- Configure the cluster to manage/not manage the different networks
- Setting the priority of the live migration networks
- Ensuring that the network we identified for cluster private has the lowest private cluster network metric and live migration a higher private metric
To view the cluster network metric settings, run the following PowerShell commands:
Import-Module FailoverClusters
Get-ClusterNetwork | ft Name, Metric, AutoMetric
If the automatically assigned metrics are not the desired values, then the following PowerShell commands can be executed to manually set the metric values:
Get-ClusterNetwork | ft Name, Metric, AutoMetric
Note the name of the networks that you want to set the values on (used for next command)
$cn = Get-ClusterNetwork "<cluster network name>"
$cn.Metric = <value>
Cluster private/CSV should have a value of 1000
Live migration should have a value of 1100