by Brian K. Matheson
Published April 2013
Oracle Virtual Networking offers advantages for virtualized server environments. With server virtualization, the bandwidth, network diversity, and storage performance requirements for the physical hosts can be accommodated by using multiple virtual NICs (vNICs) and virtual HBAs (vHBAs), all of which are delivered over a single pair of cables connecting to a purpose-built fabric. These resources can be added or changed on the fly without disrupting service, making it easy to virtualize new applications. The networking resources can also be configured such that they communicate directly across the fabric, allowing for unprecedented server-to-server performance.
In this article, we will examine guidelines and best-practices for deploying Oracle Virtual Networking in a virtualized environment. The recommendations focus on bringing the most simplicity, agility, and performance to the environment. This article is not a replacement for the official documentation; it is a supplement. It is assumed that readers are familiar with the administration of virtualized environments and the basic concepts of Oracle Virtual Networking.
The first step in the deployment of any data center equipment should be planning out the details. Since Oracle Virtual Networking allows for easy separation of networks, it is common for Oracle Virtual Networking deployments that are used by virtualized servers to have multiple Ethernet networks and potentially FC networks. These networks are typically defined by their purpose, and might include separate networks as pictured in Figure 1, for migration, management, backup, and the SAN. Additional networks might include a DMZ, an isolated database network, a dedicated iSCSI SAN, and an FC tape SAN. All of these networks can and should be planned out before the deployment.
For a simple example, take a virtualized environment that needs one network for virtual machines (VMs), one network for management connections for the host, one network for iSCSI traffic, and one network for VM migration. We can define a link aggregation group (LAG) including the first four ports on a 10-port GbE card and use that for VMs and management, and then use the last two ports for separate dedicated networks for iSCSI and migration, as shown in Figure 2:
Table 1 and Table 2 show some planning worksheets for the example configuration. Doing such planning ahead of time helps with understanding the rest of the associated hypervisor and Ethernet switch configuration that needs to be done.Table 1. Planning Worksheet: I/O Port Configuration Parameters
|Port Name||Mode||Flow Control||Maximum Transmission Unit (MTU)|
|Cloud Name||Mode||VLAN ID||Quality of Service (QoS)||Policy||Ports|
|VMNet||trunk||All VLAN IDs (for example, 2, 3, 4, and so on)||none||random||xsigo1/6.1, xsigo2/6.1|
The initial rack and stack of the Oracle Fabric Interconnects is much like that of any other data center device. Airflow is front to back, as with most server hardware. The I/O ports and InfiniBand (IB) fabric connections are on the back of the system to allow for easy cabling to the servers on the same side of the rack. Therefore, it is recommended that the Oracle Fabric Interconnects be racked in the server racks rather than in a networking cabinet (where the ports are typically located on the air-intake side).
In most large deployments (more than 20 servers), intermediary switches are used to grow the fabric. These switches are typically deployed in pairs, and they have InfiniBand ports to accommodate a number of servers connected to them, as well as uplinks into the Oracle Fabric Interconnects. Typically, it is recommended that multiple uplinks to the Oracle Fabric Interconnects be provided for redundancy and additional bandwidth. In an InfiniBand fabric, traffic is balanced across parallel links between switches without any additional configuration.
For many deployments, two InfiniBand uplinks are enough to accommodate the traffic demands of the connected servers. In other cases, four uplinks are required to allow enough bandwidth for the installed I/O cards. Additional factors in the decision are the amount of aggregate bandwidth desired for server-to-server communication over the uplinks. If in doubt, four uplinks from each external fabric switch is a good number to choose.
The servers should be connected to the fabric with exactly two InfiniBand ports. These ports can be on the same dual-port InfiniBand host channel adapter (HCA) or on two separate HCAs. In blade server environments, it is typical to have only one HCA installed. When using rack-mounted servers, either one HCA or two HCAs can be installed. In the case of two HCAs, use only one port from each card.
Care should be taken to ensure that the IB cables are seated properly. IB cables lock into place when they are seated. Removal of the cable requires the use of the extraction tab. Once the cables are plugged in, pull on them firmly to ensure that they are seated.
Whenever modifications to the physical connections are made or when new I/O cards are installed, it is important to check for any errors on the IB fabric using the
ibcheckerrors command. Log in to the Oracle Fabric Interconnect using the
root account. Run the
ibclearerrors command, wait for a minute or more with the system in a steady state and running traffic, and then run the
ibcheckerrors command. In general, symbol errors and VL15 dropped errors do not indicate a problem, but other errors—such as receive errors and discards—can indicate problems. If any of these are detected, check the new cabling first.
The initial setup of the system is very straight-forward. The Oracle Fabric Interconnects are connected to a management Ethernet network. They will use DHCP to request an address, or they can be given a specific network configuration via the configuration script on the serial console. Both Oracle Fabric Interconnects should run an IB subnet manager, and they will elect a master on their own. Once the Oracle Fabric Interconnects are set up, physical servers with the host drivers installed on them will appear by name on the fabric.
Initial setup of the Oracle Fabric Manager server is similarly straightforward. Please refer to the product documentation for information on how to install the software on a server or VM running Linux or Microsoft Windows. Once the Oracle Fabric Manager server is installed and licensed, associate it with the deployed Oracle Fabric Interconnects using the Discover or Scan buttons on the dashboard page. Any additional configuration for the Oracle Fabric Interconnects can be done at this point. The typical workflow for the initial configuration is discussed in more detail below.
It is recommended that the phone-home functionality be turned on and tested. This functionality alerts Oracle Support Engineers of activity on the system, including hardware failures. Opening a low-priority support case to ask for verification of the phone-home configuration, and then manually initiating a phone-home event, is the best way to ensure that the data is reaching Oracle. Weekly phone-home events are recommended so that the support organization has a current baseline to use for comparison in the event of a support event.
Monitoring of traffic flow can be accomplished within the Oracle Fabric Manager software using the monitoring plug-in. It can also be done via SNMP. Oracle recommends monitoring standard system characteristics—such as system uptime and so on—via SNMP. Oracle also recommends setting a trap host to catch traps and decode them with the provided management information base (MIB).
As with any networking device, the best practice is to add descriptions to your interfaces. Description fields are available for I/O cards, I/O ports, and LAGs. Oracle recommends using the port description field to indicate which network switch and port is connected (for example, switch1-ge1/1/1). The LAG description should indicate the associated logical interface (for example, switch1-portchannel40). Other information can be added as desired.
A number of Ethernet characteristics are set at the port level. These settings must be done when no virtual resources are assigned to the port in question. Proper planning will allow for these characteristics to be determined ahead of time and configured at deployment time.
The MTU is set at the port level, and it is inherited by all vNICs that terminate on that port at vNIC creation time. Oracle recommends using the default MTU (1500 bytes) for "forward facing" interfaces that carry traffic bound for an IP router. Examples of these would be a management interface for the server and an interface for virtual machine networks.
Oracle recommends using a larger MTU (typically, 9000 bytes) for "internal" networks such as a dedicated storage network, VM migration network, or anything that does not need to route traffic to the network at large.
The MTU must be the same for all the NIC devices speaking on that network, and the MTU for interswitch links must be large enough to accommodate the frames. For example, for a VLAN that is dedicated to iSCSI traffic and is used by virtual machines, set the MTU to 9000 in the following places:
vNICs created on that Ethernet port on the Oracle Fabric Interconnect will be able to pass traffic to the storage array using a frame size of 9000.
Ethernet flow control support must be enabled on the Ethernet port prior to deploying vNICs on it. Flow control should also be set on the vNICs that terminate on this port. For the best performance, most storage array vendors recommend using flow control on dedicated storage networks. Flow control should be enabled on the intermediary switch ports as well.
The port's mode must be set before deploying vNICs on the port. By default, all Ethernet ports are in access mode. To carry VLAN-tagged traffic across the port, the mode must be changed to trunk. The mode of the port is independent of the mode that the vNIC has configured, such that it is possible to have access-mode vNICs terminated on a trunk-mode Ethernet port. It is also possible to configure trunk-mode vNICs on a trunk-mode port, such that the whole path supports VLAN-tagged Ethernet frames.
For virtualized environments, it's typically best to use trunk-mode vNICs terminating on trunk-mode Ethernet ports for virtual machine networks. This allows for splitting the virtual networks into different VLANs at the hypervisor layer.
Once the Oracle Fabric Interconnects are deployed and configured, you can define the network and storage clouds within the Oracle Fabric Manager software. We discussed above the planning of the different networks within a data center. Let's go into some more detail about the specifics of configuring the interfaces that make up the clouds.
Note: A specific port can be in multiple clouds.
There are three types of clouds that can be defined: storage, network, and private virtual infrastructures (PVIs). In all cases, the clouds should be specific to one fabric. When considering naming conventions, it is important to keep in mind the possibility of a single instance of Oracle Fabric Manager managing multiple fabrics (typically at multiple sites).
For all cloud types, it is possible to add multiple ports from multiple Oracle Fabric Interconnects. In situations in which high availability (HA) vNICs or HA vHBAs are used, the Oracle Fabric Manager software will terminate the constituent resources on the most diverse set of ports possible (that is, if it is possible to configure the HA resource such that the active resource is on Oracle Fabric Interconnect A and the standby resource is on Oracle Fabric Interconnect B, it will do so). Therefore, if you are using HA vNICs and vHBAs, it is recommended that you create clouds that include ports from both Oracle Fabric Interconnect A and Oracle Fabric Interconnect B.
Avoid the use of HA resources in virtualized environments for clouds other than PVIs. Redundancy can be configured explicitly in the hypervisor, resulting in more deterministic failover behaviors. If you are connecting hypervisors and bare-metal servers to the resources, create only one set of clouds for the hypervisors and create a different set of clouds using the same ports for the bare-metal servers.
For storage clouds, Oracle recommends creating one cloud for each Oracle Fabric Interconnect/FC-fabric combination. In a typical scenario, there will be four links to the FC SAN fabrics. For example, Figure 4 shows the following clouds:
For optimal redundancy, create one cloud for each of these connectivity paths, add one vHBA per cloud on each server, and ensure the multipathing software on the host is able to use all four vHBAs. When adding FC cards to the environment, add them in multiples of four, and connect them in a similar fashion so that one port can be added to each cloud. Typically, there is no need to set other parameters for the storage clouds.
For network clouds, the optimally redundant configuration is a mesh from each Oracle Fabric Interconnect to each of a pair of stacked switches or a virtual switching system (VSS). LAGs are created on each Oracle Fabric Interconnect to bond the two ports together, and the LAGs can be added to one cloud for each Oracle Fabric Interconnect. vNICs are terminated on the LAGs that are providing switch-level redundancy, and a pair of vNICs are bonded together in the hypervisor to provide Oracle Fabric Interconnect–level redundancy. Ports are added in a similar fashion, with the configured LAG on Oracle Fabric Interconnect A being added to the cloud for Oracle Fabric Interconnect A and the LAG on Oracle Fabric Interconnect B being added to the cloud for Oracle Fabric Interconnect B. LAGs must be configured before adding the LAGs to the network clouds.
Note: It is not possible to create a LAG that spans different I/O cards. Also, each operating system handles NIC names differently. To avoid compatibility issues, use unique interface names that have fewer than seven characters.
These clouds require the least amount of planning, because they exist entirely within the IB fabric. For purposes of redundancy, it is necessary to use HA vNICs. It is also recommended that you use as large an MTU as is supported by your operating system (typically, this is 9000).
Once the clouds are created, build a template for your virtual host, as shown in Figure 5.
Naming conventions should be based on the location. The template will include vNIC objects that connect to clouds. For virtualized environments, using the HA virtual resources is possible but not recommended. I/O resources should be deployed in pairs, connected to the associated clouds, and given short names that reflect the purpose of the interface. Unless you are using a remote booting technique, such as boot from PXE or SAN, the other parameters in the template editor need not be specified.
Save the template and then use it to create I/O profiles, which can be attached to physical servers as they appear on the fabric.
Brian K. Matheson is an Oracle Sales Consultant specializing in the Oracle Virtual Networking product line. Brian has contributed to Oracle and Xsigo in a variety of ways over the years, including systems administration, QA, and professional services. He currently resides in New York City.
|Revision 1.0, 04/22/2013|