With the increasing demand for high-performance networking in virtualized environments, configuring InfiniBand on VMware vSphere 8 has become a critical task for many IT teams. This blog explores the updated process and considerations for setting up InfiniBand using the latest tools and practices, providing insights into performance, SR-IOV configuration, and common troubleshooting scenarios.
The integration of InfiniBand on vSphere 8 has been streamlined with enhancements to the vSphere Lifecycle Manager and improved native driver support. For those familiar with vSphere 7, many procedures remain consistent, but vSphere 8 introduces some important changes worth noting.
The first step is to confirm that the native Mellanox driver is already present on the ESXi host. With vSphere 8, the driver comes pre-installed, and verification can be done using a simple esxcli command.
Installing Mellanox Firmware Tools, specifically MFT and NMST, is now much easier thanks to vSphere Lifecycle Manager. Instead of deploying packages manually across hosts, admins can use Lifecycle Manager to import and remediate clusters efficiently. These packages can be downloaded from NVIDIA’s website, and after uploading them to the vSphere Client, the entire cluster can be updated with minimal downtime.
In some cases, InfiniBand cards may not be visible via the mst status command after installation. This can typically be resolved by putting the native Mellanox driver into recovery mode using specific esxcli module parameters, followed by a couple of host reboots.
Enabling SR-IOV is particularly relevant for workloads like large language model training that require multiple IB cards. A script using mlxconfig can be used to enable advanced PCI settings and configure virtual functions on each device. It's important to remember that in vSphere 8, InfiniBand VFs may appear as 'Down' in the UI, which is expected behavior. The actual link state should be verified at the VM level.
For environments that continue to use PCI passthrough rather than SR-IOV, disabling SR-IOV can be done with a similar script that reverts the card settings and resets the ESXi module parameters.
On the switching side, administrators should ensure their IB switches are running an up-to-date MLNX-OS. Compatibility between switch firmware and adapter firmware is key to avoiding communication issues. Enabling Open-SM virtualization support on the switches is also critical to support SR-IOV functionality.
The paper outlines several behavioral nuances when using passthrough versus SR-IOV. For instance, certain Open-SM utilities may not function when a virtual function is passed to a VM, which is normal. Likewise, mst status behavior may differ depending on whether a physical or virtual function is used.
Troubleshooting steps are also provided for cases where MLX cards fail to appear in the mst status output. These include reloading the appropriate kernel modules and signaling system processes, after which recovery mode is temporarily enabled until the next host reboot.
Performance tests revealed that unidirectional bandwidth reached 396.5 Gbps using four queue pairs, nearly saturating the theoretical line rate of the InfiniBand cards. Bidirectional bandwidth tests showed performance scaling up to 790 Gbps with two cards, confirming the setup’s ability to handle demanding HPC and AI workloads.
In conclusion, VMware vSphere 8 enhances the experience of deploying InfiniBand by introducing automation through Lifecycle Manager and retaining robust performance tuning capabilities. With updated best practices, simplified installation, and clear guidance on SR-IOV and troubleshooting, IT teams can now fully leverage InfiniBand’s potential in virtualized environments, including VMware Cloud Foundation.
This blog captures the essence of the technical paper authored by Yuankun Fu, who has a strong background in HPC and AI performance optimization within VMware. His guidance in this paper provides both practical instructions and valuable performance data for teams looking to adopt or enhance InfiniBand in their vSphere environments.