Printable version

Drivers & software

* RECOMMENDED * HPE Mellanox RoCE (RDMA over Converged Ethernet) Driver for Red Hat Enterprise Linux 6 Update 9 (x86_64)

By downloading, you agree to the terms and conditions of the Hewlett Packard Enterprise Software License Agreement.
Note:  Some software requires a valid warranty, current Hewlett Packard Enterprise support contract, or a license fee.

Type: Driver - Network
Version: 4.1(25 Sep 2017)
Operating System(s):
Red Hat Enterprise Linux 6 Server (x86-64)
Multi-part download
File name: kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.compsig (2.0 KB)
File name: kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm (1.1 MB)
File name: mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.compsig (2.0 KB)
File name: mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm (1.8 MB)
This RPM contains the HPE Tested and Approved Linux based Mellanox RoCE (RDMA over Converged Ethernet) driver for supported HPE Mellanox adapter cards.

HPE Mellanox RoCE driver version 4.1 contains the following changes and new features:

  • Support for additional RoCE diagnostics and ECN congestion counters under /sys/class/infiniband/mlx5_0/ports/1/hw_counters/ directory.
  • Support for rx-fcs ethtool offload configuration. Normally, the FCS of the packet will be truncated by the ASIC hardware before sending it to the application socket buffer (skb). Ethtool allows to set the rx-fcs not to be truncated, but to pass it to the application for analysis.
  • Option to enable PFC based on the DSCP value. Using this solution, VLAN headers will no longer be mandatory for use.
  • ECN parameters have been moved to the following directory: /sys/kernel/debug/mlx5/<PCI BUS>/cc_params/
  • Support for mlx_fs_dump, which is a python tool that prints the steering rules in a readable manner.
  • Ability to open a device and create a context while giving PCI peer attributes such as name and ID.
  • Ability to disable probed VFs on the hypervisor.
  • Improved performance by rendering Local loopback (unicast and multicast) disabled by mlx5 driver by default while local loopback is not in use. The mlx5 driver keeps track of the number of  transport domains that are opened by user-space applications. If there is more than one userspace transport domain open, local loopback will automatically be enabled.
  • Support for One Pulse Per Second (1PPS), which is a time synchronization feature that allows the adapter to send or receive 1 pulse per second on a dedicated pin on the adapter card.
  • Support for fast driver teardown in shutdown and kexec flows.
  • support for NVMe over fabrics (NVMEoF) offload, an implementation of the new NVMEoF standard target (server) side in hardware.
  • Changed the default RoCE mode on which RDMA CM runs to RoCEv2 instead of RoCEv1. RDMA_CM session requires both the client and server sides to support the same RoCE mode. Otherwise, the client will fail to connect to the server.

HPE Mellanox RoCE driver version 3.4 contains the following changes and new features:

  • Added the following kernel module parameters:
    • mlx4_en_only_mode
    • udev_dev_port_dev_id

To ensure the integrity of your download, HPE recommends verifying your results with the following SHA-256 Checksum values:

34cbce9b13821a6ad88426412df2f1c4ab64ed6ddfca992fdf8091cda317f2f4 kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm
ffe4feeab8fd8037e7f02b82101f595c941e72bc420ef0d6a1f44f5187884710 mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.compsig
1e9615138358179f076eee54569f2144bf254583a3ea8ecb50a9387b84318647 mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm
c1a3a4a8272c3f5c4a8fa6111566e39a16d9063e9cd760352c4daa30339acc5f kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.compsig

Reboot Requirement:
Reboot is required after installation for updates to take effect and hardware stability to be maintained.


Installation:

Login as root, download the RPM to a directory on your hard drive and change to that directory.

Note: Irrespective of the kernel version or type used, the "mlnx-ofa_kernel-4.0" RPM must be installed to enable the user space functionality for RoCE. The RoCE user space library RPM (mlnx-ofa_kernel) may conflict with the OpenMPI RPMs included with the OS distribution.This is a known behavior. Insure that the OpenMPI RPMs from the Linux distribution are not installed on the target node prior to the installation of "mlnx-ofa_kernel". If already installed, uninstall any OpenMPI distribution RPMs before installing HPE Mellanox RoCE driver packages.

To install or upgrade the driver:

If using kernel version 2.6.32-696.el6 or any future errata:

# rpm -Uvh kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm  mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm

Setup is now complete. Reboot your computer for the driver to take effect.


End User License Agreements:
HPE Software License Agreement v1


Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


Supported Devices and Features:

SUPPORTED KERNELS:
The kernels of Red Hat Enterprise Linux 6 Update 9 (x86_64) supported by this binary rpm are:
2.6.32-696.el6 -  (x86_64) and future update kernels.


Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues are fixed in version 4.1:

  • IPv6 procedures were called when they were not supported by the underlying kernel.
  • Fixed memory leak issue that was introduced in kernel 4.11, and added warning messages to the Soft RoCE driver for easy detection of future SKB leaks.
  • Kernel crash used to occur when RXe device was coupled with a virtual (dummy) device.
  • Race condition in the RoCE GID cache used to cause for the loss of IP-based GIDs.
  • "rdma_cm" connection between a client and a server that were on the same host was not possible when working over VLAN interfaces.
  • RDMACM connection used to fail upon high connection rate accompanied with the error message: RDMA_CM_EVENT_UNREACHABLE.
  • SR-IOV (Single Root I/O Virtualization) was not supported in systems with a page size greater than 16KB.

The following issues are fixed in version 4.0:

  • Kernel became out of memory upon driver start occassionally on SLES12 SP2.
  • Spoof-check was turned on for MAC address 00:00:00:00:00:00
  • TCP packets were received in out of order manner when Large Receive Offload (LRO) was on.
  • Memory allocation for CQ buffers used to fail when increasing the RX ring size.
  • MLNX_EN driver failed to load on 4K page ARM architecture.

The following issues are fixed in version 3.4:

  • "ethtool" self test used to fail on interrupt test after timeout if mlx4_ib module was not loaded.
  • On rare occassions, kernel panic occured during system reboot caused by mlx4_en_get_drvinfo() called from asynchronous event handler.
  • When attempting to disable SR-IOV while there are any VF netdevs open, the operation failed to succeed.
Version:4.3 (26 Jun 2018)
Fixes

Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues have been fixed in version 4.3:

  • Sending Work Requests (WRs) with multiple entries where the first entry was less than 18 bytes used to fail.
  • When the interface was down, ethtool counters ceased to increase. As a result, RoCE traffic counters were not always incremented.
  • Compilation errors of MLNX_OFED over kernel when CONFIG_PTP_1588_CLOCK parameter was not set.
  • System used to hang when trying to allocate multiple device memory buffers from different processes simultaneously.
Enhancements

Changes and new features in HPE Mellanox RoCE driver version 4.3:

  • For ConnectX-5 adapters, added support for the following multi-packet Work Requests related verbs for control path:
    • ibv_exp_query_device
    • ibv_exp_create_srq
  • Added support for the following new features:
    • RDMA atomic commands offload so that when an RDMA write operation is issued, the payload indicates which atomic operation to perform, instead of being written to the Memory Region (MR).
    • Out of box RoCE LAG support for Red Hat Enterprise Linux 7 Update 2 and Red Hat Enterprise Linux 6 Update 9.
    • A new counter rx_steer_missed_packets which provides the number of packets that were received by the NIC, yet were discarded/dropped since they did not match any flow in the NIC steering flow table.
    • Ability for SR-IOV counter rx_dropped to count the number of packets that were dropped while vport was down.
    • RSYNC feature to ensure correct ordering of memory operations between the GPU and HCA. 
    • Triggering software reset for firmware/driver recovery. When fatal errors occur, firmware can be reset and driver reloaded.
    • Option to retrieve the Hardware timestamp when polling for completions from a completion queue that is attached to a multi-packet RQ (Striding RQ).
    • The following advanced burst control parameters:
      • max_burst_sz - for indicating the maximal burst size of packets
      • typical_pkt_sz - for improving the accuracy of the rate limiter
  • Removed support for Virtual MAC feature.

Version:4.2 (2 Mar 2018)
Fixes

Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues have been fixed in version 4.2:

  • RPM commands used to fail and create a core file occassionally after reboot, with messages such as “Bus error (core dumped)”, causing the openibd service to fail to start.
  • RoCEv2 multicast traffic using RDMA-CM with IPv4 address were not received by the adapter.
  • ethtool -P output was 00:00:00:00:00:00 when using old kernels.
  • Replaced a few “GPL only” legacy libibverbs function with upstream implementation that conforms with libibverbs GPL/BSD dual license model.
  • ACCESS_REG command failure used to appear upon RoCE Multihost driver restart in dmesg.
  • Concurrent client requests got corrupted when working in persistent server mode due to a race condition on the server side.
  • Client side did not exit gracefully in RTT mode when the server side was not reachable.
Enhancements

HPE Mellanox RoCE driver version 4.2 contains the following changes and new features:

  • Added a feature that allows registering a specific physical address range.
  • Added support for PTP feature over PKEY interfaces. This feature allows for accurate synchronization between the distributed entities over the network. The synchronization is based on symmetric Round Trip Time (RTT) between the master and slave devices, and is enabled by default. 
  • Support for Virtual MAC feature, which allows users to add up to 4 virtual MACs (VMACs) per Virtual Function (VF).
  • Added the option to change receive buffer size and cable length. Changing cable length will adjust the receive buffer's xon and xoff thresholds. 
  • Added support for the following GRE tunnel offloads:
    • TSO over GRE tunnels
    • Checksum offloads over GRE tunnels
    • RSS spread for GRE packets
  • Added support for the host side (RDMA initiator) in Red Hat Enterprise Linux 7 Update 2 and above.
  • Added support for the driver to notify the Firmware when Software receive queues are overloaded.
  • Added support for configuring PFC stall prevention in cases where the device unexpectedly becomes unresponsive for a long period of time. PFC stall prevention disables flow control mechanisms when the device is stalled for a period longer than the default pre-configured timeout. Users now have the ability to change the default timeout by moving to auto mode.
  • Added support for Q-in-Q VST feature in ConnectX-5 adapter cards family.
  • Added support for VGT+ in ConnectX-4/ConnectX-5 HCAs. This feature is s an advanced mode of Virtual Guest Tagging (VGT), in which a VF is allowed to tag its own packets as in VGT, but is still subject to an administrative VLAN trunk policy. The policy determines which VLAN IDs are allowed to be transmitted or received. The policy does not determine the user priority, which is left unchanged.
  • Added support for hardware Tag Matching offload with Dynamically Connected Transport (DCT).
  • Added support for the driver to take an automatic snapshot of the device’s CR-Space in cases of critical failures.

Version:4.1 (25 Sep 2017)
Fixes

Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues are fixed in version 4.1:

  • IPv6 procedures were called when they were not supported by the underlying kernel.
  • Fixed memory leak issue that was introduced in kernel 4.11, and added warning messages to the Soft RoCE driver for easy detection of future SKB leaks.
  • Kernel crash used to occur when RXe device was coupled with a virtual (dummy) device.
  • Race condition in the RoCE GID cache used to cause for the loss of IP-based GIDs.
  • "rdma_cm" connection between a client and a server that were on the same host was not possible when working over VLAN interfaces.
  • RDMACM connection used to fail upon high connection rate accompanied with the error message: RDMA_CM_EVENT_UNREACHABLE.
  • SR-IOV (Single Root I/O Virtualization) was not supported in systems with a page size greater than 16KB.

The following issues are fixed in version 4.0:

  • Kernel became out of memory upon driver start occassionally on SLES12 SP2.
  • Spoof-check was turned on for MAC address 00:00:00:00:00:00
  • TCP packets were received in out of order manner when Large Receive Offload (LRO) was on.
  • Memory allocation for CQ buffers used to fail when increasing the RX ring size.
  • MLNX_EN driver failed to load on 4K page ARM architecture.

The following issues are fixed in version 3.4:

  • "ethtool" self test used to fail on interrupt test after timeout if mlx4_ib module was not loaded.
  • On rare occassions, kernel panic occured during system reboot caused by mlx4_en_get_drvinfo() called from asynchronous event handler.
  • When attempting to disable SR-IOV while there are any VF netdevs open, the operation failed to succeed.
Enhancements

HPE Mellanox RoCE driver version 4.1 contains the following changes and new features:

  • Support for additional RoCE diagnostics and ECN congestion counters under /sys/class/infiniband/mlx5_0/ports/1/hw_counters/ directory.
  • Support for rx-fcs ethtool offload configuration. Normally, the FCS of the packet will be truncated by the ASIC hardware before sending it to the application socket buffer (skb). Ethtool allows to set the rx-fcs not to be truncated, but to pass it to the application for analysis.
  • Option to enable PFC based on the DSCP value. Using this solution, VLAN headers will no longer be mandatory for use.
  • ECN parameters have been moved to the following directory: /sys/kernel/debug/mlx5/<PCI BUS>/cc_params/
  • Support for mlx_fs_dump, which is a python tool that prints the steering rules in a readable manner.
  • Ability to open a device and create a context while giving PCI peer attributes such as name and ID.
  • Ability to disable probed VFs on the hypervisor.
  • Improved performance by rendering Local loopback (unicast and multicast) disabled by mlx5 driver by default while local loopback is not in use. The mlx5 driver keeps track of the number of  transport domains that are opened by user-space applications. If there is more than one userspace transport domain open, local loopback will automatically be enabled.
  • Support for One Pulse Per Second (1PPS), which is a time synchronization feature that allows the adapter to send or receive 1 pulse per second on a dedicated pin on the adapter card.
  • Support for fast driver teardown in shutdown and kexec flows.
  • support for NVMe over fabrics (NVMEoF) offload, an implementation of the new NVMEoF standard target (server) side in hardware.
  • Changed the default RoCE mode on which RDMA CM runs to RoCEv2 instead of RoCEv1. RDMA_CM session requires both the client and server sides to support the same RoCE mode. Otherwise, the client will fail to connect to the server.

HPE Mellanox RoCE driver version 3.4 contains the following changes and new features:

  • Added the following kernel module parameters:
    • mlx4_en_only_mode
    • udev_dev_port_dev_id

Type: Driver - Network
Version: 4.1(25 Sep 2017)
Operating System(s):
Red Hat Enterprise Linux 6 Server (x86-64)

Description

This RPM contains the HPE Tested and Approved Linux based Mellanox RoCE (RDMA over Converged Ethernet) driver for supported HPE Mellanox adapter cards.

Enhancements

HPE Mellanox RoCE driver version 4.1 contains the following changes and new features:

  • Support for additional RoCE diagnostics and ECN congestion counters under /sys/class/infiniband/mlx5_0/ports/1/hw_counters/ directory.
  • Support for rx-fcs ethtool offload configuration. Normally, the FCS of the packet will be truncated by the ASIC hardware before sending it to the application socket buffer (skb). Ethtool allows to set the rx-fcs not to be truncated, but to pass it to the application for analysis.
  • Option to enable PFC based on the DSCP value. Using this solution, VLAN headers will no longer be mandatory for use.
  • ECN parameters have been moved to the following directory: /sys/kernel/debug/mlx5/<PCI BUS>/cc_params/
  • Support for mlx_fs_dump, which is a python tool that prints the steering rules in a readable manner.
  • Ability to open a device and create a context while giving PCI peer attributes such as name and ID.
  • Ability to disable probed VFs on the hypervisor.
  • Improved performance by rendering Local loopback (unicast and multicast) disabled by mlx5 driver by default while local loopback is not in use. The mlx5 driver keeps track of the number of  transport domains that are opened by user-space applications. If there is more than one userspace transport domain open, local loopback will automatically be enabled.
  • Support for One Pulse Per Second (1PPS), which is a time synchronization feature that allows the adapter to send or receive 1 pulse per second on a dedicated pin on the adapter card.
  • Support for fast driver teardown in shutdown and kexec flows.
  • support for NVMe over fabrics (NVMEoF) offload, an implementation of the new NVMEoF standard target (server) side in hardware.
  • Changed the default RoCE mode on which RDMA CM runs to RoCEv2 instead of RoCEv1. RDMA_CM session requires both the client and server sides to support the same RoCE mode. Otherwise, the client will fail to connect to the server.

HPE Mellanox RoCE driver version 3.4 contains the following changes and new features:

  • Added the following kernel module parameters:
    • mlx4_en_only_mode
    • udev_dev_port_dev_id

Installation Instructions

To ensure the integrity of your download, HPE recommends verifying your results with the following SHA-256 Checksum values:

34cbce9b13821a6ad88426412df2f1c4ab64ed6ddfca992fdf8091cda317f2f4 kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm
ffe4feeab8fd8037e7f02b82101f595c941e72bc420ef0d6a1f44f5187884710 mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.compsig
1e9615138358179f076eee54569f2144bf254583a3ea8ecb50a9387b84318647 mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm
c1a3a4a8272c3f5c4a8fa6111566e39a16d9063e9cd760352c4daa30339acc5f kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.compsig

Reboot Requirement:
Reboot is required after installation for updates to take effect and hardware stability to be maintained.


Installation:

Login as root, download the RPM to a directory on your hard drive and change to that directory.

Note: Irrespective of the kernel version or type used, the "mlnx-ofa_kernel-4.0" RPM must be installed to enable the user space functionality for RoCE. The RoCE user space library RPM (mlnx-ofa_kernel) may conflict with the OpenMPI RPMs included with the OS distribution.This is a known behavior. Insure that the OpenMPI RPMs from the Linux distribution are not installed on the target node prior to the installation of "mlnx-ofa_kernel". If already installed, uninstall any OpenMPI distribution RPMs before installing HPE Mellanox RoCE driver packages.

To install or upgrade the driver:

If using kernel version 2.6.32-696.el6 or any future errata:

# rpm -Uvh kmod-mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm  mlnx-ofa_kernel-4.1-OFED.4.1.1.0.2.1.gc22af88.rhel6u9.x86_64.rpm

Setup is now complete. Reboot your computer for the driver to take effect.


Release Notes

End User License Agreements:
HPE Software License Agreement v1


Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


Supported Devices and Features:

SUPPORTED KERNELS:
The kernels of Red Hat Enterprise Linux 6 Update 9 (x86_64) supported by this binary rpm are:
2.6.32-696.el6 -  (x86_64) and future update kernels.


Fixes

Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues are fixed in version 4.1:

  • IPv6 procedures were called when they were not supported by the underlying kernel.
  • Fixed memory leak issue that was introduced in kernel 4.11, and added warning messages to the Soft RoCE driver for easy detection of future SKB leaks.
  • Kernel crash used to occur when RXe device was coupled with a virtual (dummy) device.
  • Race condition in the RoCE GID cache used to cause for the loss of IP-based GIDs.
  • "rdma_cm" connection between a client and a server that were on the same host was not possible when working over VLAN interfaces.
  • RDMACM connection used to fail upon high connection rate accompanied with the error message: RDMA_CM_EVENT_UNREACHABLE.
  • SR-IOV (Single Root I/O Virtualization) was not supported in systems with a page size greater than 16KB.

The following issues are fixed in version 4.0:

  • Kernel became out of memory upon driver start occassionally on SLES12 SP2.
  • Spoof-check was turned on for MAC address 00:00:00:00:00:00
  • TCP packets were received in out of order manner when Large Receive Offload (LRO) was on.
  • Memory allocation for CQ buffers used to fail when increasing the RX ring size.
  • MLNX_EN driver failed to load on 4K page ARM architecture.

The following issues are fixed in version 3.4:

  • "ethtool" self test used to fail on interrupt test after timeout if mlx4_ib module was not loaded.
  • On rare occassions, kernel panic occured during system reboot caused by mlx4_en_get_drvinfo() called from asynchronous event handler.
  • When attempting to disable SR-IOV while there are any VF netdevs open, the operation failed to succeed.

Revision History

Version:4.3 (26 Jun 2018)
Fixes

Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues have been fixed in version 4.3:

  • Sending Work Requests (WRs) with multiple entries where the first entry was less than 18 bytes used to fail.
  • When the interface was down, ethtool counters ceased to increase. As a result, RoCE traffic counters were not always incremented.
  • Compilation errors of MLNX_OFED over kernel when CONFIG_PTP_1588_CLOCK parameter was not set.
  • System used to hang when trying to allocate multiple device memory buffers from different processes simultaneously.
Enhancements

Changes and new features in HPE Mellanox RoCE driver version 4.3:

  • For ConnectX-5 adapters, added support for the following multi-packet Work Requests related verbs for control path:
    • ibv_exp_query_device
    • ibv_exp_create_srq
  • Added support for the following new features:
    • RDMA atomic commands offload so that when an RDMA write operation is issued, the payload indicates which atomic operation to perform, instead of being written to the Memory Region (MR).
    • Out of box RoCE LAG support for Red Hat Enterprise Linux 7 Update 2 and Red Hat Enterprise Linux 6 Update 9.
    • A new counter rx_steer_missed_packets which provides the number of packets that were received by the NIC, yet were discarded/dropped since they did not match any flow in the NIC steering flow table.
    • Ability for SR-IOV counter rx_dropped to count the number of packets that were dropped while vport was down.
    • RSYNC feature to ensure correct ordering of memory operations between the GPU and HCA. 
    • Triggering software reset for firmware/driver recovery. When fatal errors occur, firmware can be reset and driver reloaded.
    • Option to retrieve the Hardware timestamp when polling for completions from a completion queue that is attached to a multi-packet RQ (Striding RQ).
    • The following advanced burst control parameters:
      • max_burst_sz - for indicating the maximal burst size of packets
      • typical_pkt_sz - for improving the accuracy of the rate limiter
  • Removed support for Virtual MAC feature.

Version:4.2 (2 Mar 2018)
Fixes

Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues have been fixed in version 4.2:

  • RPM commands used to fail and create a core file occassionally after reboot, with messages such as “Bus error (core dumped)”, causing the openibd service to fail to start.
  • RoCEv2 multicast traffic using RDMA-CM with IPv4 address were not received by the adapter.
  • ethtool -P output was 00:00:00:00:00:00 when using old kernels.
  • Replaced a few “GPL only” legacy libibverbs function with upstream implementation that conforms with libibverbs GPL/BSD dual license model.
  • ACCESS_REG command failure used to appear upon RoCE Multihost driver restart in dmesg.
  • Concurrent client requests got corrupted when working in persistent server mode due to a race condition on the server side.
  • Client side did not exit gracefully in RTT mode when the server side was not reachable.
Enhancements

HPE Mellanox RoCE driver version 4.2 contains the following changes and new features:

  • Added a feature that allows registering a specific physical address range.
  • Added support for PTP feature over PKEY interfaces. This feature allows for accurate synchronization between the distributed entities over the network. The synchronization is based on symmetric Round Trip Time (RTT) between the master and slave devices, and is enabled by default. 
  • Support for Virtual MAC feature, which allows users to add up to 4 virtual MACs (VMACs) per Virtual Function (VF).
  • Added the option to change receive buffer size and cable length. Changing cable length will adjust the receive buffer's xon and xoff thresholds. 
  • Added support for the following GRE tunnel offloads:
    • TSO over GRE tunnels
    • Checksum offloads over GRE tunnels
    • RSS spread for GRE packets
  • Added support for the host side (RDMA initiator) in Red Hat Enterprise Linux 7 Update 2 and above.
  • Added support for the driver to notify the Firmware when Software receive queues are overloaded.
  • Added support for configuring PFC stall prevention in cases where the device unexpectedly becomes unresponsive for a long period of time. PFC stall prevention disables flow control mechanisms when the device is stalled for a period longer than the default pre-configured timeout. Users now have the ability to change the default timeout by moving to auto mode.
  • Added support for Q-in-Q VST feature in ConnectX-5 adapter cards family.
  • Added support for VGT+ in ConnectX-4/ConnectX-5 HCAs. This feature is s an advanced mode of Virtual Guest Tagging (VGT), in which a VF is allowed to tag its own packets as in VGT, but is still subject to an administrative VLAN trunk policy. The policy determines which VLAN IDs are allowed to be transmitted or received. The policy does not determine the user priority, which is left unchanged.
  • Added support for hardware Tag Matching offload with Dynamically Connected Transport (DCT).
  • Added support for the driver to take an automatic snapshot of the device’s CR-Space in cases of critical failures.

Version:4.1 (25 Sep 2017)
Fixes

Upgrade Requirement:
Recommended - HPE recommends users update to this version at their earliest convenience.


The following issues are fixed in version 4.1:

  • IPv6 procedures were called when they were not supported by the underlying kernel.
  • Fixed memory leak issue that was introduced in kernel 4.11, and added warning messages to the Soft RoCE driver for easy detection of future SKB leaks.
  • Kernel crash used to occur when RXe device was coupled with a virtual (dummy) device.
  • Race condition in the RoCE GID cache used to cause for the loss of IP-based GIDs.
  • "rdma_cm" connection between a client and a server that were on the same host was not possible when working over VLAN interfaces.
  • RDMACM connection used to fail upon high connection rate accompanied with the error message: RDMA_CM_EVENT_UNREACHABLE.
  • SR-IOV (Single Root I/O Virtualization) was not supported in systems with a page size greater than 16KB.

The following issues are fixed in version 4.0:

  • Kernel became out of memory upon driver start occassionally on SLES12 SP2.
  • Spoof-check was turned on for MAC address 00:00:00:00:00:00
  • TCP packets were received in out of order manner when Large Receive Offload (LRO) was on.
  • Memory allocation for CQ buffers used to fail when increasing the RX ring size.
  • MLNX_EN driver failed to load on 4K page ARM architecture.

The following issues are fixed in version 3.4:

  • "ethtool" self test used to fail on interrupt test after timeout if mlx4_ib module was not loaded.
  • On rare occassions, kernel panic occured during system reboot caused by mlx4_en_get_drvinfo() called from asynchronous event handler.
  • When attempting to disable SR-IOV while there are any VF netdevs open, the operation failed to succeed.
Enhancements

HPE Mellanox RoCE driver version 4.1 contains the following changes and new features:

  • Support for additional RoCE diagnostics and ECN congestion counters under /sys/class/infiniband/mlx5_0/ports/1/hw_counters/ directory.
  • Support for rx-fcs ethtool offload configuration. Normally, the FCS of the packet will be truncated by the ASIC hardware before sending it to the application socket buffer (skb). Ethtool allows to set the rx-fcs not to be truncated, but to pass it to the application for analysis.
  • Option to enable PFC based on the DSCP value. Using this solution, VLAN headers will no longer be mandatory for use.
  • ECN parameters have been moved to the following directory: /sys/kernel/debug/mlx5/<PCI BUS>/cc_params/
  • Support for mlx_fs_dump, which is a python tool that prints the steering rules in a readable manner.
  • Ability to open a device and create a context while giving PCI peer attributes such as name and ID.
  • Ability to disable probed VFs on the hypervisor.
  • Improved performance by rendering Local loopback (unicast and multicast) disabled by mlx5 driver by default while local loopback is not in use. The mlx5 driver keeps track of the number of  transport domains that are opened by user-space applications. If there is more than one userspace transport domain open, local loopback will automatically be enabled.
  • Support for One Pulse Per Second (1PPS), which is a time synchronization feature that allows the adapter to send or receive 1 pulse per second on a dedicated pin on the adapter card.
  • Support for fast driver teardown in shutdown and kexec flows.
  • support for NVMe over fabrics (NVMEoF) offload, an implementation of the new NVMEoF standard target (server) side in hardware.
  • Changed the default RoCE mode on which RDMA CM runs to RoCEv2 instead of RoCEv1. RDMA_CM session requires both the client and server sides to support the same RoCE mode. Otherwise, the client will fail to connect to the server.

HPE Mellanox RoCE driver version 3.4 contains the following changes and new features:

  • Added the following kernel module parameters:
    • mlx4_en_only_mode
    • udev_dev_port_dev_id