Struct RdmaBackendConfig¶
Defined in File config.h
Struct Documentation¶
-
struct
switchml::RdmaBackendConfig¶ Configuration options specific to using the RDMA backend.
Public Members
-
uint32_t
msg_numel¶ RDMA sends messages then the NIC splits a message into multiple packets. Thus the number of elements in a message must be a multiple of a packet’s number of elements. This reduced the overheads involved in sending packet by packet. However, it also makes losses more costly for UC transport since the loss of a single packet will make us retransmit the whole message. Hence you should tweak this value until you find the sweet spot.
-
std::string
device_name¶ The name of the Infiniband device to use. It will be something like
mlx5_0. You can run theibv_devicescommand to list your available devices.
-
uint16_t
device_port_id¶ Each Infiniband device can have multiple ports. This value lets you choose a specific port. Use the
ibv_devinfocommand to list all ports in each device and see their id/index. Its the first number in the description of a port “port: 1” means you should use 1 for this variable.
-
uint16_t
gid_index¶ Choose from the following: 0: RoCEv1 with MAC-based GID, 1:RoCEv2 with MAC-based GID, 2: RoCEv1 with IP-based GID, 3: RoCEv2 with IP-based GID
-
bool
use_gdr¶ (Not implemented yet) Whether to try to use GPU Direct or not. In case the submitted job’s data resides on the GPU, then using GPU Direct allows us to have our registerd buffer be also in GPU memory and directly send data from the GPU instead of having to copy it to a registered CPU buffer.
-
uint32_t