Remote Direct Memory Access (RDMA) is a network protocol that offers low latency, high bandwidth, and low CPU utilization. So recently, RDMA has been expected to improve performance of applications which need frequent network communications, and there are several attempts applying RDMA to OSS. For instance, Tensorflow (deep learning framework) and Apache Spark (data analytics platform) have RDMA implementations.
However, RDMA programming offers many decisions for programmers which may affect performance (e.g. RDMA operation, application's data layout on memory region). Depending on them, extra overheads such as memcpy() will occur and reduce RDMA's benefits. To make matters more complex, they rely on target application's features.
This presentation shows overviews of RDMA, choices of design, and example of applying RDMA to MXNet, an OSS deep learning framework, by using Infiniband and its native API, ibverbs.