The performance tests carried out by the rCUDA team leveraging the new Connect-IB cards, along with the new NVIDIA Tesla K40 GPU, revealed that rCUDA is able to provide almost the same bandwidth as original CUDA. Our testbed was composed of two Ivy Bridge Xeon based computers equipped with the new dual-port Mellanox Connect-IB cards and a K40 Tesla GPU installed in one of the systems. In this hardware configuration, the maximum data transfer rate achieved by the bandwidthTest benchmark from the NVIDIA CUDA Samples when using local CUDA was 10.06GB/s. In the case of using rCUDA to transfer data from main memory of one computer to the GPU installed in the other computer, the same bandwidthTest benchmark achieved a maximum data transfer rate of 9.91GB/s. This means that rCUDA is able to attain 98.5% of CUDA’s bandwidth, thus introducing a negligible performance loss.
Presentation titled "rCUDA: Share and Aggregate GPUs in Your Cluster" at the Mellanox theatre at SC13 in Denver
We are pleased to provide you the link to the rCUDA presentation made in the Mellanox theatre during the Supercomputing Conference 2013 in Denver, CO, last November. The presentation was titled "rCUDA: Share and Aggregate GPUs in Your Cluster", and provided a quick overview of the main features of rCUDA. You can find the presentation here.
In addition to the rCUDA demo that you can attend next week at Mellanox Technologies booth #2722 at SC13 in Denver, CO, we will we glad to answer your questions about this remote GPU virtualization technology. Furthermore, in order to help you better understanding rCUDA, we have prepared several white papers, which will be available at SC13. You can see one of them at our "Support" tab in this website.